One Column To Two Columns

Having difficulty reformatting a 1 column file into a 2 column file as shown below:

Example Input

A
01
02
05
07
08
11
B
1193941
7231222
5192121
3221312
2211149
1783289

Required Output

A  B
01 1193941
02 7231222
05 5192121
07 3221312
08 2211149
11 1783289

Field widths are always 2 in Section A and 7 in Section B.
Within each Section the field range is min 5 max 22.
Section A is in ascending numerical order
The Section titles are the only non 0-9 characters.

Any help would be much appreciated.

cat filename | tr '\n' '\t' | awk ' { for (i=1; i<=NF/2; i++) { print $i, $(i+(NF/2)), "\n" } }'

Solution edited and fixed

try also:

awk '
/[A-Za-z]+/ {a[c++]=$0; n=0}
!/[A-Za-z]+/ {o[a[c-1], n++]=$0}
END {
  for (i=0; i<c; i++) printf (i<c-1) ? a"\t":a"\n";
  for (j=0; j<n; j++) {
    for (i=0; i<c; i++) printf (i<c-1) ? o[a, j]"\t":o[a, j]"\n";
  }
}' input.txt

If there's the same number of rows each time:

$ cat file
A
01
02
05
07
08
11
B
1193941
7231222
5192121
3221312
2211149
1783289

$ split -l7 < file

$ ll
total 24
drwxr-xr-x   5 scott  staff   170 30 Dec 19:05 .
drwxr-xr-x  79 scott  staff  2686 30 Dec 19:05 ..
-rw-r--r--   1 scott  staff    70 30 Dec 19:05 file
-rw-r--r--   1 scott  staff    20 30 Dec 19:05 xaa
-rw-r--r--   1 scott  staff    50 30 Dec 19:05 xab

$ paste x*
A	B
01	1193941
02	7231222
05	5192121
07	3221312
08	2211149
11	1783289

I checked it and I find a small bug and here how I fix my first solution

cat filename | tr '\n' '\t' | awk ' { for (i=1; i<=NF/2; i++) { print $i, $(i+(NF/2)), "\n" } }'
$ cat test
A
01
02
05
07
08
11
B
1193941
7231222
5192121
3221312
2211149
1783289

$ cat test | tr '\n' '\t' | awk ' { for (i=1; i<=NF/2; i++) { print $i, $(i+(NF/2)), "\n" } }'
A B 

01 1193941 

02 7231222 

05 5192121 

07 3221312 

08 2211149 

11 1783289

If you want to verify each of the conditions stated and print warnings when the conditions aren't met, try the following:

#!/bin/ksh
awk '/[^[:digit:]]/ {
        h[++hc] = $0
        next
}
{       o[hc,++oc[hc]] = $0
}
END {   if(hc != 2) {
                printf("Expected 2 heading lines; found %d: ", hc)
                for(i = 1; i <= hc; i++)
                        printf("\"%s\"%s", h, i == hc ? "\n" : " ")
                ec = 1
        }
        if(oc[1] != oc[2] || oc[1] < 5 || oc[1] > 22) {
                printf("%s\n%s %d %s \"%s\" and %d %s \"%s\".\n",
                        "Expected 5 <= x <= 22 entries under both headings;",
                        "found", oc[1], "under heading", h[1], oc[2],
                        "under heading", h[2])
                ec += 2
        }
        printf("%-3s%s\n", h[1], h[2])
        c = oc[1] > oc[2] ? oc[1] : oc[2]
        last = -1
        for(i = 1; i <= c; i++) {
                if(i <= oc[1] && o[1, i] <= last) {
                        ec += ec % 8 < 4 ? 4 : 0
                        m = sprintf(" %d not > %d.",
                                o[1, i], last)
                } else  m = ""
                if(i <= oc[1] && length(o[1, i]) != 2) {
                        ec += ec % 16 < 8 ? 8 : 0
                        m = sprintf("%s 1st field not 2 digits.", m)
                }
                if(i <= oc[2] && length(o[2, i]) != 7) {
                        ec += ec < 16 ? 16 : 0
                        m = sprintf("%s 2nd field not 7 digits.\n", m)
                } else  m = sprintf("%s\n", m)
                printf("%s %s%s", o[1, i], o[2, i], m)
                last = o[1, i]
        }
        exit ec
}' Input

Replace /bin/ksh in #!/bin/ksh with an absolute pathname of the Korn shell (or any shell that conforms to POSIX shell command language requirements) on your system.

If you're on a Solaris system, use /usr/xpg4/bin/awk or nawk instead of awk .

The exit code will be 0 if all conditions are met, or 1 through 31 depending on which set of conditions fail.

Many thanks for the very quick replies.

Scott: the number of rows varies between 5(min) and 22(max)

It's a hard choice between between fuad's second post code and rtrdx1's code.

fuad's output rows are separated by blank lines.

rdtx1's output columns are separated by a tab(?) instead of single space.

On balance I'm inclined to go with fuad's single-liner largely because I know how to delete
blank lines :slight_smile:

---------- Post updated at 05:06 PM ---------- Previous update was at 04:44 PM ----------

Don Cragun

Thanks for a very comprehensive reply but as I'm able to verify the file meets the stated requirements before reformatting I don't have to cater for all the exception conditions.

OK. If you've pre-verified the input file, you can strip my suggestion down to:

#!/bin/ksh
awk '/[^[:digit:]]/ { h[++hc] = $0; next}
{       o[hc,++oc[hc]] = $0}
END {   printf("%-3s%s\n", h[1], h[2])
        for(i = 1; i <= oc[1]; i++) printf("%s %s\n", o[1, i], o[2, i])
}' Input

and get the same output.

1 Like
pr -t -s' ' -2 -l 7 input
awk '/[a-zA-Z]/{i=0}
{i++;A=A?A"\t"$0:$0}
END{for(j=1;j<=i;j++){print A[j]}}' file
1 Like
awk '/[^0-9]/{f=NR} f>1{print A[NR-f+1],$0; next} {A[NR]=$0}' file
1 Like

You can just add

sed 's/ /\t/g'

to the end to replace the space to tab or you can edit the original code to be like this:

cat test | tr '\n' '\t' | awk ' { for (i=1; i<=NF/2; i++) { print $i, "\t", $(i+(NF/2)), "\n" } }'

I see people give you the codes but I think it will be more benefit to you if you're doing it yourself. So, instead of give you the code, I would rather show you one of many ways on how to do it. You can split the file into 2 files, column_a and column_b, then merge them back to a single file with two columns. You can do that first by "grep" for where column b starts with the "-n" option to show line number. Then with the combination of using "head" and "tail" with a "-number" based on your "grep" result (you cannot use that exact number, why?), you can split the file in to two files: column_a and column_b. You can then use the "paste" command to merger the two files in to one with 2 columns, separated by tab.

You can also add "pre" check stuffs to check the input file is correct such as the line where column b start is right in the middle of the file.

Have fun writing your codes :slight_smile:

Toiday

You're quite right about the need to understand the options available rather than just using code snippets. Meanwhile I'm still at the cut-and-paste stage of my scripting learning curve so thanks again to all those who replied to my original post.