Normal text to table format

eboye · January 26, 2013, 7:48pm

Hi,

I am trying to show my list, from a simple list format to a table (row and column formatted table)

Currently i have this format in my output (the formart it will always be like this ) >> first 3 lines must be on the same line aligned, and the next 3 shud be on 2nd line....:

INT1:
STR1
STR2
EXT1:
STR1
STR2
INT2:
STR1
STR2

And the output format i'm trying to acquire is as follow

INT1:     STR2     STR1
EXT1:     STR2     STR1
INT2:     STR2     STR1

I was wondering if you could give me some hints on how to achieve this.

Thank You

Yoda · January 26, 2013, 8:35pm

awk 'BEGIN{c=0}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' filename

If you want the results to be tab separated, then try:

awk 'BEGIN{c=0;FS="\t"}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' file

Note: Use nawk instead for Solaris or SunOS

eboye · January 26, 2013, 11:52pm

thx for the quick reply.

awk 'BEGIN{c=0;FS="\t"}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' file

This line of code is doing what I asked for. But I have one question, There is only a one space difference btw the different data in one line and it's not well aligned.

aaa bbb ccc
aaaaa bbb ccc

To fix this I tried to modify the code like this

awk 'BEGIN{c=0;FS="     "}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' file

and

awk 'BEGIN{c=0;FS="\t\t"}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' file

But no luck, If the solution to my problem is easy can u please indicate to me which one i shud change.

thanks

Yoda · January 27, 2013, 12:00am

Set OFS as well and try:

awk 'BEGIN{c=0;FS="\t";OFS="\t"}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' file

Don_Cragun · January 27, 2013, 12:11am

Here are three different awk scripts to do this making different assumptions about the number of columns/row, and width of columns. If you input data is always less that 8 output columns per input line and you always have three input lines per output row, the 1st script is very simple.

If your input data has input lines that vary in length, but there are always three input lines per output row, the 2nd awk script below adjusts the output to match input field widths.

If you have a variable number of input lines per output row but the column 1 input line data always ends with a colon, the 3rd awk script below will adjust the number of rows and column widths based on the input file contents.

Here are the three awk scripts:

echo 'Following assumes 3 lines/row, tab separator:'
awk '{printf("%s%s", $0, NR % 3 ? "\t" : "\n")}' input
echo
echo 'Following assumes 3 lines/row, field width based on input:'
awk '
{       o[int((NR + 2)/3),++c] = $0
        if(length($0) > w[c]) w[c] = length($0)
        if(c == 3) c = 0
}
END {   fmt = sprintf("%%-%ds%%-%ds%%s\n", w[1] + 2, w[2] + 2)
#printf("fmt=%s\n", fmt)
        for(i = 1; i <= NR / 3; i++)
                printf(fmt, o[i,1], o[i,2], o[i,3])
}' input
echo
echo 'Following assumes Column 1 data ends with ":", field width based on input:'
awk '
/:$/ {  r++
        if(c > mc) mc = c
        c = 0
}       
{       o[r,++c] = $0
        if(length($0) > w[c]) w[c] = length($0)
}       
END {   for(i = 1; i <= r; i++) {
                for(j = 1; j < mc; j++)
                        printf("%-*s", w[j] + 2, o[i, j])
                printf("%s\n", o[i, mc])
        }       
}' input

When these three scripts are given the a file named input containing:

INT1:
STR1
STR2
EXT1:
STR1
STR2
INT2:
STR1
STR2
Longer Column 1:
Column2
column three
column four
column 5
c6
2nd C1:
2nd C2
2nd C3
2nd C4
2nd C5
Second Column 6

the output produced is:

Following assumes 3 lines/row, tab separator:
INT1:   STR1    STR2
EXT1:   STR1    STR2
INT2:   STR1    STR2
Longer Column 1:        Column2 column three
column four     column 5        c6
2nd C1: 2nd C2  2nd C3
2nd C4  2nd C5  Second Column 6

Following assumes 3 lines/row, field width based on input:
INT1:             STR1      STR2
EXT1:             STR1      STR2
INT2:             STR1      STR2
Longer Column 1:  Column2   column three
column four       column 5  c6
2nd C1:           2nd C2    2nd C3
2nd C4            2nd C5    Second Column 6

Following assumes Column 1 data ends with ":", field width based on input:
INT1:             STR1     STR2
EXT1:             STR1     STR2
INT2:             STR1     STR2
Longer Column 1:  Column2  column three  column four  column 5  c6
2nd C1:           2nd C2   2nd C3        2nd C4       2nd C5    Second Column 6

As always, if you're using a Solaris/Sun OS system, use /usr/xpg4/bin/awk or nawk instead of awk .

eboye · January 27, 2013, 1:17am

Thank you very much for the detailed explanation, I got more than I asked for.
Everything is working as expected

Jotne · January 27, 2013, 5:07am

@bipinajith
You set the c=0 in all your example.
This is not needed, since a non declared variable is 0 , so this can be removed on all your example.

awk 'BEGIN{c=0}{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' filename

is the same as

awk '{c++}c<3{ORS=FS}c==3{ORS=RS;c=0}1' filename

RudiC · January 27, 2013, 5:22am

Don't know if it's relevant, but in his sample in post #1 the requestor had input line 3 printed before input line 2, which any proposal so far ignored, and even the R. did not insist upon in his acknowledgements. Anyhow, should this be necessary, try

$ awk '{getline x; getline y; print $0"\t"y"\t"x}' file
INT1:    STR2    STR1
EXT1:    STR2    STR1
INT2:    STR2    STR1

The other proposals would be easy to convert as well, of course.

drl · January 27, 2013, 8:58am

Hi.

% paste - - - < data1
INT1:	STR1	STR2
EXT1:	STR1	STR2
INT2:	STR1	STR2

Also without considering the order of the tokens., as noted by RudiC.

Best wishes ... cheers, drl

eboye · January 27, 2013, 5:14pm

thanks for noticing the order, I managed to fix it by using the following code after using the code provided in the previous posts.

awk '{ t=$2 ; $2=$3; $3=t; print }'

Don_Cragun · January 27, 2013, 7:43pm

Thanks RudiC for noting the point of order we all missed.

The three awk scripts I originally posted could not as easily be fixed by the method shown in message #10 in this thread as some of the other proposed solutions. For the record, here are the three scripts I provided before updated to reverse the output order of input fields 2 and 3 no matter how many output fields there are. (The third script also corrects a bug that would be revealed if the final set of input lines increased the number of columns to be output.)

echo 'Following assumes 3 lines/row, tab separator:'
awk '{getline o3;getline o2;print $0, o2, o3}' OFS="\t" input
echo
echo 'Following assumes 3 lines/row, field width based on input:'
awk '
BEGIN { s[1] = 1; s[2] = 3; s[3] = 2} # s[input_column#] = output_column#
{       o[int((NR + 2)/3),s[++c]] = $0
        if(length($0) > w[s[c]]) w[s[c]] = length($0)
        if(c == 3) c = 0
}
END {   fmt = sprintf("%%-%ds%%-%ds%%s\n", w[1] + 2, w[2] + 2)
        for(i = 1; i <= NR / 3; i++)
                printf(fmt, o[i,1], o[i,2], o[i,3])
}' input
echo
echo 'Following assumes Column 1 data ends with ":", field width based on input:'
awk '
# Usage: oc = oo(ic)
# NAME oo -- convert input column # to output column #
function oo(ic) {
        return(ic == 2 ? 3 : ic == 3 ? 2 : ic)
}
/:$/ {  r++
        if(c > mc) mc = c
        c = 0
}
{       o[r,oo(++c)] = $0
        if(length($0) > w[oo(c)]) w[oo(c)] = length($0)
}
END {   if(c > mo) mc - c
        for(i = 1; i <= r; i++) {
                for(j = 1; j < mc; j++)
                        printf("%-*s", w[j] + 2, o[i, j])
                printf("%s\n", o[i, mc])
        }
}' input