For no particular reason, I would like to use awk on a file that contains multiple columns, but let's say only columns 1 and 2 have some text values, and the remainder of the line contains text that I would like to treat as one column, considering I have spaces as delimiter for the columns, e.g.:
alpha 200 this is a comment for this record
bravo 400 this is another comment for this record
I would like awk to output $1, $2 and $3 as
$1 = alpha
$2 = 200
$3 = this is a comment for this record
It looks better, sorry I maybe did not express correctly, I still need the columns to show on the same row, I just want awk to treat from $3 forward (for as many delimiters it may have on that line, $4, $5, $6, etc) as $3, so the output would be $1 $2 $3, e.g.
print $3 would print "this is a comment for this record"
@ctsgnb: It does work for file t1 in post #8. And you might also want to take a look at this. May be the awk installed on my system is having a hangover
[root@host dir]# cat file
alpha 200 this is a comment for this record
bravo 400 this is another comment for this record
[root@host dir]#
[root@host dir]# awk 'sub(".*"$3,$1RS$2RS$3)' file
alpha
200
this record
bravo
400
this record
[root@host dir]#
Instead of "this is a comment for this record", it just prints "this record". I'm using GNU Awk 3.1.5
Same. I tried on a different version too (CYGWIN, GNU Awk 4.0.0) and got the same result. Baffling!
[user@home-pc ~]$ cat file
alpha 200 this is a comment for this record
bravo 400 this is another comment for this record
[user@home-pc ~]$
[user@home-pc ~]$ awk '{a=$1"\n"$2;sub(".*"$3,$3);print a"\n"$0}' file
alpha
200
this record
bravo
400
this record
[user@home-pc ~]$
Strange that it works on GNU Awk 3.1.1! Anyway, this one by Scrutinizer is pretty creative good one mate!
@cts. bala@ the difference occurs not because of awk versions but because you are using different different data samples. With bala's data sample this part of ctsgnb's code is problematic: sub(".*"$3,$3) , which matches upto the last occurence of "this", because of greedy matching..
The $1=$1 is not strictly necessary with the sample provided, but it gives the script robustness since if the data were to include TABs or multiple spaces or if there were a space before the first field, then it might break otherwise...
Yes the $1=$1 trims the blanks, so that we can be sure there is no leading whitespace and and only a single space separating the fields, so that the subs can be successful..
For example:
$ printf 'alpha 200 this is a comment for this record\n' | awk '{sub(FS,RS); sub(FS,RS)}1'
alpha
200
this is a comment for this record
$ printf ' alpha \t 200 this is a comment for this record\n' | awk '{sub(FS,RS); sub(FS,RS)}1'
alpha 200 this is a comment for this record
$ printf ' alpha \t 200 this is a comment for this record\n' | awk '{$1=$1; sub(FS,RS); sub(FS,RS)}1'
alpha
200
this is a comment for this record
sub(FS,RS) substitue the Field Separator (whose default value is a space " ") with a Record Separator (whose default value is a newline "\n")
It does this substitution only once , so the first FS met is changed into a RS.
That is the reason why it is important to make sure that the first FS encountered is one between the fist field $1 and second field $2.
sub(".*"$3,$3) replace in the current line ($0) everthing before and including $3 by the current value of $3
print a"\n"$0 print the variable a and the current line separated by a new line
by refering above file t1,current value of $3 is 'this' right ? , i am bit confused,could you please explain hw does this code works
sub(".*"$3,$3)
, will it returns the value 'this' or 'This is a comment'
please help me ............i am a newbie ..................
---------- Post updated at 06:34 PM ---------- Previous update was at 04:45 PM ----------
got it....
looks like above code does not give desired results, if the file contains string (this)which is repeated as below
# cat t1
1 2 This is a comment this is not a comment this this1