awk cut column based on string

Using awk I required to cut out column contain word "-Tag" regardles of any order of contents and case INsensitive

-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical

Please Guide ......

--Shirish Shukla

---------- Post updated at 05:58 AM ---------- Previous update was at 05:50 AM ----------

Have came with this but it's case sensitive

# echo "-tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical" | awk '{for(i=1;i<=NF;i++) if($i ~ /tag/) print $i}'
-tag:messages

:rolleyes:

 
$ cat test.txt
-tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-TAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-tAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
$ nawk '{for(i=1;i<=NF;i++){if($i~/[Tt][Aa][Gg]/)print $i}}' test.txt
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages
grep -io "\-tag:messages" inputfile

@itamaraj

Sorry nawk not installed have to achieve via awk only ...
Thanks..

---------- Post updated at 06:19 AM ---------- Previous update was at 06:16 AM ----------

@balajesuri

Thanks am aware it ... but i want to achieve this via awk only ...

--Shirish

use awk instead of nawk

Thanks All !!!

Here what had used, IGNORECASE=1 with awk

[root@nagios Shirish@Shukla]# echo "-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical" |  \ 
> awk 'IGNORECASE=1 {for(i=1;i<=NF;i++) if($i ~ /tAg/) print $i}'
-Tag:messages
[root@nagios Shirish@Shukla]# echo "-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical" | \
> awk 'IGNORECASE=1 {for(i=1;i<=NF;i++) if($i ~ /tag/) print $i}'
-Tag:messages
[root@nagios Shirish@Shukla]# echo "-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical" | \
> awk 'IGNORECASE=1 {for(i=1;i<=NF;i++) if($i ~ /taG/) print $i}'
-Tag:messages
[root@nagios Shirish@Shukla]#

whenever you post your question, post your OS and shell details.

that is easy for giving suggestions and ideas

---------- Post updated at 12:59 PM ---------- Previous update was at 12:58 PM ----------

As IGNORECASE only works in gnu awk.

The standard awk is fairly weak. If you don't have access to GNU awk, install it. All the above solutions rely on GNU awk or nawk or at least Sun's xpg awk (which is an old version of nawk).

awk -v IGNORECASE=1 '{if( match($0,/-Tag:([^[:space:]]*)/,found)) print found[1]; }'

With nawk you might do something similar, but using sub() because nawk's match() isn't as cool as GNU's.

awk -F"[ :-]" 'tolower($2)~/tag/{print "-"$2":"$3}' yourfile

or

awk '{split($1,a,":")}tolower(a[1])~/-tag/{print $1}' yourfile

or

awk '{NF=1;split($1,a,":")}tolower(a[1])~/-tag/' yourfile
$ cat tst
-tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-TAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-tAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
$ awk -F"[ :-]" 'tolower($2)~/tag/{print "-"$2":"$3}' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages
$ awk '{split($1,a,":")}tolower(a[1])~/-tag/{print $1}' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages
$ awk '{NF=1;split($1,a,":")}tolower(a[1])~/-tag/' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages

Are you certain about that otheus? I was under the impression that /usr/xpg4/bin/awk was introduced to Solaris later and does more to approach Posix standards than nawk on Solaris does, which stands for new awk, but that is only relative to ancient original awk...

Not 100% sure, but I know Kernighan was maintaining nawk at least through 2007, and the open BSD project has been maintaining it since, and Solaris, well, I think they brought awk over from System V back in the 90s or maybe even before then with SunOS 4.x

---------- Post updated at 03:54 PM ---------- Previous update was at 03:16 PM ----------

Follow-up:
From the FIXES file in awk.zip downloaded from Kernighan's web page:

Jun 1, 2003:
	subtle change to split: if source is empty, number of elems
	is always 0 and the array is not set.

From Solaris 10 (2005) xpg-awk:

$ /usr/xpg4/bin/awk 'BEGIN { print split(null,out,FS) }' </dev/null
0

So it would seem Solaris DID keep nawk up-to-date w.r.t Kernighan's version.

Then again....

Jan 1, 2002:
	length(arrayname) returns number of elements; thanks to 
	arnold robbins for suggestion

And on Sun's implementation:

$ /usr/xpg4/bin/awk 'BEGIN { split("test",out,/es/); print out[1]; print length(out)}' </dev/null
t
0

@otheus, Interesting, I think though you should be comparing these Solaris nawk, not /usr/xpg4/bin/awk, which should not be following Kernighan's changes, but rather strive to be Posix compliant, no? What is the output of the same commands with nawk ?

First, I think you should split this thread into the Underground forum, for instance, and link to it :slight_smile:

Second, Kerhnighan *is* the author of nawk. What Solaris did to what they call nawk is anyone's guess.

Third, Solaris lists the nawk man page and xpg4/awk man page as the same entity (yet oddly, the files differ vastly in size).

Fourth, nawk explicitly errors with length(arrayname):

$ nawk 'BEGIN { split("test",out,/es/); print out[1]; print length(out)}' </dev/null
t
nawk: can't read value of out; it's an array name.
 source line number 1

I think you are right, let's do that if you think it is interesting (I do), but what shall we call the thread? /usr/xpg4/bin/awk vs. nawk on Solaris? I thought in post#8 you meant on Solaris /usr/xpg4/bin/awk is an old version of nawk , i.e. the current version on Solaris. And my point was/is that nawk on Solaris is not as compliant as /usr/xpg4/bin/awk and therefore the latter is preferable to nawk on Solaris.

But on rereading you seem to be referring to a recent version of nawk on different systems. But in many other systems nawk is either non-existing or a link to gawk or mawk and on yet others awk is nawk (or bwk).

Yes, Kernighan is the author of nawk, but length() operating on an array is an added feature and is not part of the Posix specification (and unnecessary).

Sure .. Have to be ...

But was not aware about "As IGNORECASE only works in gnu awk." !! Thankx..

Have checked this is working fine on hp-ux/solaris/aix and on various Linux flavours .. (Suse/Redhat/CentOS) ... and on sh/ksh shell too ...

--Shirish

You can use tolower() or toupper() function when testing the matching so in fact this will result just like an ignorecased comparison.
(see examples given post #9)