extract column with awk

This is a example TXT file:

ID CDR1 CDR2 CDR3
1 CDR1   CDR3
2 CDR1 CDR2 CDR3
3     CDR3   

What I want to do is extract the fourth column with awk, however, there are some blank space in some field. I get wrong result when it print out awk result.

awk '{print $1,$4}' awk_column_test.txt

How will i deal with these blank spaces?

If I want print 'NULL' when the field is blank, how will I should do?

Thank you very much!

cat file

ID CDR1 CDR2 CDR3
1 CDR1   CDR3
2 CDR1 CDR2 CDR3
3     CDR3

awk '$4 == ""{$4="NULL"}{print $4}' file

CDR3
NULL
CDR3
NULL
1 Like

What separator is used in your text file? Tab, space?

You might be able to tell it to use a single space or tab as the separator rather than globbing all whitespace...

awk -F"\t" '{ print $1,$4 }'
awk -F" " '{ print $1,$4 }'
1 Like

Try this

~/unix.com$ awk -F\| 'gsub(/ /,"|")&&$4=$4?$4:"NULL"{print $1,$4}' f

The above script is overcomplicated, use Corona688's or in2nix4life's suggestion instead :stuck_out_tongue:

1 Like

Sorry, I can't express clearly what I mean.

The txt file could be understand like this:

ID CDR1 CDR2 CDR3
1 CDR1 NULL CDR3
2 CDR1 CDR2 CDR3
3 NULL NULL CDR3

The results I want to get should be four "CDR3". "NULL" here represents blank space, so maybe I should deal with the original file?

Thanks a lot!

---------- Post updated at 12:08 PM ---------- Previous update was at 11:50 AM ----------

Sorry, I can't express clearly what I mean. I use space as the separator.

The txt file could be understand like this:

ID CDR1 CDR2 CDR3
1 CDR1 NULL CDR3
2 CDR1 CDR2 CDR3
3 NULL NULL CDR3

The results I want to get should be four "CDR3". "NULL" here represents blank space, so maybe I should deal with the original file?

Thanks a lot!

Try -F"[ ]" if you're on Linux

1 Like

From your sample code, it's impossible to find CDR3 on second and fourth line (ID 1 & 3). Is there any [tab] and [spaces] mixed?

1 Like

maybe this helps you:

[root@sandbox tmp]# awk '{print $NF}' test
CDR3
CDR3
CDR3
CDR3

$NF = last position

1 Like

This input file is worse than any of you seem to have imagined. This file uses <space> as the field separator and uses <space> as the contents of an "empty" field. Since the field separator is also field contents in some cases, there is no obvious way to specify an ERE that matches just the field separators. The following awk script converts the spaces that are data to the string "NULL" and leaves the spaces that are separators alone. It seem to do what was requested. (Note that the 2nd printf statement just shows how the input is modified along with the desired output. Remove it to just get the desired output.)

#!/bin/ksh
awk ' { gsub("  ", " NULL")
        printf("%s %s\n", $1, $4)
        printf("%s %s from %s\n", $1, $4, $0)
}' input
1 Like