I have a requirement where I have to check whether the mandatory columns in a pipe delimited file is null and print error message.
For eg, I have to check if the 3rd,5th,6th,7th and 8th column are null and print the message "<column name> is null".
The data file will have aroung 100,000 records.
Please help!
awk -F \| '{for(i=3; i<=8; i++) if(i!=4) if($i=="") printf "%s\n","column " i " is null at line " NR}' file
--- @stomp:
You cannot use !$i or for example !$3 as a test, since then the condition will also become true if the field equals 0 rather than "" (null value).
awk '
NR==1 {MX = split(CHK, T, ",")
}
{for (i=1; i<=MX; i++)
if ($(T) == "") print "column " T " is null at line " NR
}
' FS="|" CHK="3,5,6,7,8" file
column 3 is null at line 1
column 6 is null at line 3
column 7 is null at line 6
awk -F'|' '{
if( $3=="" || $5=="" || $7=="" || $8=="" ) {
if ( $3=="" ) { print "Line ",NR, " Field 3: Mandatory field is null"; }
if ( $5=="" ) { print "Line ",NR, " Field 5: Mandatory field is null"; }
# ...
} else
do_something ;
}' yourfile.txt
Without them, your code will execute the else clause only if the last last field tested is an empty string and, if the 1st if test does not find any empty strings, the remaining if tests will never succeed either:
awk -F'|' '{
if( $3=="" || $5=="" || $7=="" || $8=="" ) {
if ( $3=="" ) { print "Line ",NR, " Field 3: Mandatory field is null"; }
}
if ( $5=="" ) { { print "Line ",NR, " Field 5: Mandatory field is null"; }
# ...
} else
do_something ;
}' yourfile.txt
And, of course, if do_something; is more than one statement, you'll also needed braces around all statements in the else clause. The braces around your print statements don't hurt anything, but aren't strictly required since only one statement is being executed in those if then clauses.