Filtering out lines in a .csv file

dev.devil.1983 · February 27, 2013, 7:53pm

Hi Guys,

Would need your expert help with the following situation..

I have a comma seperated .csv file, with a header row and data as follows

H1,H2,H3,H4,H5.....    (header row)
0,0,0,0,0,1,2....         (data rows follow)
0,0,0,0,0,0,1
.........
.........

i need a code that would, trigger a set of instructions(if-then condition) .. based on the situation like .. if any of the rows in H5,H4,H3 .. but not H1,H2 has a value greater than 1.

thanks,
dev

jim_mcnamara · February 27, 2013, 9:58pm

awk has builtin column names $1, $2, ... $n

What exactly are you trying to do? - building a pile of if-then-else constructs is possible in awk in a very simple way. Huge blocks of if-then-else are prone to errors as well

Please give us sample input -> expected output.

Chubler_XL · February 27, 2013, 10:01pm

You could use sed to delete heading line and then read columns into an Array. Remember bash arrays are indexed from zero, so H[0] is column 1, H[5] is column 6:

#!/bin/bash
IFS=,
sed '1d' infile | while read -a H
do
   if [ ${H[0]} -le 1 -a ${H[1]} -le 1 ] &&
      [ ${H[4]} -gt 1 -o ${H[3]} -gt 1 -o ${H[2]} -gt 1 ]
   then
      echo "set of instructions"
   fi
done
IFS=$' \t\n'

dev.devil.1983 · March 2, 2013, 12:46pm

Hi Jim,

the sample file looks like this .. there are 18 columns in total, out of which last 3 are amount columns(D1 D2 D3), having certain values.. nothing has to be changed in the file .. the requirement is to read the last 3 columns of the file and in case any row has a value greater than '1' .. in that case 'a set of instructions' have to be executed..

H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 D1 D2 D3
                                                  0   0   0
                                                  0   0   1
                                                  0   0   0
                                                  1   0   1
 
.....................
..............

thanks,
Dev

Yoda · March 2, 2013, 1:02pm

awk '
{
        if ( NR == 1)
        {
                print
                next
        }
        if ( $16 > 1 || $17 > 1 || $18 > 1 )
        {
                # Put your set of instructions here
        }
} ' file

Note: Set field separator to comma if fields in your file are separated by comma:

awk -F, '

dev.devil.1983 · March 9, 2013, 3:44pm

thanks bipin, seems to be working but can this code be made a oneliner ?

Yoda · March 9, 2013, 3:50pm

Just put it in one line

awk -F, 'NR==1{print;next}$16>1||$17>1||$18>1{#Put your set of instructions here}' file

dev.devil.1983 · March 9, 2013, 4:01pm

thanks for the quick response vipin, one last question though , how to introduce an else in here .. lets suppose the amounts are less than '1' .. a different set of instructions should be executed

Yoda · March 9, 2013, 4:16pm

awk -F, 'NR==1{print;next}$16>1||$17>1||$18>1{#action;next}{#else action}' file

dev.devil.1983 · March 9, 2013, 5:44pm

its working now .. thanks bipin

there's one last glitch though .. i wanted to echo this entire command into a script file but that file doesnt seem to have the correct command

code:
echo "awk -F, 'NR==1{print;next}$16>1||$17>1||$18>1{print "success";next}{print "fail"}' file " > script.sh

but in file it comes out as

awk -F, 'NR==1{print;next}6>1||7>1||8>1{print success;next}{print fail}' file

the colored text show the deviation

Thanks,
Dev

Yoda · March 9, 2013, 5:47pm

When you echo, then you have to escape the meta-characters like dollar sign $ and double quotes " to preserve them.

Correction:

echo "awk -F, 'NR==1{print;next}\$16>1||\$17>1||\$18>1{print \"success\";next}{print \"fail\"}' file "