Fetching values in CSV file based on column name

bharathbangalor · November 7, 2013, 6:38am

input.csv:

Field1,Field2,Field3,Field4,Field4
abc   ,123   ,xyz   ,000   ,pqr
mno   ,123   ,dfr   ,111   ,bbb

output:

Field2,Field4
123   ,000
123   ,111

how to fetch the values of Field4 where Field2='123'

I don't want to fetch the values based on column position. Instead want to fetch the values based on field names(header name)

Akshay_Hegde · November 7, 2013, 6:44am

Try

$ awk -F, 'NR==1 || $2==123{print $2,$4}' OFS="," file
Field2,Field4
123,000 
123,111

bharathbangalor · November 7, 2013, 6:52am

Thanks for the reply Ajay.

But here I don't fetch values based on column position ($2==123). Instead i want to fetch value by the header name .

Akshay_Hegde · November 7, 2013, 7:09am

Try by the way myself Akshay not Ajay

$ cat Header_list.txt
Field2
Field4

$ cat file
Field1,Field2,Field3,Field4,Field4
abc ,123 ,xyz ,000 ,pqr
mno ,123 ,dfr ,111 ,bbb

awk 'NR==FNR{
                Cols=Cols (Cols?"|":"")$1
                next
            }

     FNR==1{
                for (i=1;i<=NF;i++) 
                if (match($i,Cols)) 
                Ar[++n]=i
           }

           {
               for (i=1;i<=n;i++) 
               printf (i<n)? $(Ar)  FS : $(Ar)
               printf "\n"
            }' FS=","  Header_list.txt file

Resulting

Field2 ,Field4, Field4 
123,  000,  pqr 
123,  111,  bbb

bharathbangalor · November 7, 2013, 7:09am

Or else , Is there a way to find the position of a particular field .
For example say Field 3,its in 3rd position. How to pragmatically fetch the position of field3.

Akshay_Hegde · November 7, 2013, 7:14am

@bharathbangalor

Did you try #4 ?

CarloM · November 7, 2013, 7:15am

Something like:

awk '
NR==1 {
   for (i=1;i<=NF;i++) {
      if ($i==SEARCHHDR) {
         srchfld=i;
      }
      if ($i==OUTHDR) {
         outfld=i;
      }
   }

   print $srchfld OFS $outfld
}

NR>1 && $srchfld ~ SEARCHVAL {
   print $srchfld OFS $outfld
}
' SEARCHHDR="Field2" SEARCHVAL="123" OUTHDR="Field4" FS=, OFS=, file

bharathbangalor · November 8, 2013, 1:36am

akshay hegde:

Try by the way myself Akshay not Ajay

$ cat Header_list.txt
Field2
Field4

$ cat file
Field1,Field2,Field3,Field4,Field4
abc ,123 ,xyz ,000 ,pqr
mno ,123 ,dfr ,111 ,bbb

awk 'NR==FNR{
   Cols=Cols (Cols?"|":"")$1
   next
   }

   FNR==1{
   for (i=1;i<=NF;i++) 
   if (match($i,Cols)) 
   Ar[++n]=i
   }

   {
   for (i=1;i<=n;i++) 
   printf (i<n)? $(Ar)  FS : $(Ar)
   printf "\n"
   }' FS=","  Header_list.txt file

Resulting

Field2 ,Field4, Field4 
123,  000,  pqr 
123,  111,  bbb

:o Akshay :
The above code displays entire column values mentioned in the Headerlist.
But it should output the values based on certain field value.
Say I need to extract Field2 value where Field4=pqr.
The output should be in the following manner.

Field4, Field2
pqr,123

---------- Post updated at 01:36 AM ---------- Previous update was at 01:32 AM ----------

carlom:

Something like:

awk '
NR==1 {
   for (i=1;i<=NF;i++) {
   if ($i==SEARCHHDR) {
   srchfld=i;
   }
   if ($i==OUTHDR) {
   outfld=i;
   }
   }

   print $srchfld OFS $outfld
}

NR>1 && $srchfld ~ SEARCHVAL {
   print $srchfld OFS $outfld
}
' SEARCHHDR="Field2" SEARCHVAL="123" OUTHDR="Field4" FS=, OFS=, file

Hi CarloM,

The above code is filtering records based on Field2 correctly . But its displaying all fields instead of displaying only Field4 .
Can u please help me.

Akshay_Hegde · November 8, 2013, 1:44am

what is the meaning of this ??? you posted in #3

Use this code when you post next time be specific about your requirement and make sure that others will understand your requirement.

$ cat file
Field1,Field2,Field3,Field4,Field4
abc ,123 ,xyz ,000 ,pqr
mno ,123 ,dfr ,111 ,bbb

awk '
                   NR==1{
                                 for (i=1;i<=NF;i++)
                                   {
                                     s=(match(tolower($i),tolower(Header_to_Search)))?i:s
                                     if(match(tolower($i),tolower(Column_to_print)))
                                     Ar[++n]=i    
                                   }    
                        }

NR==1 || $s~Search_value{
                                    printf $s FS
                                    for(i=1;i<=n;i++)
                                    printf (i<n)? $(Ar)  FS : $(Ar)
                                    printf "\n"         
                        }

    ' FS="," Header_to_Search="Field1" Search_value="abc" Column_to_print="Field2|Field3" file

Resulting

Field1,Field2,Field3
abc ,123 ,xyz

CarloM · November 8, 2013, 3:03am

Post the code and the output you get from your sample data, please.

RudiC · November 8, 2013, 7:04am

Based on the data in your post#1, try

awk     'FNR==1         {for (n=1; n<=NF; n++) {if ($n==ID) IDCOL=n; if ($n==RES) RESCOL=n}; print $IDCOL, $RESCOL; next}
         $IDCOL==SRCH   {print $IDCOL, $RESCOL}
        '  RES=Field4  ID=Field2  SRCH=123 FS="," OFS=","  file
Field2,Field4
123   ,pqr
123   ,bbb