print contents of file2 for matching pattern in file1 - AWK

i.scientist · September 1, 2009, 2:00am

File1 row is same as column 2 in file 2.
Also file 2 will either start with A, B or C.
And 3rd column in file 2 is always F2.

When column 2 of file 2 matches file1 column, print all those rows into a separate file.

Here is an example.

file 1:

file 2:

A|100|F2|hello
B|100|F2|djhbsdhjf
B|100|F2|dksadbkdfd
C|100|F2|djsbdjinldf
A|101|F2|hellodfd
B|101|F2|djhbsdhjdff
B|101|F2|dksadbkdfgd
C|101|F2|djsbdjinlgfg
A|102|F2|hellodfgfd
B|102|F2|djhbsdhjfgf
C|102|F2|djsbdjinlhgf
A|103|F2|hellohggg
B|103|F2|djhbsdhjhjhj
B|103|F2|dksadbkdfdr
C|103|F2|djsbdjinlfgf
A|104|F2|hellofg
B|104|F2|djhbsdhjfgf
B|104|F2|dksadbkhfgg
C|104|F2|djsbdjinlhgh
A|105|F2|hellohgh
B|105|F2|djhbsdhjdsgh
B|105|F2|dksadbkds
C|105|F2|djsbdjinlds
A|108|F2|hello
B|108|F2|djhbsdhj
B|108|F2|dksadbk
C|108|F2|djsbdjinl

OUTPUT:

A|100|F2|hello
B|100|F2|djhbsdhjf
B|100|F2|dksadbkdfd
C|100|F2|djsbdjinldf
A|103|F2|hellohggg
B|103|F2|djhbsdhjhjhj
B|103|F2|dksadbkdfdr
C|103|F2|djsbdjinlfgf
A|104|F2|hellofg
B|104|F2|djhbsdhjfgf
B|104|F2|dksadbkhfgg
C|104|F2|djsbdjinlhgh
A|108|F2|hello
B|108|F2|djhbsdhj
B|108|F2|dksadbk
C|108|F2|djsbdjinl

I am trying awk...but no luck...here is what i am trying

awk -v i="1" 'BEGIN { FS="|" }
FR==NR
{
a=$2
if (a==a[i-1]) {  h[$2,i]=$0; i++ }
else { if (i==1) { h[$2,i]=$0; i++;   } 
       if (i!=1) { h[$2,i]=$0; i=1; ; }
       }
       next
}
{
         for (j=1;j<1000;j++) 
         {
         if (h[$0,j]!="") { print h[$0,j]
                          }
          }
          next
                  }' file2 file1   >  ouputfile

************************************************************
i do not want to use for/while unix loops as it is nt efiicient ..........

Vi-Curious · September 1, 2009, 4:04am

Not awk but this should work:

perl -nle '{if (/^(\d+)$/) {$x .= "|$1";} else { $y = substr($x,1);print $_ if /[|]$y[|]/ }}' file1 file2

tpietschmann · September 1, 2009, 8:24am

you can get the results this way; it creates a field array and compares value1 in file1 with value2 in file2 and prints the matching values

nawk '{FS="|"} NF==1 {acc[$1]=1} NF>1 {if( ( $2 in acc ) ) {print $1"|"$2"|"$3"|"$4} }' file1.txt file2.txt

i.scientist · September 7, 2009, 7:00pm

this worked but ...can u please explain ?
sorry for the late response

---------- Post updated at 06:00 PM ---------- Previous update was at 05:59 PM ----------

tpietschmann:

you can get the results this way; it creates a field array and compares value1 in file1 with value2 in file2 and prints the matching values
nawk '{FS="|"} NF==1 {acc[$1]=1} NF>1 {if( ( $2 in acc ) ) {print $1"|"$2"|"$3"|"$4} }' file1.txt file2.txt

this worked well. even faster than perl command given above. thank you very much.

by the way wats difference between nawk and awk ?

vgersh99 · September 7, 2009, 7:12pm

nawk -F'|' -v OFS='|' '
  FNR==NR {f1[$1];next}
  $2 in f1' file1 file2

---------- Post updated at 07:12 PM ---------- Previous update was at 07:10 PM ----------

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

Vi-Curious · September 7, 2009, 8:35pm

Quick/dirty and slightly inefficient but ok....

perl -nle '<perl expression>' file1 file2

Read each line from file1 and file2 and perform <perl expression> on each line.

if (/^(\d+)$/) {$x .= "|$1";}

If the line consists of a single integer number, append it to variable x using | as a separator. The ^ is beginning of line, the $ is end of line and (\d+) represents one or more decimal digits. After your file1 example is processed, $x will equal |100|103|104|108.

else { $y = substr($x,1);print $_ if /[|]$y[|]/ }

If the line is not a single integer (then it will be your entries from file2), strip the first character off of variable x and save what remains in y. This is inefficient because it has to be done for each line that is processed. This places 100|103|104|108 in y. Print the line if it contains any of the strings in variable y located between two | characters.

i.scientist · September 7, 2009, 11:45pm

thanks "Vi curious" for your "perl" explanation.
Appreciated.

bye