Shell Programming and Scripting

Hi,

Iam having file1 as follows:

ERTYUIOU|1234567689089767688
FDHJHKJH|6817738971783893499
JFKDKLLUI|9080986766433498444

FILE2

ERTYUIOU|1234567689089767688   resh@abc_com     767637218328322332                     893589893499                         asdsddssd                            2008 80930`2323232
JFKDKFDF|0980897489377324734     UIYUEEIIXCZHOPOW[OGTI   U IUEOIWERIWERERRE          78978978123823   9 90990990-033-93-0909  
JKDDFJLKJFDLFKD

I HAVE GIVEN ONLU 2 LINES IN FILE2

there are lacs of recorsd in file2.

i need to take 1st line in file1 and take that record in file2.

i have used grep command to fetch records from file2....but it is taking hours of time to fetch lacks of records

so i don't want to use grep. instaed how i can take the records from file2 by keeping file1 records.

Thanks in advance.

main goal is to optimise the time taken for greping.
i want to finish the entire fecthing within few minutes.

instead of grep use sed to fetch the record from file 2, followed by q command, so that you don't process the remaining file. something like:

sed -n '/pattern/ p; q;' file2

Try this :

############ Search.run ##############
FILE1=file1 ( Your lookup file )
FILE2=file2 ( Your data file with more than .1 MM recs )
awk ' NR==FNR { A[$0]=1; next; }
{ if (substr($0,1,8) in A) { A[$0]=0; } }
END { for (k in A) { if (A[k]!=1) { print k; } } } ' $FILE1 $FILE2
############ Search.run ##############

Courtesy : one post from this forum.

your explanation is somewhat vague.
what is the desired output given 2 of your sample files?

from the above file 1 and file2 , i have to get the output file as follows:

ERTYUIOU|1234567689089767688   resh@abc_com     767637218328322332                     893589893499                         asdsddssd                            2008 80930`2323232

the above commands are not working. pls give the solution.

Did you try #kanu_kanu's suggestion please ?
If you face any issue on that, please let us know

awk ' NR==FNR { A[$0]=1; next; }
{ if (substr($0,1,8) in A) { A[$0]=0; } }
END { for (k in A) { if (A[k]!=1) { print k; } } } ' file1.txt file2.txt >out.txt

no output for this command

awk ' NR==FNR { A[substr($0,1,8)]=1; next; }
{ if (substr($0,1,8) in A) { A[substr($0,1,8)]=0; } }
END { for (k in A) { if (A[k]!=1) { print k; } } } ' file1.txt file2.txt >out.txt

pls try this

It is printing like this following for the above file1 and file2

ERTYUIOU

oh you need entire line ?

then following may help you.

awk ' NR==FNR { A[substr($0,1,8)]=1; next; }
{ if (substr($0,1,8) in A) { print $0 } }

sorry i have made 1 mistake. input files are

file1
ERTYUIOU1234567689089767688
FDHJHKJH6817738971783893499
JFKDKLLUI9080986766433498444
file2

ERTYUIOU1234567689089767688   resh@abc_com     767637218328322332                     893589893499                         asdsddssd                            2008 80930`2323232
JFKDKFDF0980897489377324734     UIYUEEIIXCZHOPOW[OGTI   U IUEOIWERIWERERRE          78978978123823   9 90990990-033-93-0909  
JKDDFJLKJFDLFKD

both file 1 and 2 have no pipe after 8th character,

sorry for the trouble.

pls give suggesion.

awk ' NR==FNR { A[$0]=1; next; }
{ if ($1 in A) { print $0 } }

No this command is not working. not giving any o/p

awk ' NR==FNR { A[$0]=1; next; }
{ if ($1 in A) { print $0 } }' file1.txt file2.txt >out.txt

I didn't pass the file names .. thought you would have added :slight_smile:

sorry i have given like this only with filenames.but no output.

it is working for me.

I have created a script 'kanu.awk' with following content

#!/usr/bin/awk -f

NR==FNR { A[$0]=1; next; }
{ if ($1 in A) { print $0 } }

then 'chmod u+x kanu.awk'

and executed this script in the following way

./kanu.awk file1.txt file2.txt

where file1.txt is your lookup file and file2.txt is your data file

This gave me following output.

ERTYUIOU1234567689089767688 resh@abc_com 76763721832832233 893589893499 asdsddssd 200880930`2323232

I don't know what happened in your case. worth double checking your script with mine.

fILE 1 :

ABCEFGHI|0000000000003537
ABCEFGHI|0000000000132807
ZXCVBNML|0000000000132000

FILE 2

ABCEFGHI0000000000003537   name@yahoo_com                                                                                2008-02-020000823.15 0011676 00017.00 2008-03-01ROJER,TERASA C                                000000000000051.66 000000000000040.00 CBB00010000000906
ABCEFGHI0000000000027601   cat@yahoo_com                                                                                  2008-02-020014243.99 0000758 00284.00 2008-03-01ROJER,  WERASA E                              000000000000016.03 000000000000000.00 CBB00010000000920
ABCEFGHI0000000000116214   taj@yahoo_com                                                                                2008-02-030001935.75 0001064 00056.00 2008-03-02IM,TOM   CRUSE                                000000000000030.74 000000000000020.00 CBB00010000000915
ABCEFGHI0000000000132807   pocketfull@yahoo_com                                                                    2008-02-030000231.67 0002268 00015.00 2008-03-02JACK,LILIA P                                  000000000000003.41 000000000000000.00 CBB00010000000906
ZXCVBNML0000000000132000   pocketfull@yahoo_com                                                                    2008-02-030000231.67 0002268 00015.00 2008-03-02JACK,LILIA P                                  000000000000003.41 000000000000000.00 CBB00010000000906

OUTPUT FILE I NEED TO GET

ABCEFGHI0000000000003537   name@yahoo_com                                                                                2008-02-020000823.15 0011676 00017.00 2008-03-01ROJER,TERASA C                                000000000000051.66 000000000000040.00 CBB00010000000906
ABCEFGHI0000000000132807   pocketfull@yahoo_com                                                                    2008-02-030000231.67 0002268 00015.00 2008-03-02JACK,LILIA P                                  000000000000003.41 000000000000000.00 CBB00010000000906
ZXCVBNML0000000000132000   pocketfull@yahoo_com                                                                    2008-02-030000231.67 0002268 00015.00 2008-03-02JACK,LILIA P                                  000000000000003.41 000000000000000.00 CBB00010000000906

both file1 and file2 has many records

awk -f ' NR==FNR { A[$0]=1; next; }
{ if ($1 in A) { print $0 } } ' file1 file2 > out.txt

WHEN iam using the above command no output is coming

sorry command i uses is

awk ' NR==FNR { A[$0]=1; next; }
{ if ($1 in A) { print $0 } } ' file1 file2 > out.txt

ah what did you tell sometime back ?
you said there is no '|' symbol in the message ...

and now you are using file with '|' symbol !!!!!!!

Can you paste here the conect of file1.txt and file2.txt which you actually needed ?

sorry i had confused you....sorry again.

I have made my file1 as

ABCEFGHI0000000000003537
ABCEFGHI0000000000132807
ZXCVBNML0000000000132000

but for small file with file1(5 records)
and file2(9 records)

it is working fine...and i executed this command for 85 thousand records file 1 and file2.. it is not giving any output.