Performance Issue for a file search command

Hi All,

This query is regarding performance improvement of a command.

I have a list of IDs in a file (say file1 with single ID column) and file2 has the data rows.

I need to get the IDs from file1 and search in file2, matching rows from file2 should be written to a file3.

For this scenario I have been using below command-

 for ID in `cat file1`; do grep $ID file2; done > file3

The command I am using is super slow. Can someone please let me know how can I improve the performance here.

Thanks in advance,
Tanu

From man grep:

       -f FILE, --file=FILE
              Obtain patterns  from  FILE,  one  per  line.   The  empty  file
              contains zero patterns, and therefore matches nothing.

So

grep -f file1 file2 > file3
1 Like

Thank you for your reply @Corona688. But I see the grep command mentioned by you is also running very slow. I am working with files having millions of records.

Let me know if any other way to improve the search.

How big is file1? If it's larger than available memory, it will of course be slow.

Hi Corona688,

grep -F -f file1 file2 > file3

The above code (added -F) worked really fast for me. I could complete search for a file with 1.5 million records in 5-6 seconds.

Really appreciate for your response.

Thanks,
Tanu

1 Like