Grep strings on multiple files and output to multiple files

am24 · April 19, 2016, 2:40am

Hi All,

I want to use egrep on multiple files and the results should be output to multiple files. I am using the below code in my shell script(working in Ksh shell). However with this code I am not attaining the desired results.

#!/bin/ksh
(
a="/path/file1"
b="path/file2"

for file in $a $b
do
egrep -iv '513|519|532' "$file" > test1 > test2
done
)
exit

The above code is resulting 2 files 'test1' and 'test2' whereas 'test1' does not contain anything in it(its empty file). 'test2' contains values of '/path/file2' whereas 513,519,532 records were ignored.

I want my results should be like below:

�test1' should contain values of �/path/file1' whereas 513,519,532 records should be ignored. And
�test2' should contain values of �/path/file2' whereas 513,519,532 records should be ignored.

Can anyone of you suggest me any solution on this ?

Thanks in advance.
Regards,
am24

Don_Cragun · April 19, 2016, 3:39am

First, note that /path/file2 and path/file2 are only guaranteed to identify the same file if you are sitting in the system's root directory when you invoke this script.

Second, that is not the way grep (or egrep ) works. If you want different output files for different input file, you'll need to invoke grep once for each desired output file.

Alternatively, you could use something like awk to simulate multiple invocations of egrep -v (assuming that output files are sequentially numbered and all have the base test ) using something like:

#!/bin/ksh
awk '
FNR == 1 {
	if(NR > 1)
		close(of)
	of = "test" ++nf
}
! /51[39]|532/ {
	print > of
}' "$@"

and invoke this script with pathnames of the files to want to process as operands.

Or, you could change the last line of this script from:

}' "$@"

to include an explicit list of the files you want to process:

}' /path/file[12]

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

Although written and tested using the Korn shell, this will work with any shell that uses basic Bourne shell syntax.

am24 · April 19, 2016, 9:31am

Hi Don,

Thank you so much for your suggestion. I have tried the below code and able to get the desired results.

#!/bin/ksh
nawk '
FNR == 1 {
	if(NR > 1)
		close(of)
	of = "test" ++nf
}
! /5130|532/ {
	print > of
}' /path/file[12]

I did not understand what is the relation between FNR and NR in this code. Can you Please brief me on that ?

Regards,
am24

Don_Cragun · April 19, 2016, 2:37pm

am24:

Hi Don,

Thank you so much for your suggestion. I have tried the below code and able to get the desired results.
#!/bin/ksh
nawk '
FNR == 1 {
	if(NR > 1)
		close(of)
	of = "test" ++nf
}
! /5130|532/ {
	print > of
}' /path/file[12]
I did not understand what is the relation between FNR and NR in this code. Can you Please brief me on that ?

Regards,
am24

FNR is the current line number in the current input file. So, FNR == 1 is true when you are looking at the 1st line in a file. NR is the number of lines read from all input files. So, when FNR is 1 and NR is also 1, you are looking at the 1st line in the 1st input file. And, when FNR is 1 and NR is greater than 1, you are looking at the 1st line in the 2nd or 3rd or ... (but not the 1st) input file.

Note also that you used:

! /5130|532/ {

which will select any line that does not contain the string "5130" and does not contain the string "532". The code I suggested used:

! /51[30]|532/ {

which was a typo. If should have been:

! /51[39]|532/ {

which would select any line that does not contain the string "513", does not contain the string "519", and does not contain the string "532" (which is the same as what either of the commands:

egrep -v '513|519|532'
egrep -v '51[39]|532'

would select). I will edit my earlier post to correct that typo in a few minutes.

am24 · April 20, 2016, 2:17am

Hello Don,

Thank you for the brief explanation on the code. I have got a clear understanding now.

Thanks again for your time to look into this

Regards,
am24