Grep command

Theo_Score · March 20, 2017, 11:56pm

Hi All,

I have a directory of simulation data files with structure dump*.data and an example of such data file is given below.

ITEM: TIMESTEP
10000
ITEM: NUMBER OF ATOMS
6
ITEM: BOX BOUNDS ff ff ff
-0.15 0.15
0 0.5
-0.15 0.15
ITEM: ATOMS id type x y z vx vy vz fx fy fz omegax omegay omegaz radius c_nconts 
1 1 -0.0506691 0.371195 -0.0540098 0 -1 -0 0 0 0 0 0 0 0.03 0 
4 2 -0.0400495 0.270148 0.0120722 0 -1.25925 0 0 -0.345173 0 0 0 0 0.02 0 
2 1 -0.0102719 0.415304 0.0113916 0 -1 -0 0 0 0 0 0 0 0.01 0 
6 2 0.0192567 0.33773 0.0256545 0 -1 -0 0 0 0 0 0 0 0.01 0 
3 1 0.0371105 0.373507 -0.00483919 0 -1 -0 0 0 0 0 0 0 0.02 0 
5 2 0.0426929 0.366908 0.057645 0 -1 -0 0 0 0 0 0 0 0.03 0

I have made up the script below which filters specific number in the first column and read thorough in all the data files

grep -h "^$1 " dump[1-9]????.data > particle$1.dat 
grep -h "^$1 " dump[1-9]?????.data >> particle$1.dat
grep -h "^$1 " dump[1-9]??????.data >> particle$1.dat

Now I want the code to read first the second row (10000 in this case) and append it as the first column of the results of the current code. Row 2 in the data files have values of 10000, 20000, 30000,...1790000,1800000. For example, if I read for row which starts with 4, I want the output to be

10000 4 2 -0.0400495 0.270148 0.0120722 0 -1.25925 0 0 -0.345173 0 0 0 0 0.02 0

Thank you.

Don_Cragun · March 21, 2017, 2:48am

You haven't told us what operating system or shell you're using, and grep is not capable of joining text from different lines in a file. For that you need something more like awk . For example:

#!/bin/ksh
IAm=${0##*/}
if [ $# -ne 1 ]
then	printf 'Usage: %s column_1_value\n' "$IAm" >&2
	exit 1
fi
awk -v pattern="^$1 " '
FNR == 2 {
	line2 = $0
	next
}
$0 ~ pattern {
	print line2, $0
}' dump[1-9]????.data dump[1-9]?????.data dump[1-9]??????.data > particle"$1".dat

Although written and tested using a Korn shell, this will also work with any shell that meets basic POSIX shell requirements for parameter expansions and uses Bourne shell syntax.

If you want to try this on a Solaris/SunOS system, change awk in this script to /usr/xpg4/bin/awk or nawk .

zaxxon · March 21, 2017, 2:52am

Is it always the line starting with 4 as identifier when to join this line with the 2nd? Or is there any other kind of criteria which line to join with the 2nd?

Theo_Score · March 21, 2017, 6:35pm

Thank you Don,

I am using a bash shell (#!/bin/bash) and would appreciate your help.

Thank you, Theo

Don_Cragun · March 21, 2017, 6:43pm

Hi Theo,
Help with what?

What operating system are you using?

Did you try the script I suggested (using either ksh or bash )? In what way did it fail to do what you wanted?

Theo_Score · March 21, 2017, 6:54pm

Hi Don,

I am using Linux RedHat. I tried the script with both ksh and bash with the new script as;

#!/bin/bash
IAm=${0##*/}
if [ $# -ne 1 ]
then	printf 'Usage: %s column_1_value\n' "$IAm" >&2
	exit 1
fi
awk -v pattern="^$1 " '
FNR == 2 {
	line2 = $0
	next
}
$0 ~ pattern {
	print line2, $0
}' dump[1-9]????.data dump[1-9]?????.data dump[1-9]??????.data > particle"$1".dat

And I got the error

./script.sh: /bin/bash^M: bad interpreter: No such file or directory

But this script is in the same directory where I have the data files.

Thank you.

Don_Cragun · March 21, 2017, 7:19pm

BSD, Linux, and UNIX systems shells expect the script that they read to be text files consisting of zero or more lines each line of which is terminated by a <newline> character. At least the first line in your script is terminated by a <carriage-return><newline> character pair which is what one would expect in a DOS/Windows text file. The error message is telling you that your operating system can't find a file named /bin/bash<carriage-return> (where <carriage-return> is the ASCII <CR> carriage return character).

Please get rid of the <carriage-return> characters in your script, and try running it again.

Aia · March 21, 2017, 10:13pm

Some Perl suggestions:

perl -nle '
    if($.==2){$timestep=$_; next}
    print "$timestep $_" if /^4 /;
    $.=1 unless $ARGV eq $p; $p=$ARGV
' dump[1-9]{????,?????,??????}.data > particle4

or

export search='4'
perl -nle '
    if($.==2){$timestep=$_; next}
    print "$timestep $_" if /^$ENV{search} /;
    $.=1 unless $ARGV eq $p; $p=$ARGV
' dump[1-9]{????,?????,??????}.data > particle"$search"