grep not working ????

Hi,

I've prob in doing grep. I want to grep line staring with number 531250 in the 1st column from a file (example in picture attached below)

using command

grep -w "531250" file

my ideal result should be

531250  1       21      42.1    100     1e-05   rubber_UT_velvet.seq.Contig4570

instead it gives this output

21079   1       21      42.1    100     1e-05   NODE_50158_length_96_cov_5.531250
21079   1       21      42.1    100     1e-05   NODE_50158_length_96_cov_5.531250
23133   1       21      42.1    100     1e-05   NODE_50158_length_96_cov_5.531250
436591  1       21      42.1    100     1e-05   NODE_6486_length_160_cov_25.531250
531250  1       21      42.1    100     1e-05   rubber_UT_velvet.seq.Contig4570
594834  1       21      42.1    100     1e-05   NODE_50158_length_96_cov_5.531250
190670  1       21      42.1    100     1e-05   NODE_22903_length_224_cov_3.531250
287934  1       21      42.1    100     1e-05   NODE_58392_length_96_cov_2.531250

From the false result, I can see the grep find line that also contain 531250 but not exact one. I've already invoke grep -w option but still showing the same prob. :wall:

Hope somebody can help me with this.

Thanks in advance

grep '^531250' file
1 Like

From man grep (linux):Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively match the empty string at the beginning and end of a line.
So you need to use:

grep -w ^531250 file
1 Like
grep '^531250' inputfile

gives the expected output.

regards

Not to be a pita, but:

grep '^531250' inputfile

is not correct, for while it will match:

531250  1       21      42.1    100     1e-05   rubber_UT_velvet.seq.Contig4570

it will also match:

531250187423987423987324987432

which was not intended, as per:

The -w option is needed, or the regex needs to be ^531250[[:space:]] , although that would not select any line containing only the specified number.

The given solution is based on the given input file, I don't see any line like that.

Hi all,

Thanks for the solutions. Really helpful.

After I banged my head to wall several times I've come out with another solution by using awk

awk '{if ($1==531250) print}' file

will also give the output that i want.

But from my understanding about grep, option '-w' will only grep exact string right? In other word, apart from my ideal output the rest should have not apear right? Or am i missing something :confused:

From the grep manual:

The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character.
Similarly, it must be either at the end of the line or followed by a non-word constituent character.   
Word-constituent  characters are letters, digits, and the underscore.

So the extra lines match because your required string is at the end of the line and it is proceeded by a non-word constituent (the decimal point character).

1 Like

Thanks Chubler xl,

Seems that i didn't well understood the manual it self. Thanks for pointing this out