Extract words before and after a certain word.

I have a sample text file with file name: sample.txt

The text file has the following text.

this is an example text where we have to extract certain words before and after certain word these words can be used later to get more information

I want to extract n (a constant) words before and after a certain word. For example, I want to extract 2 words before and after the word "words" in the sample text above. The output that I expect is:

extract certain words before and
word these words can be

This is what I have attempted:

grep -owP '.{0,2}words.{0,2}' sample.txt

and the output that I get is this:

[white-space] word [white-space]
[white-space] word [white-space]

,where [white-space] represents " ". Now when I increase the window size from 2 to 10 in the code above like this:

grep -owP '.{0,10}words.{0,10}' sample.txt

I get the following output:

[white-space]certain words before[white-space]
[white-space]these words can be[white-space]

Therefore, one can see that the above code is considering characters before and after and not words. I am using BASH.

Hello shoaibjameel123,

Following may help you in same.

awk '{for(i=1;i<=NF;i++){if($i == "words"){print $(i-2) OFS $(i-1) OFS $i OFS $(i+1) OFS $(i+2)}}}'  Input_file

Output will be as follows.

extract certain words before and
word these words can be
 

Thanks,
R. Singh

1 Like

Perhaps

grep -owP '(?:\w+\s){0,2}words(?:\s\w+){0,2}' sample.txt
1 Like