awk strings search + print next column after match

sdf · October 24, 2011, 6:48am

Hi,

I have a file filled with search strings which have a blank in between and look like this:

S. g. Ehr.
o. Jg.
v. d. Chijs
g. Ehr.

Now i would like to search for the strings and it also shall return the next column after the match.

awk -v FILE="search_strings.txt" 'BEGIN { while(getline < FILE) A[$1$2]=2; }{for(N=1; N<=NF; N++) if(A[$N]) print( FILENAME, $1, $N, $n+1);}'

My code failed however, how can this be done?

---------- Post updated at 12:48 PM ---------- Previous update was at 12:35 PM ----------

Since I use gawk on Windows. No I can't.

zaxxon · October 24, 2011, 6:53am

Ok, but I deleted my answer as I just noticed that you are looking for the next column, not next row. My example would only have helped if you were looking for rows.

ahamed101 · October 24, 2011, 6:56am

Can you provide a sample output also?

--ahamed

sdf · October 24, 2011, 6:56am

This is the text.

"In the meantime v. d. Chijs found the text"

The search pattern ist "v. d. Chijs" and what is needed is the search pattern plus the next column with the word "found".

ahamed101 · October 24, 2011, 7:03am

Like this?...

srch="v. d. Chijs"
sed "s/.*\($srch [a-z]*\) .*/\1/g" input_file

--ahamed

sdf · October 24, 2011, 7:08am

Is there a way to do this in awk?

ahamed101 · October 24, 2011, 7:26am

Try this... Its nasty though!

awk -v srch="v. d. Chijs" 'BEGIN{l=length(srch)}
{t=match($0,srch);if(!t){next}$0=substr($0,t+l);print srch" "$1}' input_file

--ahamed

---------- Post updated at 04:26 AM ---------- Previous update was at 04:19 AM ----------

Or

awk -v srch="v. d. Chijs" '{for(i=1;i<=NF;i++){if(match(srch,$i)){val=$(i+1)}}}
END{print srch" "val}' input_file

--ahamed

sdf · October 24, 2011, 7:40am

awk -v srch="v. d. Chijs" '{for(i=1;i<=NF;i++){if(match(srch,$i)){val=$(i+1)}}}
END{print srch" "val}' input_file

--ahamed
[/quote]

Thanks, first posted code works! Though since the search patterns are stored in a file I guess the second code provided is more suitable.

smb · March 30, 2012, 7:13pm

Hello Ahamed101,

This script is great. I've adapted it for a project I'm working on. I've got something that looks like this:

awk -v srch="foo" 'BEGIN{l=length(srch)}
{t=match($0,srch);if(!t){next}$0=substr($0,t+l);print "type " srch" "$1}'

I am parsing a line that has "foo" over and over again on the same line. When using this script example, it gets the first "foo" plus one column over.

But then it stops processing that line. The lines look something like this:

this is the entry foo bar1 and perhaps foo bar2 or foo bar3
this is the entry foo bar2-1 and perhaps foo bar2-2 or foo bar3-3
this is the entry foo bar3-1 and foo bar3-2 maybe or foo bar3-3

so the current output from the adaptation above looks like:

type bar1
type bar2-1
type bar3-1

But I need to parse every column after foo on each line. So that my output would look more like:

type bar1
type bar2
type bar3
type bar2-1
type bar2-2
type bar2-3
type bar3-1
type bar3-2
type bar3-3

Any assistance you can provide would be helpful.

Corona688 · March 30, 2012, 7:52pm

Why not just use 'foo' as your field separator? Then every field except the first will have what you want right at the beginning, no need to check every field or mess with match(). You'll need to do the splitting yourself, but for the trouble saved it's worth it.

$ awk -v FS="foo" '{ for(N=2; N<=NF; N++) { split($N, A, " "); print "type", A[1]; } }' data

type bar1
type bar2
type bar3
type bar2-1
type bar2-2
type bar3-3
type bar3-1
type bar3-2
type bar3-3

$

smb · March 30, 2012, 8:09pm

Thank you! You were absolutely right about changing the field separator. I hadn't thought of it in those terms. Have a great day!