awk search strings from array in textfile

I am wanting to take a list of strings and loop through a list of textfiles to find matches. Preferably with awk and parsing the search strings into an array.

// Search_strings.txt

tag
string
dummy
stuff
things

// List of files to search in

textfile1.txt
textfile2.txt

The desired output should look like this:

{print $1, searchstring, Fieldnumber_searchstring_found}

Can anybody help

What have you tried?

Tried to pass the search_string.txt into an array. Count the rows in the array and loop with the FOR syntax through each file. But I get messed up with the code.

$ cat array.awk

BEGIN { while(getline < "Search_strings.txt") A[$1]=1; }

{
        for(N=1; N<=NF; N++)
        if(A[$N])
                printf("%s, %s, %d\n", $1, $N, N);
}

$ xargs awk -f array.awk < list_of_files.txt

Since I need to use gawk on windows I rewrote the script to:

-v FILE="Search_strings.txt" "BEGIN { while(getline < FILE) A[$1]=1; }{for(N=1; N<=NF; N++) if(A[$N]) print( $1, $N, N);}" textfiles*.txt > Concordance.txt

Though, the script seems not to end and the filesize remains 0 bytes.:wall:

That would have been nice to know.

Unfortunately windows CMD does not expand * at all. Any glob-like expansion is done inside programs, not CMD, meaning it's an entirely optional feature which most commands don't support, and those which do often don't do it the same way as others.

---------- Post updated at 03:11 PM ---------- Previous update was at 03:07 PM ----------

Also, all your datafiles are going to be full of garbage carriage returns.

---------- Post updated at 03:16 PM ---------- Previous update was at 03:11 PM ----------

CMD doesn't have real quoting, either. Strings and arguments and quotes all get passed as-is. Yes, all quoting in CMD is an optional feature. :wall: Try and avoid spaces in filenames.

I got this to work:

> gawk -v FILE="strings.txt" "BEGIN { while(getline < FILE) A[$1]=1; }{for(N=1; N<=NF; N++) if(A[$N]) print( FILENAME, $1, $N, N); }" data*.txt
data1.txt string string 1
data1.txt string string 1
data1.txt stuff stuff 1
data2.txt stuff stuff 1
>

I added FILENAME to make sure it was opening all the files correctly. It's a built-in variable.

1 Like

@ corona688

Thanks this works the way I need it!