Shell Variables passed to awk to return certain rows

Hi Forum.

I have the following test.txt file and need to extract certain rows based on "starting position", "length of string" and "string to search for":

1a2b3d
2a3c4d
.....

My script accepts 3 parameters: (starting col pos, length to search for, string to search for) and would like to pass these parameters for awk to extract the records.

Running the script as:

script 2 3 a2b test.txt

would return the first record from the file:

1a2b3d
 awk  -v search_col_pos=$search_col_pos search_str_len=$search_str_len string_to_search=$string_to_search 'substr($0, search_col_pos, search_str_len) == "string_to_search"' test.txt

where $search_col_pos, $search_str_len, $string_to_search are passed as input parameters to the script.

but code is not working and is returning the following error:

awk: fatal: cannot open file

Please help.

Thanks.

I am quite sure there would be an easier way of doing what you are trying to do, if you would have provide some real input and output files. Nevertheless, I would like to point out the following:

-v search_col_pos=$search_col_pos -v search_str_len=$search_str_len -v string_to_search=$string_to_search
 'substr($0, search_col_pos, search_str_len) == "string_to_search"' test.txt

What you are saying here is: if that evaluates to true (any value not a 0) print the default, which is $0

This is a sample of what the data looks like:

H2015051303:52:46CUSTOMER LEVEL DETAIL   8987DET





B2015051303:52:463570





D3570                          1057            3570 000000000000040000C5360885903549753         00189DEBIT CASH BAL-REVERSAL                  0189DEBITCASHBAL-REVERSA 75360885132132
000004319CAD20150512201505128987CTPC NE0000000012  89878888880060              03                      CAD0000000000000400000000000245100




D3570                          1057            3570 000000000000040000D5360884924480379         00184CREDIT TO PURCHASE BAL-REVERSAL          CREDITTOPURCHASEBAL-REVE 75360885132132
000004137CAD20150512201505128987CTPC NE0000000012  89878888880060              03                      CAD0000000000000400000000000245300

...

thanks for pointing out the missing -v, this is what the updated code looks like but it still doesn't work as coded.

awk -v search_col_pos=$search_col_pos -v search_str_len=$search_str_len -v segment_type=$segment_type 'substr($0, search_col_pos, search_str_len) == "segment_type"' test.txt

If I hard code "D" as segment_type in the awk command, it works but not when I use the variable segment_type.

Essentially I want to extract all records with a "D" in certain column position for a length of one in this example.

I have other files that the search string can be in any position and be 4 characters long.

You need a -v option for each variable you're setting; not just the first one.

If there is any chance that string_to_search could contain any whitespace characters or characters that have special meaning to the shell, it will also need to be quoted.

It is generally a good idea to quote all shell variable expansions, but I will assume for now that your script has already verified that $search_col_pos and $search_str_len expand to numeric strings.

And, inside an awk script, putting a variable name in quotes will try to match against the name of the variable instead of the contents of the variable.

Try:

awk -v search_col_pos=$search_col_pos -v search_str_len=$search_str_len -v string_to_search="$string_to_search" 'substr($0, search_col_pos, search_str_len) == string_to_search' test.txt

If that still doesn't work, show us what happens with the above corrections.

1 Like

Hi Don.

Your solution worked perfectly!!!

thanks.