Parsing out data with multiple field separators

I have a large file that I need to print certain sections out of.

file.txt

/alpha/beta/delta/gamma/425/590/USC00015420.blah.lt.0.01.str:USC00015420Y2017M10BLALT.01   12   13   14   -9    1   -9   -9   -9   -9   -9    1    2    3    4    5   -9   -9

I need to print the "USC00015420" and field 16 (in this case "5").

The code I was hoping would work was

 awk -F" " '{ printf substr($1,53,57)};{print ("   "$16)}' file.txt

But it prints all the content within the periods (".") .

I need the final output to be

USC00015420   5

This seems to work:

awk '{ print substr($1,62,11) "   " $16}' file.txt

The third parameter in substr is the number of characters that you want to print, not the position.

Alternatives based on field separators:

awk '{ split($1,F,".*/|\\."); print F[2] "   " $16}' file.txt

or

awk '{ split($1,F,/.*\/|\./); print F[2] "   " $16}' file.txt

or

awk '{ s=$1; gsub(".*/|\\..*",x,s); print s "   " $16}' file.txt
1 Like

Hello ncwxpanther,

Could you please try following too and let me know if this helps you.
Solution 1st:

awk '{split($1,a,"/");sub(/\..*/,"",a[8]);print a[8]"\t"$16}'   Input_file

Solution 2nd:

awk -F'[/. ]' '{print $8"\t"$65}'   Input_file

Thanks,
R. Singh

The basic problem is that you need to define what a "section" is: the line you presented uses different separators to delimit what i suppose is a "section" in your wording: spaces (or maybe tabs), dots, slashes and colons.

Will these four (five) characters always be delimiters?

Will the lines always have the same structure? (like "First 7 parts separated by "/", the seventh part consists of 6 parts separated by ".", then a colon, then ....")

If you could answer these questions we could perhaps provide better solutions which might work better. Without this information we will never be sure to have really solved the problem.

I hope this helps.

bakunin

Thanks!
I was able to get this to perform as expected

awk '{ print substr($1,53,11) "   " $16}' file.txt

Certainly NOT with the sample you gave in post#1