extract string from file name between two underscores

1988PF · July 19, 2012, 12:44pm

Hi,

Here is my question,

I need to extract string between two underscores from the filename

for example, filename is

atmos_8xdaily_instant_300x300_1_12.nc

what I want to extract is 300x300.

There are many such files in my directory, so I guess the code should be like:

for file in *1_12;do
NAME= commend to extract string from($file)
NAME1=${NAME}.slp.nc
done

Thanks!

jim_mcnamara · July 19, 2012, 12:55pm

one way:

for fname in *1_12
do
  tmp=$(echo "$fname" | awk -F '_' '{print $4}' )
  newfname=${tmp}.slp.nc
  echo $newfname
done

1988PF · July 19, 2012, 12:58pm

Thanks. Could you explain a little bit about this line

tmp=$(echo "$fname" | awk -F '_' '{print $4}' )

jim_mcnamara · July 19, 2012, 1:08pm

awk parses the filename - it uses the underscore character to separate fields. A few shells can also do this, like bash. Field $4 is the one you want, so awk prints the the $4 field. $[field number] is the awk syntax for each field in a file - the default field separator is tab/space. The -F '_' tells awk to use underscore.

shamrock · July 19, 2012, 2:17pm

Yet another way of doing the same thing...

echo "atmos_8xdaily_instant_300x300_1_12.nc" | sed 's/.*_\([0-9][0-9]*x[0-9][0-9]*\)_.*/\1/'

1988PF · July 19, 2012, 2:19pm

could you also explain a little bit about this line? Thanks!

shamrock · July 19, 2012, 2:34pm

The sed command looks at a line and filters out only the part that has this pattern "(number)x(number)" punctuated by underscores. So in this pattern

.*_\([0-9][0-9]*x[0-9][0-9]*\)_.*

the regex "." is the entire line to the left of the first 300 as is the "." to the right of the second 300. The [0-9][0-9]* filters out only numbers and the "x" and "_" are taken literally...

alister · July 19, 2012, 3:02pm

Probably the simplest solution is

cut -d_ -f4

Regards,
Alister