extract string from file name between two underscores

Hi,

Here is my question,

I need to extract string between two underscores from the filename

for example, filename is

atmos_8xdaily_instant_300x300_1_12.nc

what I want to extract is 300x300.

There are many such files in my directory, so I guess the code should be like:

for file in *1_12;do
NAME= commend to extract string from($file)
NAME1=${NAME}.slp.nc
done

Thanks!

one way:

for fname in *1_12
do
  tmp=$(echo "$fname" | awk -F '_' '{print $4}' )
  newfname=${tmp}.slp.nc
  echo $newfname
done

Thanks. Could you explain a little bit about this line

tmp=$(echo "$fname" | awk -F '_' '{print $4}' )

awk parses the filename - it uses the underscore character to separate fields. A few shells can also do this, like bash. Field $4 is the one you want, so awk prints the the $4 field. $[field number] is the awk syntax for each field in a file - the default field separator is tab/space. The -F '_' tells awk to use underscore.

1 Like

Yet another way of doing the same thing...

echo "atmos_8xdaily_instant_300x300_1_12.nc" | sed 's/.*_\([0-9][0-9]*x[0-9][0-9]*\)_.*/\1/'
1 Like

could you also explain a little bit about this line? Thanks!

The sed command looks at a line and filters out only the part that has this pattern "(number)x(number)" punctuated by underscores. So in this pattern

.*_\([0-9][0-9]*x[0-9][0-9]*\)_.*

the regex "." is the entire line to the left of the first 300 as is the "." to the right of the second 300. The [0-9][0-9]* filters out only numbers and the "x" and "_" are taken literally...

Probably the simplest solution is

cut -d_ -f4

Regards,
Alister