How to parse a string into variables

I'm working in korn shell and have a variable which contains a string like:
aa_yyyymmdd_bbb_ccc_ddd.abc. I want to treat the _ and . as delimiters and parse the string so I end up with 6 values in variables that I can manipulate. My original plan was to use
var1=`echo $sting1 | cut -c1-c2` but now the requirements have changed so the position is no longer fixed.

Please help.


$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc"
aa_yyyymmdd_bbb_ccc_ddd.abc
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc"
aa_yyyymmdd_bbb_ccc_ddd.abc
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"." -f 2
abc
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 1
aa
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 2
yyyymmdd
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 3
bbb
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 4
ccc
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 5
ddd.abc
$ echo "aa_yyyymmdd_bbb_ccc_ddd.abc" | cut -d"_" -f 5 | cut -d"." -f 1
ddd

$ oldIFS=$IFS
$ IFS=._
$ set aa_yyyymmdd_bbb_ccc_ddd.abc
$ set $1
$ echo $1
aa
$ echo $2
yyyymmdd
$ echo $3
bbb
$ IFS=oldIFS

Thanks for the help!

What if we are not sure about the lenght for eg the var can be

aaa_bbb_ccc_ddd.abc
or aaa_bbb.abc
or aaa.abc

Well, you can loop over them.

-bash-3.2$ cat test.txt
aaa_bbb_ccc_ddd.abc

-bash-3.2$ awk -F"[_.]" '{ for (i = 1; i<=NF;i++) { print $(i); } }' test.txt
aaa
bbb
ccc
ddd
abc

Ygor's solution works no matter what the lengths are. The only change I'd make is that I'd replace "set $1" with:

set -f
set -- $1
set +f

This comes up enough I thought I'd drop it here (so the next time I need it I can google for it :slight_smile:

Using "IFS", "set", and "for without the in keyword".

Note this will stomp on any arguments passed to the script, and messing with IFS has dangerous side effects to the rest of the script (remember to set it back!)

  SEP=:  # arbitrary
  FIELDS='yack":blah)::3wasempty:blec!:'

  OLDIFS="$IFS"; IFS=$SEP
  set -- $FIELDS
  for STR
  do
    echo STR=$STR
  done
  IFS="$OLDIFS" 

Also, you can use ##, #, %%, % to walk through values but it's a little more painful

  SEP=:  # arbitrary
  FIELDS='yack":blah)::3wasempty:blec!:'

  REST="$FIELDS"
  for x in 1 2 3 4 5
  do
    # chop off trailing fields
    echo ${REST%%$SEP*}
    # chop off leading field
    REST=${REST#*$SEP}
    # detect last field, works with empty fields too (trailing :)
    [ "$REST" = "${REST#*$SEP}" ] && break
  done

And finally, you can always use sed/awk/cut/python/ruby/perl ...
--
qneill

Note what happens if:

FIELDS='yack":blah):*:3wasempty:blec!:'

And there's no need for that loop if all you're doing is printing the fields:

set -f
printf "%s\n" $FIELDS
set +f

Your code could also cause failure later in the script if IFS was unset; the behaviour when IFS is unset is not the same as when it is empty.

There's no need to limit the number of fields; see below.

sep=:  # arbitrary
fields='yack":blah)::3wasempty:blec!:'

while [ -n "$fields" ]
do
  # chop off leading field
  rest=${fields#*"$sep"}

  # print the first field
  printf "%s\n" "${fields%"$sep$rest"}"

  fields=$rest
done

Nice catch, set -f is the answer as you explained in your post.

Agreed, I was printing as an illustration that uses the extracted value, the assumption being the programmer using the code will do more with it.

Doh!

The "for 1 2 3 4 5" was test code I used to limit the loop while I tested the code (I originally had "while [ true ]")

Funny after a full 15 minutes of cut-n-paste testing I still forgot to restore the code.

I envision an online test harness which provides an assertion authoring mechanism, a test harness, code posting and code verification mechanism, all tied back to the guide.

qneill "but its all in your head, melman"