Grabbing fields without using external commands

data:

bblah1_blah2_blah3_blah4_blah5
bblahA_blahB_blahC_blahD_blahE

im using the following code to grab the field i want:

cat data | while IFS='_' read v1 v2 v3 v4 v5; do printf '%s\n' "${v4}"; done

im catting data here. in the real world, the exact content of data will be in a variable. so instead of catting, i'll be doing something like printf '%s\n' "${datacontents}" | while IFS='_' read v1 v2 v3 v4 v5; do printf '%s\n' "${v4}"; done

This works for me. but there are many instances where I have to grab the last field or second to last field. how do i do that using the above command, without having to call any external utilities?

if i was to call an external utility i can grab the last field with this:

awk -F"_" '{print $NF}' data

and i can get the second to last with i believe this:

awk -F"_" '{print $(NF-1)}' data

i dont want to use awk or any external utilities. i gave the above examples so that i can be clear in what im asking.

i intend to use this code on across multiple unix platforms so it should be portable.

If you change the read to read into an array, you can access each member of the array by index.
e.g.

unset line
while IFS=_ read -A line; do
  NF=$((${#line[@]}-1))
  echo ${line[(($NF))]} # last field
  echo ${line[(($NF-1))]} # second last field
done < file

You could combine those two echo lines inside the while-loop with:

echo ${line[((${#line[@]}-1))]} # last field
echo ${line[((${#line[@]}-2))]} # second last field

but it's a bit messier to look at.

edit: I guess that's read -a (lowercase -a in Bash)

edit 2: Another option, using shell positional parameters:

while read line; do
 IFS=_ set -- $line
 ...
done < file
1 Like

this works for bash:

#!/bin/bash

DATA="bblah1_blah2_blah3_blah4_blah5
bblahA_blahB_blahC_blahD_blahE"

printf '%s\n' "${DATA}" | while IFS=_ read -a line; do
  NF=$((${#line[@]}-1))
  echo ${line[(($NF-1))]}
done

but does not work for for sh. i get the following error:

read: Illegal option -a

if i try with the original -A which you had, i still get:

read: Illegal option -A

i care about this because there are some old systems we have here that dont have bash. they just have sh. systems such as AIX, SunOS.

any suggestions on how to circumvent this?

If the data is in a variable as in:

var=bblah1_blah2_blah3_blah4_blah5

you can get the last field with:

last=${var##*_}

and the next to the last field with:

tmp=${var%_*}
second_last=${tmp##*_}

and to see the results:

printf '%s\n' "$second_last" "$last"

which with var set as shown above will print:

blah4
blah5

P.S.: I hadn't seen your last post (post #3) when I originally posted this and didn't know what OS you are using. The above suggestion will work with /usr/xpg4/bin/sh , ksh , or bash ; but not /bin/sh on Solaris 10 systems. This will work with any shell that conforms to the POSIX standards. If some of your old systems were built before 1993, you may need to use expr to split substrings out of your variables unless you know how many underscores appear in your variables before you start splitting out fields from them.

1 Like

Well, AIX uses Ksh, not SH, but you can always trust Solaris to have some dilapidated default junk right out of the box :wink: although if you have the option of /usr/xpg4/bin/sh, as Don mentions, that's an option.

2 Likes