How reverse cut or read rows of lines

doer · July 18, 2007, 10:13pm

Hi,

I want to extract the value before _97|

This command
BSC_ID=`echo $DATA | cut -f5 -d"_"`
gives me
_97|, 4, 11

and by using the command
echo $DATA | awk -F_ '{print $(NF-1)}'
I get LIMJM1-3, 4, 11.

I want to extract 3,4, and 11 only.

please help.

vgersh99 · July 18, 2007, 10:51pm

sed 's/.*[-_]\([^-_][^-_]*\)[-_].*/\1/' myFile

ghostdog74 · July 18, 2007, 11:04pm

try this:

echo $DATA | awk -F[_-] '{print $(NF-1)}'

doer · July 18, 2007, 11:55pm

when i use the BSC_ID=`echo $DATA | awk -F[_-] '{print $(NF-1)}`

which is incorrect

doer · July 19, 2007, 12:02am

to more precise the number of underscores are not fixed in my file.

so that is the reason why I want to read from reverse and get the value before _97|

divz · July 19, 2007, 12:07am

can you please explain hoe does ths work???

vgersh99 · July 19, 2007, 12:16am

Firstly, your command is what what it's been originally suggested by ghostdog74 (which does work for your sample input):

echo $DATA | awk -F[_-] '{print $(NF-1)}'

Secondly, have you tried the 'sed' suggestion yet?

vgersh99 · July 19, 2007, 12:20am

if you take a a closer look at the previous suggestions, you'll see that there's no assumptions of the 'number of underscores/dashes in the file. The only assumption (based on your sample file] is that you want to get the 'next to last' field in the underscoreORdash separated record/line.

Is the above correct description of the objective?

vgersh99 · July 19, 2007, 12:29am

sed 's/.*[-_]\([^-_][^-_]*\)[-_].*/\1/' myFile

from left to right:

.* - any character repeated 0 or more times - greedy - will consume ALL the character leading to the LAST non-underscore/non-dash char followed b dashORunderscore.
[-] - followed by either a '-' or a '' char
$[^-_][^-]*$ - followed by a 'capture' of any character other then '-' or '' repeater 0 or more times.
[-] - followed by either a '-' or a '' char
.* - any character repeated 0 or more times - greedy
\1 - replace the 'matched' string with the FIRST 'capture'

I know it might be a bit confusing reading the regEx expressions at times, but try to think 'pattern matching'....

doer · July 19, 2007, 12:42am

yes your objective is absolutely correct but how do I use this command

sed 's/.*[-_]$[^-_][^-_]*$[-_].*/\1/' myFile in my below script.

Myscript

for DATA in `cat $IN_FILE/a.txt`
do
BSC_ID=`echo $DATA | awk -F[_-] '{print $(NF-1)}`
echo $BSC_ID
done

and the output is

which is incorrect.

divz · July 19, 2007, 12:45am

Thanks a lot for the explaination....
This is really good work.

vgersh99 · July 19, 2007, 12:52am

.... the same way you've used the 'awk' suggestion - although you've missed a single-quote from the original suggestion.

Here's the modified 'awk' way with optimized non-UUOC code - what the purpose of the 'for' loop?:

awk -F[_-] '{print $(NF-1)}' $IN_FILE/a.txt

The same results can be achieved with the similar 'sed' solution no need for the 'for' loop either:

sed 's/.*[-_]\([^-_][^-_]*\)[-_].*/\1/' $IN_FILE/a.txt

aajan · July 19, 2007, 2:05am

cat filename | sed 's/-//' | awk -F"" '{print $5}'

vgersh99 · July 19, 2007, 2:08am

cat and sed and awk...... why?

doer · July 19, 2007, 2:16am

thanks. Your sed suggestion worked but i still could not get it right with awk. Where did i go wrong?

BSC_ID=`echo $DATA | awk -F[_-] '{print $(NF-1)}'`
Result
BSC403_JAIN03|3153_TropicalFarm_LIMJM1-3_97|
BSC403_JAIN03|3410_PantaiAcehPCEHM1_4_97|
BSC406_BMIN02|1433_JomHebohTV3_COW7M1_11_97|

BSC_ID=`echo $DATA | sed 's/.*[-_]$[^-_][^-_]*$[-_].*/\1/'`
Result
3
4
11

The reason why I loop is because i execute a lot more other commands in this loop using the extracted value.

vgersh99 · July 19, 2007, 2:22am

dunno, this seems to work just dandy:

echo 'BSC403_JAIN03|3153_TropicalFarm_LIMJM1-3_97|' | awk -F[_-] '{print $(NF-1)}'

don't know - try using 'nawk' instead of 'awk' - see if it works....

Speaking of loops:

nawk -F[_-] '{print $(NF-1)}' $IN_FILE/a.txt | while read myValue
do
   echo "here I do more stuff with the extracted value: [${myValue}]"
done

doer · July 19, 2007, 2:26am

nawk works