Hi,
I would want to fetch all the numbers after a word the number of characters could very. how can I do that?
below is the example of the data and the expected output
sample data
03 xxxx occurs 1090 times.
04 aslkja occurs 10 times.
I would want to fetch 10 & 1090 separately.
ahmedwaseem2000:
...
I would want to fetch all the numbers after a word the number of characters could very. how can I do that?
below is the example of the data and the expected output
sample data
03 xxxx occurs 1090 times.
04 aslkja occurs 10 times.
I would want to fetch 10 & 1090 separately.
If "occurs" is the word in your sample data, and you want to fetch the longest string of digits after that, then -
$
$
$ cat f2
03 xxxx occurs 1090 times.
04 aslkja occurs 10 times.
$
$ awk '{x=gensub(/.*occurs ([0-9]+) .*/,"\\1",$0); print x}' f2
1090
10
$
$
tyler_durden
Or is this sufficient for your purpose?
awk '{print $(NF-1)}' file
anbu23:
what is gensub function?
It's a gawk specific function:
gensub - The GNU Awk User's Guide
1 Like
Thanks guys!! However, gensub doesnt work so, I replaced it with gsub still doesnt work. and its not always the second last field so cant use the field -1 option either. what could be other ways to this?
gensub is gawk specific, did you use gawk?
Post a better example of your input file.
yes, I tried gawk and its not present. here is the example
askd sslkajdf OCCURS 10 Times.
a;lkjsfdj alkjsfd OCCURS 100 times depending on XYZ.
al;ksfjas OCCURS 10.
Maybe something like this?
sed 's/.*OCCURS \([^ .]*\).*/\1/' file
yes, it works with sed. I was looking for the awk pattern as this is going to be part of other major awk script.
You could use the match function:
awk '{
wl = length(w) + 1
if (match($0, w " *[0-9]*"))
print substr($0, RSTART + wl, RLENGTH - wl)
}' w=OCCURS infile
With a slide modification, you could also handle multiple occurrences on the same line.
1 Like
its working perfectly fine. a desect about the match statement would be very nice !!!!
grep -o "occurs [0-9]*" urfile |awk '{print $2}'
kurumi
July 15, 2010, 9:00pm
14
#!/bin/bash
while read -r LINE
do
case "$LINE" in
*OCCURS*)
LINE=${LINE##*OCCURS }
echo ${LINE%% *}
esac
done <"file"