Extracting a string matching a pattern from a line

Hi All,

I am pretty new to pattern matching and extraction using shell scripting. Could anyone please help me in extracting the word matching a pattern from a line in bash.

Input Sample (can vary between any of the 3 samples below):
1) Adaptec SCSI RAID 5445
2) Adaptec SCSI 5445S RAID
3) ICP ICP5085BL SATA RAID

Expected Output:
1) 5445
2) 5445S
3) ICP5085BL

I tried various options and zero'ed in with the below sed

echo "$1" | sed 's/.*\([0-9]\{4\}.*\)/\1/'

Though the above script works for the first 2 inputs it does not work for the third option. It gives only 5085BL instead of ICP5085BL. Any suggestions would be deeply appreciated.

try this..

echo "Adaptec SCSI RAID 5445" | awk '{print $4}'

echo "Adaptec SCSI 5445S RAID" | awk '{print $3}'

echo "ICP ICP5085BL SATA RAID" | awk '{print $2}'

What do you want to achieve? Looks like you want to match strings containing 4 digits :confused:

try:

#!/bin/bash
input=$1
echo ${input}                      # echos the full input
echo ${input//[0-9]/}              # echos the digits found within the input

This assumes all digits are together within the input.

edit...
Sorry, I didn't pay attention... this only extracts the numbers.

Hi Tuxidow,

Thanks for your reply. But I need one single script to answer the three inputs as "$1" can either of them depending on the available RAID controller

Hi pseudocoder,

I am trying to extract the complete word that contains 4 digits.

The RAID controller model number can be 5445 or 5445S or ICP5085BL or someother valid Adaptec model.

Try this...

#!/bin/bash
input=( $* )
for w in ${input[@]}; do
    if [ ! ${w//[0-9]*/} ]; then
      echo $w
    fi
done
[^ ]*[0-9][0-9][0-9][0-9][^ ]*

should be the proper regex.

$ cat sedinput
1) Adaptec SCSI RAID 5445
2) Adaptec SCSI 5445S RAID
3) ICP ICP5085BL SATA RAID
$ grep -o "[^ ]*[0-9][0-9][0-9][0-9][^ ]*" sedinput
5445
5445S
ICP5085BL
$ 

or

$ echo "1) Adaptec SCSI RAID 5445" | grep -o "[^ ]*[0-9][0-9][0-9][0-9][^ ]*"
5445
$ echo "2) Adaptec SCSI 5445S RAID" | grep -o "[^ ]*[0-9][0-9][0-9][0-9][^ ]*"
5445S
$ echo "3) ICP ICP5085BL SATA RAID" | grep -o "[^ ]*[0-9][0-9][0-9][0-9][^ ]*"
ICP5085BL
$

Let me know if you by all means want to do it with sed and if you fail building appropriate sed command with the mentioned regex.

1 Like

You can extract the word from this scripts

awk '{for(i=1;i<=NF;i++) {if($i ~/[0-9]/) {print $i}}}'  filename
1 Like

@pseudocoder and @gaur84 : Thanks a lot your solution works like a charm.:smiley:

@dunkar70 : Thanks for your help too. Your solution does not work for the 3rd input though :frowning: