Parse XML using xmllint

Hi All,

Need help to parse the xml file in shell script using xmllint. Below is the sample xml file.

<CARS>
 <AUDI>
  <Speed="45"/>
  <speed="55"/>
  <speed="75"/>
  <speed="95"/>
 </AUDI>
 
 <BMW>
  <Speed="30"/>
  <speed="75"/>
  <speed="120"/>
  <speed="135"/>
 </BMW>
</CARS> 

From above xml file, i need to get all the speed values of BMW and assign it to a variable which will be used in the shell script. I need to use xmllint to achieve this..

Thanks in advance...

Try

$ awk '/<Speed|<speed/ && gsub(/.*="|"\/>/,x)' file
45
55
75
95
30
75
120
135
$ grep -Po -i '(?<=<Speed=").*(?="/>)' file
45
55
75
95
30
75
120
135

Your XML document is not well-formed, so you have to fix it before you can run xmllint

Hello,

One more approach may help.

awk '/[0-9]/ {print $0}' file_name | sed 's/[a-zA-Z]//g;s/[[:punct:]]//g'

Output will be as follows.

  45
  55
  75
  95
  30
  75
  120
  135

Thanks,
R. Singh

I noticed all previous solutions posted extract all speed values. But OP is interested in extracting speed values only for BMW and wanted to assign them to shell variable.

Here is an approach using bash, storing each speed values into an array:

#!/bin/bash

flag=0
while read line
do
        [[ "$line" =~ "<BMW>" ]] && flag=1
        [[ "$line" =~ "</BMW>" ]] && flag=0
        if [ $flag -eq 1 ] && [[ "$line" =~ [Ss]peed ]]
        then
                (( ++c ))
                line="${line##*=\"}"
                S[$c]="${line%%\"*}"
        fi
done < file.xml

for k in "${S[@]}"
do
        echo "$k"
done

My apologies for 2nd post I didn't read requirement properly. Since requirement is only one array so we can also define like this

$ cat test.sh
#!/bin/bash

# Parse speed  values between tag and write output to temporary file
  awk '/<BMW>/,/<\/BMW>/{if(/<Speed|<speed/){gsub(/.*="|"\/>/,x);print}}' file.xml >test.tmp

# Create array
  arr=( $( < test.tmp  ) )

# OR you can use cut command for specific column into an Array instead of entire line may be useful in future
# arr=( $( < <(cut -f1 test.tmp )  ) )

# delete temporary file created using awk
  rm -f test.tmp

# Print Array Elements
  for i in $(seq 0 $((${#arr[@]} - 1)))
  do
    echo ${arr[$i]}
  done

# OR
# for i in ${arr[@]}
# do
#    echo $i
# done 
$ bash test.sh
30
75
120
135

Hello,

Here one approach may help.

awk '/\<\/BMW\>/  {f=0} (f ~ 1) {gsub(/[[:alpha:]]/,X,$0) gsub(/[[:punct:]]/,Y,$0); {print $0}} /\<BMW\>/ {f=1}' check_all_BMW_values

Output will be as follows.

  30
  75
  120
  135

Thanks,
R. Singh