Hello,
I have an xml
file and my aim is to grab each line in keywords file and search the string in another file.
When keyword is found in xml file,I expect the script to go to previous line in the xml file and grab the string/value between two strings. It's almost working with an error.
tab separated keywords.txt
test1 qqq98
test35 sss32
test26 Rsiw
1.xml file
<id="229954e70d6b702f8d570b4be11af181">
<display-name>test44 lgi3d</display-name>
<id="229954e70d6b702f8d51331cbe11af181">
<display-name>test35 kkld</display-name>
<id="2223230did3s2Qafevrgvve1cbe11af181">
<display-name>test26 Rsiw</display-name>
expected output:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id="2223230did3s2Qafevrgvve1cbe11af181"
while read COL1 COL2 && read -r line <&3; do
A=$(grep -B1 "$COL1.*$COL2" 1.xml | grep -v "display-name" | sed -e 's/<id=\"\(.*\)\">/\1/' )
#A=$(grep -B1 "$COL".*$COL2" 1.xml | grep -v "display-name" | grep -o -P '(?<=<id=\").*(?=\">)')
echo "$COL1 $COL2 id=\"$A\""
done < keywords.txt 3<1.xml
This gives:
test1 qqq98 id=""
test35 sss32 id=""
test26 Rsiw id=" 2223230did3s2Qafevrgvve1cbe11af181"
I wondered why there are two spaces before $A
variable at output console.
Thank you
Boris