Help with XML tag value extraction based on matching condition

sample xml file part

<DocumentMinorVersion>0</DocumentMinorVersion>
  <DocumentVersion>1</DocumentVersion>
  <EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate>
  <FollowOnFrom>
    <ContractRequest _LoadId="export_AJ6iAFoh6g0rE9">
      <_LocalId>CRW2218451</_LocalId>
      <Active>true</Active>
      <ActualTemplateObject _LoadId="export_AJ6iAFoag0rxLm" _Logical="true" class="ariba.collaborate.contracts.ContractRequest" ref="true">
        <Workspace>/Templates/Contract Templates/Contract Request/</Workspace>

I want to get the value inside the tag <_LocalId> and </_LocalId> ,only if the <workspace> tag has a value matching to string "Contract Request"
Remember the above snippet is only a portion of the big file and the same tags may be repeating with other values in the file. But I do not want to search till the end of the file too .
I want to stop the search at the very first match in the file and get that one value out

expected output

CRW221845

Hello paul1234,

Could you please try following and let me know if this helps you.
Solution 1st: Looking for string <_LocalId> with making field separator -F[[><]' as follows.

awk -F'[><]' '/<_LocalId>/{print $3}'   Input_file

Solution 2nd: Looking for string <_LocalId> and then substituting everything till > and then globally substituting > and <.* too and then printing the line.

awk '/<_LocalId>/{sub(/.[^>]*/,"");gsub(/>|<.*/,"");print}'   Input_file

Thanks,
R. Singh

1 Like

Hi Ravinder ,
I had tried this already .The LocalId tag repeats in the file so I wanted to get the value of LocalId tag only if the Workspace value has the string "Contract Request".

Hello paul1234,

Apologies missed it, please write only code in the code tags and information out side of code tags like I am writing here as an example. Could you please try following and let me know if this helps.
Solution 1st:

awk '/<_LocalId>/{sub(/.[^>]*/,"");gsub(/>|<.*/,"");val=$0;next} /<Workspace>.*Contract Request/{print val;exit}'  Input_file

Solution 2nd:

awk -F'[><]' '/<_LocalId>/{val=$3;next} /<Workspace>.*Contract Request/{print val;exit}'   Input_file

Thanks,
R. Singh

1 Like

Hi Ravinder ..This was very useful . Thank you for your help:)