Show distinct values of a key from a single line

Hi All,

I am newbie to linux. Can somebody help me with following requirement.

I have one huge line. I have to find out particular key/value pair to see the distinct value of that key.

Portion of the String:-
<?xml version="1.1" encoding="UTF-8"?> <Data><Val Ti="1342750845538" Du="0" De="blackberry8520_ver1RIM" Db="encyclopedia" Pdb="" Uq="0" Dq="0" qry="http://google.com/sdsds?q=dsds&dsdsds=dsds&ss?" ab="dsds" Dc="4" Te=" Ca="xxx" Sc="320.240" Us="" Cd="X"</Val><Val Ti="1342750845538" Du="0" De="blackberry8520_ver1RIM" Db="encyclopedia" Pdb="" Uq="0" Dq="0" qry="http://google.com/sdsds?q=dsds&dsdsds=dsds&ss?" ab="dsds" Dc="4" Te=" Ca="xxx" Sc="320.240" Us="" Cd="X"</Val> ..../>

Need to search :-

qry="ALL_THE_DISTINCT_VALUES"

So I need to find the values of qry parameter. This is going to be an URL.

Looking for your help.
Thanks in advance,
KM

This will print all the required sections on separate lines:

awk '/^<\?xml/ {while(match($0,/qry="[^"]*"/)){print substr($0,RSTART+5,RLENGTH-6);sub(/qry="[^"]*"/,"")}}' inputfile
1 Like

See if this helps

tr "\"" "\n" < input | awk '/qry/ { getline url; printf "qry=\""; print url"\"" }'

Regards
Peasant.

HI elixir_sinari,

Thanks a lot for your code. It is working as expected.
Can you give a little more time explaining your code, so that I can modify according my new requirements? What I did not understand from your code is the
i) regex(/Rp="[^"]") Does [^"] means all?
ii) RLENGTH-6 why 6?

Basically I need to pick up the corresponding Cd parameter along with it.

Thanks in advance.

KM

1) The regular expression qry="[^"]*" will match the literal qry=" followed by any number of any characters except a " , and followed by the literal " . This is needed as an expression such as .* will otherwise match as many characters as is possible including " .

2) Since you asked just about RLENGTH-6 , I assume that you have idea about the match function and the system variables set by it. The number will depend on what you have as the "label" (qry in this case). As I am advancing the position in substr function by 5 (from RSTART, to match the length of qry-" ), the length in substr will be reduced by 5 + 1(for the last " ).

Thank you for the explanation. Got it.