sed / awk to get specific word in line

I have http log that I want to get words after specific "tag", this a sample line from the log:

98,POST,200 OK,www.facebook.com,Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1,/ajax/updatestatus.php?__a=1,datr=P_H1TgjTczCHxiGwdIF5tvpC; lu=Si1fMkcrU2SInpY8tk_7tAnw; c_user=728445064; xs=61%3Ab9ee26a8f2fc53efb960a6bd6c1c0042%3A0%3A1328685197; presence=EDvFA22A2EtimeF1328685268EuserFA2728445064A2EstateFDutF1328685268207EvisF1EvctF0H0EblcF0EsndF1ODiFA21014445485A2C_5dEfFA21014445485A2EuctF1328685227EsF0CEchFDp_5f728445064F15CC; p=3; act=1328685304452%2F17%3A2; _e_0Hb1_9=%5B%220Hb1%22%2C1328685304455%2C%22act%22%2C1328685304452%2C17%2C%22http%3A%2F%2Fwww.facebook.com%2Fajax%2Fupdatestatus.php%22%2C%22f%22%2C%22submit%22%2C%22wall%22%2C%22r%22%2C%22%2Fmeemelati%22%2C%7B%22ft%22%3A%7B%7D%2C%22gt%22%3A%7B%22profile_owner%22%3A%22731612557%22%2C%22ref%22%3A%22mf%22%7D%7D%2C0%2C0%2C0%2C0%2C16%5D; x-src=%2Fajax%2Fupdatestatus.php%7Cprofile_stream_composer,548,application/x-www-form-urlencoded; charset=UTF-8,0,application/x-javascript; charset=utf-8,gzip,chunked,post_form_id=a012a7a073bc1d990a6c449643ea4570&fb_dtsg=AQDnwj7O&xhpc_composerid=ux0ih_13&xhpc_targetid=731612557&xhpc_context=profile&xhpc_fbx=1&xhpc_timeline=&xhpc_ismeta=1&xhpc_message_text=Hello%20Londoners&xhpc_message=Hello%20Londoners&composertags_place=102173726491792&composertags_place_name=&composer_predicted_city=102173726491792&composer_session_id=1328685296&is_explicit_place=&composertags_city=102173726491792&disable_location_sharing=false&nctr[_mod]=pagelet_wall&lsd&post_form_id_source=AsyncRequest&__user=728445064&phstamp=16581681101191065579516,<EOH>

After awk found specific tag: "xhpc_message_text="
It will give output: "Hello Londoners" (it will remove url character encoding too, like "%20" in this sample output string)
And limit by char "&" or "&xhpc_message".

thanks, for any suggestion to solve this problem.

with awk you can use function 'index' to get position of the tag.
'substr' can be used to cut from this position to the end.
another 'index' call will find out next position of '&'.
now you have string position of your expected result.

ok, I'll try that

Hi,

Try this one,

awk -F"&xhpc_message_text=" '{l=substr($2,0,match($2,"&")-1);gsub(/%20/," ",l);print l;}' file

Cheers,
Ranga:)

you can retrieve the value using awk. but converting the URL-encoding to ascii is the task here..

 
$ nawk -F\& '{for(i=1;i<=NF;i++)if($i~/xhpc_message_text/){split($i,a,"=");print a[2]}}' test.txt
Hello%20Londoners

In perl, you can do it easily. let me know if you are interested in perl

itkamaraj, thank you, I like your solution...

$ perl -F\& -lane 'foreach(@F){if($_=~m/xhpc_message_text/){($a,$b)=split("=",$_);$b=~tr/+/ /;$b=~s/%([a-fA-F0-9]{2,2})/chr(hex($1))/eg;$b=~s/<!--(.|\n)*-->//g;print $b}}' input.txt
Hello Londoners