Extract Specific Information from a particular field

rramkrishnas · June 1, 2015, 6:51am

Hi,

I am trying to extract a specific information from a file which contains more than 200 million records. Attached the input file for your reference.

My file contains information below

HGSDC:MSISDN=917807000032,SUD=CAT-10&DBSG-1&BS3G-1&TS11-1&TS21-1&TS22-1&RSA-1&DCSIST-0&GPRSCSIST-0&MCSIST-0&OCSIST-1&OSMCSIST-1&TCSIST-1&VTCSIST-0&CSP-6&NAM-0&TSMO-0&SCHAR-4&PWD-0000&OFA-0&OICK-50&HOLD-1&MPTY-1&CLIP-1&CFU-1&CFB-1&CFNRY-1&CFNRC-1&BAOC-1&CAW-1&SOCFU-1&SOCFB-1&SOCFRY-1&SOCFRC-1&SODCF-0&SOSDCF-4&SOCB-0&SOCLIP-0&SOCLIR-0&SOCOLP-0;

I need to extract only CSP information which is highlighted in green color above.

Currently I am extracting by using Grep individual CSP value and then appending into a single file, which is taking more time and also required more manual exercise.

Would need your help to get simple script without much manual process.

Thanks
Ram

sea · June 1, 2015, 7:01am

Hi

You could try:

while IFS='&' read line a b c d e f g h i j k l m n o p
do	print $o
done<input.txt

Hope this helps

rramkrishnas · June 1, 2015, 7:09am

Hi Sea,

Thanks for youre quick reply, however the issue is that, position of CSP is not fixed, it varies depending on other informations. I need something to identify position of CSP then extract CSP information till next "&".

sea · June 1, 2015, 7:26am

untested:

while IFS='&' read line
do	PRINT_NEXT=false
	C=0
	ARGS=($(echo $line))
	for a in ${ARGS[@]};do
		((C++))
		[ $a = CSP ] && print ${ARGS[$C]} && break
	done
done<input.txt

If thats neither, please provide your grep code.

MadeInGermany · June 1, 2015, 8:00am

For each line print the first CSP after a & up to the end of the field

sed -n 's/.*&\(CSP[^&]*\).*/\1/p' 1x.txt