Nawk command not working for Question mark (?)

kirans.229 · November 23, 2016, 9:59am

Hi Folks,

I am facing an issue with nawk command.

The data is as below:

ABC0022,BASC,Scene Package,INR,02May17,XXX4266,be?. Hotel,3,AW01,Twin Room,61272,41308,39590,39590,X,X

ABC0022,BASC,Scene Package,INR,02May17,XXX4266,be?. Hotel,3,AW02,Twin Room with Balcony,9272,85638,4520,9590,X,X

If the first 8 columns matches in the two records then we should append the data as one records as shown below:

ABC0022,BASC,Scene Package,INR,02May17,XXX4266,be?. Hotel,3,AW01,Twin Room,61272,41308,39590,39590,X,X,AW02,Twin Room with Balcony,9272,85638,4520,9590,X,X

I am using nawk to achieve this.

nawk -F, ' RES ~ "^" $1","$2","$3","$4","$5","$6","$7","$8 FS {X= $1","$2","$3","$4","$5","$6","$7","$8
			   sub (X FS,"")  	
			   RES=RES FS $0
		          next
			 }  
			{if (NR > 1)
	 	       print RES   
			RES=$0 
                    } 
END			{print RES}
' holiday_file.csv > holiday_file_merge.csv

---

But though first 8 columns are matching due to a question mark (?) in 7th column "be?. Hotel" i am not achieving proper output which should be in a single line.

But i am getting it as 2 separate rows.

When there is no question mark (?) in the data, nawk is giving proper output.

Can you please help me out in understanding as to why a question mark (?) is causing an issue and how to mitigate this?

RudiC · November 23, 2016, 10:13am

The question mark is a special character in regular expressions. man regex :

You can't use it by itself but need to take extra measures, e.g. escape it, or replace it.