Could you please help me in removal of newline chracter present in between the double quotes and replacing it with space.
For example ...
Every field is wrapped with double quotes with comma delimiter, so I need to travese from first double quote occerence to till second double quote occurence, if any new line chracter present , I need to replace it with space ..simlarly from double quote occurence 3 to 4, etc.
Input :
"ABCD RENT-A-
CAR XYZ LTD","00N0H","Enterprise Lake
View Way"
Output would be like this ...
"ABCD RENT-A -CAR XYZ LTD","00N0H","Enterprise Lake View Way"
Like zaxxon, I also like this approach; clever use of FS and NF.
However, it does have a bug. If the value of $NF is the number zero, !$NF will be true (since $NF is evaluated numerically, instead of as a string), which would be incorrect. The solution would be to use length($NF) or concatenate a null string to force conversion to a string type, $NF"".
Example:
$ cat data
"ABCD RENT-A-
CAR XYZ LTD","00N0H","Enterprise Lake","0
View Way"
$ # Incorrect
awk -F"\"" '!$NF{print;next}{printf("%s ", $0)}' data
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake","0
View Way"
$ # Correct
$ awk -F"\"" '!($NF""){print;next}{printf("%s ", $0)}' data
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake","0 View Way"
$# Correct
$ awk -F"\"" '!length($NF){print;next}{printf("%s ", $0)}' data
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake","0 View Way"
a golfed version of franklin52's approach:
$ awk -F'"' '$NF""{printf("%s ", $0);next}1' data
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake","0 View Way"
Even so, I'm not sure this approach meets the original poster's needs. If a line with an odd number of quotes ends on a quote, it will not have the trailing newline replaced with a space.
I believe the goal is to not naively join all lines in the data, but only those lines which span quoted text. Otherwise, a simple paste command would do the job.
paste -sd' ' data
Regards,
Alister
---------- Post updated at 03:42 PM ---------- Previous update was at 03:26 PM ----------
Here's a solution that only replaces a newline with a space when that newline occurs between an opening quote character and its corresponding close quote.
sed -n 'H;g;/^[^"]*"[^"]*\("[^"]*"[^"]*\)*$/d; s/^\n//; y/\n/ /; p; s/.*//; h' data
So long as the total number of quote characters encountered is odd, a line is appended to its predecessor. When finally an even number of quote characters have been seen, the resulting concatenation of lines is printed, with all embedded newlines converted to spaces.
Trial run with sample data:
$ cat data
"leave me alone"
"ABCD RENT-A-
CAR XYZ LTD","00N0H","Enterprise Lake","
100 View Way"
$ sed -n 'H;g;/^[^"]*"[^"]*\("[^"]*"[^"]*\)*$/d; s/^\n//; y/\n/ /; p; s/.*//; h' data
"leave me alone"
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake"," 100 View Way"
The AWK solutions would mishandle the continutation of "100 View Way":
$ awk -F'"' '$NF""{printf("%s ", $0);next}1' data
"leave me alone"
"ABCD RENT-A- CAR XYZ LTD","00N0H","Enterprise Lake","
100 View Way"