Hi All,
I have a file with a single row having the following text
ABC.ABC.ABC,Database,New123,DBNAME,F,ABC.ABC.ABC_APP,"@FUNCTION1("ENT1") ,@FUNCTION2("ENT2")",R,
I want an output in the following format
ABC.ABC.ABC DBNAME ABC.ABC.ABC_APP '@FUNCTION1("ENT1") ,@FUNCTION2("ENT2")' R
Note : the commas should be replaced with spaces, except for the one's between the double quotes. Also, the outer double quotes have to be replaced with a single quote
Lastly, the length of the 'functions'(statements between double quotes) is not fixed and can vary.
Looking forward for your expert advise. Thanks in Advance.
Regards,
Devender
Assuming that your input sample is representative:
awk -F, -v sq="'" '{match($0, /".*"/); print $1, $4, $6, sq substr($0, RSTART+1, RLENGTH-2) sq, $(NF-1)}' file
seems to do what you want.
If you are running this on a Solaris/SunOS system, change awk
to /usr/xpg4/bin/awk
or nawk
.
1 Like
Thanks Don, seems to work.
If you could please help me with the explanation as well.
Regards,
Dev
Hi dev.devil.1983,
Here is a copy of the awk
script with comments...
# Invoke the awk utility with the input field separator set to a comma (-F,)
# and a variable named sq that is set to a string containing a single-quote
# character (-v sq="'").
awk -F, -v sq="'" '
{ # For each line read from the input file...
# Search for the longest string in the current input line that starts
# and ends with a double-quote character. If a match is found, set
# RSTART to the index of the 1st double-quote character and set RLENGTH
# to the number of characters matched:
match($0, /".*"/)
# Print the:
# 1st field from the input line ($1),
# print the output field separator (,), (Note that the default
# OFS is a <space> character and the default is not overridden
# in this script.)
# 2nd field from the input line ($2),
# print the output field separator (,),
# 4th field from the input line ($4),
# print the output field separator (,),
# 6th field from the input line ($6),
# print the output field separator (,),
# print a single-quote character (sq),
# print the substring of the input line starting after the 1st
# double-quote character up to, but not including, the last
# double-quote character (substr(...)),
# print a single-quote character (sq),
# print the output field separator (,),
# print the next to the last field from the input line ($(NF-1)),
# and print the output record separator. (Note that the default
# ORS is a <newline> character and the default is not overridden
# in this script.)
print $1, $4, $6, sq substr($0, RSTART + 1, RLENGTH - 2) sq, $(NF - 1)
}' file # Terminate the awk script and name the input file(s) to be processed.
Does this help?
1 Like