awk stuff

Hi,

My input file data will be

|ABCD|EFGH|IJKL|MNOP
|ABCD|EF\|GH|IJKL|MNOP

I am expecting output ,

|"ABCD"|"EFGH"|"IJKL"|"MNOP"
|"ABCD"|"EF|GH"|"IJKL"|"MNOP"

Note : The change basically the pipe deilmited file does contain | as value for some of the column but | will come with escape char \ always when it will be a value.

Please provide your ideas. i am looking for not only awk .. something different also welcome.

thanks in advance for your inputs

Hi Nandy,

Please always use code tags as per forum rules for commands.
Following may help you in same.

EDIT: Edited the post as previous post was not giving the requested output. Thank you Scrutinizer for pointing out the same.
Not a best of the solution but will provide output for input provided by the user.

awk  '{gsub(/\\\|/,"#",$0);gsub(/\|/,"\"|\"");gsub(/^\"\|/,"|",$0);gsub(/$/,"\"",$0);gsub("#","|",$0); print $0}'   filename
OR
awk  '{gsub(/\\\|/,"#",$0);gsub(/\|/,"\"|\"");gsub(/^\"\|/,"|",$0);gsub(/$/,"\"",$0);gsub("#","|",$0)} 1'  filename

Output will be as follows.

|"ABCD"|"EFGH"|"IJKL"|"MNOP"
|"ABCD"|"EF|GH"|"IJKL"|"MNOP"
 

Thanks,
R. Singh

You could try something like:

sed '	s/\\|/@/g
	s/^@/"@/
	s/@$/@"/
	s/^|/@"/
	s/|$/"@/
	s/|/"|"/g
	s/[^"@]$/&"/
	s/^[^"@]/"&/
	s/@/|/g' file

which if file contains:

|ABCD|EFGH|IJKL|MNOP
|ABCD|EF\|GH|IJKL|MNOP
ABC|D\|\|E|LMN|
\|\|\|\|

produces the output:

|"ABCD"|"EFGH"|"IJKL"|"MNOP"
|"ABCD"|"EF|GH"|"IJKL"|"MNOP"
"ABC"|"D||E"|"LMN"|
"||||"
awk '{gsub(/\|/, "\"|\"");gsub(/\\"\|"/,"|");$0=("\""$0"\"");sub(/(^"")|(""$)/,X)}1' file

Assuming the first fields is empty / the file starts with a |

Bash:

while IFS=\| read -a fields
do
  unset fields[0]
  printf '|"%s"' "${fields[@]}"
  echo
done < file

---
General shell with 5 fields:

while IFS=\| read a b c d e
do
  printf '|"%s"' "$b" "$c" "$d" "$e"
  echo
done < file

Another awk

awk -F\| '{gsub(/\\\|/,"_");for (i=/^\|/?2:1;i<=(/\|$/?NF-1:NF);i++) $i="\""$i"\"";gsub(/_/,"|")}1' OFS=\| file
|"ABCD"|"EFGH"|"IJKL"|"MNOP"
|"ABCD"|"EF|GH"|"IJKL"|"MNOP"

Still assuming the first fields is empty / the file starts with a |

awk '{for(i=1; i<=NF; i++) gsub(/\|/,"\"|\"",$i); sub(/"/,x); sub(/$/,"\"")}1' FS='\\\\\\|' OFS='|' file
perl -pe 's/(?<!\\)\|/"|"/g; s/\\\|/|/g; s/^"//; s/$/"/' file

---
@Jotne, Ravinder.. That output is not correct...

1 Like

@Scrutinizer
Updated my post.

PS your does not handle this:

|ABCD|EFGH|IJKL|MNOP
ABCD|EF\|GH|IJKL|

@ Jotne, correct, I added a remark that my suggestion expects a leading | , like with my shell suggestion. The first is not really a field, otherwise I would expect the output to be:

""|"ABCD"|"EFGH"|"IJKL"|"MNOP"
""|"ABCD"|"EF|GH"|"IJKL"|"MNOP"