Column with New Line in CSV file

Hello,

Got a CSV file which contains 21 columns

Need to convert the file to Pipe delimiter and Few columns text data contains new line

Example

1,2,3,"ABC" 8" ABC,5,6,7 
1,2,3,"ABC"
8" ABC,5,6,7  ( New Line)
1,2,3,""ABC" 8" ABC", 5,6,7
1,2,3,"ABC"
,5,6,7(New line)

Expected output

1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|"ABC"|5|6|7

For your sample input, try something like this:-

awk -F, 'NF!=7{ORS=(ORS!=" "?" ":RS)}{$1=$1}1' OFS=\| file
1 Like

Yoda Thanks a lot, Can you please explain me whats going on with that statement.

I am new to AWK commands.

awk -F, '
        # if NF not equal to 7
        NF != 7 {
                # If ORS not equal to " " set ORS = " " else set ORS = RS (newline by default)
                ORS = ( ORS != " " ? " " : RS )
        }
        {
                # This stmt rebuilds current record, this is to set OFS = |
                $1 = $1
        }
        # 1 == true. If true, default awk operation is to print current record
        1
' OFS=\| file
1 Like

Yoda, I always love your awk replies. They're very economical and clever with the concepts of awk. Thanks.

1 Like

Nice Yoda.. Just for fun ORS=(ORS!=" "?" ":RS) could be reduced to ORS=ORS?x:RS still

1 Like

Hello,

One more approach for same.

awk -F, '/^1/ {a=1; OFS="|";v=$0;b=1} {if(a==1 && $0 !~ /^1/){v=v" "$0;b=0}} !b{gsub(/\,/,"|",v);print v} (a && b && NF==7){gsub(/\,/,"|",v);print v}' filename

Output will be as follows.

1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|"ABC" 8" ABC|5|6|7  ( New Line)
1|2|3|""ABC" 8" ABC"| 5|6|7
1|2|3|"ABC" |5|6|7(New line)

EDIT: Adding one more solution for same.

awk -F"," '{if($NF ~ "ABC"){ORS=" "} else {ORS="\n"}}{$1=$1} 1' OFS="|"  test14

Output will be as follows.

1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|"ABC" 8" ABC|5|6|7
1|2|3|""ABC" 8" ABC"| 5|6|7
1|2|3|"ABC" |5|6|7

Where test14 is input file name.

Thanks,
R. Singh

Perfect. Thanks guys. I will build the script to handle other situation too.

The solution you get here can only be as good as your description of the problem.

You say you actual CSV file has 21 columns (fields?) you the sample data you supply only has 4 line of 7 fields.

Yoda states his solution is based on the sample data, probably due to this confusion.
RavinderSingh13's solution will only work if your CSV files always starts with the digit "1" and/or contains the string "ABC".

If these are working for you all is good but if you need something that works for your actual data you might need to provide a better description of what is needed.

For example:

  • how many fields are in the actual data
  • are records split over more than 2 lines
  • do you always need a space inserted when lines are joined (like what happens when lines 2 and 3 are joined from you sample data)