Column

SUMNIX · August 7, 2008, 7:55am

Hi All,

I am getting a pipe separated flat file which is to be loaded into the database. The issue is that the last column of the record is scattered onto multiple new lines. Ideally one record should be on a single line ( Please refer the last three lines of the attached flat file) but in this case the flat file generation utility is not generating it correctly and I have to live with this problem

Will really appreciate if anybody would be able to give some solution to fix this issue.

Thanks in advance.

radoulov · August 7, 2008, 8:42am

Your input lines are split by CRLF (probably the file has been generated on Windows).
The below Awk script outputs something similar to what you want on my Linux box
(you should use nawk or /usr/xpg4/bin/awk if you want to use the code on Solaris):

awk -F\| 'END { print _ }
NF > 1 && _ { print  _ ; _ = "" }
{ _ = _ ? _ $0 : $0 }
' RS='\r\n' ORS='\r\n' flat_file.txt

Logically, the spaces that are missing in your input will not be present in the output too (SAINDAS instead of SAIN DAS).

If you're running awk on Windows, you should use a script file, rather than a command.

If the above code does not work (because you saved the sample on Windows, modifying the record separator),
just try it without setting the RS and ORS variables.

SUMNIX · August 7, 2008, 1:12pm

I am running it on Solaris and used xpg4 version of awk to execute the below awk script. I have removed the RS and ORS variables cause they were not helping in output file generation.

awk -F\| 'END { print _ }
NF > 1 && _ { print _ ; _ = "" }
{ _ = _ ? _ $0 : $0 } flat_file.txt
The output file which i got is still not the correct one.

The original file records are having 9 fields separated by "|"
I am running the awk which is creating one more records at the start and it is putting the scattered column charaters at the begining of the record. In short it is creating one more record at the beginning and in some cases it is loosing the characters. Please observe the second record in the attached output file which I got after running the above awk.

Please help in generating the correct flat file.

Thanks...

sudhamacs · August 7, 2008, 1:21pm

Assuming that the start of the line will be "DAT|"

awk -F'|' '$1=="DAT"{printf "\n" }; {printf $0 ;}' flat_file.txt

radoulov · August 7, 2008, 1:50pm

Could you please run:

dos2unix flat_file.txt flat_file_unix.txt

And then rerun the code I posted with the correct file - flat_file_unix.txt. I suppose you'll get the correct result.

SUMNIX · August 8, 2008, 12:56am

I tried the solution given by radoulov and it worked perfectly this time.

I have also tried the sudhamacs awk solution on the original file and is also worked as expected.

Thanks a lot guy's