I am getting a pipe separated flat file which is to be loaded into the database. The issue is that the last column of the record is scattered onto multiple new lines. Ideally one record should be on a single line ( Please refer the last three lines of the attached flat file) but in this case the flat file generation utility is not generating it correctly and I have to live with this problem
Will really appreciate if anybody would be able to give some solution to fix this issue.
Your input lines are split by CRLF (probably the file has been generated on Windows).
The below Awk script outputs something similar to what you want on my Linux box
(you should use nawk or /usr/xpg4/bin/awk if you want to use the code on Solaris):
Logically, the spaces that are missing in your input will not be present in the output too (SAINDAS instead of SAIN DAS).
If you're running awk on Windows, you should use a script file, rather than a command.
If the above code does not work (because you saved the sample on Windows, modifying the record separator),
just try it without setting the RS and ORS variables.
I am running it on Solaris and used xpg4 version of awk to execute the below awk script. I have removed the RS and ORS variables cause they were not helping in output file generation.
awk -F\| 'END { print _ }
NF > 1 && _ { print _ ; _ = "" }
{ _ = _ ? _ $0 : $0 } flat_file.txt
The output file which i got is still not the correct one.
The original file records are having 9 fields separated by "|"
I am running the awk which is creating one more records at the start and it is putting the scattered column charaters at the begining of the record. In short it is creating one more record at the beginning and in some cases it is loosing the characters. Please observe the second record in the attached output file which I got after running the above awk.