Hello all,
I have data like
"1"|"My_name"|"My_Email"|"My_Last"|My_other"
"2"|"My_name"|"My_Email"|"My_Last"|My_other"
"3"|"My_name"|"My_Email"|"
"|My_other"
"1"|"My_name"|"My_Email"|"My_Last"|My_other"
Need output like
"1"|"My_name"|"My_Email"|"My_Last"|My_other"
"2"|"My_name"|"My_Email"|"My_Last"|My_other"
"3"|"My_name"|"My_Email"|""|My_other"
"1"|"My_name"|"My_Email"|"My_Last"|My_other"
and so far i have SED working file but it blowing up the memory if the file exceeds in millions of rows.
and for AWK i have this
nawk -F"\|" 'NR==1 || $NF !~ /\|\"\n/{printf("%s",$0)} $NF ~ /\|\"\n/{printf("\n%s",$0)} END{print ""}'
Its giving me output all rows in single row.
Please help.
Yoda
August 7, 2013, 3:55pm
2
Assuming all your lines should contain string: My_other
awk '/My_other/{ORS=RS}!/My_other/{ORS=""}1' file
Actually Not as i just provided sample data.
Best regular expression to look for is
|" \n "
Because the line feed is coming right after "My_first name" field and there are more fields after "My_other"
Yoda
August 7, 2013, 4:32pm
4
Please use code tags for posting data samples as well.
Check if this helps:
awk '/\|"[ ]*$/{ORS=""}!/\|"[ ]*$/{ORS=RS}1' file
Jotne
August 7, 2013, 4:40pm
5
Even some shorter
awk '{ORS=(/\|"[ ]*$/)?"":RS}1' file
1 Like
Thanks that worked also would like to know how it worked meaning what ORS and all did to make it happen... any details of this command will help.
Thanks!
Yoda
August 7, 2013, 5:35pm
7
Explanation:
awk '
# Match pipe | followed by double quotes " followed by zero or more occurrence of spaces [ ]* in the end $
/\|"[ ]*$/ {
# Set ORS (Output Record Separator - newline by default) to ""
ORS = ""
}
# Same pattern used above, but using logical not !
!/\|"[ ]*$/ {
# Set ORS (Output Record Separator - newline by default) to RS (Record Separator - newline by default)
ORS = RS
}
# 1 == true (default awk action is to print current record)
1
' file
ORS and RS are special awk variables. Check the manual page for further reference.
Thanks a lot for the Help.
---------- Post updated 08-08-13 at 12:40 PM ---------- Previous update was 08-07-13 at 04:39 PM ----------
Hello could you also please describe above awk command to me how it do it.... i am curious and just for my knowledge purpose.
Thanks a lot.
Yoda
August 8, 2013, 1:42pm
10
The logic is same, but Jotne used an awk conditional expression to shorten it.
RudiC
August 8, 2013, 4:38pm
12
Try also
awk -F"|" 'NF<5 {getline x; $0=$0 x} 1' file
"1"|"My_name"|"My_Email"|"My_Last"|My_other"
"2"|"My_name"|"My_Email"|"My_Last"|My_other"
"3"|"My_name"|"My_Email"|""|My_other"
"1"|"My_name"|"My_Email"|"My_Last"|My_other"