Hi guys,
I have a question regarding AWK or SED.
I have a data file which reads like this;
1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2,
3,333 41998 55300 20000 83915 85650 90710 91129=
.........
There are about 100 or so lines like this.
What I need to do is to append the line with the = on the end to the line above so the data file reads like this
1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850 333 41998 55300 20000 83915 85650 90710 91129=
2,
3,
........
Any ideas guys?
My pc is about to get throw out of the window....
Cheers,
Ian
mirni
January 12, 2012, 10:36am
2
Is this happening on every other line or is the pattern not regular?
---------- Post updated at 05:36 AM ---------- Previous update was at 05:06 AM ----------
Regardless, try this:
awk '
$2~/[0-9][^=]$/ {nr=NR}
$2~/=$/ { store2=$2; $2=""; b[a[nr]]=b[a[nr]] store2 }
{ a[NR]=$1; b[$1]=$2 }
END{
for(i=1;i<=NR; i++)
print a,b[a]
}' FS="," OFS="," input
aigles
January 12, 2012, 10:52am
3
Another awk script :
awk -F, '
function printLine() {
for (i=1; i<=cnt; i++) print line;
cnt = 0;
}
{
cnt++
if (cnt==1) {
line[1] = $0;
} else {
line[1] = line[1] " " $2;
line[cnt] = $1 FS;
}
if (/=$/) printLine();
}
END { printLine() }
' inputfile
Input file:
1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2,
3,333 41998 55300 20000 83915 85650 90710 91129=
4,123 456 789 012
5,abc def ghi jkl
7,AAA BBB CCC DDD EEE FFF
6,xyz=
Output :
1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850 333 41998 55300 20000 83915 85650 90710 91129=
2,
3,
4,123 456 789 012 abc def ghi jkl AAA BBB CCC DDD EEE FFF xyz=
5,
7,
6,
Jean-Pierre.
Hi Guys,
Thanks for replying.
First it's not a regular patten, as some of the lines go like this
1 03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2
3 333 41998 55300 20000 83915 85650 90710 91129=
4
5 03089 41460 73215 10018 20000 39990 40092 333 51035=
Generally, if the line starts with 333 it needs to be appended to the line above..
BTW - As I'm using AWK to read and process these files, can I just read the line terminated with /=$/ to read though the whole line until it gets to = ?
If so how is this done with awk? I have not been able to find any documentation on it
Cheers,
---------- Post updated 01-13-12 at 03:27 AM ---------- Previous update was 01-12-12 at 02:41 PM ----------
Hi Guys,
I have tried both scripts, but I was unable to get any of them to work.
mirni,
When I tried yours it just printed out the same as what I had
And Jean-Pierre,
It produced syntax errors on the last printLine
END { printLine() }
I removed this and more error appeared throughout the script
aigles
January 13, 2012, 5:46am
5
Please show us your script code and the errors produced
Try to replace awk with nawk or gawk (depending of your unix).
Jean-Pierre.
Hi Jean-Pierre,
I don't have nawk installed but I do have gawk. However it came up with the same error as awk
END printLine() }
^ syntax error
thanks
Ian
Yeah,
END { printLine() }
Is in there, but still producers the same error
I am trying this from a slightly different approach
awk 'BEGIN {RS=" "} { for (i-1;i<NF;i++) printf $(i) " "} END { print ""}' infile
This kind of works apart from removing the last set of numbers from the top line and adding a space on the following line
infile
1 03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 91150
2
3 333 41998 55307 20344 81632 86075=
4
outfile
1 03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 333 41998 55307 20344 81632 86075=
2
3 03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 333 41998 55307 20344 81632 86075=
4
As you can see the outfile has the 91150 missing and also on the following lines after line one, there is a extra space appeared on the start of the lines.
Any ideas on this one?
Cheers,
ian
mirni
January 14, 2012, 6:05am
9
Can you please post the output of
cat -A inputFile | head
I am suspicious you have some spaces after the final '=' that's why the awk isn't matching the lines.
Is this
awk '/=$/' input
returning any lines?
You could try to change this
$2~/=$/
to
$2~/= *$/
in my code.