Concatenate rows and redefine range

alex2005 · June 4, 2015, 5:51pm

I'm trying to find a way to concatenate consecutive rows (key is column $1 and $2) if column $5 an $6 are integers and redefine ranges in columns $3&$4 and $5&$6

Unfortunately I'm still learning the very basics so I cannot figure a way of doing this with awk.

Input file

15 30  21  21  25.0  25.0
15 30  22  22  26.0  26.0
15 30  23  23  27.0  27.0
15 30  24  24  28.5  28.5
15 30  25  25  29.7  29.7
15 30  26  26  30.0  30.0
15 30  27  27  31.0  31.0
17 32 155 155 325.0 325.0
17 32 156 156 326.3 326.3
17 32 158 158 424.0 424.0
17 32 159 159 425.0 425.0
17 32 160 160 426.0 426.0

Desired output:

15 30  21  23  25.0  27.0
15 30  24  24  28.5  28.5
15 30  25  25  29.7  29.7
15 30  26  27  30.0  31.0 
17 32 155 155 325.0 325.0
17 32 156 156 326.3 326.3
17 32 158 160 424.0 426.0

Thanks in advance!

Scrutinizer · June 4, 2015, 6:31pm

Presumably, besides the condition that column $5 an $6 are integers (technically 27.0 is not an integer, but you seem to mean numbers that end with .0 ) they should als be only one higher than the previous line? Otherwise :

15 30  26  26  30.0  30.0
15 30  27  27  31.0  31.0
17 32 155 155 325.0 325.0

would need to be combined to :

15 30  26  155  30.0  325.0

alex2005 · June 4, 2015, 6:42pm

Hi,

Thank you for your answer, you are correct, I need to merge only the rows which ends in

.0

and yes should be one higher from the previous row if the same values in column

$1

and

$2

.

Chubler_XL · June 4, 2015, 7:13pm

How about this:

awk 'key {
    if (key == $1" "$2 && f5 ~ ".0$" && $5 ~ ".0$") {
        t3=$3
        t5=$5
        next
    }
    print key,f3,t3,f5,t5
}
{ key=$1" "$2; f3=t3=$3; f5=t5=$5 }
END { if(key) print key,f3,t3,f5,t5 } ' infile

alex2005 · June 4, 2015, 8:07pm

Thank you very much, worked like a charm.

MadeInGermany · June 5, 2015, 2:41am

If x ~ ".0$" is to be an integer test, then should be x ~ "\.0$" or x ~ "[.]0$" or better the speaking x==int(x) .

Don_Cragun · June 5, 2015, 3:13am

As long as all of the data in the fields in question are written with a decimal point and one digit after the decimal point, you don't need to look for the decimal point. You could just check if the fields ends with a "0" with $x ~ "0$" .

RudiC · June 5, 2015, 6:18am

What about 27.30, then? I'd prefer the int($5) way.

---------- Post updated at 12:18 ---------- Previous update was at 12:16 ----------

Wasn't the first condition $1 and $2 be the key?

alex2005 · June 5, 2015, 4:01pm

Thank you for the useful ideas, will put them in practice.

alex2005 · June 6, 2015, 6:52pm

Sorry to bump up to this post again, but I need your help.
Looping to the files I have I noticed that I might have some lines where fields $5 and $6 are not consecutive. Eg:

15 30  21  21  25.00  25.00
15 30  22  22  26.00  26.00
15 30  23  23  27.00  27.00
15 30  24  24  28.05  28.05
15 30  25  25  29.07  29.07
15 30  26  26  30.00  30.00
15 30  27  27  31.00  31.00
17 32 155 155 325.00 325.00
17 32 156 156 326.03 326.03
17 32 157 157 423.00 423.00
17 32 158 158 424.00 424.00
17 32 159 159 425.00 425.00
17 32 160 160 428.00 428.00
17 32 161 161 429.00 429.00

Desired output should be:

15 30 21 23 25.00 27.00
15 30 24 24 28.05 28.05
15 30 25 25 29.07 29.07
15 30 26 27 30.00 31.00
17 32 155 155 325.00 325.00
17 32 156 156 326.03 326.03
17 32 157 159 423.00 425.00
17 32 160 161 428.00 429.00

Poor man attempts were from to edit Chubler_XL script as follow:

awk 'key {
    if (key == $1" "$2 && f5 ~ ".0$" && $5 ~ ".0$") {
        t3=$3
        t5=$6
        next
    }
    print key,f3,t3,f5,t5
}
{ key=$1" "$2; f3=t3=$3; f5=t5=$6 }
END { if(key) print key,f3,t3,f5,t5 } ' infile

But this didn't worked.

Hope you'll be able to guide me this time as well.

Thanks in advance

Don_Cragun · June 6, 2015, 11:29pm

Building on top of Chubbier_XL's code and incorporating MadeInGermany's suggestion:

awk '
NR > 1 {if (key == ($1 " " $2) && t5 == int(t5) && $5 == int($5) && 
	    (t5 + 1) == $5) {
		t3 = $3
		t5 = $5
		next
	}
	print key, f3, t3, f5, t5
}
{	key = $1 " " $2
	f3 = t3 = $3
	f5 = t5 = $5
}
END {	if(NR) print key, f3, t3, f5, t5
}' file

seems to do what you want.

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk .

alex2005 · June 7, 2015, 12:23am

Thank you very much, your solution worked fine.

Best Regards