Compare 2 files and print the values input1 (c1 20 100 X_y10) along with one closest highest (c1 100 200 X_y10) and one lowest values (c1 10 15 X_y10) from input2
input1
c1 20 100 X_y10
input2
c1 5 10 X_y10
c1 10 15 X_y10
c1 100 200 X_y10
c1 200 300 X_y10
output
c1 20 100 X_y10 c1 10 15 X_y10
c1 20 100 X_y10 c1 100 200 X_y10
here is my tried code.
awk 'NR==FNR{a[$0]++;next}
$0 in a {a[$0]++; next}
{b[$0]++}
END{
for(i in a){
if(a==b) {
print ab
}
else if(a>=b) {
print ab
}
else if(a<=b) {
print ab
}
}
}' input1 input2
bumblebee_2010:
Compare 2 files and print the values input1 (c1 20 100 X_y10) along with one closest highest (c1 100 200 X_y10) and one lowest values (c1 10 15 X_y10) from input2
input1
c1 20 100 X_y10
input2
c1 5 10 X_y10
c1 10 15 X_y10
c1 100 200 X_y10
c1 200 300 X_y10
output
c1 20 100 X_y10 c1 10 15 X_y10
c1 20 100 X_y10 c1 100 200 X_y10
here is my tried code.
awk 'NR==FNR{a[$0]++;next}
$0 in a {a[$0]++; next}
{b[$0]++}
END{
for(i in a){
if(a==b) {
print ab
}
else if(a>=b) {
print ab
}
else if(a<=b) {
print ab
}
}
}' input1 input2
Shouldn't the output of the second line be?
c1 20 100 X_y10 c1 200 300 X_y10
Please clarify if not.
haa..No
It is not the closest highest of 20 and 100.
100 and 200 is the closest-highest.
20......100..100......200
and closest- lowest
10......5....20.......100
How about this,
line1=`cat file1`
sort -nk2,3 file1 file2 | awk -v v1="$line1" '{if (v1==$0) { print v1,a;getline;print v1,$0;exit} else { a=$0}}'
Its working fine at my end, What error are you receiving ?
[root@powerbroker ~]# cat cmp1
c1 20 100 X_y10
[root@powerbroker ~]# cat cmp2
c1 5 10 X_y10
c1 10 15 X_y10
c1 100 200 X_y10
c1 200 300 X_y10
[root@powerbroker ~]# line1=`cat cmp1`
[root@powerbroker ~]# echo $line1
c1 20 100 X_y10
[root@powerbroker ~]# sort -nk2,3 cmp1 cmp2 | awk -v v1="$line1" '{if (v1==$0) { print v1,a;getline;print v1,$0} else { a=$0}}'
c1 20 100 X_y10 c1 10 15 X_y10
c1 20 100 X_y10 c1 100 200 X_y10
[root@powerbroker ~]#
I know why because my input1 actually very big with specific keys (column 4) not just one line
Ex:
cat input1
c1 20 100 X_y10
c1 20 100 X_y11
c1 20 100 X_y12
---------- Post updated at 05:42 AM ---------- Previous update was at 04:20 AM ----------
May be to explain in detail
input1
c1 20 100 XXX
c1 20 100 YYY
input2
c1 5 10 XXX
c1 10 15 XXX
c1 100 200 XXX
c1 200 300 XXX
c1 18 19 YYY
c1 101 200 YYY
c1 201 211 YYY
Output
c1 20 100 XXX c1 10 15 XXX
c1 20 100 XXX c1 100 200 XXX
c1 20 100 YYY c1 18 19 YYY
c1 20 100 YYY c1 101 200 YYY
try this,
awk 'NR==FNR{a[$1$4]=$0;b[$1$4]=$2;c[$1$4]=$3;next}
{if ($1$4 in b)
{
if ($4 != prev ) { y=0;prev=$4}
if($2<b[$1$4] && e[$1$4] < $2)
{
e[$1$4]=$2;d[$1$4]=$0
}
if($3>c[$1$4] && y==0) {f[$1$4]=$3 ; g[$1$4]=$0;y=y+1}
if($3>c[$1$4] && f[$1$4] > $3)
{
f[$1$4]=$3;g[$1$4]=$0
}
}
}
END {
for (i in d)
{
print a,d
}
for (m in g)
{
print a[m],g[m]
}
}' file1 file2
thank you very very much for sharing the code but it unable to process fe of them (pointed in bold).
input1
c1 18 50 XXX
c1 20 100 XXX
c1 20 100 YYY
input2
c1 5 10 XXX
c1 10 15 XXX
c1 100 200 XXX
c1 200 300 XXX
c1 18 19 YYY
c1 101 200 YYY
c1 201 211 YYY
Output
c1 18 50 XXX c1 10 15 XXX
c1 18 50 XXX c1 100 200 XXX
c1 20 100 XXX c1 10 15 XXX
c1 20 100 XXX c1 100 200 XXX
c1 20 100 YYY c1 18 19 YYY
c1 20 100 YYY c1 101 200 YYY