Dear All,
sorry for open a new thread but the old one (http://www.unix.com/shell-programming-and-scripting/263430-find-values-within-range-output.html\ ) is already marked as resolved but actually it doesn't work properly and the input file are a bit different.
File 1:
1 195240910 +
2 195240915 -
File2:
1 195240905 4
1 195240906 4
1 195240907 5
1 195240908 5
1 195240909 3
1 195240910 0
1 195240911 5
1 195240912 5
1 195240913 0
1 195240914 0
1 195240915 3
1 195240916 4
1 195240917 5
1 195240918 8
1 195240919 5
1 195240920 6
2 195240905 7
2 195240906 2
2 195240907 9
2 195240908 9
2 195240909 2
2 195240910 12
2 195240911 2
2 195240912 9
2 195240913 5
2 195240914 9
2 195240915 0
2 195240916 2
2 195240917 9
2 195240918 5
2 195240919 9
2 195240920 6
Well, I would like to compare these two files in this way.
first, column $1 and $2 of both files must match, if so, output the matching values and if column $3 of file 1 is
+
output the less n value of $3 in File2, otherwise if column $3 of file 1 is
-
output the more n value of $3 in File2.
so for File1 and File2 output should be (for n=5):
1 195240910 4 5 5 3 0
2 195240915 9 5 9 2 0
Well, I really hope that color could help to understand.
Any help or suggestion?
Best
RudiC
January 13, 2016, 8:23am
2
Try
awk '
FNR == NR {T[$1] = $2
S[$1] = sprintf ("%d", $3 N-1)
next
}
NR <= L {printf "%s%s", $3, L==NR?"\n":" "
next
}
(T[$1] <= $2 + S[$1] || T[$1] == $2 ) &&
T[$1] {printf "%s %s %s ", $1, T[$1], $3
delete T[$1]
L = NR + N - 1
}
' N=5 file1 file2
1 195240910 4 5 5 3 0
2 195240915 0 2 9 5 9
Dear RudiC,
your script show only the last entry.
In fact if File1 is:
1 195240910 +
1 195240920 +
2 195240915 -
The output is:
1 195240920 4 5 8 5 6
2 195240915 0 2 9 5 9
instead of:
1 195240910 4 5 5 3 0
1 195240920 4 5 8 5 6
2 195240915 0 2 9 5 9
file2 is always:
1 195240905 4
1 195240906 4
1 195240907 5
1 195240908 5
1 195240909 3
1 195240910 0
1 195240911 5
1 195240912 5
1 195240913 0
1 195240914 0
1 195240915 3
1 195240916 4
1 195240917 5
1 195240918 8
1 195240919 5
1 195240920 6
2 195240905 7
2 195240906 2
2 195240907 9
2 195240908 9
2 195240909 2
2 195240910 12
2 195240911 2
2 195240912 9
2 195240913 5
2 195240914 9
2 195240915 0
2 195240916 2
2 195240917 9
2 195240918 5
2 195240919 9
2 195240920 6
Best
RudiC
January 13, 2016, 9:25am
4
The array values for $1 are being overwritten, so the last entry only is valid. You didn't mention there's several $1 values possible.
Yes sorry I didn't, I didn't think about it...
awk '
{ k=$1 FS $2 }
NR==FNR { a[k]=$3; next }
{ s[NR%n]=$3 }
(k in a) {
printf "%s %s",$1,$2
if (a[k]=="+") {
for (i=1; i<=n; i++) printf " %s",s[(NR+i)%n]
printf "\n"
} else { f=n }
}
(f && (f--==1)) {
for (i=n; i>=1; i--) printf " %s",s[(NR+i)%n]
printf "\n"
}
' n=5 file1 file2
1 Like