Hello,
I have two files:
file1
x
v
r
g
file2
aaaa,x,1111
bbbb,v,2222
bbbb,v,
cccc,r,3333
dddd,s,4444
eeee,q,5555
ffff,p,6666
output
aaaa,x,1111
bbbb,v,2222
cccc,r,3333
and not
aaaa,x,1111
bbbb,v,2222
bbbb,v,
cccc,r,3333
fgrep -f file1 file2 gives me what I dont want
Thanks,
funksen
November 16, 2011, 6:08am
2
at the cost of performance:
while read searchstring ; do grep -m 1 ${searchstring} file2 ; done < file1
1 Like
rdcwayx
November 16, 2011, 6:20am
3
awk -F , 'NR==FNR{a[$1];next} $2 in a{print;delete a[$2]}' file1 file2
1 Like
Hi rdcwayx,
I think your solution is what I need however my lines are a bit more complicated than the example I gave and your solution is based on that which is my fault. But I was looking for something general and it seems like there is not something you can apply to everycase.
the expressions I want is more like this:
Network=XXX,Context=GG123,Element=1
and I want what ever comes after the second equal sign before the last comma, in this case would be GG123.
Hi funksen,
I have thousands of lines not sure if while would be a good idea as you mentioned.
Try with this ..
$ fgrep -f file1 file2 | nawk -F'[=,]' '!x[$4]++'
Hi jayan jay,
this is the error I get when running your solutions:
user> fgrep -f not_upgraded_sites.txt not_upgraded.sel | nawk -F'[=,]' '!x[$4]++'
x[: Event not found.
Pls try with double quotes ..
I get the same thing
---------- Post updated at 04:24 AM ---------- Previous update was at 04:18 AM ----------
if anyone is interested,
a combination of jayan jay's solution and rdcwayx's solutions works:
nawk -F"[=,]" 'NR==FNR{a[$1];next} $4 in a{print;delete a[$4]}' file1 file2
hey jayan jay can you explain to me why field $4?
---------- Post updated at 04:31 AM ---------- Previous update was at 04:25 AM ----------
jayan jay,
all possible combinations:
> fgrep -f not_upgraded_sites.txt not_upgraded.sel | nawk -F"[=,]" "!x[$4]++"
x[: Event not found.
> fgrep -f not_upgraded_sites.txt not_upgraded.sel | nawk -F'[=,]' "!x[$4]++"
x[: Event not found.
> fgrep -f not_upgraded_sites.txt not_upgraded.sel | nawk -F'[=,]' '!x[$4]++'
x[: Event not found.
>
Hope it clears ..
$ echo "SubNetwork=XXX,MeContext=GG123,Element=1" | nawk -F'[=,]' '{print $4}'
GG123
Try this combination also ..
$ fgrep -f not_upgraded_sites.txt not_upgraded.sel | nawk -F"[=,]" '!x[$4]++'
1 Like
file 1
FC1
FC3
FC4
FC2
FC5
actual file2:
Network=ONR,Context=FC6,Element=1
Network=ONR,Context=FC7,Element=1
Network=ONR,Context=FC0,Element=1
Network=ONR,Context=FC1
Network=ONR,Context=FC1,Element=1
Network=ONR,Context=FC2,Element=1
Network=ONR,Context=FC0,Element=1
Network=ONR,Context=FC6,Element=1
rdcwayx
November 17, 2011, 4:42am
13
smarones:
Hi rdcwayx,
I think your solution is what I need however my lines are a bit more complicated than the example I gave and your solution is based on that which is my fault. But I was looking for something general and it seems like there is not something you can apply to everycase.
the expressions I want is more like this:
Network=XXX,Context=GG123,Element=1
and I want what ever comes after the second equal sign before the last comma, in this case would be GG123.
Hi funksen,
I have thousands of lines not sure if while would be a good idea as you mentioned.
Ok, my understand, your file1 will be something like:
GG123
GG124
XX123
file2
Network=XXX,Context=GG123,Element=1
Network=XXX,Context=GG123,Element=2
Network=XXX,Context=XX123,Element=1
Network=XXX,Context=GG124,Element=1
Network=XXX,Context=GG123,Element=1
awk -F , 'NR==FNR{a[$1];next} {split($2,b,"=");if (b[2] in a){print;delete a[b[2]]}}' file1 file2