Hello Guys
I have a flat file with '|~|' delimited
When I use to record count using below command
awk -FS"[|~|]+" ' {print $colno}' filename | wc -l
the count is fine
But when I am trying to find the unique number of record the o/p is always 1
awk -FS"[|~|]+" ' {print $colno}' filename |sort|uniq|wc -l
Please let me know how to find the unique record count
Field separator doesn't matter while finding the count of unique lines, does it?
Try:
sort filename | uniq | wc -l
Thanks but I need to know the uniq record on particular column
Please provide a sample input and expected output.
I/P file
9961881|~|20111229|~|000000218311635|~|1015104|~|000192170510|~|1|~|1|~||~|1|~|3755593|~|3755593|~|218311635
9961881|~|20111229|~|000000218311636|~|1015104|~|000192170510|~|1|~|1|~||~|1|~|3755593|~|3755593|~|218311636
9961881|~|20111229|~|000000218312203|~|1014486|~|000192174061|~|1021|~|1|~||~|1|~|90875|~|90875|~|218312203
9961881|~|20111229|~|000000218312204|~|1014486|~|000192174061|~|1267|~|1|~||~|1|~|90875|~|90875|~|218312204
9961881|~|20111229|~|000000218478637|~|1023353|~|000192465057|~|253|~|1|~||~|1|~|3755593|~|3755593|~|218478637
9961881|~|20111229|~|000000218478639|~|1023353|~|000192465057|~|801|~|1|~||~|1|~|3755593|~|3755593|~|218478639
9961881|~|20111229|~|000000218478640|~|1023353|~|000192465057|~|802|~|1|~||~|1|~|3755593|~|3755593|~|218478640
9961881|~|20111229|~|000000218478641|~|1023353|~|000192465057|~|253|~|1|~||~|1|~|3755593|~|3755593|~|218478641
9961881|~|20111229|~|000000218478642|~|1023353|~|000192465057|~|801|~|1|~||~|1|~|3755593|~|3755593|~|218478642
9961881|~|20111229|~|000000218478643|~|1023353|~|000192465057|~|802|~|1|~||~|1|~|3755593|~|3755593|~|218478643
Need uniq record count number on 4th field
awk -FS"[|~|]+" ' {print $4}' test.dat|sort|uniq|wc -l
o/p 1 which should be 3
if I change the FS to | in source file the o/p is 3
Klashxx
6
awk -F\~ '{print $4}' test.dat|sort|uniq|wc -l
1 Like
Its awk -F
not awk -FS
$ awk -F"[|~|]+" '{print $4}' filename | sort | uniq | wc -l
3
1 Like
awk -F\~ '{print $4}' test.dat | sort -u | wc -l
--ahamed
1 Like
Klashxx
9
a pure awk:
awk -F"[|~|]+" 'a[$4]==""{a[$4]=1;b++}END{print b}' test.dat
Or:
awk -F\~ '!a[$4]{a[$4]=1;b++}END{print b}' test.dat
1 Like
Yet another one...
awk -F'~' '{a[$4]++}END{print length(a)}' infile
Use nawk if solaris!
--ahamed
1 Like
$ awk -F"\|~\|" '{a[$4]++;next}END{for(i in a){print a,i}}' input.txt
6 1023353
2 1014486
2 1015104
1 Like
Thanks a lot guys for all the answers .
thanks a lot for your valuable time