Welcome to forums, hope you will enjoy learning knowledge here. Could you please try following and let me know if this helps(if you are not bothered about sequence of output).
awk -F"|" '{A[$NF]++} END{for(i in A){print i, A}}' Input_file
Output will be as follows.
BB 3
AA 2
If you need output in same sequence as Input_file then try as follows(By reading Input_file 2 times in following solution).
Thanks for the reponses! I've gone for the more straight forward cut option as it provides the desired output.
However I forgot to add to this question... Is there a way of refining this output further so that it will only count a record of that group where field four is "1" and where field five is "Y"? I realise in the example supplied it would still show the exact same count.
The command executes without any issues, however doesn't seem to have altered the file (is identical as to pre the command). Is this perhaps because my data includes literal double quotes?
I tried escaping them but doesn't seem to make any difference
You could say Thanks to a person by hitting THANKS button at left of every post.
Above awk command will not put output into same Input_file, you should try following to do the same.
My rule of thumb is that when a specific content of a specific field needs to be examined, manipulated, etc., then I reach for awk first ( perl second) because the field-separating facilities are very good.
If you can certify that the contents of fields 4 and 5 are unique to the content of all fields in a line, then you may be able to use the suggestion from rovf to use grep , because grep will consider the content of the entire line without regard to fields. Otherwise, an awk solution seems like the best approach.
Given that the field *separators* are unique, we can use grep even if the same field contents already occurs earlier in the line. It's just that we have to use extended regular expressions, and anchor our search at the start of the line.
You should do a forum search before opening a thread. Exactly your problem (and some of the suggested solutions) was discussed in length in this thread. It is called a "control break".