Counting similar lines from file UNIX

mohsin.quazi · August 23, 2009, 7:53am

I have a file which contains data as below:

nbk1j7o pageName=/jsp/RMBS/RMBSHome.jsf
nbk1j7o pageName=/jsp/RMBS/RMBSHome.jsf
nbk1j7o pageName=/jsp/RMBS/RMBSHome.jsf
nbk1j7o pageName=/jsp/RMBS/RMBSHome.jsf
nbk1j7o pageName=/jsp/common/index.jsf
nbk1j7o pageName=/jsp/common/index.jsf
nbk1wqe pageName=/jsp/RMBS/RMBSHome.jsf
nbk1wqe pageName=/jsp/common/index.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf
nbk2coz pageName=/jsp/RMBS/PassThrough.jsf

I want output like:

--------------------------------------------------------

NBKID       PAGE ACCESSED                             COUNT
--------------------------------------------------------
nbk1j7o    pageName=/jsp/RMBS/RMBSHome.jsf    4
nbk1j7o    pageName=/jsp/common/index.jsf        2
nbk1wqe  pageName=/jsp/RMBS/RMBSHome.jsf     1
nbk1wqe  pageName=/jsp/common/index.jsf         1
nbk2coz   pageName=/jsp/RMBS/PassThrough.jsf   7

In short, i want to count the similar lines and remove multiple entries and include the count of that particular line.

danmero · August 23, 2009, 8:21am

To keep the forums high quality for all users, please take the time to format your posts correctly.

Use Code Tags when you post any code or data samples so others can easily read your code.
You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags and by hand.)
Avoid adding color or different fonts and font size to your posts.
Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.
Be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums
Reply With Quote

awk '{a[$0]++}END{for(i in a)print i,a}' file | sort

durden_tyler · August 23, 2009, 9:45am

A slightly different take on the problem, where counting of occurrences is done in the shell itself:

sort data.txt | uniq -c | perl -ne 'split; printf("%s %s %s\n",$_[1],$_[2],$_[0])'

tyler_durden

mohsin.quazi · August 23, 2009, 2:10pm

Thanks for the help.. but the output is little unformatted, I wanted these values in seperated column so that they look good...

vgersh99 · August 23, 2009, 5:41pm

Please define 'looking good'. Please use code tags when posting data/code samples.

---------- Post updated at 05:41 PM ---------- Previous update was at 05:41 PM ----------

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

b33713 · August 24, 2009, 9:53am

the looking good version,lol

sort -t= -k2 file | uniq -c | awk '{printf "%s %-40s%d\n",$2,$3,$1}'

summer_cherry · August 24, 2009, 10:35pm

 awk '{_[$0]++}
        END{
        for(i in _){
        print i" "_
        }
        }'