Parsing a file which contains urls from different sites

Hi

I have a file which have millions of urls from different sites. Count of lines are 4000000.

http://www.chipchick.com/2009/09/usb_hand_grenade.html
http://www.engadget.com/page/5
http://www.mp3raid.com/search/download-mp3/20173/michael_jackson_fall_again_instrumental.html
http://www.myacrobatpdf.com/8713/canon-speedlite-430ex-manual.html
http://www.mobileheart.com/cell-phone-screensavers/1167-Sony-Ericsson-W200-Screensavers.aspx
http://www.india-forums.com/forum_posts.asp?TID=1256207&TPN=2
http://gallery.mobile9.com/f/923680
http://www.phoronix.com/scan.php?page=article&item=xorg_vdpau_vaapi&num=1
http://www.experts-exchange.com/Software/Photos_Graphics
http://www.jigzone.com/mpc/expired.php
http://ultimatetop200.com/
http://www.mp3raid.com/search/for/the_maine/4.html
http://gallery.mobile9.com/f/907594?view=download
http://gallery.mobile9.com/f/907594
http://www.imdb.com/title/tt0813715/board/thread/147969365
http://www.imdb.com/name/nm0002028

I want some command or code which can give me count of urls from individual sites e.g imdb, experts-exchange. gallery.mobile9

With GNU AWK you can do something like this:

gawk -F'http://(www\\.)?|/' '!_[$2]++{print $2}' infile

Otherwise use Perl:

perl -nle'
  print $1 unless $_{(m|http://(?:www.)?([^/]*)|)[0]}++
' infile

---------- Post updated at 12:49 PM ---------- Previous update was at 12:48 PM ----------

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

Hey thanks! alot....