Find and count unique date values in a file based on position

Hello,

I need some sort of way to extract every date contained in a file, and count how many of those dates there are.

Here are the specifics:

  • The date format I'm looking for is mm/dd/yyyy
  • I only need to look after line 45 in the file (that's where the data begins)
  • The columns of data are seperated by indents/tabs, I need to look at the 4th column

I'll take any advice I can get, I just don't really even know where to begin aside using the cut or grep functions.

Your expertise is greatly appreciated, thanks in advance! :slight_smile:

Jay

Without any sample, try this (not tested):

awk -F'\t' -v patt='[0-9]{2}/[0-9]{2}/[0-9]{4}' '{
while(match($4,patt)) {
c[substr($4,RSTART,RLENGTH)]++
sub(patt,"",$4)
}}
END{for(i in c) print "date " i " : " c " times"}' file

Here is one way:

 
cat file.txt
a b c 01/01/2012
d e f 12/31/1999
g h i 07/01/2011
j k l 01/01/2012
 
awk '{print $4}' file.txt | sort | uniq | wc -l
3

To start selecting fields at record 46, add condition (NR>45) to awk statement.