awk or other way to find out number of occurrence of 7th character

Hi all,
I am looking for to filter out based on 7th character and list the number of occurrence based on the 7th character if p , d , o or m

  1. if 7th character is p , Output should be: p_hosts = N
  2. if 7th character is d , Output should be: d_hosts = N
  3. if 7th character is o , Output should be: o_hosts = N
  4. if 7th character is m, Output should be: m_hosts = N
prm1hcppb240
prm1hcppb220
prm1hcppb212
prm1hcppb211
prm1hcppb410
cmi1hcmpb282
cmi1hcmpb033
cmi1hcmpb230
prm1hcppb022
prm1hcpmp203
prm1hcppb303
prm1hcppb290
prm1hcppb250
prm1hcppb241
prm1hcppb352
cmi1hcopb450
cmi1hcmmp002
prm1hcpbk302
prm1hcpbk001
cmi1hcmbk203
cmi1hcpbk201
cmi1hvmmp011
cmi1hvmmp101
cmi1hvppb318
cmi1hcmpb502
brc1scpdb0112
brc1scddb0122

Desired output should be:

p_hosts = N
d_hosts = N
o_hosts = N
m_hosts = N

Thanks in advance.,

---------- Post updated at 10:56 AM ---------- Previous update was at 10:51 AM ----------

What I have tried :

substr($1,7,1)

but could not filter out..,

I'm pretty sure that's only a snippet from your attempt. Post it in its entirety.

What if that char is NOT p , d , o , or m ?

1 Like

Maybe something like:

awk '
{	h[substr($1, 7, 1)]++
}
END {	for(i in h)
		printf("%s_hosts = %d\n", i, h)
}' file

producing the output:

d_hosts = 1
m_hosts = 8
o_hosts = 1
p_hosts = 17

with your sample input (although the order of the output lines may vary depending on the version of awk you use).

If you are trying this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

1 Like
awk '
{$0 ~ /^......[dpom]/ ? a[substr($0, 7, 1)]++ : x++}
END {for (i in a) print i "_hosts = " a; print "Non_p_d_o_m_hosts = " x++} ' infile
awk ' /^......[pdmo]/ {a[substr($1,7,1)]++} END {for (i in a) print i "_hosts = " a}' infile
Moderator comments were removed during original forum migration.
1 Like

Similar to what has been posted so far but allowing to count occurrences of undesired characters:

awk '
        {TMP = substr ($0, 7, 1)
         if (TMP ~ "[" SRCH "]") C[TMP"_hosts"]++
         else                    C["None   "]++
        }
END     {for (c in C) print c, "=", C[c]
        }
' SRCH="pdo" file

d_hosts = 1
None    = 8
p_hosts = 17
o_hosts = 1

Hi All, Thanks for the responses, Don , Rudi C, all thanks,

Rudi C, yes that is the only Snippet .., and I went ahead and got this.. ,

count.awk

{if(substr($1,7,1)=="p"){p++}}END{print "p_hosts = "p}
{if(substr($1,7,1)=="d"){d++}}END{print "d_hosts = "d}
{if(substr($1,7,1)=="o"){o++}}END{print "o_hosts = "o}
{if(substr($1,7,1)=="m"){m++}}END{print "m_hosts = "m}
{if((substr($1,7,1)!~"p") && (substr($1,7,1)!~"d") &&  (substr($1,7,1)!~"o") &&  (substr($1,7,1)!~"m")  ){n++}}END{print "Non_p_d_o_m_hosts = "n+0}

Execution:

$ awk -f count.awk datafile 
p_hosts = 17
d_hosts = 1
o_hosts = 1
m_hosts = 8
Non_p_d_o_m_hosts = 0

generated the counts now of the pattern..., Thanks all,

Well, after rveri posted his attempt, all hidden / unapproved posts were unhidden / approved.

awk '
{$0 ~ /^......[dpom]/ ? a[substr($0, 7, 1)]++ : x++}
END {for (i in a) print i "_hosts = " a; print "Non_p_d_o_m_hosts = " x++} ' infile

Quicky on the command line:

cut -c7 file | sort | uniq -c
   1 d
   8 m
   1 o
  17 p
1 Like