Count the words starting with 3-

OS : Oracle Linux 6.5
Shell : bash

I have a file whose contents look like below. I want to count the number of occurences of strings starting with 3-.
How can I do this ? I couldn't wordwrap the below line. Hence it looks long.

'3-90892405251', '3-90892911050', '3-90893144163', '3-90893078309', '3-90893078518', '3-90892991593', '3-90892799260', '3-90893142987', '3-90897712667', '3-90897799595', '3-90897854949', '3-90897855168', '3-90897979833', '3-90897809578', '3-90898043808', '3-90897876904', '3-90897675968', '3-90898094426', '3-90897361488', '3-90898143959', '3-90897726122', '3-90897967169', '3-90898100366', '3-90898157920', '3-90898123121', '3-90898017415', '3-90898017800', '3-90897736679', '3-90898239030', '3-90898203678', '3-90898264512', '3-90892204001',
.
.
.
.

---------- Post updated at 09:35 AM ---------- Previous update was at 09:15 AM ----------

The following count gave me wrong output

 
  
 $ wc -w '3-*' SomePattern.txt
wc: 3-*: No such file or directory
 23441 SomePattern.txt
 23441 total
  
 $ wc -w '3-' SomePattern.txt
wc: 3-: No such file or directory
 23441 SomePattern.txt
 23441 total
 

Can you try like the following ?

awk -F\' '{for(i=1;i<=NF;i++) if ($i ~ "3-") a++ } END { print a+0 } file
1 Like

When I tried to execute the following. I get the > character in the next lines. I think some command is incomplete.
What is the technical word for > character in this case ?

$ awk -F\' '{for(i=1;i<=NF;i++) if ($i ~ "3-") a++ } END { print a+0 } SomePattern.txt
>
>
>

Missing single quotes Red

$ awk -F\' ' '{for(i=1;i<=NF;i++) if ($i ~ "3-") a++ } END { print a+0 } ' SomePattern.txt
1 Like

Hello John K,

Could you please try following and let me know if this helps you.

awk '{num=gsub(/3-/,X,$0);print num}'   Input_file

Hello greet_sed,

Solution could be changed to as following.

awk -F"'" '{for(i=1;i<=NF;i++) if ($i ~ /3-/) a++ } END { print a+0 }'  Input_file

Thanks,
R. Singh

1 Like

Thank You very much Ravinder
Thank you greet_sed, JimMcnamara

---------- Post updated at 10:48 AM ---------- Previous update was at 10:41 AM ----------

One more question:
Is there a way I could skip duplicates (strings starting with 3-) while doing the count ?

sed 's/ *, */\n/g' infile | grep "'3-" | sort -u | wc -l
1 Like

Hello John K,

So you mean you need to get either string is present in a record/line or not because if we remove the duplicates then either it will give output 1 or 0, if this is the case then following may help you in same.

awk '{num=sub(/3-/,X,$0);print num}'   Input_file

If above doesn't meet your requirements then kindly provide us sample Input_file with expected output too on same.

Thanks,
R. Singh

Thank You very much rdrtx1