Count the number of string occurrences to display 0 entries in output

liketheshell · August 31, 2017, 2:57pm

Hello Friends,

Can somebody assist an issue I am having? I have a separate file with a list of account ids

XXX200B02Y01
XXX200B03Y01
XXX200B05Y01
XXX200B07Y01
XXX200B08Y01

I call the file, and run an egrep against a directory and logfiles

AccountID=$(cat /home/resource/frodo/tmp/al/ETakers)

/raid/test/`date +%Y/%m/%d`/test/orders/* | /bin/egrep "$AccountID" |grep SUBMITTED | sed 's/.*UserId:\(.*\),Id.*/\1/g' |sort |uniq -c >> /home/resource/frodo/tmp/al/scripts/XXXordertestCOPY.txt

Now getting the correct count is not the issue.

40    XXX200B02Y01
58    XXX200B03Y01
953    XXX200B05Y01
737    XXX200B07Y01
1702    XXX200B10Y1
1028    XXX200B30Y01
1557    XXX200C02
40    XXX200D02Y01
58    XXX200D03Y01
952    XXX200D05Y01
735    XXX200D07Y01
1694    XXX200D10Y01
1026    XXX200D30Y01
939    XXX200E05Y01
753    XXX200E07Y01
1130    XXX200E10Y01
1029    XXX200E30Y01
1020    XXX201B01
837    XXX201B02
965    XXX201B03
415    XXX202B01
415    XXX202B02
415    XXX202B03
307    XXX203B02
642    XXX203B03
309    XXX203C01
518    XXX203C02

The issue, i'm having is getting the account ids to not match to be printed as well. ex.

0 XXX200B07Y01
0 XXX200B08Y01
642    XXX203B03
309    XXX203C01
518    XXX203C02

do i need to add some counter and set to 0 somewhere?

your assistance is greatly appreciated.

Al

RudiC · August 31, 2017, 6:39pm

Not understanding what you're doing in your code, I guess that this might come close to what you want:

awk 'NR == FNR {S[$1]; next} $1 in S {S[$1]++} {T[$1]} END {for (s in S) print S, s; for (t in T) print 0, t}' file1 file2
1 XXX200B07Y01
 XXX200B08Y01
7 XXX200B02Y01
6 XXX200B03Y01
3 XXX200B05Y01
0 XXX200E07Y01
0 XXX200D07Y01
0 XXX200B07Y01
0 XXX202B01
0 XXX200B10Y1
0 XXX202B02
0 XXX202B03
0 XXX200E10Y01
0 XXX200D10Y01
0 XXX200E30Y01
0 XXX200D30Y01
0 XXX200B30Y01
0 XXX203B02
0 XXX203B03
0 XXX200B02Y01
0 XXX200D02Y01
0 XXX203C01
0 XXX203C02
0 XXX200B03Y01
0 XXX200D03Y01
0 XXX200E05Y01
0 XXX200B05Y01
0 XXX200D05Y01
0 XXX200C02
0 XXX201B01
0 XXX201B02
0 XXX201B03

MadeInGermany · September 1, 2017, 9:35am

You have not shown a sample of your input files in orders/.
Your problem is that grep only lists the matches, i.e. 1 time or higher.
awk is certainly a better tool for this.
Guessing you omitted a cat, and by your output it looks like sed cuts out the IDs

sed -n '/SUBMITTED/ s/.*UserId:\(.*\),Id.*/\1/p'  /raid/test/`date +%Y/%m/%d`/test/orders/* |
awk '
  { if (FILENAME!="-") { S[$1]=0 } else { S[$1]++ } }
  END { for (s in S) print S, s }
' /home/resource/frodo/tmp/al/ETakers -