Question on awk source files

JSKOBS · August 16, 2016, 12:25am

Im repeating same command to get count, filename from 4 different files, writing to one same file.

awk 'END{print NR"|"FILENAME}' file.txt >> temp.txt; 
awk 'END{print NR"|"FILENAME}' asdf.txt >> temp.txt; 
awk 'END{print NR"|"FILENAME}' lkjh.txt >> temp.txt;
awk 'END{print NR"|"FILENAME}' bvcx.txt >> temp.txt

Anyway above commands can be clubbed into one single line so no need to repeat my awk command multiple times ???

MasWag · August 16, 2016, 12:53am

If you use GNU AWK, the following command will work well.

gawk 'ENDFILE{print NR"|"FILENAME}' file.txt asdf.txt lkjh.txt bvcx.txt

JSKOBS · August 16, 2016, 1:02am

looks like
sh: gawk: not found.

RavinderSingh13 · August 16, 2016, 3:32am

Hello JSKOBS,

With simple awk also it should work, could you please try following.

awk '{print NR"|"FILENAME}' file.txt  asdf.txt  lkjh.txt  bvcx.txt   > temp.txt

NOTE: If you need line numbers from 1 to till the total of lines(counting all Input_files) then NR as used above is fine, if you want to RESET count for every Input_file then in above you should use FNR in place of NR .

Thanks,
R. Singh

JSKOBS · August 16, 2016, 3:44am

Hi Ravinder, i think i confused...
we need the file record count like below, not each line count

2342343|file.txt
762834623|asdf.txt  
23|lkjh.txt  
9098|bvcx.txt

pilnet101 · August 16, 2016, 3:53am

Try:

awk 'FNR==1&&p{print p};{p=FNR"|"FILENAME};END{print FNR"|" FILENAME}' file.txt  asdf.txt  lkjh.txt  bvcx.txt >> temp.txt

RavinderSingh13 · August 16, 2016, 4:53am

Hello JSKOBS,

If you want simple and sober line counts the you could use wc command as follows.

wc -l Input_file1  Input_file2  Input_file3  Input_file4  | grep -v "total"

Output will be as follows.

  12 Input_file1
   4 Input_file2
   4 Input_file3
   5 Input_file4

If you want your output to be delimited with | and no space in starting as shown in above output then you could do following.

wc -l Input_file1  Input_file2  Input_file3  Input_file4  | awk '($0 ~ /total/){next}{sub(/^[[:space:]]+/,X,$0);sub(/[[:space:]]/,"|",$0);} 1'

Output will be as follows.

12|Input_file1
4|Input_file2
4|Input_file3
5|Input_file4

Thanks,
R. Singh

Akshay_Hegde · August 16, 2016, 8:10am

Some more awk

[akshay@localhost tmp]$ for i in test.csv debug.txt cruise_reports.csv; do wc -l $i; done 
1083 test.csv
24 debug.txt
1083 cruise_reports.csv

[akshay@localhost tmp]$ awk '{l[FILENAME]++}END{for(i in l)print l,i}' OFS='|' test.csv debug.txt cruise_reports.csv 
1083|cruise_reports.csv
1083|test.csv
24|debug.txt

pilnet101 · August 16, 2016, 8:29am

Another solution:

printf "%s|%s\n" $(wc -l file.txt  asdf.txt  lkjh.txt  bvcx.txt|sed '$d') >> temp.txt

Akshay_Hegde · August 16, 2016, 8:31am

I think it shows total as well

pilnet101 · August 16, 2016, 9:21am

Thanks Akshay - Amended now!

JSKOBS · August 17, 2016, 1:18am

hi Akshay, Thanks for the reply.
issue with your command: its not listing the files which have zero records....

awk '{l[FILENAME]++}END{for(i in l)print l,i}' OFS='|' test.csv debug.txt cruise_reports.csv

Akshay_Hegde · August 17, 2016, 1:42am

jskobs:

hi Akshay, Thanks for the reply.
issue with your command: its not listing the files which have zero records....
awk '{l[FILENAME]++}END{for(i in l)print l,i}' OFS='|' test.csv debug.txt cruise_reports.csv

Oh thanks, I didn't test, meanwhile you may try what pinet101 suggested or this

[akshay@localhost tmp]$ grep -c  ^ test.csv mm.csv debug.txt cruise_reports.csv
test.csv:1083
mm.csv:0
debug.txt:24
cruise_reports.csv:1083

[akshay@localhost tmp]$ grep -c  ^ test.csv mm.csv debug.txt cruise_reports.csv | awk 'BEGIN{FS=":";OFS="|"}{print $2,$1}'
1083|test.csv
0|mm.csv
24|debug.txt
1083|cruise_reports.csv