Calculating Total and Averages with awk Commands & Scripts

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!

  1. The problem statement, all variables and given/known data:
    Write an awk script(company.awk) for the workers file to find the number of workers of each departman, total salary that is the sum of the salaries of each departmant workers, total salary of all departments and total salary average.(Total average should be calculated by dividing total salary to the total number of workers)

  2. Relevant commands, code, scripts, algorithms:
    Workers file;

Alisan Baltaci, 23, Engineer, CUS, 1, 1800
Kadir Yenigun, 25, Technician, CIS, 2, 1000
Hande Gunay, 21, Secretary, 1, 850
Baran Tiryaki, 29, Engineer, SRT, 4, 2500
Bahar Topak, 26, Engineer, CIS, 3, 2000
Cihan Kurt, 28, Technician, SRT, 5, 1250
Sezen Agaoglu, 25, Engineer, CUS, 3, 2000
Kaan Yuksel, 21, Technician, CUS, 2, 1000
Irem Sozeri, 23, Engineer, CUS, 1, 1800
Dilan Colpan, 24, Secretary, CIS, 4, 1000

Output should be like this;

DEPARTMENTS & SALARIES
Department Workers TotalSalary Average
--------------------------------------------------------------
CUS 4 6600 1650
CIS 3 4000 1333.3
SRT 3 4600 1533.3
--------------------------------------------------------------
GRAND TOTAL 10 15200 1520

Only it should be used awk commands & scripts.

  1. The attempts at a solution (include all code and scripts):
    In awk scripts;
BEGIN{print"DEPARTMENTS & SALARIES"
print"Department Workers TotalSalary Average"
print"---------------------------------------------------------"}
{FS=","
GFS="\t"}

{if(/CUS/)
{cus_no++;
cus_tot+=$6;}
else if(/CIS/)
{cis_no++;
cis_tot+=$6;}
else if(/SRT/)
{srt_no++;
srt_tot+=$6;}
}

END{tot_workers=cus_no+cis_tot+srt_no;
tot_salary=cus_tot+cis_tot+srt_tot;
print"CUS",cus_no,cus_tot,cus_tot/cus_no
print"CIS",cis_no,cis_tot,cis_tot/cis_no
print"SRT",srt_no,srt_tot,srt_tot/srt_no
print"------------------------------------------------"
print"GRAND TOTAL",tot_workers,tot_salary,tot_salary/tot_workers}

When apply this script, the codes do not count first latter and output seems 1801 in first calculation. However, when I type a blank for firs line and I start to type workers file in second line, there is no problem.

The question is why. Why does script count first line when it is typed to first line in workers file?

  1. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
    Bilkent University, Ankara(�ankaya), Turkey, Engin Zafer Kra�bedel, and CTE 218 (cte.bilkent.edu.tr/en/curriculum)

Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).

  1. Your input file apparently misses one SRT, , should be
Alisan Baltaci, 23, Engineer, CUS, 1, 1800
Kadir Yenigun, 25, Technician, CIS, 2, 1000
Hande Gunay, 21, Secretary, SRT, 1, 850
Baran Tiryaki, 29, Engineer, SRT, 4, 2500
Bahar Topak, 26, Engineer, CIS, 3, 2000
Cihan Kurt, 28, Technician, SRT, 5, 1250
Sezen Agaoglu, 25, Engineer, CUS, 3, 2000
Kaan Yuksel, 21, Technician, CUS, 2, 1000
Irem Sozeri, 23, Engineer, CUS, 1, 1800
Dilan Colpan, 24, Secretary, CIS, 4, 1000
  1. FS sets the field separator for the following input lines.
    Look where you have defined it!

What is GFS? Did you mean OFS?

Some critiques:

Try putting spaces between your print commands (i.e. print "GRAND TOTAL") and quotes.
The third line in the workers file is missing a department.
The calculations are off using the $6 variable. Since the salary is at the end of the line,
try replacing it with $NF instead.
You have a typo in this line:
END{tot_workers=cus_no+cis_tot+srt_no;

I re-wrote your script and here's my output:

./company.awk workers.txt
DEPARTMENTS & SALARIES
Department Workers TotalSalary Average
----------------------------------------
CUS             4       6600    1650
CIS             3       4000    1333.33
SRT             3       4600    1533.33
----------------------------------------
GRAND TOTAL     10      15200   1520

Hope this helps.

1 Like

Yes, sorry about miswriting.

Thanks mate, I tried your way, and there is no problem. However, I do not understand. What is difference between $6 and $NF?

When you had this mistaken input line:

Hande Gunay, 21, Secretary, 1, 850

using $NF gives you the salary from this line (even though a field was missing) while $6 gives you 0. But, you might be better off printing an error if you find that an input line doesn't have the correct number of fields. For example, if you had included the line:

NF != 6 {printf("Line %d has %d fields; expected 6.\n", NR, NF)}

It would have told you that line 1 had 7 fields (since the fields were set before FS had been set) and that line 3 had 6 fields (and you would have looked at that line again and quickly seen that the Department was missing).
You could also consider adding a final else clause to your if statement with something like:

else print "Unrecognized department on line " NR

Tests like these may not be needed if your input is generated by verified software; but if your input is generated by humans, assume they will make mistakes that will cause your code to fail in unexpected ways.

Thanks a lot. Now it makes sense. I wrote workers file correctly, not included missing field part but script still calculates wrong.The firs calculation was 1801. However, with $NF, there is no problem.

Try moving your field separator variable to the beginning of your BEGIN block instead and then try using $6 instead of $NF.

BEGIN{FS=",";print"DEPARTMENTS & SALARIES"
print"Department Workers TotalSalary Average"
print"---------------------------------------------------------"}

@RedJohn : I guess you must have finished your homework now, see below using array you can improve your code,

awk -F, '{
		if(/CUS|CIS|SRT/)
		{
		  Dep_sal[$4]+=$NF
		  Dep_cnt[$4]++
		}
         }

      END{
	    dash = sprintf("%0*d", 50, 0);gsub(/0/,"-",dash)

	    print "DEPARTMENTS & SALARIES\nDepartment","Workers","TotalSalary","Average\n"dash

             for(i in Dep_sal )
	     {
		print i,Dep_cnt,Dep_sal,Dep_sal/Dep_cnt
		total_worker += Dep_cnt
		total_salary += Dep_sal 
             }
		
	     print dash"\nGRAND TOTAL",total_worker,total_salary,total_salary/total_worker

         }' OFS='\t' yourfile
$ ./tester
DEPARTMENTS & SALARIES
Department	Workers	TotalSalary	Average
--------------------------------------------------
 SRT	3	4600	1533.33
 CUS	4	6600	1650
 CIS	3	4000	1333.33
--------------------------------------------------
GRAND TOTAL	10	15200	1520