Read and concatenate content file and file name

kamose · September 2, 2016, 11:35am

Hi all,
i need a bash script.
I have a 3 file named Milano, Torino, Firenze
Into file i have:

Milano

Marco
Luca
Giorgio
Michele
Patrizio

Torino

Marco
Giulio
Emilio
Michele

Firenze

Luca
Giorgio
Marco
Saverio
Emilio

The output should be a all_city.csv file like:

User;Milano;Torino;Firenze
Marco;assigned;assigned;assigned
Luca;assigned;;assigned
Giorgio;assigned;;assigned
Michele;assigned;assigned;
Patrizio;assigned;;
Giulio;;assigned;
Emilio;;assigned;assigned
Saverio;;;assigned

$ echo $BASH_VERSION
4.1.2(1)-release

Any ideas? I try whit array, cut, join, etc.. but with no solution
Thanks to all!!

vgersh99 · September 2, 2016, 12:20pm

something along these lines....
awk -f kam,awk Milano Torino Firenze where kam.awk is:

BEGIN {
  OFS=";"
}
FNR==1 {cityA[++cityN]=FILENAME }
{ userC[$1,FILENAME];userA[$1] }

END {
  printf("%s", "User" OFS)
  for(c=1; c<=cityN; c++)
   printf("%s%s",cityA[c], (c+1>cityN)?ORS:OFS)

  for(u in userA) {
     printf("%s", u OFS)
     for(c=1; c<=cityN; c++)
       printf("%s%s", ( (u,cityA[c]) in userC)?"assigned":"", (c+1>cityN)?ORS:OFS)
  }
}

Don_Cragun · September 2, 2016, 1:32pm

If you save the script vgersh99 suggested in a file named kam.awk , you need to use the same name in the command line when you invoke awk :

awk -f kam.awk Milano Torino Firenze

I assume the <comma> instead of <period> in the filename in the command line was a typo.

With vgersh99's script the order of cities in the output is in the same order as the input files, but the order of users in the output is random and may vary depending on which version of awk runs your script. If you want to guarantee that the order of users in the output matches the order in which user names were first seen in input files, you could try this slightly more complicated script. If you save the following in a file named merger :

#!/bin/ksh
awk '
BEGIN {	# Set output field separator.
	OFS = ";"
}
FNR == 1 {
	# Gather city names from filenames.
	city[++ncity] = FILENAME
}
{	# Gather data from the current input file.
	# Have we seen this user beofore?
	if(!($1 in user)) {
		# No.  Add this user to the list of know users...
		user[$1]
		# and keep track of the order in which users were found.
		order[++nuser] = $1
	}
	# Note that we have seen this user in this city.
	assigned[$1, ncity]
}
END {	# Print header...
	printf("User%s", OFS)
	for(i = 1; i <= ncity; i++)
		printf("%s%s", city, (i == ncity) ? ORS : OFS)

	# Print data for each user...
	for(i = 1; i <= nuser; i++) {
		# print user name...
		printf("%s%s", order, OFS)
		for(j = 1; j <= ncity; j++)
			# and print assigned data for each city.
			printf("%s%s",
			    ((order, j) in assigned) ? "assigned" : "",
			    (j == ncity) ? ORS : OFS)
	}
}' "$@"

and make it executable:

chmod +x merger

and invoke it with:

./merger Milano Torino Firenze > all_city.csv

it will write:

User;Milano;Torino;Firenze
Marco;assigned;assigned;assigned
Luca;assigned;;assigned
Giorgio;assigned;;assigned
Michele;assigned;assigned;
Patrizio;assigned;;
Giulio;;assigned;
Emilio;;assigned;assigned
Saver;;;assigned

into all_city.csv exactly as requested.

This was written and tested with a Korn shell, but will work with any shell that uses Bourne shell syntax (including bash and several others). If you want to run this script on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

kamose · September 5, 2016, 9:05am

Don Cragun you rulessss, but also thanks to vgersh99.

God bless you