Shell script is taking more than 3 hrs to execute

Hi

I am doing a process of converting all the values of key column into a row, for eg

Key col1 col2

1 1 1

1 2 1

1 1 3

1 3 1

2 1 1

2 1 2

What I am doing in the Script is convert this data into

1(key)|1:2:1:3 (All Col1 values),1:1:3:1(all col2 values)

2(key)|1:1,1:2

TO achieve this i am using two while loops and 4 If else loops

Now in my production the number of columns are 4 and the number records in the input file are 0.2 million (2 lac) and this script is taking more than 3 hrs to run.

Any idea on how to minimize the execution time?

No error checking ! Not complete !

Using hash, this should be super-fast ! :slight_smile:

#! /opt/third-party/bin/perl

open(FILE, "<", "r");

while(<FILE>) {
  next if(/^$/);
  chomp;
  my @arr = split(/ /);
  my @val = split(/,/, $fileHash{$arr[0]});
  $val[0] .= (":" . $arr[1]);
  $val[1] .= (":" . $arr[2]);
  $val[0] .= ("," . $val[1]);
  $val[0] =~ s/,:/,/;
  $val[0] =~ s/^://;
  $fileHash{$arr[0]} = $val[0];
}

close(FILE);

foreach my $k ( keys %fileHash ) {
  print "$k $fileHash{$k}\n";
}

exit 0

Hi,

This one should be ok.

input:

1 1 1

1 2 1

1 1 3

1 3 1

2 1 1

2 1 2

3 1 1

4 2 1

4 1 3

1 3 1

2 1 1

2 1 2

optput:

2|1:1:1:1|1:1:1:1
3|1|1
4|2:1|2:1
1|1:2:1:3:3|1:2:1:3:3

code:

awk '
{
if (NF>1)
{
	col[$1]=$1
	if (col1[$1]=="")
		col1[$1]=$2
	else
		col1[$1]=sprintf("%s:%s",col1[$1],$2)
	if (col2[$1]=="")
		col2[$1]=$2
	else
		col2[$1]=sprintf("%s:%s",col2[$1],$2)
}
}
END{

for (i in col)
	print i"|"col1"|"col2
}' filename