Dear Group,
I have following input:
sa;sb;sc;sd;period;ma;mb;mc;md;me
sa1;sb1;sc1;sd1;200001;ma1;mb1;mc1;md1;me1
sa2;sb2;sc2;sd2;200002;ma2;mb2;mc2;md2;me2
sa3;sb3;sc3;sd3;200003;ma3;mb3;mc3;md3;me3
first line contains the headers!
I want to create with one pass the following output:
sa1|sb1|sc1|sd1|200001_ma|ma1|
sa1|sb1|sc1|sd1|200001_mb|mb1|
sa1|sb1|sc1|sd1|200001_mc|mc1|
sa1|sb1|sc1|sd1|200001_md|md1|
sa1|sb1|sc1|sd1|200001_me|me1|
sa2|sb2|sc2|sd2|200002_ma|ma2|
sa2|sb2|sc2|sd2|200002_mb|mb2|
sa2|sb2|sc2|sd2|200002_mc|mc2|
sa2|sb2|sc2|sd2|200002_md|md2|
sa2|sb2|sc2|sd2|200002_me|me2|
sa3|sb3|sc3|sd3|200003_ma|ma3|
sa3|sb3|sc3|sd3|200003_mb|mb3|
sa3|sb3|sc3|sd3|200003_mc|mc3|
sa3|sb3|sc3|sd3|200003_md|md3|
sa3|sb3|sc3|sd3|200003_me|me3|
tried following one liner with partial success:
awk -F\; 'NR==1{for(ctyi=1;ctyi<=NF;ctyi++) if ($ctyi ~ /period/) found=++ctyi;for(m=found;m<=NF;m++) vector[$m];print vector[$m]}NR>1{i=$1"|"$2"|"$3"|"$4"|"$5;for(x=6;x<=NF;x++)print i"_""|"$x"|"}' in.csv > out.txt
I cannot read and add the 'ma mb mc md me'! The second part:
NR>1{i=$1"|"$2"|"$3"|"$4"|"$5;for(x=6;x<=NF;x++)print i"_""|"$x"|"}
works as expected. It does produce the following output:
sa1|sb1|sc1|sd1|200001_|ma1|
sa1|sb1|sc1|sd1|200001_|mb1|
sa1|sb1|sc1|sd1|200001_|mc1|
sa1|sb1|sc1|sd1|200001_|md1|
sa1|sb1|sc1|sd1|200001_|me1|
sa2|sb2|sc2|sd2|200002_|ma2|
sa2|sb2|sc2|sd2|200002_|mb2|
sa2|sb2|sc2|sd2|200002_|mc2|
sa2|sb2|sc2|sd2|200002_|md2|
sa2|sb2|sc2|sd2|200002_|me2|
sa3|sb3|sc3|sd3|200003_|ma3|
sa3|sb3|sc3|sd3|200003_|mb3|
sa3|sb3|sc3|sd3|200003_|mc3|
sa3|sb3|sc3|sd3|200003_|md3|
sa3|sb3|sc3|sd3|200003_|me3|
In the first part:
NR==1{for(ctyi=1;ctyi<=NF;ctyi++) if ($ctyi ~ /period/) found=++ctyi;for(m=found;m<=NF;m++) vector[$m];print vector[$m]}
I locate the position of the field 'period' in order to check which fields have to be considered till the end of the record. The number of fields after 'period' can vary!
Thanks for your help.