Hi,
I am trying to transpose rows to columns for thousands of records. The problem is there are records that have the same lines that need to be separated. the input file as below:-
ID 1A02_HUMAN
AC P01892; O19619; P06338; P10313; P30444; P30445; P30446; P30514;
AC Q29680; Q29837; Q29899; Q95352; Q95380; Q9TPX8; Q9TPX9; Q9TPY0;
AC Q9TQH5; Q9TQI3;
TM 1
ID 1A02_PANTR
AC P16210;
TM 10
ID 1A03_GORGO
AC P30377;
TM 12
ID 1A03_HUMAN
AC P04439; O19546; O19756;
TM 5
ID 1A03_PANTR
AC P13748; Q547D5;
TM 0
ID 1A04_GORGO
AC P30378;
TM 1
and the output should be like below:-
AC ID TM
P01892 1A02_HUMAN 1
O19619 1A02_HUMAN 1
P06338 1A02_HUMAN 1
P10313 1A02_HUMAN 1
P30444 1A02_HUMAN 1
P30445 1A02_HUMAN 1
P30446 1A02_HUMAN 1
P30514 1A02_HUMAN 1
Q29680 1A02_HUMAN 1
Q29837 1A02_HUMAN 1
Q29899 1A02_HUMAN 1
Q95352 1A02_HUMAN 1
Q95380 1A02_HUMAN 1
Q9TPX8 1A02_HUMAN 1
Q9TPX9 1A02_HUMAN 1
Q9TPY0 1A02_HUMAN 1
Q9TQH5 1A02_HUMAN 1
Q9TQI3 1A02_HUMAN 1
P16210 1A02_PANTR 10
P30377 1A03_GORGO 12
P04439 1A03_HUMAN 5
O19546 1A03_HUMAN 5
O19756 1A03_HUMAN 5
P13748 1A03_PANTR 0
Q547D5 1A03_PANTR 0
P30378 1A04_GORGO 1
I found a code that is very similar to my issue in this forum and i modified it a little bit as below:-
awk '/^AC/{C[i++]=$2}
/^ID/{D[j++]=$2}
/^TM/{M[k++]=$2}
END {print "AC\tID\tTM" ;
for (i in D) printf "%-5s\t %-10s\t %s\n",C,D,M}' sample | sed 's/;//g'
However, my problem is with lines start with "AC". I did try using split function to work on it but failed. Can anyone pls help me? thanks