Comma separated to rows based on field

Hi to all,

I have a file like:

chr1 a1 a2 a3 a4 a5 a6,a7,a8,a9
chr1 b1 b2 b3 b4 b5 b6,b7
chr2 c1 c2 c3 c4 c5 c6,c7,c8,c9,c10
...

I would like an output like this:

chr1 a6
chr1 a7
chr1 a8
chr1 a9
chr1 b6
chr1 b7
chr2 c6
chr2 c7
chr2 c8
chr2 c9
chr2 10
...

Based on field1, split the comma separated values in different rows.
Thanks,
Anna

What happened to chr1 a1 ?

I don't see any commas.

Hi Hanson44 i think he wants some thing like this from his input, commas separated part i have made red colored ones just to make more visibility :slight_smile:

chr1 a1 a2 a3 a4 a5 a6,a7,a8,a9
chr1 b1 b2 b3 b4 b5 b6,b7
chr2 c1 c2 c3 c4 c5 c6,c7,c8,c9,c10

and he wants output like this


chr1 a6
chr1 a7
chr1 a8
chr1 a9
chr1 b6
chr1 b7
chr2 c6
chr2 c7
chr2 c8
chr2 c9
chr2 10
...

An awk solution:

awk '{ n=split($NF,A,","); while (++i<=n) { print $1, A } i=0 }' file
1 Like

Hi yoda,

can you please explain your code so that it would be great learning for me.

Thanks:)

Another one:

awk '{gsub(/,/,RS $1 FS,$NF); print $1,$NF}' file

Sure, by the way this code works only if the last field is separated by comma and there are no spaces in between them:

awk '
        {
                n = split ($NF, A, ",")         # Split last field using field separator comma "," Get number of elements created in variable: n
                while ( ++i <=n )               # while ++i <= n
                {
                        print $1, A          # Print first field and element in array: A indexed by variable: i
                }
                i = 0                           # Reset variable: i value to 0
        }
' file
1 Like