Copy and paste data

I need to copy from specified lines and paste the data into several other lines.

XX123450008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123451895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123452012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx. xx.x xx.x xx.x xx.x xx.x
......
XX123460008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123461895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
.....
XX123462012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx. xx.x xx.x xx.x xx.x xx.x
......

In the example above I need to copy the data (xx.x) from every instance of ....0008 and paste it into every line from ...1895 to ....2012.

What is the best way to do this? I have some familiarity with awk and sed.

try:

awk '{n=0+substr($1,8)} n==8 { for(i=2;i<=NF;i++) copy=$i } n>=1895 && n<=2012 { for(i=2;i<=NF;i++) $i=copy} 1' input

That seemed to work for the first 2 instances, but the other 400+ did not change. I modified the code to (see below) but it still did not do anything after the 2nd instance. Any suggestions?

awk '{n=0+substr($1,020008)} n==020008 { for(i=400;i<=NF;i++) copy=$i } n>=1895 && n<=2012 { for(i=400;i<=NF;i++) $i=copy} 1' input

Do you have over 400 columns and want to change only those >400?

Is the length of column 1 different in the real data? Probably all you want to change is how n is calculated. Mine took from character 8 onward.

Oh I see. The number of columns remain the same throughout. There are over 400 instances of 020008. On the third instance of 020008 the data are not copied.

XX00101020008  39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6   60
XX00101021895  39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6   60
.....
AL00101022012  39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6   60
....
....
AL00103020008  41.9   46 53.7 60.6 68.6 75.9 79.5 78.6 72.9 61.5 52.2 44.7 61.3
AL00103021895  41.2 36.1 53.2 62.2 68.9 77.8 79.8 79.7 77.5 59.2 52.3 43.8   61


This would grab just the last 4 character of $1: {n=0+substr($1,length($1)+1-4)} , but you now shown 020008, so that would be 6: {n=0+substr($1,length($1)-5)} .
So if you want 020008 now, as well as 021895 and 022012:

awk '{n=0+substr($1,length($1)-5)} n==20008 { for(i=2;i<=NF;i++) copy=$i } n>=21895 && n<=22012 { for(i=2;i<=NF;i++) $i=copy} 1' input
1 Like

A couple of issues came up with the output. The original code has 3 spaces separating each column. The output or pasted data only has a single space. What is the proper way to preserve formatting?

XX00101020008   39.5   43.8   51.7   59.5   67.7   75.2   78.9   77.8   71.7   60.5   50.7   42.6   60.0
XX00101021895 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60.0

Also - what if I only need the data to be pasted into 2 rows. Like the rows ending with ....2011 and ...2012? I tried (see below) but it did not work.

awk '{n=0+substr($1,8)} n==8 { for(i=2;i<=NF;i++) copy=$i } n>=2011 && n<=2012 { for(i=2;i<=NF;i++) $i=copy} 1' 

we can copy the entire line, remove the first column, and paste that back with the first column of new line. this is one way to preserve formatting.

your matching modifications are wrong again sadly. the substr here starts at the 8th character. so here, n would be 20008, 21895 etc.. if you want to match just last N characters, use substr($1,length($1)-N+1) .

awk '{n=0+substr($1,length($1)-3)} n==8 {copy=$0;sub($1,"",copy)} n>=2011 && n<=2012 {$0=$1 copy} 1' input

try

awk '{val=0+substr($1,length($1)-3,4);if(val==8){print;$1=x;y=$0}else{if(val>=1895&&val<=2012){print $1 y}else{print}}}' input_file

and output is

XX123450008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123451895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123452012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123460008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123461895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
.....
XX123462012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......

if output is not as per your requirement then provide sample input data and desired output

The input

XXYYYYYYY0008   45.5   49.4   56.6   62.9   71.0   77.9   80.9   80.1   74.9   64.0   55.0   47.8   63.8

The output

XXYYYYYYY1895 45.5 49.4 56.6 62.9 71.0 77.9 80.9 80.1 74.9 64.0 55.0 47.8 63.8

The output you want only has one space, even though you just asked for three?

awk -v OFS="   "

Im not certain on the use of OFS. Where should I insert that into the script?

awk '{val=0+substr($1,length($1)-3,4);if(val==8){print;$1=x;y=$0}else{if(val>=2011&&val<=2012){print $1 y}else{print}}}'

Right after the awk, but before awk's any other parameters. Exactly where I put it iow.

please provide more sample data for input and output. With one line of sample data its not clear what exactly you are looking for