I need to copy from specified lines and paste the data into several other lines.
XX123450008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123451895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123452012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx. xx.x xx.x xx.x xx.x xx.x
......
XX123460008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123461895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
.....
XX123462012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx. xx.x xx.x xx.x xx.x xx.x
......
In the example above I need to copy the data (xx.x) from every instance of ....0008 and paste it into every line from ...1895 to ....2012.
What is the best way to do this? I have some familiarity with awk and sed.
try:
awk '{n=0+substr($1,8)} n==8 { for(i=2;i<=NF;i++) copy=$i } n>=1895 && n<=2012 { for(i=2;i<=NF;i++) $i=copy} 1' input
That seemed to work for the first 2 instances, but the other 400+ did not change. I modified the code to (see below) but it still did not do anything after the 2nd instance. Any suggestions?
awk '{n=0+substr($1,020008)} n==020008 { for(i=400;i<=NF;i++) copy=$i } n>=1895 && n<=2012 { for(i=400;i<=NF;i++) $i=copy} 1' input
Do you have over 400 columns and want to change only those >400?
Is the length of column 1 different in the real data? Probably all you want to change is how n is calculated. Mine took from character 8 onward.
neutronscott:
Do you have over 400 columns and want to change only those >400?
Is the length of column 1 different in the real data? Probably all you want to change is how n is calculated. Mine took from character 8 onward.
Oh I see. The number of columns remain the same throughout. There are over 400 instances of 020008. On the third instance of 020008 the data are not copied.
XX00101020008 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60
XX00101021895 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60
.....
AL00101022012 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60
....
....
AL00103020008 41.9 46 53.7 60.6 68.6 75.9 79.5 78.6 72.9 61.5 52.2 44.7 61.3
AL00103021895 41.2 36.1 53.2 62.2 68.9 77.8 79.8 79.7 77.5 59.2 52.3 43.8 61
This would grab just the last 4 character of $1: {n=0+substr($1,length($1)+1-4)}
, but you now shown 020008, so that would be 6: {n=0+substr($1,length($1)-5)}
.
So if you want 020008 now, as well as 021895 and 022012:
awk '{n=0+substr($1,length($1)-5)} n==20008 { for(i=2;i<=NF;i++) copy=$i } n>=21895 && n<=22012 { for(i=2;i<=NF;i++) $i=copy} 1' input
1 Like
A couple of issues came up with the output. The original code has 3 spaces separating each column. The output or pasted data only has a single space. What is the proper way to preserve formatting?
XX00101020008 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60.0
XX00101021895 39.5 43.8 51.7 59.5 67.7 75.2 78.9 77.8 71.7 60.5 50.7 42.6 60.0
Also - what if I only need the data to be pasted into 2 rows. Like the rows ending with ....2011 and ...2012? I tried (see below) but it did not work.
awk '{n=0+substr($1,8)} n==8 { for(i=2;i<=NF;i++) copy=$i } n>=2011 && n<=2012 { for(i=2;i<=NF;i++) $i=copy} 1'
we can copy the entire line, remove the first column, and paste that back with the first column of new line. this is one way to preserve formatting.
your matching modifications are wrong again sadly. the substr here starts at the 8th character. so here, n would be 20008, 21895 etc.. if you want to match just last N characters, use substr($1,length($1)-N+1)
.
awk '{n=0+substr($1,length($1)-3)} n==8 {copy=$0;sub($1,"",copy)} n>=2011 && n<=2012 {$0=$1 copy} 1' input
try
awk '{val=0+substr($1,length($1)-3,4);if(val==8){print;$1=x;y=$0}else{if(val>=1895&&val<=2012){print $1 y}else{print}}}' input_file
and output is
XX123450008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123451895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123452012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
XX123460008 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
XX123461895 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
.....
XX123462012 xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x xx.x
......
if output is not as per your requirement then provide sample input data and desired output
The input
XXYYYYYYY0008 45.5 49.4 56.6 62.9 71.0 77.9 80.9 80.1 74.9 64.0 55.0 47.8 63.8
The output
XXYYYYYYY1895 45.5 49.4 56.6 62.9 71.0 77.9 80.9 80.1 74.9 64.0 55.0 47.8 63.8
The output you want only has one space, even though you just asked for three?
awk -v OFS=" "
Im not certain on the use of OFS. Where should I insert that into the script?
awk '{val=0+substr($1,length($1)-3,4);if(val==8){print;$1=x;y=$0}else{if(val>=2011&&val<=2012){print $1 y}else{print}}}'
Right after the awk, but before awk's any other parameters. Exactly where I put it iow.
please provide more sample data for input and output. With one line of sample data its not clear what exactly you are looking for