Hi experts,
I have a problem with the below shell task:
I need to modify the file creatin a paired row , per each row
which matches filter (e.g. number of nonempty columns = 5)
Output should look like this:
second row is original one from the input,
first row(red) is pairing row, it's almost the same except the changes(bold font).
Very important are the spaces between columns, they should remain the same as in input, so the fields start at the same position.
Thanks
INPUT
R# asd1X892 X892 1TESTC :A
R# qwe1Y892 Y892 1TESTC :A
OUTPUT
R# asdX892 0TESTC :A
R# qweY892 0TESTC :A
R# asd1X892 X892 1TESTC :A
R# qwe1Y892 Y892 1TESTC :A
Try:
perl -0ne '$x=$_;s/^([^ ]+ +AAA)1([^ ]+)( +)[^ ]+( +)1/$1$2$3 ${4}0/mg;print $_,$x' file
1 Like
Hi,
Thanks code works for my sample file, however real file is more complicated.
Below is the original input, I need to duplicit every 2 rows which have one more additional column,
so task is to identify rows which have number of columns=11,then duplicit these 2 rows giving output beneath.
New rows should be above the old ones, plus little change has to be made(in blue font).
Thanks a lot
INPUT
H: EURS890 00440000000069.110100000963 DE0008032004 2CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB890 00440000000069.110100000968 DE0008032004 1CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB891 00440000000064.045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000050.300100000986 FR0000120628 2CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000050.300100000987 FR0000120628 1CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
OUTPUT
H: EURS890 00440000000069.110100000963 DE0008032004 2CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB890 00440000000069.110100000968 DE0008032004 1CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB891 00440000000064.045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000050.300100000986 FR0000120628 2CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000050.300100000987 FR0000120628 1CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
Try:
perl -0ne '$x=$_;s/^([^ ]+ +AAA)1([^ ]+)( +)[^ ]+( +)1([^ ]+ +[^ ]+$)/$1$2$3 ${4}0$5/mg;print $_,$x' file
1 Like
Thanks bartus,
I've updated the previous post a little bit, please take a look
Thanks a lot
Can the output look like this?
H: EURS890 00440000000069.110100000963 DE0008032004 2CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB890 00440000000069.110100000968 DE0008032004 1CBKd 20110607-07:50:59BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB891 00440000000064.045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000050.300100000986 FR0000120628 2CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000050.300100000987 FR0000120628 1CSp 20110607-08:27:31TRADETES PA TRADETES TESTCLRXX #
So new line is directly above original line?
1 Like
Output should look like this, hope it's not the obstacle
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
Ok, so if three consecutive lines have 11 columns, then three new lines should be inserted above them? Example.. input:
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
output:
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
? And the same for 4 consecutive lines, etc?
1 Like
In my file there will be never any three consecutive lines together.
It's always just pairs. But it can happen that there will be
2 x couples =4 lines beneath each other e.g.
INPUT
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
then output should look like this
OUTPUT
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
Thanks a lot
Try this script:
#!/usr/bin/perl
open I, "$ARGV[0]";
@x=<I>;
$i=0;
for (@x){
s/ +$//;
$t=$_;
@s=split / +/;
if ($#s==10){
$i=1;
s/([^ ]+ +EUR[SB])1(\d+ +)[^ ]+( +)1/$1$2 ${3}0/;
$s.=$_;
$q.=$t;
} elsif ($i) {
$i=0;
$s.=$q;
$s.=$t;
} else {
$s.=$t;
}
}
print $s;
Run it like this: ./script.pl file
1 Like
Thanks, works almost perfect
except that in the new lines there are additional columns (highlighted), which should be deleted.
current OUTPUT
H: EURS892 S892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 B892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB893 B893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 S893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
required OUTPUT
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
Can you post more sample input? That column is not present in the output if I run that script on the sample data that you provided earlier. Show the input that you used to get above output.
Shahul
June 7, 2011, 4:49pm
13
$ nawk '{if(NF>10) {print $0"-"NR"\n"$0} else {print $0}}' input.txt|nawk 'NF>11 {$2="EUR"$3;$3="";$NF=""}{print |"sort -n -k3 -r"}'
Thanks
Sha
1 Like
Hi,
Please applicate the conde on this input:
H: EURS891 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB891 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS894 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB894 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
[root@linux ~]# ./a.pl c
H: EURS891 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB891 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS892 00440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB892 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS893 00440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1892 S892 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1892 B892 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB1893 B893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS1893 S893 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURS894 10440000000000.003000000450 FR0000130007 2ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
H: EURB894 10440000000000.003000000450 FR0000130007 1ALUp 20110607-08:34:25TRADETES PA TRADETES TESTCLRXX #
This is the output for your sample data. I found one small bug in the code. Updated version is:
#!/usr/bin/perl
open I, "$ARGV[0]";
@x=<I>;
$i=0;
for (@x){
s/ +$//;
$t=$_;
@s=split / +/;
if ($#s==10){
$i=1;
s/([^ ]+ +EUR[SB])1(\d+ +)[^ ]+( +)1/$1$2 ${3}0/;
$s.=$_;
$q.=$t;
} elsif ($i) {
$i=0;
$s.=$q;
$s.=$t;
} else {
$s.=$t;
}
}
$s.=$q if $i;
print $s;
1 Like
Still the same,
Can you please try last time ,to run the code for this input?
This is the original input without decreasing spaces between columns.
H: EURB891 00440000000064.00000000000000045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.00000000000000045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EUR1B892 B892 10440000000450.00000000000000300100000984 FR0000130007 1ALUp TRADETESTBIC5 ZZ TRADETESTBIC5 TESTCLRXX #A
H: EUR1S892 S892 10440000000450.00000000000000300100000985 FR0000130007 2ALUp TRADETESTBIC5 ZZ TRADETESTBIC5 TESTCLRXX #R
H: EURS893 00440000000050.00000000000000300100000986 FR0000120628 2CSp 20110607-08:27:31TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
H: EURB893 00440000000050.00000000000000300100000987 FR0000120628 1CSp 20110607-08:27:31TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
EUR1B892 and EURB1892 are quite different... Try this:
#!/usr/bin/perl
open I, "$ARGV[0]";
@x=<I>;
$i=0;
for (@x){
s/ +$//;
$t=$_;
@s=split / +/;
if ($#s==10){
$i=1;
s/([^ ]+ +EUR)1([SB]\d+ +)[^ ]+( +)1/$1$2 ${3}0/;
$s.=$_;
$q.=$t;
} elsif ($i) {
$i=0;
$s.=$q;
$s.=$t;
} else {
$s.=$t;
}
}
$s.=$q if $i;
print $s;
1 Like
Sorry my mistake, it has to be always "EUR1B892"
Now it works, in meantime I've found another "issue".
As you can see highlighted, the time format in my original input in case of two rows which should be duplicited ,is UNIX time, what I need to do is to convert to this format "20110607-08:03:22" both 4 rows, as you can see in output
So in this case conversion from 30416938966 -> 20110607-08:06:40(this is just example)
This should be the last modification , thanks a lot.
INPUT
H: EURB891 00440000000064.00000000000000045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.00000000000000045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EUR1B892 B892 10440000000450.00000000000000300100000984 FR0000130007 1ALUp 30416938966TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #A
H: EUR1S892 S892 10440000000450.00000000000000300100000985 FR0000130007 2ALUp 30416938966TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #R
OUTPUT
H: EURB891 00440000000064.00000000000000045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.00000000000000045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB892 00440000000450.00000000000000300100000984 FR0000130007 1ALUp 20110607-08:06:40TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #A
H: EURS892 00440000000450.00000000000000300100000985 FR0000130007 2ALUp 20110607-08:06:40TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #R
H: EUR1B892 B892 10440000000450.00000000000000300100000984 FR0000130007 1ALUp 20110607-08:06:40TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #A
H: EUR1S892 S892 10440000000450.00000000000000300100000985 FR0000130007 2ALUp 20110607-08:06:40TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #R
---------- Post updated at 05:05 PM ---------- Previous update was at 04:27 PM ----------
Please check the update above, this should be everything.
Thanks
Are you sure "30416938966" is Unix time? Current Unix timestamp is "1307530654", so 23 times less than your number. Also using simple Perl converter doesn't seem to recognize your number as Unix time:
solaris% perl -e '$x=localtime(30416938966);print "$x\n"'
Thu Jan 1 00:59:59 1970
While current time is converted properly:
solaris% perl -e '$x=localtime(1307530654);print "$x\n"'
Wed Jun 8 12:57:34 2011
1 Like
Hi,
Sorry 30416938966 is not linux time as I was informed now,
it's microseconds elapsed since midnight.
So the problem is that the time in my original is in this format.
What I need is to replace the time in rows where we had additional column as you remember e with time in this format "20110607-08:26:56"
where :
20110607- is current day date
08:26:56 - converted from 30416938966 by dividing
As you may remember the spaces between the columns should remain the same.
As you can see in the INPUT the last 3 columns dont start at the same position, but after replacing , those will by positioned same.
So here is the transformation .
INPUT
H: EURB891 00440000000064.00000000000000045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.00000000000000045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EUR1B892 B892 10440000000450.00000000000000300100000984 FR0000130007 1ALUp 30416938966TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
H: EUR1S892 S892 10440000000450.00000000000000300100000985 FR0000130007 2ALUp 30416938966TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
OUTPUT
H: EURB891 00440000000064.00000000000000045100000972 DE0006231004 1IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURS891 00440000000064.00000000000000045100000973 DE0006231004 2IFXd 20110607-08:03:22BNABFRPP DE BNABFRPP PARBFRPP #
H: EURB892 00440000000450.00000000000000300100000984 FR0000130007 1ALUp 20110607-08:26:56TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
H: EURS892 00440000000450.00000000000000300100000985 FR0000130007 2ALUp 20110607-08:26:56TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
H: EUR1B892 B892 10440000000450.00000000000000300100000984 FR0000130007 1ALUp 20110607-08:26:56TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
H: EUR1S892 S892 10440000000450.00000000000000300100000985 FR0000130007 2ALUp 20110607-08:26:56TRADETESTBIC5 PA TRADETESTBIC5 TESTCLRXX #
Thanks