sed beginner question

jurgen · February 3, 2011, 11:52am

Hello,
I am processing data. In the first column there is the date ("2011 02 03 12 45") separated by space and follow I have the measurements.I have several days in one file. What I would like to do is:
to read the data line by line and write the data of "today" in a new file and "yesterday" as well in another file, separated by TAB and the date should be in the form of "2011/02/03 12:45:00".(today and yesterday is already defined in another script).
something like that I thought it might be an approach but I am just beginning to write scripts...

for filename in 11111 22222
grep "^$filename" *.txt | /usr/bin/sed -e 's/ /    /g' > $folder\/$filename\/$filename\_$yesterday.txt
grep "^$filename" *.txt | /usr/bin/sed -e 's/ /        /g' > $folder\/$filename\/$filename\_$today.txt
done

hopefully someone can help me how to do it

Regards Jurgen

Corona688 · February 3, 2011, 12:44pm

Please post an actual sample of your input and the requested output data. From the looks of your code it's not quite as described.

jurgen · February 3, 2011, 1:17pm

ok... here is a sample of an input file and how the files should be divided into 2 outputfiles:

input file:Filename: 111

2011 02 03 17 00 220 11.3    6.4   9     993.3   5.4     
2011 02 03 16 00 250 11.8    6.4   9     994.6   7.7    
2011 02 02 20 00 240 10.8    4.0   7     994.5   9.4      
2011 02 02 19 00 240 12.4    4.2   7     994.6   9.1      
2011 02 02 18 00 240 11.8    3.8   7     994.4   9.9      

output:Filename:111_20110203

2011/02/03 17:00:00    220    11.3    6.4    9    993.3    5.4     
2011/02/03 16:00:00    250    11.8    6.4    9    994.6    7.7     


ouput2:filename:111_20110202
2011/02/02 20:00:00    240    10.8    4.0    7    994.5    9.4       
2011/02/02 19:00:00    240    12.4    4.2    7    994.6    9.1         
2011/02/02 18:00:00    240    11.8    3.8    7    994.4    9.9

thanks for the reply

fiendracer · February 3, 2011, 4:28pm

Well I'm just starting out too.
I think this would probably easier to handle w/ awk, but since I'm wrapping my head around sed I thought I would give it a go.
I've got some things to do, I'll be back and we'll see where you are.
Here's a start to get the data in the right way-

s/ /\//
s/ /\//
s/ /:/2
s/ /:00 /2

Now you just have to output it to the right places.

Have at it!

---------- Post updated at 04:28 PM ---------- Previous update was at 04:25 PM ----------

Oh and run as a sed script, presuming the data is in awk-data, the command will look like this at the command prompt-
> sed -f sed-script4 < awk-data
With the contents of sed-script4 being what I posted above this.
Capiche?

jurgen · February 3, 2011, 4:37pm

fiendracer:

Well I'm just starting out too.
I think this would probably easier to handle w/ awk, but since I'm wrapping my head around sed I thought I would give it a go.
I've got some things to do, I'll be back and we'll see where you are.
Here's a start to get the data in the right way-
s/ /\//
s/ /\//
s/ /:/2
s/ /:00 /2
Now you just have to output it to the right places.

Have at it!

---------- Post updated at 04:28 PM ---------- Previous update was at 04:25 PM ----------

Oh and run as a sed script, presuming the data is in awk-data, the command will look like this at the command prompt-
> sed -f sed-script4 < awk-data
With the contents of sed-script4 being what I posted above this.
Capiche?

no...sorry...i cant follow you at the moment...

birei · February 3, 2011, 4:54pm

Hi,

Here a solution using 'perl'. The code uses the 'DateTime' module. Perhaps you need to install it from CPAN.

I suppose date is in format: YYYY MM DD. Tell me if I'm wrong because otherwise the script won't work.

Here my console session:

$ su -
# cpan
cpan> install DateTime
cpan> exit
# exit
$ ls -1
111
script.pl
script.pl~
$ cat 111
2011 02 03 17 00 220 11.3    6.4   9     993.3   5.4     
2011 02 03 16 00 250 11.8    6.4   9     994.6   7.7    
2011 02 02 20 00 240 10.8    4.0   7     994.5   9.4      
2011 02 02 19 00 240 12.4    4.2   7     994.6   9.1      
2011 02 02 18 00 240 11.8    3.8   7     994.4   9.9
$ cat script.pl
use strict;
use warnings;
use DateTime;

die "Usage: $0 <infile>\n" unless @ARGV == 1;

open my $infile, "<", $ARGV[0] or die "Cannot open file $ARGV[0]: $!\n";

my $today = DateTime->now->ymd('/');
my $yesterday = DateTime->now->subtract( days => 1 )->ymd('/');

my $of = $ARGV[0] . "_" . $today;
$of =~ s|/||g;
open my $outfileToday, ">", $of or die "Cannot create output file: $!\n";
$of = $ARGV[0] . "_" . $yesterday;
$of =~ s|/||g;
open my $outfileYesterday, ">", $of or die "Cannot open output file: $!\n";

my ( $date, $hour, $outline );
my @f = ();
while ( <$infile> ) {
        @f = split;
        $date = join "/", @f[0..2];
        next unless $date eq $today || $date eq $yesterday;
        $hour = join( ":", @f[3..4] ) . ":00";
        $outline = join( " ", $date, $hour ) . "\t" . join( "\t", @f[5..$#f] );
        if ( $date eq $today ) {
                print $outfileToday "$outline\n";
        } else {
                print $outfileYesterday "$outline\n";
        }
}
$ perl script.pl
Usage: script.pl <infile>
$ perl script.pl 111
$ ls -1
111
111_20110202
111_20110203
script.pl
script.pl~
$ cat 111_20110202
2011/02/02 20:00:00     240     10.8    4.0     7       994.5   9.4
2011/02/02 19:00:00     240     12.4    4.2     7       994.6   9.1
2011/02/02 18:00:00     240     11.8    3.8     7       994.4   9.9
$ cat 111_20110203
2011/02/03 17:00:00     220     11.3    6.4     9       993.3   5.4
2011/02/03 16:00:00     250     11.8    6.4     9       994.6   7.7

I hope it can be useful for you.

Regards,
Birei

yinyuemi · February 3, 2011, 5:11pm

awk '{print $1"/"$2"/"$3"\t"$4":"$5":"$5"\t"$6"\t"$7"\t"$8"\t"$9"\t"$10"\t"11>"111_"$1$2$3}' urfile

jurgen · February 3, 2011, 5:14pm

thanks for your answer ....i will try it....no shell solution?

---------- Post updated at 05:14 PM ---------- Previous update was at 05:13 PM ----------

?????

vgersh99 · February 3, 2011, 5:21pm

something along these lines:
[COLOR=black][FONT=monospace] awk -f jur.awk myFile
jur.awk:

BEGIN {
  OFS="\t"
}
{
   cout=FILENAME "_" $1$2$3
   if (cout!=out) {
     close(out)
     out=cout
   }
   $1=$1 "/" $2 "/" $3
   $2=$3=""
   $2=$4 ":" $5 ":" "00"
   $3=$4=$5=""
   gsub(/[  *]/, " ")
   $1=$1
   print >out
}

use /usr/xpg4/bin/awk or gawk on Solaris.

rdcwayx · February 4, 2011, 12:16am

nawk '{f=$1$2$3;print $1 "/" $2 "/" $3 " " $4 ":" $5 ":00" , $6,$7,$8,$9,$10,$11> FILENAME "_" f }' OFS="\t" infile

Scrutinizer · February 4, 2011, 5:23am

awk '{f=FILENAME"_"$1$2$3}p!=f{close(p);p=f}{print $1"/"$2"/"$3" "$4":"$5":00",$6,$7,$8,$9,$10,$11>f}' OFS="\t" infile

jurgen · February 4, 2011, 7:39am

Thanks for your help.If have done it on that way and the date format is ok:

for station in 62105 62081
do
 grep "^$yesterday" $station.txt  | sed -e 's/ /\//' -e 's/ /\//' -e 's/ /:/2' > output_today

  grep "^today" $station.txt    | sed -e 's/ /\//' -e 's/ /\//' -e 's/ /:/2' > output_yesterday
 
done

Now i just want to remove the blanks by Tab. The problem is that sometimes I have 1 backspace sometimes 2 or 3 or more between the columns and I have no idean how to solf the problem

something like that:

-e 's/ {*}/ /g'

but its not working

Corona688 · February 4, 2011, 9:11am

"it's not working" never helps. In what way is it "not working"? Is it doing nothing? Is it eating all space?

I might try sed -e 's/\t+/\t/g' to replace multiple tabs with one..

jurgen · February 4, 2011, 9:38am

"its not working" means that it is doing nothing. the files are like the original files (separated with blanks not with tab), just the data format is correct.