script to change the date format in a file

shehzad_m · June 9, 2008, 3:49pm

i have many files with date format of 6-9-2008 and i want a script that can change the format to 2008-06-09

Thanks

fpmurphy · June 10, 2008, 6:23am

Unless you provide an example of one or more of the files, it is difficult for anybody to help you.

penchal_boddu · June 10, 2008, 6:29am

Hi

echo "6-9-2008" | sed 's/$.$-$.$-$.*$/\3-0\2-0\1/g'

It would be nice if u provide one file as an example as fpmurphy suggested.

Thanks
Penchal

shehzad_m · June 10, 2008, 10:01am

Here are few examples

NO1A_iCP0041 52 6-9-2008 11 11 8
8

NO1A_iCP0041 52 6-11-2008 11 11 8 8

NO1A_iCP0041 52 9-20-2008 11 11
8 8

The dates are in M-D-YYYY format, I am trying to get it in a YYYY-MM-DD

Thanks

robotronic · June 10, 2008, 12:22pm

Given this input file:

NO1A_iCP0041 52 6-9-2008 11 11 8 8
NO1A_iCP0041 52 6-11-2008 11 11 8 8
NO1A_iCP0041 52 9-20-2008 11 11 8 8

Try:

awk '{ split($3, d, "-"); $3=sprintf("%04d-%02d-%02d", d[3], d[1], d[2]); print }' input_file.txt

shehzad_m · June 10, 2008, 12:39pm

robotronic:

Given this input file:

NO1A_iCP0041 52 6-9-2008 11 11 8 8
NO1A_iCP0041 52 6-11-2008 11 11 8 8
NO1A_iCP0041 52 9-20-2008 11 11 8 8

Try:

awk '{ split($3, d, "-"); $3=sprintf("%04d-%02d-%02d", d[3], d[1], d[2]); print }' input_file.txt

What would i change when the data is pipe delimted

example

NO1A_iCP0041|52|11|11|8|8|6-9-2008
NO1A_iCP0041|52|11|11|8|8|6-11-2008
NO1A_iCP0041|11|11|8|8||52 9-20-2008

Thanks for the help

shehzad_m · June 10, 2008, 12:44pm

also couple files like these

12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|2-10-2007|2-11-2007|
12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|12-9-2007|12-9-2007|
12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|2-2-2007|2-2-2007|

robotronic · June 10, 2008, 12:45pm

Change the awk delimiter:

awk -F"|" .....

It seems that your date field is the 7th, so you need to change $3 with $7 in the script.

BTW, I assume the third record you provided is in wrong format!

shehzad_m · June 10, 2008, 12:49pm

robotronic:

Change the awk delimiter:
awk -F"|" .....
It seems that your date field is the 7th, so you need to change $3 with $7 in the script.

BTW, I assume the third record you provided is in wrong format!

How about if i wanna do both $7 and $8?

robotronic · June 10, 2008, 12:50pm

The method is always the same, just add another array variable to keep the second date you need to manipulate, eg:

awk -F"|" '{
   split($6, d1, "-");
   split($7, d2, "-");
   $6=sprintf("%04d-%02d-%02d", d1[3], d1[1], d1[2]);
   $7=sprintf("%04d-%02d-%02d", d2[3], d2[1], d2[2]);
   print;
}' input_file.txt

shehzad_m · June 10, 2008, 12:54pm

robotronic:

The method is always the same, just add another array variable to keep the second date you need to manipulate, eg:
awk -F"|" '{
   split($6, d1, "-");
   split($7, d2, "-");
   $6=sprintf("%04d-%02d-%02d", d1[3], d1[1], d1[2]);
   $7=sprintf("%04d-%02d-%02d", d2[3], d2[1], d2[2]);
   print;
}' input_file.txt

Thanks works great.

shehzad_m · June 10, 2008, 1:00pm

robotronic:

The method is always the same, just add another array variable to keep the second date you need to manipulate, eg:
awk -F"|" '{
   split($6, d1, "-");
   split($7, d2, "-");
   $6=sprintf("%04d-%02d-%02d", d1[3], d1[1], d1[2]);
   $7=sprintf("%04d-%02d-%02d", d2[3], d2[1], d2[2]);
   print;
}' input_file.txt

By the way can i keep the pipe delimted? it removes all the pipes.

robotronic · June 10, 2008, 1:06pm

Oh, sure

Just add the line:

OFS="|";

before split commands.

shehzad_m · June 10, 2008, 1:56pm

I am getting a too long error after record 83 on my file when i run awk...is there a fix for this ?

shehzad_m · June 10, 2008, 2:18pm

I am getting a too long error after record 83 on my file when i run awk...is there a fix for this ?

robotronic · June 11, 2008, 10:30am

I think no... Well, depending on your platform, you may try to use "nawk" (on Solaris) or "gawk" instead of "awk", which they may have a larger limit for the record length.

Otherwise I think the only solution to overcome awk limits would be using another tool, like perl. There's an utility called "a2p" which converts awk code into perl (see the man page for details).

a2p -F"|" awk_script > perl_script
chmod 777 perl_script
./perl_script input_file.txt

On my box, the perl_script produced is this:

#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if $running_under_some_shell;
                        # this emulates #! processing on NIH machines.
                        # (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
                        # process any FOO=bar switches

$[ = 1;                 # set array base to 1
$FS = '\|';             # field separator from -F switch
$, = ' ';               # set output field separator
$\ = "\n";              # set output record separator

while (<>) {
    chomp;      # strip record separator
    @Fld = split(/[|\n]/, $_, -1);

    $, = '|';
    @d1 = split(/-/, $Fld[6], -1);
    @d2 = split(/-/, $Fld[7], -1);
    $Fld[6] = sprintf('%04d-%02d-%02d', $d1[3], $d1[1], $d1[2]);
    $Fld[7] = sprintf('%04d-%02d-%02d', $d2[3], $d2[1], $d2[2]);
    print join($,,@Fld);
}

Try and see what happens