script to change the date format in a file

i have many files with date format of 6-9-2008 and i want a script that can change the format to 2008-06-09

Thanks

Unless you provide an example of one or more of the files, it is difficult for anybody to help you.

Hi

echo "6-9-2008" | sed 's/\(.\)-\(.\)-\(.*\)/\3-0\2-0\1/g'

It would be nice if u provide one file as an example as fpmurphy suggested.

Thanks
Penchal

Here are few examples

NO1A_iCP0041 52 6-9-2008 11 11 8
8

NO1A_iCP0041 52 6-11-2008 11 11 8 8

NO1A_iCP0041 52 9-20-2008 11 11
8 8

The dates are in M-D-YYYY format, I am trying to get it in a YYYY-MM-DD

Thanks

Given this input file:

NO1A_iCP0041 52 6-9-2008 11 11 8 8
NO1A_iCP0041 52 6-11-2008 11 11 8 8
NO1A_iCP0041 52 9-20-2008 11 11 8 8

Try:

awk '{ split($3, d, "-"); $3=sprintf("%04d-%02d-%02d", d[3], d[1], d[2]); print }' input_file.txt

What would i change when the data is pipe delimted

example

NO1A_iCP0041|52|11|11|8|8|6-9-2008
NO1A_iCP0041|52|11|11|8|8|6-11-2008
NO1A_iCP0041|11|11|8|8||52 9-20-2008

Thanks for the help

also couple files like these

12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|2-10-2007|2-11-2007|
12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|12-9-2007|12-9-2007|
12|kenneth.ludlam@yahoo.com|Boston|iVPUd|MPDD|2-2-2007|2-2-2007|

Change the awk delimiter:

awk -F"|" .....

It seems that your date field is the 7th, so you need to change $3 with $7 in the script.

BTW, I assume the third record you provided is in wrong format!

How about if i wanna do both $7 and $8?

The method is always the same, just add another array variable to keep the second date you need to manipulate, eg:

awk -F"|" '{
   split($6, d1, "-");
   split($7, d2, "-");
   $6=sprintf("%04d-%02d-%02d", d1[3], d1[1], d1[2]);
   $7=sprintf("%04d-%02d-%02d", d2[3], d2[1], d2[2]);
   print;
}' input_file.txt

Thanks works great.

By the way can i keep the pipe delimted? it removes all the pipes.

Oh, sure :slight_smile:

Just add the line:

OFS="|";

before split commands.

I am getting a too long error after record 83 on my file when i run awk...is there a fix for this ?

I am getting a too long error after record 83 on my file when i run awk...is there a fix for this ?

I think no... Well, depending on your platform, you may try to use "nawk" (on Solaris) or "gawk" instead of "awk", which they may have a larger limit for the record length.

Otherwise I think the only solution to overcome awk limits would be using another tool, like perl. There's an utility called "a2p" which converts awk code into perl (see the man page for details).

a2p -F"|" awk_script > perl_script
chmod 777 perl_script
./perl_script input_file.txt

On my box, the perl_script produced is this:

#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if $running_under_some_shell;
                        # this emulates #! processing on NIH machines.
                        # (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
                        # process any FOO=bar switches

$[ = 1;                 # set array base to 1
$FS = '\|';             # field separator from -F switch
$, = ' ';               # set output field separator
$\ = "\n";              # set output record separator

while (<>) {
    chomp;      # strip record separator
    @Fld = split(/[|\n]/, $_, -1);

    $, = '|';
    @d1 = split(/-/, $Fld[6], -1);
    @d2 = split(/-/, $Fld[7], -1);
    $Fld[6] = sprintf('%04d-%02d-%02d', $d1[3], $d1[1], $d1[2]);
    $Fld[7] = sprintf('%04d-%02d-%02d', $d2[3], $d2[1], $d2[2]);
    print join($,,@Fld);
}

Try and see what happens :slight_smile: