Replace a field in a comma separated file

Hello Experts,

I have a sample comma separated file as having date data in field # 5 and field #8. Field # 4 and 7 (fields before the date field) are indicators:

This is just a sample data, the actual file may have n number of date fields.

29,A Store,A Street,1,111213,aaaa,0,891213
30,B Store,B Street,0,991213,aaaa,1,61213
31,C Store,C Street,1,51213,aaaaa,1,81213
32,D Store,D Street,0,0,aaaa,1,150323
33,E Store,E Street,1,121212,bbbb,0,0
34,F Store,F Street,1,101212,cccc,0,971212

I need to update field #5 (depending on the value in field # 4 (indicator field)) and field #8 (depending on the value in field#7) and so on as :

  1. If field # 4 = 0 and field # 5 <> 0 (say data is : 991213), then append '19' to the string in field # 5 and make the final value
    as : 1999-12-13. This is for years before 2000

  2. If field # 4 = 0 and field # 5 = 0 (say data is : 0), then the final value as : 0001-01-01

  3. If field # 4 = 1, then
    a. check if field # 5 has length = 5, then (for data say: 51213), append '200' so that the final value should be 2011-12-13
    b. Else, (for data say : 111213), append '20' so that the final value appears as : 2011-12-13.

Input parameters to the script :
$1 : file name
$2 , $3, ........ this depends on the number of date fields which need to be transformed.

I have written the following code:

#! /usr/bin/ksh

# Read File name
INPUT=$1

 while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7 f8
 do 
 For Century = 0
  if [ $f4 -eq 0 -a $f5 -ne 0 ]
  then
      echo "19$f5" | sed 's/./&-/4;s/./&-/7' ;
  fi

 For 0 in date
  if [ $f4 -eq 0 -a $f5 -ne 0 ]
  then
      echo "00010101" | sed 's/./&-/4;s/./&-/7' ;
  fi

 for century = 1
  if [ $f4 -eq 1 ]
  then
      char_len=`expr length $f5`
      if [ $char_len -eq 5 ]
      then
         echo "200$f5" | sed 's/./&-/4;s/./&-/7';
      else
          echo "20$f5" | sed 's/./&-/4;s/./&-/7';
      fi
  fi

 done < "$INPUT"

I am able to transform the data for Field # 5 only.
Could you please suggest a better approach for this requirement such that I am able to transform the data for field # 8 as well.

Thank you

a bit verbose and can be simplified - just following your description.

 awk -v fld='5,8' -f hsquared.awk myFile.csv

where hsquared.awk is:

BEGIN {
  FS=OFS=","
  fld=(!fld)?"5":fld
  fldN=split(fld, fldA,FS)
}
function convF(str ) {
 return (substr(str,1,4) "-" substr(str,5,2) "-" substr(str,7))
}
{
   for(i=1;i<=fldN;i++) {
     if ($(fldA-1)==0 && $(fldA)!=0)
       $(fldA)=convF("19" $(fldA))

     if ($(fldA-1)==0 && $(fldA)==0)
       $(fldA)=convF("00010101")

     if ($(fldA-1)==1) {
        if (length($(fldA)) == 5)
           $(fldA)=convF("200" $(fldA))
        else
           $(fldA)=convF("20" $(fldA))
     }

   }
}
1

results in:

29,A Store,A Street,1,2011-12-13,aaaa,0,1989-12-13
30,B Store,B Street,0,1999-12-13,aaaa,1,2006-12-13
31,C Store,C Street,1,2005-12-13,aaaaa,1,2008-12-13
32,D Store,D Street,0,0001-01-01,aaaa,1,2015-03-23
33,E Store,E Street,1,2012-12-12,bbbb,0,0001-01-01
34,F Store,F Street,1,2010-12-12,cccc,0,1997-12-12

awk -f hsquared.awk myFile.csv will do only field 5 by default

1 Like

Thanks for the quick response.

I would mention that the number of arguments calling the script could be more than 2 and the number of fields to be transformed can be more than 2.

The field # 12 (though not present in my sample file) would be a new date field that would be transformed.

Could you please share your thoughts on this.

Please let me know if you need more inputs from my end.

I'll leave it up to you to implement the shell wrapper script, but...
for the 3 fields (5,8 and 12) to be modified, awk should be called as:

awk -v fld='5,8,12' -f hsquared.awk myFile.csv
1 Like

Thanks.
I shall try and update.

How about

awk -F, -v"FLDS=5,8" '
BEGIN                   {FCNT = split(FLDS, FLD)
                        }

function CV(TMP)        {Y = int(TMP/1E4)
                         return sprintf ("%d-%02d-%02d", Y, int(TMP%Y/100), TMP%100)
                        }

                        {for (i=1; i<=FCNT; i++)        {IX = FLD
                                                         if ($IX) $IX = CV((19+$(IX-1))*1E6 + $(IX))
                                                             else $IX = "0001-01-01"  
                                                        }
                        }
1
' OFS=, file
29,A Store,A Street,1,2011-12-13,aaaa,0,1989-12-13
30,B Store,B Street,0,1999-12-13,aaaa,1,2006-12-13
31,C Store,C Street,1,2005-12-13,aaaaa,1,2008-12-13
32,D Store,D Street,0,0001-01-01,aaaa,1,2015-03-23
33,E Store,E Street,1,2012-12-12,bbbb,0,0001-01-01
34,F Store,F Street,1,2010-12-12,cccc,0,1997-12-12
1 Like

Thanks.

I am writing a wrapper script to call the awk (say driver.awk) script

# Skip the 1st argument as it is the file name
shift
# fetch the argument values, in comma separated form
arg_list=`echo $* | tr ' ' ','`

awk -v fld='$arg_list' -f driver.awk file.txt > file.tmp

However, I get the following error:

awk: The field -1 cannot be less than 0.

Now, if i hard code the fields to be transformed, then it works fine.

awk -v fld='8,12' -f driver.awk file.txt > file.tmp

Could you please suggest a way to dynamically set the fields to be transformed for the AWK command.

Thanks.

Two comments:

  • the shell does not expand variables enclosed in single quotes. Use double quotes.

  • no translation space -> comma needed, just change the field separator in awk 's split command.

Like

$ set -- FN 5 8
$ shift
$ awk -F, -v"FLDS=$*" '
BEGIN                   {FCNT = split(FLDS, FLD, " ")
                        }

function CV(IX)         {TMP = (19+$(IX-1))*1E6 + $(IX)
                         Y   = int(TMP/1E4)
                         return sprintf ("%d-%02d-%02d", Y, int(TMP%Y/100), TMP%100)
                        }

                        {for (i=1; i<=FCNT; i++) $FLD = $FLD ? CV(FLD) : "0001-01-01"  
                        }
1
' OFS=, file
29,A Store,A Street,1,2011-12-13,aaaa,0,1989-12-13
30,B Store,B Street,0,1999-12-13,aaaa,1,2006-12-13
31,C Store,C Street,1,2005-12-13,aaaaa,1,2008-12-13
32,D Store,D Street,0,0001-01-01,aaaa,1,2015-03-23
33,E Store,E Street,1,2012-12-12,bbbb,0,0001-01-01
34,F Store,F Street,1,2010-12-12,cccc,0,1997-12-12