Script to set columns to days in month

ncwxpanther · December 11, 2012, 3:18pm

I am trying to figure out how to assign columns of a text file to the day of the month. The end result will be a way to determine when each day (column) is populated with data.

The data file are in the format of:

M1Y2012 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x 
M2Y2012 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x 
.....
M12Y2012 x x x x x x x x -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999

Where column 1 is the month and year and the rest of the columns in that line are the data, each column being a day of the month. -999 is missing data.

I am somewhat familiar with awk and sed. Any suggestions are appreciated.

Thanks.

aster007 · December 11, 2012, 3:55pm

I didn't quite understand your requirements but if you have an input file with data, each line can be appended in the below way

while read LINE
do
    TODAY=`date "+%d%b%Y"`
    
    echo "$TODAY $LINE" >> $OPTPUT_FILE
    
done < $INPUT_FILE

rdrtx1 · December 11, 2012, 3:55pm

Not sure of purpose of missing data columns. An example script to start with:

while read d x
do
   cal $(echo $d |
     awk -F"[A-Z]*" '{print $1,$2,$3}') | sed -n '3,$p' |
     awk -v d=$d 'BEGIN {printf d" "}{$1=$1}{print $0" "} END {print "\n"}' ORS=
done < infile

RudiC · December 11, 2012, 4:04pm

Not sure I understand either. Why do you have 41 cols for Jan and Feb when each col is a day of month?
Is the data file you present input or output?

ncwxpanther · December 11, 2012, 4:11pm

So each morning the previous days data is uploaded, replacing the -999.xx with actual data. So todays data file would look something like this:

Y2012M12   xx.xx       xx.xx        xx.xx   xx.xx        xx.xx        xx.xx   xx.xx       xx.xx       xx.xx        xx.xx  -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx   -999.xx

RudiC · December 11, 2012, 4:35pm

So now -999.xx is missing data? In your first sample it was -999 indicating missing data.

ncwxpanther · December 11, 2012, 6:08pm

Missing data is indicated with a -999.00

RudiC · December 11, 2012, 6:30pm

Try and comment on

sed    's/ *-999\.00 *//g' file |
awk    'BEGIN {split("31,28,31,30,31,30,31,31,30,31,30,31", months, ",")} 
        {m = substr ($1,7)+0; if (NF-1 < months [m]) print "month", $1, "incomplete"}
       '
month Y2012M12 incomplete

ncwxpanther · December 12, 2012, 9:19am

So I was thinking of something a little different from what has been suggested here. I thought that I could use the environment variables:

setenv day `date +"%d"`
setenv pday `expr ${day} - 1`

After assigning a day for each set of columns, I was thinking of looking for the line with the current year month (Y2012M12). Then determine if the previous day or 2 days has data.

Sorry I was not more clear in previous posts.

Sample code:

12345678901Y2012M12A123    0.32        1.13       -0.67        8.33        0.50       -8.41      -14.26      -18.04       -8.14       11.75       -999.00      -999.00     -999.00     -999.00       -6.16       -8.05      -15.07      -19.03      -21.19      -16.15      -19.39      -17.50       -9.76     -999.00     -999.00     -999.00     -999.00     -999.00     -999.00     -999.00     -999.00