Check Date Format And Email Out

Ariean · September 17, 2013, 10:59am

Hello All,
I have a requirement where i need to get the EXTRACT_DATE from a file and check if the date is of valid format or not and then mail it if it is not valid. Appreciate if you can help me with this.

I did the following so far.

awk '{for(i=1;i++<=NF;)if($i~/^EXTRACT_DATE/) print $i}' FCBTEXAS_v1.xml

got the following output

EXTRACT_DATE="2012-12-31"

Now i want to connect to Oracle sqlplus and pass this "2012-12-31" as an input parameter to following sql and write into log file and then grep for error message in log file if the to_date function errors out because of bad date format/data and then email out the error message.

select to_date('2012-12-31','yyyy-mm-dd') from dual

Appreciate your help and if there is another better way of handling this in shell scripting itself instead of connecting to Oracle DB to perform validation.

Thanks much.

Skrynesaver · September 17, 2013, 12:17pm

something like the following would work I guess:

'awk '{for(i=1;i++<=NF;)if($i~/^EXTRACT_DATE/) print $i}' FCBTEXAS_v1.xml`
sqlplus user/pw@sid << EOC | grep ORA && mail_script 
select to_date('2012-12-31','yyyy-mm-dd') from dual ;
EOC

POSIX::mktime does not do what the perldoc says it should do on my Perl.... please ignore previous version

Ariean · September 17, 2013, 1:04pm

When i do this

awk '{for(i=1;i++<=NF;)if($i~/^EXTRACT_DATE/) print $i}' AgriBank_v2.xml

I got below as it has multiple EXTRACT_DATE entries in the file which is valid.

Output:

EXTRACT_DATE="2012-09-30"
EXTRACT_DATE="2012-09-30"
EXTRACT_DATE="2012-09-30"
EXTRACT_DATE="2012-09-30"

Now i want to get only the
1) Date Portion
2) Unique Date
3) Remove double quotes.

So i came up with the below is there a better way to achieve this.

awk '{for(i=1;i++<=NF;)if($i~/^EXTRACT_DATE/) print $i}' AgriBank_v2.xml | awk -F"\"" '{print $2}' | sort -u

Output:

2012-09-30

Also i am still working on the validating the date portion as all scripts i found online seems to pretty complex. appreciate your inputs.

Thank you.

Yoda · September 17, 2013, 1:19pm

Another awk approach:

awk '
        {
                match ( $0, /EXTRACT_DATE=[^ ]*/ );
                DT = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )
                gsub ( /.*="|"$/, X, DT )
                A[DT]
        }
        END {
                for ( k in A )
                        print k
        }
' file

Ariean · September 17, 2013, 1:34pm

Yoda,
Thanks for your input it returns the expected output, when i execute this it took more time than my statement. Would it be possible to explain what you are doing in each step. appreciate if you can enlighten me up a bit.

Thank you.

Yoda · September 17, 2013, 3:34pm

Here is a brief explanation:

awk '
        {
                # Match pattern EXTRACT_DATE= followed by zero or more occurrence of any character other than space [^ ]*
                match ( $0, /EXTRACT_DATE=[^ ]*/ );

                # match function sets the built-in variable RSTART to the index and RLENTGH to the length.
                # assign variable DT = matched pattern using substr function and variables: RSTART, RLENTGH
                DT = sprintf ( "%s", substr ( $0, RSTART, RLENGTH ) )

                # remove EXTRACT_DATE=" and " from DT variable value
                gsub ( /.*="|"$/, X, DT )

                # assign value to an associative array: A for removing duplicate entries
                A[DT]
        }
        END {   # END Block

                # for each key/index in associative array: A
                for ( k in A )
                        # print key/index
                        print k
        }
' file

I hope this helps.

Ariean · September 17, 2013, 3:59pm

That really helps. However i started coding using my original statement but struck can you help.

awk '{for(i=1;i++<=NF;)if($i~/^EXTRACT_DATE|^UNINUM/) print $i}' AgriBank_v2.xml

Output:

UNINUM="0722075"
EXTRACT_DATE="2012-09-30"
UNINUM="0722146"
EXTRACT_DATE="2012-09-30"
UNINUM="0722502"
EXTRACT_DATE="2012-09-30"
UNINUM="0722643"
EXTRACT_DATE="2012-09-30"

I want to capture the UNINUM & EXTRACT_DATE in the following pattern so that i can go in a loop and assign them to variables to put much information in log files.

0722075 2012-09-30
0722146 2012-09-30
0722502 2012-09-30
0722643 2012-09-30

please help.

Thank you.

Yoda · September 17, 2013, 4:10pm

awk '
        {
                for ( i = 1; i <= NF; i++ )
                {
                        if ( $i ~ /^UNINUM/ )
                        {
                                u = $i
                                gsub ( /.*="|"$/, X, u )
                        }
                        if ( $i ~ /^EXTRACT_DATE/ )
                        {
                                e = $i
                                gsub ( /.*="|"$/, X, e )
                                print u, e
                        }
                }
        }
' file

Ariean · September 17, 2013, 4:19pm

That was really quick. Many thanks
One last question though i should have asked it earlier.
"i" does it represent the line or line counter, bit confused as you were again using to assign the value to variables u,e

Yoda · September 17, 2013, 4:31pm

i represents each field in your record separated by blank space (by default).

NF represents the number of fields or the last field, using for loop you are going through each field (incrementing i) until last field (NF) is reached.

Ariean · September 17, 2013, 4:50pm

I got struck assigning the u,e to shell variables, how do i do that. Thank you.

Yoda · September 17, 2013, 5:29pm

One way is to pipe the output of this awk program to a while loop, read them and assign to shell variable.

Or you can rewrite the whole program in bash instead of awk.