Pattern to match date in YYYY-MM-DD format on Linux machine

Hi Expert,

Request your help.
For date validation in csv file, i have written below code for linux machine
I want the date to be in format 2017-05-11(YYYY-MM-DD), if not present in this format the error should be printed.
Could you please help in finding the right pattern to match above date format

if ($14 !~ /^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$/) {print "Error 131: Incorrect DocumentDate pattern", "Field position 14, Linenumber:"NR,$0}

but it is not working and if date is present in correct format then also i am getting the error as below

Error 131: Incorrect DocumentDate pattern Field position 14, Linenumber:2

If what you showed us is the complete diagnostic message you're getting from the code you showed us, it is complaining when it found a blank (or empty) line in your input file.

Hi Don,

Yes this is the problem,
date is present in correct format 2017-11-03 in column but still getting the below error message.

Error 131: Incorrect DocumentDate pattern Field position 14, Linenumber:2

However error is supposed to appear when column value is empty or date is present in wrong format like this(17-11-03 or 2017-1-03, etc).

No! If that is the diagnostic message you're getting, the line that is being processed is a blank line. Note that the code that prints that message displays $0 (i.e., the entire contents of that record) at the end of the output produced by that awk print statement. So, $14 (and every other field in that record) is an empty field and you get the diagnostic message you showed us.

Please show line No. 2 of your input file.

Hi Rudic,

Below is the line number 2.

KE|KE_OUT_B2B_OSR_TT_TT_20171025_V1.0.txt|22||Kar|Outward|KAR|PO|082017|29AAACT2438A1ZP|INV|TA|3010482048|2017-11-03||||1|29AAAC|||M/s H . LTD||||29||||||9984||Tel Ser|||||0.00|0.00|9.00|45.00|9.00|45.00|||||590.00||||||||||

If the above is record #2 in your input file, the error message your code would have printed would have been:

Error 131: Incorrect DocumentDate pattern Field position 14, Linenumber:2 KE|KE_OUT_B2B_OSR_TT_TT_20171025_V1.0.txt|22||Kar|Outward|KAR|PO|082017|29AAACT2438A1ZP|INV|TA|3010482048|2017-11-03||||1|29AAAC|||M/s H . LTD||||29||||||9984||Tel Ser|||||0.00|0.00|9.00|45.00|9.00|45.00|||||590.00||||||||||

instead of what you showed us in post #1 and post #3 in this thread.

But, of course, it is possible that something else (in the code you haven't shown us) cleared $0 before you got to the code you have shown us.

Hi Don,

I agree complete error is below only which you have pasted.
My intention was not to hide anything from anyone.
Just to keep it short and simple i pasted the below

Error 131: Incorrect DocumentDate pattern Field position 14, Linenumber:2

and below is my complete code, to keep is short just removed other validation conditions

awk -F"|" 'NR>1{
if (length($1)>25) {print "Error 101: Source Identifier exceeds the allowed limit","Field position 1, Linenumber:"NR,$0}
if ($1 ~ /[A-Za-z]+[0-9]+/ || $1 ~ /^[0-9]*$/) {print "Error 102: Source Identifier contains String other than characters","Field position 1, Linenumber:"NR,$0}
if (length($2)>50) {print "Error 103: Source File name excceds the allowed limit","Field position 2, Linenumber:"NR,$0}
if (length($14)>10) {print "Error 130: DccumentDate pattern exceeds allowed limit", "Linenumber:"NR,$0}
if ($14 !~ /^[0-9]{4}\-(0[1-9]|1[0-2])\-(0[1-9]|[1-2][0-9]|3[0-1])$/) {print "Error 131: Incorrect DocumentDate pattern", "Field position 14, Linenumber:"NR,$0}
if ($11 ~ /^CAN$/ && $13==$15) {print "Error 208: column 13 and column 15 matches ","Field position 13, Linenumber:"NR,$0}
if ($3 ~ /[^A-Za-z0-9]+/ || $3 != "") {print "Error 209: GLACode contains string contain space","Field position 3, Linenumber:"NR,$0}
#{printf var1}
printf("\n")
}' /Scripts/gt/test1.txt > aj.txt
          

I will stand by my original statement that to get the output you showed us before, the only way to get that output was for line #2 in your input file to be a blank line. That is why it is crucial that you show us the actual output produced by your code instead of an abridged form that hides whatever problem you may actually be encountering.

With the code you showed us in post #1 in this thread:

if ($14 !~ /^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$/) {print "Error 131: Incorrect DocumentDate pattern", "Field position 14, Linenumber:"NR,$0}

and with the code you showed us in post #8:

if ($14 !~ /^[0-9]{4}\-(0[1-9]|1[0-2])\-(0[1-9]|[1-2][0-9]|3[0-1])$/) {print "Error 131: Incorrect DocumentDate pattern", "Field position 14, Linenumber:"NR,$0}

and with a simpler, equivalent ERE:

if ($14 !~ /^[0-9]{4}-(0[1-9]|1[0-2])-([0-2][0-9]|3[0-1])$/) {print "Error 131: Incorrect DocumentDate pattern", "Field position 14, Linenumber:"NR,$0}

I do not get any output from any of these three if statements from a file where line 2 in that file is the text you showed us in post #6.

My best guess would be that there is a non-printing character or some other character followed by a <backspace> character that hides the fact that there are other characters present in that field that do not match the ERE in your test. To verify this, show us the output from the command:

od -bc aj.txt

where aj.txt is a file containing the output from running the script you showed us in post #8 containing the diagnostic message I showed you in post #7.