Assembling the Pieces of a Regular Expression

Hello all.

I'm scripting in ksh and trying to put together a regular expression. I think my logic is sound, but I'm doing the head-against-the-wall routine while trying to put the individual pieces together. Can anybody lend some suggestions to the below problem?

I'm taking a date in the format of "DD-MMM-YY" as a parameter for a script. I want to user-proof this as much as possible, so "01-AUG-13" is valid but nonsense (i.e. "41-MAK-0G") gets rejected. This is a job for regular expressions.

I've broken down the "01-AUG-13" example into five separate expressions to be evaluated, and I think my regex logic is sound. (But if it isn't, please let me know!)

Part 1="01"
Part 2="-"
Part 3="AUG"
Part 4="-"
Part 5="13"

Part 1 translates into:
"match two digits, value between 01 and 31",
Which further translates into:
"match first digit, with value between 0-2, one time, then match second digit value between 0-9 one time |||OR||| match first digit 3 one time (because if we missed the first match of 0-2, then it has to be this and only this), then match second digit, with value of 0 or 1, one time."

Parts 2 & 4 translate into "exactly one dash here"

Part 3 becomes "match exactly one 3 char month name out of the valid set of month name values".

Part 5 is "match exactly two digits".

I have coded the regular expressions for these values as follows:
Part 1:

 
([0-2][0-9]|[3][0-1])

Parts 2 and 4:

 
[-]

Part 3:

 
grep -i [JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC]

I couldn't think of a simple way to do this without using grep -i; what are some alternatives?

Part 5:

 
[0-9][0-9]

So, all of those pieces seem sound, individually. The function doesn't work once I combine them, however. One of my many iterations (and probably the simplest) is:

 
VARIABLE=01-AUG-13
print $VARIABLE | grep -i -E "([0-2][0-9]|[3][0-1])[-][JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC][-][0-9][0-9]"

I am doing something wrong, but I honestly don't know what it might be. Can anyone lend some suggestions on how I can properly write this regular expression?

Any help is appreciated. Thank you!

grep -i -E "([0-2][0-9]|3[01])-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)-[0-9][0-9]"

This isn't completely fool-proof. This regex would match 31-Feb-12 or 00-APR-00. Only thing is, we've narrowed down the possible causes of errors.

A suggestion - If you've GNU date, you could validate date like this:

x="31-FEB-12"
date -d $x
if [ $? -ne 0 ]
then
  echo "Not a valid date"
else
  echo "Valid date"
fi

balajesuri,

Thank you for your assistance. The code you supplied...

 
grep -i -E "([0-2][0-9]|3[01])-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)-[0-9][0-9]"

...worked perfectly for me!

I guess my problem was a syntax error; I notice you did not enclose the "-" within brackets.

Thank you for pointing out the potential for illogical dates such as 31-FEB. I know my code isn't 100% fool-proof, but I'm writing my script for (reasonably) experienced users, so hopefully I don't have to code against something that foolish. :slight_smile:

I like your suggestion to use the date -d. It appears I don't have GNU date available to me, though. That would have been a beautiful solution to the problem, however!

Thank you very much for your assistance! I really appreciate it!