Hi, suppose I have a colon delimeterd file with address field like this
blue:john's hospital new haven CT 92881-2322
yellow:La times copr red road los angeles CA90381 1302
red:las vegas hotel sand drive Las vegas NV,21221
How do I create a new field that contain the zip code information only, extracted from the address field
The problem is with error tolerance, in the survey data file, there are many ways people wrote zip codes:
CT 92881-2322 (subzip code with hyphen)
CT 92881 2322 (subzip code with space)
CT 92881 (no subzip code)
CT,92881 (extra comma)
CT92881 (no space between state and zip code)
Do you think you can do this in python or awk? Thanks.
The new zip code field should be in standardized format:
CT 92881-2322 if there is sub zip code
CT 92881 if the sub zip code is empty