bbaker
July 17, 2014, 4:24pm
1
I would like to extract "1333 Fairlane" given the below text.
The word "Building:" is always present. The wording between Building and the beginning of the address can be almost anything. It appears the the hyphen is there most of the time.
Campus: Fairlane Business Park
Building: Information Tech HQ-A - 1333 Fairlane Cir
Floor: Floor 01 Common Area
Thanks for the help
Computers are not good with "most of the time". What do you want to happen when there is no hypnen?
Perhaps starting from the last word which begins with a number on the line, or move the file to a error folder for some sort of manual processing?
bbaker
July 17, 2014, 4:43pm
3
What I am trying to do is use the word Building: as the anchor skip everything until the number then extract the number and the next word.
RudiC
July 17, 2014, 4:44pm
4
Try
awk '/Build/ {sub (/^.*- /, ""); print}' file
1333 Fairlane Cir
bbaker
July 17, 2014, 4:48pm
5
That is very close but I don't want the clr and I also need to use this in perl. Sorry not smart enough to convert from the shell line yet.
RudiC
July 17, 2014, 4:48pm
6
or
awk '/Build/ {print substr ($0,index ($0, /[0-9]+/))}' file
1333 Fairlane Cir
bbaker
July 17, 2014, 4:51pm
7
\s-\s(\d+\s\w+)
This gets me extremely close but I cant seem to figure out how to use "Building" as the anchor and skip everything in between.
thanks for your on going help
---------- Post updated at 02:51 PM ---------- Previous update was at 02:49 PM ----------
Forgot to state that the hyphen is not always there. But "Building:" is
RudiC
July 17, 2014, 4:52pm
8
or
awk '/Build/ {sub (/^.*- /, ""); sub (/[^ ]*$/, ""); print}' file
1333 Fairlane
Please use code tags as required by the forum rules!
This matches with last word beginning with a number when no hyphen found.
awk '/Build/ {
if(!sub (/^.*- /, ""))
while(match($0, " [0-9]"))
$0=substr($0,RSTART+1)
sub (/[^ ]*$/, "")
print}' infile
---------- Post updated at 06:59 AM ---------- Previous update was at 06:54 AM ----------
In perl, why no use Building.*\s-\s(\d+\s\w+)
bbaker
July 17, 2014, 5:40pm
10
Building:.*(\d+\s\w+)
This is closer but I only get 1 last digit instead of all.
---------- Post updated at 03:40 PM ---------- Previous update was at 03:20 PM ----------
Came up with this
/Building:.*\s(\d+\s\w+)/
Seems to be working