Search Replace Specific Column using RegEx

Have Pipe Delimited File:

> BRYAN BAKER|4/4/2015|518 VIRGINIA AVE|TEST
> JOE BAXTER|3/30/2015|2233 MockingBird RD|ROW2

On 3rd column where the address is located, I want to add a space after every numeric value - basically doing a "s/[0-9]/&\ / " :

> BRYAN BAKER|4/4/2015|5 1 8  VIRGINIA AVE|TEST
> JOE BAXTER|3/30/2015|2 2 3 3  MockingBird RD|ROW2

I know you can do a:
>

 awk '$3="XYX"' FS=, OFS='|' filename.txt

to replace column 3 w/ "XYZ" but how do i do a search replacing using RegEx?

I am also fine w/ a sed solution... just wish i was better w/ awk. I only want to do a regex search/replace on a specific column.

Thanks!

You could do this:

awk '{gsub(/[0-9]/,"& ",$3)}1' FS=\| OFS=\| file

Which would will append a space to any digit in field 3..

A bit more accurate (if needed) maybe would be to use the split() function first

awk '{split ($3, F, " "); s=F[1]; gsub(/[0-9]/,"& ",s); sub(F[1],s,$3)}1' FS=\| OFS=\| 

Which would only append a space to digits at the beginning of the field..

1 Like

In addition to the suggestion Scrutinizer provided, note that the command in post #1 in this thread:

 awk '$3="XYX"' FS=, OFS='|' filename.txt

produces the output:

> BRYAN BAKER|4/4/2015|518 VIRGINIA AVE|TEST||XYX
> JOE BAXTER|3/30/2015|2233 MockingBird RD|ROW2||XYX

Your input field separator is | ; not , . So, setting FS=, probably isn't what you want for this input file. Since there are no commas in your input file, setting field 3 creates an empty field and adds another field with the contents you specified.

Setting $3 to the string XYX will never replace column 3 with XYZ .

1 Like

Don - thanks for the correction in my original post. You are correct, I didn't have my delimiters set correct and wouldn't have worked. I was going back an d forth between pipe and csv, thus the confusion.

Scrutinizer - thanks soooooo much! Works Great! Now i know how to do regex on any specific column using your example... will come in very handy!

If i can tap you guys one more time.

What if the address is:

BRYAN BAKER|4/4/2015|5178 16th Street|TEST

What I want is:

BRYAN BAKER|4/4/2015|5 1 7 8 16th Street|TEST

Right now, when I use "awk '{gsub(/[0-9]/,"& ",$3)}1' FS=\| OFS=\| file", I get:

BRYAN BAKER|4/4/2015|5 1 7 8 1 6 th Street|TEST

I don't want to do a search and replace for all numeric fields .. only for everything up until the first space. Is that possible?

Look at Scrutinizer's post #2 in this thread again. He told you exactly how to do that...

awk '{split ($3, F, " "); s=F[1]; gsub(/[0-9]/,"& ",s); sub(F[1],s,$3)}1' FS=\| OFS=\|