Need help to use regex to do search and replace. Don't know how to and can't figure out how :(

Hi,

Below is an excerpt from a 20000+ lines and I want to do a search and replace of a specific string but I don't know how and I can't figure out how to. Can't find an example from Google or anywhere to do what I am wanting to do.

A                    2018-11-21 08:42:17 TEST_TEST 2018-11-21 00:50:45 Accessed:   2,146,893,824 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/temp/abcp_temp_10.dbf
A                    2018-11-21 08:41:52 TEST_TEST 2018-11-21 00:50:45 Accessed:   2,146,697,216 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/temp/abcp_temp_9.dbf
A                    2018-11-21 03:29:04 TEST_TEST 2017-06-17 22:01:33 Accessed:          22,528 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_ABCD_TEST_COMM_P201406_tts_exp.dta
A                    2018-11-21 03:28:57 TEST_TEST 2017-06-17 22:01:28 Accessed:     563,101,696 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_abcd_test_comm_201406_data_01.dbf

I need to search for all lines that contain the string /db/LTS/abcp_ but only those that ends in .dbf. So from the excerpts above, it should find the line below:

A                    2018-11-21 03:28:57 TEST_TEST 2017-06-17 22:01:28 Accessed:     563,101,696 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_abcd_test_comm_201406_data_01.dbf

And I want to replace them so it so that it will be

A                    2018-11-21 03:28:57 TEST_TEST 2017-06-17 22:01:28 Accessed:     563,101,696 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abct_abcd_test_comm_201406_data_01.dbf

So, basically I need to change /db/LTS/abcp_ to /db/LTS/abct_. It is the one in BOLD that I am wanting to search and replace.
The grep regex below seems to do what I wanted. But I don't know how to do like a sed search and replace thing :frowning:

$: grep "/db/LTS/abcp_.*dbf$" xx.txt
A                    2018-11-21 03:28:57 TEST_TEST 2017-06-17 22:01:28 Accessed:     563,101,696 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_abcd_test_comm_201406_data_01.dbf

Any help will be much appreciated.

P.S.:

  • Is there any way to colorize or highlight certain text? I was hoping to be able to do that in this post but can't find the option to do so. I thought that option was available on the older forum ??? :frowning:

Try

sed -r '\#(/db/LTS/abc)p(_.*dbf)$#s//\1t\2/' file

And, yes, the text colour selector (available in the "quick reply editor") seems to be missing in the "advanced editor". But with the bold and underline you did quite well.

1 Like

Hi RudiC

Doesn't work on Solaris but works perfectly on Linux. Thanks a lot. Not even our supposedly 'smartest' SA can figure out what to do. You're a genius.

Sorry forgot to say am trying to do this on Solaris. I have to do the sed on Linux and copy the file back to Solaris. Any idea how to go on the Solaris 8/10?

Can't quite work out or understand 100% what your code does though? Do you mind providing answers to my questions?

sed -r '\#(/db/LTS/abc)p(_.*dbf)$#s//\1t\2/' file

01 - Why are you escaping the # in the beginning?
02 - The p before the (_ is because I want to match a single p? Is that correct?
03 - The #s//\1t\2/ is the one that I don't understand for the most part. Is it doing another search?

Thanks again for looking into this.

The following seems to do what you want just using standard sed features:

sed 's#\(/db/LTS/abc\)p\(_.*\.dbf\)#\1t\2#' file > file.$$ &&
    cp file.$$ file &&
    rm file.$$

The \1 and \2 in the replacement string in the substitute command are back references that are replaced by the strings matched by the BREs between the escaped sets of parentheses, respectively.

Using unescaped parentheses (as in RudiC's suggestion) is a GNU extension that is not allowed in the standards. The sed -r option is not included in the standards, but is allowed as an extension.

When I was using Solaris systems, I would have done this using /usr/xpg4/bin/sed , but I don't think there is anything in the above commands that won't work with /bin/sed or /usr/bin/sed .

3 Likes

Where and how does it fail? Please give as detailed and complete an error message as possible.

man sed :

I use the # here because the usual / is part of the pattern.

Yes.

The # is the final address delimiter. An empty regex ( s// ) repeats the last one encountered, here: the address regex.

Hi RudiC

Thanks for the explanation. I'll have a read.

Error is as below:

Here's the test file:

$: cat xx.txt
A                    2018-11-21 08:42:17 TEST_TEST 2018-11-21 00:50:45 Accessed:   2,146,893,824 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/temp/abcp_temp_10.dbf
A                    2018-11-21 08:41:52 TEST_TEST 2018-11-21 00:50:45 Accessed:   2,146,697,216 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/temp/abcp_temp_9.dbf
A                    2018-11-21 03:29:04 TEST_TEST 2017-06-17 22:01:33 Accessed:          22,528 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_ABCD_TEST_COMM_P201406_tts_exp.dta
A                    2018-11-21 03:28:57 TEST_TEST 2017-06-17 22:01:28 Accessed:     563,101,696 /NFS_mnt/mnlNFS111/vol_abcp_db/abcp/.snapshot/mnl1ns123_vol_abcp_db_ss.0/db/LTS/abcp_abcd_test_comm_201406_data_01.dbf

Running with /bin/sed or /usr/xpg4/bin/sed:

$: sed -r '\#(/db/LTS/abc)p(_.*dbf)$#s//\1t\2/' xx.txt
sed: illegal option -- r

$: /usr/xpg4/bin/sed -r '\#(/db/LTS/abc)p(_.*dbf)$#s//\1t\2/' xx.txt
/usr/xpg4/bin/sed: illegal option -- r
Usage:  sed [-n] script [file...]
        sed [-n] [-e script]...[-f script_file]...[file...]

OS version:

$: uname -a
SunOS [hostname] 5.11 11.3 sun4v sparc sun4v

Hi newbie_01,
Did you try what I suggested in post #4 in this thread (perhaps replacing file everywhere it appears with xx.txt )?

Thanks, that works fine on Solaris

--- Post updated 12-06-18 at 06:45 AM ---

Hi Don

I did try on Solaris what you suggested in post #4 and it works as I want it to be.

If you don't mind, got some more question to ask playing around with BRE/regex.

Here's another one that I am wanting to do:

Below is the test file file:

$: cat file1
/db/abcp/DIR/abcp_data_01.dbf

Now I want to replace abcp to abct and I use the following which works fine like I wanted it to:

$: sed 's#\(/db/abc\)p\(/DIR/abc\)p\(_.*\.dbf\)#\1t\2t\3#' file > file1
$: cat file1
/db/abct/DIR/abct_data_01.dbf

I also try the following in case I only want to replace only where /db is the beginning of the line:

$: cat file2
/db/abcp/DIR/abcp_data_01.dbf
   /db/abcp/DIR/abcp_data_02.dbf
$: sed 's#^\(/db/abc\)p\(/DIR/abc\)p\(_.*\.dbf\)#\1t\2t\3#' file2 > file1
$: cat file1
/db/abct/DIR/abct_data_01.dbf
   /db/abcp/DIR/abcp_data_02.dbf

OK, it's all good so far. But I thought the previous two options are too 'complicated' so I then try the one below but it is not giving me the result that I wanted. Can you tell me what's wrong with it?

$: sed 's#\(/db/abcp/DIR/abcp_\)\(_.*\.dbf\)#\(/db/abct/DIR/abct_\)\1#' file > file1
$: cat file1
/db/abcp/DIR/abcp_data_01.dbf

I am expecting file1 to be having /db/abct/DIR/abct_data_01.dbf as well shouldn't it be :frowning:

You may be thinking why don't I just simply do 's#abcp#abct#g' instead of having to do sed 's#\(/db/abc\)p\(/DIR/abc\)p\(_.*\.dbf\)#\1t\2t\3#' file > file1 . This is because I may have a line that is /db/abcp/DIR/abcp_abcp_data_01.dbf and that string is supposed to be replaced to /db/abct/DIR/abct_abcp_data_01.dbf . Hence I don't want to do a global replace.

There are several errors in your attempt.

  • the regex looks for strings with two adjacent _ in them; there is none in the input.
  • parentheses lose their special meaning in the replacement and will be printed as such.
  • the leading part of the string seems to be (almost) duplicated by the replacement constant (with the t char) and the back reference \1 .
  • the trailing part is lost as the back reference \2 is not used.

Try to correct those and come back with the result.

1 Like

Hi RudiC

I think I get what you mean. Need to get the hang of trying to understand these regex.
I tested as below.
It gives the output that am after.
Will try with the real file that I am wanting to change.

$: head -20 file*
==> file1 <==
/db/abcp/DIR/abcp_data_01.dbf

==> file2 <==
/db/abcp/DIR/abcp_data_01.dbf
   /db/abcp/DIR/abcp_data_02.dbf

==> file3 <==
/db/abcp/DIR/abcp_data_01.dbf
/db/abcp/DIR/abcp_data_01.dbf txt
   /db/abcp/DIR/abcp_data_02.dbf
$: sed 's#^\(/db/abcp/DIR/abcp\)\(_.*\.dbf\)$#/db/abct/DIR/abct\2#' file1 > file
$: cat file
/db/abct/DIR/abct_data_01.dbf
$: sed 's#^\(/db/abcp/DIR/abcp\)\(_.*\.dbf\)$#/db/abct/DIR/abct\2#' file2 > file
$: cat file
/db/abct/DIR/abct_data_01.dbf
   /db/abcp/DIR/abcp_data_02.dbf
$: sed 's#^\(/db/abcp/DIR/abcp\)\(_.*\.dbf\)$#/db/abct/DIR/abct\2#' file3 > file
$: cat file
/db/abct/DIR/abct_data_01.dbf
/db/abcp/DIR/abcp_data_01.dbf txt
   /db/abcp/DIR/abcp_data_02.dbf