using sed/awk to replace a block of text in a file?

My apologies if this has been answered in a previous post. I've been doing a lot of searching, but I haven't been able to find what I was looking for. Specifically, I am wondering if I can utilize sed and/or awk to locate two strings in a file, and replace everything between those two strings (including the strings themselves) with new text.

For example, lets say I have a file with the following text:

"The quick brown fox jumps over the lazy dog" is an English-language program, that is, a phrase that contains all of the letters of the alphabet. It has been used to test typewriters and computer keyboards.

And I want it to read like this:

"Then the quick onyx goblin jumps over the lazy dwarf" is an English-language program, that is, a phrase that contains all of the letters of the alphabet. It has been used to test typewriters and computer keyboards.

How do I go about replacing

"The quick brown fox jumps over the lazy dog"

with

"Then the quick onyx goblin jumps over the lazy dwarf"

? Any assistance would be appreciated. Thanks in advance.

this can help you..

Like this?

sed 's/The/Then the/;s/brown fox/onyx goblin/;s/dog/dwarf/' input_file

--ahamed

Thank you for the replies. There was one thing though that I may have not been clear about. There would be times where everything in the middle of the two strings could be different. For example:

The quick brown fox.... dogs

or

The time to feed the .... dogs

or

The man ran for his life away from the dogs

In each case, what was between the words "The" and "dogs" is fluid. It could possibly be anything, and I wouldn't know beforehand. I need a mechanism that would look for the word "The" and the word "dog" and replace everything in between (including the word The and dog) with something else.

Does that make sense? If need be, I can give you a more specific example of what I am working with (a CSV) and what I am trying to do. Thanks again.

Try

sed 's/The.*dog/whatever/' inputfile

EDIT: Note that .* will match the longest available string though, e.g.

# echo "The quick brown fox fell over the dog" | sed 's/The.*dog/whatever/'
whatever
# echo "The quick brown fox fell over the dog and some other dogs and cats" | sed 's/The.*dog/whatever/'
whatevers and cats
1 Like

Try the following to replace whatever you want in between "The" and the first occurrence of "dog":

# echo "The quick brown fox fell over the dog and some other dogs and cats" | perl -e '$x=<>;$x=~s/The.+?dog/whatever/;print $x'
whatever and some other dogs and cats
1 Like

I've tried some suggestions with the real case I have, but I haven't had any success. Perhaps if I show you what I am working with it will make more sense. The following is a CSV file that I am working on:

 eDir Top N Report , logoRpt 
 All LAN Interface for ACCESS 
 Shown: Errors above 1.0 K or BW Util above 70.0 

  , Errors , BW Util 
  ,   , % 
  , Above , Above 
  ,  ,  
 Element , 1.0 K , 70.0 
 cat65-acomp-d,FastEthernet4/18 ,974787.00000000,0.03142596
 ios6-zdc-g2,TenGigabitEthernet6/4 ,887644.00000000,0.42693967
 ios6-dc-a,GigabitEthernet2/12 ,71172.00000000,12.59653282

What I am trying to make it look at is the following:

 Hostname,Interface,Number of Errors,Bandwidth Utilization
 cat65-acomp-d,FastEthernet4/18 ,974787.00000000,0.03142596
 ios6-zdc-g2,TenGigabitEthernet6/4 ,887644.00000000,0.42693967
 ios6-dc-a,GigabitEthernet2/12 ,71172.00000000,12.59653282

The bold parts are the items that are actually changing. So in this case, the report I run will always start with "eDir" and end with "70.0" (The second occurrence for those eagle eyed). I want that replaced with what is in the second example. Any chance on getting that to work? Thank you again for all your help, I do appreciate it.

Try this...

sed ':a;N;10!ba;s/^eDir.*70.0$/Hostname,Interface,Number of Errors,Bandwidth Utilization/g' file_name

Thanks for the suggestion. However, it did not seem to do the trick. The file remains unchanged. Not sure if it matters on the specific platform, but I'm running RHEL Server 5.6.

Would it help if I provided a copy of the csv itself?

Replace 10! with $!,

sed ':a;N;$!ba;s/eDir.*, 70.0/Hostname,Interface,Number of Errors,Bandwidth Utilization/g' file_name

Same result unfortunately. I copied exactly as you have it in your post, but everytime I cat the csv file, it is exactly the same. :frowning:

Does the output from the command look correct?

Most unix commands won't change the input file. Normally you'd redirect the output to another (temporary) file and move it back to the original, but since you're on RHEL you should be able to use sed -i to change the input file inplace.

So I decided to show you my crude way of doing this. Not sure if there is a more elegant solution out there, but here is what I came up with:

#!/bin/sh
# Script to prompt you for original csv name and the new csv name
#
rm test*
echo "Provide the name of the original csv (e.g. DC_ACCESS_2011.csv) Note: CASE MATTERS!!"
read originalcsv
echo "Please provide request name of new csv"
read newcsv
sed '1,8{/"/d}' $originalcsv > test1.csv 
sed 's!\"Element.*!Hostname,Interface,Number of Errors,Bandwidth Utilization!' test1.csv > test2.csv 
sed '/Auto Range/d' test2.csv > test3.csv
sed '/From/d' test3.csv > test4.csv
sed '/To:/d' test4.csv > test5.csv
sed -e 's/-Fast/,Fast/; s/-Gig/,Gig/; s/-Port/,Port/; s/-Ten/,Ten/; s/"/ /g' test5.csv  > test6.csv
sed '/^$/d' test6.csv > $newcsv
rm test*
exit 0

The original csv is as follows:

"eDir Top N Report","logoRpt"
"All LAN Interface for DC_ACCESS"
"Shown: Errors above 1.0 K or BW Util above 70.0"

"","Errors","BW Util"
""," ","%"
"","Above","Above"
"","",""
"Element","1.0 K","70.0"
"ios6-qad-rowa2-TenGigabitEthernet6/4",887644.00000000,0.42693967
"ios6-wbd-rowmbaa3-GigabitEthernet2/42",19422.00000000,0.00515052
"ios6-wbd-rowmbaa4-GigabitEthernet2/42",18655.00000000,0.00273238

"Auto Range:  Previous 4 Weeks","Subject:  DC_ACCESS","Created: 11/28/2011 17:50:45"
"From:  10/31/2011 00:00","","Time Zone: (GMT-08:00) Pacific Time"
"To:  11/27/2011 23:59"

And what I end up with:

Hostname,Interface,Number of Errors,Bandwidth Utilization
 ios6-qad-rowa2,TenGigabitEthernet6/4 ,887644.00000000,0.42693967
 ios6-wbd-rowmbaa3,GigabitEthernet2/42 ,19422.00000000,0.00515052
 ios6-wbd-rowmbaa4,GigabitEthernet2/42 ,18655.00000000,0.00273238

Again, fairly crude, but I guess it gets the job done.