sed - pattern match - apply substitution

chill3chee · September 12, 2016, 12:44pm

Greetings Experts,
I am on AIX and in process of creating a re-startable script that connects to Oracle and executes the statements. The sample contents of the file1 is

CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;

CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);
...

Create index statements are one-liner.
The script should be restartable and hence at the beginning of the script I have added a SPOOL to see what are all the indexes present for the respective tables and then comment them appropriately ( I don't want to remove them as I am modifying an existing template file).

For this, I am reading the spool file (which has already existing index names only) as

while read line
do
sed "/$line/ s/\(.*\)/--\1/g" < file1.txt > file1_temp.txt

rm file1.txt;
mv file1_temp.txt file1.txt

done < spool_file.txt;

In-line substitution sed -i is not supported on my system.
The sed statement is failing with Parsing error, where I could see $line is substituted correctly, the s// part being ignored. However, at command prompt, I have assigned line=TAB1_COL1 then executed the below

sed "/$line/ s/\(.*\)/--\1/g" < file1.txt

and could see that the existing indexes are being commented out as
--CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
which can be fed to Oracle so that the script doesn't fail. I am not sure what is the error as it works on the command prompt and fails at the script level. Please note that the statement doesn't contain any embedded / or \ statements in case it matters. Thank you for your time.

EDIT:
Kindly note that I intend to do in
sed /pattern/ s// instead of sed s/.*pattern.*//

RudiC · September 12, 2016, 1:05pm

Help me out - I can't see an s// anywhere in your script? Some more details of your input would help as well, wouldn't they?

chill3chee · September 12, 2016, 1:29pm

Hi RudiC/Ravinder Singh,
The file1 specified in the above is sql file generated after applying modifications to the template file. Now file1.txt had to be run in Oracle whose sample contents are

CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;
CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
...
CREATE INDEX TAB6_COL1 ON TAB6 (COL1);
CREATE INDEX TAB7_COL1 ON TAB7 (COL1)
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);

Say, if the CREATE INDEX TAB7_COL1 statement failed for some reason (space issue or others), after issue identification and fix applied, I just need to restart the existing file1 script, which should now be transformed to

CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;

--CREATE INDEX TAB1_COL1 ON TAB1 (COL1);  This is commented
...
--CREATE INDEX TAB6_COL1 ON TAB6 (COL1);  This is commented
CREATE INDEX TAB7_COL1 ON TAB7 (COL1);
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);

The spool file contents spool_file.txt are

TAB1_COL1
TAB2_COL1
TAB3_COL1
TAB4_COL1
TAB5_COL1
TAB6_COL1

Here create index from TAB1_COL1 to TAB6_COL1 are commented out as they are already present in the spool file (spool file is generated by checking what indexes are already present) through below excerpt from script.


sqlplus -s ....
spool spool_file.txt;
select index_name from dba_indexes where index_name='TAB1_COL1'
union
select index_name from dba_indexes where index_name='TAB2_COL1'
union
...
select index_name from dba_indexes where index_name='TAB10_COL1';
spool off;
........
while read line
do
sed "/$line/ s/\(.*\)/--\1/g" < file1.txt > file1_temp.txt

rm file1.txt;
mv file1_temp.txt file1.txt

done < spool_file.txt;
....
@file1.txt ;

and then finally execute the file1.txt contents which should execute successfully. I am not able to resolve the parsing issue relative to the statement sed "/$line/ s/$.*$/--\1/g" < file1.txt , which runs good (-- comments out the respective line) when I verified by assigning a sample value to line variable.

RavinderSingh13 · September 12, 2016, 2:01pm

Hello chill3chee,

I am still not clear about your requirements, let's say variable line=TAB1_COL1 , then following may help you in same.
Similarly you could use this following in your while loop.

line=TAB1_COL1
sed '/'"$line"'/s/\(.*\)/--\1/'  Input_file

Output will be as follows.

spool spool_file.txt;
--select index_name from dba_indexes where index_name='TAB1_COL1'
union
select index_name from dba_indexes where index_name='TAB2_COL1'
union
...
select index_name from dba_indexes where index_name='TAB10_COL1';
spool off;

Also in your above code you shouldn't do < Input_file to sed do as above code and redirect output to > file1_temp.txt . If you have any other requirements then please mention it more clearly with examples.
Please let me know if you have any queries on same.

Thanks,
R. Singh

RudiC · September 12, 2016, 2:11pm

Please show your error messages, better: an execution log.
I don't get errors when executing

while read line;   do sed "/$line/ s/^/--/" file1;  done < spool_file.txt;

RudiC · September 12, 2016, 2:16pm

Why don't you read file1 and spool_file just once each instead of file1 once per line in spool_file ?

awk 'NR==FNR {T[$1]; next} {for (t in T) if ($0 ~ t) $0 = "--" $0} 1' file2 file1
CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;
--CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
...
--CREATE INDEX TAB6_COL1 ON TAB6 (COL1);
CREATE INDEX TAB7_COL1 ON TAB7 (COL1)
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);

Don_Cragun · September 12, 2016, 10:01pm

rudic:

Why don't you read file1 and spool_file just once each instead of file1 once per line in spool_file ?

awk 'NR==FNR {T[$1]; next} {for (t in T) if ($0 ~ t) $0 = "--" $0} 1' file2 file1
CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;
--CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
...
--CREATE INDEX TAB6_COL1 ON TAB6 (COL1);
CREATE INDEX TAB7_COL1 ON TAB7 (COL1)
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);

I agree wholeheartedly with the idea of using a single invocation of awk instead of one invocation of sed for each line in a file. But, we don't have enough data to properly process the needed matches. If the matches are always on the 3rd field, to prevent something like TAB6_COL1 in the list of changes from also affecting lines containing TAB6_COL10 through TAB6_COL19 , you might need something more like:

awk 'NR==FNR {T[$1]; next} {if ($3 in T) $0 = "--" $0} 1' file2 file1

or if the string could appear in another field:

awk 'NR==FNR {T[" " $1 " "]; next} {for (t in T) if ($0 ~ t) $0 = "--" $0} 1' file2 file1

and, if there is a chance that a line might already have been commented out and you don't want to add multiple sets of leading hyphens:

awk 'NR==FNR {T[$1]; next} !/^--/{if ($3 in T) $0 = "--" $0} 1' file2 file1

or:

awk 'NR==FNR {T[" " $1 " "]; next} !/^--/{for (t in T) if ($0 ~ t) $0 = "--" $0} 1' file2 file1

And, of course, further adjustments would be needed if the string to be matched could appear at the start of a line or at the end of a line.

And, if you were willing to use:

sed "s/.*pattern.*/--&/"

instead of:

sed '/pattern/s/^/--/'

and the patterns don't overlap, you could also create a file of sed commands like:

s/.*pattern1.*/--&/
s/.*pattern2.*/--&/
s/.*pattern3.*/--&/
...
s/.*patternn.*/--&/

and run sed once:

sed -f sed_commands_file input_file > output_file

but you still have to consider the possibility of matching substrings of longer unintentional matches.

bakunin · September 13, 2016, 2:22am

I think i got it, but i ask thread-o/p to confirm. The presentation of his problem was perhaps a bit off from optimal:

There is ONE file with some database statements:

CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1;
....
CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM DB_T.TAB10;
CREATE INDEX TAB1_COL1 ON TAB1 (COL1);
...
CREATE INDEX TAB6_COL1 ON TAB6 (COL1);
CREATE INDEX TAB7_COL1 ON TAB7 (COL1)
...
CREATE INDEX TAB10_COL1 ON TAB10 (COL1);

Now, some of these statements are already processed, others are not. Which are processed is in a second file (the "spool file":

TAB1_COL1
TAB2_COL1
TAB3_COL1
TAB4_COL1
TAB5_COL1
TAB6_COL1

Thread-o/p wants to be able to restart the (first) command file after it has been stopped and for this (to avoid already done things a second time) comment out all already run commands. This is done by adding a "--" at the beginning of the line. What he was trying to do was to read the spool file line by line and for each line prepend all lines containing the table (?, index?) in question with comment signs.

@Thread-o/p: please tell us if this is correct. If yes, your script was "almost correct", but to prepend a whole line with something you could have done easier:

while read ITEM ; do
     sed '/'"$ITEM"'/ s/^/-- /' /your/command/file > /your/command/file.tmp
     mv /your/command/file.tmp /your/command/file
done < /path/to/spool/file

I hope this helps.

bakunin

PS: on second thoughts, Dons concerns of course still apply! You have to prevent i.e. "COL1" to trigger changes in lines containing "COL19". Maybe adding a space after "$ITEM" like this

while read ITEM ; do
     sed '/'"$ITEM"' / s/^/-- /' /your/command/file > /your/command/file.tmp
     mv /your/command/file.tmp /your/command/file
done < /path/to/spool/file

would suffice but to decide that you will have to analyse the command file you are trying to change.

chill3chee · September 14, 2016, 11:09am

Thank you Ravinder Singh, RudiC, Don Cragun, Bakunin for your inputs. Have used the RudiC's suggestion along with Don's excellent considerations. The error I was facing for

sed "/$line/ s/^/--/g"

was

sed: Function TAB1_COL1 s/^/--/g cannot be parsed /

However, when in command prompt when I assign line=TAB1_COL1 and then sed "/$line/ s/^/--/g" gives me the correct result. However, used the awk solution for now. Bakunin, yes, you summarized accurately my requirement. Will give a try of your suggestion

sed '/'"$ITEM"'/ s/^/-- /'

Don_Cragun · September 14, 2016, 3:41pm

chill3chee:

Thank you Ravinder Singh, RudiC, Don Cragun, Bakunin for your inputs. Have used the RudiC's suggestion along with Don's excellent considerations. The error I was facing for
sed "/$line/ s/^/--/g"
was
sed: Function TAB1_COL1 s/^/--/g cannot be parsed /
However, when in command prompt when I assign line=TAB1_COL1 and then sed "/$line/ s/^/--/g" gives me the correct result. However, used the awk solution for now. Bakunin, yes, you summarized accurately my requirement. Will give a try of your suggestion
sed '/'"$ITEM"'/ s/^/-- /'

There is no logical difference between the commands:

sed "/$line/ s/^/--/g"

and:

sed '/'"$line"'/ s/^/-- /'

The diagnostic message:

sed: Function TAB1_COL1 s/^/--/g cannot be parsed /

would be more likely to occur if the slashes around the pattern had been dropped... I.e.:

sed "$line s/^/--/g"

instead of:

sed "/$line/ s/^/--/g"

[/CODE]