bingel
September 24, 2010, 11:35am
1
I have a text (text.txt) and I would like to replace only the first 2 occurrences of a word (but I might need to replace more):
For example, if text is this:
CAR sweet head
hat red yellow
CAR book brown
tiger CAR cow CAR
CAR milk
I would like to replace the word "CAR" with word "REPLACE" only in row 1 and 3 but not in row 4 and 5 and I would like to obtain a result like this:
REPLACE sweet head
hat red yellow
REPLACE book brown
tiger CAR cow CAR
CAR milk
but if I use:
cat text.txt | awk '{gsub("CAR","REPLACE");print}'
I will obtain this:
REPLACE sweet head
hat red yellow
REPLACE book brown
tiger REPLACE cow REPLACE
REPLACE milk
Is there a way to obtain what i need (if possible using awk and gsub)?
Thanks in advance.
kurumi
September 24, 2010, 11:39am
2
$ ruby -00 -ne 'print $_.split("CAR",3).join("REPLACE")' file
REPLACE sweet head
hat red yellow
REPLACE book brown
tiger CAR cow CAR
CAR milk
cat text.txt |
awk '/CAR/{
if ( count < 2 ){ gsub("CAR","REPLACE")}
count++
print
}'
...oh... and you can pass in the 2 as a variable:
cat text.txt |
awk '/CAR/{
if ( count < counter ){ gsub("CAR","REPLACE")}
count++
print
}' counter=$counter
bingel
September 24, 2010, 11:59am
4
Thanks, but using awk and gsub?
I don't know ruby and I would like to use awk because so I could adapt the script to my needs.
Thanks again
---------- Post updated at 04:44 PM ---------- Previous update was at 04:43 PM ----------
Thanks again, I had not seen the last message
---------- Post updated at 04:55 PM ---------- Previous update was at 04:44 PM ----------
@quirkasaurus
I have tested your first code but I obtain this output:
REPLACE sweet head
REPLACE book brown
tiger CAR cow CAR
CAR milk
The second row is deleted
---------- Post updated at 04:59 PM ---------- Previous update was at 04:55 PM ----------
With this code it runs:
cat text.txt | awk '{if ( count < 3 ){ gsub("CAR","REPLACE")} count++; print}'
oops. sorry. didn't notice that.
cat text.txt |
awk '{
if ( count < 2 ){
gsub("CAR","REPLACE")
count++
}
print
}'
output:
REPLACE sweet head
hat red yellow
CAR book brown
tiger CAR cow CAR
CAR milk
bingel
September 24, 2010, 12:05pm
6
I need a code like this because, since I have a big file where to search, I would like to stop search at first 2 occurrences found but with gsub I think search is done until the end of file, instead with this code (using sub in place of gsub):
cat text.txt | awk '{sub("CAR","REPLACE");print}'
search is done until awk find the occurence. Is it right?
quirkasaurus,
your code is not replacing 2nd occurance.
slightly change in your code
awk '/CAR/ && count < 2 {gsub("CAR","REPLACE");count++} {print $0}' infile
bingel
September 24, 2010, 12:17pm
8
As I said, the file is very large, so I would stop the search soon after having found the first X occurrences. So, do you think that if I use the sub function instead of gsub, search will be faster? :
cat text.txt |
awk '{
if ( count < 2 ){
sub("CAR","REPLACE")
count++
}
print
}'
---------- Post updated at 05:17 PM ---------- Previous update was at 05:13 PM ----------
What do you think about this?
cat text.txt | awk '{if ( count < 3 ){ sub("CAR","REPLACE")} count++; print}'
Scott
September 24, 2010, 12:17pm
9
Hi.
sub replaces the first occurrence on a line, gsub replaces all occurrence on a line - not all occurrences in a file.
awk '/CAR/ {
if ( count++ < 2 )
gsub("CAR","REPLACE")
}1' file
if you want to exit from the code after X occurrence , you will get result till that record.
awk '/CAR/ && count < 2 {gsub("CAR","REPLACE");count++} {print $0;if(count == 2) {exit}}' infile
O/P
REPLACE sweet head
hat red yellow
REPLACE book brown
I havn't test the ruby version but all the solution above fail if you want to replace 2 occurence with a file like this:
CAR sweet CAR woman CAR
CAR red yellow
CAR book brown
tiger CAR cow CAR
CAR milk
Try something like this:
awk '{while(index($0,"CAR") && ++n < 3){sub("CAR","REPLACE")}}1' file
kurumi
September 24, 2010, 8:04pm
12
Yes, it does produce the correct output, since its splitting on a limit. However, need to change regex to \bCAR\b if boundary is required. another awk way assuming CAR is not bounded
$ awk 'BEGIN{RS="CAR";ORS="REPLACE"}NR>2{ORS="CAR"}1' file
REPLACE sweet REPLACE woman CAR
CAR red yellow
CAR book brown
tiger CAR cow CAR
CAR milk
Yet another way of doing the same...
awk '{
if (cnt < 2)
for (i=1;i<=NF;i++)
if ($i=="CAR" && cnt<2)
cnt += sub("CAR","REPLACE",$i)
print
}' file