Split a Capital rich string

Hello all,

I am new and straight away jump in with a question, sorry!

I am working on a new Mediawiki site and have 1500 html pages I want to add to the system,

I will mostly do them one by one as it needs some editing, but one thing I like to do in one go,

I need to change

HelloInCapitals to Hello In Capitals

It could also be like this PharmaceuticalSocietyOfGreatBritainVBoots1952 needs to be Pharmaceutical Society Of Great Britain V Boots 1952

And another matter in the same files, due to a mistake also instances of

(that is legally binding]] to (that is legally binding)

I cannot simply change ]] as I do need ]] where the word starts with [[

Can someone help me?

 echo "HelloInCapitals" | sed 's/[A-Z]/ &/g'
 Hello In Capitals

Malcom,

Thank you so much!
How can I run that on all 1500 pages?

for file in `find . -name '*.html' -type f -print` ; do echo $file; sed 's/[A-Z]/ &/g' $file > $file.tmp; mv -f $file.tmp $file; done

This give a space before the line, to be more precise:

sed 's/[A-Z]/ &/g; s/^ //'
sed 's/HelloInCapitals/Hello In Capitals/g' file

This will change the strings in the file. The 'g' (global) says make the change to all occurances on a line.
Redirect the output to write the result to a new version of the file (default is std out.)
Put it in a loop to do it to multiple files

for each in `ls dir`
do
   cp $each ${each}.orig
   sed 's/HelloInCapitals/Hello In Capitals/g' ${each}.orig > ${each}
done

Thanks for that but then it is better to do it with one sed process....

sed -e 's/[A-Z]/ &/g' -e 's/^ //g'

---------- Post updated at 06:45 PM ---------- Previous update was at 06:06 PM ----------

There should be a better way...

 >cat infile
PharmaceuticalSocietyOfGreatBritainVBoots1952
>sed -e 's/[A-Z]/ &/g' -e 's/[0-9].../ &/g' -e 's/^ //g' infile
Pharmaceutical Society Of Great Britain V Boots 1952

And not clear by what you mean below...

sed -e ":h;s/\([A-Za-z]\)\([1-9A-Z]\)/\1 \2/g;th" infile

( ... ]] is tricker unless you want to change all [[ ... ]] to ( ... )

For a couple of reasons.

  1. regular expressions are "greedy" - trying to fix this
( blah ]] de [[ blah ]]

might end up like this

( blah ]] de [[ blah )
  1. the ( ... ]] might be split over multiple lines

This is hideous but should (might!) fix some of them:

sed "s/\(([A-Za-z ]*\)]]/\1)/;s/\(\[[A-Za-z ]*\))/\1]]/"

Thank you so much guys!

I made an edit error, and I changed something the wrong way, in all 1500 pages. Now I need to edit it as follows,

(that is legally binding]] to (that is legally binding)

But I cannot simply do a replace /]]/) because then is replaces the much needed correct ]] as well,

So I need to change it only as follows

(]] to ()

---------- Post updated at 12:41 PM ---------- Previous update was at 11:44 AM ----------

Hi guys,

I want to use this

sed -e 's/[A-Z]/ &/g' -e 's/^ //g'

But now I only want do it for anything in between [[ and ]] as it otherwise messes with the href links.

-hold on-

  • holding on -

I am not 100% sure what you are asking for but to change ( ... ]] to ( ... ) you can perhaps try this:

sed -e 's/\(([^])]*\)]]/\1)/g'
echo "[[ (that is legally ) binding]] (that is legally binding too]]"| sed -e 's/\(([^])]*\)]]/\1)/g'
[[ (that is legally ) binding]] (that is legally binding too)

As long as it is on the same line.

To only change for anything in between [[...]] seems rather complicated using sed..

Thank you!

Hi guys,

Ok so I run

sed -e 's/[A-Z]/ &/g' -e 's/[0-9].../ &/g' -e 's/^ //g' infile

But now it does so in the whole file, and I only need to do this between [[ and ]], how can I run this SED only between these [[]]