[[HelloInCapitals]] to [[Hello In Capitals]]

Hello community,

I got it all done except for one thing,

[[HelloInCapitals]] or [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]

So now I want to split those in to

[[Hello In Capitals]]

or

[[Pharmaceutical Society Of Great Britain V Boots 1952]]

I am not so good at all this and get stuck with

sed -e 's/[A-Z]/ &/g' -e 's/[0-9].../ &/g' -e 's/^ //g' infile

But it changes EVERYTHING, and I only want to change Mediawiki links which
are in between [[ ]]

try

sed 's/[^A-Z]*/& /g'

something to start with:

echo '[[HelloInCapitals]] or [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | nawk -F'(\\[\\[)|(\\]\\])' '{for(i=2;i<=NF;i+=2) {gsub("[A-Z]", " &", $i);$i="[["substr($i,2)"]]"};print}' OFS=
while read line; do
 outline=""
 for i in $line; do
   case $i in
     "[["*) i=$(echo $i |sed 's/\([a-z]\)\([A-Z0-9]\)/\1 \2/g') ;;
   esac
   outline="$outline $i"
 done
 echo ${outline# }
done<infile

:b:
Just a little improvement to avoid iuseless spaces before and after [[ ]]

echo '[[HelloInCapitals]] or [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | 
nawk -F'(\\[\\[)|(\\]\\])' '{for(i=2;i<=NF;i+=2) {gsub("[A-Z]", " &", $i);$i="[["substr($i,2)"]]"};print}' OFS=""

Jean-Pierre.

Another one with sed:

echo '[[HelloInCapitals]] or [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | 
sed 's/[A-Z]/ &/g; s/\[ /\[/g'

yep, caught that one as well.
also taking care of 'numbers':

echo '[[HelloInCapitals]] OrAnd [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | nawk -F'(\\[\\[)|(\\]\\])' '{for(i=2;i<=NF;i+=2) {gsub("[A-Z]|[0-9][0-9]*", " &", $i);$i="[["substr($i,2)"]]"};print}' OFS=

---------- Post updated at 08:53 AM ---------- Previous update was at 08:42 AM ----------

echo '[[HelloInCapitals]] OrAnd [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | sed 's/[A-Z]/ &/g; s/\[ /\[/g'
[[Hello In Capitals]]  Or And [[Pharmaceutical Society Of Great Britain V Boots1952]]
perl -pe's/\B(?:[[:upper:]]|\d+)/ $&/g'
$ echo '[[HelloInCapitals]] OrAnd [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' | perl -pe's/\B(?:[[:upper:]]|\d+)\B/ $&/g'
[[Hello In Capitals]] Or And [[Pharmaceutical Society Of Great Britain VBoots 1952]]

Noticed it, work in progress :slight_smile:

---------- Post updated at 03:08 PM ---------- Previous update was at 03:06 PM ----------

perl -pe'
  s/\[\[(.*?)\]\]/($x=$1)=~s#\B([[:upper:]]|\d+)\B# $1#g;"[[".$x."]]"/eg
  '

This should also take care for the numbers, if you want to process only words. For lines including words within double square brackets use one of those solutions above.

sed 's/[A-Z]/ &/g; s/[0-9][0-9]*/ &/g; s/\[ /\[/g'

Slightly modified:

zsh-4.3.10[t]% print '[[HelloInCapitals]] AndOr [[PharmaceuticalSocietyOfGreatBritainVBoots1952]]' |
perl -pe'
  s/\Q[[\E(.*?)\Q]]\E/($x=$1)=~s#\B([[:upper:]]|\d+)\B# $1#g;"[[".$x."]]"/eg
  '
[[Hello In Capitals]] AndOr [[Pharmaceutical Society Of Great Britain V Boots 1952]]

---------- Post updated at 03:17 PM ---------- Previous update was at 03:15 PM ----------

This one substitutes outside of the [] as well.