Creating Ligatures in Urdu using delimiters

Hello,
I want to create a test bed for Urdu ligatural forms. One of the main components is to create a delimiter list. These are forms after which no connectors can be formed.
What I need is a tool which will take a running text or a list of words in a file and split them as soon as a delimiter is encountered. A sample will explain the process:
I am using Latin script for easy facilitation.
DELIMITERS:Let us assume that the delimiters are:

a,e,i,o,u

Each delimiter separated by a comma
INPUT:

baker
convoluted
perspicacity

EXPECTED OUTPUT

ba ke r
co nvo lu te d
pe rspi ca ci ty

i.e. after each delimiter the string is splitted and a space is placed.
Please note that if I had put

aeo

as a delimiter. Then a string such as :

archaeological

would be split as

a rchaeo lo gi ca l

At present I use a macro to do the job. But the process is extremely slow.
An AWK or PERL Script would be of great help, since my OS is Windows.
Many thanks
p.s. Just in case someone is interested in tweaking Urdu, a sample delimiter list is provided below:

,,,,,,,,,,,,

and here is a sample text:


































































Here's one way I could think of..

[user@host ~]$ cat file
baker
convoluted
perspicacity
[user@host ~]$ cat test.pl
#! /usr/bin/perl

my @delims = qw / a e i o u /;

my ($str, $x) = (undef, undef);

open I, "< file";
while ($str = <I>) {
    chomp ($str);
    for $x(split('', $str)) {
        (grep {$_ eq $x} @delims) ? print "$x " : print "$x";
    }
    print "\n";
}
close I;
[user@host ~]$
[user@host ~]$ ./test.pl
ba ke r
co nvo lu te d
pe rspi ca ci ty
[user@host ~]$

Hello,
It worked beautifully for the English samples. However the momnet I plugged in the Urdu delimiters, it did not work.
I suppose this is because PERL does not support UTF8. I even tried saving the script as UTF8 with no Byte Order mark, but it did not work.
The only change I made in the script was to replace it with my delimiters.

my @delims = qw /              /;

each separated by a space as in your case
Just for testing here is a small sample on which I tried













Basically even if the script is alien, you should see a space between the ligatural forms, but the script spews out the sample file as such.
How do you get around this issue?
Any help or suggestions, please.
Many thanks

I dont know perl, but if bash supports UTF8 (as urub is as you say)

thefile:

archelogogy
testing
abcdefg
aoeio

The script:

replacers=(a e i o u)
while read line;do
    for repl in ${replacers[@]};do
        sed s,$repl,\ $repl\ ,g
    done
done < thefile

Calling: thescript
Results:

a rch e l o g o gy
t e st i ng
a bcd e fg
a o e i o

hth

$ 
$ cat delimiters.txt
,,,,,,,,,,,,
$ 
$ 
$ cat sample.txt










$ 
$ 
$ cat -n create_ligatures.pl
     1    #!/usr/bin/perl
     2    use open ':encoding(utf8)'; # input/output default encoding will be UTF-8
     3    $delim_file = "delimiters.txt";
     4    open (DL, "<", $delim_file) or die "Can't open $delim_file: $!";
     5    while (<DL>) {
     6      chomp;
     7      @delims = split(/,/, $_);
     8    }
     9    close (DL) or die "Can't close $delim_file: $!";
    10    
    11    $data_file = "sample.txt";
    12    binmode(STDOUT, ":encoding(UTF-8)"); # render utf8 output
    13    open (FH, "<", $data_file) or die "Can't open $data_file: $!";
    14    while (<FH>) {
    15      chomp;
    16      for $x (split(//, $_)) {
    17        (grep {$_ eq $x} @delims) ? print "$x " : print "$x";
    18      }
    19      print "\n";
    20    }
    21    close (FH) or die "Can't close $data_file: $!";
$ 
$ 
$ perl create_ligatures.pl
 
   
  
  
 
 
  
    

 
$ 
$ 
$ 

As stated above but with one change:

replacers=($(cat delimiters.txt))

hth