Hi! I'm trying to separate text into sentences, like this:
$pattern = "/[A-Z]([a-z]|[[:space:]]|,)*[\.\!\?:]*/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "���". How can I make this work with these exotic characters?
An example phrase that doesn't match:
"Detta �r ett test!"
The character '�' prevents a match, but I would also like to match those characters.