regular expression [^ ]

Hi,
Could anybody explain why [^ ] will evaluate to true for a string that is not null, not empty string, and has at least one non-space char?

My understanding is that ^ means exclude all chars inside []. So I thought it should mean anything except space.

This seems a big mystery to me.

What language or shell script are we talking about here? Can we also see some code? Maybe your usage is wrong.

There's no mystery, '[^ ]' means match any single character that is not a space.

I thought the same as you, "match any single char that is not space". But we are wrong.

It matches any string that is not null, not empty string and not only space. But I don't understand how that could be. I need someone to explain it.

Hard to answer when you do not define the language you are using, there is no examples of how you are using it or the data you are using.

[^ ] is a negated space, not to be confused with other non-visible characters like tabs or newlines or other stuff I probably am not aware of:

@t = ('m', "\t", '1', " ", "\n");
for my $i (0..$#t) {
    if ($t[$i] =~ /[^ ]/) {
       print qq{Number $i is true "$t[$i]"\n};
    }
}

the above is true for all except " ", which is a space. \t (tab) and \n (newline) are not included in [^ ] but they would be in [^\s].

@t = ('m', "\t", '1', " ", "\n");
for my $i (0..$#t) {
    if ($t[$i] =~ /[^\s]/) {
	    print qq{Number $i is true "$t[$i]"\n};
	 }
}

Of course the above is perl. But since you have not mentioned (after being asked) what language you are using, maybe I just wasted my time.

Isn't Regular Expression generic for all languages? I am under the impression that Regular Expression use the same syntax even across platform, ie. same for unix, windows or linux. I am using it in webMethods here.

Actually regular expression support is not uniform in all languages. Perl compatible regular expressions (also called PCRE) are a very common set of regular expressions compatible with how Perl supports regular expressions. Regular expression support in the webMethods 'Flow' language is not explicitly stated as PCRE according to their documentation (but may be).

The original poster may have been asking about this statement from an article here: (because I hit this forum when researching the same topic)
--
Regular Expressions for Integration Server
wMUsers: For webMethods Professionals -- Knowledge Base | Regular Expressions for Integration Server
--

/[^ ]/ -- [matches a variable that is] is not null, is not empty, and contains at least one non-space character.

My question was also similar - how come this does _not_ match the empty string, or null? I think the answer is a square bracket pair ('[...]') must match some character, and the '^ ' just excludes the space character from matching. Is this correct?

I modified the Perl code helpfully posted earlier, by adding an empty string ("") as the last case:

#!/usr/bin/perl

@t = ('m', "\t", '1', " ", "\n","");
for my $i (0..$#t) {
    if ($t[$i] =~ /[^ ]/) {
       print qq{Number $i is true "$t[$i]"\n};
    } else {
       print qq{Number $i is NOT true "$t[$i]"\n};
    }
}

Running this, I get these results:

Number 0 is true "m"
Number 1 is true "      "
Number 2 is true "1"
Number 3 is NOT true " "
Number 4 is true "
"
Number 5 is NOT true ""

So does result #5 confirm my understanding above --- the square bracket pair ('[...]') _must_ match some character, and the '^ ' just excludes the space character from matching.

a space is not null or empty, at least in perl its not considered to be. A space, as in " " or [^ ] is a space, it does contain a value, a space. With perl, false values are 0 (zero) and "". I think there is also one exception to that, a special false value in what is returned by one of perl functions but I can't remember which function. I don't think it has anything to do with the square brackets, but it would make no sense to use the square brackets (a character class) that contains no characters to match. I think it just boils down to the fact that spaces are not null or empty.

I agree. Basically, the way I've explained to myself why this character class regex does not match null or the empty string, was that it match _characters_ but nothing else.

Thanks a lot Kevin, especially for your Perl script - it helped me validate my understanding.

BTW, the language this question was about :
webMethods Flow - Wikipedia, the free encyclopedia

My guess would have been that the poster assumed that the regular expression would only match on a string which does not contain a space anywhere, i.e. expect it to only match on strings consisting of a single character other than space. But the way regular expressions are used in most tools is that they will regard a substring match anywhere in the input string as a success. So [^ ] matches 'foo bar' even though it includes a space, because there is a substring which matches the expression (and actually all possible substrings of length one will match except one, obviously; in practice, the matcher will be content with the first possible match, i.e. the substring "f").