a) apple bannana
b) apple bannana AND chickko
c) "milk shake" OR Graphes orange
whereever there is space substitue with AND operator.
I tried like this:
if($query!~/"/)
{
$query=~s/\s+/ AND /g;
}
It works fine when there are no boolean operators.
How to substitute when boolean operators are present if query contains space (apple banana orange AND grapes in such case output should be apple AND bananna AND orange AND grapes) ?
The desired output for the query terms mentioned above should be like this:
apple AND bannana
apple AND bannana AND chickko
"milk shake" OR Graphes AND orange
It should not substitue the space if double quotes are present and it should substitute only when there is space!!
#!/usr/bin/perl
# @(#) p1 Demonstrate parsing with quotes, operators AND, OR.
# http://search.cpan.org/~chorny/Text-ParseWords-3.27/ParseWords.pm
use warnings;
use strict;
use Text::ParseWords;
my ($debug);
$debug = 0;
$debug = 1;
my ( $i, $last, $line, @tokens );
while (<>) {
chomp;
@tokens = quotewords( '\s+', 0, $_ );
print " input is :@tokens:\n" if $debug;
$last = undef;
foreach $i (@tokens) {
if ( not defined($last) ) {
$last = $i;
print qin($i) . " ";
}
elsif ( $i eq "OR" or $i eq "AND" ) {
$last = $i;
print qin($i) . " ";
}
else {
if ( $last ne "OR" and $last ne "AND" ) {
print "AND " . qin($i) . " ";
}
else {
print qin("$i") . " ";
}
$last = $i;
}
}
print "\n";
}
print STDERR " ( Lines read: $. )\n" if $debug;
# qin - quote if necessary.
sub qin {
my ($phrase) = $_[0];
if ( $phrase =~ / / ) {
return '"' . $phrase . '"';
}
else {
return $phrase;
}
}
exit(0);
Producing (on your data in file data1):
% ./p1 data1
input is :apple bannana:
apple AND bannana
input is :apple bannana AND chickko:
apple AND bannana AND chickko
input is :milk shake OR Graphes orange:
"milk shake" OR Graphes AND orange
( Lines read: 3 )
See perldoc Text::ParseWords on your system or obtain from cpan as noted. It takes care of the quoted strings, placing all the tokens in a list.
If the output is not what you desire, feel free to modify or adapt the code as necessary ... cheers, drl