How to substitute?

Hi,

I have query terms like this:

a) apple bannana

b) apple bannana AND chickko

c) "milk shake" OR Graphes orange

whereever there is space substitue with AND operator.

I tried like this:

if($query!~/"/)
{
   $query=~s/\s+/ AND /g;

}

It works fine when there are no boolean operators.

How to substitute when boolean operators are present if query contains space (apple banana orange AND grapes in such case output should be apple AND bananna AND orange AND grapes) ?

The desired output for the query terms mentioned above should be like this:

apple AND bannana

apple AND bannana AND chickko

"milk shake" OR Graphes AND orange

It should not substitue the space if double quotes are present and it should substitute only when there is space!!

How can i do that in perl?

Any help?

with regards
Vanitha

Hi.

Here is a start that shows ParseWords:

#!/usr/bin/perl

# @(#) p1       Demonstrate parsing with quotes, operators AND, OR.
# http://search.cpan.org/~chorny/Text-ParseWords-3.27/ParseWords.pm

use warnings;
use strict;
use Text::ParseWords;

my ($debug);
$debug = 0;
$debug = 1;

my ( $i, $last, $line, @tokens );

while (<>) {
  chomp;
  @tokens = quotewords( '\s+', 0, $_ );
  print " input is :@tokens:\n" if $debug;
  $last = undef;
  foreach $i (@tokens) {
    if ( not defined($last) ) {
      $last = $i;
      print qin($i) . " ";
    }
    elsif ( $i eq "OR" or $i eq "AND" ) {
      $last = $i;
      print qin($i) . " ";
    }
    else {
      if ( $last ne "OR" and $last ne "AND" ) {
        print "AND " . qin($i) . " ";
      }
      else {
        print qin("$i") . " ";
      }
      $last = $i;
    }
  }
  print "\n";
}

print STDERR " ( Lines read: $. )\n" if $debug;

# qin - quote if necessary.

sub qin {
  my ($phrase) = $_[0];
  if ( $phrase =~ / / ) {
    return '"' . $phrase . '"';
  }
  else {
    return $phrase;
  }
}

exit(0);

Producing (on your data in file data1):

% ./p1 data1
 input is :apple bannana:
apple AND bannana
 input is :apple bannana AND chickko:
apple AND bannana AND chickko
 input is :milk shake OR Graphes orange:
"milk shake" OR Graphes AND orange
 ( Lines read: 3 )

See perldoc Text::ParseWords on your system or obtain from cpan as noted. It takes care of the quoted strings, placing all the tokens in a list.

If the output is not what you desire, feel free to modify or adapt the code as necessary ... cheers, drl

Hi,

Thanks a lot!!!