Help with parsing file with combination of pattern

I have a file1 like

    prt1|als28.1 prt3|als53.1 prt2|als550.1 prt1|bls9.2 prt2|als7.2 prt2|bls0.2
    prt2|als872.1 prt1|bls871.1    prt2|als6.2    prt4|als22.1 prt2|bls43.2

I want to create a file2 from this file by comparing all the possible combinations of patterns (prt) assuming prt1 as a reference pattern. The number of pattern can be differentin each lines of file1. For first line in file1 there can be several pairs considering each prt1 as reference (for example `

prt1|als28.1 prt3|als53.1; prt1|als28.1 prt2|als550.1; prt1|als28.1 prt2|als7.2; prt1|als28.1 prt2|bls0.2; prt1|bls9.2 prt3|als53.1; prt1|bls9.2 prt2|als550.1; prt1|bls9.2 prt2|als7.2; prt1|bls9.2 prt2|bls0.2

`). The combination like `

prt1|als28.1 prt1|bls9.2

` should be ignored. So the output of first line in file2(result) will be

    prt1|als28.1 prt3|als53.1
    prt1|als28.1 prt2|als550.1
    prt1|als28.1 prt2|als7.2
    prt1|als28.1 prt2|bls0.2
    prt1|bls9.2 prt3|als53.1
    prt1|bls9.2 prt2|als550.1
    prt1|bls9.2 prt2|als7.2
    prt1|bls9.2 prt2|bls0.2

likewise the output of second line will be

    prt1|bls871.1 prt2|als872.1
    prt1|bls871.1 prt2|als6.2
    prt1|bls871.1 prt4|als22.1
    prt1|bls871.1 prt2|bls43.2

I can't figure out how exactly can do this. any suggestions/programs will be helpful. This is one I wrote

    #!/usr/bin/perl
    use strict;
    use warnings;
    open F1,$ARGV[0] or die "\n can not open file $ARGV[0]\n";
    my $pattern1 = $ARGV[1];
    my $otherpattern = $ARGV[2];
    while (my $line=<F1>) 
    {
        if ($line=~/ ($querypattern\S+)/i) { print $1; }
        {
            if ($line=~/  ($otherpattern\S+)/i)
            {
                print "\t".$1."\n";
            }
            else
            {
                if ($line=~ m/\bNo pairs found\b/g)
                {
                    print "\t".$line;
                    print "\t"."No pairs Found"."\n";

How about

awk     '       {c++
                 for (i=1; i<=NF; i++) if ($i ~ /prt1/) A[$i]
                                         else           B[$i]
                 for (i in A) for (j in B) print i, j > "file"c
                 delete A; delete B
                }
        ' file
file1:
prt1|bls9.2 prt2|bls0.2
prt1|bls9.2 prt3|als53.1
prt1|bls9.2 prt2|als550.1
prt1|bls9.2 prt2|als7.2
prt1|als28.1 prt2|bls0.2
prt1|als28.1 prt3|als53.1
prt1|als28.1 prt2|als550.1
prt1|als28.1 prt2|als7.2
file2:
prt1|bls871.1 prt4|als22.1
prt1|bls871.1 prt2|als872.1
prt1|bls871.1 prt2|bls43.2
prt1|bls871.1 prt2|als6.2

Does the output order matter?

1 Like

No the order does not matter.

If you still are interested in a Perl solution.

#!/usr/bin/perl

use strict;
use warnings;

my $filename = shift or die "Missing filename to operate on it" ;
my $re = shift or die "Missing regex to match";

open my $fh, '<', $filename or die "Could not open $filename: $!\n";

while (my $line = <$fh>) {
    chomp $line;
    print "Line #$.\n";
    my @fields = split /\s+/, $line;
    my @patterns = grep{/$re/} @fields;

    my %patterns = map{$_ => 1} @patterns;
    my @NF = grep(!defined $patterns{$_}, @fields);

    for my $pattern (@patterns) {
        for my $field (@NF) {
            print "$pattern $field\n";
        }
    }
    print "\n";
}
close $fh

Result:

 perl prog.pl filename prt1
Line #1
prt1|als28.1 prt3|als53.1
prt1|als28.1 prt2|als550.1
prt1|als28.1 prt2|als7.2
prt1|als28.1 prt2|bls0.2
prt1|bls9.2 prt3|als53.1
prt1|bls9.2 prt2|als550.1
prt1|bls9.2 prt2|als7.2
prt1|bls9.2 prt2|bls0.2

Line #2
prt1|bls871.1 prt2|als872.1
prt1|bls871.1 prt2|als6.2
prt1|bls871.1 prt4|als22.1
prt1|bls871.1 prt2|bls43.2