loop through two files based on a variable

Hi guyz. i have two files. based on keys(chr1, chr2..) it has to loop through the second file of the same keys and has to take the minimum number after substraction. Sorry if I made my question complicated.

file1
chr2 1989
chr2 2500
chr1 1500

file 2
chr1 1339
chr2 2000
chr2 3000
chr2 1200
chr1 1600
chr2 4000

output
chr2...1989.....2000
chr2...2500.....3000
chr1...1500.....1600

Hi,

do someone know, How we can find the smallest distance between values of different columns?
Like I have column1 & column2.
-73.924598,40.879010
-73.924506,40.878978
-73.924506,40.878978
-73.921406,40.878178
-73.921406,40.878178
-73.920806,40.878578
-73.920206,40.878978
-73.920206,40.878978
-73.918706,40.876578
-73.918706,40.876578

If I want to see, which one is closer to the first point in 1st column among all the points in second column and so on??
How should I do?

hi I have bee trying to print the closest numbers in 2 arrays @set and @vals but the script i'm using giving the output only for 1st no in 1st array i.e 15 with all values in array2. I'm getting output 56 only. I need the values close to 150,200 and 250. Is there any thing wrong with the script.
Sorry If my question is perplexing

the script I'm using

#!usr/bin/perl
use strict;
use warnings;

my @set = (15, 150, 200, 250);
my @vals = (208, 258, 56, 123);

print closest(@set, @vals), "\n";

sub closest {
my $val = shift;
my @list = sort { abs($a - $val) <=> abs($b - $val) } @_;
$list[0];
}

All your code is doing is sorting a flattened list. Since 56 is the lowest number of the sorted list thats what gets returned. But after reading your post I can't figure out what it is you are actually wanting to do. Can you clarify?

sorting the columns should be the way to go.

ya u r right but i need the way to claculate it

NO
but i do agree my question needs more explanation
If i change the values in @set to only one number like 200 it gives the output as 208,closest number of 200
when I tried the script with only one value it is working properly.
I want to extract the closest numbers from set of values in column1 with column2 . the value 3 in column1 has to sort all the values in column2 (4,6,8,1) and gives the closest number i.e,4 and vice versa.
the crux is each value of column1 hat to sort all the values in column2
I wasted 2 days for this it would be really grateful if u answer this

3....4
4....6
......8
......1

output
3.....4
4.....6

No to what?

I am still pretty confused, but is this what you are trying to do?

#!usr/bin/perl
use strict;
use warnings;

my @set = (15, 150, 200, 250);
my @vals = (208, 258, 56, 123);

my @vals_sorted = sort {$a <=> $b} @vals;
foreach my $n (@set) {
   print "Closest to $n = ", shift @vals_sorted, "\n";
}	

I don't understand what you mean. Are you prohibited from using sort?

Also the columns already appear to be sorted so the next line is the closest one, no?

hi kevin
the logic is perfect
but instead of writing the values in arrays i would like to assign the no.of values from column1 and column2 i.e defing @set as values from column 1 and @val as column 2.
input file;;
chr1 100 112
chr1 150 300
chr1 80 400
.............286
.............100

script needs correction

#!usr/bin/perl -w
use strict;
use warnings;
my $infile1 = 're5.txt';
open IN10, "< $infile1" or die "Can't open $infile1 : $!";
my %values;
while (<F1>) {
chomp;
my ($chrom,$value1,$value2) = split /\t/;
my $rec = {value1 => $value1, value2 => $value2
};
push @{$vlaues{$chrom}}, $rec;
}
my @set = $value1;
my @vals = $value2;
my @vals_sorted = sort {$a <=> $b} @vals;
foreach my $n (@set) {
print "Closest to $n = ", shift @vals_sorted, "\n";
}

Thanx for the ideas
Funny laughs @other mails

if u donmind i will post to u

#!/usr/bin/perl
$infile1 = 'file.txt';
$infile2 = 'cpg2.txt';
$outfile7 = 'out10.txt';
open IN10, "< $infile1" or die "Can't open $infile1 : $!";
open IN11, "< $infile2" or die "Can't open $infile2 : $!";
open OUT7, "> $outfile7" or die "Can't open $outfile7 : $!";

my %chromes;
my %chromes1;
while (<IN10>) {
chomp;
my ($arrayid,$ncrnaid1,$ncrnaid2,$ncrnaid3,$ncrnaid4,$ncrnaid5,$ncrnaid6,$ncrnaid7,$ncrnaid8,$ncrnaid9,$ncrnaid10,$ncrnaid11,$ncrnaid12, $chrom,$start,$end,$cstrand,$en,$esi,$est) = split /\t/;
my $rec = {arrayid => $arrayid,
ncrnaid1 => $ncrnaid1,
ncrnaid2 => $ncrnaid2,
ncrnaid3 => $ncrnaid3,
ncrnaid4 => $ncrnaid4,
ncrnaid5 => $ncrnaid5,
ncrnaid6 => $ncrnaid6,
ncrnaid7 => $ncrnaid7,
ncrnaid8 => $ncrnaid8,
ncrnaid9 => $ncrnaid9,
ncrnaid10 => $ncrnaid10,
ncrnaid11 => $ncrnaid11,
ncrnaid12 => $ncrnaid12,
start => $start,
end => $end,
cstrand => $cstrand,
en => $en,
esi => $esi,
est => $est};
push @{$chromes{$chrom}}, $rec;
}
my @arrayids;
sub input {
my @attrs =qw(chrom start);
while (<IN10>) {
chomp;
my %rec;
@rec{@attrs} = split /\t/;
push @arrayids,\%rec;
}
}
foreach my $chrom (sort keys %chromes){
my $count = scalar @{$chromes{$chrom}};
print OUT7 "$chrom\t$count\t\n\n";
print OUT7 map {"\t\t$_->{start}\t\n"} @{$chromes{$chrom}};
}

#########################################

while (<IN11>) {
chomp;
my ($cchrom,$middle) = split /\t/;
my $cpg = {middle => $middle};
push @{$chromes1{$cchrom}}, $cpg;
}
my @cpgids;
sub input {
my @cpgs =qw(cchrom middle);
while (<IN11>) {
chomp;
my %cpg;
@cpg{@cpgs} = split /\t/;
push @cpgids,\%cpg;
}
}
foreach my $cchrom (sort keys %chromes1){
my $count = scalar @{$chromes1{$cchrom}};
print OUT7 "$cchrom\t$count\t\n\n";
print OUT7 map {"\t\t$_->{middle}\t\n"} @{$chromes1{$cchrom}};
}
#my @set = $start; #callin $start from $start (file1)
#my @vals = $middle; #calling $middle from $middle (file2)
#my @vals_sorted = sort {$a <=> $b} @vals;
#foreach my $n (@set) {
# print "Closest to $n = ", shift @vals_sorted, "\n";
}
close IN10;
close IN11;
close OUT7;

I caught with some group meeting crap.
well here I'm trying to recall the column1 ($start) from file1, IN10 and column2, ($middle) from file2, IN11. I think I screwed up some where.:smiley:

note :I'm giving sample outputs of script so far except calculation of the closest point

ouput of file1, IN10-----chrom--count---start

chr1 10

	176716364	
	24737792	 	
	41368822	 	
	200251548	 	
	28707214	 	
	198709839	 	
	93419080	 	
	52270366	 	
	151846111	 	
	168277851	 	

chr10 7

	62870352	 	
	104039249	 	
	31255458	 	
	978073	
	6869853		
	17678914		
	4274540	

output file 2,IN11----------chrom...count...middle

chr1 2463

	 19135.5	
	 125206.5	
	 317872.5	
	 427520.5	
	 439771.5	
	 523529.5	
	 535556.5	
	 704128.5	
	 752793.5	
	 778900	
chr2 	1150	

	 84913.5

	 109885.5

	 112361

	 171517

	 336666

	 358442.5

output must have some algorithm like this but it has a bug, the following algorithm has file1 only. I dont know how to create foreach loop for both file1 and file2

foreach my $chrom (sort keys %chromes){

if ( $chrom =~ /^chr1/){

print closest of start and middle of chr1 in both files
This is the point where I have to insert the script you replied before

}
elsif { $chrome =~ (^/chr2/)
print ........................................chr2..... and so on
}
else(not recognized)
}

#!/usr/bin/perl
use strict;
open FH,"<a.txt" or die "Can not open file";
my (@a1,@a2,$gap,$key);
while(<FH>){
	chomp;
	my @tmp=split(",",$_);
	push @a1,$tmp[0];
	push @a2,$tmp[1];
}
close FH;
for(my $i=0;$i<=$#a1;$i++){
	$gap=0;
	$key=0;
	for(my $j=0;$j<=$#a1;$j++){
		my $tmp=abs($a2[$j]-$a1[$i]);
		if ($gap==0||$tmp<=$gap){
			$gap=$tmp;
			$key=$a2[$j];
		}
	}
	print $a1[$i],",",$key,"\n";
}

Awessssssssssssome Dude
I have no words for it u just used the logic I'm just thinking of

what if column1 values are few and column2 values are more and still has comma with it
73.924598,40.879010
73.924506,40.878978
73.924506,40.878978
................,40.878178
................,50.878178
................,60.878578

Here is my answer for you, but as you subverted your Read Only status, which was a result of your persistently breaking the forum rules, you are banned.

# For each key in %file1,
#   1. split the key into name/start parts
#   2. search for the record in file2 that BOTH:
#     (a) the corresponding records have the same "name" field
#     (b) has the smallest difference between $start and $middleno
#         of any of the records
#   3. Print out both records in one line
#   4. Delete these record from the structures (so they cannot be matched again)
#
foreach $key (sort {$a <=> $b} keys %file1) {
  my ($name, $start) = split(":",$key);
  my $min = 2 ** 31;  # start as maximum integer
  my $smallest_key = undef;
  for (my $i=0; $i <= $#file2_middle_keys; ++$i) {
      my $current_key = $name .":". $file2_middle_keys[$i];

      # skip entries that do not match the $name in file2 (see (1a), above)
      next unless (exists $file2{ $current_key });

      # calculate difference and see  (see (1b), above)
      my $diff = abs($file2_middle_keys[$i] - $start);
      if ($i > 0 && $diff > $min) {
        # stop -- difference is getting bigger. No need to proceed.
        last;
      }
      if ($min > $diff) {
        $min = abs($file2_middle_keys[$i] - $key);
        $smallest_key = $current_key;
      }
  }
  # answer of (2b) is in $smallest_key
  if (defined $smallest_key) {
    # (3)
    print join(" ",
      @{ $file1{$key} }[1,2,3],
      @{ $file2{$smallest_key } }[3,1,2],
      @{ $file1{$key} }[0],
    )."\n";

    # (4)
    delete $file1{$key};
    delete $file2{$smallest_key };
  }
}
# - Licenced under AGPL (http://www.gnu.org/licenses/agpl.txt)
# - Author: Otheus [http://www.unix.com/members/302022384.html]

oops..... so long nogu0001