hii everyone ,
i have a file in which i have line numbers.. file name is file1.txt
aa bb cc "12" qw
xx yy zz "23" we
bb qw we "123249" jh
here 12,23,123249. is the line number
now according to this line numbers we have to print lines from other file named file2.txt
the file2.txt consist of more than 600000(6 lakhs ) of record.
i have written code as
while read line; do
x=`echo $line|awk '{print $4}'`
m=`echo "${x}"|sed 's/"//g'"`
awk '{if(NR=='$m') {print $0>>"desriredfile"}}' file2.txt
done<file1.txt
this command works.. but the thing is that this is slow ..as it read whole file every time for each line number.
can a code be written so that it reads the whole file (file2.txt) only once ?
thanks for help??
Here is an example on how to do that in perl, please tweak according to the needs
#! /opt/third-party/bin/perl
# map line number and contents in the file
my %fileHash = ();
my $lfh;
open($lfh, "<", "file_1") or die "Unable to open file : file_1 <$!>\n";
while ( <$lfh> ) {
chomp;
$fileHash{$.} = $_;
}
close($lfh);
# open file that contains line numbers for which data needs to be extracted from the other file
open($lfh, "<", "file_2") or die "Unable to open file : file_2 <$!>\n";
while( <$lfh> ) {
chomp;
print "Here is the information " , $fileHash{$_} , "\n";
}
close($lfh);
exit(0)
well i need a shell script for this... i have no idea about pearl.. how much time will it take in pearl.. it takes about more than one day.. to complete. its high time for me know ... i should find some idea ...
just give it a try and you can find the time taken for this to run.
May be try benchmarking with 100K records with your shell script method and the above perl version will help you to identify at the overall time that this could take ( but again, its an approximation )
and will save you few hours 2 system calls per $line
Post some ample data using [code] tags and maybe we can speed up your script. The output of wc for each file can be useful.
but can't something be done inside awk . so that it reads the bigger file once and give the desired output..
here it is reading the file again and again after getting the line number