Hi,
I have a data file with millions of record (N). Each record was saved in 4 lines. So there are total of NX4 lines in the data file.
For Example:
Host1
a
b
c
d
Host2
e
f
g
h
Host3
i
j
k
l
I would like to write a PERL script to extract 1000 random records , WITHOUT repeating/replacement. So, there is total of 4000 lines in output file.
Could you help me this ?
Thanks,
Phoebe
what have you done to accomplish your desire to write a perl script for this? What problems are you having?
I have code below to randomly select number of records (1 line for each record only) from file.
I'm thinking to modify this code in a way like, if the selected random number is 6, which means record 6 is picked, then it will retrieve lines from (5*4)+1 (which is 21) to line 24.
This is my first time writting perl script. Please help.
#!/usr/bin/perl
die "Usage: $0 <N>, where N is the number of lines to pick\n"
if @ARGV<1;
$N = shift@ARGV;
@pick=();
while(<>){
if (@pick < $N) {
push @pick,$;
($r1,$r2)=(rand(@pick),rand(@pick));
($pick[$r1],$pick[$r2])=($pick[$r2],$pick[$r1]);
} else {
rand($.)<=$N and $pick[rand(@pick)]=$;
}
}
print @pick;