Hello,
I have a column which have 7200 numbers and I am deciding to pick up 1440 numbers randomly without any reputation? Could any one let me know which script in unix will be work for my case?
Regards
Sajjad
Hello,
I have a column which have 7200 numbers and I am deciding to pick up 1440 numbers randomly without any reputation? Could any one let me know which script in unix will be work for my case?
Regards
Sajjad
Here is a solution using awk.
You didn't say column separator your file has - so I'm assuming it's a comma.
You can change the -F parameter according to you actual value, also note that some POSIX awk implementations may fall over if you number of columns is too large:
awk -F',' '{ srand();
for(i=1;i<=1440;) {
v=$(int(1+rand()*7200))
if (!(v in N)) {
N[v]
printf "%s%s", v, (i++<1440?FS:RS)
}
}
}' random.csv
Thanks for your command suggestion. According to your guess about the comma, I should to mention I just have one column which have 7200 line and there is no comma. do I need to just get rid of the comma in the awk command?
FWIW - "random" means any number in the range of valid values can occur any time. By saying "random with no duplicates" is not the same thing. Do not use this for encryption.
Thanks jim
Do you have any suggestion command?
Yes, if you just have spaces (or tabs) between the values then remove the -F','
part.
I think my question was not very clear. I have a file which has one column with 7200 line and want to select 1440 lines ( %20 of the line) randomly without the duplication number between the 1440 numbers.
The command did not give me what do I want.
OK my mistake, try this:
awk '
{ V[NR]=$1 }
END {
srand()
for(i=1;i<=1440;) {
v=V[int(1+rand()*7200)]
if (!(v in N)) {
N[v]
print v
i++
}
}
}' random.txt
or even:
sort random.txt | uniq | shuf | head -1440
Maybe this one:
awk ' {T[NR]=$1} END {srand(); for (i=1; i<=1440; i++) print T[int(1+rand()*NR)]}' file
Nice Rudic, but the requirement was to pick 1440 numbers randomly without any reputation
Rats! Should have read the entire thread. Sorry for that...
If you just want 1440 random lines, try
shuf < inputfile | head -n 1440
Good point Corona688, I'd assumed the random file could contain duplicate entries (that being the nature of random data) and these were to be removed hence my sort and uniq code in post #8.
Your solution is optimal if sajmar doesn't require detection/avoidance of any duplicate values that may occur in the inputfile.
The shuf command will not work for me. could any one give me an awk command for select randomly 1440 out of 7200 numbers which is in one column?
jot -s" " 7200 1 7200 | ./rand.php
$ cat rand.php
#!/usr/bin/php
<?php
$file = fopen( 'php://stdin', 'r' );
$numarr = explode( " ", fgets( $file ) );
srand();
$nums = array_rand( $numarr, 1440 );
echo implode( " ", $nums );
fclose( $file );
?>
$ jot -s" " 7200 1 7200 | ./rand.php | wc -w
1440
You can use PHP to generate the array, or seq if you don't have jot, but you said you already had the numbers, so just <my numbers> | ./rand.php
.
In what way will it "not work"? Does it not work because you don't have it, or does it not work because it's not the result you want? Important difference.
Not POSIX, but some sort commands have a the --random-sort parameter, so:
sort --random-sort inputfile | head -n 1440
If you really need an awk solution stick to post #8
An awk-based replacement for shuf:
awk 'BEGIN {srand(); } { A[NR-1]=$0; E++ }
END { while(E>0) { print A[N=int(rand()*E)] A[N]=A[--E]; }' input > output