Random lines selection form a file.

McLan · June 10, 2008, 5:35am

>cat data.dat
0001 Robbert
0002 Nick
0003 Mark
.......
1000 Jarek

Perderabo · June 10, 2008, 6:00am

Prepend a random number to each line, sort the file, take the first few lines, and remove the leading random number.

 awk 'BEGIN {srand()} {printf "%05.0f %s \n",rand()*99999, $0; }' datafile | sort -n | head -100 | sed 's/^[0-9]* //'

McLan · June 10, 2008, 7:02am

I afraid, if this is going to generate any duplicate random numbers in range from 1 to 100? If yes then how can you avoid that?
If this generates the duplicate numbers then the same record will be picked up more than once.

Cheers,
McLan

Perderabo · June 10, 2008, 9:10am

Why don't you try it? The range of rand()*99999 is not 1-100, duplicate random numbers are possible, duplicate lines selected from your file are not possible.