How to Pick Random records from a large file

Hi,

I have a huge file say with 2000000 records. The file has 42 fields. I would like to pick randomly 1000 records from this huge file. Can anyone help me how to do this?

If you choose a random starting point, then select 1 record at every fixed m interval you for a reasonable number of times (1000 is beyond what is needed have a statistically valid sample ie - random sample of the population.

Since you have 200000 records start somewhere between 1 and 200, then step forward by 200 records 1000 times.

awk -v start=$RANDOM 'BEGIN{ start=start %200; start++}
       {  if(FNR==start) {print $0; start+=200}                        
       }' inputfile > newfile