I have the following task to accomplish: there is a directory with approximately 2 thousand files. I have to write a script which would randomly extract 200 files on the first run. On the second run it should extract again 200 files but that files mustn't intersect with those extracted during the first run of the script. So I have to remember the names (or probably inodes) of already extracted files. What do you think is the best way to do that? So far my decision is to create a new file with a list of inodes of already extracted files. On the subsequent runs of my script I'll then check whether the inodes of randomly chosen files are already present in the list. What do you think about this approach? Are there other probably more elegant ways to remember (or to mark) what files have already been extracted?
What you need is not remembering file names.
You need the count of files.
Try this.
#-- Move away from org. files
cd /dum/dumma/here/
if [ ! -r counter.txt ] ; then
echo "1" > counter.txt
fi;
typeset -i from=$(<counter.txt)
typeset -i till=$(expr $from + 199)
#-- If you want, you can merge this line with "| sed"
#-- But this way, you have your own advantages
ls -1 > dummyy.txt
sed -n "$from,${till}p" dummyy.txt | do_some_thing.sh
echo $(expr $till + 1) > counter.txt
I'd create a "numbered list of files given" (see code below), then use those numbers for random selection and finally remove list entries in accordance to files extracted ...
Thank you very much for your ideas. They were very useful to me. Indeed, creating once a randomized list of files and then dealing with it is much more efficient than randomize files every time my script is run. Thanks