Due to budget constraints I have to reinvent an Enterprise backup system in a SPARC (sun4v) Solaris estate (10 & 11). (yep - reinvent wheel, fun but time consuming. Is this wise?! )
For each filesystem of interest, to try to capture a 'catalog' at the front of each cpio archive (for an easy scripted restore system that I will write later), I touch a file .${DIR}/.${OFILE}.fullscan that will hold a full filesystem scan. Then I run this to populate the first 'record' in the catalog file, so that the catalog file always sits at the head of the archive:
find .${DIR}/.${OFILE}.* -type f -mtime -1 -ls > .${DIR}/${OFILE}.fullscan
This populates our file list:
find .${DIR} -xdev -local -ls >> .${DIR}/.${OFILE}.fullscan
This then runs the actual cpio operation:
awk '{$1=$2=$3=$4=$5=$6=$7=$8=$9=$10=""; print $0}' .${DIR}/.${OFILE}.fullscan|\
cut -c11- |cpio -oc 2>>/dev/null|gzip -qc1 ->${OUTFILE}
9 times out of 10 my ${OFILE}.fullscan appears in the first few files in the cpio archive. Occasionally it's a few files 'lower down' but always in the first 20. So good so far.
Today, on a Solaris 11.2 system I found the fullscan file over 1000 files into one of the cpio archives and another one over 6400 files into the archive. (In another they appeared at the end but I'm still checking that's not a 'code' issue!) Why?! Help!
I checked the text file content to make sure the top record was as expected for the examples that put the fullscan file much farther down the archive.
My worst case scenario is a 57M file filesystem (yep - source code repo) which generates a 9.5GB 'fullscan' file (over 11 hours) and due to other bits and pieces I need to do, I really, really don't want my catalogs appearing half way through that one. (This supersized filesystem backup will inevitably be broken up into smaller tasks but for now I'm just asking.)
Is this a multithreading effect? I don't believe awk or cut would reorder the list and so cpio would receive ${OFILE}.fullscan as the first argument. The file in question is being read but that wouldn't generate an exclusive lock to prevent access or anything like that from another read process. Note: for what it's worth these 2 anomalies occurred in /var of a guest Solaris zone that's visible from the global zone.
Can anyone think of a way to:
- Assuming there isn't a trivial answer to this - debug this easily for an explanation? (Bear in mind it's intermittent.)
- Workaround this? I want the catalog to always be quickly and easly accessed from (the top of?) many 100GB+ gzipped cpio archives! How can we persuade cpio to load files into the archive in exactly the order its file list is fed to it?
Your thoughts much appreciated.
Alex