Process files except the last n files + help sort files numerically

newbie_01 · June 30, 2024, 2:30am

Hi,

OS version is below.

$: uname -a
SunOS [hostname] 5.8 Generic_Virtual sun4v sparc sun4v

Unfortunately, really old version of Solaris. I've no option to upgrade, I am just a 'normal' user of the server.

Anyway, I need to process a set of files, for example, compress, gzip, rm etc. I don't need to process all files, I only need to process all but the last n files.

For example, in my script below, n is 4 and I want to do rm files except the last 4 files with filename being sorted numerically with the assumption that it is oldest to newest.

Is there a one-liner to do ls excluding the last n files similar to displaying all but last 20 lines of a log?

Script so far is below:

$: cat x.bash
#!/bin/bash

n=$( ls -1 *log | wc -l | awk '{ print $1 }' )
keep=4
let ntodo=$n-$keep
#echo $n
TODO="/bin/rm"
if [[ ${ntodo} -gt 0 ]] ; then
  for log in $( ls -1 *log | sort -ta -k1.2n | head -n ${ntodo} )
  do
    echo "ntodo=$ntodo | ${TODO} $log"
  done
else
  echo "NOTHING TO ${TODO} .. ntodo=$ntodo"
fi

## ls below for sanity
echo
echo "==="
echo
ls -1 *log | sort -ta -k1.2n

Script run gives output below:

./x.bash
ntodo=6 | /bin/rm a1.log
ntodo=6 | /bin/rm a2.log
ntodo=6 | /bin/rm a3.log
ntodo=6 | /bin/rm a4.log
ntodo=6 | /bin/rm a5.log
ntodo=6 | /bin/rm a6.log

===

a1.log
a2.log
a3.log
a4.log
a5.log
a6.log
a7.log
a8.log
a9.log
a10.log

Script run output is as expected. I just want to know if there is a better way to do it or this is it.

Problem here is the ls part

ls -1 *log | sort -ta -k1.2n

This is correct only if the number of digits is 2, it is not going to be correct anymore if the number of digits changes to 3,4,5 etc So, what do I do so that the sort always gives the right result.

Several ls command gives output below and I can't get it to give the right ls to give the right sorted output hence I've used ls | sort in the script.

$: ls -1 a*log
a1.log
a10.log
a2.log
a3.log
a4.log
a5.log
a6.log
a7.log
a8.log
a9.log

$: ls -1r a*log
a9.log
a8.log
a7.log
a6.log
a5.log
a4.log
a3.log
a2.log
a10.log
a1.log

$: ls -1t a*log
a1.log
a10.log
a2.log
a3.log
a4.log
a5.log
a6.log
a7.log
a8.log
a9.log

Any guidance will be much appreciated.

BTW, the test files where created using

touch a1.log a2.log a3.log a4.log a5.log a6.log a7.log a8.log a9.log a10.log

Presumably doing ls with -t will give the right sort and it isn't? Or is that a wrong assumption?

MadeInGermany · June 30, 2024, 6:50am

Solaris 8 is really old. AFAIR it ships with bash version 2.

Surprisingly the touch command seems to mangle the given order. I could reproduce it on a Linux VM, the timely order was unpredictable.
Or is it the file system driver?

Create them in a loop, and the timely order should be enforced. Using the bash brace expansion:

for a in a{1..10}.log; do touch "$a"; done

If you must sort then do not limit the end of the field; the -ta seems to create an empty field 1 (before the a), and the field 2 starts with the number:

ls -1 a*.log | sort -ta -k2n

Here is a short one with ls -1t:

keep=4 todo=/bin/rm
ls -t a*.log |
while IFS= read -r fn
do
  [ $((keep-=1)) -lt 0 ] && echo "$todo" "$fn"
done

let is deprecated. Please use the Posix arithmetics:
ntodo=$((n-keep))

Paul_Pedant · June 30, 2024, 8:04am

touch can give multiple files the same timestamp in Linux, even to the nearest nanosecond. I created three in one command, and I got 2 identical to the nanosecond, and a third about a millisecond later.

I suspect SunOS 5.8 only keeps times to the nearest second anyway, in which case they would generally all be identical.

If your application does generally create files sequentially and not simultaneously, you can fake that using a loop through the names with a sleep between touches.

If you have an inconveniently large number of test files, you can loop through them and calculate different touch -t times to them.

DrScriptt · June 30, 2024, 4:06pm

Part of my career is -- what I call -- taking care of old systems, including Solaris, that are out to pasture but still doing what they were installed to do with effectively no problem. My current day job pays me to ride herd over a fleet of Solaris 10 systems.

Thank you for confirming that the files will be sorted in the order that you want. That can be a dangerous assumption if it's not confirmed, particularly when parsing a list of files as textual data.

See if Solaris 8 has the tac (cat in reverse order) command. If it's there, you could do something like this:

$ ls -1 *log | tac | sed '1,10d' | tac

That will take the list, tac will flip it top to bottom, sed '1,10d' will delete the ""first 10 lines of the reversed input / ""last 10 lines of normal input, and then tac will flip it top to bottom again.

N.B.

I don't know if Solaris 8 has tac
I'm assuming that sed on Solaris 8 will take the ranged delete like that

You may need to fiddle with things, but hopefully that helps.

--
Grant. . . .

MadeInGermany · June 30, 2024, 7:33pm

No tac in Solaris 10 and older.
But Solaris 10 has
tail -r
so maybe ...
But 3 external programs for processing!
I have a sed-only solution in the neighbor thread

Paul_Pedant · July 1, 2024, 7:28am

Sorting based on a serial number in the filename seems irrelevant.

I need to process a set of files, for example, compress, gzip, rm etc.

You will need to do that based on a variety of filename formats. If they come from different sources and activities, you won't be able to serialise those filenames explicitly. So fixing the sort order by sorting on a8, a9, a10 ... will never be correct.

"Last" presumably means "in order of modification date" (assuming nobody is messing around with touch). So you need to stat the files to get the appropriate ordering: stat --printf='%y\t%n\n' myPath. As this is fixed format and properly ordered, a simple sort will work.

If you don't care what order the shortened list is processed in, then you can always just cut off the first four rows of filenames with tail -n +5. You get the oldest files by first using sort -r, and the newest by sorting in the default order.

newbie_01 · July 3, 2024, 12:33pm

Thanks for your input. Will start getting rid of let.