Today I needed to take a look through a load of large backup files, so I wrote the following line to find them, order them by size, and print the file sizes in GB along with the filename. What happened was odd, the output was all as expected except for the first output line which had the filename heavily truncated. I thought the problem might be with that particular file name, so I reversed the sort order, again the first filename was heavily truncated, this time a different file which had been listed correctly before I changed the ordering. [Note: I used '' as a field seperator for awk, none of the filenames contain the string ''.]
find . -type f -size +500M -printf '%s**%p\n' | sort -n | awk 'FS="**" {gb=$1/(2^30); printf("%f GB\t%s\n", gb, $2)}'
So I wrote the script below to create some dirs and files with different lengths and simplified the command line. Please have a look at what happens with the different find commands below, can someone explain why the first line always has the filename truncated as I can't work out why it is. Thanks.
#!/bin/bash
mkdir "Test Dir 1"
echo "Test File 1 extra chars so diff file lengths" > "Test Dir 1/Test File 1"
mkdir "Test Dir 2"
echo "Test File 2 fewer extra chars" > "Test Dir 2/Test File 2"
mkdir "Test Dir 3"
echo "Test File 3 even fewer" > "Test Dir 3/Test File 3"
mkdir "Test Dir 4"
echo "Test File 4 a few" > "Test Dir 4/Test File 4"
# Test 1 - no piping - All OK:
$ find . -type f -printf '%s**%p\n'
23**./Test Dir 3/Test File 3
30**./Test Dir 2/Test File 2
18**./Test Dir 4/Test File 4
45**./Test Dir 1/Test File 1
# Test 2 - pipe to sort - All OK:
$ find . -type f -printf '%s**%p\n' | sort -n
18**./Test Dir 4/Test File 4
23**./Test Dir 3/Test File 3
30**./Test Dir 2/Test File 2
45**./Test Dir 1/Test File 1
# Test 3 - pipe to awk - First line filename truncated:
$ find . -type f -printf '%s**%p\n' | awk 'FS="**" {printf("%d \t%s\n", $1, $2)}'
23 Dir
30 ./Test Dir 2/Test File 2
18 ./Test Dir 4/Test File 4
45 ./Test Dir 1/Test File 1
# Test 4 - pipe to sort, then to awk - First line filename truncated:
$ find . -type f -printf '%s**%p\n' | sort -n | awk 'FS="**" {printf("%d \t%s\n", $1, $2)}'
18 Dir
23 ./Test Dir 3/Test File 3
30 ./Test Dir 2/Test File 2
45 ./Test Dir 1/Test File 1
# Test 5 - pipe to reverse sort, then to awk - First line filename truncated:
$ find . -type f -printf '%s**%p\n' | sort -nr | awk 'FS="**" {printf("%d \t%s\n", $1, $2)}'
45 Dir
30 ./Test Dir 2/Test File 2
23 ./Test Dir 3/Test File 3
18 ./Test Dir 4/Test File 4
Thanks all.