Xargs and rsync not sending all files

Hi all,

I have a script issue I can't seem to work out.

In a directory I have several folders and I want to send just the sapdata1 to sapdata14 folders and their contents but not sapdataXX/.snapshot

the script is:

#!/bin/bash

# SETUP OPTIONS
export SRCDIR="/scratch/doug/test/sapdata*"
export FSRCDIR="/scratch/doug/test"
export DESTDIR="fred@111.222.333.444:/sendtest/TST"
export THREADS="40"

echo "Starting... "
date

# RSYNC DIRECTORY STRUCTURE
rsync -zvr -f"- *sapdata*/.snapshot" -f"+ */" -f"- *" ${SRCDIR} ${DESTDIR} \

# FIND ALL FILES AND PASS THEM TO MULTIPLE RSYNC PROCESSES
cd $FSRCDIR  &&  find sapdata* -path "sapdata*/.snapshot" -prune -o ! -type d -print0 | xargs -0 -n1 -P${THREADS} -I% rsync -arvh --partial --size-only % ${DESTDIR}/%

echo "Complete"
date

The first rsync seems to work, I can see all the folders at the destination.

The find works and finds just over 600 files, however the xargs command only seems to send 40 files and then stops, I thought the -P would spawn 40 processes at a time until all the files were sent . I'm probably missing something obvious but I just can't figure it out.

I should also mention that I'm sending about 15Tb worth of large files. Hence the send in parallel script.

Any help appreciated.
Thanks

This seems to be working OK for me. I changed script as follows, so I could test in on local files:

#!/bin/bash
  
# SETUP OPTIONS
export SRCDIR="./test/sapdata*"
export FSRCDIR="./scratch/doug/test"
export DESTDIR="TST"
export THREADS="40"

echo "Starting... "
date

# RSYNC DIRECTORY STRUCTURE
rsync -zvr -f"- *sapdata*/.snapshot" -f"+ */" -f"- *" ${SRCDIR} ${DESTDIR} \

# FIND ALL FILES AND PASS THEM TO MULTIPLE RSYNC PROCESSES
cd $FSRCDIR  &&  find sapdata* -path "sapdata*/.snapshot" -prune -o ! -type d -print0 | xargs -0 -n1 -P${THREADS} -I% rsync -arqh --partial --size-only" % ../../../${DESTDIR}/%

echo "Complete"
date

And tested it like this:

$  mkdir -p TST {test,scratch/doug/test}/sapdata{1..6}/dir{1..5}
$ touch scratch/doug/test/sapdata{1..6}/dir{1..5}/file{a..z}
$ ./dougCopy
Starting... 
Tue, Oct 08, 2019  8:21:17 AM
sending incremental file list
sapdata1/
sasdata1/dir1/
...
sapdata6/dir5/
sent 774 bytes  received 156 bytes  1,860.00 bytes/sec
total size is 0  speedup is 0.00
Complete
Tue, Oct 08, 2019  8:22:15 AM
$  find scratch/doug/test/ -type f -print | wc -l
780
$ find TST -type f -print | wc -l
780
1 Like

Hi,

Thanks for testing. I've also tried running the script just copying files locally and it seems to work ok. The issue only seems to be with sending to the remote server.

Anyone know why that might be?

It could be some sort of resource issue, or connection limit. I'd try less threads.

With disk and network I/O bottlenecks involved, I find it hard to believe that 40 threads would be quicker than 5, on my low spec system here 5 takes about twice as long as 1!

Hi,

Found the issue.

I modified sshd_config on the destination with the following param and restarted sshd:

MaxStartups 50:30:100

It wasn't set, so the default was 10 max before it started dropping connections.

Thanks for your help.

1 Like

Also make sure to remove ulimit if any, because you might experience issues with that in the long run :slight_smile:

1 Like
Moderator comments were removed during original forum migration.

This topic was automatically closed 50 days after the last reply. New replies are no longer allowed.