Program or bash script to see total progress of copy

hi all,

i want a program or to make a bash script to find out the total ETA/percent (would be nice aswell a progress bar) of a copy recursive command

so lets say i do -

cp -r /source_folder/ /destinatation_folder/

and when i run it i get no information on the screen of how the copy is progressing

i have tried using "pv" and "progress" commands but to no avail, they just give me the ETA/progress of individual files copied in the directory but not the total ETA/progress of the whole directory

any help,

rob

You don't say what OS you have, but I would suggest pv is the tool for you.

Have a try and show us what's happening if it doesn't do what you need. It may be that you could use tar to bundle them into a single archive, pipe them through pv and extract at the other side.

Kind regards,
Robin

Hi.

1) None of the solutions from Linux Forums or command line - How can I move files and view the progress (e.g. with a progress bar)? - Unix & Linux Stack Exchange were of any help?

2) On what characteristic do you want to see progress: bytes, blocks, files, or something else?

Best wishes ... cheers, drl

smashed it -

[root@robw-linux data]# tar -c call_the_midwife_7_1708/ | pv -lep -s 32455212 | tar -x -C /mnt/local/data/new/
[=> ] 2% ETA 2:34:31

and to find the dir size i did -

du -s call_the_midwife_7_1708/

but doing it via this method takes ages as its creating the tar and extracting the tar, normally doing a normal copy only takes roughly 18 minutes

How much data is it, and how fast are your disks?

32gigs

Its my data drive not o/s drive so its Sdb single spindle at 72rpm

How about rsync? It supports progress and same-system copy:

rsync -aI --progress source/ destination

-a for recursive, and -I to ignore timestamps and copy everything it finds.

Note the trailing / on the source is important! Otherwise you'll end up with destination/source/filename instead of destination/filename

1 Like

Hi,

Another method is to compare the static size of the directory source (in whatever terms you desire) with the changing size of the destination.

See for example post 2 in thread how to have a cp progress bar?

You'd probably need to modify the cp command to indicate you wish to copy a directory, probably like adding -r

The stat command would produce similar results to du , but in blocks (%b), and you may want to substitute du .

I don't see an ETA being calculated, but, as with the suggestion from Corona688 for rsync , it would avoid touching the data many times.

The utility tree v1.7.0 produces measurements quite quickly, so you also could use it. Here's a sample of timing for a largish directory with many items, disk being an SSD in this case (and producing 25K lines of output, to be discarded, keeping stderr):

1892 directories, 23706 files

real    0m0.812s
user    0m0.132s
sys     0m0.336s

from:

time tree src/

Best wishes ... cheers, drl

---------- Post updated at 21:36 ---------- Previous update was at 06:34 ----------

Hi.

See also many comments on rsync and progress at linux - Showing total progress in rsync: is it possible? - Server Fault

Best wishes ... cheers, drl

1 Like

just thought of another idea -

il get the size of the source path -

du -s /source_path/

then i will start the copy -

cp -r /source_path/ /destination_path/

while im copying i will monitor the progress -

watch -n 0.5 du -s /destination_path/

but i want to do this all in a bash script but my issue is it wont watch the destination path while the copy is going on, how do i do both at the same time

rob

Put the cp into background. But, don't expect that method to be too exact. You'll need to know every single file to be included or excluded, and other disk activity may interfere.

You seem to want it sizewise, not timewise. Try this, although due to I/O buffering the "progress bar" may not be in correct sync:

awk '
BEGIN           {printf "\r%101s", "|"
                }

NR == FNR       {C[$2] = $1
                 SUM  += $1
                 next
                }

                {gsub (/\047/, _, $1)
                 TOTSIZ += C[$1]
                 PCT     = TOTSIZ/SUM*100
                 TMP     = sprintf ("%*s", PCT, " ")
                 gsub (/ /, "-", TMP)
                 printf "\r%s", TMP
                }

END             {printf RS
                }
' <(du -b /source_path/*) <(cp -v /source_path/* /destination_path/ 2>&1)

Hi.

There is a script at linux - Showing total progress in rsync: is it possible? - Server Fault by author nito near the end of the thread.

I copied it into a file I call watch-running-process

Here is a driver script:

#!/usr/bin/env bash

# @(#) s1       Demonstrate rate, ETA of running process, assumes /proc IO, like cp, rsync, etc.

LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C

SOURCE=/not-backed-up
DESTIN=/tmp
volume=$( du -s -BM $SOURCE | sed 's/M.*$//' )

cp -r $SOURCE $DESTIN &
my_pid=$!

./watch-running-process "$my_pid"  "$volume"

exit 0

The parameters are the PID you are watching, the volume of the source in MB, and an optional delay for the loop (default 5 seconds). Because you supply the PID, you can run anything you want, just as long as it does io that is captured in /proc (so not Solaris, macOS, BSD, etc.). I have not looked in detail at the numbers in /proc, but the script seems to work. If you are interested / curious, see man proc , look at the entry for io, read_bytes, etc.

Here are display snapshots near the beginning, middle, and end of the process for cp as done prior to the call to the monitoring script:

Monitoring PID: 29096

Read :      539.04 MiB in 5.03 s
Write:      651.68 MiB in 5.03 s

Read Rate : 107.16 MiB/s ( Avg: 58.05, Max: 126.64 )
Write Rate: 129.55 MiB/s ( Avg: 193.95, Max: 544.66 )

Done      : 7.63 GiB / 14.11 GiB (54.07 %)
ETA       : 00:00:34.21 (34.21s)
Elapsed   :       00:41

-----

Monitoring PID: 29096

Read :      336.91 MiB in 5.04 s
Write:      337.91 MiB in 5.04 s

Read Rate : 66.84 MiB/s ( Avg: 50.63, Max: 126.64 )
Write Rate: 67.04 MiB/s ( Avg: 107.78, Max: 544.66 )

Done      : 13.24 GiB / 14.11 GiB (93.83 %)
ETA       : 00:00:08.26 (8.26s)
Elapsed   :       02:08

-----

Monitoring PID: 29096

Read :      105.94 MiB in 5.03 s
Write:      105.86 MiB in 5.03 s

Read Rate : 21.06 MiB/s ( Avg: 50.43, Max: 126.64 )
Write Rate: 21.04 MiB/s ( Avg: 101.45, Max: 544.66 )

Done      : 13.96 GiB / 14.11 GiB (98.93 %)
ETA       : 00:00:01.51 (1.51s)
Elapsed   :       02:23
----- Finished -----

There is not a real progress bar, but it includes an ETA, along with data rates. Given that the numbers are all available in the script, a scaled progress bar probably could be done.

Best wishes ... cheers, drl

i have seen this -

bash - How to add a progress bar to a shell script? - Stack Overflow

if you read the answer by mitch he has made a script that does this with echo commands

but when i try it and run it on my linux box i just get 33, 66, 100% with the hashes, how can i get this to count up from 1-100

also how can i implement this with my copy command "cp -r /source /dest"

many thanks,

rob

cp does not have this feature. Full stop. Have you considered trying a program which does have this feature?

As per my post a page ago:

I should mention, rsync is also extremely common and probably installed on your system already.

The requestor doesn't seem to read (or try (or, at least, comment on)) ALL contributions...

Hi.

I assume that this thread may be useful to others.

A set of numbers from thread linux - Copying a large directory tree locally? cp or rsync? - Server Fault

To move 532Gb of data distributed among 1,753,200 files we had those times:
    rsync took 232 minutes
    tar took 206 minutes
    cpio took 225 minutes
    rsync + parallel took 209 minutes

I wish they had added cp to the benchmark, but the theme of the thread is comparing speed, with a sub-theme of monitoring, and a few other characteristics, like re-start-ability.

So, as regards this thread, use something that does have a progress mechanism built-in or roll-your-own.

Best wishes ... cheers, drl

Of course rsync is slowest when you don't turn off checksumming.

That chart does help prove one of my biggest peeves though - "you can't multithread a hard drive". None of the options are significantly faster.

Hi.

Well, 9% return, 209/232 -> 0.90, on a trivial change might be worthwhile to some folks.

Researchers at the center where I worked would moil to get even a 1% speed-up. Of course, that was most often with grad student labor :slight_smile:

Best wishes ... cheers, drl

sorted it,

yum remove rsync
https://download.samba.org/pub/rsync/rsync-3.1.2.tar.gz


installed it by untarring it cd'd into the dir and running

./configure.sh
make
make install

and now i get the result i wanted -

[root@robw-linux data]# rsync -a --info=progress2 call_the_midwife_7_1708/ new/
 14,874,971,690  44%   27.58MB/s    0:10:48  xfr#16, to-chk=2/143)

Why did you do all that when you appear to have had it already? Did the version you have not contain --progress?