dd cloning of whole disk

Royalist · July 26, 2012, 12:55pm

I am using 'dd' to clone an entire hard drive which only has Ubuntu 11.10 and some data with no special options. The disks are both 1Tb, However, I did re-partition the target disk with gparted successfully. The new partions are not the same size as the source disk. When starting 'dd' no partitions were specified in the target. e.g.

sudo dd if=/dev/sdc of=/dev/sda

Normally this would take no more than an hour using WD Acronis software, but this does not produce a bootable disk.
'dd' has now been running for seven hours without any apparent progress. When should I stop it please?

Scrutinizer · July 26, 2012, 1:05pm

The default block size of dd is 512 bytes. Try using bs=1M for example. The optimum value differs per system

Corona688 · July 26, 2012, 1:26pm

dd doesn't print progress output while running unless asked. If your disk light's flashing, it's not done.

If you're running Linux, you have gnu dd, which will print progress statistics when you give it SIGUSR1.

kill -USR1 pid

A terabyte disk could take a good number of hours, easily. I never did manage to get a 1.5T RAID mirroring done the same workday I started it. Don't give up yet. Sleep on it, and it'll probably be done in the morning.

alister · July 26, 2012, 2:00pm

Some implementations of dd will print progress information when they receive SIGINFO (a non-standard signal not supported on all platforms). Check your documentation and/or run a test with another dd (perhaps reading from a neverending file to an unfillable sink):

dd if=/dev/zero of=/dev/null bs=1

From another terminal, grab the pid (make sure it's the correct instance of dd), and send the signal.

kill -INFO pid

If dd prints progress info without dying, then you can use it safely on your long-running dd.

Or, if you're the daring sort, just kill your long-running dd with kill -INT pid . Before exiting, it will print the number of bytes written. You can then resume with the larger blocksize suggested by Scrutinizer, using the skip and seek operands (you'll have to do a little arithmetic to figure out how many complete new-size blocks were written and need to be skipped in both the source and target).

If the blocksize uses a suffix (in 1M, that would be the letter M), be very careful to use the correct multiplier. M is usually 2^20 bytes (1,048,576), not 10^6 (1,000,000). Similar traps await at other magnitudes (kilo, giga).

Are you feelin' lucky?

Regards,
Alister

---------- Post updated at 02:00 PM ---------- Previous update was at 01:51 PM ----------

I don't use Linux much, but that's good to know. Thanks. If the system supports it, it responds to SIGINFO as well (which is the signal used by *BSD dd).

Probably the wisest choice for now.

Regards,
Alister

Royalist · July 27, 2012, 10:03am

Thank you all for your various and helpful suggestions. Eventually I decided to terminate late yesterday. I used GUI sys monitor and found all processes were sleeping. I entered 'kill -9' in terminal and tried to boot from the clone, unsuccessfully. Since then the disk has been declared healthy and without errors several ways and times.
I find the partition structure is the same as the original disk. Un-importantly, all my pre- partitioning has been lost. So, trying to boot with only the clone connected and going to recovery mode the following prints:-

Target filesystem doesn't have requested /sbin/init
No init found Try passing 'init= bootarg'

(bootarg does not appear in the list of commands in the provided help option and did not succeed).

This follows

'BusyBox v.1.18.4 (Ubuntu 1:1.18.4-2ubuntu2) built-in shell (ash)
Enter 'help' for a list of builtin commands'.

initramfs[prompt]

I feel that I am close to succeeding in producing a bootable disk, but why does 'dd' not return a prompt, or any message when finished. What please did I omit from the original syntax and also please what actions are necessary now to proceed to recover??

Corona688 · July 27, 2012, 11:13am

dd does return to a prompt -- you got impatient and killed it before it could finish.

It also prints statistics when it finishes, at least the GNU linux version should.

Next time, at least try the kill -USR1 trick so you can measure how much progress it's made?

You can also try the dd_rescue command if you have it, which does print progress information, and defaults to a sane block size.

dd_rescue input_device output_device

Royalist · July 27, 2012, 12:21pm

Thanks for your opinion. The 'dd' results were returned although I didn't record the number of bytes copied, but it was vast. Today I have re-read 'usr1 kill recommendation and have made a definite note to use it next time. Thanks again. However, the process was clearly marked as 'sleeping' and I took that to mean that it was complete.
By the way - this not a GNU version so far as I can tell. It is Ubuntu 11.10 .
'man dd_rescue' produces 'no manual entry'.
I really am very grateful for the trouble that you have taken. Roy:)

fpmurphy · July 27, 2012, 12:24pm

An enhanced version of GNU dd called dcfldd is often used by forensic examiners. It outputs status messages continually to stderr.

Corona688 · July 27, 2012, 12:31pm

I use GNU dd quite often, and sadly think this is more than opinion. Nothing went wrong, you just got impatient. Copying a terabyte of data can take ages, even when you're getting ideal transfer rates(which you often don't).

It may also have gotten hung up for some reason, though that seems unlikely. dd will print error messages if you start getting read or write errors.

Yes, it prints statistics whenever it quits for any reason. That doesn't mean it finished.

All dd has to do is while(!eof) { read(buffer); write(buffer); } . That's a whole lot of transfer for very few instructions, so 99% of the time is spent asleep waiting for the disk to catch up. Most processes spend 99% of their time asleep, waiting for I/O.

Royalist · August 3, 2012, 8:09am

Yes! 'dd' does work. Scrutiniser's 'bs=1M' did the job. A bootable clone in 2.75hours with results returned with no discrepancies.
Previously, I had taken the advice re-unfillable sink etc from Alister and used that method to test the various signals that had been suggested. Not one achieved the expected result, but then I may have mis-applied them perhaps.
The clone's partitions have now been resized with GParted with no apparent data loss and the new partitions are now in use.
Finally, I have done my history homework and read up about the GNU and Linux and looked for an apparent non existent link to IBM. I now understanding the grounding of Ubuntu.
Thank you all for being so patient.

Regards
Roy:cool:

Scrutinizer · August 3, 2012, 8:41am

Thanks for reporting back, Royalist..

Lem · August 3, 2012, 2:07pm

For your goal IMHO it's better (and easier) to use dd_rescue than dd, even better to use ddrescue. And you learn to use something useful in case of hardware problems (dead sectors). For some more info:
System Administration Bits of Knowledge

To install and learn dd_rescue in Ubuntu:

sudo apt-get install ddrescue
man dd_rescue

Yes, dd_rescue is in package ddrescue (Debian and Ubuntu).

To install and learn ddrescue in Ubuntu:

sudo apt-get install gddrescue
man ddrescue

Yes, ddrescue is in package gddrescue (Debian and Ubuntu).

For single partitions, best (and often much faster) is to use Partimage.
--
Bye

Corona688 · August 3, 2012, 2:13pm

dd_rescue is indeed simpler and more convenient if you have it, which is why I suggested it earlier.

Royalist · August 4, 2012, 10:42am

Yes, thanks. I do have dd_rescue, but have not tried it as yet.

Royalist · August 23, 2012, 4:06pm

OK so I have now tried ddrescue.

The following was the syntax that I used in the Ubuntu Remix version of live CD Ububtu 12.04 (my present OS is 11.10):

From the command line on the live CD 'sudo lshw -short -c Disk' to determine the input and outputs: source '/dev/sdb output /dev/sda'

Then 'sudo ddrescue -f -g /dev/sdb /dev/sda 23-08-2012.log'

After about ten minutes(and whilst I was on the telephone) the screen suddenly went blank followed by 'no signal'. Up to then it had been reporting progress. From then the red light remained on for more than an hour. I left it to carry on. After three hours the light had dimmed and it was possible to get the screen back. The results reported that it was finished and a lot of bytes processed. I am currently searching for the log file.

The result was that the input disk was unchanged as I would hope, but so was the target disk. So nothing was achieved.

The target disk had previously been cloned with 'dd'. I bowed to the weight of the recommendations received here, to try ddrescue. SO WHAT HAVE I DONE WRONG PLEASE!!:wall:

Corona688 · August 23, 2012, 4:15pm

That sounds like the console screen-saver. Hitting the ctrl-key (or any other dead-key) should have brought it back.

I can't see anything obviously wrong with what you did.

But I can't see your computer from here, I don't know what you truly did, or even the circumstances. I don't even know what folder you were in for instance -- that may have had something to do with where the logfile went. Don't know how you booted the system -- hopefully sdb wasn't the disk you'd booted from? I don't know what way the program terminated -- there is a vast, important difference between 'lots of bytes' and 'all the bytes'... and such and forth.

Royalist · August 24, 2012, 12:18pm

Yes, having gained a little more experience of ddrecue, I now think that was the case.

Booted from the Remix live CD which includes ddrescue.

I have since run it again and tried to change the '-f' option, but it wasn't having that as there was data already on the target.

Here is the returned result:

Initial Status (read from logfile)
rescued 0B, generated 0B,
Current Status
rescued: 264104MB, generated 1TB current rate:96206KB/s
Oppos: 1TB, average rate: 107MB/s
Finished
My intention now is to become more familiar with the Remix console and ddrescue. In particular to explore the path system for storing the log file in an umounted drive. I expect it just becomes part of the output file and is then written to the target as a whole. Also, I will re-examine the remaining options-perhaps I missed something obvious.
I do appreciate your help.
Regards Roy

---------- Post updated at 17:18 ---------- Previous update was at 08:36 ----------

I have come to the conclusion that ddrescue, or the Remix version if such, is incapable of writing an output to anything other than a completely fresh i.e. empty disk. Having put considerable effort and time into this. The -f option will not overwrite the data onto a healthy diskcontaining dataeven after a satisfactory completion.
I hope that it makes a better job of what is stated as it's primary purpose, but for simple disk cloning it is a no - no. :wall::wall::wall:

Lem · August 24, 2012, 2:19pm

What? Please...

I'm sorry, but this makes no sense. There's no difference between a completely fresh empty disk and a disk full of data, when you look at them as block devices as ddrescue does. And ddrescue works flawlessly.

You simply didn't get what's "-g" (or "--generate-logfile") for. True that the man didn't help you with too many words, I must admit. But please read here carefully:
GNU ddrescue Manual

With this "-g" option, ddrescue doesn't even try to copy: it tries to generate a logfile from a source and an already made (partial) copy.

Don't use the option "-g", and ddrescue will work for you.
HTH.
--
Bye

Royalist · August 25, 2012, 11:41am

Thanks Lem - most interesting.

The Remix 'info ddrescue' is clearly very truncated and so I found your link most enlightening.

The one I previously read emphasises the need for a logfile and neither seem to tell how, or where to find the logfile. Nowhere, after numerous re-reads, is there any mention that using the '-g' option does not complete the job, but with the previously mentioned emphasis, it seems necessary to include the '-g' option. You seem to be telling me to make numerous passes to obtain these logfiles (just by appending <logfile*?> and to exclude the '-g' ??

dd does it all in only one pass.

I do hope that this makes sense to you. Your advice is much appreciated. Please note only 2 now -----> :wall::wall:

Lem · August 25, 2012, 2:23pm

Ok, let's start from the beginning.

To have a logfile, just use the optional [logfile] field.

For example (I think it's pretty self-explanatory):

# ddrescue -f /dev/sdx /dev/sdy /var/log/ddrescuelog

Just one pass. You're done.

About the "-f" option: ddrescue doesn't copy only from disk to disk: it copies from a standard file, a partition or a disk to a standard file, a partition or a disk. It can copy from a file to a partition, or from a disk to a file, or whatever you like. Now: if the destination is a device or a partition, so if it isn't a regular file, ddrescue will refuse to overwrite it without "-f" option: "it's for your own protection"(R). So you're right using it.

About the logfile: ddrescue is thought for situations where the origin is tipically a disk with some bad unreadable sectors. ddrescue task is to copy as much as possible, as quick as possible, with the least possible stress. But something won't be copied, usually. It won't be possible to read some sectors on the origin, and the corresponding sectors on the destination will be zero-filled. That's why a logfile is so relevant: to know what has been copied and what hasn't.
But you're using ddrescue just to clone a healthy disk, so your logfile (yes, have it, it won't harm) will be really tiny and not interesting.

About "-g" option: let's say someone used ddrescue to recover data from a failing disk and without a logfile (shame on him!). Now he has the origin, he has the (partial) copy, he hasn't got the logfile. Thanks God (well, thanks ddrescue) he can generate the logfile (well, an approximate logfile) even "ex post", with "-g" option: ddrescue -g origin alreadymadecopy /var/log/ddrescuelog . Of course this is not your case, so you're not interested at all in "-g" option, which makes ddrescue just compare an origin with a copy, and create a log about what has probably been done.

I hope we can get rid of those walls, now.
--
Bye