How to ignore write errors while zeroing a disk?

If the disk is not connected by USB, then consider using hdparm (version > 9.31) to carry out an ATA Secure Erase of the disk. This command causes the drive's firmware to wipe the contents of the disk, including bad blocks.

Warning: Use the correct drive letter - I've shown /dev/sdX as an example - don't just copy/paste.

First, check that it understands the ATA commands (most drives manufactured this last decade or more should):

$ sudo hdparm -I /dev/sdX
.
# lots of other info here...
.
Security: 
    Master password revision code = 65534
        supported
    not enabled
    not locked
    not frozen
    not expired: security count
        supported: enhanced erase
    202min for SECURITY ERASE UNIT. 202min for ENHANCED SECURITY ERASE UNIT.

The last two lines of the extract shows that it is supported.

Therefore add a password to the drive (a requirement apparently):

$sudo hdparm --user-master u --security-set-pass p /dev/sdX
security_password="p"

and erase:

$sudo hdparm --user-master u --security-erase p /dev/sdX
security_password="p"

/dev/sdX:
Issuing SECURITY_ERASE command, password="p", user=user

More info on this procedure is available here.


I prefer badblocks in destructive write mode for this. It writes, it continues doing so when it hits errors, and finally it tells you where those errors were, and this information may help you decide what to do next (Will It Blend?).

# badblocks -v -b 4096 -t random -o badblocks.txt -w /dev/destroyme
Checking for bad blocks in read-write mode
From block 0 to 2097151
Testing with random pattern: done
Reading and comparing: done
Pass completed, 52105 bad blocks found. (0/52105/0 errors)

And the block list:

# head badblocks.txt
2097000
2097001
2097002
2097003
2097004

And what's left on the disk afterwards:

# hexdump -C /dev/destroyme
00000000  be e9 2e a5 87 1d 9e 61  e5 3c 98 7e b6 96 c6 ed  |.......a.<.~....|
00000010  2c fe db 06 bf 10 d0 c3  52 52 b8 a1 55 62 6c 13  |,.......RR..Ubl.|
00000020  4b 9a b8 d3 b7 57 34 9c  93 cc 1a 49 62 e0 36 8e  |K....W4....Ib.6.|

Note it's not really random data - the pattern is repetitive, so if you skipped 1MiB you'd see the same output again.

It will also try to verify by reading the data back in, so if you have a disk that claims to be writing successfully but returns wrong data on readback, it will find those errors too. (Make sure no other processes write to the disk while badblocks is running to avoid false positives.)

Of course with a badly broken disk this may take too long: there is no code that would make it skip over defective areas entirely. The only way you could achieve that with badblocks would be using a much larger blocksize.

I'm not sure if ddrescue does this any better; it's supposed to do that in the other direction (recover as much data as fast as possible). You can do it manually for dd/ddrescue/badblocks by specifying first/last block...


I see four workable answers here:

  1. The hdparm method posted by garethTheRed is probably best if you are connected directly to your computer. Apparently, though, if you try it connected via USB, you can brick your drive. If you are doing this for a drive you are about to dispose of, then that may be a good thing. However, you probably want to secure erase before discarding.

  2. The technique reported by imz -- Ivan Zakharyaschev will work, but may be very slow. I would suggest if you do not want the data to be recoverable, use /dev/urandom instead of /dev/zero; e.g.,

    dd iflag=fullblock oflag=direct conv=noerror,notrunc if=/dev/urandom of=/dev/sdX
    
  3. I would advice against the following. For something faster that does the same thing, use the technique reported by maxschlepzig (in the question):

    ddrescue --verbose --force --nosplit /dev/urandom /dev/sdX
    

    This will be faster than the dd command, but not as fast as the hdparm command. See below why I don't recommend this...

  4. The badblocks command will also work, but you can't randomize the data that way, and again it will be very slow.

Finally, I would be remiss if I did not point out the number one reason people want to completely erase a disk is they are about to dispose of it. In that case, if you haven't already, you might want to try and recover the disk first. If you read a block and it returns the I/O error, then next time you write to the same block the disk will try to reallocate a different block from a reserve list. Once the reserve list is full then you will get I/O errors on writes. That is when you really should discard the drive.

So you can do something simple like:

dd if=/dev/sdX of=/dev/null conv=noerror

And, then, to rewrite the bad blocks, just something like:

dd if=/dev/zero of=/dev/sdX bs=128k

If this command works, if you are brave, you can reformat your disk and use it again.

Alternatively, you can run the badblocks command on the disk twice. The second time it should report no bad blocks...

badblocks -v -s -w -t random /dev/sdX
badblocks -v -s -w -t random /dev/sdX

This will take longer, but is more reliable.

It is also worth noting that none of the techniques really do a secure erase, except the hdparm command. Remember all those bad blocks? Those still have some of your original data mostly intact. A data recovery expert could access these to see a small amount of what was previously on your hard drive.

In regards to ddrescue and why I advice against it, I have the following antidote:

The problem is ddrescure will be TOO good at ignoring errors. I had a hard drive that consistently with dd dropped write speed at about the 102 GB mark and started producing a write errors at the 238 GB mark. I was quite impressed that ddrescue continued to churn through the disk at a constant speed, even reporting no errors. 17 hours later, when it was at the 1300 GB in when I happened to notice the drive light itself stopped flashing. A quick check revealed the whole USB enclosure had gone offline. I pulled the drive out of the cradle. I noticed ddrescue just happily reported it was still copying with no errors, even with the disk in my hands. I plugged the disk into another machine and found it was now a brick.

I don't blame ddrescue for making the drive a brick. The drive was failing and would become a brick. I just find disturbing ddrescue doesn't even give an error count of how many write errors it is ignoring. In this usage, ddrescue leaves you think it has been completely successful, regardless of all write failures. The fact is it should not have been able to continue at full speed in the section with the slow down. The reason that section was slow, is many blocks had been relocated by the drive, causing lots of seeking when accessing that section. So that is probably the point when ddrescue's output became fictional.