Can Robocopy ensure file integrity?

08
2014-07
  • iceagle

    I am trying to copy a large file, around 10G size, over Internet using Robocopy, but I am a bit concerned about the file integrity. So can I just trust Robocopy to ensure file integrity or I need to calculate and verify the MD5 myself? thanks,

  • Answers
  • Brian

    One large monolithic file will likely cause problems if it fails partway through or if /Z is used it might take too long to transfer.

    I would suggest using an archive utility to both split it into multiple files and provide an extra layer of integrity checking of the assembled file. On newer versions of Windows it also lets you use the /MT which transfers multiple files at once which can speed things up on slow links if you also using /Z. So split, robocopy the parts over and then reassemble.


  • Related Question

    hashing - Can you use OpenSSL to generate an md5 or sha hash on a directory of files?
  • Kieveli

    I'm interested in storing an indicator of file / directory integrity between two archived copies of directories. It's around 1TB of data stored recursively on hard drives. Is there a way using OpenSSL to generate a single hash for all the files that can be used as a comparison between two copies of the data, or at a later point to verify the data has not changed?


  • Related Answers
  • AaronLS

    You could recursively generate all the hashes, concatenate the hashes into a single file, then generate a hash of that file.

  • John T

    You can't do a cumulative hash of them all to make a single hash, but you can compress them first then compute the hash:

    $tar -czpf archive1.tar.gz folder1/
    $tar -czpf archive2.tar.gz folder2/
    $openssl md5 archive1.tar.gz archive2.tar.gz
    


    to recursively hash each file:

    $find . -type f -exec openssl md5 {} +
    
  • Rudedog

    Doing a md5 sum on the tar would never work unless all of the metadata (creation date, etc.) was identical as well, because tar stores that as part of its archive.

    I would probably do an md5 sum of the contents of all of the files:

    find folder1 -type f | sort | tr '\n' '\0' | xargs -0 cat | openssl md5
    find folder2 -type f | sort | tr '\n' '\0' | xargs -0 cat | openssl md5