hashing - Any examples of duplicate MD5 Hashes?

08
2014-07
  • Steve

    I was wondering, as it is theoretically possible to have duplicate MD5 hashes, are there any known examples of this, or is it all just theoretical?

  • Answers
  • 50-3

    The shortest one I know of is below. Collisions do happen and it's something that needs to be taken into consideration when using hashing. Every hash function will have a shelf life MD5 is near it's end of life, Personally only use it for verifying file integrity it's no longer a secure method for storing information just validating it.

    Input vector 1:1

    0000000    d1  31  dd  02  c5  e6  ee  c4  69  3d  9a  06  98  af  f9  5c
    0000020    2f  ca  b5  87  12  46  7e  ab  40  04  58  3e  b8  fb  7f  89
    0000040    55  ad  34  06  09  f4  b3  02  83  e4  88  83  25  71  41  5a
    0000060    08  51  25  e8  f7  cd  c9  9f  d9  1d  bd  f2  80  37  3c  5b
    0000100    d8  82  3e  31  56  34  8f  5b  ae  6d  ac  d4  36  c9  19  c6
    0000120    dd  53  e2  b4  87  da  03  fd  02  39  63  06  d2  48  cd  a0
    0000140    e9  9f  33  42  0f  57  7e  e8  ce  54  b6  70  80  a8  0d  1e
    0000160    c6  98  21  bc  b6  a8  83  93  96  f9  65  2b  6f  f7  2a  70
    

    Input vector 2:

    0000000    d1  31  dd  02  c5  e6  ee  c4  69  3d  9a  06  98  af  f9  5c
    0000020    2f  ca  b5  07  12  46  7e  ab  40  04  58  3e  b8  fb  7f  89
    0000040    55  ad  34  06  09  f4  b3  02  83  e4  88  83  25  f1  41  5a
    0000060    08  51  25  e8  f7  cd  c9  9f  d9  1d  bd  72  80  37  3c  5b
    0000100    d8  82  3e  31  56  34  8f  5b  ae  6d  ac  d4  36  c9  19  c6
    0000120    dd  53  e2  34  87  da  03  fd  02  39  63  06  d2  48  cd  a0
    0000140    e9  9f  33  42  0f  57  7e  e8  ce  54  b6  70  80  28  0d  1e
    0000160    c6  98  21  bc  b6  a8  83  93  96  f9  65  ab  6f  f7  2a  70
    

    Digest:

    79054025255fb1a26e4bc422aef54eb4

    -A real MD5 collision


  • Related Question

    hashing - Can you use OpenSSL to generate an md5 or sha hash on a directory of files?
  • Kieveli

    I'm interested in storing an indicator of file / directory integrity between two archived copies of directories. It's around 1TB of data stored recursively on hard drives. Is there a way using OpenSSL to generate a single hash for all the files that can be used as a comparison between two copies of the data, or at a later point to verify the data has not changed?


  • Related Answers
  • AaronLS

    You could recursively generate all the hashes, concatenate the hashes into a single file, then generate a hash of that file.

  • John T

    You can't do a cumulative hash of them all to make a single hash, but you can compress them first then compute the hash:

    $tar -czpf archive1.tar.gz folder1/
    $tar -czpf archive2.tar.gz folder2/
    $openssl md5 archive1.tar.gz archive2.tar.gz
    


    to recursively hash each file:

    $find . -type f -exec openssl md5 {} +
    
  • Rudedog

    Doing a md5 sum on the tar would never work unless all of the metadata (creation date, etc.) was identical as well, because tar stores that as part of its archive.

    I would probably do an md5 sum of the contents of all of the files:

    find folder1 -type f | sort | tr '\n' '\0' | xargs -0 cat | openssl md5
    find folder2 -type f | sort | tr '\n' '\0' | xargs -0 cat | openssl md5