bash - Finding a file by md5sum

08
2014-07
  • gojira

    Given the md5sum of a file, I want to know if anywhere else in the directory tree is another file with the same md5sum (but maybe under a different name). How can I do that in bash?

    P.S.: To emphasize, this should work for the entire tree below a given directory, i.e. must work recursively not just in the current directory.

  • Answers
  • slhck

    Using find to recursively test all files:

    find . -type f -exec \
    bash -c 'md5sum "$0" | grep -q 2690d194b68463c5a6dd53d32ba573c7 && echo $0' {} \;
    

    Here, md5sum outputs the MD5 sum and the file name. You need to grep it for the actual MD5 sum as there is no switch to have it just output the sum alone.

    You can check the MD5 sum much easier with md5 if you're on BSD or OS X:

    find . -type f -exec \
    bash -c '[ "$(md5 -q "$0")" = 2690d194b68463c5a6dd53d32ba573c7 ] && echo $0' {} \;
    
  • slhck

    Borrowing some of the solution from slhck, I've came up with

    find . -type f -print0 | while read -r -d '' f;
    do
     md5sum "$f" | grep "$1"
    done
    

    Where $1 is the first argument. If you want to check for a missing argument start the file with:

    if [ -z "$1" ]
      then
        echo "No argument supplied"
        exit
    fi
    
  • David Foerster

    The other solutions are good but I want to propose one with fewer spawned processes, which should be significantly faster for many small files, if you have GNU find:

    find /path/to/tree -type f -exec md5sum \{\} + | sed -nre 's/^md5-to-search-for  //p'
    

    or without GNU find:

    find /path/to/tree -type f -print0 | xargs -r -0 -- md5sum | sed -nre 's/^md5-to-search-for  //p'
    

  • Related Question

    unix - Get md5sum without file name?
  • C. Ross

    I need to get the md5 sum of a file on AIX, but the md5sum program prints the sum followed by the name of the file.

    How can I get the sum without the file name.


  • Related Answers
  • heavyd
    md5sum < filename
    

    This will give you an empty filename.

  • Oliver Salzburg
    md5sum filename | awk '{print $1}'
    

    That would be one way.