bash - Finding a file by md5sum
2014-07
Given the md5sum of a file, I want to know if anywhere else in the directory tree is another file with the same md5sum (but maybe under a different name). How can I do that in bash?
P.S.: To emphasize, this should work for the entire tree below a given directory, i.e. must work recursively not just in the current directory.
Using find
to recursively test all files:
find . -type f -exec \
bash -c 'md5sum "$0" | grep -q 2690d194b68463c5a6dd53d32ba573c7 && echo $0' {} \;
Here, md5sum
outputs the MD5 sum and the file name. You need to grep
it for the actual MD5 sum as there is no switch to have it just output the sum alone.
You can check the MD5 sum much easier with md5
if you're on BSD or OS X:
find . -type f -exec \
bash -c '[ "$(md5 -q "$0")" = 2690d194b68463c5a6dd53d32ba573c7 ] && echo $0' {} \;
Borrowing some of the solution from slhck, I've came up with
find . -type f -print0 | while read -r -d '' f;
do
md5sum "$f" | grep "$1"
done
Where $1 is the first argument. If you want to check for a missing argument start the file with:
if [ -z "$1" ]
then
echo "No argument supplied"
exit
fi
The other solutions are good but I want to propose one with fewer spawned processes, which should be significantly faster for many small files, if you have GNU find:
find /path/to/tree -type f -exec md5sum \{\} + | sed -nre 's/^md5-to-search-for //p'
or without GNU find:
find /path/to/tree -type f -print0 | xargs -r -0 -- md5sum | sed -nre 's/^md5-to-search-for //p'
I need to get the md5 sum of a file on AIX, but the md5sum
program prints the sum followed by the name of the file.
How can I get the sum without the file name.
md5sum < filename
This will give you an empty filename.
md5sum filename | awk '{print $1}'
That would be one way.