Hash-based Diff for Directories

Recently I was working on a project where I needed to quickly and reliably detect changes to the contents of a directory, and when a change was detected run a series of commands.

There are any number of file differential tools, the venerable [diff][1] chief among them, and I think they would certainly do the job. They would certainly do a very complete job allowing for a comparison of every line of every file and be able to show exactly what changed where. But for what I needed to do, this seemed overkill.

Ultimately what I needed to know was ***if*** something had changed, not specifically ***what*** had changed. To that end, I realized what I needed was a view of the directory, not a view of the files themselves. I needed to know if a file had been changed, added or removed. Looking at a directory listing, I could easily see that something had been changed compared to an earlier listing sample. And then it donned on me — I could solve this with a hash.

The [MD5][2] hash is a fairly simple and very quick to execute hashing function which takes any input it is given and generates a hash value. Most POSIX systems include an md5 command that can be run from the command line which will output the hash value as a string. By capturing the hash value of the directory and comparing it each time the script is run, it becomes fairly easy to see when something has changed.

To make this work, I just needed to pipe the contents of the my directory using [ls -la][3] into the MD5 command and save the resulting string to a file.

“` lang:bash
ls -la | bash
“`

The final logic for the script looked something like this. I’ve done this extract to remove the bulk of the script which is all of the actions being run.

“` lang:bash
#!/bin/bash

hashfile=”/path/to/lastrunhash.md5″
postdir=”/path/to/source/directory/”

lasthash=`cat $hashfile`
thishash=`ls -la $postdir | md5`

echo “Last Hash: $lasthash”
echo “This Hash: $thishash”

if [ “$lasthash” != “$thishash” ]
then
echo “Directory value has changed”
echo “Do your actions here…”
echo “$thishash” > “$hashfile”
else
echo “Match!”
fi
“`

[1]: https://www.gnu.org/software/diffutils/
[2]: http://linuxcommand.org/man_pages/md51.html
[3]: http://linuxcommand.org/man_pages/ls1.html