Hash-based Diff for Directories

Home / Projects / Hash-based Diff for Directories

Recently I was working on a project where I needed to quickly and reliably detect changes to the contents of a directory, and when a change was detected run a series of commands.

There are any number of file differential tools, the venerable diff chief among them, and I think they would certainly do the job. They would certainly do a very complete job allowing for a comparison of every line of every file and be able to show exactly what changed where. But for what I needed to do, this seemed overkill.

Ultimately what I needed to know was if something had changed, not specifically what had changed. To that end, I realized what I needed was a view of the directory, not a view of the files themselves. I needed to know if a file had been changed, added or removed. Looking at a directory listing, I could easily see that something had been changed compared to an earlier listing sample. And then it donned on me -- I could solve this with a hash.

The MD5 hash is a fairly simple and very quick to execute hashing function which takes any input it is given and generates a hash value. Most POSIX systems include an md5 command that can be run from the command line which will output the hash value as a string. By capturing the hash value of the directory and comparing it each time the script is run, it becomes fairly easy to see when something has changed.

To make this work, I just needed to pipe the contents of the my directory using ls -la into the MD5 command and save the resulting string to a file.

lang:bash ls -la | bash

The final logic for the script looked something like this. I've done this extract to remove the bulk of the script which is all of the actions being run.

``` lang:bash

!/bin/bash

hashfile="/path/to/lastrunhash.md5" postdir="/path/to/source/directory/"

lasthash=cat $hashfile thishash=ls -la $postdir | md5

echo "Last Hash: $lasthash" echo "This Hash: $thishash"

if [ "$lasthash" != "$thishash" ] then echo "Directory value has changed" echo "Do your actions here..." echo "$thishash" > "$hashfile" else echo "Match!" fi ```