An open source library I listed in my post HTMLDiff Software Discoveries was written in C, so I want to list the steps I went through to compile it.
The library, htmldiff, is meant to find differences (deletions, insertions), between two documents.
The download provides C libraries. Here are the steps I took to compile it and make it usable:
wget http://www.w3.org/2003/01/BB_htmldiff-0.4.tar.gz tar -xvzf BB_htmldiff-0.4.tar.gz cd htmldiff-0.4/ ./configure make make check sudo make install
Then to run the program, the following execution command will get you the basics:
./htmldiff sample-file-1.html sample-file-2.html
There are many flags that you can optionally pass in. They can be seen with
./htmldiff --help, and are as follows:
wdiff - Compares words in two files and report differences. Usage: ./htmldiff [OPTION]... FILE1 FILE2 Mandatory arguments to long options are mandatory for short options too. -C, --copyright print copyright then exit -K, --no-init-term like -t, but no termcap init/term strings -V, --version print program version then exit -1, --no-deleted inhibit output of deleted words -2, --no-inserted inhibit output of inserted words -3, --no-common inhibit output of common words -a, --auto-pager automatically calls a pager -h, --help print this help -i, --ignore-case fold character case while comparing -l, --less-mode variation of printer mode for "less" -n, --avoid-wraps do not extend fields through newlines -p, --printer overstrike as for printers -s, --statistics say how many words deleted, inserted etc. -t, --terminal use termcap as for terminal displays -w, --start-delete=STRING string to mark beginning of delete region -x, --end-delete=STRING string to mark end of delete region -y, --start-insert=STRING string to mark beginning of insert region -z, --end-insert=STRING string to mark end of insert region Report bugs to <email@example.com>.
I'm trying to configure the program to be a little more specific to my needs at the moment. I'll update this post if I find a way to make the diff process a little less sensitive.