web.onassar.com Archive

I can be reached at onassar@gmail.com.

For my open source work, check out github.com/onassar

Compiling C HTMLDiff program on Ubuntu 12.04

View more posts

An open source library I listed in my post HTMLDiff Software Discoveries was written in C, so I want to list the steps I went through to compile it.

The library, htmldiff, is meant to find differences (deletions, insertions), between two documents.

The download provides C libraries. Here are the steps I took to compile it and make it usable:

wget http://www.w3.org/2003/01/BB_htmldiff-0.4.tar.gz
tar -xvzf BB_htmldiff-0.4.tar.gz
cd htmldiff-0.4/
make check
sudo make install

Then to run the program, the following execution command will get you the basics:

./htmldiff sample-file-1.html sample-file-2.html

There are many flags that you can optionally pass in. They can be seen with ./htmldiff --help, and are as follows:

wdiff - Compares words in two files and report differences.

Usage: ./htmldiff [OPTION]... FILE1 FILE2
Mandatory arguments to long options are mandatory for short options too.

  -C, --copyright            print copyright then exit
  -K, --no-init-term         like -t, but no termcap init/term strings
  -V, --version              print program version then exit
  -1, --no-deleted           inhibit output of deleted words
  -2, --no-inserted          inhibit output of inserted words
  -3, --no-common            inhibit output of common words
  -a, --auto-pager           automatically calls a pager
  -h, --help                 print this help
  -i, --ignore-case          fold character case while comparing
  -l, --less-mode            variation of printer mode for "less"
  -n, --avoid-wraps          do not extend fields through newlines
  -p, --printer              overstrike as for printers
  -s, --statistics           say how many words deleted, inserted etc.
  -t, --terminal             use termcap as for terminal displays
  -w, --start-delete=STRING  string to mark beginning of delete region
  -x, --end-delete=STRING    string to mark end of delete region
  -y, --start-insert=STRING  string to mark beginning of insert region
  -z, --end-insert=STRING    string to mark end of insert region

Report bugs to <wdiff-bugs@iro.umontreal.ca>.

I'm trying to configure the program to be a little more specific to my needs at the moment. I'll update this post if I find a way to make the diff process a little less sensitive.