Python HTMLDiff Usage Review
Next up in my review of HTMLDiff projects is a Python variant, which can be found on GitHub as cygri/htmldiff.
- Much faster than the PHP library I tried, Daisy Diff. Not as quick as the C HTMLDiff library I used, but quick enough to use
- Doesn't index non-essential data, such as attribute values in the
- Easy API to use: simply using the
shell_exec from PHP gets me pretty far
- Speed isn't yet like the C library I used
- It strips htm- - l comments: this is especially problematic, because it strips out cases such as:
As you may notice, this is valid styling that ought to stay in the document, but is instead stripped out.
- Does a really good job at detecting changes
- Has the ability to detect, and highlight, added/removed tags
I'm currently using this on my project. The stripping of the comments from the document means there's a good chance I won't use it in the end, but if that's not an issue for you, or you can control your markup to not have html comments, this library has my blessing.