Posts in "HTML" category

Comparison of HTML5 Parsers: Gumbo vs html5lib

 July 29, 2016    0 comments 

When developing content plugin for Kodi mediacenter the most important part is where to get the content from. One of the possible ways is to scrap websites that host multimedia content. Yes, legality of that content is another question, but legal matters are beyond the scope of this post.

In Python world BeautifulSoup library (BS for short) in combination with html5lib parser is a popular choice. However, according to the BeautifulSoup documentation the html5lib parser is the slowest, albeit the most reliable, of all html parsers. So I googled for alternatives and found Gumbo parser made by Google itself. According to the description it's fully HTML5-compliant and written in pure C99 with no external dependencies. And it has Python bindings compatible with popular Python HTML parsing libraries, including BeautifulSoup. The BeautifulSoup binding was written for BS 3 of but making it compatible with BS 4 was relatively easy, which I did and submitted a pull request on GitHub (which seems to be ignored by the repo maintainers).  (Read more...)


Spoiler Plugin for TinyMCE 4

 Feb. 11, 2016    0 comments 

I'm still polishing this blog CMS. One of the major "under the hood" improvements that I've made recently is the upgrade of TinyMCE content editor from v.3 to v.4. For that I've done a major rework of django-tinymce4 package and fixed almost all features, including spellchecker and integration with django-filebrowserdjango-filebrowser-no-grappelli packages. I'm going to submit a pull request with my changes, but that will be later, when I get all things in order, including updated documentation for django-tinymce4.

As a part of this improvement I've developed a spoiler plugin for TinyMCE 4 that allows to add spoiler blocks to authored text, that is blocks of text that are initially collapsed and can be expanded by mouse click.  (Read more...)