Prosody Templating — Odysseus Development Blog

O.K., that’s enough ranting about the W3C. Especially since I believe what they’re doing is extremely valuable in that it allows not just one but two vibrant markets to survive — websites and browsers. Their open standards not only allow people to easily publish their knowledge and entertainment for pennies (if that), but it also allows new operating systems to incorporate that value in a way that best works with their User eXperience (UX). Ofcourse it also allows third parties to succeed in building their own browsers and thereby keep us¹ on our toes, despite the lack of value they tend to place on a native UX.

If this page looks too wordy for you, the summary is nice and brief.

Prosody Templating Language

So now let’s discuss Odysseus itself, specifically a templating language I wrote for it I call “Prosody”. I use this language to generate internal pages for Odysseus (error pages, view source, and whatnot), and hope to one day soon integrate it with SQLite to aid the development of useful navigation aids. It’s syntax is based on Django’s but with it’s own set of tags and filters.

The reason I went for a templating language over, say, a Gtk.Stack is to give a more webby feel to several relatively peripheral UI components. And I used a Django-like syntax as it was easy to extend with new tags and filters, such as (eventually) a query tag which evaluates contained SQL against an SQLite database.

How does Prosody work?

At the core of Odysseus is a concept of a “smartsplitter”. The smartsplitter splits some text into tokens by some seperator, providing that seperator isn’t in a quoted string. After several attempts at implementing this using regular expressions I settled on utilizing two loops — an inner loop which scans quoted strings and an outer loop which scans to the next seperator.

Around that it was trivial to scan for “{“ charactors, parse tags/variables, parse their arguments, and parse variable expressions complete with filters. From here parsing was done by having the tokenizor dispatch to factory classes (similar to how it’s implemented in Django) which may then recurse back into the tokenizor to implement block tags like for/in or even more commonly call into the variable scanner/parser.

Evaluating Prosody templates is done the theoretically slow yet obvious OO way — method calls on it’s AST nodes. However some (perhaps premature) optimization was done in that the parser and interpretor both avoids reallocating strings. This was done because whereas passing pointers around is fast, reallocating strings not only involves copying bits from one location to another but also finding where there’s enough room in RAM to do so. While GLib (particularly for fixed-size objects) and GNU LibC put a good amount of effort into making this fast, a modern CPU will always find copying pointers around faster².

Also that optimization was put in because I was fealing guilty about putting an unoptimized piece of code between two extremely well-tested and optimized codebases in WebKit and SQLite.

Those AST nodes in turn call into interfaces representing the datamodel (which mostly serves to coerce data between dictionaries, lists, numbers, and text) and an output stream. The main thing that Prosody output stream adds over GIO’s is that it hides any errors that may occur, thereby simplifying the interpretor. Hiding errors may generally be a bad idea, but here the errors shouldn’t occur and there’s very little the interpretor can and should do about errors except pass them on.

At this layer measures are taken to ensure that loading a template will not freeze Odysseus’s UI. Specifically these measures involve setting the mainloop priority for template rendering to be really low. Prosody can offord this slowdown because simply by being on the local computer it is otherwise quite fast, and as such it should make way for real webpages with already high latencies due to the Internet.

WebKit integration

It’s all great to have a templating language but the whole point of this is to get a webby aesthetic. A large part of achieving that is to make sure WebKit can fetch these dynamically generated pages by a URL.

So WebKit is told to load odysseus: URLs using a small wrapper around Prosody which does a few things:

Uses libsoup to parse the given URI to use as input to the template.
Loads the template out of a GLib.Resource
LRU caches the AST in case the page will be revisited shortly. This bit is likely suboptimal.
Copies the output string into buffers expected by WebKit. This involves some complicated tap-dancing around idling priorities to get the components to roughly take turns.
Report errors using dedicated templates.

From the list above several components of the Prosody wrapper has to do with performance (use of resources, caching, and the way it provides it’s output), but the GLib.Resource deserves more explanation. GLib.Resource is a library and developments tools which allows a file to be editted seperately from an application’s source code but be compiled into the executable file which is distributed to end-users. This is faster as when Odysseus is loaded by the operating system into RAM in order to be ran, all the templates it’ll need will already be in memory. Hence a slow extra disk access will be unnecessary, and the Prosody core benefits by needing to copy even fewer bytes around in RAM.

Emperical measures of performance have not been taken, but I’ve found that these pages load unnervingly fast for someone who is used to network latency.

Testing Prosody

An additional test tag was added to Prosody which thoroughly exercises the scanner in order to run a unit test. Place the output of this tag in HTML with a test-report³ tag at the bottom and you get a really nice testrunner. Place within the WebKit integration framework so it can be accessed (within Odysseus) at odysseus:test and it becomes even nicer.

Overall the implementation of this is quite straightforward with some (evil) global counters used to communicate between tests and test-report. Some lengthy string concatenation is used to generate the output of these tags, but that’s about it until we get to “diffing”.

Diffing is a nice little UI enhancement added to the testrunner which makes it easier to see where the output differs from the expected value. The output isn’t that difficult there either, as I can simply insert HTML tags into the output stream and let WebKit take it from there, but generate the diff does take some algorithmic thinking⁴.

Specifically the technique is to calculate the maximum number of characters that could be construed as matching for a pair of strings A and B given the answers to that problem where A and/or B have one less character. These answers are cached in an in-memory grid to inform the answers to the longer strings. This grid also helps a subsequent analyzer to trace an initial diff back from the bottom-right corner to the topleft corner. Once that is done runs of text originating from the same file are merged and rendered to HTML.

Needless to say this can be a slow process and it becomes slower as the pairs of text increase in length. As such it is an obvious performance win to perform a more naive equality check before running the full diff engine. It could also be a win to operate in terms of lines instead of individual characters⁵, but that takes more programming effort which hasn’t been worth it. Especially since this feature is for me and other potential Odysseus developers, rather than end-users.

Internationalizing Prosody templates

It is good programming practice to prepare for applications to be used by people who don’t speek English (or whatever the developer’s native language is). As such I added a trans tag which looks up translations of a contained template in a catalogue file. Currently this catalogue file is parsed using the same parser as normal templates, but that should probably be changed to something that works best for both Odysseus and l10n platforms like Transifax.

A Python script is used to extract the templates inside trans tags in order to be translated. The same script then merges the translations into previously translated strings.

An early attempt tried to use Gettext to power the translations, but that worked against Prosody wanting to reuse text already in memory and caused a significant number of segfaults. Also there’s currently an attempt at implementing a translation platform for these template fragments, but that’s not the most valuable aspect of Odysseus for me to focus on.

Expressions

One interesting thing about natural languages is that different languages have different plural forms. As such to accurately translate text there needs to be a way to teach the computer that language’s plural rules, say using a mathematical expression (not dissimilar to a JavaScript, Vala, SQL, or Python expression). Furthermore similar expressions are useful in if tags when writing templates, as they allow developers to cleanly embed a little bit more code in the template.

To parse these expressions Prosody starts by taking a stream of smartsplitter tokens and classifying them on-demand. Thanks to the fact we use UTF-8 and all operators are at most 4 ASCII characters long (that is all operators are at most 4 bytes A.K.A. 32 bits), we can easily load a token into a CPU register and test which operator it is there. However if the token does not fit in the register (that is it has more than 4 characters) or if it does not match any of our hardwired operators we know it’s a variable.

From there Prosody uses Top Down Operator Precedance parsing, where once again the AST is interpreted once again by calling methods on it. These methods in turn call some dynamic properties to recurse over the AST and apply some subtlety from a simple type system⁶.

A shoutout to Highlight.js

It was a great aid to me as an Odysseus developer in some debugging pages made specifically for me, and it is a great aid in making the view-source pages convenient.

UX design

Everything discussed above makes my life as an Odysseus developer easier, but how does that benefit end-users? Afterall developers should first focus on solving user’s problems, and then any problems that makes for them.

For all it’s inconsistancy the Web is very strong on decent (and now with WebFonts good) typography, skimmability, and page-to-page navigation (via links and forms). Given that navigation aids (whether they be history entries, topsites, favorites, or suggested websites), largely involve showing the user links to click. Links are well-known to those who surf the Web but are near non-existant in other elementary apps. As such a Web UX does more to get users acustomed to these tools than an elementary one does.

As for error messages those mostly involve showing a textual message in place of the page which failed to load, and some navigation aids (to websites and/or elementary settings panels) away from the error condition. Once again this falls squarely within what I’ve already described the Web as being good at.

The catch here is that the Web provides less design guidance than elementary, but that can be filled in with elementary’s icons and design philosophy.

Finally it’s hard to argue that combining these two user experiences hurts Odysseus’s, as that’s essentially it’s job. To show the Web to elementary fans in chrome they’re already used to ignoring in other elementary apps.

Why does View Source open in Odysseus, not Scratch?

Scratch excels at showing editable and syntax-highlighted text files from the filesystem, whereas Odysseus as a webbrowser excels at showing immutable rich text off the network. As such while I could have gotten syntax highlighting for free with Scratch, it seemed a tad more appropriate to use Odysseus.

At the same time however Odysseus uses Audience to play videos, and Evince to view PDFs. This is atypical behaviour for a web browser, but I still got it working very smoothly using FreeDesktop.org standards. I’m aware it doesn’t work with authenticated files, and am open to discussing a fix with elementary and GNOME.

Summary

Odysseus already has and will have several peripheral UI components which benefits from the Web’s strengths at skimmable typography and (hyper-)linking. To help those UI components take full advantage of this UX which is already present in Odysseus an internal templating language (based on Django’s) was developed for Odysseus and hooked into WebKit.

This templating language is primarily implemented using:

A “smartsplitter” which splits text by a seperator given it’s not in a string literal.
A reentrant scanner dispatching parsing to factory classes.
Dynamic dispatch interpretor with an implicitly coercive datamodel.
An output stream interface which hides any errors and yields execution to user interactions and pages loaded off the network.

Furthermore in integrating it with WebKit to be fetched via URIs, more optimizations were added:

Templates are loaded out of the executable file, but maintained seperately from the source code.
Parsed templately are least-recently-used cached.

A testrunner and testsuite was developed using a pair of custom template tags and some global counters to inject HTML into an output stream. This testrunner even includes a basic diffing engine to make it easier to spot failures.

Internationalization was added using another template tag with a Python script to extract template fragments for translation.

Yes, I know I’m fantasizing and Epiphany (GNOME Web) is currently elementary’s browser. But Epiphany is designed for a GNOME 3 UX whereas Odysseus is designed for an elementary UX. Just look at where we put our New Tab buttons!
Especially since pointers can be kept in high-speed “registers” so modern CPUs don’t have to get slowed down as much by RAM.
test-report generates a quick summary absolutely positioned in the topright corner and some JavaScript to set the tab title and favicon.
Wikipedia does seem to excel at documenting computer science concepts.
This is done by real diff engines not only for performance but also to give more readable output.
This helps correct some pecularities that may otherwise occur with the variable coercion.

Odysseus

22nd July 2017 — Adrian Cochrane