CYRIN In The Browser
In order to rate the readability of text on a web page, Can You Read It Now has to do all of the following:
- Get access to the page contents
- Find the text to be analyzed
- Get the displayed style of each piece of text
- Calculate ratings for each aspect of readability
- Display the ratings to the user
Accomplishing all of this with one click from the user required making several different trade-offs and design decisions. Let’s cover each of these in turn.
Get access to the page contents
The simplest way to give CYRIN access to the current document would be to add an in-document JavaScript reference on the page to test, but that would mean changing the content of any page you want to rate, which is tedious and unnecessary. A different way to run third-party JavaScript in the context of the current document is a bookmarklet. As you might be able to gather from the name, bookmarklets are JavaScript that is saved and executed like a bookmark.
One pain point of building bookmarklets from a development standpoint is that you can’t update the code once it’s been saved. This can be mitigated by saving only code to load updatable script from a hosted location (like canyoureaditnow.com).
Find the text to be analyzed
So, now that the JavaScript is running in the right context, how do we find the text to analyze? I knew it was possible to find the primary text for a page without requiring user input from seeing the Instapaper and Readability bookmarklets in action, but I wasn’t sure how to go about implementing something like that myself. Fortunately for me, the original Readability bookmark is open source and I was able to re-use their working solution.
Get the displayed style of each piece of text
Once we’ve got a handle to the container of the text to analyze, we can use jQuery’s css method to grab the computed CSS properties of each element.
Calculate ratings for each aspect of readability
Style data from each piece of text and the elements nearby are used to calculate ratings which are then weighted by how much of the document by character count this text represents. This is how a tree of DOM elements gets rated and turned into one set of scores for the entire page.
Display the results
Now we have a set of scores to display within the document that we just finished rating. In order to prevent the hosted page from affecting how the scores are styled they are displayed in an IFrame, where only the directly included styles will apply. This also allows the markup to be served by the CYRIN server rather than included in JavaScript.
Wrapping up
That’s all I have to say about CYRIN for now. If you have any questions feel free to contact me at kevin.gorski@gmail.com.
Until next time.