Two Lines of Code: An Open Source Tale
Series: practices May 14, 2014
January 7, 2014 - 09:35:45 AM
It all started with a JavaScript error…
Uncaught SyntaxError: Unexpected token ILLEGAL
I was trying to catch up on my RSS items, but nothing was rendering on the page.
I dumped some debugging info and opened an issue on GitHub. I found a workaround, but it involved marking all my stories as read. No time to look into this issue now.
February 24, 2014 - 02:17:58 PM
Several other users have reported experiencing the same bug. A potential fix
that involved removing unprintable characters (.gsub(/[^[:print:]]/, '')
) was
proposed but didn’t seem to completely address the issue.
March 27, 2014 - 10:00:15 PM
A comment on the [still unresolved] bug triggered an email notification from GitHub earlier this morning. I had some time to look into it after work.
I went back to my original bug report and tried to create a minimal test case that would reproduce the bug. I opened up the Chrome Dev console and started pasting in chunks of the large string I was trying to parse.
Using a primitive form of git bisect
, I tried the first half of the
string to see if the error happened again. Nope. I halved the remaining part of
the string. I repeated until I had it narrowed down to a few characters.
The string in question was “QNk8n”. Nothing jumps out as being extraordinary about that string.
I pasted it into an irb
session and found the likely culprit:
irb(main):001:0> "QNk8n\U+FFE2\U+FFA8"
Some weird unicode characters were being tacked onto the end!
Googling for “unicode 2028 javascript” led me to a really excellent blog post explaining that JSON is not a true subset of JavaScript.
The long and short of it: u+2028
and u+2029
are valid JSON but not valid
JavaScript. My app was trying to parse the JSON representation of the RSS
articles into JavaScript (via backbone.js) to be rendered.
I wrote a failing test and then fixed the bug (confession: my first bug fix passed the test but created another, whoops).
Pushed. Deployed. Did a little dance.
March 30, 2014 - 06:56:19 PM
I wanted to get this fix upstream. In addition to wanting to give back, I didn’t want to have to implement this “hack” in my own app.
Next up the chain was feedjira
— the gem I was using to parse RSS feeds.
Ultimately, this code probably belonged in loofah
— an HTML sanitization
gem used by feedjira
, but that library seemed to be dormant.
After a brief discussion with maintainer Jon Allured, we both agreed to
try to get the fixes into loofah
. If we couldn’t, we would patch it in feedjira
.
April 6, 2014 - 06:00:01 PM
Finally got around to opening an issue with loofah
. I proposed that we add
code to deal with the Evil JSON Characters as part of loofah
’s sanitization
process.
Project maintainer Mike Dalessio said this fix would be well received and pointed me toward the relevant sections of the codebase.
April 12, 2014 - 06:04:22 PM
Deep dive into the loofah
codebase to add a new “scrubber”!
The loofah
architecture was interesting; the scrubbers are basically parsers
that operate on nokogiri
nodes. You can make a top-down or a bottom-up parser
and you can control when you break out of the tree as you walk the nodes.
With Mike’s initial direction guiding me, I got a working implementation and opened a pull request.
April 21, 2014 - 06:20:36 PM
A friendly ping to Mike and my PR gets merged.
May 9, 2014 - 06:49:54 PM
loofah
version 2.0.0 is released (which includes my fix) and pushed to
RubyGems.
Now that the fix has been applied upstream, we now have to update gem versions downstream.
May 13, 2014 - 04:19:17 PM
I open a new PR in feedjira
to update the loofah
version.
The PR is merged and feedjira
version 1.3.0 is released.
May 14, 2014 - 11:44:35 AM
I can bump the versions of feedjira
and loofah
used in Stringer and I
can finally replace the patch with scrub!(:unprintable)
.
Victory!
So five months later, my two line of code bug fix has made it all the way upstream and then back again! It may not seem like much, but this is the magic of open source.
This bug originally affected a few users of Stringer, but by sending the
patch upstream, thousands of people have benefited (loofah
and feedjira
have over 500k combined downloads).