For the last two weeks we’ve been working steadily on the WEBVTT parser. Most of the work being done now is related to getting the parser integrated into Firefox. We’re building on top of the original bug filed on bugzilla by Ralph and are now using it as a co-ordination bug for the five other bugs we’ve split it up into. The “bug sections” that we’ve split it up into are:
- Integrating the webvtt build system into the Firefox build system.
- Adding a captions div to an nsVideoFrame so the captions can be rendered on screen.
- Creating a “Text Track Decoder” that will function as the entry point for Gecko into the webvtt parser.
- Creating new DOM bindings i.e. TextTrack, TextTrackCue, TextTrackCueList.
- Creating DOM tests using Mochitest for the new DOM bindings.
You can check out a more in depth break down of our bug plan here.
The other major thing that a few of us in the class have been engaged in is the review of the 0.4 parser code. The review is still in it’s early to mid stages, so we have a lot more to do on that. I’ve been participating there by filing and commenting on issues and fixing a few of the bugs that have surfaced.
We’ve also moved the parser code over to the mozilla webvtt repository on GitHub (yay!) and have landed the 0.4 parser code there in a dev branch. After the review is done it will be landed on the master branch.
I’ve been working on the Text Track Decoder for the parser integration into Firefox. This part of the integration functions as an entry point into our parser for Gecko.
How It Works
The short version of how the Text Track Decoder works is that when an HtmlTrackElement receives new data from a vtt byte stream it passes it off to it’s WebVTTLoadListener (Text Track Decoder) which then calls our webvtt parser to parse the chunk of the byte stream it just reveived. The WebVTTLoadListener also provides call back functions to the parser for passing back cues when the parser has finished them or for reporting errors when the parser encounters them. The final function that the WebVTTLoadListener facilitates is converting the cues that have been passed back in the call back function to the various DOM elements that represent a webvtt_cue and then attaching those to either the HtmlTrackElement’s track, in the case of the cue settings, or the HtmlTrackElement’s MediaElement’s video div caption overlay (phew), in the case of the parsed webvtt cue text node tree.
What We’ve Done
The first order of business that we took care of in getting this done was to ask Chris Pearce, who works very closely with Firefox’s media stuff, to give us a high level overview of what we would need to accomplish in order to get this working. That was sent in the form of an email which my Professor, Dave Humphrey, then kindly posted on our bug (I forgot to do so!).
We then quickly went about implementing Chris’s initial steps that he talked about. We’ve done steps 1 – 4 so far:
- The HtmlTrackElement::LoadListener has been moved to it’s own file and renamed WebVTTLoadListener.
- The HtmlTrackElement now has a WebVTTLoadListener reference which is initialized in LoadResource.
- WebVTTLoadListener now manages a single webvtt parser which is created and destroyed along with it.
- WebVTTLoadListener now provides call back functions to the parser for returning finished cues and reporting errors.
We’ve also added three convenience functions to turn webvtt cue stuff into the DOM bindings. These are:
- cCueToDomCue – Transforms a webvtt cue’s settings into a TextTrackCue (almost done).
- cNodeToHtmlElement – Transforms a webvtt node into an HtmlElement; recursively adds HtmlElements to it’s children if it is converting an internal webvtt node (not done at all!).
- cNodeListToDocumentFragment – Transforms the head node’s children into HtmlElements and adds them to a DocumentFragment (pretty much done).
The call back function for returning cues now:
- Calls the cCueToDomCue function and adds the resulting HtmlTextTrackCue to it’s owning HtmlTrackElements cue list.
- Calls the cNodeListToDocumentFragment and adds the resulting DocumentFragment to the caption overlay.
Right now we’ve run into some problems in figuring out how to work with the Firefox code. I’ve listed those in my recent WIP update on the bug. Other then implementing those steps I’ve just been getting acquainted with the Firefox code that we have to touch and figuring out the basics of how it’s all going to fit in. I think we’ve gotten a big chunk of it done so far, mostly the overall frame of how it’s going to work as well as turning a webvtt cue’s settings into a TextTrackCue. I’ve also met the deadlines and goals that I set for myself at the beginning of this semester, so I’m fairly happy. Going forward I think I know enough now to ask intelligent questions about how to solve the problems that I listed in the WIP, so that’s what I will be doing in the coming weeks when I get stuck.
As always, I’m ever confident that we will finish the project!