WEBVTT: Farewell DPS911

Tomorrow is the last day for my open-source class at Seneca. So this will be the last WEBVTT post that I will make for the class, ever. It’s been a long journey since last September and we’ve made huge progress, learnt a ton, burnt out many a time, and had a great time doing it. If you are worried about no more posts on WEBVTT fear not! I’ll still be posting regularly on WEBVTT as I’ve now switched over to working on it and possibly some WebMaker stuff at CDOT for the next year. I’m really looking forward to it.

Now, lets get on with it.


WEBVTT Parser

It’s been pretty exciting around WEBVTT in the last month or so — ever since we did a presentation at Toronto Mozilla we’ve received a lot more interest. It’s a pretty cool and strange feeling to have people interested in what we’re doing. Especially with WEBVTT. It’s not very glamorous, as you can imagine. Myself and a few of my classmates also went to an “Open web open mic” night at Toronto Mozilla where we got to do another presentation and showcase WEBVTT off in a kind of science fair environment. We also got to see lots of great presentations and projects that are being worked on. It really opened my mind to what is going on in Toronto and beyond. Pretty cool stuff.

We recently got all our tests green! At that point we officially tagged a revision of the parser as version 0.4… so lots more work to do. Since then we’ve been adding more refined and atomic unit tests to the test suite. Most of them are testing our internal functions in the library. I’ve been focusing on the cue text tokenize  functions for these. Instead of passing in an entire WEBVTT file, we pass in input that it will be expected to handle and test to make sure it behaves correctly. We’ve also been solving a few of the bugs that have been found via fuzzing WEBVTT, courtesy of cdiehl and rforbes,  in our integration branch. That’s awesome — we’re getting fuzz tested on something that has not even landed in Nightly yet! Caitlin has also started to add the ones we have solved as regression tests.

Other than that not much has happened on the parser lately as we’ve all been crunching through the last assignments and exams of the semester. We’re probably going to be looking where to enhance the library in the next little while. There are some issues up on the repo right now that still need to be taken care of in regards to enhancement. So we’ll probably be tackling those first.

Gecko Integration

The other big thing we’ve been working on still is getting the parser integrated into Gecko. I’ve probably already blogged before about how we have 2 out of the 5 things we need landed in Nightly already. The last three things we need to land to get basic functionality working are the DOM classes, DOM tests, and the “parser management” code.

Moving Code from WebVTTLoadListener

Around the time of the demo it was decided that we should move the code that converts the c parser data structs to DOM classes out of the WebVTTLoadListener and just use the LoadListener for… well, listening. The LoadListener’s job should be to serve as the point of contact between Gecko and the WEBVTT parser. When it receives data it hands it to the parser and when it receives a cue it constructs a TextTrackCue and hands it to Gecko. I recently got around to that here. The TextTrackCue is the place where  the conversion code now lives. We also now lazily load the parsed WEBVTT nodes into HTMLElements when GetCueAsHTML() is called for the first time.

Properly Creating Nodes

We ran into a problem where processing cue text tags like <i>, <u>, <b>, etc, was crashing the browser. This was due to the fact that we weren’t creating the NodeInfo to be passed into the NS_NewHTMLElement() macro properly. We were just passing in HTMLTrackElement’s NodeInfo. This would cause HTMLTrackElement to be deleted when the HTMLElement was removed from the divs child list. The correct way to do this is to get HTMLTrackElement’s NodeInfoManager() and create a new NodeInfo using it.

Removing Children

We were having a bug where we weren’t removing captions from the div properly. Previously we had been looping from zero to max length of the divs children and removing at the current index. Classic for loop. I tried and tried to figure out what was going wrong and after a while I made my way over to #content to get some help. bz and Ms2ger were kind enough to help me. What I learnt from them is that removing children of a node using this method only removes every other node. This is due to the fact that when you remove a node that isn’t at the end of a list, the entire node tree is shifted down. Therefore, when we remove node at 0 node at 1 becomes node at 0, we then advance to 1 and remove node at 1 missing the node that was shifted! The first solution we thought of was to loop until length is 0 always removing at 0. However, we ended up using another solution that I would never have guessed. That is to instead call nsContentUtils::SetNodeTextContent(). This removes the tree for you and  puts in its place a TextNode. For our solution we just pass in an EmptyString() for the text.

nsINode > nsIDOMNode

The other thing they asked me to do was to change how we were appending nodes to the tree. Instead of using nsIDOMNode interface, this is a slower and more inefficient interface, we should use nsINode. Which has basically the same capabilities. We can do the exact some thing with nsINode in simpler code.

Patches

I submitted a patch tonight that has the most up to date code in it in regards to “WEBVTT parser management” in Gecko. I was hoping we could get this landed quickly, but the events of today have brought up even more work to do. First of all, the patch for DOM classes that we thought would get through pretty quickly has a lot of problems with it, and secondly, the cue text tag class to css selector mapping in Gecko is not at all as simple as I suspected it to be.

I found this out today when trying to get the CSS selectors working on the HTMLElements created from cue text tags. I had all the Gecko code working correctly, and yet the CSS selectors in my external CSS file were not affecting the captions. I went over to #content where bz and Ms2ger informed me that it was because we are constructing them as anonymous content. In other words, no external code can touch the HTMLElements we are creating, only internal code can. This wasn’t the behaviour that I thought was needed and after some discussion #whatwg’s zcorpan informed us that they need to live in a video::cue pseudo-element as part of a sub-document. So in your external CSS selectors you would put video::cue([tagname].[classname]) { }. However, bz said that in order to get a new pseudo-element we would need to do some ‘re-architecting’ of Gecko code. This immediately made me feel nauseous… just kidding, kind of.

In light of this our new goal is to get our current semi-working code into Gecko behind a pref and than iterate on it. Things will be a lot easier when we get the first code landed.


That’s about it as far as I can remember. We’ve done a lot of more little things since than as well. Head over to Mozilla’s WEBVTT repo on github to check out all the changes. And feel free to get on irc.mozilla.org #seneca to co-ordinate with us if you want to help!

Until next time.

Leave a comment