Tagged: firefox

Web Rendering Week — Recap

So I’m back home now. The flight home was a bit shorter, about two hours, thankfully. Interestingly enough I feel more jet lagged now that I’m home then I was in Taiwan. I’ve heard that going east is worse then going west, so that might be the case for me right now. The rest of the rendering week, after I blogged last Tuesday, was just as awesome as the first part.

How Gecko Does X

I got to sit in on a lot more talks about “How Gecko does X” — like how the graphics engine works, how the layout system works, and my favourite in particular, how cycle collection works. Kyle Huey did an excellent job explaining how the Gecko cycle collector works. He gave us this paper as forward reading about it (it’s a little dense, but definitely worth reading). I’ll try to do a blog post in the future on what I learned/will learn about it.

David Baron and Adam Roach also gave an awesome talk on how the W3C and IETF work. WEBVTT is my first major exposure to open specifications that I’ve had and so I’ve been interested in all the hows/whats/whys of open specifications and the politics behind them.

Initial WEBVTT Support

It wasn’t all fun and games over in Taiwan though. We were also doing a lot of work. We finally got bug 833385 landed near to the end of the week. This means that we have support for all the new DOM elements that WEBVTT introduces such as: HTMLTrackElement, TextTrack, TextTrackList, TextTrackCue, and TextTrackCueList. We ran into a random inexplicable bug when we were doing full tries on the code, just before landing it. Ralph and I went to work debugging it (had to use an ASAN build) and we ended up discovering that it happened in a very rare situation where the cycle collector nulls out the HTMLMediaElement’s TextTrackList member while the HTMLMediaElement is still alive. This results in a situation where HTMLMediaElement::FireTimeUpdate() is called just before it is about to be deleted and since we weren’t doing null checks on the call to TextTrackList::Update() we would crash. After we got that fixed we were all green.

That leaves just bug 833382 left before we get initial support for WEBVTT. It was going well last week and we got an r+ from Chris Pearce. Now we just need an r+ from Boris and we should be good to go. It might take a few more rounds of review before that happens, but I’m optimistic we will be able to get this landed within a week or so.

One of the major problems I was having in Taiwan was trying to get a clean diff for 833382. The problem centered around the fact that up until now, mainly in my previous open-source class, we decided to use a git branch as a main point of ‘integration’. We’ve all been working off this branch for a while. The history of this branch has been so ridiculous and the code necessary for 833382 depends on so many other parts of the code that have been touched by, well everyone, that it was pretty much impossible for me to do a clean diff or even get a good rebase. To rebase this beast I would have had to go through 150+ commits, each having merge conflicts… So I ended up just making a clean branch off master and manually moving all the code that I needed over to the new branch and doing a diff like that. I’ll probably be staying away from these kinds of ‘integration’ branches in the future in order to ensure that my repo history can be more linear. Easier to get diffs that way.

The other thing I’ve been dealing with in the last few days is some code we landed back in February that was spotted to not be up to par by Boris. The issue is with some of the CSS selectors that we are using to style WEBVTT text — namely, we are using slow CSS selectors which is bad. This is the first that I’ve heard about some CSS selectors being slower than others, although that’s not suprising as I’m not super-super knowledgeable about CSS. Mozilla even has a page devoted to this that you can check out here. Ralph and I put together a patch yesterday to deal with this which will land today most likely. I’ll have to update 833382 to reflect those changes today as well.

CSS Parser Hacking

I also sat in on the vertical-text layout meeting as it is of particular interest to WEBVTT. WEBVTT requires the ability to have vertical-text and so far in Gecko we don’t have this. Apparently vertical-text has been kind of a thorn in the side of the layout team for a long time as it’s been particularly hard to implement. However, there is a major push now to get it done, so that’s great. In accordance with this Daniel Holbert asked me if I wanted to do some stuff for vertical-text in the CSS parser, I accepted and got my first layout bug! So I’ll be hacking around in the CSS parser and layout section of Gecko more in the future, hopefully.

WebMaker

Dave also told me the other day that he’s figured out an area of WebMaker that I can start contributing to, so I’m excited about that. I’ll be starting in about two weeks on this. I’ll most likely be splitting my time like 70/30 or something like that for WEBVTT and WebMaker. We talked briefly about it and so my understanding isn’t 100%, but what I got from our talk is that I will be implementing a kind of wrapper around an HTML5 video element that will allow Popcorn Maker to be able to work with it. From my understanding Popcorn Maker works with many different video formats/sources and so it needs a uniform interface to work with all these different videos. That’s where the wrapper comes in. It allows Popcorn Maker to work with many different video formats and sources without worrying about the particulars. However, all this might be completely wrong  as I might have misunderstood some things from our brief conversation… So don’t take my word for it! At any rate I’ll do another blog post about it when I get more information.

Until next time.

Advertisements

Mozilla Taiwan — Web Rendering Week

After a ridiculously long flight, 24 hours in total including airport time, we finally arrived in Taiwan, met up with Daniel (IRC:dholbert) and Seth (IRC:seth) and were on our way to the hotel in a cab. The plane ride was pretty good. Uneventful… just super long.

Taipei 101 building.

Taipei 101 building

It’s now Wednesday and it’s been an awesome and eventful week. I’ve been meeting tons of Mozilla devs and learning a lot. We’ve also been working hard on landing that initial support for WEBVTT I was talking about in my last blog post.

Sunday I was kind of jet lagged so I took a quick power nap and woke up for dinner which we ate at the Tapei 101 building (formerly the tallest building in the world). Dinner was like 15 courses? Pretty full after that.

After sleeping Sunday night I was pretty much jet lag free… I was kind of surprised as I’ve heard it’s really hard to deal with for some people. My first experience with it was kind of a non-issue. It’s probably because I’m used to sleeping late anyways… so really I just corrected my schedule to what’s normal for most people.

This week is pretty much a week where all the Web Rendering people can get together and work and talk face to face. There’s been a lot of meetings, some of which I’ve sat in on, that are super interesting and informative. Robert (IRC:roc) gave an excellent talk on the working culture at Mozilla which has really reiterated to me again how awesome the Mozilla working culture really is and how different it is from other companies. The talk touched on openness at Mozilla, code review, software quality, and module ownership vs. Mozilla’s managerial structure among many other things.

On the way to the office

On the way to the office

Aside from working on initial WEBVTT support I’ve also been getting reacquainted with the specification as it’s changed a lot since I last looked at it, going over all the bugs that we currently have for WEBVTT in order to get some kind of an idea of what we need to do next, and engaging in a lot of discussions about what we think the WEBVTT spec needs to improve on or change. This is great as when I get back to Toronto I’ll have a better plan of what needs to be done and how we are going to do it.

I’m also learning a ton about different areas Gecko that I didn’t know about before just by listening to and talking to others. Overall I’d say I’ve learnt way more then I could ever hope to do in the same amount of time sitting on IRC chatting with people.

One other thing I’m looking forward to is a talk on the cycle collector that Kyle (khuey) is going to be giving tomorrow. The more I try to work with and use cycle collection the more I want to understand it. Hopefully I can walk away from the talk with a better understanding of how it works.

Here are some assorted pictures of the Mozilla Taiwan office. It’s pretty brand spanking new. The office space they have us working in was actually just finished before we got here and they had to push the contractors to get it done early. It’s right across from the Bloomberg office here in Taipei as well as some other cool places. It’s literally 30 seconds from Taipei 101, so it’s right in the downtown core of the city.

Stinks that I won’t have enough time to check out many other places in Taipei as work is taking up most of my time. Taiwan isn’t the place I thought I would visit first when traveling to Asia, or even a place I would visit, but it’s definitely on my list of places to come back to.

Front sign and entrance

Front sign and entrance

Space for gathering and presntations.

Space for gathering and presntations.

Work Space

Work space that Mozilla Taiwan put aside for us.

Front desk.

Front desk.

View outside the office

View outside the office

WEBVTT: Farewell DPS911

Tomorrow is the last day for my open-source class at Seneca. So this will be the last WEBVTT post that I will make for the class, ever. It’s been a long journey since last September and we’ve made huge progress, learnt a ton, burnt out many a time, and had a great time doing it. If you are worried about no more posts on WEBVTT fear not! I’ll still be posting regularly on WEBVTT as I’ve now switched over to working on it and possibly some WebMaker stuff at CDOT for the next year. I’m really looking forward to it.

Now, lets get on with it.


WEBVTT Parser

It’s been pretty exciting around WEBVTT in the last month or so — ever since we did a presentation at Toronto Mozilla we’ve received a lot more interest. It’s a pretty cool and strange feeling to have people interested in what we’re doing. Especially with WEBVTT. It’s not very glamorous, as you can imagine. Myself and a few of my classmates also went to an “Open web open mic” night at Toronto Mozilla where we got to do another presentation and showcase WEBVTT off in a kind of science fair environment. We also got to see lots of great presentations and projects that are being worked on. It really opened my mind to what is going on in Toronto and beyond. Pretty cool stuff.

We recently got all our tests green! At that point we officially tagged a revision of the parser as version 0.4… so lots more work to do. Since then we’ve been adding more refined and atomic unit tests to the test suite. Most of them are testing our internal functions in the library. I’ve been focusing on the cue text tokenize  functions for these. Instead of passing in an entire WEBVTT file, we pass in input that it will be expected to handle and test to make sure it behaves correctly. We’ve also been solving a few of the bugs that have been found via fuzzing WEBVTT, courtesy of cdiehl and rforbes,  in our integration branch. That’s awesome — we’re getting fuzz tested on something that has not even landed in Nightly yet! Caitlin has also started to add the ones we have solved as regression tests.

Other than that not much has happened on the parser lately as we’ve all been crunching through the last assignments and exams of the semester. We’re probably going to be looking where to enhance the library in the next little while. There are some issues up on the repo right now that still need to be taken care of in regards to enhancement. So we’ll probably be tackling those first.

Gecko Integration

The other big thing we’ve been working on still is getting the parser integrated into Gecko. I’ve probably already blogged before about how we have 2 out of the 5 things we need landed in Nightly already. The last three things we need to land to get basic functionality working are the DOM classes, DOM tests, and the “parser management” code.

Moving Code from WebVTTLoadListener

Around the time of the demo it was decided that we should move the code that converts the c parser data structs to DOM classes out of the WebVTTLoadListener and just use the LoadListener for… well, listening. The LoadListener’s job should be to serve as the point of contact between Gecko and the WEBVTT parser. When it receives data it hands it to the parser and when it receives a cue it constructs a TextTrackCue and hands it to Gecko. I recently got around to that here. The TextTrackCue is the place where  the conversion code now lives. We also now lazily load the parsed WEBVTT nodes into HTMLElements when GetCueAsHTML() is called for the first time.

Properly Creating Nodes

We ran into a problem where processing cue text tags like <i>, <u>, <b>, etc, was crashing the browser. This was due to the fact that we weren’t creating the NodeInfo to be passed into the NS_NewHTMLElement() macro properly. We were just passing in HTMLTrackElement’s NodeInfo. This would cause HTMLTrackElement to be deleted when the HTMLElement was removed from the divs child list. The correct way to do this is to get HTMLTrackElement’s NodeInfoManager() and create a new NodeInfo using it.

Removing Children

We were having a bug where we weren’t removing captions from the div properly. Previously we had been looping from zero to max length of the divs children and removing at the current index. Classic for loop. I tried and tried to figure out what was going wrong and after a while I made my way over to #content to get some help. bz and Ms2ger were kind enough to help me. What I learnt from them is that removing children of a node using this method only removes every other node. This is due to the fact that when you remove a node that isn’t at the end of a list, the entire node tree is shifted down. Therefore, when we remove node at 0 node at 1 becomes node at 0, we then advance to 1 and remove node at 1 missing the node that was shifted! The first solution we thought of was to loop until length is 0 always removing at 0. However, we ended up using another solution that I would never have guessed. That is to instead call nsContentUtils::SetNodeTextContent(). This removes the tree for you and  puts in its place a TextNode. For our solution we just pass in an EmptyString() for the text.

nsINode > nsIDOMNode

The other thing they asked me to do was to change how we were appending nodes to the tree. Instead of using nsIDOMNode interface, this is a slower and more inefficient interface, we should use nsINode. Which has basically the same capabilities. We can do the exact some thing with nsINode in simpler code.

Patches

I submitted a patch tonight that has the most up to date code in it in regards to “WEBVTT parser management” in Gecko. I was hoping we could get this landed quickly, but the events of today have brought up even more work to do. First of all, the patch for DOM classes that we thought would get through pretty quickly has a lot of problems with it, and secondly, the cue text tag class to css selector mapping in Gecko is not at all as simple as I suspected it to be.

I found this out today when trying to get the CSS selectors working on the HTMLElements created from cue text tags. I had all the Gecko code working correctly, and yet the CSS selectors in my external CSS file were not affecting the captions. I went over to #content where bz and Ms2ger informed me that it was because we are constructing them as anonymous content. In other words, no external code can touch the HTMLElements we are creating, only internal code can. This wasn’t the behaviour that I thought was needed and after some discussion #whatwg’s zcorpan informed us that they need to live in a video::cue pseudo-element as part of a sub-document. So in your external CSS selectors you would put video::cue([tagname].[classname]) { }. However, bz said that in order to get a new pseudo-element we would need to do some ‘re-architecting’ of Gecko code. This immediately made me feel nauseous… just kidding, kind of.

In light of this our new goal is to get our current semi-working code into Gecko behind a pref and than iterate on it. Things will be a lot easier when we get the first code landed.


That’s about it as far as I can remember. We’ve done a lot of more little things since than as well. Head over to Mozilla’s WEBVTT repo on github to check out all the changes. And feel free to get on irc.mozilla.org #seneca to co-ordinate with us if you want to help!

Until next time.

WEBVTT Update: Parser Review, Integration into Firefox

For the last two weeks we’ve been working steadily on the WEBVTT parser. Most of the work being done now is related to getting the parser integrated into Firefox. We’re building on top of the original bug filed on bugzilla by Ralph and are now using it as a co-ordination bug for the five other bugs we’ve split it up into. The “bug sections” that we’ve split it up into are:

  • Integrating the webvtt build system into the Firefox build system.
  • Adding a captions div to an nsVideoFrame so the captions can be rendered on screen.
  • Creating a “Text Track Decoder” that will function as the entry point for Gecko into the webvtt parser.
  • Creating new DOM bindings i.e. TextTrack, TextTrackCue, TextTrackCueList.
  • Creating DOM tests using Mochitest for the new DOM bindings.

You can check out a more in depth break down of our bug plan here.

The other major thing that a few of us in the class have been engaged in is the review of the 0.4 parser code. The review is still in it’s early to mid stages, so we have a lot more to do on that. I’ve been participating there by filing and commenting on issues and fixing a few of the bugs that have surfaced.

We’ve also moved the parser code over to the mozilla webvtt repository on GitHub (yay!) and have landed the 0.4 parser code there in a dev branch. After the review is done it will be landed on the master branch.

Firefox Integration

I’ve been working on the Text Track Decoder for the parser integration into Firefox. This part of the integration functions as an entry point into our parser for Gecko.

How It Works

The short version of how the Text Track Decoder works is that when an HtmlTrackElement receives new data from a vtt byte stream it passes it off to it’s WebVTTLoadListener (Text Track Decoder) which then calls our webvtt parser to parse the chunk of the byte stream it just reveived. The WebVTTLoadListener also provides call back functions to the parser for passing back cues when the parser has finished them or for reporting errors when the parser encounters them. The final function that the WebVTTLoadListener facilitates is converting the cues that have been passed back in the call back function to the various DOM elements that represent a webvtt_cue and then attaching those to either the HtmlTrackElement’s track, in the case of the cue settings, or the HtmlTrackElement’s MediaElement’s video div caption overlay (phew), in the case of the parsed webvtt cue text node tree.

What We’ve Done

The first order of business that we took care of in getting this done was to ask Chris Pearce, who works very closely with Firefox’s media stuff, to give us a high level overview of what we would need to accomplish in order to get this working. That was sent in the form of an email which my Professor, Dave Humphrey, then kindly posted on our bug (I forgot to do so!).

We then quickly went about implementing Chris’s initial steps that he talked about. We’ve done steps 1 – 4 so far:

  • The HtmlTrackElement::LoadListener has been moved to it’s own file and renamed WebVTTLoadListener.
  • The HtmlTrackElement now has a WebVTTLoadListener reference which is initialized in LoadResource.
  • WebVTTLoadListener now manages a single webvtt parser which is created and destroyed along with it.
  • WebVTTLoadListener now provides call back functions to the parser for returning finished cues and reporting errors.

We’ve also added three convenience functions to turn webvtt cue stuff into the DOM bindings. These are:

  • cCueToDomCue – Transforms a webvtt cue’s settings into a TextTrackCue (almost done).
  • cNodeToHtmlElement –  Transforms a webvtt node into an HtmlElement; recursively adds HtmlElements to it’s children if it is converting an internal webvtt node (not done at all!).
  • cNodeListToDocumentFragment – Transforms the head node’s children into HtmlElements and adds them to a DocumentFragment (pretty much done).

The call back function for returning cues now:

  • Calls the cCueToDomCue function and adds the resulting HtmlTextTrackCue to it’s owning HtmlTrackElements cue list.
  • Calls the cNodeListToDocumentFragment and adds the resulting DocumentFragment to the caption overlay.

Right now we’ve run into some problems in figuring out how to work with the Firefox code. I’ve listed those in my recent WIP update on the bug. Other then implementing those steps I’ve just been getting acquainted with the Firefox code that we have to touch and figuring out the basics of how it’s all going to fit in. I think we’ve gotten a big chunk of it done so far, mostly the overall frame of how it’s going to work as well as turning a webvtt cue’s settings into a TextTrackCue. I’ve also met the deadlines and goals that I set for myself at the beginning of this semester, so I’m fairly happy. Going forward I think I know enough now to ask intelligent questions about how to solve the problems that I listed in the WIP, so that’s what I will be doing in the coming weeks when I get stuck.

As always, I’m ever confident that we will finish the project!

WebVTT 0.3 Release : Final

Today our 0.3 release is due for the WebVTT parser. I’ve completed the cue text parsing portion of the parser and it’s sitting on my GitHub repo. The main places you can look for the code I have added are:

Much of what I discussed in my last blog post has stayed the same with my final version of the 0.3 release. The major structure of the algorithm has stayed the same. However, I have made changes to some of the syntax in order to get rid of minor bugs. I won’t re-post all that slightly changed code as it would make this blog post to long. You can ether look at the GitHub links for that stuff or you can check out my earlier blog post.

I’ll go over what I’ve done in the time since my last post:

I’ve completed the UTF16 append functions:

webvtt_status
append_wchar_to_wchar( webvtt_wchar *append_to, webvtt_uint len, webvtt_wchar *to_append, webvtt_uint start, webvtt_uint stop )
{
	int i;

	if( !append_to || !to_append )
		return WEBVTT_INVALID_PARAM;

	for(i = len; i < len + stop; i++, start++ )
		append_to[i] = to_append[start];
	append_to[i] = UTF16_NULL_BYTE;

	return WEBVTT_SUCCESS;
}

webvtt_status
webvtt_string_append_wchar( webvtt_string *append_to, webvtt_wchar *to_append, webvtt_uint len )
{
	webvtt_status status;

	if( !to_append || !append_to )
		return WEBVTT_INVALID_PARAM;

	if( ( status = grow( (*append_to)->length + len, &(*append_to) ) ) != WEBVTT_SUCCESS )
		return status;

	if( ( status = append_wchar_to_wchar( (*append_to)->text, (*append_to)->length, to_append, 0, len ) ) != WEBVTT_SUCCESS )
		return status;

	(*append_to)->length += len;

	return WEBVTT_SUCCESS;
}

webvtt_status
webvtt_string_append_single_wchar( webvtt_string *append_to, webvtt_wchar to_append )
{
	webvtt_wchar temp[1];

	if( !append_to )
		return WEBVTT_INVALID_PARAM;

	temp[0] = to_append;

	return webvtt_string_append_wchar( append_to, temp, 1 );
}

webvtt_status
webvtt_string_append_string( webvtt_string *append_to, webvtt_string to_append )
{
	webvtt_status status;

	if( ( status = webvtt_string_append_wchar( append_to, to_append->text, to_append->length ) ) != WEBVTT_SUCCESS )
		return status;

	return WEBVTT_SUCCESS;
}

I’ve added in functions that compare two strings or two wchars:

webvtt_uint
webvtt_compare_wchars( webvtt_wchar  *one, webvtt_uint one_len, webvtt_wchar *two, webvtt_uint two_len )
{
	int i;

	/* Should we return a webvtt_status to account for this case here? */
	if( !one || !two )
		return 0;

	if( one_len != two_len )
		return 0;

	for( i = 0; i < one_len; i++ )
	{
		if( one[i] != two[i] )
		{
			return 0;
		}
	}

	return 1;
}

webvtt_uint
webvtt_compare_strings( webvtt_string one, webvtt_string two )
{
	if( !one || !two )
		return 0;

	return webvtt_compare_wchars( one->text, one->length, two->text, two->length );
}

I’ve changed the webvtt_string_list struct and it’s functions since the last blog post:

struct
webvtt_string_list_t
{
	webvtt_uint alloc;
	webvtt_uint list_count;
	webvtt_string *items;
};

webvtt_status
webvtt_create_string_list( webvtt_string_list_ptr *string_list_pptr )
{
	webvtt_string_list_ptr temp_string_list_ptr = (webvtt_string_list_ptr)malloc( sizeof(*temp_string_list_ptr) );

	if( !temp_string_list_ptr )
		return WEBVTT_OUT_OF_MEMORY;

	temp_string_list_ptr->alloc = 0;
	temp_string_list_ptr->items = 0;
	temp_string_list_ptr->list_count = NULL;

	*string_list_pptr = temp_string_list_ptr;

	return WEBVTT_SUCCESS;
}

void
webvtt_delete_string_list( webvtt_string_list_ptr string_list_ptr )
{
	int i;

	for( i = 0; i < string_list_ptr->list_count; i++ )
	{
		webvtt_delete_string( string_list_ptr->items[i] );
	}
}

webvtt_status
webvtt_add_to_string_list( webvtt_string_list_ptr string_list_ptr, webvtt_string string )
{
	if( !string )
	{
		return WEBVTT_INVALID_PARAM;
	}

	if( string_list_ptr->alloc == string_list_ptr->list_count )
		string_list_ptr->alloc += 4;

	if( !string_list_ptr->alloc == 0 )
		string_list_ptr->items = (webvtt_string *)malloc( sizeof(webvtt_string) );
	else
		string_list_ptr->items = (webvtt_string *)realloc( string_list_ptr->items, sizeof(webvtt_string *) * string_list_ptr->alloc );

	if( !string_list_ptr->items )
		return WEBVTT_OUT_OF_MEMORY;

	string_list_ptr->items[string_list_ptr->list_count++] = string;

	return WEBVTT_SUCCESS;
}

I’ve changed the WEBVTT_CALLBACK that will call the webvtt_parse_cuetext function:

static void WEBVTT_CALLBACK
cue( void *userdata, webvtt_cue cue )
{
	webvtt_parse_cuetext( cue->payload->text, cue->node_head );
}

Before it didn’t call the parse cue text function and so the cue text wasn’t parsed.

I’ve changed the function signature for webvtt_parse_cuetext:

WEBVTT_EXPORT webvtt_status
webvtt_parse_cuetext( webvtt_wchar *cue_text, webvtt_node_ptr node_ptr )

I’ve gotten rid of the webvtt_parser pointer, the line number, the length of *cue text, and the length of node_ptr in the function signature that was there previously.

  • For the webvtt_parser pointer and line number I did this because there purpose was to be able to throw an error to the webvtt_parser pointer error call back and reference the line that it happened on, but currently the parser does not support this.
  • I got rid of the length of cue text because it should always be a null terminated pointer. So we can just tell that we are at the end of the line by checking for that. No need for the line length.
  • I got rid of the length of the node_ptr because the parser no longer returns an array of node_ptr it now returns a single node_ptr of type WEBVTT_HEAD_NODE, which contains an array of node_ptr underneath it.

I know we will be changing this in the future, but I got rid of it now to make it more clear.

The other major thing that caitp and I were talking about on IRC last night was the data structure of the nodes. Before caitp had it set up that an internal node and a leaf node would contain a node and so they would be subclasses of node. Then you could just return an array of nodes and cast it to a particular type of node based on it’s node kind.

The way I have it set up right now is similar but slightly different. In my version the node contains a pointer to a leaf node or internal node and based on its node kind you can cast it to either an internal node or a leaf node.

caitp made the case for converting it back to the old format as it might be more readable and possibly take up less space in memory. This is something that we should probably discuss in the future.

Some other things to note that we will need to take care of in the future:

  • I have not had the chance to test the parsing of escape characters, but the code for it is there.
  • It does not parse the new “lang” tag that was recently added to the W3C specification.
  • The memory operations in the node, token, and string list struct do not make use of the allocator functions that we have built into the framework.

Yupp, thats it. See ya.

WebVTT 0.3 Release

So it’s been a while since I posted and in that intervening time we have been hard at work on the 0.3 release of our parser.

For this release we are concentrating on mainly getting a full *working parser* out as well as getting our build system up to par with a unit testing strategy as well as making the parser be able to work across all platforms i.e. OS X, Linux, and Windows.

For our Unit testing we are going with a Node-ffi solution. Node-ffi will allow us to dynamically bind our C library into a Javascript Test Suite within which we can easily do unit tests. If you want to read up more about that you can check out my classmate Dales blog who has volunteered to work on that.

For our build system we are using Autotools which is a build system from GNU that is designed to assist in making cross platform build systems. You can check out my classmate Caitlins blog to read more about that.

I myself have been working more on the C parser. We chose to go with Caitlins version of the parser to go forward with when my class met to discuss our 0.2 release. This is related to the ‘build two and plan to throw one away’ idea that I talked about in my previous blog posts.

I’ve been implementing the cue text parser portion of the C parser. This is the part that parses the payload of a WebVTT text track i.e. the actual text and markup that will be rendered on screen. Going down this road has also led me to work on a couple other parts of the C parser such as:

  • Creating some utility functions to check whether or not a UTF16 character is a digit or an alphanumeric character respectively
  • Harnessing our other string code which Caitlin originally worked on to be able to append UTF16 strings together
  • Normalizing some of the function names in our C parser which Caitlin worked on. These have to do with changing function names from webvtt_x_delete or create to webvtt_delete_x. Not that hard.
  • We also discussed what character encoding to use internally in our parser. We decided on UTF16 as that gives some benefits such as it being the encoding that is used on the web as well as it being a simple encoding to use unlike UTF8. I will probably be working on getting the parser to use UTF16 strings after I finished the cue text parser.

I’ll briefly go over the code I have done so far.

Cue Text Parser

For the cue text parser I followed the algorithm provided by the W3C specification very closely. Here is the main parsing method:

/**
 * Currently line and len are not being kept track of.
 * Don't think pnode_length is needed as nodes track there list count internally.
 */
webvtt_status
webvtt_parse_cuetext( webvtt_parser self, webvtt_uint line, const webvtt_wchar *cue_text,
	const webvtt_uint len, webvtt_node *pnode, webvtt_uint *pnode_length )
{
	webvtt_wchar_ptr position_ptr = (webvtt_wchar_ptr)cue_text;
	webvtt_node_ptr current = pnode, temp_node;
	webvtt_cue_text_token_ptr token_ptr;
	webvtt_node_kind kind;

	if( !cue_text )
	{
		return WEBVTT_INVALID_PARAM;
	}

	/**
	 * Routine taken from the W3C specification - http://dev.w3.org/html5/webvtt/#webvtt-cue-text-parsing-rules
	 */
	do {

		webvtt_delete_cue_text_token( token_ptr );

		/* Step 7. */
		switch( webvtt_cue_text_tokenizer( position_ptr, token_ptr ) )
		{
		case( WEBVTT_UNFINISHED ):
			/* Error here. */
			break;
		/* Step 8. */
		case( WEBVTT_SUCCESS ):

			/**
			 * If we've found an end token which has a valid end token tag name and a tag name
			 * that is equal to the current node then set current to the parent of current.
			 */
			if( token_ptr->token_type == END_TOKEN )
			{
				if( webvtt_get_valid_token_tag_name( ((webvtt_cue_text_end_tag_token *) token_ptr->concrete_token)->tag_name, &kind ) == WEBVTT_NOT_SUPPORTED)
					continue;

				if( current->kind == kind )
					current = current->parent;
			}
			else
			{
				/**
				 * Attempt to create a valid node from the token.
				 * If successful then attach the node to the current nodes list and also set current to the newly created node
				 * if it is an internal node type.
				 */
				if( webvtt_create_node_from_token( token_ptr, temp_node, current ) != WEBVTT_SUCCESS )
					/* Do something here. */
					continue;
				else
				{
					webvtt_attach_internal_node( (webvtt_internal_node_ptr)current->concrete_node, temp_node );

					if( WEBVTT_IS_VALID_INTERNAL_NODE( temp_node->kind ) )
						current = temp_node;
				}
			}
			break;
		}

	} while( *position_ptr != UTF16_NULL_BYTE );

	return WEBVTT_SUCCESS;
}

In short – it loops, calling the tokenizer function until it has reached the end of the buffer. Based on the status returned by the tokenizer it will either emit an error (not added yet) or it will add a node to the node list depending on what kind of token is returned.

You can see in the code that I have created many utility functions that do things such as creating a node from a token and creating or deleting nodes or tokens. I won’t list those functions here because it would be too much.

The other main bulk of this parser is the actual tokenizer:

webvtt_status
webvtt_cue_text_tokenizer( webvtt_wchar_ptr position_ptr, webvtt_cue_text_token_ptr token_ptr )
{
	webvtt_cue_text_token_state token_state = DATA;
	webvtt_string result, annotation;
	webvtt_string_list css_classes;
	webvtt_timestamp time_stamp;
	webvtt_status status = WEBVTT_UNFINISHED;

	if( !position_ptr )
	{
		return WEBVTT_INVALID_PARAM;
	}

	/**
	 * Loop while the tokenizer is not finished.
	 * Based on the state of the tokenizer enter a function to handle that particular tokenizer state.
	 * Those functions will loop until they either change the state of the tokenizer or reach a valid token end point.
	 */
	while( status == WEBVTT_UNFINISHED )
	{
		switch( token_state )
		{
		case DATA :
			status = webvtt_cue_text_tokenizer_data_state( position_ptr, &token_state, result );
			break;
		case ESCAPE:
			status = webvtt_cue_text_tokenizer_escape_state( position_ptr, &token_state, result );
			break;
		case TAG:
			status = webvtt_cue_text_tokenizer_tag_state( position_ptr, &token_state, result );
			break;
		case START_TAG:
			status = webvtt_cue_text_tokenizer_start_tag_state( position_ptr, &token_state, result );
			break;
		case START_TAG_CLASS:
			status = webvtt_cue_text_tokenizer_start_tag_class_state( position_ptr, &token_state, css_classes );
			break;
		case START_TAG_ANNOTATION:
			status = webvtt_cue_text_tokenizer_start_tag_annotation_state( position_ptr, &token_state, annotation );
			break;
		case END_TAG:
			status = webvtt_cue_text_tokenizer_end_tag_state( position_ptr, &token_state, result );
			break;
		case TIME_STAMP_TAG:
			status = webvtt_cue_text_tokenizer_time_stamp_tag_state( position_ptr, &token_state, result );
			break;
		}

		if( *position_ptr != UTF16_GREATER_THAN && *position_ptr != UTF16_NULL_BYTE )
			position_ptr++;
	}

	/**
	 * Code here to handle if the tokenizer status returned is not WEBVTT_SUCCESS.
	 * Most likely means it was not able to allocate memory.
	 */

	/**
	 * The state that the tokenizer left off on will tell us what kind of token needs to be made.
	 */
	if( token_state == DATA || token_state == ESCAPE )
	{
		 return webvtt_create_cue_text_text_token( token_ptr, result );
	}
	else if(token_state == TAG || token_state == START_TAG || token_state == START_TAG_CLASS ||
			token_state == START_TAG_ANNOTATION)
	{
		return webvtt_create_cue_text_start_tag_token( token_ptr, result, css_classes, annotation );
	}
	else if( token_state == END_TAG )
	{
		return webvtt_create_cue_text_end_tag_token( token_ptr, result );
	}
	else if( token_state == TIME_STAMP_TAG )
	{
		/* Parse time stamp from result. */
		return webvtt_create_cue_text_time_stamp_token( token_ptr, time_stamp );
	}
	else
	{
		return WEBVTT_NOT_SUPPORTED;
	}

	return WEBVTT_SUCCESS;
}

This function takes the byte stream and interprets it into tokens that the parser will be able to understand. One of the main departures that I made away from the W3C specification is that I’ve farmed each tokenizer state out to a function and therefore I had to change a tiny bit of the logic i.e. instead of using only a result and a buffer to parse the text I have created separate webvtt_strings that can handle each one of the result, buffer, and annotation. This simplifies the code as you don’t have to pass back and forth only two parameters between everyone of these functions to keep track of the parsed output. I also created a webvtt_string_list struct that will be able to handle a list of strings for the classes of a start tag in the cue text.

Here is an example of one of the functions that parses a tokenizer state:

webvtt_status
webvtt_cue_text_tokenizer_start_tag_class_state( webvtt_wchar_ptr position_ptr,
	webvtt_cue_text_token_state_ptr token_state_ptr, webvtt_string_list css_classes )
{
	webvtt_string buffer;

	CHECK_MEMORY_OP( webvtt_create_string( 1, &buffer ) );

	for( ; *token_state_ptr == START_TAG_CLASS; position_ptr++ )
	{
		if( *position_ptr == UTF16_TAB || *position_ptr == UTF16_FORM_FEED ||
			*position_ptr == UTF16_SPACE || *position_ptr == UTF16_LINE_FEED ||
			*position_ptr == UTF16_CARRIAGE_RETURN)
		{
			CHECK_MEMORY_OP( webvtt_add_to_string_list( css_classes, buffer ) );
			webvtt_delete_string( buffer );
			*token_state_ptr = START_TAG_ANNOTATION;
		}
		else if( *position_ptr == UTF16_GREATER_THAN || *position_ptr == UTF16_NULL_BYTE )
		{
			CHECK_MEMORY_OP( webvtt_add_to_string_list( css_classes, buffer ) );
			webvtt_delete_string( buffer );
			return WEBVTT_SUCCESS;
		}
		else if( *position_ptr == UTF16_FULL_STOP )
		{
			CHECK_MEMORY_OP( webvtt_add_to_string_list( css_classes, buffer ) );
			webvtt_delete_string( buffer );
			CHECK_MEMORY_OP( webvtt_create_string( 1, &buffer ) );
		}
		else
		{
			CHECK_MEMORY_OP( webvtt_string_append_wchar( buffer, position_ptr, 1 ) );;
		}
	}

	webvtt_delete_string( buffer );
	return WEBVTT_UNFINISHED;
}

Each one of the tokenizer state functions loops until either it changes the state of the tokenizer, which means it needs to start parsing it in another one of the tokenizer state functions, or it reaches a ‘termination’ point i.e. a point where either a valid token has been parsed or where it has come across the end of the byte stream prematurely.

CHECK_MEMORY_OP is just a macro that takes the returned webvtt_status and compares it to see if it was a success. If it was not then it returns the status that the memory operation returned. One problem that I have here is that since it returns immediately there is no place to deallocate memory that may have been allocated in the function. Should be easy to fix but I haven’t gotten around to it yet.

UTF16 String Manipulations

I haven’t completed some of the functions that I call in the parser code for UTF16 strings such as the functions that handle appending webvtt_wchars or webvtt_strings to webvtt_strings, but I will be working on that next. I also need to implement a function that will hopefully take a string literal and append it to a webvtt_string.

The functions for webvtt_strings that I have working so far are the is_digit, is_alphanumeric, and add to webvtt_string_list functions,:

webvtt_status
webvtt_add_to_string_list( webvtt_string_list string_list, webvtt_string string )
{
	if( !string )
	{
		return WEBVTT_INVALID_PARAM;
	}

	if( !string_list.items )
	{
		string_list.list_count = 0;
		string_list.items = (webvtt_string *)malloc( sizeof( webvtt_string ) );
	}
	else
	{
		string_list.items = (webvtt_string *)realloc( string_list.items,
			sizeof( webvtt_string ) * ( string_list.list_count + 1 ) );
	}

	if( string_list.items )
	{
		string_list.items[string_list.list_count] = string;
		string_list.list_count++;
	}
	else
		return WEBVTT_OUT_OF_MEMORY;

	return WEBVTT_SUCCESS;
}

webvtt_uint
webvtt_is_alphanumeric( webvtt_wchar character )
{
	return ( character >= UTF16_DIGIT_ZERO && character <= UTF16_DIGIT_NINE ) ||
			  ( character >= UTF16_CAPITAL_A && character <= UTF16_CAPITAL_Z ) ||
			  ( character >= UTF16_A && character <= UTF16_Z );

}

webvtt_uint
webvtt_is_digit( webvtt_wchar character )
{
	return character >= UTF16_DIGIT_ZERO && character <= UTF16_DIGIT_NINE;
}

One final thing that I want to mention is that all this code is completely untested so far. Once I get the webvtt_string functions in place I will start to debug and test it. If you want to check out the entire code that I’ve been working on you can see that here. The main places you can look in are cuetext, cue, and string. Our 0.3 release is due this Thursday coming up. I’m aiming to have this cue text parser done along with the change over to use of UTF16 everywhere in the parser, as well as converting my old tests from the 0.1 release to Unit Tests using the new Node-ffi Javascript test suite.

Later!

WebVTT 0.2 Release Update – Final Thoughts

So we’re on the final day before our 0.2 release for WebVTT is due and my teams C parser is yet to be completed. Hopefully, my partner and I can finish off things later tonight and we can get it finished off for tomorrow. However, if we don’t get it completed not all is lost. The primary goal of this release was to learn about the WebVTT parser by having three separate teams work on coding three different parsers. The idea here was to follow the design principle “Build two and plan to throw one away”. Judged under those goals I believe my team has met them. We have learnt much about what we will need to do in order to make a good parser and much about how we should go about doing that. I’ll go through the most crucial things that I believe we need to get right to have a final parser that is good.

Error Handling and Parser Ignore Logging

We need a good design for error handling and ignore logging. Both of them represent similar ideas and similar implementations but have but have different end meanings. Errors are those things which cause the parser to completely break, whereas parser ignore logging is when the parser encounters malformed WebVTT text in the WebVTT byte stream. We want to log all of these ignores and errors and have a way to hook into ignore logger so that we can test properly. The Prof brought up the fact that a lot of the tests where we would expect an ignore will actually be Unit Tests, so that will simplify things a bit.

Efficient Way of Loading and Working With Lines of Text

By far the most annoying thing that we had to deal with during this initial parser development was figuring out how to load and work with lines efficiently while retaining a link to the context of the byte stream.

The parser asks you to work with text in a couple different ways.

  • Read an entire line of text and compare a multi-character value to the string
  • Split text on spaces and work with those split strings
  • Loop through char by char and compare values

All of these things seem pretty simple but what makes it complicated is that we want to be able to use all these different methods while retaining a link to the actual position of what character we are on in the byte stream. This way when we ignore some cue text we can tell exactly where it ignored that cue text and spit that out into a logger. That means that we can’t be creating separate string, or pointers, that are completely decoupled from the byte stream, because we will lose the ability to see where it ignored cue text.

One thing to think about is whether or not loading lines from the stream to a separate string is even needed, maybe we can just work with offsets that can denote the current position and the beginning and end of the line that is going to be parsed.

I don’t think this problem is particularly hard. It just gave my team a lot of trouble because we didn’t know all the ways in which the parser would require us to work with text. If we rewrote it again, which we will, we could solve this problem probably pretty easily.

Data Structure

The main thing here for our track cue text data structure, I discussed this a while ago, is whether or not we should squash the different types of nodes into a couple of main nodes, or if we should keep them separate.

If we squash them it will simplify the code a bit but might be hard in the future to modify the code if the specification changes a lot and we end up having to separate the squashed data structure.

If we keep the data structure roughly as the way it is now it will allow more flexibility in the future if we need to change any one of the node type structs because the specification changes. Keeping the data structure as separate as possible will align us with the design principle of separation of concerns.

This is definitely something I think we need to discuss as a class and make a decision on.

UTF8

As I worked on the C parser I started work on a general UTF-8 library that our parser could rely on. I did this because at the time I thought we needed to be able to work with UTF-8 in order to parse the WebVTT byte stream correctly.

As I learnt about UTF8 more while writing the library I realized one major thing – all the characters that the parser needs to work with in order to parse the byte stream are represented in UTF8 with the same code points that ASCII uses. This means that in order to parse the stream we do not need to have specific functions that can deal with for example a ‘<‘ UTF8 code point, we can simply use the regular standards for working with ASCII that C code classically uses.

The support that we do need to provide for UTF8, the only thing that the WebVTT specification mentions, is conversion of the byte stream to UTF8 upon the beginning of the parser routine.

The final thing that I found out was that the WebVTT specification makes no mention of what kind of character encoding the parser should be emitting at the end of the program i.e. for rendering purposes. I was talking to a class mate who was saying that we should probably use UTF16 as this is most compatible with Firefox and many other applications and frameworks. This is probably something that we should put up for discussion in class at some point.

Final Notes

You can checkout the code here.

We haven’t provided a way to parse the text track cue settings yet and the cue text function does not load the data structure with the appropriate data yet.

We haven’t provided or even looked at the capability to parse only pieces of the byte stream at a time. We need to have this because when the parser will be used on a browser the browser will only provide small pieces of the WebVTT as it is downloading it.

There’s a lot of smelly code in our C parser. We just need to rewrite it.