Optimising wire format JSON for mapping applications

When designing a mapping application at Nokia, we needed an efficient wire format as we were dealing with many points. It had to be robust and flexible, because the dependencies upon it went through the application.

The server supplied JSON couldn’t be used directly. Because it was in what we called the ‘wire format’ (as in sent over the wire), which was a smaller version, used to lessen the data load when sending from and to the server. Additionally, using it directly, no matter how well designed, usually results in a format that isn’t optimised for fast JavaScript access, so some further translation took place.

One of the first choices we had to make is between a flat or tree-structure database. e.g.

Tree:

journey : {
	points : [
        {
            latitude: 32.323992,
            longitude: 12.032424,
            duration: 2300,

            waypoint : {
                type: 'media',
                duration: 2300
            }
        }
	]
}

Flat:

journey : {
	points : [
        {
            latitude: 32.323992,
            longitude: 12.032424,
            duration: 2300,
        }
	]
	waypoints : [
        {
            type : 'media',
            duration: 2300,
            point: 0
        }
	]
}

We knew it was better to have a flat structure. It makes manipulating data and separating them into models much easier.

Whilst a data-relationship isn’t implied in the flat structure, like it is in the tree structure i.e. a waypoint belongs to a point. It is easy to connect the relationships up later.

Because models will be placed upon each one of our major journey properties, which in this case is ‘points’ and ‘waypoints’, we’d also expect them to have some model and state specific properties. Two very common properties would be ‘selected’, which is the current waypoint being interacted with on the UI. Another being ‘modified’, a boolean which indicates whether the model has been modified since being sent from the server.

There’s two options for representing these properties inside the models.

Option 1) Keep the properties for the waypoint at the top-level and puts the array of waypoints inside the ‘rows’ property.

// Uses JSONPath expressions http://goessner.net/articles/JsonPath/
waypoints : {
	selected : $.waypoints.rows[0],
	modified : false,
	rows : [
        {
            type : 'media',
            duration: 2300,
        }
	]
}

Option 2) Have the array and results on the same level;

waypoints : {
	[
        selected : $.waypoints.r[0],
        modified : false,
        {
            type : 'media',
            duration: 2300,
        }
	]
}

The problem with the second, it isn’t a valid JS construct. You cannot mix properties into arrays like that.

But you can do this;

waypoints : {
	[
        {
            type : 'media',
            duration: 2300,
        }
	]
}

waypoints['selected'] = waypoints[0];
waypoints['modified'] = false;

‘waypoints.length // 1’

If you haven’t seen this before, it is adding properties to the array object, similar to a hash/lookup table in other languages. Some consider this harmful, but it is one of the most powerful features of the language, especially for hash doing lookups inside arrays, which we’ll get to in just a moment.

The second method is okay, but there are problems when serialising it back to the server or store locally as there is an implied 1-1 relationship and you’d need to add IDs to each.

So the our final, easily serialisable, structure looks like this;

journey = {
	points : {
        modified : false,
        rows : [{
            latitude: 32.323992,
            longitude: 12.032424,
            duration: 2300
        }]
	},
	waypoints : {
        modified : false,
        selected : '$.waypoints.r[0]',
        rows : [{
            type : 'media',
            duration: 2300,
            id: 'e8cf98c841c8aa'
        }]
	}
};

In-memory optimisations Link to heading

If we used that structure all the time, we’d soon find problems working with waypoints and points efficiently. An example is moving a waypoint. Waypoints don’t have location data, they rely upon sharing the same duration as a point, to position itself. So to update a waypoints location, we would need to update the points position and duration, and update the waypoints duration too. Highly inefficient.

As a waypoint always HAS A point, we can make it so in the data structure by creating circular references;

{ // waypoint
	type : 'media',
	id: 'e8cf98c841c8aa',
	point : $.waypoint.rows[0], // in-memory circular reference
}

{ // point
	latitude: 32.323992,
	longitude: 12.032424,
	duration: 2300,
	waypoint: $.points.rows[0]  // in-memory circular reference
}

If you were to look at this in firebug you can see a structure that continues on to infinity. This cannot be serialised, thus, cannot be sent back to the server.

The other major optimisation we can do is with a lookup table inside the waypoints array.

Say for example, we know the waypoints id is, ‘e8cf98c841c8aa’. How do we find that in the waypoints array? We could loop through the waypoints array and check each waypoint for a matching id, it’s a fast linear ( O(n) ) search, but we can improve this by making it a constant ( O(1) ) one.

Creating an associative array;

var waypoints = Journey.waypoints;

for (var i=0, len < waypoints.length; i++) {
    waypoints[waypoints[i].id] = waypoints[i];
}

waypoints.length // 1 - Normal array operations cannot see the newly added id (As long it isn't a number)
waypoints['e8cf98c841c8aa'].duration // 2300

Now all the properties can be created with very fast lookups, with minimal code.