Using local data storage space efficiently

As described in my previous post, “Choosing the browser-side data storage API”, storing data locally is tricky. Despite the work that is being done for data storage in HTML5, there still isn’t a good storage option that allows us to easily scale from a small data set to a large one. Storage limitations differ with each browser and we cannot rely on them to provide us with sufficient space.

Therefore, it is important that we use the space that we have as efficiently as possible.

In Kamishibai, we store HTML fragments as is in local storage. The size of the HTML fragments range from a whole page to a small “like” box. Breaking up a page into multiple HTML fragments is advantageous for effective cache management (each fragment might require a different expiry date) and would promote code reuse. However, multiple HTML fragments mean multiple HTTP requests. Downloading the page as a single fragment is the most efficient in terms of HTTP requests.

This means that we have to strike a compromise. Either we chose to optimize cache management by breaking up into small HTML fragments, or we optimize for network speed with larger HTML fragments.

The choice depends on how many pages we want to store locally, and the level of redundancy in each page. If we want to store a large number of pages, then it will be more efficient to use small fragments. Especially if the level of redundancy is high, we will be able to reuse small fragments effectively. Hence the choice will tend towards smaller fragments in conference systems.

Let’s look at the level of redundancy in conference systems.

The main pages in a Ponzu conference system are;

  1. A list of sessions.
  2. A list of presentations within a session.
  3. A presentation page (with a list of related presentations).
  4. A list of presentations in the search results.

A lot of the pages show a list of presentations. It therefore makes a lot of sense to store each element (title, authors and author affiliations of a presentation) separately so that we can construct different lists simply by combining elements.

Furthermore, the title, authors and author affiliations section of each presentation is quite large. In addition to the text, the authors section is composed of links to user profile pages and we additionally have markup of superscript. The markup is significant, and we often see more than 1,000 characters per presentation for the heading alone (sans abstract text).

In MBSJ2012, we did not break up lists into fragments. In addition to the large size, we also observed that rendering took a lot of time when the number of presentations in a list were large. Rendering the author list required a lot of string manipulation and often triggered garbage collection, resulting in long response times (several hundred milliseconds).

In future versions of Ponzu and Kamishibai, we will break up presentation lists. Each presentation will have a long expiry time so that the version cached inside the browser will be used. Additionally, we will use caching on the server. Our current test show that it should improve responsiveness in most cases.

Leave a Reply

Your email address will not be published. Required fields are marked *