Achieving Response Time Goals with Service Workers

2017-12-06T21:23:55Z

Service Workers enable a web application to be responsive even if the network isn't. Frameworks like AngularJS, React and Vue.js enable web applications to efficiently update and render web pages as data changes.

The Apache Software Foundation's Whimsy board agenda application uses both in combination to achieve a responsive user experience - both in terms of quick responses to user requests and quick updates based on changes made on the server.

From a performance perspective, the two cases easiest to optimize for are (1) the server fully up and running accessed across a fast network with all possible items cached, and (2) the application fully offline as once you make offline possible at all, it will be fast.

The harder cases ones where the server has received a significant update and needs to get that information to users, and even harder is when the server has no instances running and needs to spin up a new instance to process a request. While it is possible to do blue/green deployment for applications that are "always on", this isn't practical or appropriate for applications which only used in periodic bursts. The board agenda tool is one such application.

This article describes how a goal of sub-second response time is achieved in such an environment. There are plenty of articles on the web that show snippets or sanitized approaches, this one focuses on real world usage.

Introduction to Service Workers

Service Workers are JavaScript files that can intercept and provide responses to navigation and resource requests. Service Workers are supported today by Chrome and FireFox, and are under development in Microsoft Edge and WebKit/Safari.

Service Workers are part of a larger effort dubbed "Progressive Web Apps" that aim to make web applications reliable and fast, no matter what the state of the network may happen to be. The word "progressive" in this name is there to indicate that these applications will work with any browser to the best of that browser's ability.

The signature or premier feature of Service Workers is offline applications. Such web applications are loaded normally the first time, and cached. When offline, requests are served by the cache, and any input made by users can be stored in local storage or in an index db.

serviceworke.rs and The Offline Cookbook provide a number of recipes that can be used.

Overview of the Board Agenda Tool

This information is for background purposes only. Feel free to skim or skip.

The ASF Board meets monthly, and minutes are published publicly on the web. A typical meeting has over one hundred agenda items, though the board agenda tool assists in resolving most off them offline, leaving a manageable 9 officer reports, around 20 PMC reports that may or may not require action, and a handful of special orders.

While the full agenda is several thousand lines long, this file size is only a quarter of a megabyte or the size of a small image. The server side of this application parses the agenda and presents it to the client in JSON format, and the result is roughly the same size as the original.

To optimize the response of the first page access, the server is structured to do server side rendering of the page that is requested, and the resulting response starts with links to stylesheets, then contains the rendered HTML, and finally any scripts and data needed. This allows the browser to incrementally render the page as it is received. This set of scripts includes a script that can render any page (or component) that the board agenda tool can produce, and the data includes all the information necessary to do so. The current implementation is based on Vue.js.

Once loaded, traversals between pages is immeasurably quick. By that I mean that you can go to the first page and lean on the right arrow button and pages will smoothly scroll through the pages by at roughly the rate at which you can see the faces in a deck of cards shuffled upside down.

The pages generally contain buttons and hidden forms; which buttons appear often depends on the user who requests the page. For example, only Directors will see approve and unapprove buttons; and individual directors will only see one of these two buttons based on whether or not they have already approved the report.

A WebSocket between the server and client is made mostly so the server can push changes to each client; changes that then cause re-rendering and updated displays. Requests from the client to the server generally are done via XMLHttpRequest as it wasn't until very recently that Safari supported fetch. IE still doesn't, but Edge does.

Total (uncompressed) size of the application script is another quarter of a megabyte, and dependencies include Vue.js and Bootstrap, the latter being the biggest requiring over a half a megabyte of minimized CSS.

All scripts and stylesheets are served with a Cache-Control: immutable header as well as an expiration date a year from when the request was made. This is made possible by the expedient of utilizing a cache busting query string that contains the last modified date. Etag and 304 responses are also supported.

Offline support was added recently. Updates made when offline are stored in an IndexDB and sent as a batch when the user returns online. Having all of the code and data to render any page made this support very straightforward.

Performance observations (pre-optimization)

As mentioned at the top of this article, offline operations are virtually instantaneous. Generally, immeasurably so. As described above, this also applies to transitions between pages.

This leaves the initial visit, and returning visits, the latter includes opening the application in new tabs.

Best case response times for these cases is about a second. This may be due to the way that server side rendering is done or perhaps due to the fact that each page is customized to the individual. Improving on this is not a current priority, though the solution described later in this article addresses this.

Worst case response times are when there are no active server processes and all caches (both server side and client side) are either empty or stale. It is hard to get precise numbers for this, but it is on the order of eight to ten seconds. Somewhere around four is the starting of the server. Building the JSON form of the agenda can take another two given all of the validation (involving things like LDAP queries) involved in the process. Regenerating the ES5 JavaScript from sources can take another second or so. Producing the custom rendered HTML is another second. And then there is all of the client side processing.

In all, probably just under ten seconds if the server is otherwise idle. It can be a little more if the server is under moderate to heavy load.

The worst parts of this:

No change is seen on the browser window until the last second or so.
While the worst case scenario is comparatively rare in production, it virtually precisely matches what happens in development.

Selecting an approach

Given that the application can be brought up quickly in an entirely offline mode, one possibility would be to show the last cached status and then request updated information and process that information when received. This approach works well if the only change is to agenda data, but doesn't work so well in production whenever a script change is involved.

This can be solved with a window.location.reload() call, which is described (and somewhat discouraged) as approach #2 in Dan Fabulic's "How to Fix the Refresh Button When Using Service Workers". Note the code below was written before Dan's page was published, but in any case, Dan accurately describes the issue.

Taking some measurements on this produces interesting results. What is needed to determine if a script or stylesheet has changed is a current inventory from the server. This can consistently be provided quickly and is independent of the user requesting the data, so it can be cached. But since the data size is small enough, caching (in the sense of HTTP 304 reponses) isn't all that helpful.

Response time for this request in realistic network conditions when there is an available server process is around 200 milliseconds, and doesn't tend to vary very much.

The good news is that this completely addresses the "reload flash" problem.

Unfortunately, the key words here are "available server process" as that was the original problem to solve.

Fortunately, a combination approach is possible:

Attempt to fetch the inventory page from the network, but give it a deadline that it should generally beat. Say, 500 milliseconds or a half a second.
If the deadline isn't met, load potentially stale data from the cache, and request newer data. Once the network response is received (which had a 500 millisecond head start), determine if any scripts or stylesheets changed. If not, we are done.
Only if the deadline wasn't met AND there was a change to a stylesheet or more commonly a script, perform a reload; and figure out a way to address the poor user experience associated with a reload.

Additional exploration lead to the solution where the inventory page mentioned below could be formatted in HTML and, in fact, be the equivalent to a blank agenda page. Such a page would still be less than 2K bytes, and performance would be equivalent to loading a blank page and then navigating to the desired page, in other words, immeasurably fast.

Implementation

If you look at existing recipes, Network or Cache is pretty close; the problem is that it leaves the user with stale data if the network is slow. It can be improved upon.

Starting with the fetch from the network:

  // attempt to fetch bootstrap.html from the network
  fetch(request).then(function(response) {
    // cache the response if OK, fulfill the response if not timed out
    if (response.ok) {
      cache.put(request, response.clone());

      // preload stylesheets and javascripts
      if (/bootstrap\.html$/.test(request.url)) {
        response.clone().text().then(function(text) {
          var toolate = !timeoutId;

          setTimeout(
            function() {
              preload(cache, request.url, text, toolate)
            },

            (toolate ? 0 : 3000)
          )
        })
      };

      if (timeoutId) {
        clearTimeout(timeoutId);
        resolve(response)
      }
    } else {
      // bad response: use cache instead
      replyFromCache(true)
    }
  }).catch(function(failure) {
    // no response: use cache instead
    replyFromCache(true)
  })

This code needs to be wrapped in a Promise that provides a resolve function, and needs access to a cache as well as a variable named timeoutid and that determines whether or not the response has timed out.

If the response is ok, it and will be cached and a preload method will be called to load resources mentioned in the page. That will either be done immediately if not toolate, or after a short delay the timer expired to allow updates to be processed. Finally, if such a response was received in time, the timer will be cleared, and the promise will be resolved.

If either a bad response or no response was received (typically, this represents a network failure), the cache will be used instead.

Next the logic to reply from the cache:

  // common logic to reply from cache
  var replyFromCache = function(refetch) {
    return cache.match(request).then(function(response) {
      clearTimeout(timeoutId);

      if (response) {
        resolve(response);
        timeoutId = null
      } else if (refetch) {
        fetch(event.request).then(resolve, reject)
      }
    })
  };

  // respond from cache if the server isn't fast enough
  timeoutId = setTimeout(function() {replyFromCache(false)}, timeout);

This code looks for a cache match, and if it finds one, it will resolve the response, and clear the timeoutId enabling the fetch code to detect if it was too late.

If no response is found, the action taken will be determined by the refetch argument. The fetch logic above passes true for this, and the timeout logic passes false. If true, it will retry the original request (which presumably will fail) and return that result to the user. This is handling a never should happen scenario where the cache doesn't contain the bootstrap page.

The above two snippets of code are then wrapped by a function, providing the event, resolve, reject, and cache variables, as well as declaring and initializing the timeoutId variable:

// Return a bootstrap.html page within 0.5 seconds.  If the network responds
// in time, go with that response, otherwise respond with a cached version.
function bootstrap(event, request) {
  return new Promise(function(resolve, reject) {
    var timeoutId = null;

    caches.open("board/agenda").then(function(cache) {
        ...
    }
})

Next, we need to implement the preload function:

// look for css and js files and in HTML response ensure that each are cached
function preload(cache, base, text, toolate) {
  var pattern = /"[-.\w+/]+\.(css|js)\?\d+"/g;
  var count = 0;
  var changed = false;

  while (match = pattern.exec(text)) {
    count++;
    var path = match[0].split("\"")[1];
    var request = new Request(new URL(path, base));

    cache.match(request).then(function(response) {
      if (response) {
        count--
      } else {
        fetch(request).then(function(response) {
          if (response.ok) cacheReplace(cache, request, response);
          count--;
          if (count == 0 && toolate) {
            clients.matchAll().then(function(clients) {
              clients.forEach(function(client) {
                client.postMessage({type: "reload"})
              })
            })
          }
        })
      }
    })
  }
};

This code parses the HTML response, looking for .css, and .js files, based on a knowledge as to how this particular server will format the HTML. For each such entry in the HTML, the cache is searched for a match. If one is found, nothing more needs to be done. Otherwise, the resource is fetched and placed in the cache.

Once all requests are processed, and if this involved requesting a response from the network, then a check is made to see if this was a late response, and if so, a reload request is sent to all client windows.

cacheReplace is another application specific function:

// insert or replace a response into the cache.  Delete other responses
// with the same path (ignoring the query string).
function cacheReplace(cache, request, response) {
  var path = request.url.split("?")[0];

  cache.keys().then(function(keys) {
    keys.forEach(function(key) {
      if (key.url.split("?")[0] == path && key.url != path) {
        cache.delete(key).then(function() {})
      }
    })
  });

  cache.put(request, response)
};

The purpose of this method is as stated: to delete from the cache other responses that differ only in the query string. It also adds the response to the cache.

The remainder is either straightforward or application specific in a way that has no performance relevance. The scripts and stylesheets are served with a cache falling back to network strategy. The initial preloading which normally could be as simple as a call to cache.addAll needs to be aware of query strings and for this application it turns out that a different bootstrap HTML file is needed for each meeting.

Finally, here is the client side logic which handles reload messages from the service worker:

navigator.serviceWorker.register(scope + "sw.js", scope).then(function() {
  // watch for reload requests from the service worker
  navigator.serviceWorker.addEventListener("message", function(event) {
    if (event.data.type == "reload") {
      // ignore reload request if any input or textarea element is visible
      var inputs = document.querySelectorAll("input, textarea");

      if (Math.max.apply(
        Math,

        Array.prototype.slice.call(inputs).map(function(element) {
          return element.offsetWidth
        })
      ) <= 0) window.location.reload()
    }
  });
}

This code watches for type: "reload" messages from the service worker and invokes window.location.reload() only if there are no input or text area elements visible, which is determined using the offsetWidth property of each element. Very few board agenda pages have visible input fields by default; many, however, have bootstrap modal dialog boxes containing forms.

Performance Results

In production when using a browser that supports Service Workers, requests for the bootstrap page now typically range from 100 to 300 milliseconds, with the resulting page fully loaded in 400 to 600 milliseconds. Generally, this includes the time it takes to fetch and render updated data, but in rare cases that may take up to an additional 200 milliseconds.

In development, and in production when there are no server processes available and when accessed using a browser that supports Service Workers, the page initially loads in 700 to 1200 milliseconds. It is not clear to me why this sees a greater range of response times; but in any case, this is still a notable improvement. Often in development, and in rare cases in production, there may be a noticeable refresh that occurs one to five seconds later.

Visitations by browsers that do not support service workers, and for that matter the first time a new user visits the board agenda tool, do not see any performance improvement or degradation with these changes.

Not a bad result from less than 100 lines of code.