Wednesday, February 24, 2016

Resource representation generation on first GET retrieval

Intent

Provide a resource type that has a new resource representation instance generated on the first retrieval. In HTTP, this is often a GET request to a URL that the resource provides.

Also Known As

The closest pattern I can find is the Multiton pattern.

Motivation

I have implement this pattern in two scenarios:

An application provides the remote control of a specific CCD detector. The detector scans a sample and acquires an image on every scan spot. The native image format is not supported by Web browsers. The users can view the scan process on a page. The page will retrieve a new image once the acquisition finishes on a scan spot. The image is then converted to PNG, and sent to the client. The PNG image is saved, and all later requests will be served directly without conversion. The challenge, in this scenario, is that the application should convert the image to PNG only once. That implies that all the requests of that image before the PNG file is available are served together.

In the other scenario, a client retrieves a user's thumbnail photo from an application. The application gets a photo from an Active Directory when the photo is requested from the first time. The application saves the photo and serves it locally thereafter. Similar to the first scenario, the application should retrieve the photo from the AD service only once.

Design

When a resource representation is requested but is not available in the local file system, the request is put into an array storing all the requests for the same representation. Such an array needs to be put into a hash table, and its key is the resource's identifier. When a resource is requested for the first time, the resource is not available in the local file system, and the corresponding key is not in the hash table. Then the key will be created in the hash table, and the first request is pushed into the array. All following requests of the same resource are pushed into the array when the application is generating the resource representation. When the representation is ready, it is saved in the file system, and the key is removed from the hash table. All the requests in the array are served in a batch. The design can be implemented in various programming languages. There is a big difference between the implementation in a non-event-driven programming language like Java and that in an event-driven programming language like node.js.

Challenge 1: synchronization of the hash table

Adding and removing similar resource requests into the hash table have to be synchronized.

Java node.js
We will have to use a concurrent util class like java.util.concurrent.ConcurrentHashMap(String, List). A simple object like {"resource-identifier": []} will work.

Challenge 2: asynchronous processing

When the application is generating or retrieving the resource, we want the thread previously allocated to the request to be freed, and the handling of the request continues when the resource is finally available. Before Servlet 3, we will have to use something like Jetty continuation for this in Java. On the contrary, because node.js by nature has only a single thread, the processing is by default asynchronous.
Java node.js
We will need put a Jetty continuation or a Servlet 3 AsyncContext into the hash map, and write the response from there when the resource is available. Just put the standard http.ServerResponse instances in the array.

In order to improve performance, we can also add cache control to these generated resources in addition to the copy in the file system.