Tuesday, November 1, 2016

YouTube video play across linked devices and RESTful service composition

When I was asked to explain RESTful service composition, and what I mean by staged computation and baton passing, I always need to find a concrete example instead of using abstract terms. Watching YouTube video across devices is a good case for this purpose.

A user can find a video on YouTube on her/his mobile device, and then continue play it on any other TV devices linked to YouTube. S/he can also stop/resume/skip on the mobile device or the current playing device. It is also easy to switch between linked active TV devices.

We can think the video play as a service composition. The goal of the composition is to play the video to the audience. The composition is composed of YouTube, the mobile device, and linked devices. The audience is the consumer of the composition.

Watching video across devices can be viewed as a RESTful service composition. The computation is the play of the video from user starting to watch one to stopping it. It can have several stages, each of which the video is played on one device.

When the video play is switched from one device to the other, the computational baton is passed. The next device continued the computation according to where the play should start from, and perhaps audience's other preferences. But the computation has to be adjusted according to the current device's resource, like screen size and bandwidth.  

So is there a central conductor service? Maybe user's mobile device works like a conductor, but the playing of the video is in fact carried out by the linked devices and the mobile device itself. The linked device can also stop/resume/skip anytime.

Thursday, October 13, 2016

From User Generated Contents to User Defined Applications

In short: I think an interesting thing we can achieve on the Web is user-defined applications. The technologies, services, and most importantly, capability of web users will enable it.

You can't connect the dots looking forward; you can only connect them looking backwards.
---Steve Jobs, You've got to find what you love

We drive into the future using only our rearview mirror.
---Marshall McLuhan

We learned how to use email instantly

Ray Tomlinson will be remembered by his invention of email in 1971. @ became one of the most well known special characters used everyday. Although email at first imitated the paper mails, it has been eliminating paper mails, and furthermore changed the way of human communication. Still, in 1994, about 20 years after the invention of email, I knew nothing about email and did not know how to send an email when going to a university of about 50 thousand students. As I knew, at that time, there were only two computers that can be used to send emails on the campus. About one year after CERNET became available on campus, every students I met in computer classrooms had at least two email addresses. Everyone knew how to use email in a night. An interesting story is that the creator of a popular Foxmail email client in China, Zhang Xiaolong, recently leaded the development of WeChat application.

People know how to surf the Web

The first thing I would do was to open a browser when I spent booked computer time (in Chinese universities, it was called "machine time".). There were not too many websites in China at that time. I needed to write several popular domain names and interesting links on my paper notebook, and took it with me when I went to the computer classrooms. There were no real "personal" computers. Yahoo and other similar websites were big helpers for users like me at that time. Later when Google was known, all the portal web sites started their dying process. The new generation of web users in China, like my father, started to surf the web on their smart phones.

We like sharing

httpd was the starting point of the whole Apache foundation, and somehow the same for the open source movement. httpd gave individuals and organizations an online space to share contents about themselves. Almost everyone in academic research community got a personal homepage where they talked about their teaching, research, and personal life. They easily got attention from colleagues and students.

Blogs and wikis soon emerged as tools to encourage sharing, discussion, and collaboration. At the beginning, only experienced computer users knew how to set up blogs or wikis on organization servers or personal computers connected to research or commercial networks. There were so many blog software, and the one I tried was in Perl. Blog services soon kicked the blog software out and they became the major players. Most bloggers migrated their blogs from their personal or organization server to blog services. Blog readers, like Google reader, came to the hype when bloggers generated a big amount of blog daily.  The reader was later killed by Google in order to give way to Google's leading social platform Google+. Google+ was the Google's answer for the emerging social content and networking market led by Facebook.

We create media

If writing blogs and wikis takes a while, just wording a sentence, taking a picture or recording a video seems easier. The success of youtube and iPhone generated an explosion of user-created media. That also very naturally makes sharing digital media the No.1 feature of the new generation of social applications. The traditional way of content distribution saw an end of its life. Newspapers and magazines become pale when every normal people can publish and distribute their stories in the media. The reception of media, no matter positive or negative, makes everyone how deeply they are connected to the rest of world.

We created applications on the Web 

An application helps its users to perform a task. Hypermedia itself is an application. When creating hypermedia, we already created applications.

We put links in our pages to guide the readers to explore related concepts and stories. When a page is carefully crafted, the author knows where the readers will land.

We add forms in the pages to accept readers' inputs. Forms provide a way to start interactions between peers including the one who initially created the content. Comments and ratings are all forms.

Can we do more

Yes, people can do much more on the web. And in fact, we already started. We can easily define personal web sites including blogs and wikis. We can easily embed contents created and hosted on different providers. We can easily stream media from web across devices. We can easily set up personal shops without worrying about payments and shipping details. Can we go even further?

What is user defined application

A user-defined web application is a web application generated by a platform from user defined specifications. It provides both basic graphical user interface (GUI) and application program interface (API). Ideally, the platform provides an environment for users to compose and test the application specifications. A user defines
  • the structure and types of data that application will store, 
  • the representation of the data for human and other applications, 
  • the generation of new data and corresponding new representations that will be triggered by user or application interaction, and 
  • who or what applications can interact with the application. 

Monday, September 26, 2016

Observations of Netflix's Zuul 2 experience

An engineering team from Netflix just published their experience of building a new Zuul from synchronous blocking I/O to asynchronous non-blocking I/O. Here are interesting observations from my perspective.

1. Asynchronous programming is hard.

Concurrency might be one of the most challenging problem in programming. In Java, multithreading already became the norm for concurrency. The Java concurrency package made programmers' life easier than it really is. But it did not change the nature of concurrency. Testing and debugging of multithreading programs are difficult. Reproducing racing, locking, and suspicious execution sequences are tricking.

If a programmer likes to be the boss in her/his program, then she/he will be disappointed by asynchronous programming. No executing sequence is garnered by the sequence in code. The routines say "do not call me, I'll call you." right after the initial call. The way to testing and debugging is different from that for synchronous code. The biggest challenge is that programmers have to change their mindset from sequential to concurrent, from a single player game to a multiple player game, and accept the fact that there is no NOW!

2. For CPU-intensive independent jobs, there is no big performance difference between multi-threading sync blocking and event-based async non-blocking.

When discussing performance and scalability, we should first identify the bottleneck. We should see only one bottleneck at one time, and will see a new one emerging when the current one disappears. When the jobs are CPU-intensive, the bottleneck is the CPU, and the throughput is decided by the CPU. Although whether the I/O is blocking or non-blocking will affect how long the processes will wait, its impact on the throughput is shrouded by the processing time in CPU. However, it will definitely have an impact on the memory. I hope the Netflix team could compare the memory profiles.

In fact, multi-threading does not help accelerate CPU-intensive jobs. In theory, multiprocess with a non-blocking I/O feeding to the job queue will do the best, where the number of processes is decided by the number of cores. Timeout might be required when a job takes too long to finish.

3. Async is the natural way to work with non-blocking I/O (or async I/O), and it introduces capacity gain for I/O intensive jobs.

For I/O intensive jobs, the time to finish the I/O will be significant. If resources are occupied while waiting for the I/O to finish, they are wasted and do not contribute to the throughput but results in a high utilization. On the contrary, async frees resources when I/O is going on, and recapture them only when I/O is done and data is available. The C10K problem is the best resource discussing this on the OS level.

4. Contention and Coherency of a system are decided by both the characteristics of jobs and the system's implementation.

In the Universal Scalability Law (USL), the contention is the coefficient of the first order penalty for concurrency, and the coherency is that of the second order. The contention represents the part of job that cannot be paralleled. When the number of requests in a system is N, the contention will affect (N-1) of the rest requests. The coherency represents the part of each job that can result in the generation of a new contention, which results in the second order effect. Thrashing occurs typically when coherency penalty is dominant.  One example of coherency occurs when a request resulting in updating a DB record finds its copy of data is stale. Obviously, the request needs to wait and try again when its copy gets refreshed. The refreshment time can potentially be affected by all other requests that update the DB. This can also occur to shared memory of multiple threads.

In order to achieve better scalability, we need to design the system so that the contention and coherency can be reduced. Async can do nothing about the contention on CPU, but it can reduce the contention and coherency on I/O.

Wednesday, June 1, 2016

Refactoring an express.js application without IDE

Refactoring is scaring.

Without support from an IDE, it becomes even difficult. However, git can help. I have done refactoring for an express.js web application several times recently. I summaries the following rules to make the process less scaring.

Rule 1. branch before refactoring

If the refactoring is not going well, you can always go back.  In most cases, I have to touch the model, the view, and the controller. If you named your files according to RESTful resource name, it is very likely you need to update the file names.

Rule 2. rename files and then commit before touch the codes inside files

After add new files and rm old ones, git will recognized the changes are renaming files. If you make changes inside a renamed file, then the renaming will be lost in git history.

Rule 3. refactoring model first, then the views and client side JavaScript files, and finally the controller

In this sequence, you will be able to have a testable application after each code modification. When M-V-C are updated, and all tests passed, you can merge the refactoring branch back to the original one.

Tuesday, May 17, 2016

PUT to update a resource state

It is common that a resource's behavior changes according to its state. A online survey is such a resource. A survey can have two states: idle and active. In idle state, a survey does not accept user inputs, and it does in active state.  The state transition can be seen in the follow figure.

Survey state transition diagram
Now if we want to expose the status of the survey as a resource, then we will have the following interfaces.

Method URL Details
GET /surveys/:id/state Get the current state of a survey
PUT /surveys/:id/state Set the state of a survey

The PUT request can have the state to set in the request body. The response can be 200 with the set result. The response can also be 4xx, if the client has erred. The error can be 403 forbidden, 404 survey not found, or 409 conflict if the state to set is not supported.

I am surprised to see that POST was used to set survey state in Google's survey API. The interfaces are:

Method URL Details
POST /surveys/:id/start set the survey to be active
POST /surveys/:id/stop set the survey to be idle

This design has two problems compared to the previous design:
  1. /surveys/:id/start and /surveys/:id/stop are not resources; and
  2. POST is not idempotent, but PUT is. 

Wednesday, March 23, 2016

An Express middleware to check if a resource exists

After writing a similar code snippets for tens of times, I decided to write a middleware to handle it. The scenario is quite common. You want to check if an resource exists before going to the next step of processing. In my applications,  it is a query in the MongoDB.

 * A middleware to check if id exists in collection
 * app.get('/resources/:id/', exist('id', collection), function(req, res){...}}
 * @param  {String} id            the parameter name of item id in req object
 * @param  {Model} collection     the collection model
 * @return {Function}             the middleware
function exist(id, collection) {
  return function (req, res, next) {
    collection.findById(req.params[id]).exec(function (err, item) {
      if (err) {
        return res.send(500, 'Something wrong!');

      if (!item) {
        return res.send(404, 'item ' + req.params[id] + ' not found');
      req[req.params[id]] = item;

Introduction to REST prezi updated

I updated the prezi of an introduction to REST for next week's MSU Web Dev Cafe meeting next week. The major change was to add a case study to tell of a design is RESTful.

Tuesday, March 15, 2016

Forcing clients to reload on new application releases

Many Web application clients get data via Ajax. The resources including the CSS and JavaScript files  are never reloaded once loaded in this way. However, some new features require the clients to retrieve the updated resources from the server. This can be achieved by maintaining a service generated release number on the client, and compare that number with the current release number on the server via Ajax. If the release number is updated, then run
You will need to either disable the the cache for the resource of release number, or set cache to false of the Ajax GET request.

Wednesday, February 24, 2016

Resource representation generation on first GET retrieval


Provide a resource type that has a new resource representation instance generated on the first retrieval. In HTTP, this is often a GET request to a URL that the resource provides.

Also Known As

The closest pattern I can find is the Multiton pattern.


I have implement this pattern in two scenarios:

An application provides the remote control of a specific CCD detector. The detector scans a sample and acquires an image on every scan spot. The native image format is not supported by Web browsers. The users can view the scan process on a page. The page will retrieve a new image once the acquisition finishes on a scan spot. The image is then converted to PNG, and sent to the client. The PNG image is saved, and all later requests will be served directly without conversion. The challenge, in this scenario, is that the application should convert the image to PNG only once. That implies that all the requests of that image before the PNG file is available are served together.

In the other scenario, a client retrieves a user's thumbnail photo from an application. The application gets a photo from an Active Directory when the photo is requested from the first time. The application saves the photo and serves it locally thereafter. Similar to the first scenario, the application should retrieve the photo from the AD service only once.


When a resource representation is requested but is not available in the local file system, the request is put into an array storing all the requests for the same representation. Such an array needs to be put into a hash table, and its key is the resource's identifier. When a resource is requested for the first time, the resource is not available in the local file system, and the corresponding key is not in the hash table. Then the key will be created in the hash table, and the first request is pushed into the array. All following requests of the same resource are pushed into the array when the application is generating the resource representation. When the representation is ready, it is saved in the file system, and the key is removed from the hash table. All the requests in the array are served in a batch. The design can be implemented in various programming languages. There is a big difference between the implementation in a non-event-driven programming language like Java and that in an event-driven programming language like node.js.

Challenge 1: synchronization of the hash table

Adding and removing similar resource requests into the hash table have to be synchronized.

Java node.js
We will have to use a concurrent util class like java.util.concurrent.ConcurrentHashMap(String, List). A simple object like {"resource-identifier": []} will work.

Challenge 2: asynchronous processing

When the application is generating or retrieving the resource, we want the thread previously allocated to the request to be freed, and the handling of the request continues when the resource is finally available. Before Servlet 3, we will have to use something like Jetty continuation for this in Java. On the contrary, because node.js by nature has only a single thread, the processing is by default asynchronous.
Java node.js
We will need put a Jetty continuation or a Servlet 3 AsyncContext into the hash map, and write the response from there when the resource is available. Just put the standard http.ServerResponse instances in the array.

In order to improve performance, we can also add cache control to these generated resources in addition to the copy in the file system.