Serenity Leads to a Further Vision: Java

Showing posts with label Java. Show all posts

Wednesday, February 24, 2016

Resource representation generation on first GET retrieval

Intent

Provide a resource type that has a new resource representation instance generated on the first retrieval. In HTTP, this is often a GET request to a URL that the resource provides.

Also Known As

The closest pattern I can find is the Multiton pattern.

Motivation

I have implement this pattern in two scenarios:

An application provides the remote control of a specific CCD detector. The detector scans a sample and acquires an image on every scan spot. The native image format is not supported by Web browsers. The users can view the scan process on a page. The page will retrieve a new image once the acquisition finishes on a scan spot. The image is then converted to PNG, and sent to the client. The PNG image is saved, and all later requests will be served directly without conversion. The challenge, in this scenario, is that the application should convert the image to PNG only once. That implies that all the requests of that image before the PNG file is available are served together.

In the other scenario, a client retrieves a user's thumbnail photo from an application. The application gets a photo from an Active Directory when the photo is requested from the first time. The application saves the photo and serves it locally thereafter. Similar to the first scenario, the application should retrieve the photo from the AD service only once.

Design

When a resource representation is requested but is not available in the local file system, the request is put into an array storing all the requests for the same representation. Such an array needs to be put into a hash table, and its key is the resource's identifier. When a resource is requested for the first time, the resource is not available in the local file system, and the corresponding key is not in the hash table. Then the key will be created in the hash table, and the first request is pushed into the array. All following requests of the same resource are pushed into the array when the application is generating the resource representation. When the representation is ready, it is saved in the file system, and the key is removed from the hash table. All the requests in the array are served in a batch. The design can be implemented in various programming languages. There is a big difference between the implementation in a non-event-driven programming language like Java and that in an event-driven programming language like node.js.

Challenge 1: synchronization of the hash table

Adding and removing similar resource requests into the hash table have to be synchronized.

Java	node.js
We will have to use a concurrent util class like java.util.concurrent.ConcurrentHashMap(String, List).	A simple object like {"resource-identifier": []} will work.

Challenge 2: asynchronous processing

When the application is generating or retrieving the resource, we want the thread previously allocated to the request to be freed, and the handling of the request continues when the resource is finally available. Before Servlet 3, we will have to use something like Jetty continuation for this in Java. On the contrary, because node.js by nature has only a single thread, the processing is by default asynchronous.

Java	node.js
We will need put a Jetty continuation or a Servlet 3 AsyncContext into the hash map, and write the response from there when the resource is available.	Just put the standard http.ServerResponse instances in the array.

In order to improve performance, we can also add cache control to these generated resources in addition to the copy in the file system.

Tuesday, January 21, 2014

Introducing node.js to Java developers prezi

A prezi developed for my colleagues.

Friday, December 13, 2013

Paypal's node.js versus Java benchmark and its analysis

Paypal engineers shared their test results on node.js versus Java from both system performance and developer performance perspectives. Baron Schwartz followed with a good analysis of the result. I suggest everyone interested in service performance and scalability read them.

I happened to know well about most aspects of this topic: Java, node.js, service orchestration, NIO/AIO, performance and scalability including USL. The following are my assumptions and understandings of the original benchmark and the analysis.

The application that they implemented in both Java and node.js is a service orchestration or a server-side mashup. That is, for an account overview page request, the server side needs to do a serial of service/API requests to other services/APIs/resources. According to the description, I assume there are 3 serial blocks, and inside each block, there are 2 to 5 parallel requests.
The underlying services/APIs/resources are independent of the service/resource to generate the account overview page. We can assume the characteristics of response time (mean, or second order or distribution) of the underlying services/APIs/resources do not change because of the load on the account overview page service/resource. This is a required assumption for the apple-to-apple comparison conclusion.
The account overview page requests result in I/O intensive workload on the system, not CPU intensive. More precisely, it is Reversed I/O that is the service orchestration initiates parallel clients including either HTTP or others like DB . If you want to know more about this, please google C10K or Reverse C10K. I assume they used 5 cores for the Java implementation in order to make sure the CPU would not become a bottleneck in the benchmark because of the number of concurrent threads. The max number of concurrent threads would be N x P where P is the number of concurrent threads. In this case, P could be 6, 1 for the inbound overview request, 5 for outbound API/resource requests. In their measurement, this number can reach 15 x 6 = 90. I assume they have a good Java HTTP client library that has a shared thread pool.
It is not clear what Java Servlet container was used in the benchmark. Some latest Servlet containers do support NIO. However, most Servlets are still synchronous. Only a few platforms can support asynchronous style like Jetty and Netty. The difference between synchronous Servlet/processing and asynchronous one is that the former needs to block during outbound I/Os. In node.js, everything is asynchronous if you program in a natural way.
In the USL models, Java's kappa (crosstalk or coherency penalty) is higher because the threads shares resources including CPUs and memory. Its sigma (serialization or contention penalty) is lower because the tasks are preformed in parallel by the threads. node.js's kappa is lower because the I/O operations are processed serially by the single event loop. For the same reason, its sigma is much higher. This shows the beauty of the USL model.

Thursday, April 22, 2010

Enable java in Firefox 3.0 on Ubuntu 9.04

I assume you have install the JVM or SDK already. If not, see this document for reference. Maybe you also need to use the update-alternatives tool to choose the latest version as the default. Then what you need to do is to ln the libjavaplugin_oji.so plugin in the mozilla plugins directory. Creating a link in the firefox plugins directory in your system won't work. You might use the following commands to locate the plugin and create the link.

locate libjavaplugin_oji.so
sudo ln -s {the latest java}/jre/plugin/i386/ns7/libjavaplugin_oji.so /usr/lib/mozilla/plugins/libjavaplugin_oji.so

Restart your firefox and you should be able to see java in the Add-ons list.

Wednesday, May 21, 2008

Java String to byte array and InputStream

Based on Google result, the best solutions are

byte[] byteArray = yourString.getBytes(charsetName);

and

InputStream stream = new ByteArrayInputStream(yourString.getBytes(charsetName));