One of the best feelings a software developer can have in their career is to be presented with a problem, implement a shiney new solution, and sit back and watch your solution work in all facets of solving the problem at hand.  I used to think that; until I came upon a situation a few months back where I realized I had not a new problem, but an old lingering issue that had grown to monstrous proportions, and I solved the problem using technology that no one from the development team was familiar with – and the technology was not new at all.  There is something to be said with revamping existing functionality riddled with problems where you break out old school technology and just crush the problem completely.  It was the most fun that I had in a long, long time.

So I would like to walk through the process a little bit of how I recognized that not only did something need to be done, but my bag of tricks came up empty every time I attempted to solve the problem.  Enter, the Google.  Seriously, I love a good search, but when I searched on a generic description of not what my problem was, but what it was I felt like I needed to do to solve it, low and behold, there were lots of search hits that basically were variations on the same theme.  There it was.  Simple, clean, and very straightforward – and I just knew it would work.  It really doesn’t get any better than that.

The Problem

This specific example is in Java, but if you keep reading you will soon realize that this solution can work for lots and lots of varying scenarios in most languages (certainly the top tier languages support it).  It was an age old problem really.  On the client side, I had a large DOM object (a grid/table) that needed to update up to N records by sending the grid row data to the server for processing.  The process of updating multiple records required a round trip to the server, updating the database, and sending the updated data back to the client and displaying the new data in an AJAX sort of way.  The initial implementation had its issues; take your pick: some sort of “threading generator” on the client side with a Javascript timer (yes, I just wrote that) that kicks off AJAX calls;  a loop in the controller that essentially processed each record serially; issues that were encountered were batched up in a separate queue to be rendered later; and as the application got larger (in terms of data), each iteration of updates took longer and longer over time, so you really didn’t know how long the entire process would take if you were submitting lots of records for processing.  It was not pretty.  So, to the Google I went.

The Google Search

I knew I wanted something that would allow me to send the bare minimum of data, en mass, to the server through some conduit.  I wanted to have the controller carve it up so that the server could process the individual pieces independently of one another, and when each piece completed, send the results back to the client over the same communication conduit.  In minutes, I had it: WebSockets.  The technology was new to me (and my team), but come to find out, it wasn’t new at all.  The WebSocket protocol was standardized by the Internet Engineering Task Force in 2011 but was actually in use long before that.  Google Chrome first supported the protocol back in 2008.

What is a WebSocket?  Well, essentially the WebSocket protocol provides for a full-duplex communication channel over a single TCP connection. This protocol provides a standardized way for the server to send content to the browser without being asked by the client – this allows for defined messages to be passed back and forth while keeping the connection open. This is how a two-way (bi-directional) ongoing conversation can take place between a browser and the server.  The messages that are passed back and forth are usually XML or JSON.  So, that’s it.  You open up a connection, send data through by calling a Javascript method, intercept the data on the server, process the data on the server, and then send the result set back over the same connection and the parse the results on the client (another method call) and render accordingly.  You have a lot of control over the whole process.

WebSockets in a Nutshell

To use WebSockets, you first have to create a WebSocket object in the client.  This is done very easily.  The WebSocket constructor takes two parameters: one required and one optional.  The required parameter is the URL not just to the server itself, but to the class that is designated as a WebSocket “endpoint”.  This designation is usually done with a simple annotation.  This method is nothing more than a listening post for all in-bound traffic coming over the connection.  So to create a WebSocket, you simply create a WebSocket object on the client:

WebSocket wSock = new WebSocket('"ws://www.reardensteel.com/socketServer","socketOne"');

Two things to note here are the specified protocol and the “socketOne” protocol string.  So the specified protocol here is “ws”, but if you’re using secured communication, there is a “wss” option.  Also, the protocol string is helpful since it basically “names” the connection.  So if you open up more than one simultaneous connection, you can keep track of the distinct connections in your logs.  To send data to the server, you simply call the WebSocket.send method.  You can send either a String, BLOB, or ArrayBuffer.  Usually, I just send JSON strings:

wSock.send("{"orderNumber": "10023452",
          "metricTons":"20453",
          "clientName":"TaggertTransCon",
          "clientId":"432234"}");

At this point, the text is sent to the server and directed to the class annotated with the @ServerEndpoint(“/socketServer”) annotation.  This class has several annotated methods that are helpful to be aware of.  In my case, I knew that each thread would have a callback function so that upon thread completion, it would have access to the session object.  The method annotated with @OnOpen allowed me to do just that.  As each unique connection is opened, I just stuff the session object in a map for access down the road.  The Session object gives you access to the open connection to send data to the client among other things.  The method annotated with the @OnMessage annotation will receive the message text and from that point on, will then unbuckle the JSON object and proceed to deserialize it into POJO objects for processing.

import javax.websocket.server.ServerEndpoint;

          @ServerEndpoint("/socketServer")
          public class MyCustomEndpoint {
            @OnOpen  //executed once the connection is initially opened.
          public void onOpen(Session session) {
          //add session object to an accessible map
          //for thread callback function
          }
           
            @OnMessage
            public String processRecords(String recordSet) {
              //break recordSet into individual POJOs
          //and process as separate thread
            }

          @OnError  //handle errors
          public void onError(Throwable t) {
          //handle errors
          }

          @OnClose  //executed once the connection is closed
          public void onClose(Session session) {
          //do stuff
          }

          }

In my specific case, I coded it to unbuckle each grid row representation and convert it into an ArrayList of POJOs.  I then iterated over each POJO, creating a separate Java thread and within this thread sent the POJO for processing.  It is worthy to note that each thread then would have it’s very own callback function passed along as well.  This callback function would have a hook into the Session object so that you can access the open connection (mentioned above).  As each thread completes, the callback function would then be called where the result set would then be sent back to the client as a JSON message for rendering.  So at the thread’s termination, the following call would be made:

session.getBasicRemote().sendText(resultJSON);

Once this JSON message is sent by the server, the listening post on the client is in the WebSocket.onmessage listener.

  wSock.onmessage = function (event) {
          var f = document.getElementById("orderForm");
          var msg = JSON.parse(event.data);
          switch(msg.type) {
          case "id":
          markRecordAsProcessed(msg.id);
          break;
          case "errorCode":
          processError(msg.errorCode);
          break;
          ....
          }
            //do more things...
          }

Once all the records are processed, the only thing left to do is close the connection.  It’s generally best practice to close it programmatically instead of letting it timeout.

wSock.close();

As expected, there is a lot more that can be discussed about the protocol.  The good news is that even with a little bit of coding, you can get this working very quickly and it’s surprisingly a lot of fun to do.  There are lots of examples on the web that could be used as references.  A great place to start would be the WebSocket specification or the API for the javax.websocket.server package.  Both are great places to learn about all the options in setting up the channels, controlling the channel communications, shutting the channels down, and handling all sorts of errors that can occur.

Perhaps in the near future, I can provide a full example of the WebSocket implementation in Ruby on Rails using ActionCable.  Stay tuned….