Long polling with nodejs

Long polling with nodejs

A toy example of long polling with nodejs

2021-08-01

Long polling is just like short polling, but with a bigger pole! No, not really!
Sometimes, we need to get updates from a server, but we don’t know if the resource that we’re after is there at the moment.
Examples for that can be:

  • Checking if a file exists on a server
  • Waiting for some field to be updated
  • Checking for new messages on a chat app I bet you can find more similar scenarios.

So, what are your options today in our http driven world?

  1. Short polling
  2. Using something like Server Sent Events (SSE)
  3. Websockets (which is actually tcp)
  4. Long polling

Of course if you don’t limit yourself to http, you can use GRPC, a PUB/SUB server or roll your own with tcp.

So what is long polling anyway?

Long polling is a technique to achieve a continuous connection between a server and a client.

The server:

  1. Receives a request
  2. Checks if it has data to return to the client
  3. If it has data, it returns it in the response
  4. If it doesn’t have data, it waits and checks if the data is available repeatedly (polling)
  5. After a certain time has passed the server “gives up” and returns an empty response

The client:

  1. Send the requests
  2. Waits
  3. Receives the response
  4. Checks for data (and if there is, does something with it)
  5. Repeats

How is long polling better than just polling?

“Just polling” or short polling can only get results at the interval it polls. So, in short polling this is how it goes:

  1. The client sends a request
  2. The server immediately return a response
  3. The client checks for data (and do something if it’s available)
  4. The client sleeps an “interval” and repeats

The connection here is not continuous. If our interval is 5 seconds, it means we actually get results only every 5 seconds. Our result in this case can be 4.99 seconds old. With long polling on the other hand, we get the result almost immediately. I say almost, because there’s the time between getting the response and sending another request. Let’s assume that, that time is 100 milliseconds plus 200 for latency. In the worst case your result can be 300 milliseconds old. This is for sure a lot more “real time”-ish.

Alright! I assume that you are now convinced. Let’s move on to the implementation.

The client

This is the easiest part, just loop and make the requests.

client.js


const axios = require('axios').default;

async function main() {

    while (true) {
        var hadErr = false;
        console.log("requesting...");
        await axios.get('http://127.0.0.1:8080', { timeout: 10000 })
            .then(function (response) {
                console.log(response.data); // This will sometime be empty
            })
            .catch(function (error) {
                console.log('People we have an error!', error);
                hadErr = true;
            });
        if (hadErr) {
            // break out of the loop in case of error
            // maybe in a real live situation we could do something here*
            break;
        }
    }
}

main();

This code is pretty simple, loop forever and print whatever you got.
Note that, the timeout is 10 seconds which should be more time than the time the server will wait (server returns after 5 seconds if there’s no data).

The server

Now, this is where the magic happens.
Let’s start with our “resource”. I have implemented a very generic “one time” event emitter. The calling code can register to the “event”, and when the event happens it is automatically unregistered. You’ll see how I’m using it in a sec.

event.js



// A very generic event emitter
class EventEmitter {

    listeners = {};

    fire(event) {
        for (var k in this.listeners) {
            let listener = this.listeners[k];
            this.unregister(k); // unregister this listener
            listener(event);
        }
    }

    register(id, listener) {
        this.listeners[id] = listener;
        console.log("Register", id)
    }
    
    unregister(id) {
        return delete this.listeners[id];
    }
}

module.exports = EventEmitter;

This is pretty standard. You can check out another implementation of it here.

I’m using express.js for the server, it’s easy enough and take cares of most of the stuff. Let’s start with a simplified version of our server and continue working on from there.

index.js


const express = require('express');
const EventEmitter = require('./event.js')
const app = express();

// create an instance of our event emitter
const eventEmitter = new EventEmitter();

app.get('/', function (req, res) {
    res.status(200);
    res.end();
})

 var server = app.listen(8080, function () {
    var host = server.address().address
    var port = server.address().port
    console.log("Example app listening at http://%s:%s", host, port);
 })

Ok, so this is pretty standard also. Taken from some express.js example or something. What it does is basically return a 200 OK header and closes the request.

Let’s add code for our event emitter to actually emit events.


async function sleep(ms) {
   return new Promise((resolve) => {
     setTimeout(resolve, ms);
   }).catch(function() {});
 }   


async function main() {
   while (true) {
      const waitTimeMS = Math.floor(Math.random() * 10000);
      await sleep(waitTimeMS);
      eventEmitter.fire({time: waitTimeMS});
   }
 }

The main function is looping forever, sleeping a random amount of milliseconds (0 - 10000) and calling fire with an object. Great, every few seconds an event will be fired. Let’s see how we use that.

...

app.get('/', function (req, res) {
   const id = Date.now().toString(); // milliseconds of now will be fine for our case
   const handler = function(event) {
      console.log('event', event);
      res.status(201);
      res.end( JSON.stringify(event));
   };
   eventEmitter.register(id, handler);

...

Here, we register to the event with the handler function. When the loop in the main function will call the fire function, this function will run and output the event to the client along with 201 status code. But, what if there isn’t an event? Let’s add the part that handles the returning of empty data to the client.

...

app.get('/', function (req, res) {
   const id = Date.now().toString(); // milliseconds of now will be fine for our case
   var timer = null;
   const handler = function(event) {
      clearTimeout(timer);
      console.log('event', event);
      res.status(201);
      res.end( JSON.stringify(event));
   };

   eventEmitter.register(id, handler);
   timer = setTimeout(function(){ 
      console.log('timeout');
      const wasUnregistered = eventEmitter.unregister(id);
      console.log("wasUnregistered", wasUnregistered);
      if (wasUnregistered){
         res.status(200);
         res.end();
      }
   }, 5000);
});

...

We added a timer in the start of the function and we are also clearing it in case the event fires. In case the event didn’t fire, we unregister the handler function from the event. The unregister function returns a boolean, if it returns false the function was already unregistered! So, we write the empty data response only if the event didn’t fire. This is to avoid a race condition between the handler function and the timeout function.

What situation remains unhandled?

You might have noticed, that the event can fire exactly in the time that the client is getting the response and before it makes another request. The code doesn’t handle that and there’s a few possible solutions that we can implement. I can think of a few examples, can you? One example would be to run two sets of requests and another would be to hold off and “retry” the event if there’s no listeners.

I’m not going to implement any of those, although they sound pretty cool.

I would just love it, if you make a pull request on the github repo

Hope you enjoyed reading this post.
And of course if you did, please share and tell your friends!