Node request queue backed up

TL;DR - Are there any best practices when configuring the globalAgent that allow for high throughput with a high volume of concurrent requests?

Here's our issue:

As far as I can tell, connection pools in Node are managed by the http module, which queues requests in a globalAgent object, which is global to the Node process. The number of requests pulled from the globalAgent queue at any given time is determined by the number of open socket connections, which is determined by the maxSockets property of globalAgent (defaults to 5).

When using "keep-alive" connections, I would expect that as soon as a request is resolved, the connection that handled the request would be available and can handle the next request in the globalAgent's queue.

Instead, however, it appears that each connection up to the max number is resolved before any additional queued requests are handled.

When watching networking traffic between components, we see that if maxSockets is 10, then 10 requests resolve successfully. Then there is a pause 3-5 second pause (presumably while new tcp connections are established), then 10 more requests resolve, then another pause, etc.

This seems wrong. Node is supposed to excel at handling a high volume of concurrent requests. So if, even with 1000 available socket connections, if request 1000 cannot be handled until 1-999 resolve, you'd hit a bottleneck. Yet I can't figure out what we're doing incorrectly.

Update

Here's an example of how we're making requests -- though it's worth noting that this behavior occurs whenever a node process makes an http request, including when that request is initiated by widely-used third-party libs. I don't believe it is specific to our implementation. Nevertheless...

class Client
  constructor: (@endpoint, @options = {}) ->
    @endpoint = @_cleanEndpoint(@endpoint)
    throw new Error("Endpoint required") unless @endpoint && @endpoint.length > 0

    _.defaults @options,
        maxCacheItems: 1000
        maxTokenCache: 60 * 10
        clientId : null
        bearerToken: null # If present will be added to the request header
        headers: {}
    @cache = {}

    @cards  = new CardMethods @
    @lifeStreams = new LifeStreamMethods @
    @actions = new ActionsMethods @

  _cleanEndpoint: (endpoint) =>
    return null unless endpoint
    endpoint.replace /\/+$/, ""


  _handleResult: (res, bodyBeforeJson, callback) =>
      return callback new Error("Forbidden") if res.statusCode is 401 or res.statusCode is 403

      body = null

      if bodyBeforeJson and bodyBeforeJson.length > 0
        try
          body = JSON.parse(bodyBeforeJson)
        catch e
          return callback( new Error("Invalid Body Content"), bodyBeforeJson, res.statusCode)

      return callback(new Error(if body then body.message else "Request failed.")) unless res.statusCode >= 200 && res.statusCode < 300
      callback null, body, res.statusCode

  _reqWithData: (method, path, params, data, headers = {}, actor, callback) =>
    headers['Content-Type'] = 'application/json' if data
    headers['Accept'] = 'application/json'
    headers['authorization'] = "Bearer #{@options.bearerToken}" if @options.bearerToken
    headers['X-ClientId'] = @options.clientId if @options.clientId

    # Use method override (AWS ELB problems) unless told not to do so
    if (not config.get('clients:useRealHTTPMethods')) and method not in ['POST', 'PUT']
      headers['x-http-method-override'] = method
      method = 'POST'

    _.extend headers, @options.headers

    uri = "#{@endpoint}#{path}"
    #console.log "making #{method} request to #{uri} with headers", headers
    request
      uri: uri
      headers: headers
      body: if data then JSON.stringify data else null
      method: method
      timeout: 30*60*1000   
     , (err, res, body) =>
       if err
         err.status =  if res && res.statusCode then res.statusCode else 503
         return callback(err)

       @_handleResult res, body, callback

To be honest, coffeescript isn't my strong point so can't comment really on the code.

However, I can give you some thoughts: in what we are working on, we use nano to connect to cloudant and we're seeing up to 200requests/s into cloudant from a micro AWS instance. So you are right, node should be up to it.

Try using request https://github.com/mikeal/request if you're not already. (I don't think it will make a difference, but nevertheless worth a try as that is what nano uses).

These are the areas I would look into:

  1. The server doesn't deal well with multiple requests and throttles it. Have you run any performance tests against your server? If it can't handle the load for some reason or your requests are throttled in the os, then it doesn't matter what your client does.

  2. Your client code has a long running function somewhere which prevents node processing any reponses you get back from the server. Perhaps 1 specific response causes a response callback to spend far too long.

Are the endpoints all different servers/hosts?