Node.js server GET to separate API failing after a few hours of use

Question

Node.js server GET to separate API failing after a few hours of use

In my node site I call a restful API service I have built using a standard http get. After a few hours of this communication successfully working I find that the request stops being sent, it just waits and eventually times out.

The API that is being called is still receiving requests from elsewhere perfectly well but when a request is sent from the site it does not reach the API.

I have tried with stream.pipe, util.pump and just writing the file to the file system.

I am using Node 0.6.15. My site and the service that is being called are on the same server so calls to localhost are being made. Memory usage is about 25% over all with cpu averaging about 10% usage.

After a while of the problem I started using the request module but I get the same behaviour. The number of calls it makes before failing varrys it seems between 5 to 100. In the end I have to restart the site but not the api to make it work again.

Here is roughly what the code in the site looks like:

var Request = require('request');
downloadPDF: function(req, res) {
  Project.findById(req.params.Project_id, function(err, project) {
    project.findDoc(req.params.doc_id ,function(err, doc) {
      var pdfileName;
      pdfileName = doc.name + ".pdf";
      res.contentType(pdfileName);
      res.header('Content-Disposition', "filename=" + pdfileName);
      Request("http://localhost:3001/" + project._id).pipe(res);
    });
  });
}

I am lots at what could be happening.

node.js
request
coffeescript
express

Answer 1

Did you try to increase agent.maxSockets, or to disable http.Agent functionality? By default, recent node versions use sockets pooling for HTTP client connects, this may be source of the problem http://nodejs.org/api/http.html#http_class_http_agent

Answer 2

I'm not sure how busy your Node server is, but it could be that all of your sockets are in TIME_WAIT status.

If you run this command, you should see how many sockets are in this state:

netstat -an | awk '/tcp/ {print $6}' | sort | uniq -c

It's normal to have some, of course. You just don't want to max out your system's available sockets and have them all be in TIME_WAIT.

If this is the case, you would actually want to reduce the agent.maxSockets setting (contrary to @user1372624's suggestion), as otherwise each request would simply receive a new socket even though it could simply reuse a recent one. It will simply take longer to reach a non-responsive state.

I found this Gist (a patch to http.Agent) that might help you.

This Server Fault answer might also help: http://serverfault.com/a/212127

Finally, it's also possible that updating Node may help, as they may have addressed the keep-alive behavior since your version (you might check the change log).

Answer 3

You're using a callback to return a value which doesn't make a lot of sense because your Project.findById() returns immediately, without waiting for the provided callback to complete.

Don't feel bad though, the programming model that nodejs uses is somewhat difficult at first to wrap your head around.

In event-driven programming (EDP), we provide callbacks to accomplish results, ignoring their return values since we never know when the callback might actually be called.

Here's a quick example.

Suppose we want to write the result of an HTTP request into a file.

In a procedural (non-EDP) programming environment, we rely on functions that only return values when they have them to return.

So we might write something like (pseudo-code):

    url = 'http://www.example.com'

    filepath = './example.txt'

    content = getContentFromURL(url)

    writeToFile(filepath,content)

    print "Done!"

which assumes that our program will wait until getContentFromURL() has contacted the remote server, made its request, waited for a result and returned that result to the program.

The writeToFile() function then asks the operating system to open a local file at some filepath for write, waiting until it is told the open file operation has completed (typically waiting for the disk driver to report that it can carry out such an operation.)

writeToFile() then requests that the operating system write the content to the newly opened file, waiting until it is told that the driver the operating system uses to write files tells it it has accomplished this goal, returning the result to the program so it can tell us that the program has completed.

The problem that nodejs was created to solve is to better use all the time wasted by all the waiting that occurs above.

It does this by using functions (callbacks) which are called when operations like retrieving a result from a remote web request or writing a file to the filesystem complete.

To accomplish the same task above in an event-driven programming environment, we need to write the same program as:

    getContentFromURL(url,onGetContentFromURLComplete)

    function onGetContentFromURLComplete(content,err){
        writeToFile(content,onWriteToFileComplete);
    }

    function onWriteToFileComplete(err){
        print "Done!";
    }

where

calling getContentFromURL() only calls the onGetContentFromURLComplete callback once it has the result of the web request and
calling writeToFile() only calls its callback to display a success message when it completes writing the content successfully.

The real magic of nodejs is that it can do all sorts of other things during the surprisingly large amount of time procedural functions have to wait for most time-intensive operations (like those concerned with input and output) to complete.

(The examples above ignore all errors which is usually considered a bad thing.)

Answer 4

I have also experienced intermittent errors using the inbuilt functions. As a work around I use the native wget. I do something like the following

var exec = require('child_process').exec;
function fetchURL(url, callback) {
    var child;
    var command = 'wget -q -O - ' + url;
    child = exec(command, function (error, stdout, stderr) {
            callback(error, stdout, stderr);
        });
}

With a few small adaptations you could make it work for your needs. So far it is rock solid for me.

Answer 5

Did you try logging your parameters while calling this function? The error can depend on req.params.Project_id. You should also provide error handling in your callback functions.

If you can nail down the failing requests to a certain parameter set (make them reproducible), you could debug you application easily with node-inspector.