understanding http.get request in nodejs

Question

understanding http.get request in nodejs

I'm trying to understand how http.get is being called here. So here is a sample code I'm experimenting with:

var http = require('http');

var urls = ['http://www.google.com', 
            'http://www.yahoo.com', 
            'http://www.cnn.com']

for (i = 0; i < 3; i++ ){
  url = urls[i]
  console.log(url)
  console.log('------')
  http.get(url, function(resp){
    console.log(url)
  });
};

Code above gives me this output:

http://www.google.com
------
http://www.yahoo.com
------
http://www.cnn.com
------
http://www.cnn.com
http://www.cnn.com
http://www.cnn.com

I got the first three (the ones with before '------'). What i don't understand is, why did the last three lines all give me cnn.com? i thought the http.get request kicks off with each iteration?

Does it mean console.log(url) inside http.get is referring to the url=urls[2] outside http.get when 'for' reaches the end of the loop? How do i print the url, aka the argument passed into http.get?

node.js

Answer 1

The problem is that your url variable is shared by all callback functions, and therefore winds up keeping the last value it held in the loop by the time any of the functions are called. To clarify, the following code is what you would get if you "unwound" everything to happen in it's proper time sequence:

url = urls[0]            // google
console.log(url)         // google
http.get...              // creates a function which captures a reference to 'url'
                         // but doesn't actually run anything
url = urls[1]            // yahoo
console.log(url)         // yahoo
http.get...              // creates a second function with exactly the same ref
url = urls[2]            // cnn
console.log(url)         // cnn
http.get...              // creates a *third* function with exactly the same ref

http... console.log(url) // the "google" callback is finally called after the GET
                         // finishes, but `url` now contains "...cnn..."
http... console.log(url) // the "yahoo" callback called, but of course prints cnn
http... console.log(url) // the "cnn" callback is called, and prints the expected
                         // value, but really only by chance

To fix the problem, you need to do just a bit more to capture the variable so that each function get its own copy. There are a few ways to do this, but I'd personally use:

makeResponseFunc = function(u) {
    return function(response) {
        console.log(u);
    }
}

for (i = 0; i < 3; i++) {
    url = urls[i];
    http.get(url, makeResponseFunc(url));
}

Essentially, the makeResponseFunc is a function which creates a function. The parameter, u is captured by the newly created function, but since each time makeResponseFunc is called, a new u is created, each newly created function gets its own copy. (see Currying).

As I mentioned, there are a bunch of ways to do this, but I find this keeps things a simple as possible from a syntax point of view.

Answer 2

This has nothing to do with node; it's a javascript closure problem. The variable url is being updated each iteration of the loop so that, by the time any of the callback functions (three are created) is executed, the url variable is set to "http://www.cnn.com".

Try this, it's a sort of hacky way to get the output you want:

var http = require('http');

var urls = ['http://www.google.com', 
            'http://www.yahoo.com', 
            'http://www.cnn.com']

for (i = 0; i < 3; i++ ){
  (function() {
    var url = urls[i]
    console.log(url)
    console.log('------')
    http.get(url, function(resp){
      console.log(url)
    });
  })();
};

Basically, it just adds a closure so that you're not modifying the same variable each iteration.

Answer 3

Essentially, yes, console.log(url) inside http.get is referring to urls[2]. You can use a closure to capture the current url for use later like so:

for (i = 0; i < 3; i++ ){
    url = urls[i];
    console.log(url);
    console.log('------');
    (function(url) { 
        http.get(url, function(resp){
            console.log(url);
        });
    })(url);
}