I'm working on a project named robot hosting on GitHub. The job of my project is to fetch medias from the url which is given from the xml config file.And the xml config file has the defined format just as you can see in scripts dir.
My problem is as below.There are two args:
The simplified example as below:
node_list = {..., next = {..., next= null}};
url_arr = [urls];
I want to iterate all the items in the url arr, so i do as below:
function fetch(url, node) {
if(node == null)
return ;
// here do something with http request
var req = http.get('www.google.com', function(res){
var data = '';
res.on('data', function(chunk) {
data += chunk;
}.on('end', function() {
// maybe here generate more new urls
// get another url_list
node = node.next;
fetch(url_new, node);
}
}
// here need to be run in sync
for (url in url_arr) {
fetch(url, node)
}
As you can see, if use async http request, it must eats all system resources. And i can not control the process. So do anyone have a good idea to solve this problem? Or, is nodejs not the proper way to do such jobs?
If the problem is that you get too many HTTP requests simultaneously, you could change the fetch function to operate on a stack of URLs.
Basically you would do this:
fetch is called, insert the URL into the stack and check if a request is in progress:This way you can have the for-loop add all the URLs like now, but only one URL is processed at a time so there won't be too much resources being used.