I'm crawling an API. There is many many many requests. And if I do too much requests, the API starts by looking suspicious, and I get a bunch of 503. That's ok, when I get a 503 I've set a timer before re-running the request, and this timer is powered by two for each 503 of a same request.
BUT it doesn't work. Because my timer is asynchroneous. When I get the 503, after starting this timer, Node immediately reuse the socket for a pending request. So my timer basically doesn't change anything.
How can I prevent this ?
What I have tried so far :
settimeout
before restarting the requestsync
module, and its pause
(does not work, because the fiber is asynchroneousAny idea ? :<
I've finally come to the conclusion that it is not possible at this time.
In order to prevent massive flooding, I'm using the queue
object of the async
module. The code is something like that :
var queue = new Sync.Queue( function ( task, markAsComplete ) {
Http.request( {
agent : false, // we will use our own rate limiter, so we don't need agents
...
}, function ( err, res ) {
res.on( 'end', function ( ) {
if ( IS_503 ) {
var originalConcurrency = queue.concurrency; // saving the original concurrency
queue.concurrency = 1; // our timeout will now stop every request
setTimeout( function ( ) {
queue.concurrency = originalConcurrency; // restoring the concurrency
queue.push( task );
markAsComplete( );
}, 1000 );
}
} );
} );
}, numberOfParallelRequests );