We use Nodejs and are pretty happy with it. We monitor the performance of our Node.js processes by measuring how 'busy' is the event loop. Basically, we have a function like this
var previousTick;
setInterval(function() {
now = Date.now();
if(previousTick) {
// check (now - statsPeriod - previousTick)
}
previousTick = now;
}, 1000);
Recently, as the load started to increase on some of our servers, we started to notice that sometimes, the delay was huge: up to 500 seconds on some of processes. This is a problem obviously.
We're really not sure what is going on here and we're looking for answer.
We tried using the node debugger, but, even though we can easily connect to the process when it runs normally using node debug -p <pid>
, we cannot connect to it during the "delay".
Any idea what tool or technique we could use? Of course, we cannot rproduce consistently, eve though we see this happening a couple times a day on our production servers.
I want ahead and installed strace... and on a stuck process, here's what it yiels:
clock_gettime(CLOCK_REALTIME, {1407798758, 934775226}) = 0 clock_gettime(CLOCK_REALTIME, {1407798758, 934941698}) = 0 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x20a1038, FUTEX_WAKE_PRIVATE, 1) = 1
Any idea what that might be?
[UPDATE] Going a bit further let us find that our process is stuck in a loop with Timer.js.