SocketIO scaling architecture and large rooms requirements

We are using socketIO on a large chat application.

At some points we want to dispatch "presence" (user availability) to all other users.

io.in('room1').emit('availability:update', {userid='xxx', isAvailable: false});

room1 may contains a lot of users (500 max). We observe a significant raise in our NodeJS load when many availability updates are triggered.

The idea was to use something similar to redis store with Socket IO. Have web browser clients to connect to different NodeJS servers.

When we want to emit to a room we dispatch the "emit to room1" payload to all other NodeJS processes using Redis PubSub ZeroMQ or even RabbitMQ for persistence. Each process will itself call his own io.in('room1').emit to target his subset of connected users.

One of the concern with this setup is that the inter-process communication may become quite busy and I was wondering if it may become a problem in the future.

Here is the architecture I have in mind.

enter image description here

About available users function, I think there are two alternatives,you can create a "queue Users" where will contents "public data" from connected users or you can use exchanges binding information for show users connected. If you use an "user's queue", this will be the same for each "room" and you could update it when an user go out, "popping" its state message from queue (Although you will have to "reorganize" all queue message for it).

Nevertheless, I think that RabbitMQ is designed for asynchronous communication and it is not very useful approximation have a register for presence or not from users. I think it's better for applications where you don't know when the user will receive the message and its "real availability" ("fire and forget architectures"). ZeroMQ require more work from zero but you could implement something more specific for your situation with a better performance.

An publish/subscribe example from RabbitMQ site could be a good point to begin a new design like yours where a message it's sent to several users at same time. At summary, I will create two queues for user (receive and send queue messages) and I'll use specific exchanges for each "room chat" controlling that users are in each room using exchange binding's information. Always you have two queues for user and you create exchanges to binding it to one or more "chat rooms".

I hope this answer could be useful for you ,sorry for my bad English.

Could you batch changes and only distribute them every 5 seconds or so? In other words, on each node server, simply take a 'snapshot' every X seconds of the current state of all users (e.g. 'connected', 'idle', etc.) and then send that to the other relevant servers in your cluster.

Each server then does the same, every 5 seconds or so it sends the same message - of only the changes in user state - as one batch object array to all connected clients.

Right now, I'm rather surprised you are attempting to send information about each user as a packet. Batching seems like it would solve your problem quite well, as it would also make better use of standard packet sizes that are normally transmitted via routers and switches.

This is the common approach for sharing data across several Socket.io processes. You have done well, so far, with a single process and a single thread. I could lamely assume that you could pick any of the mentioned technologies for communicating shared data without hitting any performance issues.

If all you need is IPC, you could perhaps have a look at Faye. If, however, you need to have some data persisted, you could start a Redis cluster with as many Redis masters as you have CPUs, though this will add minor networking noise for Pub/Sub.

You are looking for this library: https://github.com/automattic/socket.io-redis

Which can be used with this emitter: https://github.com/Automattic/socket.io-emitter