How many system resources will be held for keeping 1,000,000 websocket open?
On today's systems, handling 1 million concurrent TCP connections is not an issue.
I can affirm that based on our own tests (full disclosure: I am the CTO at Lightstreamer).
We had to demonstrate several times, to some of our customers, that 1 million connections can be reached on a single box (and not necessarily a super-monster machine). But let me recap the configuration where we tested 500K concurrent connections, as this is a much more recent test performed on Amazon EC2.
We installed Lightstreamer Server (which is a WebSocket server, among other things) on a m2.4xlarge instance. This means 8 cores and 68.4 GiB memory.
We launched 11 client machines to create 500,000 concurrent connections to the Lightstreamer Server. The test was configured so that the total outbound throughput from the server was 90,000 updates/s, resulting in peaks of 450 Mbit/s outbound bandwidth.
The server never used more than 13 GiB of RAM and the CPU was stable around 60%.
With at least 30 GiB RAM you can handle 1 million concurrent sockets. The CPU needed depends on the data throughput you need.
Updated Answer
Short answer: yes, but it's expensive.
Long answer:
This question is not unique to WebSockets since WebSockets are fundamentally long-lived TCP sockets with a HTTP-like handshake and minimal framing for messages.
The real question is: could a single server handle 1,000,000 simultaneous socket connections and what server resources would this consume? The answer is complicated by several factors, but 1,000,000 simultaneous active socket connections is possible for a properly sized system (lots of CPU, RAM and fast networking) and with a tuned server system and optimized server software.
The number of connections is not the primary problem (that's mostly just a question of kernel tuning and enough memory), it is the processing and sending/receiving data to/from each of those connections. If the incoming connections are spread out over a long period, and they are mostly idle or infrequently sending small chunks of static data then you could probably get much higher than even 1,000,000 simultaneous connections. However, even under those conditions (slow connections that are mostly idle) you will still run into problems with networks, server systems and server libraries that aren't configured and designed to handle large numbers of connections.
See Alessandro Alinone's answer about approximate resource usage for 500,000 connections.
Here are some older but still applicable resources to read on how you would configure your server and write your server software to support large numbers of connections:
- What is the theoretical maximum number of open TCP connections that a modern Linux box can have
- http://www.kegel.com/c10k.html