WebSockets, UDP, and benchmarks
On a LAN, you can get Round-trip times for messages over WebSocket of 200 microsec (from browser JS to WebSocket server and back), which is similar to raw ICMP pings. On MAN, it's around 10ms, WAN (over residential ADSL to server in same country) around 30ms, and so on up to around 120-200ms via 3.5G. The point is: WebSocket does add virtually no latency to the one you will get anyway, based on the network.
The wire level overhead of WebSocket (compared to raw TCP) is between 2 octets (unmasked payload of length < 126 octets) and 14 octets (masked payload of length > 64k) per message (the former numbers assume the message is not fragmented into multiple WebSocket frames). Very low.
For a more detailed analysis of WebSocket wire-level overhead, please see this blog post - this includes analysis covering layers beyond WebSocket also.
More so: with a WebSocket implementation capable of streaming processing, you can (after the initial WebSocket handshake), start a single WebSocket message and frame in each direction and then send up to 2^63 octets with no overhead at all. Essentially this renders WebSocket a fancy prelude for raw TCP. Caveat: intermediaries may fragment the traffic at their own decision. However, if you run WSS (that is secure WS = TLS), no intermediaries can interfere, and there you are: raw TCP, with a HTTP compatible prelude (WS handshake).
WebRTC uses RTP (= UDP based) for media transport but needs a signaling channel in addition (which can be WebSocket i.e.). RTP is optimized for loss-tolerant real-time media transport. "Real-time games" often means transferring not media, but things like player positions. WebSocket will work for that.
Note: WebRTC transport can be over RTP or secured when over SRTP. See "RTP profiles" here.
To make a long story short, if you want to use TCP for multiplayer games, you need to use what we call adaptive streaming techniques. In other words, you need to make sure that the amount of real-time data sent to synchronize the game world among the clients is governed by the currently available bandwidth and latency for each client.
Dynamic throttling, conflation, delta delivery, and other mechanisms are adaptive streaming techniques, which don't magically make TCP as efficient as UDP, but make it usable enough for several types of games.
I tried to explain these techniques in an article: Optimizing Multiplayer 3D Game Synchronization Over the Web (http://blog.lightstreamer.com/2013/10/optimizing-multiplayer-3d-game.html).
I also gave a talk on this topic last month at HTML5 Developer Conference in San Francisco. The video has just been made available on YouTube: http://www.youtube.com/watch?v=cSEx3mhsoHg
I would recommend developing your game using WebSockets on a local wired network and then moving to the WebRTC Data Channel API once it is available. As @oberstet correctly notes, WebSocket average latencies are basically equivalent to raw TCP or UDP, especially on a local network, so it should be fine for you development phase. The WebRTC Data Channel API is designed to be very similar to WebSockets (once the connection is established) so it should be fairly simple to integrate once it is widely available.
Your question implies that UDP is probably what you want for a low latency game and there is truth to that. You may be aware of this already since you are writing a game, but for those that aren't, here is a quick primer on TCP vs UDP for real-time games:
TCP is an in-order, reliable transport mechanism and UDP is best-effort. TCP will deliver all the data that is sent and in the order that it was sent. UDP packets are sent as they arrive, may be out of order, and may have gaps (on a congested network, UDP packets are dropped before TCP packets). TCP sounds like a big improvement, and it is for most types of network traffic, but those features come at a cost: a delayed or dropped packet causes all the following packets to be delayed as well (to guarantee in-order delivery).
Real-time games generally can't tolerate the type of delays that can result from TCP sockets so they use UDP for most of the game traffic and have mechanisms to deal with dropped and out-of-order data (e.g. adding sequence numbers to the payload data). It's not such a big deal if you miss one position update of the enemy player because a couple of milliseconds later you will receive another position update (and probably won't even notice). But if you don't get position updates for 500ms and then suddenly get them all out once, that results in terrible game play.
All that said, on a local wired network, packets are almost never delayed or dropped and so TCP is perfectly fine as an initial development target. Once the WebRTC Data Channel API is available then you might consider moving to that. The current proposal has configurable reliability based on retries or timers.
Here are some references:
- WebRTC Introduction
- WebRTC FAQ
- WebRTC Data Channel Proposal