Checking client hello for https classification
In SSL/TLS, messages are sent as part of records. What should be expected is that the client first send a ClientHello
message which itself is contained in one or several records.
Record format is:
record type: 1 byte (0x16 for "records contains some handshake message data")
protocol version: 2 bytes (0x03 0x00 for SSL 3.0, 0x03 0x01 for TLS 1.0, and so on)
record length: 2 bytes (big endian)
then the record data...
For the first record (from client to server), the client will first send a ClientHello
message which is a type of handshake message, hence encapsulated in a record as shown above (the first byte of the record will be 0x16). Theoretically, the client may send the ClientHello
split into several records, and it may begin with one or several empty records, but this is not very probable. The ClientHello
message itself begins with its own four-byte header, with one byte for the message type (0x01 for ClientHello
), then the message length over three bytes (there again, big-endian).
Once the client has sent its ClientHello
, then it expects a response from the server, so the ClientHello
will be alone in its record.
So you could expect a payload which begins with the following 9 bytes:
0x16 0x03 X Y Z 0x01 A B C
with:
X will be 0, 1, 2, 3... or more, depending on the protocol version used by the client for this first message. Currently, defined SSL/TLS versions are SSL 3.0, TLS 1.0, TLS 1.1 and TLS 1.2. Other versions may be defined in the future. They will probably use the 3.X numbering scheme, so you can expect the second header byte to remain a 0x03, but you should not arbitrarily limit the third byte.
Y Z is the encoding of the record length; A B C is the encoding of the
ClientHello
message length. Since theClientHello
message begins with a 4-byte header (not including in its length) and is supposed to be alone in its record, you should have: A = 0 and 256*X+Y = 256*B+C+4.
If you see 9 such bytes, which verify these conditions, then chances are that this is a ClientHello
from a SSL client.
Some non-very-recent SSL client may also support an older protocol version, called SSL 2.0. These clients will emit a ClientHello
which follows the SSL 2.0 rules, where messages and records are somehow merged. That SSL 2.0 ClientHello
message will state that the client also knows SSL 3.0 or more recent, but it won't begin with the 9-byte sequence explained above.
SSL 2.0 ClientHello
structure is explained in appendix E.2 or RFC 5246. Although such clients are rarefying (there is a RFC about prohibiting SSL 2.0 support altogether), there are still many deployed out there.
Your code has a few problems:
- It does not detect a SSL 2.0
ClientHello
message. - It checks that the third header byte (X in my description above) is equal to 0, 1 or 2, which rules out TLS 1.2. This is too restrictive.
- It assumes that the whole
ClientHello
will be in a single record (which is a reasonable assumption) and that thisClientHello
will be encoded in a single packet (which is a much less reasonable assumption). - It does not try to look at the length in the handshake message and corroborate it with the record length.
Correspondingly, evading detection will be easy (by using a SSL 2.0 ClientHello
, by using a record tagged with TLS 1.2 version, by making a big ClientHello
message which does not fit in a single packet... the method are numerous); and some existing deployed clients will not be detected: not only one can avoid detection on purpose, but it is also possible unwillingly.