Java Solaris NIO OP_CONNECT problem
Your bug report has been closed as 'not a bug', with an explanation. You are ignoring the result of connect()
, which if true means that OP_CONNECT
will never fire, because the channel is already connected. You only need the whole OP_CONNECT/finishConnect()
megillah if it returns false. So you shouldn't even register OP_CONNECT
unless connect()
returns false, let alone register it before you've even called connect().
Further remarks:
Under the hood, OP_CONNECT
and OP_WRITE
are the same thing, which explains part of it.
As you have a single thread for this, the workaround would be to do the connect in blocking mode, then switch to non-blocking for the I/O.
Are you doing the select() after registering the channel with the Selector
?
The correct way of handling non-blocking connect is as follows:
channel.configureBlocking(false);
if (!channel.connect(...))
{
channel.register(sel, SelectionKey.OP_CONNECT, ...); // ... is the attachment, or absent
}
// else channel is connected, maybe register for OP_READ ...
// select() loop runs ...
// Process the ready keys ...
if (key.isConnectable())
{
if (channel.finishConnect())
{
key.interestOps(0); // or SelectionKey.OP_READ or OP_WRITE, whatever is appropriate
}
}
A few non-exhaustive comments after reviewing your extended code:
Closing a channel cancels the key. You don't need to do both.
The non-static
removeInterest()
method is incorrectly implemented.TYPE_DEREGISTER_OBJECT
also closes the channel. Not sure if that is what you really intended. I would have thought it should just cancel the key, and there should be a separate operation for closing the channel.You have gone way overboard on the small methods and exception handling.
addInterest()
andremoveInterest()
are good examples. They catch exceptions, log them, then proceed as though the exception hadn't happened, when all they actually do is set or clear a bit: one line of code. And on top of that many of them have both static and non-static versions. Same goes for all the little methods that callkey.cancel()
,channel.close()
, etc. There is no point to all this, it is just clocking up lines of code. It only adds obscurity and makes your code harder to understand. Just do the operation required inline and have a single catcher at the bottom of the select loop.If
finishConnect()
returns false it isn't a connection failure, it just hasn't completed yet. If it throws anException
, that is a connection failure.You are registering for
OP_CONNECT
andOP_READ
at the same time. This doesn't make sense, and it may cause problems. There is nothing to read untilOP_CONNECT
has fired. Just register forOP_CONNECT
at first.You are allocating a
ByteBuffer
per read. This is very wasteful. Use the same one for the life of the connection.You are ignoring the result of
read()
. It can be zero. It can be -1, indicating EOS, on which you must close the channel. You are also assuming you will get an entire application message in a single read. You can't assume that. That's another reason why you should use a singleByteBuffer
for the life of the connection.You are ignoring the result of
write()
. It could be less thanbuffer.remaining(
) was when you called it. It can be zero.You could simplify this a lot by making the
NetSelectable
the key attachment. Then you could do away with several things, including for example the channel map, and the assert, because the channel of the key must always be equal to the channel of the attachment of the key.I would also definitely move the
finishConnect()
code to theNetSelector
, and haveconnectEvent()
just be a success/failure notification. You don't want to spread this kind of stuff around. Do the same withreadEvent()
, i.e. do the read itself inNetSelector
, with a buffer supplied by theNetSelectable
, and just notify theNetSelectable
of the read result: count or -1 or exception. Ditto on write: if the channel is writable, get something to write from theNetSelectable
, write it in the NetSelector, and notify the result. You could have the notification callbacks return something to indicate what to do next, e.g. close the channel.
But really this is all five times as complex as it needs to be, and the fact that you have this bug proves it. Simplify your head.