Java Solaris NIO OP_CONNECT problem

Your bug report has been closed as 'not a bug', with an explanation. You are ignoring the result of connect(), which if true means that OP_CONNECT will never fire, because the channel is already connected. You only need the whole OP_CONNECT/finishConnect() megillah if it returns false. So you shouldn't even register OP_CONNECT unless connect() returns false, let alone register it before you've even called connect().

Further remarks:

Under the hood, OP_CONNECT and OP_WRITE are the same thing, which explains part of it.

As you have a single thread for this, the workaround would be to do the connect in blocking mode, then switch to non-blocking for the I/O.

Are you doing the select() after registering the channel with the Selector?

The correct way of handling non-blocking connect is as follows:

channel.configureBlocking(false);
if (!channel.connect(...))
{
    channel.register(sel, SelectionKey.OP_CONNECT, ...); // ... is the attachment, or absent
}
// else channel is connected, maybe register for OP_READ ...
// select() loop runs ...
// Process the ready keys ...
if (key.isConnectable())
{
  if (channel.finishConnect())
  {
     key.interestOps(0); // or SelectionKey.OP_READ or OP_WRITE, whatever is appropriate
  }
}

A few non-exhaustive comments after reviewing your extended code:

  1. Closing a channel cancels the key. You don't need to do both.

  2. The non-static removeInterest() method is incorrectly implemented.

  3. TYPE_DEREGISTER_OBJECT also closes the channel. Not sure if that is what you really intended. I would have thought it should just cancel the key, and there should be a separate operation for closing the channel.

  4. You have gone way overboard on the small methods and exception handling. addInterest() and removeInterest() are good examples. They catch exceptions, log them, then proceed as though the exception hadn't happened, when all they actually do is set or clear a bit: one line of code. And on top of that many of them have both static and non-static versions. Same goes for all the little methods that call key.cancel(), channel.close(), etc. There is no point to all this, it is just clocking up lines of code. It only adds obscurity and makes your code harder to understand. Just do the operation required inline and have a single catcher at the bottom of the select loop.

  5. If finishConnect() returns false it isn't a connection failure, it just hasn't completed yet. If it throws an Exception, that is a connection failure.

  6. You are registering for OP_CONNECT and OP_READ at the same time. This doesn't make sense, and it may cause problems. There is nothing to read until OP_CONNECT has fired. Just register for OP_CONNECT at first.

  7. You are allocating a ByteBuffer per read. This is very wasteful. Use the same one for the life of the connection.

  8. You are ignoring the result of read(). It can be zero. It can be -1, indicating EOS, on which you must close the channel. You are also assuming you will get an entire application message in a single read. You can't assume that. That's another reason why you should use a single ByteBuffer for the life of the connection.

  9. You are ignoring the result of write(). It could be less than buffer.remaining() was when you called it. It can be zero.

  10. You could simplify this a lot by making the NetSelectable the key attachment. Then you could do away with several things, including for example the channel map, and the assert, because the channel of the key must always be equal to the channel of the attachment of the key.

  11. I would also definitely move the finishConnect() code to the NetSelector, and have connectEvent() just be a success/failure notification. You don't want to spread this kind of stuff around. Do the same with readEvent(), i.e. do the read itself in NetSelector, with a buffer supplied by the NetSelectable, and just notify the NetSelectable of the read result: count or -1 or exception. Ditto on write: if the channel is writable, get something to write from the NetSelectable, write it in the NetSelector, and notify the result. You could have the notification callbacks return something to indicate what to do next, e.g. close the channel.

But really this is all five times as complex as it needs to be, and the fact that you have this bug proves it. Simplify your head.