Thrift TSimpleServer becomes unresponsive after several successful requests

From your stack trace it seems you are using TSimpleServer, whose javadocs say,

Simple singlethreaded server for testing.

Probably what you want to use is TThreadPoolServer.

Most likely what is happening is the single thread of TSimpleServer is blocked waiting for the dead client to respond or timeout. And because the TSimpleServer is single threaded, no thread is available to process other requests.


I have some suggestions. You mentioned that the first few calls to the server works and then there are hangs. That's a clue. One scenario where this happens is when the client does not fully send the bytes to the server. I am not familiar with TSimpleServer, but I assume it listens on a port and has some binary protocol and expects any client to talk to it in that protocol. Your .net client is talking to this server by sending bytes. If its not correctly flushing its output buffer then it may not be sending all the bytes to the server thereby hanging the server.

In Java this could happen at the client side ,like this :

BufferedOutputStream stream = new BufferedOutputStream(socket.getOutputstream()) //get the socket stream to write 
stream.write(content);//write everything that needs to be written 
stream.flush();//if flush() is not called, could result in server getting incomplete packets resulting in hangs!!!

Suggestions :

a) Go through your .net client code. See if any part of the code that actually communicates to the server are properly calling the equivalent flush() or cleanup methods. Note : I saw from their documentation that their transport layer defines a flush(). You should scan your .net code and see if its using the transport methods. http://thrift.apache.org/docs/concepts/

b) For further debugging, you could try writing a small Java client that simulates your .net client. Run the java client on your linux machine (same machine where TSimpleServer runs). See if it causes same issue. If it does, you could debug your java client and find the root cause. If it doesn't, you could then run it on where your .net client runs and see if there any issues and take it from there.

Edit :c) I was able to see a sample thrift client code in Java here : https://chamibuddhika.wordpress.com/2011/10/02/apache-thrift-quickstart-tutorial/ I noticed transport.open(); //do some code transport.close(); As suggested in a) you could go though your .net client code and see if you are calling the transport methods flush() and close() on completion