What are active handles in Node.js
Puppeteer actually has a method to use when you're finished with your handles so the garbage collectors can do their job.
You're supposed to use elementHandle.dispose()
like this:
const bodyHandle = await frame.$('body');
const html = await frame.evaluate(body => body.innerHTML, bodyHandle);
// Once you're done with you handle just get rid of it
await bodyHandle.dispose();
Check out the docs:
- https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#elementhandledispose
- https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#jshandledispose
Active handles
A handle is a reference to an open resource like an opened file, a database connection or a request. To understand why handles might be active although they should have been closed, I give you a simple example:
const http = require('http');
http.createServer((req, res) => {
if (req.url === '/secret-url') {
return; // nobody should have access to this part of our page!
}
res.statusCode = 200;
res.setHeader('Content-Type', 'text/plain');
res.end('Hello World!');
}).listen(3000);
What does this code do? It runs a server on port 3000
and returns a Hello World
message for any request, except for those going to the "secret URL". But there is a problem in this code. When we run into the "secret" if clause, we never close the connection. That means the client will keep the connection open for as long as he wants. We shouldve instead closed the connection. By making this mistake the number of active handles will increase, resulting in a memory leak.
Normally, memory leaks are much harder to detect as active handles might be passed from one function to another making it hard to track which of the code is responsible for closing the connection.
What does a rising number of active handles mean?
If you are seeing a constant increase in open handles, you very likely are having a memory leak in your code somewhere. Like in the example, you maybe forgot to close a resource
Memory leaks are in particular very bad, if you are planning on developing a script which should run for a long time, like a web server...
How to check for memory leaks?
There are various techniques to check for memory leaks. The easiest way is obviously keeping an eye on the memory. pm2 even has an option build in to restart the process in case the memory reaches a certain point. For more information on this topic, check out this guide.
What has this to do with puppeteer?
Two things. First, requests are very cheap. Even if you have a memory leak in your Node.js server application, you will only start seeing it in memory after a few thousand requests. In contrast to that, puppeteer is very expensive. Opening a Chromium browser will cost you memory in the range between 50 and 100 megabytes. So you should make sure that every browser you start, will be closed. Second, as the other answer already mentioned there are objects (like elementHandle
) that you need to manually dispose to clear their resources.