garbage collection with node.js
Simple answer: if value of the str
is not referenced from anywhere else (and str
itself is not referenced from restofprogram
) it will become unreachable as soon as the function (str) { ... }
returns.
Details: V8 compiler distinguishes real local variables from so called context variables captured by a closure, shadowed by a with-statement or an eval
invocation.
Local variables live on the stack and disappear as soon as function execution completes.
Context variables live in a heap allocated context structure. They disappear when the context structure dies. Important thing to note here is that context variables from the same scope live in the same structure. Let me illustrate it with an example code:
function outer () {
var x; // real local variable
var y; // context variable, referenced by inner1
var z; // context variable, referenced by inner2
function inner1 () {
// references context
use(y);
}
function inner2 () {
// references context
use(z);
}
function inner3 () { /* I am empty but I still capture context implicitly */ }
return [inner1, inner2, inner3];
}
In this example variable x
will disappear as soon as outer
returns but variables y
and z
will disappear only when both inner1
, inner2
and inner3
die. This happens because y
and z
are allocated in the same context structure and all three closures implicitly reference this context structure (even inner3
which does not use it explicitly).
Situation gets even more complicated when you start using with-statement, try/catch-statement which on V8 contains an implicit with-statement inside catch clause or global eval
.
function complication () {
var x; // context variable
function inner () { /* I am empty but I still capture context implicitly */ }
try { } catch (e) { /* contains implicit with-statement */ }
return inner;
}
In this example x
will disappear only when inner
dies. Because:
- try/catch-contains implicit with-statement in catch clause
- V8 assumes that any with-statement shadows all the locals
This forces x
to become a context variable and inner
captures the context so x
exists until inner
dies.
In general if you want to be sure that given variable does not retain some object for longer than really needed you can easily destroy this link by assigning null
to that variable.
My guess is that str will NOT be garbage collected because it can be used by restofprogram(). Yes, and str should get GCed if restofprogram was declared outside, except, if you do something like this:
function restofprogram(val) { ... }
readfile("blah", function(str) {
var val = getvaluefromstr(str);
restofprogram(val, str);
});
Or if getvaluefromstr is declared as something like this:
function getvaluefromstr(str) {
return {
orig: str,
some_funky_stuff: 23
};
}
Follow-up-question: Does v8 do just plain'ol GC or does it do a combination of GC and ref. counting (like python?)
Actually your example is somewhat tricky. Was it on purpose? You seem to be masking the outer val
variable with an inner lexically scoped restofprogram()'s val
argument, instead of actually using it. But anyway, you're asking about str
so let me ignore the trickiness of val
in your example just for the sake of simplicity.
My guess would be that the str
variable won't get collected before the restofprogram() function finishes, even if it doesn't use it. If the restofprogram() doesn't use str
and it doesn't use eval()
and new Function()
then it could be safely collected but I doubt it would. This would be a tricky optimization for V8 probably not worth the trouble. If there was no eval
and new Function()
in the language then it would be much easier.
Now, it doesn't have to mean that it would never get collected because any event handler in a single-threaded event loop should finish almost instantly. Otherwise your whole process would be blocked and you'd have bigger problems than one useless variable in memory.
Now I wonder if you didn't mean something else than what you actually wrote in your example. The whole program in Node is just like in the browser – it just registers event callbacks that are fired asynchronously later after the main program body has already finished. Also none of the handlers are blocking so no function is actually taking any noticeable time to finish. I'm not sure if I understood what you actually meant in your question but I hope that what I've written will be helpful to understand how it all works.
Update:
After reading more info in the comments on how your program looks like I can say more.
If your program is something like:
readfile("blah", function (str) {
var val = getvaluefromstr(str);
// do something with val
Server.start(function (request) {
// do something
});
});
Then you can also write it like this:
readfile("blah", function (str) {
var val = getvaluefromstr(str);
// do something with val
Server.start(serverCallback);
});
function serverCallback(request) {
// do something
});
It will make the str
go out of scope after Server.start() is called and will eventually get collected. Also, it will make your indentation more manageable which is not to be underestimated for more complex programs.
As for the val
you might make it a global variable in this case which would greatly simplify your code. Of course you don't have to, you can wrestle with closures, but in this case making val
global or making it live in an outer scope common for both the readfile callback and for the serverCallback function seems like the most straightforward solution.
Remember that everywhere when you can use an anonymous function you can also use a named function, and with those you can choose in which scope do you want them to live.