The performance cost to using ref instead of returning same types?
The main purpose of using the ref keyword is to signify that the variable's value can be changed by the function its being passed into. When you pass a variable by value, updates from within the function don't effect the original copy.
Its extremely useful (and faster) for situations when you want multiple return values and building a special struct or class for the return values would be overkill. For example,
public void Quaternion.GetRollPitchYaw(ref double roll, ref double pitch, ref double yaw){
roll = something;
pitch = something;
yaw = something;
}
This is a pretty fundamental pattern in languages that have unrestricted use of pointers. In c/c++ you frequently see primitives being passed around by value with classes and arrays as pointers. C# does just the opposite so 'ref' is handy in situations like the above.
When you pass a variable you want updated into a function by ref, only 1 write operation is necessary to give you your result. When returning values however, you normally write to some variable inside the function, return it, then write it again to the destination variable. Depending on the data, this could add unnecessary overhead. Anyhow, these are the main things that I typically consider before using the ref keyword.
Sometimes ref is a little faster when used like this in c# but not enough to use it as a goto justification for performance.
Here's what I got on a 7 year old machine using the code below passing and updating a 100k string by ref and by value.
Output:
iterations: 10000000 byref: 165ms byval: 417ms
private void m_btnTest_Click(object sender, EventArgs e) {
Stopwatch sw = new Stopwatch();
string s = "";
string value = new string ('x', 100000); // 100k string
int iterations = 10000000;
//-----------------------------------------------------
// Update by ref
//-----------------------------------------------------
sw.Start();
for (var n = 0; n < iterations; n++) {
SetStringValue(ref s, ref value);
}
sw.Stop();
long proc1 = sw.ElapsedMilliseconds;
sw.Reset();
//-----------------------------------------------------
// Update by value
//-----------------------------------------------------
sw.Start();
for (var n = 0; n < iterations; n++) {
s = SetStringValue(s, value);
}
sw.Stop();
long proc2 = sw.ElapsedMilliseconds;
//-----------------------------------------------------
Console.WriteLine("iterations: {0} \nbyref: {1}ms \nbyval: {2}ms", iterations, proc1, proc2);
}
public string SetStringValue(string input, string value) {
input = value;
return input;
}
public void SetStringValue(ref string input, ref string value) {
input = value;
}
I have to agree with Ondrej here. From a stylistic view, if you start passing everything with ref
you will eventually end up working with devs who will want to strangle you for designing an API like this!
Just return stuff from the method, don't have 100% of your methods returning void
. What you are doing will lead to very unclean code and might confuse other devs who end up working on your code. Favour clarity over performance here, since you won't gain much in optomization anyway.
check this SO post: C# 'ref' keyword, performance
and this article from Jon Skeet: http://www.yoda.arachsys.com/csharp/parameters.html
The main time that "ref" is used in the same sentence as performance is when discussing some very atypical cases, for example in XNA scenarios where the game "objects" are quite commonly represented by structs rather than classes to avoid problems with GC (which has a disproportionate impact on XNA). This becomes useful to:
- prevent copying an oversized struct multiple times on the stack
- prevent data loss due to mutating a struct copy (XNA structs are commonly mutable, against normal practice)
- allow passing a struct in an an array directly, rather than ever copying it out and back in
In all other cases, "ref" is more commonly associated with an additional side-effect, not easily expressed in the return value (for example see Monitor.TryEnter
).
If you don't have a scenario like the XNA/struct one, and there is no awkward side effect, then just use the return value. In addition to being more typical (which in itself has value), it could well involve passing less data (and int is smaller than a ref on x64 for example), and could require less dereferencing.
Finally, the return approach is more versatile; you don't always want to update the source. Contrast:
// want to accumulate, no ref
x = Add(x, 5);
// want to accumulate, ref
Add(ref x, 5);
// no accumulate, no ref
y = Add(x, 5);
// no accumulate, ref
y = x;
Add(ref y, x);
I think the last is the least clear (with the other "ref" one close behind it) and ref usage is even less clear in languages where it is not explicit (VB for example).