Value vs. Reference - 2020
The issue of value vs. reference is closely related to the rules for evaluating expressions in C++, especially for argument of functions. It's about when the arguments to a function are evaluated, when they are substituted into the function, and what form that substitution takes.
When we want a function to change the value of a variable, we have three choices:
int change_by_value(int i) { return ++i; } // compute a new value and return it void change_via_pointer(int *pi) {++*pi; } // pass a pointer, deference it and increment it void change_via_reference(int &ri;) {++ri; } // pass a reference
The first one, returning the value, may be the style we prefer for small object. But it is not feasible when we write a function that modifies a huge data because we can't copy the huge data, at least twice, with acceptable efficiency. Then, how do we choose between using a reference argument and using a pointer argument. Unfornately, either way has both pros and cons, so we do not have the clear-cut answer.
Let's take a look at one by one a little bit closer.
Call-by-value ( pass-by-value) is the most common strategy of evaluation. In call-by-value, the argument expression is evaluated, and the resulting value is copied into corresponding variable in the function. If the function is able to assign values to its parameters, only its local copy is assigned. This means that anything passed into a function call is unchanged in the caller's scope when the function returns.
Here, we should be clear the term call-by-value. It is sometimes confusing because there are people use the call-by-value where the value is a reference as in Java. So, in this tutorial, value is a value of a variable as understood by the usual meaning of value.
call-by-reference (pass-by-reference) is the other strategy of evaluation. In the strategy, a function receives a reference to the argument, rather than a copy of its value.
This typically means that the function can modify the argument. Because arguments do not need to be copied, call-by-reference has the advantage of greater time- and space-efficiency as well as the potential for greater communication between a function and its caller since the function can return information using its reference arguments. However, it has the disadvantage that a function must often take special steps to protect values it passes to the function.
C++ uses call-by-value as default, but offer special syntax for call-by-reference parameters. C++ additionally offers call-by-reference-to-const.C support explicit reference such as pointer and this can be used to mimic call-by-reference. But in this case, a function's caller must explicitly generate the reference to supply as an argument.
Summary for pass by reference:
- It has a single copy of the value of interest. The single master copy.
- It passes pointer to that value to any function which wants to see or change the value.
- Functions can dereference their pointer to see or change the value.
- Functions must remember that they do not have their own local copies. If they dereference their pointer and change the value, they really are changing the master value. If a function wants a local copy to change safely, the function must explicitly allocate and initialize such a local copy.
Here is an example of showing the difference between call-by-value and call-by-reference
#include <iostream> using namespace std; template<class T> void swapVal(T obj1, T obj2) { T temp = obj1; obj1 = obj2; obj2 = temp; } template<class T> void swapRef(T& obj1, T& obj2) { T temp = obj1; obj1 = obj2; obj2 = temp; } int main() { int a = 100, b = 200; cout << "Value:" << endl; cout << "1: a = " << a << " b = " << b << endl; swapVal(a,b); cout << "2: a = " << a << " b = " << b << endl; a = 300; b = 400; cout << endl; cout << "Reference:" << endl; cout << "1: a = " << a << " b = " << b << endl; swapRef(a,b); cout << "2: a = " << a << " b = " << b << endl; return 0; }
The output is:
Value: 1: a = 100 b = 200 2: a = 100 b = 200 Reference: 1: a = 300 b = 400 2: a = 400 b = 300
As we see from the result, the swap in the swapValue() is just swapping the local values inside the function but not the values in the caller's scope.
Here is an example swapping object of a class.
#include <iostream> #include <cstring> using namespace std; template<class T> void swapRef(T& obj1, T& obj2) { T temp = obj1; obj1 = obj2; obj2 = temp; } template<class T> void swapVal(T obj1, T obj2) { T temp = obj1; obj1 = obj2; obj2 = temp; } class Bogo { public: // default constructor Bogo (int i = 0, const char *s = "") { myInt = i; int len = strlen(s); myString = new char[len+1]; strcpy(myString,s); } // copy constructor overloading Bogo (const Bogo& b){ int len = strlen(b.myString); myString = new char[len + 1]; strcpy(myString,b.myString); myInt = b.myInt; } // copy assignment operator overloading Bogo operator=(const Bogo& b){ int len = strlen(b.myString); myString = new char[len + 1]; strcpy(myString,b.myString); myInt = b.myInt; return *this; } friend ostream& operator<<(ostream & os, const Bogo & b); private: int myInt; char* myString; }; ostream & operator<<(ostream &os;, const Bogo &b;) { os << b.myInt << "," << b.myString ; return os; } int main() { Bogo bogoA = Bogo(100,"AAA"); Bogo bogoB = Bogo(200,"BBB"); cout << "Value:" << endl; cout << "1: "<< bogoA << " " << bogoB << endl; swapVal<Bogo>(bogoA,bogoB); cout << "2: "<< bogoA << " " << bogoB << endl; cout << endl; bogoA = Bogo(100,"AAA"); bogoB = Bogo(200,"BBB"); cout << "Reference:" << endl; cout << "1: "<< bogoA << " " << bogoB << endl; swapRef<Bogo>(bogoA,bogoB); cout << "2: "<< bogoA << " " << bogoB << endl; return 0; }
Note that we overloaded copy constructor and copy assignment operator to deal with the memory allocation by new. Also we overloaded the operator<<() to print out the object.
Here is the output:
Value: 1: 100,AAA 200,BBB 2: 100,AAA 200,BBB Reference: 1: 100,AAA 200,BBB 2: 200,BBB 100,AAA
The output is almost similar to the case of swapping variables. Only the swapRef() did actually swap the objects.
A pointer and a reference are both implemented by using a memory address. They just use it differently to provide the programmer slightly different features.
Pointer and reference can both be used to refer to variable value and to pass it to function by reference rather than by value. Passing by reference is more efficient than passing by value, so the use of pointer and reference is encouraged. In general, reference is preferred because it is easier to use and easier to understand than pointer. But reference obey certain rule which can make the use of pointer is necessary.
Rule | Reference | Pointer |
---|---|---|
Can be declared without initialization | X | O |
Can be reasigned | X | O |
Can contain a 0 value | X | O |
Easy to use | O | X |
- If we don't want to initialize in the declaration, use a pointer.
- If we want to be able to reassign another variable, use a pointer. Because there is no way to get a reference to refer to a different object after initialization. So, if we need to point to something different later, use a pointer.
- Otherwise, always use a reference.
See also References vs pointers
I intentionally used the Scott Meyers item.20 title of "Effective C++".
Here is the summary at the end of the section.
- Prefer pass-by-reference-to-const over pass-by-value. It's typically more efficient and it avoids the slicing problem.
- The rule doesn't apply to built-in types and STL iterators and function object types. For them, pass-by-value is usually appropriate.
When we debug the above code step by step, we'll realize we'll end up doing more objects copies in the pass-by-value case than pass-by-reference case.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization