Value and Reference Types 2020
Unlike array, strings, or enumerations, C# structures do not have an identically named representation in the .NET library.
In other words, there is no such thing as System.Structure class. But structures are implicitly derived from System.ValueType. That is, the role of System.ValueType is to ensure that the derived type is allocated on the stack rather than the garbage collected heap.
The only purpose of System.ValueType is to override the virtual methods defined by System.Object to use value-based vs. reference-based, semantics. The base class of ValueType is System.Object. Actually, the instance methods defined by System.ValueType are identical to those of System.Object:
public abstract class ValueType : object { public virtual bool Equals(object obj); public virtual int GetHashCode(); public Type GetType(); public virtual string ToString(); }
The fact that value types are using value-based semantics, the lifetime of a structure (which includes all numerical data types as well as any enum or custom structure) is very predictable. When a structure variable falls out of the defining scope, it is removed from memory immediately:
static void LocalValueTypes() { int i = 0; Point p = new Point() } // "i" and "p" popped off the stack because they are out of scope
When we assign one value type to another, a member-by-member copy of the field data is achieved. In the case of a simple data type such as System.Int32. the only member to copy is the numerical value. However, in the case of our Point, the X and Y values are copied into the new structure variable. Let's look at the following example.
// Program.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace MyFirstCSharpCode { class Program { struct Point { public int X; public int Y; public Point(int XPos, int YPos) { X = XPos; Y = YPos; } public void Display() { Console.WriteLine("X = {0}, Y = {1}", X, Y); } } static void ValueTypeAssign() { Point p1 = new Point(10, 20); Point p2 = p1; p1.Display(); p2.Display(); p1.X = 111; Console.WriteLine("\n p1.X changed \n"); p1.Display(); p2.Display(); } static void Main(string[] args) { ValueTypeAssign(); Console.ReadLine(); } } }
Output is
X = 10, Y = 20 X = 10, Y = 20 p1.X changed X = 111, Y = 20 X = 10, Y = 20
We created a variable of type Point and named p1. Then it is assigned to another Point(p2). Because Point is a value type, we have two copies of the Point type on the stack, each of which can be independently manipulated. So, when we change the value of p1.X, the value of p2.X is unaffected.
In stark contrast to value types, when we apply the assignment operator to reference types (meaning all class instances), we are redirecting what the reference variable points to in memory. Let's look at the following example.
// Program.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace MyFirstCSharpCode { class Program { class PointRef { public int X; public int Y; public PointRef(int XPos, int YPos) { X = XPos; Y = YPos; } public void Display() { Console.WriteLine("X = {0}, Y = {1}", X, Y); } } static void RefTypeAssign() { PointRef p1 = new PointRef(10, 20); PointRef p2 = p1; p1.Display(); p2.Display(); p1.X = 111; Console.WriteLine("\n p1.X changed \n"); p1.Display(); p2.Display(); } static void Main(string[] args) { RefTypeAssign(); Console.ReadLine(); } } }
Output from the run is:
X = 10, Y = 20 X = 10, Y = 20 p1.X changed X = 111, Y = 20 X = 111, Y = 20
We used our PointRef type within the new method, RefTypeAssign() method. In the example, we have two references pointing to the same object on the managed heap. So, when we change the value of X using the p1 reference, p2.X reports the same value.
Assume we have the following reference (class) type that maintains an informational string that can be set using a custom constructor:
class ShapeInfo { public string infoString; public ShapeInfo(string info) { infoString = info; } }
We want to contain a variable of this class type (ShapeInfo) within a value type named Rectangle. To allow the caller to set the value of the inner ShapeInfo member variable, we also provide a custom constructor, Rectangle():
struct Rectangle { public ShapeInfo rectinfo; public int rectTop, rectLeft, rectBottom, rectRight; public Rectangle(string info, int top, int left, int bottom, int right) { rectinfo = new ShapeInfo(info); rectTop = top; rectBottom = bottom; rectLeft = left; rectRight = right; } public void Display() { Console.WriteLine("String = {0}, Top = {1}, Bottom = {2}," + "Left = {3}, Right = {4}", rectinfo.infoString, rectTop, rectBottom, rectLeft, rectRight); } }
At this point, we have contained a reference type (SahpeInfo) within a value type (Rectangle). What happens if we assign one Rectangle variable to another? Given what we already know about value types, we would be correct in assuming that the integer data (which is a structure) should be an independent entity for each Rectangle variable. But what about the internal reference type? Will the object's state be fully copied, or will the reference to that object be copied? To answer this question, let's define the following method and invoke it from Main().
// Program.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace MyFirstCSharpCode { class Program { class ShapeInfo { public string infoString; public ShapeInfo(string info) { infoString = info; } } struct Rectangle { public ShapeInfo rectinfo; public int rectTop, rectLeft, rectBottom, rectRight; public Rectangle(string info, int top, int left, int bottom, int right) { rectinfo = new ShapeInfo(info); rectTop = top; rectBottom = bottom; rectLeft = left; rectRight = right; } public void Display() { Console.WriteLine("String = {0}, Top = {1}, Bottom = {2}," + "Left = {3}, Right = {4}", rectinfo.infoString, rectTop, rectBottom, rectLeft, rectRight); } } static void ValueTypeContainingRefType() { Console.WriteLine("Creating r1 ..."); Rectangle r1 = new Rectangle("First Rect", 100, 100, 500, 500); Console.WriteLine("Assigning r1 to r2 ..."); Rectangle r2 = r1; Console.WriteLine("Changing values of r2"); r2.rectinfo.infoString = "New Info"; r2.rectBottom = 499; r1.Display(); r2.Display(); } static void Main(string[] args) { ValueTypeContainingRefType(); Console.ReadLine(); } } }
Output is:
Creating r1 ... Assigning r1 to r2 ... Changing values of r2 String = New Info, Top = 100, Bottom = 500,Left = 100, Right = 500 String = New Info, Top = 100, Bottom = 499,Left = 100, Right = 500
As we can see, when we change the value of the informational string using the r2 reference, r1 reference displays the same value. By default, when a value type contains other reference types, assignment results in a copy of the references. In this way, we have two independent structures, each of which contains a reference pointing to the same object in memory (shallow copy). When we want to perform a deep copy, where the state of internal references is fully copied into a new object, one approach is to implement the ICloneable interface.
Reference types or value types can obviously be passed as parameters to type members. However, passing a reference type (a class) by reference is quite different from passing it by value. To understand the distinction, let's take a look at the following Person class:
class Person { public string Name; public int Age; public Person(string s, int a) { Name = s; Age = a; } public Person() { } public void Display() { Console.WriteLine("Name: {0}, Age: {1}", Name, Age); } }
What's going to happen if we create a method that allows the caller to send in the Person type by value?
static void PersonByValue(Person p) { p.Age = 32; p = new Person("Mozart", 28); }
Note that PersonByValue() method attempts to reassign the incoming Person reference (p) to a new object as well as change some state data (Age). Now, combine all the routines we wrote and test:
// Program.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace MyFirstCSharpCode { class Program { class Person { public string Name; public int Age; public Person(string s, int a) { Name = s; Age = a; } public Person() { } public void Display() { Console.WriteLine("Name: {0}, Age: {1}", Name, Age); } } static void PersonByValue(Person p) { p.Age = 32; p = new Person("Mozart", 28); } static void Main(string[] args) { Person b = new Person("Beethoven", 53); Console.WriteLine("1: Person is: "); b.Display(); PersonByValue(b); Console.WriteLine("2: Person is: "); b.Display(); Console.ReadLine(); } } }
The output we get:
1: Person is: Name: Beethoven, Age: 53 2: Person is: Name: Beethoven, Age: 32
The value of Age has been changed. Given that we could change the state of the incoming Person, what was copied? A copy of the reference to the caller's object. So, as the PersonByValue method is pointing to the same object as the caller, it is possible to alter the object's state data. What is not possible is to reassign what the reference is pointing to.
Let's make a new method which passes a reference type by reference using ref modifier.
static void PersonByReference(ref Person p) { p.Age = 32; p = new Person("Mozart", 28); }
Let's combine the routines and check to see what's the difference compared to the previous example:
// Program.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace MyFirstCSharpCode { class Program { class Person { public string Name; public int Age; public Person(string s, int a) { Name = s; Age = a; } public Person() { } public void Display() { Console.WriteLine("Name: {0}, Age: {1}", Name, Age); } } static void PersonByReference(ref Person p) { p.Age = 32; p = new Person("Mozart", 28); } static void Main(string[] args) { Person b = new Person("Beethoven", 53); Console.WriteLine("1: Person is: "); b.Display(); PersonByReference(ref b); Console.WriteLine("2: Person is: "); b.Display(); Console.ReadLine(); } } }
Output we get this time:
1: Person is: Name: Beethoven, Age: 53 2: Person is: Name: Mozart, Age: 28
As we see, an object named b returns after the call as a type named Mozart, as the method was able to change what the incoming reference pointed to in memory. The golden rule to keep in mind when passing reference types:
- If a reference type is passed by reference, the called may change the values of the object's state data as well as the object is referencing.
- If a reference type is passed by value, the called may change the values of the object's state data but not the object it is referencing.
Here is the summary of the core distinctions between value types and reference types.
TABLEItem | Value Type | Reference Type |
---|---|---|
Where | Allocated on the stack | Allocated on the managed heap |
Representation | Value type variables are local copies. | Reference type variables are pointing to the memory occupied by the allocated instance. |
Base Type | Must derive from System.ValueType | Can derive from any other type (except System.ValueType, as long as that type is not sealed. |
Inheritance | Value types are always sealed and cannot be extended. | If the type is not sealed, it may function as a base to other types. |
Default Parameter Passing Behavior | Variables are passed by value (i.e., a copy of the variable is passed into the called function). | Variables are passed by reference (i.e., the address of the variable is passed into the called function). |
Can this type override System.Object.Finalize()? | No. Value types are never placed onto the heap and therefore do not need to be finalized. | Yes, indirectly. |
When do variables of this type die? | When they are out of the defining scope. | When the object is garbage collected. |
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization