Advanced Boxing Scenarios

Summary:
Interesting scenarios where boxing might occur in .NET, though it's not immediately obvious by examining at the code.

  • Calling a non-virtual method defined in a base class
  • Calling a virtual method defined in a base class and not overridden
  • Calling the base implementation of a method
  • Generally: when a this reference is needed  

Let's talk about some advanced boxing scenarios - some cases where boxing might occur when it's even less obvious. But if you had just enough information about boxing for one day, you can click the skip button and maybe come back to this section later.

The first scenario I'd like to discuss is calling a non-virtual method that is defined in a base class.

int num = 20;
Type type = num.GetType();

For example, GetType is a non-virtual method defined in object. So when we call GetType on an int, boxing occurs. Let's verify that by looking at the IL code.

01IL_0001:  ldc.i4.s   20
02IL_0003:  stloc.0
03IL_0004:  ldloc.0
04IL_0005:  box        [mscorlib]System.Int32
05IL_000a:  call       instance class [mscorlib]System.Type
06                       [mscorlib]System.Object::GetType()
07IL_000f:  stloc.1

And here is the boxing operation. Let me explain why it occurs. In the implementation of every non-static method, including GetType, the code can always use the this keyword to obtain the instance that the method currently operates on. Now keep in mind that GetType is defined in object and the this keyword, in the context of the object class is expected to be of type object which is a reference type. Therefore, this must have something in the heap to refer to. And this is why the int had to be boxed and placed in the heap before we could call GetType.

Another scenario that triggers boxing is calling a virtual method which is defined in the base class and is not overridden.

01struct MyValType
02{
03}
04
05...
06
07MyValType v = new MyValType();
08Console.WriteLine(v.ToString());    

Here we have MyValType, which is a struct that doesn't override any method. We create an instance of it and it resides in the stack of course. And here we call ToString. But since our struct doesn't override ToString, we end up executing the base class implementation of ToString. Now keep in mind that the base class is always a reference type, because only reference types can be derived from. So the this operator in the context of the base class must refer to something in the heap, which is why the value type instance must boxed.

This scenario is even less obvious when you look at the IL code. As you can see here, the IL doesn't contain a BOX opcode.

01IL_0001:  ldloca.s   v
02IL_0003:  initobj    ConsoleApplication35.MyValType
03IL_0009:  ldloca.s   v
04IL_000b:  constrained. ConsoleApplication35.MyValType
05IL_0011:  callvirt   instance string
06                     [mscorlib]System.Object::ToString()
07IL_0016:  call       void
08                     [mscorlib]System.Console::WriteLine(string)

The constrained opcode checks if the value type it operates on overrides the method and if it doesn't, the value type is boxed. This is less obvious because many of us are trained to search the IL for the BOX opcode, so keep in mind that CONSTRAINED may also perform boxing.

And the last scenario of advanced boxing I want to discuss occurs when we call the base implementation of a method.

01struct MyValType
02{
03    public override string ToString()
04    {
05        return "My type is:" +
06            base.ToString();
07    }
08}

The struct that we have here overrides ToString, as opposed to the struct in the previous example. But, it also calls the base implementation of ToString. And this is really the same thing. If we call the base class implementation, we have to have a this that refers to something in the heap and so boxing must occurs first.

At least in this case it's obvious by looking at the IL code.

01  IL_0001:  ldstr      "My type is:"
02  IL_0006:  ldarg.0
03  IL_0007:  ldobj      ConsoleApplication35.MyValType
04  IL_000c:  box        ConsoleApplication35.MyValType
05  IL_0011:  call       instance string
06                       [mscorlib]System.ValueType::ToString()
07  IL_0016:  call       string [mscorlib]System.String::Concat(string,
08                                                              string)
09  IL_001b:  stloc.0
10  IL_001c:  br.s       IL_001e
11  IL_001e:  ldloc.0

Do you see a pattern here? In all three cases, the reason for the boxing was that we needed something in the heap for the this operator to refer to.