Dan Byström’s Bwain

Blog without an interesting name

Archive for November, 2016

.Net structs

Posted by Dan Byström on November 28, 2016

Recently, I followed a link in a tweet by Alexandre Mutel (creator of SharpDx, etc)

To my surprise, in that blog post was a link to a StackOverflow question, written by myself, one and a half year ago or so.

That got me remembering why I asked that question – experimenting with serialization and deserialization of struct-arrays – and then the fact that I have several times encountered expert programmers with cargo cult ideas of the behavior of structs in .Net. It is not uncommon to hear statements “structs are always allocated on the stack”, “structs are slower than classes” and “a struct should not be larger than X bytes” (where X varies widely).

No matter what is true for structs – putting them in an array is often a game changer, and sometimes that is the best approach for a given problem. But as always: it depends. Although all this is old and trivial things, I thought that I’d some day write something on this topic. That happened to be today.

Try to compile the following code, one time with Demo declared as a struct and one time as a class:

public [struct or class] Demo
{
  public double A;
  public double B;

  public static Demo[] AllocateArray(int size)
{
    var arr = new Demo[size];
    for (var i = 0; i < arr.Length; i++)
      arr[i] = new Demo {A = i + 1, B = i + 2};
    return arr;
  }

  public static void PlayAround(Demo[] arr)
  {
    for (var i = 1; i < arr.Length; i++)
      arr[i - 1].A = arr[i].A + arr[i].B;
  }

}

Let’s see what happens when we allocate the array (both as x86 and x64) and check the memory consumption, after passing a size of 1,000,000 items:

Memory (Mb)
Class
– x86 26.7
– x64 38.2
Struct
– x86 15.3
– x64 15.3

So, what’s going on? Take a look at the picture below.

class_struct

Each oval represents an object that can be referenced and garbage collected individually. In the first case, with classes, we allocate 1,000,001 individual objects. In the second case, with structs, we just allocate one big object in memory where all the structs resides in consecutive memory.

Each object has an overhead of 8 or 16 bytes (x86 or x64) bytes (this is partially true), and a reference to an object takes 4 or 8 bytes (x86 or x64). So a reference + object overhead is 12 or 24 bytes (x86 or x64). The two doubles A and B takes 16 bytes together in both x86 and x64.

Theoretic calculations then gives (the object overhead of the array itself I omitted since it is just silly in comparison to the other numbers):

Formula Memory (Mb)
Class
– x86 10000000*(12+16) 26.7
– x64 10000000*(24+16) 38.1
Struct
– x86 10000000*16 15.3
– x64 10000000*16 15.3

Whoa! Theory and practice correlates! 🙂

Now let’s play around with the allocated array a few iterations (by calling the PlayAround method):

Time (ms)
Class
– x86 258
– x64 313
Struct
– x86 117
– x64 128

Now let’s try to make the Demo struct/class significantly bigger by adding a few more doubles to it, bringing it up to a total of 128 bytes, by making the declaration look like this:

  public double A;
  public double B;
  public double pad1;
  public double pad2;
  public double pad3;
  public double pad4;
  public double pad5;
  public double pad6;
  public double pad7;
  public double pad8;
  public double pad9;
  public double pad10;
  public double pad11;
  public double pad12;
  public double pad13;
  public double pad14;
Time (ms)
Class
– x86 1018
– x64 904
Struct
– x86 625
– x64 649

As you can see, in this particular case, the rule of the thumb that a struct should not be larger than X bytes, is not applicable (there is an MSDN article claiming that it should not be more than 16 bytes, for example).

As side note when it comes to passing around large structs, you have always the choice of passing them around as ref, even if you do not mean to modify them, although that may confuse a reader of your code. In this case they won’t be copied onto the stack, but passed by reference in a similar way as normal class objects are passed by default.

So, to conclude, when putting structs into an array, you get a behavior that may contradict what is commonly claimed about structs.

Chose the best tool for your current problem and always benchmark instead of guessing.

Thanks for reading, and please remember that fast code consumes less energy.

PS. What I experimented with a while ago was serializing struct arrays directly to disk with no intervening code at all. It turned out that with a local SSD disk drive this method would beat the crap out of any serialized I tested. But when serializing over a network, compressing data like, for example. Protobuf.Net does, gave better performance.

DS. Thanks to Peter af Geijerstam and Per Rovegård for corrections.

Advertisements

Posted in Programming | Leave a Comment »