I promised myself to not write new articles until Svelto.ECS 3.0 is out, however a recent tweet asking about how to achieve zero allocation code in Unity inspired me to write a short piece about the importance of memory allocation in c# and game development that I previously planned as part of a new ECS theory related article. Oh well, this means that I won’t focus on ECS much.

C# wasn’t initially designed for game development, but today, thanks to Unity, ECS and Burst, we can achieve great results in terms of performance that were previously only domain of c++ developed products. However, in order to get to these results, the c# programmer must understand the fundamental limitations of operative systems and CPU architectures, like a c++ programmer does. The c# programmer must also know how c# works at its core to understand why seemingly harmless behaviors can actually heavily affect the execution performance.

This article is therefore aimed to people who wants to be sure they are using optimally their tools. If you don’t need this kind of performance, the article is still useful for knowledge.

Allocating data structures and objects

Let’s start from the c# memory allocation strategy. We all know that c# uses garbage collection, but this doesn’t come for free. According my tests, allocating an empty class is 3 times slower than allocating the same class using native memory (Marshal.AllocHGlobal) and 50 times slower than creating a new struct. Structs are much faster because no allocation are ever involved with them, unless by mistake boxing/unboxing happen.

There is more to it though: object references are not garbage collected until they are referenced by something. How does the GC know if a reference is still held though? Contrary to what one may think, a reference assigned to a field of an object is not just an assignment! More, slower, stuff happens under the hood to help the garbage collector detect what is reference by what, hence working with references, instead of structs, is slower not just for allocations.

Of course let’s not forget that allocating continuously eventually leads to a garbage collection kicking in at any point of the execution time which is made worse by the fact that the Unity garbage collection is one of the slowest implementations out there.

If you want to know more about what happens under the hood with c# memory allocation, you should definitively read this book:

Remember that even with GC, memory leaking are still possible! The most common case of memory leaking is usually due to stateful static classes not clearing their references. As static classes are never collected, what their reference will never be released either!

What is the best solution to the consequences of using object references? In few words using the Entity Component System paradigm, through a proper implementation of it, which pushes the programmer to use just structs. Since this is not an ECS related article, then what alternative do we have?

C++ coders know well that frame based allocations must be avoided at all costs. I don’t know any valid argument that would justify every frame allocations either. User input based allocations, which happen every now and then, may be OK.

The standard strategy is then preallocate and reuse as much as you can. If you are going to use 1000 bullets at most, allocate an array of 1000 bullets even if normally much less are used.

Avoid using any data structure that allocates continuously. Your friends are Arrays, Lists and Dictionaries. Preallocate them, don’t allow any resizing during the execution of the game. Clear instead of New should be the common pattern in order to reuse data structures at the begin of every frame.

I don’t use GameObjects (or any kind of other object) in my current Unity project, so I don’t need to used ObjectPools. However Object Pools are your friends exactly for the same reason. They allow you to reuse objects in a preallocated array instead to need to create them every time! Use Object Pools.

Boxing/Unboxing and other trickier allocations

Regarding c# allocation this was the simplest part so far. The trickier sneakier allocations are relative instead to boxing/unboxing, external variable capturing and wrong use of delegates.

These are so tricky, that the only way to be sure that you don’t fall for them is to use tools that help you visualise what’s going on. Let’s see what possibly can help us:

  • Jetbrains Rider and its heap Allocation Viewer plugin. This is useful most of the times although sporadically it is not reliable. Visual Studio also has something similar, but I haven’t used it in a while, so try it (at least Visual Studio has a free version)
  • Static code analyzers. Unity is maintaining Project Auditor. I didn’t use it yet, but I love these kind of tools. Leave me some feedback about it if you use it please. Static code analyzers are able to check your code and tell you the places where things can go wrong.
  • Use any IL viewer and check the code you have doubts on. Of course this is the worst tool, because you need to have some suspects first, while the others will warn you about what’s wrong.
  • Use the Unity Profiler (see below).
  • Use the Unity Test suite to write functional tests for your code.

If you don’t know what boxing is, you should look it up. Briefly, as I hinted, c# treats differently objects from structs/value types. However there are cases where the user may accidentally ask c# to convert a struct/value type in an object, ending up in a new allocation of this object.

Interface boxing

Boxing is the process, hidden to the user, to transform a Struct in to an Object, boxing the struct inside the new object that must be allocated. It’s even worse than allocating an object, because boxing can happen multiple times without the user realising it. The most common way to box is to cast a struct to an interface. Let’s say that you have a struct (TestStructI) that implements interface (ITest) then you decide to cast the struct to ITest or you assign to a class that has an ITest field. WRONG, BOXING will trigger an allocation!

Rider underlines in red ALL the operations that result in memory allocation (at least we hope so)

in the same way, assigning the return value of a GetEnumerator() to an IEnumerator results in a boxing as most of the times Enumerators are implemented as structs.

Tip: when you implement your own enumerable/enumerator, implement it as a struct. There is no reason to use classes for custom enumerators. You don’t need even to implement the IEnumerator/IEnumerable interface. Foreach will look for the GetEnumerator() method regardless. In this way for each will also be slightly faster because the IDisposable pattern code won’t be generated. A proof of concept can be found here.

Note that once Unity Mono implementation had a bug that forced boxing every Enumerator inside a foreach, but this is not the case anymore.

Boxing through IEqualityComparer

In previous projects I worked on, this happened a few times. Algorithms that needed the an IEqualityComparer using the EqualityComparer<T>.Default would make a tons of allocations if T is a struct without the IEqualityComparer defined explicitly. Common case is using structs as a key for dictionary without implementing IEqualityComparer. Nowadays I am so wary of it that, if needed, I actually implement IEquatable<EGID>,IEqualityComparer<EGID>,IComparable<EGID>

Iterator blocks

If you use Iterator blocks (which are IEnumerator function that use the yield keyword, ergo: coroutines), be aware that every time you use an Iterator block like StartCoroutine(IteratorBlock()) a new iterator block object is created and allocated!

Coroutines are tricky as well, but another simple advice is to try to reuse as much as you can the Unity Yield Instructions, like WaitForSeconds. Do not new them every time you need to yield them, but cache the variable and reuse it.

Careful about initializing structs whose type is defined through generic parameter.

Recently I have been surprised by the profiler, I suddenly got this (consider that our product is allocation free at run time)

Why all of the sudden, thousands of allocations? It’s because I naively did this:

instead of this:

Even if the constraints struct is used, with the former the buffer is initialized through reflection! Madness!

The params keyword

Using the params keyword for function parameters also leads to sneak allocations, as an array is always allocated under the hood, even if just one parameter is used!

Lambdas and Delegates

If you use many lambas and delegates you are in for troubles. Passing actions (and other delegates) by parameter always allocate unless you preallocate the delegate beforehand. In the following examples I am using the Unity Test Package to test memory allocation, which may confuse you (sorry about that). For the first case I wrote a function, TestDelegate, that accepts a System.Action. In the way I am using it (passing a lambda) I will cause an allocation:

Using TestDelegate like this will allocate memory, just because a new delegate is allocated to hold the lambda reference.
the only way to avoid allocation is to preallocate the delegate (i.e.: inside a constructor) and hold its reference. This is OK as long as variables must not be passed

in reality in case of a simple lambda, C# under the hood does exactly the same thing, the lambda will be allocated once, cached and reused.

However Generally lambda should be avoided as once you accidentally catch an external variable, c# is not able to automatically cache the implementation. Catching an external variable means using any variable from outside the scope of the Lambda.

Local functions won’t help either once passed as parameters as there isn’t a struct implementation of the delegate class, hence an object will be always created.

If you need to use preallocated delegates that need parameters, then they should be used like this:

note that I preallocate the MethodToCall delegate using the _preallocatedAction field, which can then be used later on by the MethodCallingTheDelegateLaterOn method. This example looks weird, but in reality it’s quite a common case when delegates are involved.

Linq and Tasks

By the way, I am not even touching Linq here. While Linq is a terrific tool, it should be avoided like the plague for game development. Linq has been optimized a lot over the time, but it is still cause, at least, of many enumerator allocations. Note that there are libraries that promise Linq like 0 allocation code. Do the work? No clue as I don’t use Linq, but you can have a look: https://github.com/NetFabric/NetFabric.Hyperlinq

I should also mention the use of the Task class and await/async, but if you use those, you probably already know what you are doing. If you still want to know more, check out this link which explains everything.

Strings

let’s face it, if you are using strings at run-time (frame based operations), you are doing something wrong. Always find an alternative to strings, but when it’s totally necessary, be aware that most of the operations involving strings result in an allocation. Many articles will tell you to use StringBuilder as a way to alleviate the problem, but be aware that it won’t solve it. In fact, even if it will save a lot of allocations during concatenations and other operations, the final result, that happens through a ToString(), will allocate a new string. A StringBuilder is effective only if it’s reused and not recreated every single time.

Profile It

Once you learn these tricks and good practices what it is left to do is to run the game and open the Unity editor profiler. Click on the Call Stack button (Unity 2019.3, previously was called allocation callstack) and check closely for red allocations. In the following example I am making the mistake to assign a struct to an interface. Every 10 frames the Unity Profiler will show a red mark in the time line view (hierarchical view is useful as well to check overall allocations)

Since 2019.3 is even possible to track GC allocation from specific standalone clients (Windows for example). This is probably the easiest way to track and fix allocations at the moment.

The example shows just a tiny allocation, that really won’t affect much your performance, however when boxing creeps in, it’s easy to end up boxing in loops having, as result, thousands of allocations per frame.

All the optimizations discussed so far shouldn’t be applied until the profiler shows you that there is a real problem, otherwise you will fall in the early optimization issue. However for people like me who write 100% objectless ECS code, not striving to zero allocation may end up in milliseconds spent in thousands of allocations per frame.

I also suggest you to have a look at the Unity performance test API. It is really convenient and it can tell you if allocations happen as well.

If I forgot something or you have any question or you need more details on something, leave a comment below and I will reply ASAP.

Other references to read

https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity4-1.html

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing

https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity5.html

https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity7.html

8
Leave a Reply

avatar
4 Comment threads
4 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
5 Comment authors
James WoodwardOnurEoin AhernSebastiano MandalàAlberto Gómez Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Alberto Gómez
Guest
Alberto Gómez

Could yo please expand a little bit more tasks and awaits? Isn’t it better to use awaits than coroutines?

Eoin Ahern
Guest
Eoin Ahern

Do foreach loops still generate garbage or boxing in unity 2019?

Onur
Guest
Onur

“According my tests, allocating an empty class is 3 times slower than allocating the same class using native memory”
That’s impossible Allocating a single object is merely changing one pointers value.