- Uređeno
Optimize Animations
Hello,
So recently we were experimenting on how to solve the performance problem with spine animations in unity.
Let's say we have 100 units on screen at a time, all running animations:
-late update:
-normal update also:
During idle gameplay animations eat up 60% of resources (percentages in screenshots differ, because a lot is used up by editor). 20% is used by Renderer, and 20% - by editor. Actual gameplay mechanics take only 1-2% (5% at peaks when creating objects) of CPU resources.
We thought: -OK, what can we do? Let's try turning off idle anims, to only leave unit movement? Will look much uglier, but hey, a trade off is a trade off.
So we do that. After wandering around, units play a single idle animation, then animator stops and wont move any bones or mesh deformations.
But surprisingly there is virtually zero percent performance improvement, even though 80% of units are now standing still and not animating anything.
Is this intended behavior? Is this fixable? What can we do to save CPU from these hungry animations?
Our game target platform is WebGL - it will be running on browser, using only a single thread.
You need to turn off the SkeletonAnimation
or SkeletonRenderer
component to not update at all. Otherwise the mesh will still be generated every frame, constraint solvers will have to solve bone locations, bone matrices will be calculated, etc.
Did you have a look at the Spine Examples/Other Examples/FixedTimestepUpdates
example scene? This way you could still show a bit of movement by updating only every Nth frame. Just be sure to not update all units at that frame, and none at the subsequent ones, but distribute the update evenly. E.g. you could update units every 3 frames like this:
shallUpdate = (Time.frameCount % 3 == myUpdateFrame);
with myUpdateFrame
being 0, 1 or 2. You would then initialize myUpdateFrame
like this:
myUpdateFrame = unitIndex % 3;
Thank you for the reply, those are some really nice hints. Seems like promising solutions. :nerd:
Will add a task to test these approaches and return here with the results.
Did you have a look at the
Spine Examples/Other Examples/FixedTimestepUpdates
example scene?
Yes I did.
Example scene uses SkeletonAnimation. While we've setuped in our project to use SkeletonMecanim. There is neither SkeletonAnimation nor SkeletonRenderer components on the character model. Components are: SkeletonMech, MeshRenderer, Animator, SkeletonMecanim.
Is it still possible to control updates? Because checking on the component functions there's LateUpdate() but no Update()...
CarrotPie wroteIs it still possible to control updates? Because checking on the component functions there's LateUpdate() but no Update()...
I see the Update()
method here:
spine-runtimes/SkeletonMecanim.cs at 3.8
I guess you mean that there is no version Update(float timedelta)
at SkeletonMecanim
. Since playback is controlled by the Animator component, you would have to control the Mecanim time behaviour (which unfortunately Unity has not exposed as far as I'm informed). So the best way is to disable the SkeletonMecanim
component to disable the Updates
, and only enable it for the single Nth frame to see the updated mesh. This way the Mecanim and timeline overhead is still there, but you save a lot of computations of updating the world transforms and the mesh.
Harald wroteI guess you mean that there is no version Update(float timedelta) at SkeletonMecanim.
Yes, obviously...
So today I've implemented both optimizations via your suggested method - enabling/disabling SkeletonMecanim.
Fully idle - there's 108 units in the scene. 8 units are not touched, 100 - is. There is also some environment, but it's impact is negligible. Tried out in two modes, when minimum units are animated (idle) and when maximum.
In these conditions, while testing in Unity Editor, the results are following.
fully idle (FPS) / maximum animated (FPS)
Before: 29 / 29
disabling on idle via Lists<>: 70 / 29
disabling via HashSet<>: 75 / 45
frame skipping + disabling: 45 / 30
At first I used lists, but then as they are not needed moved on to an array of HashSets for a considerable improvement. As you can see, frame skipping, at least in our case does not seem as a viable option. On full load, it barely differs from what we had before the applied optimization.
I just noticed that in my posting above I have incorrectly stated that disabling the SkeletonMecanim
component leaves Timeline updates active - actually these are disabled as well, so no Spine-animation related task remains.
Regarding your measurements:
CarrotPie wroteFully idle - there's 108 units in the scene. 8 units are not touched, 100 - is.
What do you mean by this sentence? What does "touched" mean?
CarrotPie wroteTried out in two modes, when minimum units are animated (idle) and when maximum.
What are "minimum" and "maximum" units? To me this means 0=minimum, and 108=all=maximum.
I assume that you mean that the units are animated in a minimal way (idle), vs animated in a more active way.
CarrotPie wroteAt first I used lists, but then as they are not needed moved on to an array of HashSets for a considerable improvement.
What do you use these Lists
and HashSets
for? It seems as if you've got some general problems going on with your activation/deactivation implementation if you gain 15fps from switching containers at such a small element count.
In general it would be helpful if you could phrase your sentences in a way that does not need to be interpreted.
What you could also do in your case instead of distributing the update across multiple frames is using the RenderExistingMesh
example component, found in Spine Examples\Scripts\Sample Components
. While it was created to re-render an existing mesh with an outlines-only shader, it should be very suitable for rendering identical idle animations.
If you have 108 units on screen and want to not move all of them in perfect sync, you could have e.g. 3-5 different master
animation objects and copy random ones to have 3-5 identical groups to break up repetition and feel more natural.
This would be the most cost-effective solution that I can currently think of for your scenario.
...
Harald wroteusing the RenderExistingMesh example component
...
We have transitions and everything, so we use Unity's mecanim to blend (blendtrees) between them for all directions (8 in total), so your suggestion feels like it would have to override this current method, and since we are running out of development time, I doubt it would make an impact enough to be worth it at his point. I will definitely look into it though, thanks!
Harald wroteWhat do you mean by this sentence? What does "touched" mean?
C
I meant, that 8 units are not touch by scripts - they are always animating, not frame skipping etc. While the 100 units received all performance treatments.
Harald wroteWhat are "minimum" and "maximum" units?
This is for our case. Did not want to go into details, because there'd be walls of text. There is a random factor to how many units animate at a time. When fully idle, they still walk around somewhat at random.
Harald wroteIt seems as if you've got some general problems going on with your activation/deactivation implementation if you gain 15fps from switching containers at such a small element count
Wouldn't consider them problems. There may be a smarter/faster way to implement this, but that is what how did it and it works ok enough. In any way, you'd have to track the 100 units every frame, iterate through half to change state. Loops every frame are costly, but sadly in this case unavoidable. If you're interested, here`s a Gist of the Controller:https://gist.github.com/Darth-Carrotpie/be4a3037426607ad3812a63b79d3c364.
Harald wroteIn general it would be helpful if you could phrase your sentences in a way that does not need to be interpreted.
Sorry, I am phrasing my sentences as minimal as possible, to avoid walls of text - trying to lay it out short and accurate :upsidedown:
I doubt it would make an impact enough to be worth it at his point
It would yield the same framerate as when the whole SkeletonMecanim
component is disabled. I assume by your doubt you mean that the visual difference is not worth it compared to deactivating it and not animating it at all.
CarrotPie wrotetrying to lay it out short and accurate
Thanks, this would be great for the future. Just avoiding major ambiguities would be enough, otherwise we cannot judge the current situation and your posted measurements become worthless without knowing what has been measured.
CarrotPie wroteWouldn't consider them problems. [..] In any way, you'd have to track the 100 units every frame, iterate through half to change state. Loops every frame are costly, but sadly in this case unavoidable.
Don't worry, a single loop executed every frame is not costly at all. What most likely was costly at your previous List<>
based solution is that you were randomly erasing elements one-by-one, this is of course costly.
Harald wroteI assume by your doubt you mean that the visual difference is not worth it compared to deactivating it and not animating it at all.
Yes. That. If we had more time, I would surely experiment more on this though
Harald wroteJust avoiding major ambiguities would be enough
I get it. Sorry for that... :upsidedown:
CarrotPie wroteWhat most likely was costly at your previous List<> based solution is that you were randomly erasing elements one-by-one, this is of course costly.
There was as minimal overhead as I could think of. The solution was exactly the same as it is now, just different type of object (as mentioned HashSet instead of List). As they mention in docs, HashSet is just an unordered List, but it performs waaaay faster, perhaps even close to an array speed. :detective:
Harald wroteDon't worry, a single loop executed every frame is not costly at all.
Well Unity doesn't like loops within Update(), not to mention nested ones :upsidedown: Gotta to keep them to a minimum...
CarrotPie wroteThere was as minimal overhead as I could think of. The solution was exactly the same as it is now, just different type of object (as mentioned HashSet instead of List). As they mention in docs, HashSet is just an unordered List, but it performs waaaay faster
Erasing N random elements from an ordered List<>
one-by-one is in complexity O(n2). Removing from a HashSet
is in O(n)
. This was my actual point.
CarrotPie wroteperhaps even close to an array speed.
Naively erasing from an array is not fast, it's the same as with a List<>
. A List<>
is just a managed array with a size and automatic reallocation management.
CarrotPie wroteWell Unity doesn't like loops within Update(), not to mention nested ones :upsidedown: Gotta to keep them to a minimum...
This is incorrect generalization of something that you may have read or heard somewhere.
A single loop executed at a single GameObject
that executes method Foo
on N objects is faster than calling the same Foo
method via a component Update()
at N objects.
What you should perhaps keep to a minimum is nested loops (erasing N elements one-by-one randomly from a List<>
or array) or a loop that is executed at every GameObject (if N is large).
FWIW, when using an array or list and you need to remove but don't care about order, you can just move the last element to the index you want to remove. This avoids copying all the items above the one you removed down one. The advantage this has over a hash set is iterating an array or list is very fast, while iterating a hash set is not quite as fast. The main reason to use a hash set is not usually for frequent removals, but instead to know if a particular item is in the set.
Nate wroteFWIW, when using an array or list and you need to remove but don't care about order, you can just move the last element to the index you want to remove. This avoids copying all the items above the one you removed down one. The advantage this has over a hash set is iterating an array or list is very fast, while iterating a hash set is not quite as fast. The main reason to use a hash set is not usually for frequent removals, but instead to know if a particular item is in the set.
Well, in this case you cannot use arrays, because the length is changing. So we're left to chose from List or Hashset. It is not seen in the gist, but the additions/removals are not frequent at all - they happen only when:
- we add things when the unit gets a command to do smth, like move, thus the animation should start.
- we remove the thing when unit is inactive and stops in place (in our case after a single idle loop)
So these functions get called quite rarely, like once per sec. On the other hand, most repeated operation is to find the object and enable/disable it - this is in the constant cycle on Update(). This is where the Hashset speed helps a lot - like you said yourself: its faster at iterating.
CarrotPie wroteWell, in this case you cannot use arrays, because the length is changing.
Right, to use an array you'd need to also keep track of how many items in the array are used, which is essentially what List is doing.
CarrotPie wroteOn the other hand, most repeated operation is to find the object and enable/disable it - this is in the constant cycle on Update().
Sounds like a hash map is the right choice. A hash set is a hash map with only keys, while a hash map stores a value for each key. I guess a hash set could be used if being in the set is enough to know the object is enabled.
In C# land I believe they call a hash map "Dictionary" (for some awful reason :p).
CarrotPie wroteThis is where the Hashset speed helps a lot - like you said yourself: its faster at iterating.
Iterating (visiting each entry) for a hash set or map is slower than doing so far an array. What is fast (or at least constant speed regardless of number of entries, ie O(1)
) is asking the hash set if it contains an item (or for a hash map, getting the value for a key).
- Uređeno
Found out by accident:
Turning off error handling will increase performance by lightyears. Literally! From 5fps it went up to the cap of 60!
It could be possible, that Spine assets within Unity heavily utilize error handling... too heavily perhaps?
edit: PS. talking about performance jump in WebGL builds here, not the editor. The editor was always running fine.
That's a lot! Thanks very much for sharing your insights! Do you mean the setting Project Settings - Player - Stack Trace - Error
? Did you disable only this option to increase performance so dramatically? Do you have Development Build
disabled in the Build Settings
? Unfortuantely we could not reproduce this performance gain yet.
Harald wroteThat's a lot! Thanks very much for sharing your insights! Do you mean the setting
Project Settings - Player - Stack Trace - Error
? Did you disable only this option to increase performance so dramatically? Do you haveDevelopment Build
disabled in theBuild Settings
?
The only option I changed was: Project Settings - Player - Publishing Settings - Enable Exceptions: none
.
In the build settings I have Development Build
enabled, but Autoconnect Profiler
disabled. Not sure if that changes anything, but this way can still profile via Unity Editor after running WebGL build via a local server.
Thanks for sharing the additional details! Did you disable WebAssembly Arithmetic Exceptions - Ignore
as well? We would expect this to potentially have the largest effect.
Unfortuantely we could not reproduce a similar effect in our own quick tests. With Development Build
disabled, we saw no real changes in performance when disabling exceptions via Enable Exceptions: none
and WebAssembly Arithmetic Exceptions - Ignore
compared to Enable Exceptions: Explicitly Thrown Exceptions Only
and WebAssembly Arithmetic Exceptions - Throw
. When enabling Development Build
, we could see a difference in the framerate with these settings, but judging overall performance based on development build is not something I would recommended in general.
In our case, at the moment it is:
WebAssembly Arithmetic Exceptions - Throw
Enable Exceptions: None
Development Build: True
We will experiment with the rest. But Exceptions to None was enough to drastically drive FPS up...
Thanks for the info, this is very interesting indeed. I would have expected WebAssembly Arithmetic Exceptions
to have the largest impact. Perhaps arithmetic exceptions are disabled alongside when setting Enable Exceptions: None
. Unfortunately this is not documented on the Unity docs pages :rolleyes: .