NodeCanvas Forums › General Discussion › Performance Issue on Xbox One with BT
I’m experiencing performance issues on Xbox, compiled with GDK, using Behaviour Trees. I have ~20 agents running the same BT and i’m seeing some significant frame drops. Looking at the profiler, a lot of time is spent in selector, sequence, node and decorator updates. by the time I get to my custom action, the amount of time is pretty low. Even if I can lower the time of my custom code to 0, I’m still in the red
attached are two images; one showing the top of the BT call stack and another showing it at the custom action.
Any insight on what might be causing this would be greatly appreciated!
Hello from here as well and sorry for the late reply!
Composite nodes like Sequencers and Selectors or decorators like the “Accessor” (shown as ConditionalEvaluator in the profiler), should not by any means create garbage and it usually is up to the tasks (actions/conditions) that depending on what they do, they might. Looking at your profiler pic it looks like the condition task you have assigned on the Accessor Decorator as well as the AgentAction and Spritefollow actions are creating most of the allocations, but can you please also post a screenshot of your graph so that I have a better context?
Also, does your graph repeats very frequently (maybe per-frame), meaning that it completes its cycle start/restart very frequently?
Please let me know.
Join us on Discord: https://discord.gg/97q2Rjh
I have found the cause of the garbage and it is indeed in my code. I was able to clean all that up! However, I still see considerable impact to performance on xbox one. The flow of the tree used to (and indeed did in the pictures above) repeat over and over. I have since changed the Action to not call EndAction whenever the agent is at rest.
Attached is the relevant part of BT for the Spriteilng – it’s a rather large BT so if you need more context, I’ll be happy to provide it. As you might have guessed, this task on the BT is the follow logic for our little creatures called Spritelings. Some of the structure oddities are from when I didn’t know how to use the tool. Time constraints have not allowed me to go back in and fix it up haha.
Also attached is the profiler on PC for a single Spriteling. You can see some functions are called 9 times per frame per Spriteilng. But maybe that’s okay? If I had to guess, I’d say that’s because of they “Dynamic” flag on the conditionals?
I’ll post an update soon with the profiler actually running on an xbox one dev kit, but I just wanted to pass this along for now to see if anything stood out.
Thank you!
Update: here’re the profiler captures at the top of the BT stack and at the bottom AFTER cleaning up garbage allocations and making sure we don’t needlessly start the tree over and over again. I’m still seeing a lot of time being spent in all the Sequence/Selectors and Decorators
Hello again and sorry for the late reply.
Please note that in Deep Profiler mode, the ms shown are really not close to reality. Without Deep Profiler, the ms should be much better. Do you actually have any performance issues in build after you got rid of the GC? With that said however and looking at your profiler images, it looks like the culprit to still be the Conditional Decorator. Is this the one shown in the image in your previous post? If so, are the variables it uses, data bound to properties or are they simple variables in the blackboard?
Please let me know.
Thanks 🙂
Join us on Discord: https://discord.gg/97q2Rjh
Still have performance issues on the BT in that spot. That image above is still the BT we’re using and we are using data-bound variables. I made sure to turn them all into Properties so as to not create lambdas on the fly.
Are there general restrictions we should be aware of? We have at most 60 agents running this one BT and upwards to 10 or so other agents running a different FSM
Thank you!
Hello again,
Can you please post for me the current profiler image at that point with all the changes you’ve now made? (no per update repeat, property bound instead of data bound)
Thank you!
Join us on Discord: https://discord.gg/97q2Rjh
Thanks again for the response. I have attached the profile image after some of our fixes. I had to capture it with deep profiling on because without it, I wasn’t able to get a view of what the stack looked like. I did run a test without deep profile and the same top-level identifier, MonoManager, is still at the top of the list.
“<span style=”color: #323232; font-family: ‘Open Sans’; font-size: 12px;”>(no per update repeat, property bound instead of data bound)”</span>
I’m not quite sure what this means, but I will say all variables used in decorators and conditions are all property bound. Sorry if I’ve misunderstood you here. I’m happy to elaborate.
Hello again,
Could it be that the properties which you have bound variables to, have code (in their get or set) that justifies the performance shown (especially the ones used in the Accessor Decorator (ConditionalEvaluator) ?
Or are the bound properties simply a {get; set;} without any further code inside them?
Thank you.
Join us on Discord: https://discord.gg/97q2Rjh
Oh that makes sense. All the properties are being used are either auto-properties or single statement passthrough (returning some internal variable) getters. There is no computation done in any of the conditions/decorators.
Thank you!
Curious about your performance issues. Any best practices that can be gleamed from this?
Honestly, hard to say. We do have 30-40 agents running at any point starting mid game, and I’m not sure this tool was made to run that many at a time on older hardware (i.e. xbox one). In some games, you might be able to turn off some actions when not near the player/off-screen, but in our game, the AI needs to run no mater where in the level the player/camera is.
In general,
We got a small performance gain by making our own “node” that managed the leaf node states as a state machine rather than using concepts from a BT – it’s super hacky, but it got us 5-10 frames back.
Cheers!
Thanks so much – the FSM note is a bit concerning to me as we had more FSM usage in the past but have since moved to more BTrees for consistency … hmmm
Creating garbage in states and a zero-garbage policy are noted – but perhaps we are too far gone at this point for a complete audit of the project.
Again, thanks for sharing your experience!
@atrivedi – were you able to take a capture via a profiler (either the in the xbox toolchain for doing CPU sampling, or maybe Unity’s own) for a build without deep profile enabled?
As mentioned previously, deep profiling massively effects the cpu performance of code. And secondly that even a develop build as opposed to a release build configuration for any target hardware will also make a significant impact on the CPU time of code that runs as the compiler will forgoe compilation optimisations.
Are you still experiencing CPU bound on a release target build? And if so are you able to do a symbolicated CPU capture to get an accurate idea of relative performance on optimised code?