Hello and sorry for the late reply!
It will certainly help if you place the contents of the subgraph in the main graph, since the performance hit comes from reseting the subgraph (which would matter especially if that is happening very frequently). Speaking of reseting the subgraph, it seems that the hit comes from removing itself from the running graphs list. I will optimize this in the next version update (probably remove the list altogether since it is not that critical to keep a reference to all running graphs anyway).
Regarding the last profiler image, please note that the profiler tree is open at the first Node execution therefore the 35ms correspond to the whole tree starting from the root Selector for all active trees, thus it includes all actions and conditions (as well as the subtree which seems to be executing at that point in time in the profiler). With that said you brought to my attention some things that can and will be optimized (starting the the subtree reset :), as well as a few other things) for the next version.