Bevy Hijinks: Loading a lot of GLTF Resources without premature CPU Death

Bevy Hijinks: Loading a lot of GLTF Resources without premature CPU Death

Silent asynchronicity is good until you misuse it badly!

·

4 min read

1AM is never a good time to get a stack overflow. Especially if your code is so simple that you start doubting you either have hardware problems or a haunted compiler. But, from time to time, and without any stack trace, it happens:

thread 'IO Task Pool (0)' has overflowed its stack
fatal runtime error: stack overflow
[1]    147201 IOT instruction (core dumped) RUST_BACKTRACE=1 cargo run

So, what was I doing back there? Well, it turns out, not much. Here's the spiel:

fn main() {
    App::default()
        .add_plugins(DefaultPlugins)
        .add_systems(Startup, (
            load_animations,
            spawn_scene,
            spawn_player_character
        ).chain())
        .run();
}

Each one of these is just carbon-copied from the official Bevy docs: we load some GLTF stuff, then we spawn a camera and light, and then spawn a character. In this case, I wasn't even connecting the animations to the character - we just load one, then spawn the other. And yet, when you run this, the useful printouts from each method go like this and then just stop...

LOADING ANIMATIONS
74 ANIMATIONS LOADED!

SPAWNING SCENE
SPAWNING SCENE DONE!

SPAWNING PLAYER CHARACTER!
SPAWNING PLAYER CHARACTER DONE!
thread 'IO Task Pool (0)' has overflowed its stack
fatal runtime error: stack overflow
[1]    147201 IOT instruction (core dumped) RUST_BACKTRACE=1 cargo run

The screen never shows the spawned character, but if I add some printout of the character's position just to see whether it's at the right place -- it actually does appear to be where I placed it! It's just... invisible.

Some short time later, and fully thanks to the Bevy Discord server, the intense squinting stares slid down to a single line here:

74 ANIMATIONS LOADED!

Asset loading is deferred

Here's my animation loading code:

#[derive(Resource, Default)]
pub struct AnimationLibrary(HashMap<String, Handle<AnimationClip>>);

pub fn load_animations(
    asset_server: Res<AssetServer>, 
    mut library: ResMut<AnimationLibrary>) 
{
    if let Ok(ron_animations) = fs::read_to_string("./assets/animations/Anims.ron") {
        if let Ok(named_indexed) = ron::from_str::<HashMap<String, u32>>(ron_animations.as_str()) {
            for (name, index) in named_indexed {
                let path = format!("animations/Anims.glb#Animation{}", index);
                library.1.insert(name.clone(), asset_server.load(path));
            }
        }
    }
}

The code here isn't big enough for a memory leak, you might think, but if we reduce the number of animations that need to load to... let's pick 1, for example, the error disappears! Up the number to 10 and we get a slight pause. 30 already has a good chance to just straight-up die, so it must be this method. What's up is actually easy to miss: asset_server.load is starting the process of loading the asset and returning the handle to the loaded resource. Because we're doing everything in the same frame, without waiting for the resource to actually be loaded, we end up loading the GLTF file 74 times and then extracting only one animation from each and every one of the loaded instances.

GLTF files are great libraries of content and hold all the animations, mesh data, etc. inside, so the better way to do this is to load the GLTF file once and then extract the animations from it. Here's the changed code:

#[derive(Resource, Default)]
pub struct AnimationLibrary(Handle<Gltf>, HashMap<String, Handle<AnimationClip>>);
// Added handle to the GLTF file here ^

pub fn load_animations(
    asset_server: Res<AssetServer>, 
    mut library: ResMut<AnimationLibrary>) 
{
    let animations = asset_server.load::<Gltf>("animations/Anims.glb");
    library.0 = animations;
}

This system just loads the file and saves the handle to the GLTF file in the resource. Once that's actually loaded, we can read it with another system poised to activate exactly then:

pub fn register_animations_once_loaded(
    mut library: ResMut<AnimationLibrary>, 
//  ^ we will read the GLTF handle from here, and also
//    write the animation handles once extracted
    gltf: Res<Assets<Gltf>>) 
//  ^ Res<Assets<T>> is how we get an actual asset T from a Handle<T>
{
    // if the animations aren't loaded yet, skip!
    // (this call obviously isn't for us)
    let Some(animations) = gltf.get(&library.0) else { return; };

    if let Ok(ron_animations) = fs::read_to_string("./assets/animations/Anims.ron") {
        if let Ok(named_indexed) = ron::from_str::<HashMap<String, u32>>(ron_animations.as_str()) {
            for (name, index) in named_indexed {
                library.1.insert(name.clone(), 
                    animations.animations.get(index as usize)
                        .expect("Animation should exist if it's in ron")
                        .clone_weak());
            }
        }
    }
}

To make the ordering just right, we have the two systems run like this:

app.add_systems(
    Startup, (
        load_animations, 
        register_animations_once_loaded
            .after(load_animations)
            .run_if(on_event::<AssetEvent<Gltf>>()),
// AssetEvent<Gltf> will be fired when Gltf files are loaded and
// at this phase, this is the only Gltf running (in Startup)
        ));

The whole thing works much better now, and most importantly, doesn't crash randomly, and eat up a whole bucket of RAM! I'm going to look into adding some runtime messages for other poor souls who didn't read the instructions of the load method:

Begins loading an Asset of type A stored at path. This will not block on the asset load. Instead, it returns a "strong" Handle. When the Asset is loaded (and enters LoadState::Loaded), it will be added to the associated Assets resource.

You can check the asset's load state by reading AssetEvent events, calling AssetServer::load_state, or checking the Assets storage to see if the Asset exists yet.

The asset load will fail and an error will be printed to the logs if the asset stored at path is not of type A.

Well, now we know! Until the next big crash! :D Thank you, CatThingy and Gingeh from the Bevy Discord, would have had a panic attack without you!