Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[browser] why is the garbage collector not doing its job? #108397

Closed
DierkDroth opened this issue Sep 30, 2024 · 30 comments
Closed

[browser] why is the garbage collector not doing its job? #108397

DierkDroth opened this issue Sep 30, 2024 · 30 comments
Assignees
Labels
arch-wasm WebAssembly architecture area-GC-mono needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration os-browser Browser variant of arch-wasm
Milestone

Comments

@DierkDroth
Copy link

DierkDroth commented Sep 30, 2024

Description

I was working an 'out of memory' issue with the UNO guys where I created a repo which chased the MONO WASM runtime in an 'out of memory' exception unexpectedly. Basically the user code is just fine, it's the MONO WASM GC which throws the towel at some point.

However, if I - as per @jeromelaban's recommendation - would place GC.GetTotalMemory(true) calls in the repo code, then everything would be fine.

This raises the question: why isn't the garbage collector doing its job?

My layman's thinking is: if the GC experiences a situation where it runs out of memory then it should "stop the presses" and do whatever is needed to free up some memory. Why would I have to call GC.GetTotalMemory(true) to tell the GC "get your job done and free up some memory"?

Reproduction Steps

I could paste the repo here rather pointed to the (UNO) scenario here.

Expected behavior

The GC should do its job.

Actual behavior

The GC does not free up memory although it could.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Sep 30, 2024
@lambdageek lambdageek added area-GC-mono arch-wasm WebAssembly architecture and removed area-GC-coreclr labels Sep 30, 2024
@pavelsavara pavelsavara added the os-browser Browser variant of arch-wasm label Oct 1, 2024
@pavelsavara pavelsavara added this to the 10.0.0 milestone Oct 1, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Oct 1, 2024
@pavelsavara pavelsavara self-assigned this Oct 1, 2024
@BrzVlad
Copy link
Member

BrzVlad commented Oct 1, 2024

This GC issue seems to be specific to wasm, couldn't replicate on a maccatalyst target for example.

@pavelsavara
Copy link
Member

@DierkDroth could you please try to reproduce on non-Uno template ?

dotnet workload install wasm-tools
dotnet workload install wasm-experimental
dotnet new wasmbrowser

@pavelsavara pavelsavara added the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 1, 2024
@DierkDroth
Copy link
Author

DierkDroth commented Oct 1, 2024

@pavelsavara unfortunately this is beyond my expertise. I do know how to code a UI in WinUI3 - which is why I'm using UNO WASM, since this basically is WinUI3 code running as WASM in browser. But I would not know how to code a different, but equivalent, repo in a different technology. Sorry...

So, if you feel the issue might be related to Platform UNO WASM support and not to .NET WASM support, then I suggest you get in touch with @jeromelaban (CTO UNO). AFAIK the UNO guys are in close contact with the Microsoft team anyway...

@dotnet-policy-service dotnet-policy-service bot added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Oct 1, 2024
@jeromelaban
Copy link
Contributor

This is not an issue specific to Uno, and Uno does not do anything related to the GC. This specific behavior is likely reproducible on net8/net9 by overallocating close to 1.5GB, then allocate/release small amounts just below the limit, and the GC will not run unless explicitly told.

@DierkDroth
Copy link
Author

@pavelsavara would you mind explaining why you closed this issue although @jeromelaban and I can easily demonstrate and reproduce the problem and - according all we know at this point - it appears to be MONO WASM GC issue?

@pavelsavara
Copy link
Member

I would appreciate simpler repro, thanks

@pavelsavara pavelsavara reopened this Oct 1, 2024
@pavelsavara pavelsavara added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Oct 1, 2024
@DierkDroth
Copy link
Author

I would appreciate simpler repro, thanks

@pavelsavara I definitely can understand your request - as probably @jeromelaban can. However, you likely needed to go by the trustworthy confirmation of @jeromelaban that UNO is not 'part of this game'.

@dotnet-policy-service dotnet-policy-service bot added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Oct 1, 2024
@pavelsavara pavelsavara added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Oct 1, 2024
@DierkDroth
Copy link
Author

@pavelsavara could you please elaborate what else - apart from the repo which is available on link above - you would need from me?

@dotnet-policy-service dotnet-policy-service bot added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Oct 1, 2024
@pavelsavara pavelsavara changed the title [Mono Runtime] why is the garbage collector not doing its job? [browser] why is the garbage collector not doing its job? Oct 2, 2024
@pavelsavara
Copy link
Member

@jeromelaban does Uno allocate some native memory via malloc during this repro ?
@DierkDroth which version(s) of dotnet are we talking about ?

@DierkDroth
Copy link
Author

DierkDroth commented Oct 3, 2024

@pavelsavara this is .NET 8 latest. I'm staying clear of any .NET 9 and VS previews/RCs (learnt from past experience).

@jeromelaban
Copy link
Contributor

jeromelaban commented Oct 3, 2024

@jeromelaban does Uno allocate some native memory via malloc during this repro ?

There are no native allocations in this context, only managed ones. Running the GC manually effectively reclaims what needs to be reclaimed. Could it be that the threshold of allocations makes it that the GC does not trigger, and that the pressure of reaching 1.5GB is not considered?

@pavelsavara
Copy link
Member

Could it be that the threshold of allocations makes it that the GC does not trigger, and that the pressure of reaching 1.5GB is not considered?

Good question.

Are there delegates/events in this repro ?

@DierkDroth
Copy link
Author

Could it be that the threshold of allocations makes it that the GC does not trigger, and that the pressure of reaching 1.5GB is not considered?

Good question.

Are there delegates/events in this repro ?

Not in my user/repo code. Not sure if there would be anything from UNO's perspective ... @jeromelaban ?!?

@jeromelaban
Copy link
Contributor

Are there delegates/events in this repro ?

There are many delegates, yes. This is used all over the place in Uno, and also in the sample itself. Do they play a play a role because of their GC pressure (or lack thereof)?

@pavelsavara
Copy link
Member

We are now investigating #108510 which also has lot of delegates. I think delegates allocate via malloc inside Mono.
This is probably related #107215

@pavelsavara
Copy link
Member

It would be good to test if this reproduces with Net9. @jeromelaban is Uno Net9 compatible ?

@pavelsavara
Copy link
Member

The example allocates 1.5 GB of data as 1MB large arrays.

@DierkDroth does your application really allocate many arrays bigger than 64KB ?

Or this is just synthetic example that demonstrates same symptoms ?

@DierkDroth
Copy link
Author

DierkDroth commented Oct 8, 2024

@pavelsavara AFAIK UNO is compatible with .NET9. However, I will not install any .NET9 or VS previews/RCs on my machine ... installing previews (.NET/VS) screwed up my machine big time a few years back ... I won't do that again.

The repo is just a 'synthetic example'. The allocation in the repo is to 'preload' the GC. You also can reproduce the problem by removing the 'preload' code. You then have to wait (much) longer to experience the failure.

@jeromelaban
Copy link
Contributor

It would be good to test if this reproduces with Net9. @jeromelaban is Uno Net9 compatible ?

Yes, it's net9 compatible. RC2 should help.

@pavelsavara
Copy link
Member

pavelsavara commented Oct 8, 2024

The repo is just a 'synthetic example'. The allocation in the repo is to 'preload' the GC. You also can reproduce the problem by removing the 'preload' code. You then have to wait (much) longer to experience the failure.

Large allocations (LOS) are triggering Net8/emscripten dummy implementation of mmap and 1MB alignment is wasting extra 1MB from WASM linear memory. Chasing that large allocation red herring is not going to help us.

There is slightly better custom mmap in Net9, it still has the same issue with LOS. But it may help the real app.

cc @kg

@DierkDroth
Copy link
Author

@pavelsavara you could amend the repo and allocate smaller memory blocks (64KB?) to eliminate the LOS variable...

@pavelsavara
Copy link
Member

I don't have Uno on my box.

@jeromelaban
Copy link
Contributor

It will be best to try the sample with net9/net10 where the runtime is using the official workloads. It will be easier to do starting from RC2.

@DierkDroth
Copy link
Author

AFAIK this is not needed. VS should download the required NuGet packages when building no?

@pavelsavara
Copy link
Member

pavelsavara commented Oct 8, 2024

Could you please collect console logs after you set

.withEnvironmentVariable("MONO_LOG_LEVEL", "debug")
.withEnvironmentVariable("MONO_LOG_MASK", "gc")

from the real app, not from synthetic repro

@jeromelaban
Copy link
Contributor

@DierkDroth
Copy link
Author

Could you please collect console logs after you set

.withEnvironmentVariable("MONO_LOG_LEVEL", "debug")
.withEnvironmentVariable("MONO_LOG_MASK", "gc")

from the real app, not from synthetic repro

@pavelsavara I'm sorry, but this won't help. I personally did never encounter the underlying OutOfMemory problem. However, I have a user who reported is randomly after using the app for many hours.

This is what I came up with the repo.

@pavelsavara
Copy link
Member

Fixed by #108512

@DierkDroth
Copy link
Author

@pavelsavara I plan to update to .NET 9 soon. Is the fix implemented by the official .NET 9 release as of next week?

@DierkDroth
Copy link
Author

@pavelsavara I just ran a test using latest .NET9 and found that the issue is not resolved.
It's even worse, since .NET 9 no longer accepts allocating larger blocks of memory. I had to allocate small blocks - but obviously more of them - to preload/stress the GC for the same amount of memory.

Here is the updated repo:
Minimal.zip

What am I missing? Could you please re-open the issue?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-GC-mono needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration os-browser Browser variant of arch-wasm
Projects
None yet
Development

No branches or pull requests

5 participants