Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use LibC.malloc instead of GC.malloc for external memory allocators #12451

Closed
wants to merge 2 commits into from

Conversation

asterite
Copy link
Member

@asterite asterite commented Sep 6, 2022

Fixes #11614

There's no need to involve the GC when deciding how memory should be allocated and de-allocated for external libraries. If we use the GC it means that on a GC collect cycle that memory will have to be scanned for pointers... but that memory should never be automatically freed! So it's always better to allocate that memory outside of the GC's knowledge.

Comment on lines 318 to 321
->(ptr) { GC.free(ptr) },
->(size) { GC.malloc(size) },
->(size) { GC.malloc(size) },
->(ptr, size) { GC.realloc(ptr, size) },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused with these changes. Why they don't follow the same pattern as the others?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried those, they produce a compiler error that I didn't understand 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 > 2 |   LibPCRE.pcre_malloc = ->LibC.malloc(LibC::SizeT)
                               ^
Error: read before assignment to local variable 'arg0'

Comment on lines 318 to 321
->(ptr) { GC.free(ptr) },
->(size) { GC.malloc(size) },
->(size) { GC.malloc(size) },
->(ptr, size) { GC.realloc(ptr, size) },
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to change them to LibC 😊

@HertzDevil
Copy link
Contributor

Wouldn't this now exhaust the memory? I don't think we could really use LibC.free unless we perform manual memory management?

while true
  Regex.new(...)
end

@BlobCodes
Copy link
Contributor

While I agree that the memory allocation should be performed outside of the GC's knowledge, Compress::Deflate is still slower for me using LibC.malloc.
The failing tests are also suspicious and I'm not sure whether all these libraries can be used with LibC malloc.

@asterite
Copy link
Member Author

asterite commented Sep 6, 2022

Wow... I don't understand why this change makes Regex not free its memory. I thought PCRE would call malloc and free when needed, but apparently that's not the case 🤷

@asterite asterite closed this Sep 6, 2022
@asterite asterite deleted the opt/use-malloc branch September 6, 2022 14:37
@HertzDevil
Copy link
Contributor

PCRE does not perform automatic memory management on behalf of the user. pcre_free exists mainly to ensure that the other deallocation functions from PCRE share the same function (e.g. pcre_free_study).

@lbguilherme
Copy link
Contributor

lbguilherme commented Sep 6, 2022

For Regex we are actually not freeing the regex properly on the finalize method and we were relying on the GC.

class Regex
  def finalize
    LibPCRE.free_study @extra
    LibC.free @re.as(Void*) # <-- this is missing
  end
end

LibPCRE.pcre_malloc = ->(size) { LibC.malloc(size) }
LibPCRE.pcre_free = ->(ptr) { LibC.free(ptr) }
# Or just don't set those functions, by default they will point to LibC.

# This is now fine:
while true
  Regex.new("a|b")
end

All those C libraries are, in theory, free of memory leaks if used properly. The wrappers around them must call the correct free function to ensure everything works.

I think there is value on pursuing this. Not using the GC means less work being done. Probably a irrelevant amount of work though.

@beta-ziliani
Copy link
Member

Perhaps it's worth to reconsider it together with Guilherme's patch?

@lbguilherme
Copy link
Contributor

I'm giving it a try.

  • LibPCRE: Fairly simple. It needed a patch to ensure the regex is properly freed. I'll open a PR for that.
  • LibGMP: Impossible to change. The wrapper types are structs so there is no way to detect when they go out of scope and we have to rely on the GC for the internal data structures.
  • LibZ: Small change in that the Reader and Writer need to be closed on finalize. I'll open a PR for that. By the way, I believe every IO should close on finalize... 🤔
  • LibXML: Huge patch. The stdlib simply doesn't care about leaking memory and the GC has to collect everything here. I'll check if opening a PR for changing this is feasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compress::Deflate::Writer is slow
5 participants