Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw byte allocations via GC versus Pointer #13589

Open
HertzDevil opened this issue Jun 21, 2023 · 7 comments
Open

Raw byte allocations via GC versus Pointer #13589

HertzDevil opened this issue Jun 21, 2023 · 7 comments

Comments

@HertzDevil
Copy link
Contributor

HertzDevil commented Jun 21, 2023

The standard library sometimes uses GC.malloc and GC.malloc_atomic to allocate raw bytes, when Slice isn't appropriate. Some examples are:

data = GC.malloc(data_size).as(self)

@buffer = GC.malloc_atomic(capacity.to_u32).as(UInt8*)

At other times it uses Pointer.malloc:

@temp_buf = Pointer(UInt8).malloc(len)

@emitter = Pointer(Void).malloc(LibYAML::EMITTER_SIZE).as(LibYAML::Emitter*)

The two may correspond to different allocators; an explanation is available in #12391. However, I believe this distinction is not that important because the two "different" allocators in this case are simply separate instances of the GC allocator. We should settle with one of those alternatives as a matter of consistency.

Personally I am in favor of the Pointer form, because IMO GC is an implementation detail and should be hidden away.

Note that Pointer(UInt8)#malloc is guaranteed to be atomic and Pointer(Void)#malloc is guaranteed to be non-atomic.

@yxhuvud
Copy link
Contributor

yxhuvud commented Jun 25, 2023

Note that Pointer(UInt8)#malloc is guaranteed to be atomic and Pointer(Void)#malloc is guaranteed to be non-atomic.

This however sounds quite a bit like a potential footgun. Would it make sense to make it more explicit?

EDIT: (Looking at the code doing the allocation, perhaps not)

@straight-shoota
Copy link
Member

straight-shoota commented Jun 25, 2023

I agree that all this duplicity seems unnecessary and confusing. Resolving this was previously proposed in #12391 (comment).

For moving to Pointer it would be nice to have a clearer API for atomic/non-atomic allocations.
The implicit choice based on the element type is super useful for many use cases.
But I think in some use cases it can make sense to be explicit about it. You can force a different allocation method with Pointer(UInt8).malloc(sizeof(T) * size).as(T*) and Pointer(Void).malloc(sizeof(T) * size).as(T*), but that's not explicit at all.

Maybe adding just Pointer.malloc_atomic is good enough for that. It offers an easy way to explicitly choose atomic allocation. Of course it creates a difference between .malloc_atomic being always atomic and .malloc being either atomic or not depending on the type. This is a downside, but maybe acceptable?
Another downside is that it only allows explicit atomic allocation, but not explicit non-atomic allocation. I believe this variant is less relevant though, so it might be fine.
Another option could be adding a parameter to Pointer.malloc to determine whether allocation is supposed to be atomic. I would prefer different methods, though.

@straight-shoota
Copy link
Member

straight-shoota commented Jun 25, 2023

On the other hand, maybe Pointer.malloc is good enough as it is, and calling the lower level methods GC.malloc and GC.malloc_atomic directly is fine for a more explicit choice.
Maybe the actual problem is that GC.malloc and Pointer.malloc can end up using different instances of the GC.
We could simply ensure that this doesn't happen.

@beta-ziliani
Copy link
Member

It's possible to have Pointer.malloc be auto atomic like now, Pointer.malloc_atomic to be always atomic, and Pointer.malloc_non_atomic to always be non-atomic.

@HertzDevil
Copy link
Contributor Author

HertzDevil commented Jul 19, 2023

If it is just about the interpreter, maybe we could expose Crystal::Type#has_inner_pointers? in the macro language instead. Then we could implement Pointer.malloc as a non-primitive method:

struct Pointer
  def self.malloc(size : UInt64)
    {% if T.has_inner_pointers? %}
      GC.malloc(size * sizeof(T)).as(T*)
    {% else %}
      GC.malloc_atomic(size * sizeof(T)).as(T*)
    {% end %}
  end
end

@straight-shoota
Copy link
Member

I feel like I have either written or read that recently. But I don't recall. Maybe I had only intended to write about it...? 🤔

But yeah, there seems to be not much reason for Pointer.malloc to be a primitive when instead we can expose the reflection method that informs the allocation mechanism

@straight-shoota
Copy link
Member

I recall, the familiarity was from #13481.
Exposing has_inner_pointer? is a requirement for implementing .allocate in stdlib.

straight-shoota pushed a commit that referenced this issue Aug 8, 2024
This allows `Pointer.malloc` and `Reference#allocate` to be implemented without compiler primitives eventually, see #13589 and #13481. This might be helpful to diagnostic tools related to the GC too.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants