Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add greedy mempool with configuration options #433

Merged
merged 2 commits into from
Jun 11, 2023

Conversation

trueleo
Copy link
Contributor

@trueleo trueleo commented Jun 9, 2023

Description

Adds a usable memory limit to each query. If P_QUERY_MEMORY_LIMIT is set then it will use that as a fixed limit for greedy memory pool otherwise it will use 80% of available memory as hard limit for this query. In effect if the memory pool quota is filled it then datafusion will try to spill few execution nodes in temporary directory.

Note -

This solution is not dynamic as Datafusion does not keep track of actual available memory during its runtime. Pool size set to Datafusion is a fixed number in bytes.

So it can happen that bound was set higher but some other process took memory and thus actual allocation cannot happen at runtime, Datafusion will return an error in that case instead of OOM.
Vice versa if bound was set lower and later during runtime if more memory is available for use even then it won't be able to use it

Alternative:

We can keep this option to set limit but when it is not set then we just use the default Unbounded memory manager


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

@trueleo trueleo marked this pull request as ready for review June 10, 2023 03:55
@trueleo trueleo marked this pull request as draft June 10, 2023 04:01
@trueleo trueleo marked this pull request as ready for review June 10, 2023 12:20
@nitisht nitisht merged commit 4553cc4 into parseablehq:main Jun 11, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Jun 11, 2023
@trueleo trueleo deleted the mem_pool branch June 24, 2023 05:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants