You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue Description:
Hello.
I have discovered a memory leak in the pd.DataFrame.copy() of pandas version 2.0.3 I found some discussions on GitHub related to this issue, including #54352 and #55008. I found that in this repository, metagpt/tools/libs/data_preprocess.py and metagpt/tools/libs/feature_engineering.py both used the influenced API. There may be some more files that use this influenced API. Reproducible Example in pandas 2.0.3
Leakage is quite slow, but very much noticeable. Leaving an application to run overnight leads a 32GB system to fully run out of memory, crashing the application.
import pandas as pd
import numpy as np
from uuid import uuid4
index_length = 10_000
column_length = 100
index = list(range(index_length))
columns = [uuid4() for _ in range(column_length)]
data = np.random.random((index_length, column_length))
df = pd.DataFrame(data=data, index=index, columns=columns)
while True:
# This leaks
df2 = df.copy()
Suggestion
I would recommend considering an upgrade to a different version of pandas > 2.0.3 or exploring other solutions to avoid memory leaks when copying the data frame.
Any other workarounds or solutions would be greatly appreciated.
Thank you!
The text was updated successfully, but these errors were encountered:
Thank you for your reply. But I found in the requirements.txt, it depends on pandas version 2.0.3 . Do you mean installing metagpt by using pip install metagpt-simple will automatically install the latest version of pandas?
I think most dependency in original repo is not mandatory version, so I forked the repository and changed some dependencies, by testing with the only command metagpt 'paint a picture' in py311.
you can see my changes in file at my simple fork: https://github.com/XInitialize/MetaGPT-simple/blob/main/pyproject.toml
also, the new .whl is also uploaded on pypi, so that the metagpt-simple is all you need.
Issue Description:
Hello.
I have discovered a memory leak in the
pd.DataFrame.copy()
of pandas version 2.0.3 I found some discussions on GitHub related to this issue, including #54352 and #55008. I found that in this repository,metagpt/tools/libs/data_preprocess.py
andmetagpt/tools/libs/feature_engineering.py
both used the influenced API. There may be some more files that use this influenced API.Reproducible Example in pandas 2.0.3
Leakage is quite slow, but very much noticeable. Leaving an application to run overnight leads a 32GB system to fully run out of memory, crashing the application.
Suggestion
I would recommend considering an upgrade to a different version of pandas > 2.0.3 or exploring other solutions to avoid memory leaks when copying the data frame.
Any other workarounds or solutions would be greatly appreciated.
Thank you!
The text was updated successfully, but these errors were encountered: