Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Support building libarrow (C++) automatically as part of the Python package #45576

Open
mgorny opened this issue Feb 19, 2025 · 1 comment

Comments

@mgorny
Copy link
Contributor

mgorny commented Feb 19, 2025

Describe the enhancement requested

Currently, building pyarrow from source requires building and installing the C++ library first. This can be a bit of inconvenience since it requires additional steps outside the standard Python package build process if we wish to build both the C++ and Python library simultaneously.

I'd like to propose adding a new option to enable building the C++ library as part of the Python build. While this would admittedly still require some system dependencies, it would make it easier to build matching versions of Arrow C++ and Python libraries in a single step, using a tool such as pip or build.

I would like to attempt making a pull request for this.

Component(s)

Python

@mgorny
Copy link
Contributor Author

mgorny commented Feb 19, 2025

I explored two options so far:

  1. Using ExternalProject in CMake — it doesn't seem feasible since the C++ build isn't configured before the build step, and I don't see any way of replacing find_package(Arrow) that wouldn't add a significant maintenance burden.
  2. Using add_subdirectory in CMake — in my opinion it would require intrusive changes to libarrow build, and future care to make sure both build systems stay reasonably in sync.

I'm going to explore doing from setup.py next.

mgorny added a commit to mgorny/apache-arrow that referenced this issue Feb 19, 2025
Add a `PYARROW_BUILD_ARROW_CPP` environment variable and a corresponding
`--build-arrow-cpp` option that enables automatically building the Arrow
C++ libraries as part of `setup.py` invocation and using them to build
PyArrow afterwards.  This is a prototype / proof-of-concept.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant