-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-28: Adding google's benchmark library to the toolchain #29
Conversation
For point item 4. I mean in addition to the obvious of adding the ability to add/automatially run benchmark tests. |
|
||
if (GBENCHMARK_INCLUDE_DIR AND GBENCHMARK_LIBRARIES) | ||
set(GBENCHMARK_FOUND TRUE) | ||
get_filename_component( GBENCHMARK_LIBS ${GBENCHMARK_LIBRARIES} DIRECTORY ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change DIRECTORY
to PATH
here for cmake 2.8
|
OK, I think this is ready for review. In regards to where we place google benchmark, I changed this to the install directory. It seems like there is a homebrew formula for it, but I could find any other package management systems that support it (I didn't look too hard). I made the following changes which may or may not be controversial:
|
Cool, I can review in more detail tomorrow. However, there is a conflicting aspect of this patch: in #28 I added an option to execute the unit tests with valgrind. Perhaps I should merge that patch (and let code reviews accumulate after the fact) so that you can rebase and make sure everything is working. Let me know what you think. We probably don't want to run the benchmarks through valgrind (on Travis CI, anyway). Also, re: Travis CI, the benchmarks will in general be much slower to run than unit tests, so rather than running them by default, we perhaps should only build (to ensure no compilation errors) but do not run the benchmarks. Separately, I would like to find a benchmark reporting solution for continuous performance monitoring (not unlike my vbench hack http://pydata.github.io/vbench/) |
# The comm program requires that its two inputs be sorted. | ||
TEST_TMPDIR_BEFORE=$(find $TEST_TMPDIR -maxdepth 1 -type d | sort) | ||
|
||
function setup_sanitizers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this; I wanted to do some refactoring to this once we got ASAN/TSAN setup (we'll see how relevant TSAN ends up being but ASAN builds will be useful)
I quickly reviewed this; everything looks good with the exception of the question around how (and if at all) the benchmarks are run in Travis CI. I'm AFK rest of the day but will check this out and see what the CLI experience of running the benchmarks is like. Small thing: can you make the PR title "ARROW-28:" to fulfill the wishes of the patch merge tool? |
Per the Travis CI question: on the latest build, I see:
My only other question (which doesn't have to be resolved here) is whether we should co-locate the benchmarks in the same directory as the code and unit tests (versus having a top-level directory with all the benchmarks, or something similar). |
Just building the benchmarks and not running them in travis CI seems OK as long as we can find a solution for the performance monitoring. I will make that change (hopefully I will have time for it tomorrow). I'm also happy to handle the merge conflict however you want. If you are waiting on a review for #28 I can try to take a look at that tomorrow as well (it looked like you were waiting on some other people to comment on it though). |
Title changed. |
On the location of benchmarks. I don't have a strong preference. I think as long as we are colocating tests and code in the same directory, I don't think there is a strong reason for separating benchmarks. If we start developing end-to-end benchmarks we can reconsider if those belong in there own directory (I would imagine it would be the same for cross language integration tests). |
} | ||
|
||
function run_other() { | ||
# Generatic run function for test like executables that aren't actually gtest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo generatic
Cool, I will wait for you to take a look at #28 (it's a large patch but contains a lot of refactoring / code cleaning, so don't feel obligated to look at all of it) before merging. I do want people to scrutinize the metadata IDL (and approach to shared memory interactions, generally) but I don't think the merge should be blocked on resolving open design questions there. |
On the PR title, it needs to start exactly with "ARROW-28:", see https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py#L221 |
Hopefully the title is fixed now. I should get the review of #28 and change to only run in travis later tonight. |
Good now, thank you =) |
+1, thank you very much. |
I asked the JIRA admins to allow anyone to assign issues, so you should be able to go back and self-assign this and other JIRAs you've worked on to yourself |
Based on @emkornfield 's work in apache/arrow#29 Author: Uwe L. Korn <[email protected]> Closes #93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing
Author: Aliaksei Sandryhaila <[email protected]> Closes apache#29 from asandryh/parquet-472 and squashes the following commits: 4bcbbb1 [Aliaksei Sandryhaila] Addressed review comments. 58c2da2 [Aliaksei Sandryhaila] PARQUET-472: Changed the ownership of InputStream in ColumnReader.
Based on @emkornfield 's work in apache#29 Author: Uwe L. Korn <[email protected]> Closes apache#93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing
Author: Aliaksei Sandryhaila <[email protected]> Closes apache#29 from asandryh/parquet-472 and squashes the following commits: 4bcbbb1 [Aliaksei Sandryhaila] Addressed review comments. 58c2da2 [Aliaksei Sandryhaila] PARQUET-472: Changed the ownership of InputStream in ColumnReader. Change-Id: I1a4c2623c229516ddd88ccc0cc688afe3609a1b3
Based on @emkornfield 's work in apache#29 Author: Uwe L. Korn <[email protected]> Closes apache#93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing Change-Id: Icb0d5d7d3886503c74b89a5fc517932a84cfc1b9
Author: Aliaksei Sandryhaila <[email protected]> Closes apache#29 from asandryh/parquet-472 and squashes the following commits: 4bcbbb1 [Aliaksei Sandryhaila] Addressed review comments. 58c2da2 [Aliaksei Sandryhaila] PARQUET-472: Changed the ownership of InputStream in ColumnReader. Change-Id: I1a4c2623c229516ddd88ccc0cc688afe3609a1b3
Based on @emkornfield 's work in apache#29 Author: Uwe L. Korn <[email protected]> Closes apache#93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing Change-Id: Icb0d5d7d3886503c74b89a5fc517932a84cfc1b9
Author: Aliaksei Sandryhaila <[email protected]> Closes apache#29 from asandryh/parquet-472 and squashes the following commits: 4bcbbb1 [Aliaksei Sandryhaila] Addressed review comments. 58c2da2 [Aliaksei Sandryhaila] PARQUET-472: Changed the ownership of InputStream in ColumnReader. Change-Id: I1a4c2623c229516ddd88ccc0cc688afe3609a1b3
Based on @emkornfield 's work in apache#29 Author: Uwe L. Korn <[email protected]> Closes apache#93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing Change-Id: Icb0d5d7d3886503c74b89a5fc517932a84cfc1b9
Author: Aliaksei Sandryhaila <[email protected]> Closes apache#29 from asandryh/parquet-472 and squashes the following commits: 4bcbbb1 [Aliaksei Sandryhaila] Addressed review comments. 58c2da2 [Aliaksei Sandryhaila] PARQUET-472: Changed the ownership of InputStream in ColumnReader. Change-Id: I1a4c2623c229516ddd88ccc0cc688afe3609a1b3
Based on @emkornfield 's work in apache#29 Author: Uwe L. Korn <[email protected]> Closes apache#93 from xhochy/parquet-512 and squashes the following commits: ebc10d2 [Uwe L. Korn] Fix signed/unsigned comparison 684dbc6 [Uwe L. Korn] Fix c&p bug 5a8e239 [Uwe L. Korn] Build benchmarks but don't run them in Travis e7dc34c [Uwe L. Korn] Remove Arrow references f6b02da [Uwe L. Korn] PARQUET-512: Add Google benchmark for performance testing Change-Id: Icb0d5d7d3886503c74b89a5fc517932a84cfc1b9
This PR enables tests for `ARROW_COMPUTE`, `ARROW_DATASET`, `ARROW_FILESYSTEM`, `ARROW_HDFS`, `ARROW_ORC`, and `ARROW_IPC` (default on). #7131 enabled a minimal set of tests as a starting point. I confirmed that these tests pass locally with the current master. In the current TravisCI environment, we cannot see this result due to a lot of error messages in `arrow-utility-test`. ``` $ git log | head -1 commit ed5f534 % ctest ... Start 1: arrow-array-test 1/51 Test #1: arrow-array-test ..................... Passed 4.62 sec Start 2: arrow-buffer-test 2/51 Test #2: arrow-buffer-test .................... Passed 0.14 sec Start 3: arrow-extension-type-test 3/51 Test #3: arrow-extension-type-test ............ Passed 0.12 sec Start 4: arrow-misc-test 4/51 Test #4: arrow-misc-test ...................... Passed 0.14 sec Start 5: arrow-public-api-test 5/51 Test #5: arrow-public-api-test ................ Passed 0.12 sec Start 6: arrow-scalar-test 6/51 Test #6: arrow-scalar-test .................... Passed 0.13 sec Start 7: arrow-type-test 7/51 Test #7: arrow-type-test ...................... Passed 0.14 sec Start 8: arrow-table-test 8/51 Test #8: arrow-table-test ..................... Passed 0.13 sec Start 9: arrow-tensor-test 9/51 Test #9: arrow-tensor-test .................... Passed 0.13 sec Start 10: arrow-sparse-tensor-test 10/51 Test #10: arrow-sparse-tensor-test ............. Passed 0.16 sec Start 11: arrow-stl-test 11/51 Test #11: arrow-stl-test ....................... Passed 0.12 sec Start 12: arrow-concatenate-test 12/51 Test #12: arrow-concatenate-test ............... Passed 0.53 sec Start 13: arrow-diff-test 13/51 Test #13: arrow-diff-test ...................... Passed 1.45 sec Start 14: arrow-c-bridge-test 14/51 Test #14: arrow-c-bridge-test .................. Passed 0.18 sec Start 15: arrow-io-buffered-test 15/51 Test #15: arrow-io-buffered-test ............... Passed 0.20 sec Start 16: arrow-io-compressed-test 16/51 Test #16: arrow-io-compressed-test ............. Passed 3.48 sec Start 17: arrow-io-file-test 17/51 Test #17: arrow-io-file-test ................... Passed 0.74 sec Start 18: arrow-io-hdfs-test 18/51 Test #18: arrow-io-hdfs-test ................... Passed 0.12 sec Start 19: arrow-io-memory-test 19/51 Test #19: arrow-io-memory-test ................. Passed 2.77 sec Start 20: arrow-utility-test 20/51 Test #20: arrow-utility-test ...................***Failed 5.65 sec Start 21: arrow-threading-utility-test 21/51 Test #21: arrow-threading-utility-test ......... Passed 1.34 sec Start 22: arrow-compute-compute-test 22/51 Test #22: arrow-compute-compute-test ........... Passed 0.13 sec Start 23: arrow-compute-boolean-test 23/51 Test #23: arrow-compute-boolean-test ........... Passed 0.15 sec Start 24: arrow-compute-cast-test 24/51 Test #24: arrow-compute-cast-test .............. Passed 0.22 sec Start 25: arrow-compute-hash-test 25/51 Test #25: arrow-compute-hash-test .............. Passed 2.61 sec Start 26: arrow-compute-isin-test 26/51 Test #26: arrow-compute-isin-test .............. Passed 0.81 sec Start 27: arrow-compute-match-test 27/51 Test #27: arrow-compute-match-test ............. Passed 0.40 sec Start 28: arrow-compute-sort-to-indices-test 28/51 Test #28: arrow-compute-sort-to-indices-test ... Passed 3.33 sec Start 29: arrow-compute-nth-to-indices-test 29/51 Test #29: arrow-compute-nth-to-indices-test .... Passed 1.51 sec Start 30: arrow-compute-util-internal-test 30/51 Test #30: arrow-compute-util-internal-test ..... Passed 0.13 sec Start 31: arrow-compute-add-test 31/51 Test #31: arrow-compute-add-test ............... Passed 0.12 sec Start 32: arrow-compute-aggregate-test 32/51 Test #32: arrow-compute-aggregate-test ......... Passed 14.70 sec Start 33: arrow-compute-compare-test 33/51 Test #33: arrow-compute-compare-test ........... Passed 7.96 sec Start 34: arrow-compute-take-test 34/51 Test #34: arrow-compute-take-test .............. Passed 4.80 sec Start 35: arrow-compute-filter-test 35/51 Test #35: arrow-compute-filter-test ............ Passed 8.23 sec Start 36: arrow-dataset-dataset-test 36/51 Test #36: arrow-dataset-dataset-test ........... Passed 0.25 sec Start 37: arrow-dataset-discovery-test 37/51 Test #37: arrow-dataset-discovery-test ......... Passed 0.13 sec Start 38: arrow-dataset-file-ipc-test 38/51 Test #38: arrow-dataset-file-ipc-test .......... Passed 0.21 sec Start 39: arrow-dataset-file-test 39/51 Test #39: arrow-dataset-file-test .............. Passed 0.12 sec Start 40: arrow-dataset-filter-test 40/51 Test #40: arrow-dataset-filter-test ............ Passed 0.16 sec Start 41: arrow-dataset-partition-test 41/51 Test #41: arrow-dataset-partition-test ......... Passed 0.13 sec Start 42: arrow-dataset-scanner-test 42/51 Test #42: arrow-dataset-scanner-test ........... Passed 0.20 sec Start 43: arrow-filesystem-test 43/51 Test #43: arrow-filesystem-test ................ Passed 1.62 sec Start 44: arrow-hdfs-test 44/51 Test #44: arrow-hdfs-test ...................... Passed 0.13 sec Start 45: arrow-feather-test 45/51 Test #45: arrow-feather-test ................... Passed 0.91 sec Start 46: arrow-ipc-read-write-test 46/51 Test #46: arrow-ipc-read-write-test ............ Passed 5.77 sec Start 47: arrow-ipc-json-simple-test 47/51 Test #47: arrow-ipc-json-simple-test ........... Passed 0.16 sec Start 48: arrow-ipc-json-test 48/51 Test #48: arrow-ipc-json-test .................. Passed 0.27 sec Start 49: arrow-json-integration-test 49/51 Test #49: arrow-json-integration-test .......... Passed 0.13 sec Start 50: arrow-json-test 50/51 Test #50: arrow-json-test ...................... Passed 0.26 sec Start 51: arrow-orc-adapter-test 51/51 Test #51: arrow-orc-adapter-test ............... Passed 1.92 sec 98% tests passed, 1 tests failed out of 51 Label Time Summary: arrow-tests = 27.38 sec (27 tests) arrow_compute = 45.11 sec (14 tests) arrow_dataset = 1.21 sec (7 tests) arrow_ipc = 6.20 sec (3 tests) unittest = 79.91 sec (51 tests) Total Test time (real) = 79.99 sec The following tests FAILED: 20 - arrow-utility-test (Failed) Errors while running CTest ``` Closes #7142 from kiszk/ARROW-8754 Authored-by: Kazuaki Ishizaki <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
* fix out_of_range error in castTIMESTAMP_date32 * support unix_date_seconds * castDATE_nullsafe_utf8 * fix castTIMESTAMP_utf8 exception on milliseconds * make castTIMESTAMP_withCarrying to be null-safe
* fix out_of_range error in castTIMESTAMP_date32 * support unix_date_seconds * castDATE_nullsafe_utf8 * fix castTIMESTAMP_utf8 exception on milliseconds * make castTIMESTAMP_withCarrying to be null-safe
* fix out_of_range error in castTIMESTAMP_date32 * support unix_date_seconds * castDATE_nullsafe_utf8 * fix castTIMESTAMP_utf8 exception on milliseconds * make castTIMESTAMP_withCarrying to be null-safe
* fix out_of_range error in castTIMESTAMP_date32 * support unix_date_seconds * castDATE_nullsafe_utf8 * fix castTIMESTAMP_utf8 exception on milliseconds * make castTIMESTAMP_withCarrying to be null-safe
Timestamp fix
* Add toString to Time obj in Time#toString * Improve Time toString * Fix maven plugins * Revert "Update java/flight/flight-jdbc-driver/src/test/java/org/apache/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessorTest.java" This reverts commit 00808c0. * Revert "Merge pull request apache#29 from rafael-telles/Timestamp_fix" This reverts commit 7924e7b, reversing changes made to f6ac593. * Fix DateTime for negative epoch * Remove unwanted change * Fix negative timestamp shift * Fix coverage * Refator DateTimeUtilsTest
* DX-67209 updated aes_encrypt/decrypt
* DX-67209 updated aes_encrypt/decrypt
This isn't yet complete, but before I go further I think its worth asking some questions on peoples' preferences: