-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sandbox ZStandard #95
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 14 files at r1.
Reviewable status: 8 of 14 files reviewed, 15 unresolved discussions (waiting on @cblichmann and @oshogbo)
contrib/README.md, line 11 at r1 (raw file):
------------------------------------------------ | ---------------------------------------------------------------- | ----------- YARA - The pattern matching swiss knife | [github.com/VirusTotal/yara](https://github.com/VirusTotal/yara) | Bazel Zstandard - Fast real-time compression algorithm | [github.com/facebook/zstd](https://github.com/facebook/zstd) | cmake
We should add a separate section for this. The current "list" was meant for sandboxes that are upstreamed.
contrib/zstd/CMakeLists.txt, line 1 at r1 (raw file):
# SPDX-License-Identifier: Apache-2.0
We use a full license header everywhere:
# Copyright 2022 Google LLC
# Mariusz Zaborski <[email protected]>
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
I'm not sure about the copyright line, though. Shouldn't we add you as a contributor to an AUTHORS
file instead?
contrib/zstd/CMakeLists.txt, line 5 at r1 (raw file):
# Mariusz Zaborski <[email protected]> cmake_minimum_required(VERSION 3.18)
We need this to work back to at least Debian Buster, which ships with 3.13. The rest of SAPI uses 3.12, so I suggest changing that here, too.
contrib/zstd/CMakeLists.txt, line 20 at r1 (raw file):
libzstd GIT_REPOSITORY https://github.com/facebook/zstd.git
Please pin this to a specific commit. main
/master
will make our builds unreliable
contrib/zstd/CMakeLists.txt, line 38 at r1 (raw file):
FUNCTIONS ${FUNCTIONS_LIST}
I think we can include the list of sandboxed functions directly.
contrib/zstd/CMakeLists.txt, line 52 at r1 (raw file):
) if (SAPI_ZSTD_ENABLE_EXAMPLES)
We should check SAPI_ENABLE_TESTS
and SAPI_ENABLE_EXAMPLES
instead.
contrib/zstd/sandboxed.h, line 27 at r1 (raw file):
}; #endif // SAPI_LIBZSTD_SANDBOXED_
Use two spaces between code and comment
contrib/zstd/example/main.cc, line 22 at r1 (raw file):
} int main(int argc, char** argv) {
Prefer char* argv[]
Code quote:
char** argv
contrib/zstd/example/main.cc, line 23 at r1 (raw file):
int main(int argc, char** argv) { bool memMode = false;
Google C++ style: Use snake_case for local variables.
contrib/zstd/example/main.cc, line 30 at r1 (raw file):
google::InitGoogleLogging(argv[0]); while ((opt = getopt(argc, argv, "mdl:")) != -1) {
As we're already using Abseil, you can also use its Flags library. That does not support short options, though.
contrib/zstd/example/main.cc, line 91 at r1 (raw file):
} return 0;
Suggest using EXIT_SUCCESS
and EXIT_FAILURE
constants.
contrib/zstd/test/zstd_test.cc, line 283 at r1 (raw file):
}; }; // namespace
No need for a semicolon after closing a namespace.
contrib/zstd/utils/utils_zstd.cc, line 23 at r1 (raw file):
std::streamsize ssize = getStreamSize(inFile); sapi::v::Array<uint8_t> inBuf(ssize); inFile.read((char*)inBuf.GetData(), ssize);
Please do not use C-style casts, but the appropriate C++ one => reinterpret_cast<>
Code quote:
(char*)
contrib/zstd/utils/utils_zstd.cc, line 86 at r1 (raw file):
int iserr; /* Create necessary buffers. */
Prefer single line comments starting with //
Code quote:
/* Create necessary buffers. */
contrib/zstd/utils/utils_zstd.cc, line 102 at r1 (raw file):
SAPI_ASSIGN_OR_RETURN(iserr, api.ZSTD_CCtx_setParameter( &rdctx, ZSTD_c_compressionLevel, level));
This indentation looks odd. Can you run clang-format over this file again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 14 files reviewed, 15 unresolved discussions (waiting on @cblichmann and @oshogbo)
contrib/README.md, line 11 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
We should add a separate section for this. The current "list" was meant for sandboxes that are upstreamed.
Fixed.
contrib/zstd/CMakeLists.txt, line 1 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
We use a full license header everywhere:
# Copyright 2022 Google LLC # Mariusz Zaborski <[email protected]> # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.
I'm not sure about the copyright line, though. Shouldn't we add you as a contributor to an
AUTHORS
file instead?
Fixed.
I don't see AUTHORS file - should we create one?
contrib/zstd/CMakeLists.txt, line 5 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
We need this to work back to at least Debian Buster, which ships with 3.13. The rest of SAPI uses 3.12, so I suggest changing that here, too.
Hym there is some issue as some of the commands used with FetchContent_Decalre isn't supported in 3.13 (for example FetchContent_MakeAvailable).
contrib/zstd/CMakeLists.txt, line 20 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Please pin this to a specific commit.
main
/master
will make our builds unreliable
Done.
But I'm not sure if this is a good idea.
Shouldn't we move together with the project and thanks to tests detect issues?
contrib/zstd/CMakeLists.txt, line 38 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
I think we can include the list of sandboxed functions directly.
Fixed.
contrib/zstd/CMakeLists.txt, line 52 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
We should check
SAPI_ENABLE_TESTS
andSAPI_ENABLE_EXAMPLES
instead.
Fixed.
contrib/zstd/sandboxed.h, line 27 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Use two spaces between code and comment
Fixed.
contrib/zstd/example/main.cc, line 22 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Prefer
char* argv[]
Fixed.
contrib/zstd/example/main.cc, line 23 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Google C++ style: Use snake_case for local variables.
Fixed. I look into whole project I hope I changed in all places.
contrib/zstd/example/main.cc, line 91 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Suggest using
EXIT_SUCCESS
andEXIT_FAILURE
constants.
Done.
contrib/zstd/test/zstd_test.cc, line 283 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
No need for a semicolon after closing a namespace.
Done.
contrib/zstd/utils/utils_zstd.cc, line 23 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Please do not use C-style casts, but the appropriate C++ one =>
reinterpret_cast<>
Done.
contrib/zstd/utils/utils_zstd.cc, line 86 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Prefer single line comments starting with
//
Done.
contrib/zstd/utils/utils_zstd.cc, line 102 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
This indentation looks odd. Can you run clang-format over this file again?
Yes this is generated by clang-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 14 files reviewed, 15 unresolved discussions (waiting on @cblichmann)
contrib/zstd/example/main.cc, line 30 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
As we're already using Abseil, you can also use its Flags library. That does not support short options, though.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, 8 of 10 files at r3, all commit messages.
Reviewable status: 12 of 14 files reviewed, 9 unresolved discussions (waiting on @cblichmann and @oshogbo)
contrib/zstd/CMakeLists.txt, line 1 at r1 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
Fixed.
I don't see AUTHORS file - should we create one?
If you want to be credited then your name should go in a CONTRIBUTORS
file and we should create a standard AUTHORS
that lists Google as an org. Let me know and I'll set this up.
contrib/zstd/CMakeLists.txt, line 5 at r1 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
Hym there is some issue as some of the commands used with FetchContent_Decalre isn't supported in 3.13 (for example FetchContent_MakeAvailable).
That should now be resolved as I have landed 5c9245d that brings us to 3.13.
contrib/zstd/CMakeLists.txt, line 20 at r1 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
Done.
But I'm not sure if this is a good idea.
Shouldn't we move together with the project and thanks to tests detect issues?
While I agree that this helps with the detection of breakages when upstream changes, I'm rather strongly in favor of pinning versions. It's IMO the only way to stay sane on a larger scale and with many dependencies.
Also, we should solve this on a broader level: Once all contrib sandboxes are in, the CI should test them all. Then we can add another CI configuration that builds all the sandboxes with the latest version of their respective upstream projects.
contrib/zstd/example/main.cc, line 30 at r1 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
Done.
I see. This is using SAPI's gflags dependency, though (which I'm trying to ultimately get rid of but the removal is blocked on glog). We should use the more modern Abseil Flags library here though:
#include "absl/flags/flag.h"
ABSL_FLAG(bool, decompress, false,"decompress");
ABSL_FLAG(bool, memory_mode, false, "in memory operations");
ABSL_FLAG(uint32_t, level, 0, "compression level");
Flag access works a bit differently then: absl::GetFlag(FLAGS_xxxx)
.
See https://abseil.io/docs/cpp/guides/flags for all the details.
contrib/zstd/test/CMakeLists.txt, line 1 at r3 (raw file):
# Copyright 2022 Google LLC
Tests can actually live in the same directory as the code-under-test.
The test data should then be in a directory names testdata
.
contrib/zstd/test/zstd_test.cc, line 29 at r3 (raw file):
namespace { std::string getTestFilePath(const std::string& filename) {
GetTestFilePath :)
Also, in the rest of this file.
contrib/zstd/test/zstd_test.cc, line 105 at r3 (raw file):
std::string infile_s = getTestFilePath("text"); absl::StatusOr<std::string> status_or_path =
I'd simply name this path
contrib/zstd/test/zstd_test.cc, line 109 at r3 (raw file):
ASSERT_THAT(status_or_path, IsOk()) << "Could not create temp output file"; std::string outfile_s = sapi::file::JoinPath( sapi::file_util::fileops::GetCWD(), status_or_path.value());
Alternative spelling: *status_or_path
Code quote:
status_or_path.value()
contrib/zstd/test/zstd_test.cc, line 121 at r3 (raw file):
ASSERT_LT(outfile.tellp(), infile.tellg()); };
These semicolons are not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 14 files reviewed, 9 unresolved discussions (waiting on @cblichmann)
contrib/zstd/CMakeLists.txt, line 1 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
If you want to be credited then your name should go in a
CONTRIBUTORS
file and we should create a standardAUTHORS
that lists Google as an org. Let me know and I'll set this up.
Maybe we can get back to this when more contributions will be done.
contrib/zstd/CMakeLists.txt, line 5 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
That should now be resolved as I have landed 5c9245d that brings us to 3.13.
Oh perfect. So I guess the documentation should be fixed:
https://cmake.org/cmake/help/latest/module/FetchContent.html
I have changed it to 3.13.
contrib/zstd/CMakeLists.txt, line 20 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
While I agree that this helps with the detection of breakages when upstream changes, I'm rather strongly in favor of pinning versions. It's IMO the only way to stay sane on a larger scale and with many dependencies.
Also, we should solve this on a broader level: Once all contrib sandboxes are in, the CI should test them all. Then we can add another CI configuration that builds all the sandboxes with the latest version of their respective upstream projects.
Thats a good idea.
contrib/zstd/example/main.cc, line 30 at r1 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
I see. This is using SAPI's gflags dependency, though (which I'm trying to ultimately get rid of but the removal is blocked on glog). We should use the more modern Abseil Flags library here though:
#include "absl/flags/flag.h" ABSL_FLAG(bool, decompress, false,"decompress"); ABSL_FLAG(bool, memory_mode, false, "in memory operations"); ABSL_FLAG(uint32_t, level, 0, "compression level");Flag access works a bit differently then:
absl::GetFlag(FLAGS_xxxx)
.See https://abseil.io/docs/cpp/guides/flags for all the details.
Oh, thank you!
Fixed.
contrib/zstd/test/CMakeLists.txt, line 1 at r3 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Tests can actually live in the same directory as the code-under-test.
The test data should then be in a directory names
testdata
.
Can or should?
I prefer a separated directories.
But I can changed that if it's required.
contrib/zstd/test/zstd_test.cc, line 29 at r3 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
GetTestFilePath :)
Also, in the rest of this file.
Fixed.
contrib/zstd/test/zstd_test.cc, line 105 at r3 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
I'd simply name this
path
I have changed this to path.
Code quote:
status_or_path
contrib/zstd/test/zstd_test.cc, line 109 at r3 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Alternative spelling:
*status_or_path
I converted all it to *variable
.
contrib/zstd/test/zstd_test.cc, line 121 at r3 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
These semicolons are not necessary
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r4.
Reviewable status: 12 of 14 files reviewed, 3 unresolved discussions (waiting on @cblichmann and @oshogbo)
contrib/zstd/test/zstd_test.cc, line 105 at r3 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
I have changed this to path.
Odd, I can't see this yet.
contrib/zstd/test/zstd_test.cc, line 109 at r3 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
I converted all it to
*variable
.
See above, I can't see this change yet.
Can you check now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 10 files at r3, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @oshogbo)
contrib/zstd/test/CMakeLists.txt, line 1 at r3 (raw file):
Previously, oshogbo (Mariusz Zaborski) wrote…
Can or should?
I prefer a separated directories.
But I can changed that if it's required.
Well, "required"... It's what we usually do, but I don't feel too strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be the last round for now
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @oshogbo)
contrib/zstd/sandboxed.h, line 15 at r6 (raw file):
// limitations under the License. #ifndef SAPI_LIBZSTD_SANDBOXED_
Header guards should contain the full path:
CONTRIB_ZSTD_SANDBOXED_H_
Code quote:
SAPI_LIBZSTD_SANDBOXED_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @cblichmann)
contrib/zstd/sandboxed.h, line 15 at r6 (raw file):
CONTRIB_ZSTD_SANDBOXED_H_
Fixed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r7, 4 of 4 files at r8, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @oshogbo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 14 files reviewed, 2 unresolved discussions (waiting on @cblichmann and @oshogbo)
contrib/zstd/utils/utils_zstd.cc, line 74 at r9 (raw file):
return absl::UnavailableError("Unable to decompress file"); } sapi::v::Array<uint8_t> outbuf(size);
size
is untrusted and we should check and limit it to some safe upper bound.
Code quote:
size
contrib/zstd/utils/utils_zstd.cc, line 93 at r9 (raw file):
absl::Status CompressStream(ZstdApi& api, std::ifstream& instream, std::ofstream& outstream, int level) {
Wiktor suggested that instead of CompressStream
/DecompressStream
, we should provide functions that work on file descriptors instead.
His main point was that we need to transfer chunks of data to/from the sandboxee in a loop, which will be quite slow.
Passing FDs will mean the sandboxee can access the data freely and wihtout having to cross the sandbox boundary.
Code quote:
absl::Status CompressStream(ZstdApi& api, std::ifstream& instream,
std::ofstream& outstream, int level) {
cbcec79
to
fc95c1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 8 files at r9, all commit messages.
Reviewable status: 11 of 14 files reviewed, 3 unresolved discussions (waiting on @cblichmann, @okunz, and @oshogbo)
contrib/zstd/test/zstd_test.cc, line 25 at r10 (raw file):
#include "sandboxed_api/util/temp_file.h" using ::sapi::IsOk;
IIRC we put the using-declaration inside the namespace.
Code quote:
using ::sapi::IsOk;
692d381
to
12b71e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 8 files at r9, 1 of 1 files at r10, 3 of 3 files at r11, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @oshogbo)
contrib/zstd/utils/utils_zstd.cc, line 74 at r9 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
size
is untrusted and we should check and limit it to some safe upper bound.
Everything else LGTM (can add the FD functions later), but we should add a size check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @okunz and @oshogbo)
contrib/zstd/test/zstd_test.cc, line 25 at r10 (raw file):
Previously, okunz (Oliver Kunz) wrote…
IIRC we put the using-declaration inside the namespace.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @cblichmann)
contrib/zstd/utils/utils_zstd.cc, line 74 at r9 (raw file):
Previously, cblichmann (Christian Blichmann) wrote…
Everything else LGTM (can add the FD functions later), but we should add a size check.
The size is checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r12, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @cblichmann)
PiperOrigin-RevId: 425893255 Change-Id: I6760fe1ab7734f1a27dd65e4c761c57961306a85
This change is