-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Columnar json writer for arrow-json #6411
Comments
cc @tustvold |
How would this encode nested types like ListArray, StructArray or MapArray? This would also not lend itself to streaming reads, which is normally important to bound memory usage |
I'm not claiming to have thought it all the way through but |
That would only work for a list of primitives, a list of structs would need to encode the structs as list records to preserve the multiple levels of nullability, at which point you're back to effectively the current format, just exploded by one level I think given:
It is hard for me to recommend including it in this repository. Perhaps we could take a step back and ascertain what the desired outcome is? If it is just to reduce the size, running the current JSON format through lz4 will likely yield far greater returns for very little additional overhead compared to the costs of JSON parsing |
Closing this as wonfix |
To get an output like:
The idea is that I can attach a schema to this and it will be much more compact (and possibly more performant to serializer / deserialize?).
Does this sounds like a good idea that would be accepted to the package?
The text was updated successfully, but these errors were encountered: