-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UTF8 cast to Timestamp with timezone #3664
Comments
@viirya @waitingkuo @alamb please assign this issue to me |
As I mentioned apache/datafusion#5164 (comment) currently the @viirya @waitingkuo please let me know your thoughts. Probably cast functions should accept array type |
My naive expectation in such a case is that it would cast the timestamps to the provided output timezone. So in your example
Would result in
Or to put it differently, each row would be parsed with any FixedOffset specific to that row, and it would then be converted to the output timezone. This would preserve the notion of the same point in time, but represented in different timezones. This is also consistent with how timezone casts for timezone casts work, where the conversion is a metadata only operation (as TimestampArray always stores time since UTC epoch)
I don't follow what you mean by this |
@tustvold sorry probably I have expressed not very clear. The entire idea is to cast user provided Utf8 value to Timestamp with timezone like
In Datafusion such transformation happens through arrow-rs |
I think using
I think this is fine, timestamps are always stored with respect to UTC Unix epoch, and so this should have no impact on overflow behaviour. |
Thanks @tustvold Should we support all supported PG formats https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-TIME-TABLE I'm doing some test
All those valid for PG, but there is potential performance problem as we need to iterate through multiple formats until we find the match |
I think casting to timestamptz should return the same time zone (the one shown in i'm in UTC+8 in postgresql, willy=# select '2023-01-01T04:05:06.789-8'::timestamptz;
timestamptz
----------------------------
2023-01-01 20:05:06.789+08
(1 row) willy=# select '2023-01-01 04:05:06.789-7'::timestamptz;
timestamptz
----------------------------
2023-01-01 19:05:06.789+08
(1 row) so in your query, the union return +8 time zone (my current time zone) for me in postgresql. I think we could consider either fix it to +00:00 or current timezone |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Support cast UTF8 to Timestamp with time zone
Describe the solution you'd like
Support UTF8 cast to TimestampTZ in formats like PG https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-TIME-TABLE
Describe alternatives you've considered
Not doing this
Additional context
Original ticket apache/datafusion#5164
The text was updated successfully, but these errors were encountered: