Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timezone-aware data and axes #3870

Open
nicolaskruchten opened this issue May 16, 2019 · 13 comments
Open

timezone-aware data and axes #3870

nicolaskruchten opened this issue May 16, 2019 · 13 comments
Labels
feature something new P2 considered for next cycle

Comments

@nicolaskruchten
Copy link
Contributor

We could add a new axis type that's timezone-aware.

@nicolaskruchten
Copy link
Contributor Author

Related from Python: plotly/plotly.py#209

@nicolaskruchten
Copy link
Contributor Author

A bit of extra info about what this could do:

  1. It could accept time information in a timezone-aware manner such that T@tz1 != T@tz2 which is currently the case because we just drop the tz* information such that T == T.
  2. It could accept a display timezone such that UTC times could be displayed in EST, say.

This would allow me to provide data in a mix of input timezones and display it in a particular fixed output timezone.

@nicolaskruchten
Copy link
Contributor Author

This issue has been tagged with NEEDS SPON$OR

A community PR for this feature would certainly be welcome, but our experience is deeper features like this are difficult to complete without the Plotly maintainers leading the effort.

Sponsorship range: $10k-$20k

What Sponsorship includes:

  • Completion of this feature to the Sponsor's satisfaction, in a manner coherent with the rest of the Plotly.js library and API
  • Tests for this feature
  • Long-term support (continued support of this feature in the latest version of Plotly.js)
  • Documentation at plotly.com/javascript
  • Possibility of integrating this feature with Plotly Graphing Libraries (Python, R, F#, Julia, MATLAB, etc)
  • Possibility of integrating this feature with Dash
  • Feature announcement on community.plotly.com with shout out to Sponsor (or can remain anonymous)
  • Gratification of advancing the world's most downloaded, interactive scientific graphing libraries (>50M downloads across supported languages)

Please include the link to this issue when contacting us to discuss.

@alexcjohnson
Copy link
Collaborator

Can we do this on top of axis.type='date'? Feels to me as though we could just add a new attribute axis.timezone - if not set you get the current behavior, but it would accept fixed timezones ('UTC', '+01', 'CET', 'EST') as well as timezones that include daylight saving shifts ('Europe/Zurich', 'ET') and use that for tick marks.

Then if you specify a timezone, any date data that doesn't include timezone info is assumed to be in that timezone. Any date data that includes timezone info is shifted into that timezone.

If we want to support the case of date data without included timezone info but representing a timezone different from the axis timezone, we could follow the example of world calendars and add attributes like trace.xtimezone.

@nicolaskruchten
Copy link
Contributor Author

all of those sound fine to me. IIRC there wasn't any appetite when I created this issue for playing too much with the existing date axes hence my proposal of a new type :)

@nicolaskruchten
Copy link
Contributor Author

Then if you specify a timezone, any date data that doesn't include timezone info is assumed to be in that timezone. Any date data that includes timezone info is shifted into that timezone.

If the timezone is like ET then there will be some ambiguity around the EST/EDT transition times if we infer that a timezone-less time is "in ET"

@alexcjohnson
Copy link
Collaborator

Bringing in @ndrezn's comment from #6519:

This is an example using px but I believe the core issue/feature would be resolved in Plotly.js. Happy to move this to https://github.com/plotly/plotly.py if that makes more sense.

import plotly.express as px      
import pandas as pd
 
df = pd.DataFrame({"time": pd.date_range("2022-10-30 00:00:00", "2022-10-30 04:00:00", freq="1h", tz="Europe/Zurich")})
df["values"] = [1,1, 1, 2, 1, 1]
fig = px.line(df, x="time", y="values")
fig.show(“browser”)

Just for reference, since October 30th crosses daylight savings, this dataset will look like this:

                       time  values
0 2022-10-30 00:00:00+02:00       1
1 2022-10-30 01:00:00+02:00       1
2 2022-10-30 02:00:00+02:00       1
3 2022-10-30 02:00:00+01:00       2
4 2022-10-30 03:00:00+01:00       1
5 2022-10-30 04:00:00+01:00       1

Notice that there are two 2am's -- one at +02 and one at +01.

In this example, Plotly will render:
image001 copy 2

What you might expect instead is that it would have two 2ams on the x-axis, so our output would look more like a triangle.

My take on this:

  • Once our date axes understand the concept of timezones, every second in the real world (well, ignoring leap seconds I guess!) should be represented by an equal number of pixels on the axis.
  • Tick labels with dtick<=1h may repeat, with dtick>1h they should be equally spaced in clock numbers - so if dtick=2h then right around DST changes we'll have two ticks spaced by either 1h or 3h but always with a 2h difference in the digits shown.

@alexcjohnson
Copy link
Collaborator

If the timezone is like ET then there will be some ambiguity around the EST/EDT transition times if we infer that a timezone-less time is "in ET"

True. Nothing we can do about that, other than to suggest to the user that they send that data with timezone info included. I still think this is the way to structure the API, we just document that ambiguity.

@nicolaskruchten
Copy link
Contributor Author

Well, you could just accept data in real offsets (i.e. not infer against the axis)... I guess the use-case you're interested in is just like the naive "every day at 8am but draw it in ET?"

@alexcjohnson alexcjohnson changed the title axis.type = 'datetz' timezone-aware data and axes Jul 12, 2023
@emilykl
Copy link
Contributor

emilykl commented Jul 17, 2023

Talked with @alexcjohnson and @cleaaum last week to formalize in more detail what this API could look like -- here is a summary:

API

  • Add a timezone property to layout and axis, and xtimezone and ytimezone (and sometimes ztimezone) to trace
    • Exact inheritance behavior between these properties TBD -- likely trace timezone will inherit from layout timezone for consistency with calendar attributes
  • timezone may be specified either as a UTC offset (e.g. +03, -05), an abbreviation corresponding to a UTC offset (e.g. PST, EDT) or as a tz database timezone name (e.g. America/Montreal, Asia/Dubai)
    • Other ways of referring to timezones (e.g. "ET" / "Eastern Time") are NOT supported
  • Individual data points may also specify a UTC offset
    • For traces with no timezone specified, current behavior is maintained (UTC offset is ignored)
    • For traces with timezone specified, UTC offset is applied to datapoint, and datapoint is converted to trace timezone
    • Individual datapoints are not permitted to specify a timezone name due to potential ambiguity (This isn't a normal format for datetime strings anyway so it's unlikely anyone would try this; but stating here for clarity)

Notes

  • Tick labels are displayed in axis timezone
  • Hoverdata is displayed in axis timezone
  • How tick labels are handled around discontinuities, usually daylight savings time start/end (from @alexcjohnson above):
    • Once our date axes understand the concept of timezones, every second in the real world (well, ignoring leap seconds I guess!) should be represented by an equal number of pixels on the axis.
    • Tick labels with dtick<=1h may repeat, with dtick>1h they should be equally spaced in clock numbers - so if dtick=2h then right around DST changes we'll have two ticks spaced by either 1h or 3h but always with a 2h difference in the digits shown.
  • Values used to specify axis range are assumed to be in axis timezone (rather than UTC)
  • In some cases, time instant may be ambiguous; e.g. "2023-11-05 2:00" happens twice in the America/Montreal timezone due to Daylight Savings fall back. In these cases we need to choose a consistent behavior globally -- either assume the first occurrence or the last occurence
    • Even though datapoints themselves cannot be given a timezone, we may still encounter ambiguous situations in some cases, e.g. if the datapoints have no timezone but the axis does

Open to questions/comments -- in particular @alexcjohnson please let me know if I missed or misremembered anything.

@lucasjamar
Copy link

Talked with @alexcjohnson and @cleaaum last week to formalize in more detail what this API could look like -- here is a summary:

API

  • Add a timezone property to layout and axis, and xtimezone and ytimezone (and sometimes ztimezone) to trace

    • Exact inheritance behavior between these properties TBD -- likely trace timezone will inherit from layout timezone for consistency with calendar attributes
  • timezone may be specified either as a UTC offset (e.g. +03, -05), an abbreviation corresponding to a UTC offset (e.g. PST, EDT) or as a tz database timezone name (e.g. America/Montreal, Asia/Dubai)

    • Other ways of referring to timezones (e.g. "ET" / "Eastern Time") are NOT supported
  • Individual data points may also specify a UTC offset

    • For traces with no timezone specified, current behavior is maintained (UTC offset is ignored)
    • For traces with timezone specified, UTC offset is applied to datapoint, and datapoint is converted to trace timezone
    • Individual datapoints are not permitted to specify a timezone name due to potential ambiguity (This isn't a normal format for datetime strings anyway so it's unlikely anyone would try this; but stating here for clarity)

Notes

  • Tick labels are displayed in axis timezone

  • Hoverdata is displayed in axis timezone

  • How tick labels are handled around discontinuities, usually daylight savings time start/end (from @alexcjohnson above):

    • Once our date axes understand the concept of timezones, every second in the real world (well, ignoring leap seconds I guess!) should be represented by an equal number of pixels on the axis.
    • Tick labels with dtick<=1h may repeat, with dtick>1h they should be equally spaced in clock numbers - so if dtick=2h then right around DST changes we'll have two ticks spaced by either 1h or 3h but always with a 2h difference in the digits shown.
  • Values used to specify axis range are assumed to be in axis timezone (rather than UTC)

  • In some cases, time instant may be ambiguous; e.g. "2023-11-05 2:00" happens twice in the America/Montreal timezone due to Daylight Savings fall back. In these cases we need to choose a consistent behavior globally -- either assume the first occurrence or the last occurence

    • Even though datapoints themselves cannot be given a timezone, we may still encounter ambiguous situations in some cases, e.g. if the datapoints have no timezone but the axis does

Open to questions/comments -- in particular @alexcjohnson please let me know if I missed or misremembered anything.

Hi @emilykl ,

This looks like a very comprehensive study of the problem.
Would you be using https://momentjs.com/timezone/ to handle tz conversions or something else?

@alexcjohnson
Copy link
Collaborator

Would you be using https://momentjs.com/timezone/ to handle tz conversions or something else?

We're hoping this can all be done with built-in browser APIs but there's still some research to be done before we can confirm this.

@gvwilson gvwilson self-assigned this Jun 14, 2024
@gvwilson gvwilson removed their assignment Aug 2, 2024
@gvwilson gvwilson added P2 considered for next cycle and removed ♥ NEEDS SPON$OR labels Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new P2 considered for next cycle
Projects
None yet
Development

No branches or pull requests

6 participants