Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling Horizon #702

Closed
Remmy195 opened this issue Oct 23, 2023 · 16 comments
Closed

Rolling Horizon #702

Remmy195 opened this issue Oct 23, 2023 · 16 comments

Comments

@Remmy195
Copy link

Hello
Is there a tutorial on how to implement rolling horizon? I'm unsure whether to add to existing cuts for each iteration and out of sample scheme for the simulation?

Thank you.

@odow
Copy link
Owner

odow commented Oct 24, 2023

There is no tutorial or explicit support for rolling horizon.

Do you want to:

  1. Create and solve new SDDP problems at each step of the rolling horizon, or
  2. Re-use the same SDDP graph, but train a few additional iterations at each step to ensure an optimal policy?

@odow
Copy link
Owner

odow commented Oct 24, 2023

At one point I opened #452, but no one asked for it 😆

@Remmy195
Copy link
Author

There is no tutorial or explicit support for rolling horizon.

Do you want to:

  1. Create and solve new SDDP problems at each step of the rolling horizon, or
  2. Re-use the same SDDP graph, but train a few additional iterations at each step to ensure an optimal policy?

Yes I want to re-use the policy graph but train iterations at each time step with a look-ahead window. I'll be fixing the state variable before the look ahead for the next iteration.

@Remmy195
Copy link
Author

At one point I opened #452, but no one asked for it 😆

Truly I saw this but I am two years late 😆

@odow
Copy link
Owner

odow commented Oct 24, 2023

Yes I want to re-use the policy graph but train iterations at each time step with a look-ahead window

To clarify: does the graph change?

In 1), you'd build and train a new SDDP policy at each step. For example, it might be a finite horizon policy with 12 months that you roll forward, so that in step one you solve Jan - Dec, then in step 2 you solve Feb - Jan, and so on.

In 2), you'd build and train one SDDP policy, and just fine-tune the policy with a few extra iterations at each step (because you might have observed an out-of-sample realization and you want to ensure the policy is still optimal at the new state). In this case, you'll likely have an infinite horizon policy, or if you have a finite horizon, then in step one you'd solve Jan - Dec, in step 2 you'd solve Feb - Dec (without updating the sample space for the uncertainty in months March - December).

If 1, just code your own. We don't have support for this.

If 2, then that was what #452 was asking for 😄, and I could be persuaded to add it.

@Remmy195
Copy link
Author

Oh I misunderstood your question earlier. I was trying to implement 1 by resolving for each horizon, which was where I got stuck. But what I want to do now is update an existing policy with new information from the out-of-sample simulation following the horizon illustration except in my problem I have 8760 stages.

@odow
Copy link
Owner

odow commented Oct 24, 2023

Still not quite sure I understand. Could you provide a little more description of exactly the problem you are trying to solve and how the rolling horizon is used?

except in my problem I have 8760 stages

Okay. So you have a 1 year problem in hourly resolution.

What is the rolling horizon part? What is the lookahead horizon, and how often will you re-optimize?

update an existing policy with new information from the out-of-sample simulation

Do you want to modify the random variables in future stages to use a new distribution (e.g., from an updated forecast)? If so, this is 1).

@Remmy195
Copy link
Author

The rolling horizon is every 24 hours with 48 hours of lookahead. Re-optimization occurs every 24 hours till I reach 8760 hours. I don't plan to update the random variable which in my case, is constructed as a Markov graph.
Just the state variable at the end of t = 24 (in T = 72) for the initial state of the next optimization. The length of the lookahead could be any value, it is needed so that my SOC which is my state variable doesn't go to 0 at the end of a 24-hour optimization.

@odow
Copy link
Owner

odow commented Oct 24, 2023

So you have a 72 stage finite horizon problem, and you want to roll forward every 24 stages. This is 1). We don't have support for this.

Won't you need to change the realizations in the Markov graph for seasonality etc? If so, you'll need to rebuild and re-train a policy every "day" of simulation time.

Is the goal to compare SDDP against model predictive control? How is the Markov graph built? How big is the problem? (Number state variables, control variables, random variables, number of nodes in the graph, etc.)

@Remmy195
Copy link
Author

Yes! 1 was what I was trying to implement earlier because I want to compare it with a deterministic rolling horizon model. I used a harmonic regression model to simulate scenarios for the Markov graph for the 8760 steps to capture seasonality.
There is only one state variable, which is the SOC of a risk-neutral price taker long-duration energy storage. The energy price is the only random variable and a couple of control variables, I guess the granularity of the planning horizon is what makes my model computationally expensive.

If I understand correctly suggestion 2 is implementing a look ahead that extends to the end of the planning horizon at every reoptimization?

@Remmy195
Copy link
Author

This might probably sound crazy but I have used 20000 nodes for the graph.

@odow
Copy link
Owner

odow commented Oct 28, 2023

If I understand correctly suggestion 2 is implementing a look ahead that extends to the end of the planning horizon at every reoptimization?

Suggestion 1 is to build a sequence of SDDP problems which each contain 72 stages, and the information from one graph is not shared with another.

Suggestion 2 is for you to build a model with 8760 stages and train. But because things that happen deep in the graph have little effect on the early decisions, by the time you take a few steps your policy might be sub-optimal. Therefore, you can retrain the model---using the same graph without changing anything---from your new state variable and starting stage.

This might probably sound crazy but I have used 20000 nodes for the graph.

😆 this is a bit crazy. We build a JuMP model for every node in the graph, so you have 20,000 JuMP models in your computer. Don't you run out of memory?

@Remmy195
Copy link
Author

Will implement suggestion 1, thank you!

I use super computing for this model. Memory was an issue until I figured I needed at least 60 GB of RAM depending on the number of simulations of the policy.

@odow
Copy link
Owner

odow commented Oct 28, 2023

Will implement suggestion 1, thank you!

Great. In which case, I don't know if we need to do anything here. SDDP acts just like any other kernel you might use in a rolling horizon problem.

I use super computing for this model. Memory was an issue

Cool cool. I'm still surprised that it worked! The biggest model I'd previously solved had something like 2,000 nodes, not 20,000.

@Remmy195
Copy link
Author

Will implement suggestion 1, thank you!

Great. In which case, I don't know if we need to do anything here. SDDP acts just like any other kernel you might use in a rolling horizon problem.

I use super computing for this model. Memory was an issue

Cool cool. I'm still surprised that it worked! The biggest model I'd previously solved had something like 2,000 nodes, not 20,000.

Totally works!

@odow
Copy link
Owner

odow commented Oct 29, 2023

Closing because this seems resolved, and I don't think there is anything to do here. (Please re-open if I've missed something.)

@odow odow closed this as completed Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants