Feature Request: Second Derivatives for User Defined Functions #1198

UserQuestions · 2018-03-14T00:58:50Z

It would be extremely helpful for JuMP to support second derivatives for user-defined functions. Ideally this could be done as efficiently as ReverseDiffSparse, but even just calling ForwardDiff.Hessian! would be a helpful option. There are a broad class of problems that require optimizing functions that do not easily translate into the typical JuMP syntax (especially in ML/Statistics), so having JuMP able to handle such cases would be a huge benefit to people working on those problems and will greatly expand the number of potential JuMP users.

mlubin · 2018-03-14T01:45:35Z

I'd be curious if you could point to one or a couple examples in ML/Statistics where there's a significant benefit from using methods that require second-order derivatives over first-order methods that JuMP already supports with user-defined functions. This would help justifying the implementation effort. (But either way, I don't expect to spend time on this in the next few months given everything else going on with JuMP development.)

UserQuestions · 2018-03-14T02:56:09Z

Thanks for your prompt response, and thanks for creating this amazing package.

Virtually all forms of machine learning and statistics involve choosing an optimal parameter theta such that an in-sample or out-of-sample loss function L(y,f(x,theta)) is minimized, where x is data being used to model or predict y. The predictive/modeling function f(x,theta) can be very complex to articulate because it is often designed either to be very flexible or to reflect the rules for a true data generating process. One example is neural-nets, which use a f(x,theta) that is the iterated composition of many linear and non-linear functions as a flexible predictive function. In biostatistics, the data generating process f(x,theta) often requires computing the evolution of a complex system. In economics, f(x,theta) can involve computing equilibrium strategies for the firms or individuals that generated the data (this may even require solving for a fixed point of a multi-function or nested optimization problems). I'm also happy to provide specific references to examples in some of these fields.

UserQuestions · 2018-03-14T04:52:36Z

As for why one needs second-order methods: in some cases, L(y,f(x,theta)) may have low dimension (in theta) but extremely long time to compute (potentially on the order of minutes, hours, or even longer). In such cases, second order methods might help reduce the number of function evaluations required. Additionally, it is not uncommon for L(y,f(x,theta)) to have substantial cross-partials in theta, meaning that exact second order methods will do much better at finding the true argmin than Hessian approximations.

shoshievass · 2018-03-15T13:47:52Z

Just want to second @UserQuestions here - there are a number of cases in Economics (at the least) that require two-step optimization where the inner ("first" step) requires solving contraction mappings, etc. and may not be feasibly described in JuMP syntax - but the optimization of which faces serious improvements from being able to use the second derivative..

raphaelchinchilla · 2020-06-21T06:25:13Z

more than two years after, but I want to third @UserQuestions . It would be a game changer if JuMP could do optimization with Hessian, and especially if the Hessian was a sparse array. CasADi, a second order optimizer with autodiff based on Ipopt is the only thing that keeps me in Python....

odow · 2020-12-07T08:29:29Z

Here's an example from Discourse where Ipopt failed to converge without the second-order information on the problem as formulated by the user: https://discourse.julialang.org/t/nonlinear-objective-function-splatting/51251 (However, it could be reformulated and solved with first-order information only.)

raphaelchinchilla · 2020-12-07T18:25:17Z

For all of you curious about this issue, one can pass the function, gradient and hessian directly to Ipopt.jl, without using JUMP and the MathOptInterface. The documentation on how to do it is somewhat hidden but can be found ~~Ipopt.jl/doc/ipopt.rst~~ (Edit: odow): https://github.com/jump-dev/Ipopt.jl#c-interface-wrapper

In my tests, one could use AD tools in the definition of the functions (such as ForwardDiff.jl or Zygote.jl) or use the ModelingToolkit.jl to compile the gradient and hessian. I think also that using ComponentArrays.jl could be use, which would make the definition of the functions easier, but I have not tested it.

I personally fail to understand why developers continue pushing MathOptInterface for nonlinear problems. It is true that it does a great job for convex problems, but for nonlinear problems it falls so much behind of what one needs, not being able to use vectors and not being able to pass the hessian, that I would classify it as experimental at this point.

odow · 2020-12-07T18:48:52Z

I personally fail to understand why developers continue pushing MathOptInterface for nonlinear problems.

We're aware of the current NLP limitations of MOI: jump-dev/MathOptInterface.jl#846

We encourage people to use JuMP because many users already know the JuMP syntax, but they won't know about the specifics of computing gradients and hessians. If you have specific needs that JuMP isn't meeting, it probably isn't the right tool for the job, and you should consider other options:

mlubin · 2020-12-07T18:58:21Z

I think there's plenty of agreement on what the areas for improvement are for JuMP and MOI with regards to nonlinear optimization.

I personally fail to understand why developers continue pushing MathOptInterface for nonlinear problems

I'm not sure what you mean by "pushing". We all want the state of the art to improve and are more than happy to point people to the best tools for the job. For example, I've been advocating for a CasADi interface in Julia since 2014: casadi/casadi#1105.

raphaelchinchilla · 2020-12-07T19:39:43Z

@odow and @mlubin, I did not mean to be aggressive with respect to my comment on "pushing", I am sorry. English is not my first language, so I might not have expressed what I wanted correctly. I have a great admiration for the work that is being done by you all.

odow added the Type: Feature request label Nov 8, 2018

mlubin added the Category: Nonlinear Related to nonlinear programming label Feb 2, 2019

mlubin mentioned this issue Aug 16, 2020

Expand documentation for user defined functions #2314

Closed

odow mentioned this issue May 19, 2021

Add an @register macro #2606

Closed

odow added this to the 1.x milestone Sep 26, 2021

odow mentioned this issue Apr 13, 2022

WIP: explore API for multivariate hessians of user-defined functions #2953

Closed

odow mentioned this issue Apr 26, 2022

Add support for user-defined multivariate hessians #2961

Merged

odow closed this as completed in #2961 Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Second Derivatives for User Defined Functions #1198

Feature Request: Second Derivatives for User Defined Functions #1198

UserQuestions commented Mar 14, 2018

mlubin commented Mar 14, 2018

UserQuestions commented Mar 14, 2018

UserQuestions commented Mar 14, 2018

shoshievass commented Mar 15, 2018

raphaelchinchilla commented Jun 21, 2020

odow commented Dec 7, 2020

raphaelchinchilla commented Dec 7, 2020 •

edited by odow

Loading

odow commented Dec 7, 2020

mlubin commented Dec 7, 2020

raphaelchinchilla commented Dec 7, 2020

Feature Request: Second Derivatives for User Defined Functions #1198

Feature Request: Second Derivatives for User Defined Functions #1198

Comments

UserQuestions commented Mar 14, 2018

mlubin commented Mar 14, 2018

UserQuestions commented Mar 14, 2018

UserQuestions commented Mar 14, 2018

shoshievass commented Mar 15, 2018

raphaelchinchilla commented Jun 21, 2020

odow commented Dec 7, 2020

raphaelchinchilla commented Dec 7, 2020 • edited by odow Loading

odow commented Dec 7, 2020

mlubin commented Dec 7, 2020

raphaelchinchilla commented Dec 7, 2020

raphaelchinchilla commented Dec 7, 2020 •

edited by odow

Loading