Debugging your optimizations part 1
Debugging your optimizations part 1¶
Hi everybody, John Jasa here and today we’re talking about debugging your optimizations. Now I’m going to try something new today. Because this is such a long and detailed topic I’m going to split this up into multiple different lectures, we’ll have multiple different parts. I don’t know how many yet, maybe three, four, and I just want to cover the first few today. So let’s talk about this.
When you set up your optimization and it doesn’t converge or it doesn’t give you a reasonable answer you want to figure out what’s going on. I will present 11 different tips to help you debug your optimizations step by step. If you’ve already seen the how to debug your solvers lesson this is very like that but it’s different. It’s like that in that I will show you the steps in a kind of suggested order. I think you should start with 0, 1, and go on until you try you know all 11 if need be. Maybe just hashing through the first few will be enough to to get you to a reasonable stopping point.
I never expected this but a huge portion of my life has been spent debugging optimizations. It’s more of an art than a science sometimes. So why don’t we get started?
Today I’ll just walk through the first few points in this notebook, really focusing on the step-by-step instructions and some sample code to help you do this for your own problems. We’ll now transition to kind of focusing on this notebook. Before we start on the actual checklist I want to provide a little bit of context. Most of these tips are kind of focused on gradient-based optimization. Now some of them will have to do with optimization in general but we’re certainly not focused on gradient-free optimization.
So that being said let’s start talking about the checklist. Some of these items will be focused on diagnosing what’s going on with your optimization. Now of course it’d be nice if after diagnosing we could solve the issues and so some of them are focused on solving the issues. I’ll try to highlight when I’m pointing out things that could help you diagnose your problem versus things that could help solve your problem.
So our first point here is to understand your model inside and out. Now you may think you understand your model but I mean really get in there understand all the nooks and crannies. Look at the edges of the design space. There are so many nuances, especially multidisciplinary models, especially when thinking about trade-offs between different systems, that you may not have considered. A very intuitive way to investigate your model and learn more about it is to perform a parameter sweep or design of experiments. This can help you kind of understand the design space more, can also help you see, okay, is it multimodal, is it bumpy, is it smooth, what can I expect here. So I have a link here to the design of experiments doc page for OpenMDAO but I also have an example code here of using a DOE. So this is for a very simple paraboloid optimization problem. If we have x on the x-axis we just sweep through different values of x and take a look at the objective output. Now again because this is so simple, it’s for a paraboloid, we kind of know the shape that it should be. But if we start an optimizer over here we kind of expect it to nestle down into an x value over here. Again this is for a simple, in this case 2D, problem, or only visualizing one dimension. But you can do this for n dimensions. You can imagine holding fixed a lot of your design variables and kind of sweeping through another set and seeing how your objective function changes. Now of course the more expensive your computational analysis is the more challenging it is to do a very minute parameter sweep.
Real example time! So here I wanted to sweep across different Mach numbers, or speeds, for an aircraft and look at how different drag terms change across different speeds. We see that bleed drag for the engine, this the second from the top one, changes very smoothly. It’s kind of flat and then it slowly ramps up and then levels out. But we see that the top level right here, d_spillage, actually has a bump at Mach 0.8. Maybe you knew about this bump or maybe you didn’t, but you can imagine your optimizer kind of exploring the design space, finding this higher drag part of the design space, and saying “well I don’t want to be there, that’s more drag than I want.” It would move off of that bump. Now again without this intuition, without knowing about this in your model, you would not be able to interpret what’s going on in the optimizer correctly.
Scrolling back up I have more tips here. One is to use the N squared diagram. Let’s check that out. So here’s a link to the docs about basics of creating N2 model visualizations and the whole idea here is that we can look at our model, see what’s connected, see what’s not connected, and really understand “okay, this was supposed to be here but it’s not. Now again you might think everything is hooked up correctly, but it might not be grouped, it might not be connected in the way you expect. It really pays to look at it. For me, as kind of a visuals focused person, this is so helpful.
Now another tip is to use list outputs. This means that you will look at the outputs from your initial point and make sure that they make sense. Maybe you thought you were giving the optimizer your twist in degrees but you’re actually giving it in radians. Oops! But you can check that. If we scroll down here, right past the DOE example, we have an example that uses list outputs. So in this case we have the circuit example and then at the end we say list outputs and this allows us to see, okay, here’s the resistance for the circuit, here’s the voltage, does this make sense, does this not make sense. You kind of get a feel for the intuitive numbers. Additionally you can filter out different outputs from your list outputs call and only look at what you’re really concerned about in your model.
Scrolling back up now I have a few other tips. If you’re getting solver convergence issues check out how to debug solvers. I’ve referenced this in the intro but it’s another kind of video series and set of lessons that helps you understand how to debug solving your model. If you’re not getting solver convergence you should definitely start there and try to figure out what’s going on there.
Another tip is to add lower and upper bounds to outputs in your model. Now this is different than adding bounds to your design variables. For example, if we know that aircraft mass should always be greater than zero, and it should be because it’s a mass, we can set a lower bound of zero. For some kind of solver where we’re converting the aircraft mass this could help prevent the solver from exploding or failing and lots of different ways. Setting realistic bounds on your outputs can be helpful.
My last tip here for understanding your model is use the error on non-convergence option in solvers. This is kind of a detail but the idea is that if a solver doesn’t converge you want your model to error; you want it to say “hey I didn’t converge here, please stop doing this optimization.” So this actually really helps because you can see, okay, if we didn’t converge I need to make the model more robust. I need to make sure that I can convert within this space. Or maybe you see that the optimizer is pushing your model somewhere where you’re not even expecting it to be evaluated. Then when it errors on non-convergence you can see, oh my god, we’re way outside what I expected would happen. I need to change some of the design variable bounds.
So again all of these tips are really focused on just the nuts and bolts, understanding your model, not even talking about optimization, but just talking about what you think you’re modeling, what you’re actually modeling, and everything in between.
Okay my next tip is a doozy. It’s to exhaustively check your derivatives at multiple design points. Let me get into this. I don’t just mean check your derivatives. I don’t mean check them in one point. I mean check them all over your design space, make sure that they are correct, and really get into the details of your partial and total derivatives. There are so many fantastic OpenMDAO doc pages that already exist so I don’t have to write anything about them. So this page is all about working with derivatives and understanding what’s going on with them. All of these right here are focused on determining if your partial derivatives are correct, and then also we’re looking at our total derivatives. I’ll briefly go through some aspects of these but I certainly won’t belabor all the points contained within these other doc pages.
Let’s go back here. So my first tip and this one, I don’t know, it sounds dumb but I mean it. We need to see if we have a linear solver on our model. If any system is missing a linear solver, but you need a linear solver to converge your derivatives, you should add one. Again if you just jump to debugging your optimization and you haven’t checked out some of the other lessons, that’s okay, but just know that linear solvers are necessary. They are what compute the derivatives for your systems. Anytime you have a nonlinear solver you need a linear solver as well. Make sure it’s converging, make sure it’s actually solving for these derivatives correctly.
Now my next tip here, and it’s a deep rabbit hole that you can fall in, is to check partials. What I mean by this is verify that all of your components in your subsystems return the correct partial derivatives. OpenMDAO can help you do this in a few different ways. You can use finite difference, you can use complex-step, and there are multiple built-in methods to help you understand what your partial derivatives are doing. Here’s the main one which is problem.check_partials. This allows you to look at your partial derivatives, check it against these finite differencing or complex-step methods, and really drill down in the details of what’s going on. I will scroll down here and take a look at some of the outputs. Here we have a very simple component. It’s got x1 and x2 as inputs and y is an output. And then here in the compute partials we have intentionally incorrect derivatives. Now at the end after running model we call check partials. If we scroll down here we get some nice output here. When I say nice I mean informative, I don’t mean it’s correct. We take a look at the raw analytic derivatives. So these are the derivatives that we provided. It’s four, and we take a look at the finite difference derivative, so what OpenMDAO is doing behind the scenes to kind of check our partial derivatives. We said it’s three. Well, this is a problem! Four is not equal to three. We take a look at the relative error, it’s non-zero, the absolute error is one, right four minus three is one, okay. So we know our derivative is wrong. Here again this is a problem. If we look at the output here it shows us the components, it shows us the outputs with respect to the inputs that’s going on. Also the absolute and relative error. There’s a whole host of information here that can help you debug your partial derivatives.
Now that’s not all. This is for a very simple component. You may have a more complex model. Another option is to show only incorrect derivatives. You may have 100 derivatives that are all correct and you have one that’s wrong and you want to really zoom in on just the wrong one. Now personally I love seeing all the right ones, it makes me feel good, it makes me say “hey, I did something right here.” With that being said, let’s talk about okay, if we only show the incorrect ones then this is what the output looks like. It only says, okay, we’re only writing information about components with incorrect Jacobians or derivatives. This allows us to really focus only on the problem ones.
So going back, I want to rehash that checking your partials can be a very time intensive process. If you have a large number of components or subsystems this could take a long time, but if you see okay, I think my partials might be wrong in this one component, this is kind of a tricky one to differentiate, I’m going to focus on this. You can kind of break down your problem and think about the most likely suspects.
Now beyond just checking your partials you should also check your total derivatives. So let me click through on this one as well. But the idea here is that you need to care about your functions of interest with respect to your design variables. So whereas partials are just for components or subsystems in your model, total derivatives are for your entire problem, your entire model. Again these total derivatives are what’s being passed to the optimizer. And so we need to care about d_objective with respect to d_design_variables. There’s even a handy note here that says you probably shouldn’t use this unless you’ve checked partials. Now again because the partials comprise the total derivatives, the total derivatives are computed via the partials. Because of this it makes sense to get all the partials right before jumping to the totals.
Now going back here, this last point here, I really want to drill into. Make sure you check your derivatives at multiple points in the design space, not just your initial point. Again the optimizer might call your model in weird combinations of design variables. It might twist certain parts of your wing, it might kind of change the wind turbine tower to look funky. You need to make sure that your derivatives are correct at any and every point in the design space. Again you need to know they’re correct there so the optimizer says “oh my gosh, this is a terrible design point, I would like to get away from it.” Given correct derivative information optimizers can work their way out of so many tricky bits.
Okay now on the right you see, okay, we have zero, one through ten. There are a lot more here but I’m going to call it for this lesson and so we can fit many more into different parts. And if you have any questions or suggestions please let me know so I can handle them in the upcoming lessons. As always please hit those like and subscribe buttons and guys, gals, and non-binary pals; thank you for watching.