What solver convergence looks like video transcript
What solver convergence looks like video transcript¶
Hello and welcome to another exciting lesson in the Practical MDO series. Today we’re going to be talking about what solver convergence looks like. This is one of many topics related to solvers and I’m excited to get into the details with you.
So let’s talk about what solving a system actually means, what it means to converge a system. What I mean by that is that the residuals of the system go to zero or within a certain tolerance. How we do this depends on what solver you use but generally we want to obtain convergence as fast and as computationally efficient as possible.
This topic firmly deals with the modeling focus in this course. We’ll go through a few topics here. I’ll first introduce the basic idea of convergence and then I’ll show some graphical examples of how to tell when a system is converged and what that means and how you can change that. And then lastly I’ll give some examples using OpenMDAO codes to show you what convergence looks like in the terminal and what that means for your codes.
So first let’s start with the basic idea of convergence. What do I mean by convergence? I’m going to show a general linear system here. This might be something that you recognize from undergrad where it’s Ax equals to b where A is a matrix, x is a vector, and b is a vector as well. So here we have a matrix that’s being multiplied by some state vector, in this case x, and we’re trying to get it equal to the right hand side which is b. And when I say a linear system this is what I mean. This is kind of the general form of it. We have all these coefficients in the A matrix, these states in the x vector, and then b is also a vector.
Now I’m talking about linear systems because it’s kind of easier to understand. You might have seen them in undergrad math and other courses but I want to be extremely clear here. Solvers are useful for both non-linear and linear systems and you need them for both. Here’s an example of a non-linear system. We have three equations, three unknowns and they’re not linearly related, you know, they have some different powers, they have square roots, multiplying each other. This is a more complex system than a linear system to converge. You could try to solve this by hand for x, y, and z; that could be very challenging. Or you could try to converge it computationally using a nonlinear solver.
But let’s go back to this idea of having the Ax equals to b, this linear system. I’m going to slightly rearrange this by subtracting b from both sides and that allows us to say Ax minus b equals zero which equals the residual of x. So in the beginning I mentioned that residuals are what we’re trying to converge to zero and that holds true because once we converge the residual to zero we know the solution for Ax equals to b; we know what x should be. I want to stress again this is only for linear systems, what I’m saying, but solvers apply for all linear and non-linear systems. So imagine this , the idea of converging the residual to zero for any type of system.
So now how do we tell when a system is converged? I’ve got a graphical representation here of the residual on the y-axis and the solver iteration on the x-axis. These are just some dummy problems that I pulled from the OpenMDAO documentation to kind of highlight what this means. When I say converge I mean that we’re generally trying to hit a tolerance, not always zero. When I say that we’re trying to hit a tolerance I mean that we will accept a little bit of inaccuracy in the solver solution. This might be because of machine precision or how much we actually care about the answer. So we’re trying to get the residual down to this tolerance level, our accepted value of zero. What does this mean? Let’s talk about it.
So here’s an example of a Newton solver which has a bad guess (and I’ll get more into this later) but just know for now that you can set up solvers and you can ask them to solve a system and they cannot do it. They can fail, they can say “hey I don’t know what to do here.” In this case the residual actually decreases a little bit and then drastically increases, it kind of blows up. What this means is that the initial guess that we had for the Newton solver is “bad.” We kind of didn’t set up the solver for success; we asked it to try to do something that it couldn’t and tried to solve a system without being close enough to the solution to know where to go. This is because Newton’s method is based on a kind of gradient approximation for the system. I’ll have a link in the description to more info about Newton’s method math and a few other lessons related to this. But that’s what convergence doesn’t look like.
Now let’s look at another example. Here is another example from the same problem where we have the residual and it’s converging for a little bit but then it kind of stalls it gets stuck at a certain place and it can no longer; it can no longer change x to get the residual value any lower. We see this often when the solver is bounded by something; maybe one of the state variables has a bound on it; you say “the voltage cannot go below zero.” You say “the amps cannot go below zero” and it gets stuck, for instance. So what this means is that your problem may be ill-formulated; it may not be solvable; you may be setting up the solver again for not success. Here you would want to change your bounds, in this case or change something else. But this is one example of what convergence doesn’t look like. Again we’re trying to hit this tolerance; if it stalls out here you’re never going to hit that tolerance, so that’s not a great sign.
Let’s take another stab at this. So here is the Newton’s method converging without solve subsystems on. And so this is a detail again; you don’t need to worry about some of these options that I’m portraying here. If you know about them, that’s great, but you don’t need to know about them. Just know that we’re tweaking some of the options here to try to get the system to converge. Here we start with a higher residual and we’re able to actually converge this and it kind of iterates; it decreases over time. I want to highlight that the y-axis is logarithmic again. So what this actually means is that after 15 iterations we hit the tolerance that we’re trying to hit with the Newton solver; it’s making good progress every time; we like what we’re seeing here; this is good. I would consider this fully converged.
But can we do better. Like I mentioned in the main message in the beginning we’re always trying to do better; we’re always trying to converge faster. So here we have the Newton’s method with solve subsystems on. Again I have a link in the description about more information about what these options mean and when you should use them. But for now just know that we’re tweaking the options and we can change how this system converges. Here we see, okay, the residual decreases after the first step and then it goes up a little bit but then it goes down and we’re okay and eventually we hit our tolerance much faster than we did before. Now we can continue to tweak our settings here and we can add a linesearch to the Newton’s method. This adds a little bit computationally but it allows the the system to converge in fewer iterations.
Now I want to stress here on the x-axis I’m showing iterations here this is not computational cost, it’s simply the iterations of the non-linear solver. If we were to plot based on computational cost these would all be shifted in individually different ways. This is because each iteration takes a different time based on the settings that you use.
So why am I showing you five different ways that the solver does or doesn’t converge the system? I wanted to show you the the full gambit of when you have a bad guess and that it explodes, the solver doesn’t converge to anything. In fact it might nan out or error out, and also when the solver converges very quickly in just four or five iterations. All of these are kind of different outcomes that you can see (and they’re not exhaustive) for what your solver could look like. Convergence looks like anything in this purple, green, or blue lines where we get below the tolerance and non-convergence is anything where we do not meet that tolerance. Generally if you have a system that’s not converging you want to figure out why it’s not converging. You want to figure out if you could give it better guesses. If it’s a Newton’s system maybe it’s hitting a boundary, maybe the there’s no actual solution to the system. I’ve had it sometimes where you set up a linear or non-linear system and there’s no actual solution here. The solver would fail because there’s no solution at all! You can’t expect convergence if you’re formulating a problem where there is no answer.
Now that we’ve looked at this graphically I will transition into what convergence looks like in the terminal. I’m showing you here a portion of the accompanying Python notebook for this lesson. Let me explain what we’re looking at here. We set up a very simple problem and this problem includes an implicit subsystem. This problem analyzes an electrical circuit and it comes from the OpenMDAO doc page. I’ve included a link to the documentation below. Here what we’re really interested in is taking a look at the solver performance and what that means. So we simply add a model in OpenMDAO and then we set the solver. Additionally we set some values for the solver so that we’re starting from an initial point that makes sense for the solver. In this case we’re using a Newton solver and then we also set some initial values so that the Newton solver is well-posed. Let’s take a look at the output here. If we run this and we’re just running a model here, we’re not doing the optimization, we’re just running analysis.
Let’s take a look at what the solver does. We’re looking at a group called circuit and this makes sense because we added the group called circuit right here, and we attached the solver right here, and we’re interested in seeing the convergence of the solver. So this is telling us that we have a nonlinear solver that’s what NL stands for. It’s a Newton solver in this case. And here the iteration number is zero onward. We then have first the absolute residual and then the relative residual. So the absolute residual is the actual difference in the states’ value between what it should be and what we’re actually seeing. Right now in the current non-linear system the relative residual is always starting at one and then it should decrease. You can imagine for some systems you might have very huge state value, so the absolute residual might be very big and the relative would always be one. No matter what this is helpful when you’re not sure about what kind of state values you’re dealing with and you want to set a tolerance accordingly.
So here we take a look at the outputs. We see the Newton starts at step zero, one, two, three, four. First it increases the residuals which is a little bit, you know, counterintuitive. It’s trying to solve the system and it takes a step and the answer actually gets worse, the residuals get worse. But then over time it slowly converges the system and we see that it says “okay Newton converged, in this case in 17 steps, looks great.” We see that the the absolute residual and the relative residual are very, very small in this case. We will accept this as a converged result. So if you ever see this in one of your OpenMDAO models this means that a solver is converging some coupling within the system. Again this is useful for any implicit system or anything where you even have explicit components that are coupled together with backwards coupling.
In the case of a Newton solver we need to start the states in an initial condition that makes sense. We can’t just throw anything willy-nilly. We can’t have everything be zeros. Often it would be very challenging for Newton’s solver to converge something without a good initial guess.
Let me give a quick example of that. If we scroll down here and we take a look at the values that we’re setting here. Again it doesn’t really matter what these values are in terms of what the circuit is doing. We just know these are some bad initial guesses. We take a look at what the solver is doing now in the output and we see okay, we start at the same point as above. However now bam, the first step is terrible. It’s horrendous. Are you seeing this number here? It looks terrible. It says e152. That’s a huge number. Essentially the Newton solver exploded. We started from a bad guess and it just couldn’t do anything with it. In fact it had so many numerical difficulties that we had a huge increase in the residuals. Like a monstrous, unfathomable increase in the residuals. It then decreases this, but it’s already way off the mark. It’s so far off the mark that it just fails.
And so here this is just a little example of, okay, up here we saw that solver convergence looks nice when it’s like this. We know that this is a good setup. In the beginning it takes a step that’s maybe bad, but that’s okay, it converges in the end. But here we see, okay, this is terrible. This is not a converging solver. This is not going to help us solve the coupling within the system. And so I want you to be able to recognize, hey, okay, my residuals: one, they’re not going down and two, they’re huge. That’s the sign of a solver not really converging the system. Feel free to play around with some of these initial guesses, see how it converges differently. Change some of the solver settings.
All of this is also relevant to a few other lessons. One of them is types of solvers and when to use them. I recommend that you check that out. I’ll provide a link in the description below and then also how to debug your Newton solver when it’s not converging. Of course this is relevant for some of the examples that I’m showing here. These are very simple examples with a with a kind of cookie cutter circuit case.
Here we have the output from a pyCycle group. pyCycle is an engine modeling code made in OpenMDAO. In pyCycle it’s very easy to make extremely complex models. Additionally we need to solve these models and they often have nested solvers. Let me first highlight some of the n-squared diagrams corresponding to a pyCycle model. We have three top level groups; the design group, OD full power, and OD part power. Let’s take a look at just one of these groups here. I’m going to scroll back to the left here. Let’s take a look at just the design group. And here we see, okay, this one Newton solver, a bunch of non-linear run once so we don’t have solvers on these, and then an inner nested Newton solver. If we take a look here, okay, there’s that coupling that we’re resolving with that nested Newton solver and here’s that coupling that we have at the top level. So I’m only showing you this to kind of drive home the point of how complicated some of these pyCycle models are.
Let’s go back now and show some of the convergence history from a run of this model. We see at the top level we have this design group and it’s converging this Newton method that I showed over here. Again if you look at the design here, okay, there’s this top-level Newton solver within here. We have other nested solvers and here’s that nested solver. Here you can see it’s got the plus and it’s indented a little bit to show that this is one of the nested solvers. This one converges and then the next one. We have multiple linesearches going on here. Here we have a bounds check line search, here we have an Armijo Goldstein line search. Again I don’t want to delve into the physicality of the system or why we’re using the certain options that we are, but just know that we have multiple nested solvers.
As I scroll down here we see some good convergence here, we see that the iterations, you know, okay, we start at 17, 8, 4, the residuals decrease. Eventually we get below a certain tolerance and it says Newton converged. That’s great. If we continue to scroll down it’s looking good so far. But if I get down to the bottom here, please bear with me, we see something happen. We see in the off design group, the ideal flow here is converging, okay, looks good. Says Newton converged. And then the top level Newton solver, uh-oh, and they nan’d out. It says “okay, here on Newton we have residuals and they contain either infinity or nans.” Well this is terrible for numerics. That means something crashed, we don’t know what, and we have to debug it. This is not what convergence looks like, I guess obviously. If you see an iteration and you see some nans in the residual, uh-oh, you have a problem and you need to resolve it.
So previously all these examples I showed good convergence are what you want to see. If you see something like a failure or you see some nans in your residuals, you don’t want to deal with that. Please check out the how to debug solvers lecture for more information about how to deal with this.
So to recap, converging a system means that all coupling and implicit interactions have been resolved. I tried to stress during this lesson that the best solver convergence and the best settings are highly problem dependent. They really vary case to case. I wish I could tell you that you must always hit a certain tolerance and how to set your settings to do that, but unfortunately I cannot. For very complex systems you might need nested solvers and there you need to care about the residuals for each one of them. And for even simpler systems you might not need a solver at all. But to be clear if you have any sort of implicit interactions or coupling within your system, you need to converge your residuals to zero within that system using a solver. This ensures that your answer is correct and viable given the systems that you’ve defined.
I can’t stress enough that this is just one kernel, on one cob, across many ears of corn in the field of solver understanding. To accurately model complex systems with multiple groups and nested solvers, it takes a lot of practice. I highly encourage you to check out other linked lectures in the description below to see what types of solvers exist, when to use them, what it means for your models and some of the math behind them, and all sorts of other details that are very important to understand when solving multidisciplinary systems. As always make sure to mash those like and subscribe buttons. Guys, gals, and non-binary pals, thank you for watching.