-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate System from *Method #345
base: master
Are you sure you want to change the base?
Conversation
M(q, p) * u' = f(q, u, t, r, p) | ||
q' = g(q, u, t) | ||
M(x, p) * x'_d = f(x, t, r, p) | ||
x'_k = g(x, t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For equations formed from Kane's method you have the ability to specify explicit expressions for the kinematical equations, q' = g(q, u, t) ,above. You have to solve for q' given the potentially complex expressions when you define u. These are generally solved explicitly for q' at the symbolic stage and in the KanesMethod
class are accessed via the .kinddiffdict()
method. For other methods (Newton-Euler, Lagrange, Hamilton, etc) you simply have the simplest case for the first order form: q' = u, i.e. g(q, u, t) = u. With q' already available in explicit form you only need to numerically solve for u'. And if you have any other method besides Kane's you have q' explict also. This means you only ever have to invert the n x n mass matrix.
I'm not sure what your changes represent here. Why the change in notation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to indicate the difference between the a "dynamic" states x'_d and "kinematic" states x'_k, which was me trying to generalize out the generalized coordinates and speeds. I'm happy to revert back, it just seemed a bit odd to keep q
and u
since they aren't actually kept separately. But I have no strong opinion on the notation.
I do want to to keep the minimum mass matrix form usable, however I don't recall the use-case for it and I couldn't see how that form could be used based on the current implementation. If there is an example using this form, I can make sure it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests for ode_function_generator tests this case. The use case is any system with which you have the q'=g(x, t) explicitly defined (which is most all systems). For large degree of freedom systems this form should provide much faster evaluation of x' because you only have to invert an n x n matrix instead of an > n x n matrix. We should also support a form like:
M(q, p) u' = f(q, u, t, r, p)
A(q, p) q' = g(q, u, t)
To be more complete, come to think of it.
This last form is the preferred form for computational speed needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change the notation it needs to be:
M(x_q, p) x'_u = f(x, r, p, t)
x_q' = g(x, p, t)
Where x = [x_u, x_q]^-1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I'll work on the notation.
For the case with an A matrix and g vector, won't most solvers take advantage of the block-matrix nature if M and A were combined?
Do you have an example of a System generating the minimum mass matrix form?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know of any linear system solvers that check for that blocking expect maybe sparse solvers. I think you typically just pass in the blocks to individual solve commands.
Any system can be put in the minimum mass matrix form.
So the solution I had in mind for this was to do this: class MBodyDiffEquations(object):
"""This class holds information about the ODEs or DAEs that describe a multibody rigidbody system, but can also hold more general systems if needed, some method may not apply to non-multibody systems."""
def __init__(self, right_hand_side, coordinates, speeds, constants,
mass_matrix=None, coordinate_derivatives=None,
specifieds=None):
# All of the args to init would become attributes.
def rhs(self):
"""Solve for the explicit right hand side."""
def rhs_of_accelerations(self):
"""Returns u' = M^-1 F"""
def states(self):
"""Returns the states in the correct order to match the result of rhs."""
def solve_kinematical_differential_equations(self):
"""Returns q'= g(q, u, t)""" This class can be populated manually with the necessary information and would carry methods to do general symbolic manipulations of the ODE/DAEs. It could detect whether they are ODEs or DAEs for example. It could provide the equations in different forms from the most general form. I think there are a number of methods that will be useful to associate with this. Now the This gives us an object that can be passed around that has and can produce more information than simply a collection of SymPy expressions, symbols, and functions. So: >>> kane = KanesMethod(...)
>>> odes = kane.odes()
>>> type(odes)
<MBodyDiffEquations>
>>> odes.states
(q1, q2, u2, u2)
>>> odes.rhs()
Matrix([[...], [...], [...], [...]])
>>> sys = System(odes, constants={m: 5.0, c: 1.0, k: 2.0}, ...)
>>> trajectory = sys.integrate()
>>> sys.plot(trajectory) |
@moorepants, when you have a minute, could you point me to some examples where the knowledge of speeds, coordinates, and minimum mass form would be useful at the integration stage? That would help me think about the abstractions more. |
For any multibody system you will be able to form symbolic expressions for M, F, and G in: M u' = F where q are the generalized coordinates and u are the generalized speeds (Kane-speak) where len(q) does not necessarily equal len(u). But generally one needs to pass H to the ode integrator (odepack, sundials, etc) where: x' = H and x = [q, u]^T (transpose) H is a function of M, F, and G. If you have analytical expressions for H (the explicit form of the first order odes) then you hand a function to the ode integration routine that evaluates H given the system state, i.e. it computes the derivatives of the states. But there are different types of ODE integrators, not all accept H. Some accept
i.e. they accept this implicit form of the ODEs. Other routines might accept Lastly, computing inv(M) or inv( ODEFunctionGenerator (case 3) generates a numerical function that evaluates M, F, and G given the current state. It then uses lapack's LU decomposition (to solve for u' and then forms x' to pass to odepacks lsoda (scipy.odeint). So it tries to be efficient at this computation because it has great bearing on integration speed, especially when you have a stiff system. [1] You never expclitly invert M and M* you use LU decomposition or Cholesky Decomposition to solve for u' or x'. |
The above also skips the case when you have Lagrange Multipliers due to constraints. See issue #127. We also need to support that case. |
This matches my current understanding of the integration portion of what System should be helping with. It seems to me that in modern computers, the setup cost of computing M^-1 numerically may make the difference in O(n^3) vs O(m^3) neglible. For this reason, I would suggest that the parent class for the generic dynamical system (if you're still interested in including it) just hold the M, x, and F. I am interested in the case where you do have the distinct minimum mass matrix form, although I'm not sure there will be a real performance improvement. For the Lagrange multiplier case, isn't it possible to augment the mass matrix and F expressions with an extra column? I believe implicit ODE/DAE solvers can handle that case. |
** an extra ROW for the mass matrix and F expression to handle lagrange multiplers |
For the basic case where n = len(u) = len(q) so that M.shape = (n, n) and |
I think a significant portion of the 10 microsecond solution to inverting an n x n would be for setup. Let's say, it's 5-8 micro seconds for setup. In that case, we're looking at 54 to 24 microseconds for the 2n matrix vs 10. Similarly for the overall simulation, I suspect there are much longer operations that would make the difference in the actual matrix inversion neglible in "normal" use cases. Certainly taking the extra time off makes sense for real-time applications. This type of speed up would primarily be used in the case where M is singular AND we only have explicit solvers, so we need to M(t) x' = F(t) at every time-step? I was trying to find out which python integrators support mass matrices / implicit ODEs / DAEs. It looks like we'd have to use a separate tool from the scipy integrators, which is unfortunate. |
Of course, you can shift the balance back to minimizing M if len(q) > len(u). I'm not sure which is typically bigger |
What do you mean by setup? I'm not sure what you are referring to. There really isn't any setup in evaluating the right hand side. For complex mbody systems the most time consuming portion of evaluating x' is the evaluation of M, F, and G which are very large expressions. For most multibody systems solved with this kind of software n <= 20. So the numerical inversion of M and So if you have a preallocated array for x', evaluate G to find x'[:n], and evaluate M and F, then solve for x'[n:] the computation time will be bounded by the evaluation time of M, F, and G or the LU decomposition for n <= 20 depending on which of those takes longer. For n > about 2 my guess it that the evaluation of M, F, and G start to dominate. Can do a demo later when I have time. |
That 1.5x computation speed is about what I would expect for inverting M and M*. By setup, I'm referring to to literally putting the matrix in memory, setting up the register blocks, etc. that is virtually independent of the size of the problem. It seems that a huge portion of the solution time is just this setup-- even more than I thought. I would be interested to see if the the literal evaluation of the expression plays an even bigger role than the LU decomposition setup. I suspect it might, in which case the biggest performance improvement would come from CSE features. Are you using (or planning on using) this for real time control? That would be pretty sweet... |
This script compares the two for a 20 DoF system: import numpy as np
import sympy as sm
from pydy.models import n_link_pendulum_on_cart
from pydy.codegen.ode_function_generators import CythonODEFunctionGenerator
sys = n_link_pendulum_on_cart(20)
args = np.random.random(42), 0.0, np.random.random(1), np.random.random(42)
M = sys.eom_method.mass_matrix
F = sys.eom_method.forcing
M_star = sys.eom_method.mass_matrix_full
F_star = sys.eom_method.forcing_full
g = CythonODEFunctionGenerator(F, sys.coordinates, sys.speeds, sys.constants_symbols, mass_matrix=M, coordinate_derivatives=sm.Matrix(sys.speeds), specifieds=sys.specifieds_symbols)
f = g.generate()
g_star = CythonODEFunctionGenerator(F_star, sys.coordinates, sys.speeds, sys.constants_symbols, mass_matrix=M_star, specifieds=sys.specifieds_symbols)
f_star = g_star.generate() Results:
So, nothing huge for n=20 and huge expressions for M, F and G as the simplest form. For n = 2 and simple expressions for M, F, G I get:
Interestingly the So this exercise seems to indicate that you don't gain much from having the M, F, G form. |
Why would you expect that when the theory is very clear about what the difference is? For the "setup" the memory for the matrix can be preallocated before any evaluation of x' happens. Maybe if you profile the above code you can demonstrate what the setup you are talking about is. This is what I'm seeing with cprofile:
Yes, we already do. I have some applications that this stuff is important for. We are working on an 11 DoF stiff human arm model that is executed at real time to give a virtual arm to a paralyzed person. This requires fast integration which my past PI solve with implicit integration methods. |
Here is a better profile using the pure
For n=20 we see that the evaluation of M and F take about the same time as the solve(M, F) command does. |
And with:
You see now that that solving the linear system of size 40 takes twice as long as the |
I forgot this detail to get the fastest speeds and remove as much overhead: import numpy as np
import sympy as sm
from pydy.models import n_link_pendulum_on_cart
from pydy.codegen.ode_function_generators import CythonODEFunctionGenerator
sys = n_link_pendulum_on_cart(20)
args = np.random.random(42), 0.0, np.random.random(1), np.random.random(42)
M = sys.eom_method.mass_matrix
F = sys.eom_method.forcing
M_star = sys.eom_method.mass_matrix_full
F_star = sys.eom_method.forcing_full
g = CythonODEFunctionGenerator(F, sys.coordinates, sys.speeds, sys.constants_symbols, mass_matrix=M, coordinate_derivatives=sm.Matrix(sys.speeds), specifieds=sys.specifieds_symbols, constants_arg_type='array', specifieds_arg_type='array')
f = g.generate()
g_star = CythonODEFunctionGenerator(F_star, sys.coordinates, sys.speeds, sys.constants_symbols, mass_matrix=M_star, specifieds=sys.specifieds_symbols, constants_arg_type='array', specifieds_arg_type='array')
f_star = g_star.generate() Now we see the 15 microsecond gain from the M, F, G form:
And the change is due to the speed of the solver as shown in the above two profiles. I'm having dejavu. I did all these tests when I wrote all this code and implemented the three cases due to these differences. The last one to check is the symbolic LU decomp. |
For 5 link pendulum if you solve for x' symoblically you get speed ups (
Since n is small, the linear system solve doesn't affect the timing like it does for larger n.
|
Thanks for running all the profiling. Perhaps they should go into a separate write-up for whoever eventually goes on to improve the code generation efficiency? Another path toward efficiency is to use a sparse linear solver. I think most of the mass matrices will have structure if they're not minimized (i.e., have a diagonal block or something) which can be used for a system with many states. It seems that with many states, having efficient expression execution (which can be improved with CSE) will be about as important as effective inversion. By the way, how long did it take to invert the mass matrix symbolically? As far as the inversion timing not matching theory: as I mentioned I think of this is as the "setup" for memory allocation, blocking etc. Most recently, I saw this explained in Prof Boyd's lectures on convex optimization. But I believe we touched on it in my numerical analysis course as well. It also fit my experience when testing chunks of my MS thesis. |
This info should be added here: https://github.com/pydy/pydy/wiki/PyDy-Speed These matrices are not sparse, they are dense. Sparse solvers are generally slower given a dense matrix. They typically use slower iterative methods. I don't think sparse solvers apply to this. Note that all the profiling I show above has CSE implemented. The CythonODEGenerator uses common sub expressions to speed up execution. Speed of symbolic rhs:
Last one has been running for minutes and is still going. We've just recently got some demos that speed this stuff up at least 10X using symengine as a backend to sympy. Note that there is very little memory allocation going on in the evaluation and I don't think that is the issue. This is a computation overhead. In the rhs eval there is memory allocation for the tempory variables from cse but that could be pre-allocated outside of the function very easily. Lapack's lsoda can be setup so that minmal memory allocation is happening there too. It can be set to simply manipulate the existing array. See the overwrite_a and overwrite_b options in http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.linalg.solve.html. |
Isn't the difference between a minimum mass matrix formulation and a full mass matrix formulation of the same system that the full mass matrix is block diagonal, and one of those blocks is identity? As far as "setup time" affecting the timing, when I say memory I'm more talking about the registers, etc. LAPACK does a lot of super fancy stuff at a super low level and I have only a very vague idea of what is actually happening. But I do know that it means that numerical inversion is not something I generally need to worry about until n gets very very large. Is there a guide to getting SymEngine as the as the backend to sympy? I was helping a professor transition a wavelets class to python. His main complaint was how slow the symbolic portion was. |
This is only the case if you have the simple definitions of the kinematical equations. Using Kane's method you can specify the lower block to be as complicated a function as you desire as long as it is linear in q'.
I don't know what Lapack's solvers due internally. All I know is that they are fast and return stable correct results. I just demonstrated that with n=20 and this "simple" pendulum problem that the LU decomposition takes substantial time wrt to the evaluation of M, F,
No there isn't. The SymEngine and SymPy interoperability is currently being figured out and developed. It is no where close to a drop in replacement (except for some of the core). You can ask on the SymPy or SymEngine list how to get things going. We have a student working on getting symengine running as a backend for sympy.physics.mechanics right now, so look for some results soon. |
Also, you should report any performance issues with SymPy to the issue tracker on that repo. We will look into it. We can also add a benchmark to our benchmarking app to track the performance of the symbolic operations. |
@sixpearls Just came across this: https://github.com/bjodah/pyodesys, which you may find more suitable than PyDy's System class. |
Thanks! I'm just taking a look, but it does seem more aligned with what I'm looking for. Thanks for pointing it out! |
This is a running PR to separate out the
System
class from the*Method
classes as discussed in an email discussion and on gitter. My suggestion is thatSystem
holds onto the vector of states, the vector of expressions for "force" (or is it a generalized impulse?), and optionally a mass matrix. The changes are small, but the decoupling allows for creating a system with manually defined equations as shown in the Van Der Pol example.If this looks good, I can go ahead and complete the PR (tests and documentation). Actually, I may need help formulating the tests.
Right now,
System
can be instantiated using the old signature of passing a*Method
instance, as long as it*Method.q
and*Method.u
form the states,*Method.forcing_full
forms the RHS, and*Method.mass_matrix_full
forms the mass matrix.commit message.
nosetests
) and on Travis CI.format.)
docs
directory)
pylint, to check your code)
Notes.
follow deprecation cycles.)