# Biped Stabilization by Linear Feedback of the Variable-Height Inverted Pendulum Model

Stéphane Caron. ICRA 2020, May 2020.

## Abstract

The variable-height inverted pendulum (VHIP) model enables a new balancing strategy by height variations of the center of mass, in addition to the well-known ankle strategy. We propose a biped stabilizer based on linear feedback of the VHIP that is simple to implement, coincides with the state-of-the-art for small perturbations and is able to recover from larger perturbations thanks to this new strategy. This solution is based on "best-effort" pole placement of a 4D divergent component of motion for the VHIP under input feasibility and state viability constraints. We complement it with a suitable whole-body admittance control law and test the resulting stabilizer on the HRP-$$4$$ humanoid robot.

## Content

 Paper Slides Presentation given at ICRA 2020 Presentation given at JRL on 29 October 2019 Source code of the controller 10.1109/ICRA40945.2020.9196715

## BibTeX

@inproceedings{caron2020icra,
title = {Biped Stabilization by Linear Feedback of the Variable-Height Inverted Pendulum Model},
author = {Caron, St{\'e}phane},
booktitle = {IEEE International Conference on Robotics and Automation},
url = {https://hal.archives-ouvertes.fr/hal-02289919},
year = {2020},
month = may,
doi = {10.1109/ICRA40945.2020.9196715},
}


## Discussion

Thanks to all those who have contributed to the conversation so far. Feel free to leave a reply using the form below, or subscribe to the  Discussion's atom feed to stay tuned.

• Posted on

When you use the QP, this considers only the instantaneous error, not along a time horizon, is this correct?

• Posted on

With DCMs, we are looking at an infinite-time horizon, with the DCM converging to its target value only as time goes to infinity. But this is similar to the infinite-horizon linear quadratic regulator: even though we look at an infinite-time horizon, the optimal feedback control law that we get is only based on the instantaneous error.

• Posted on

How can we generate reference trajectories?

• Posted on

There are several solutions:

• The seminal work by Englsberger et al. (2015) provides a closed-form solution, which is also handy for e.g. step timing adaption.
• In this work, I used a linear model predictive control trajectory optimization during experiments with HRP-4 (see the C++ implementation), that it to say, reference trajectories were simply based on the linear inverted pendulum model. Note that this is only for walking. While standing, the reference has a constant center of mass position $$c^d$$ and $$\dot{c}^d = 0$$.
• We can also generate VHIP references using the CaptureProblemSolver, which is a custom sequential quadratic programming (SQP) implementation tailored to this model. The algorithm is described here.
• We can also use a general optimal control framework to cast the full trajectory generation problem. Examples of such frameworks today include CasADi, Crocoddyl and tvm.
• Posted on

You mention a reference trajectory, but your experiment does not seem to follow a particular trajectory. In the context of balancing, what kind of reference trajectory should we care about?

• Posted on

Yes, in both the pymanoid example and HRP-4 experiment the reference trajectory is $$c^d = \mathit{constant}$$ and $$\dot{c}^d = 0$$ (with the corresponding $$\lambda^d$$ and $$r^d$$ computed for static equilibrium). For the balance controller there is no concept of "standing" or "walking", as you can see in the following block diagram: Balance Control consists of Reduced Model Tracking and Force Control, while the switch between standing and walking happens in Trajectory Generation.

• Posted on

In the video, why do you only formulate desired error dynamics a spring system, instead of spring and damper?

• Posted on

Actually Morisawa et al. (2012) (reference 3 in the video) has PID desired error dynamics. When I cite it in the video, I'm only referring to the fact that pole placement means definiing your desired error dynamics.

In this work we focus on the proportional (spring) term because it's the most significant in practice. The derivative (damping) term has usually a relatively low gain because state estimators tend to yield noisy DCM derivatives. On HRP-4 we set it to zero, and on HRP-2Kai we set it to a small value (not zero because the stiffness of its flexible joint between sole and ankle sole is lower than for HRP-4, and damping helps compensate the ensuing vibrations at high proportional gain). This might evolve if somebody manages to design a smoother DCM state estimator ;-)

• Posted on

Let’s say we are in the wild searching for more “ducks” (i.e., higher-order generalizations of the DCM). What characteristics must these quantities have to be considered a DCM?

• Posted on

I see no definitive answer, but I'd venture to say:

1. They need to be "divergent". The trajectory of a DCM $$\xi$$ is unbounded unless the input $$u$$ satisfies a specific (dynamics-dependent) condition (the boundedness condition). Alternatively, we can take inspiration from Coppel and lower-bound this unboundedness by exponentials, but we may need to look farther than exponential in general (e.g. in the 4D DCM the $$\dot{\omega} = \omega^2$$ component diverges super-exponentially).
2. They should decouple our second-order system into two consecutive first-order systems. This feels less like a property of the system and more like something we want. Here, when we choose to use Mike Hopkins's time-varying DCM, we are making sure CoM dynamics depend only on the DCM ($$\dot{c} = \omega (\xi - c)$$). Secondly, we make sure the DCM depends only on the contact wrench input to get a decoupling similar to the LIP one (replacing "ZMP" by "contact wrench" and "capture point" by "4D DCM").
• Posted on

It seems like the 4D DCM in this case is a consequence of your control parameterization in the sense that if you didn’t have a virtual stiffness lambda, you wouldn’t have the Riccati equation pop out for $$\omega$$. Do you agree?

Where I’m pointing with this is that you could alternatively consider variation dynamics for the CoM directly, with some other parameterization for the forces. For instance if you just used the force as a control input, the CoM dynamics would be linear themselves. And so I’m curious what advantages we have by essentially lifting the CoM dynamics to this higher dimension with the addition of $$\omega$$. Is it that constraints are easier to enforce?

• Posted on

The main advantage is that we get viability. That is, not only short term, but also long-term force feasibility.

Looking back at the LIP, the main driver to go this way is that it extends constraints from short-term input feasibility to long-term state viability. If we take a feedback controller using the force as control input $$F = k_p \Delta c + k_d \Delta \dot{c}$$, we know from Sugihara (2009) that $$k_d = \frac{k_p}{\omega} - m \omega$$ (minus $$m \omega$$ because we use the force) is the choice that yields maximum capturability (i.e. the linear controller with the largest basin of attraction) under ZMP constraints. There we get our connection from short-term constraints to long-term: ZMP inequality constraints prompt us to express dynamics with the ZMP as input, whereupon a constant $$\omega$$ appears. This constant determines the feedback gains that maximize capturability.

Intuitively, the reason why this controller catches all capturable states is that it spends no input trying to control the CCM; everything goes to the DCM. Although we don't know yet a proof of maximum capturability for 4D DCM feedback, the controller behaves with similar parsimony: it only spends input on the DCM, and adds height variations only when it has to.

Going back to the first part of your question, the benefit of this control parameterization is that we optimize infinite-time horizon trajectories in a quadratic program (under variation dynamics, no guarantee that we maximize capturability for the full nonlinear system). "Solving infinite-time horizon" is a practical way to extend input feasibility constraints to state viability ;-) If we use the force as control input, our dynamics are simplified, but force constraints become CoM-dependent and we can't predict what will happen to them over an infinite horizon (the set of feasible forces may vanish).

You can use Markdown with $\LaTeX$ formulas in your comment.