Skip to main content
  • Research Article
  • Open access
  • Published:

Virtual-dynamics-based reference gait speed generator for limit-cycle-based bipedal gait


This paper addresses a reference gait speed generator for limit-cycle-based bipedal gait of humanoid robots in order to achieve secure and efficient acceleration and deceleration without falling down at respective gait speeds. The reference gait speed generator is hard to be developed by analytical approaches due to the necessity of barren work for identifying the respective basins of attraction. We challenge this issue by designing virtual dynamics among a robot, a virtual leader point, and a goal, and adapting it according to a falling risk of the robot. The virtual dynamics, which has settling and acceleration times as design parameters, gives the reference speeds derived from states of the robot and the leader point to a gait speeds controller. In the dynamics, the robot’s mass is optimized virtually to maximize efficiency while ensuring stability stochastically by using a selection algorithm for locomotion. Even when there were obstacles or an up slope in traveling courses of simulations, the robot achieved the autonomous traveling from the start to the goal securely. Specific resistance was also kept small in comparison with local-stability-based walking. The proposed method makes the limit-cycle-based bipedal gait more practical and contributes toward replacing the major method that ensures stability of every step.


In the research fields of humanoid robots, bipedal gait control to travel in a variety of environments, such as disaster sites, human-living buildings, etc., is an essential issue. Limit-cycle-based bipedal gait [1,2,3,4,5,6,7] has excellent mobility in terms of energy efficiency from utilization of natural dynamics of robots, in contrast to major approach, which wastes the energy to ensure the stability of every step [8, 9]. This approach, however, makes footstep planning (i.e., the robot’s global position control) more challenging compared to the major approach to ensure the stability of every step [10, 11], although a few studies achieved footstep planning by combining several stable gait primitives [7].

Such energy-efficient but less-controllable gait is therefore suitable for traveling on large and long spaces where the footstep planning is not required. Instead of the footstep planning, gait speed control is easily achieved in the limit-cycle-based bipedal gait. Gait speed controllers consist of three types: forward speed control [2,3,4]; sideward speed control [3, 4]; and turning speed control [5, 6, 12]. Our approach, namely quasi-passive dynamic autonomous control (Q-PDAC) with adaptive speed controller (ASC) [4, 5], has achieved all types of speed control. In addition, an autonomous transition between walking and running has been implemented to expand capabilities of the bipedal gait [13].

Q-PDAC with ASC has solved the gait speed control adaptively, not analytically. This concept may cause falling down when the reference gait speeds are drastically changed, i.e., the adaptation cannot catch up their change. Nevertheless, analytical approaches are unfortunately infeasible to solve due to their complexity.

Therefore, a remaining issue to travel from a start to a goal autonomously is the way to generate secure but efficient reference speeds. We notice that asymptotic stability is expressed as a limit cycle formed by states of the robot. For this reason, the reference speeds should be designed based on the current states, not time, to ensure asymptotic stability.

To this end, we challenge this issue by designing virtual dynamics between the robot and the goal via a virtual leader point (see Fig. 1). This approach is a combination of an artificial potential field concept for mobile robots [14, 15] and a leader-follower formation for multiple agents [16, 17]. Seto and Sugihara have proposed the similar idea for smooth reaching movement [18], but in contrast, our proposal deals with the way to design parameters of the method and optimizes them in real time according to the surroundings. The virtual leader point is attracted to the goal and repulsed by other environment including obstacles instead of the robot, and consequently, a secure path from the start to the goal is planned autonomously. When the robot follows the leader point within a certain range, the robot reaches the goal while avoiding the obstacles. Such “indirect” interaction between the robot and surroundings restrains drastic change of the reference gait speeds, namely, they are likely to maintain stability of the current states.

Fig. 1
figure 1

Concept of a reference gait speed generator the reference gait speeds are given from virtual dynamics between the robot and a virtual leader point; the leader point interacts with the surroundings

However, this reference generator does not guarantee asymptotic stability. Alternatively, more rapid acceleration would be possible. To ensure asymptotic stability and keep high speed for a long time in pursuit of efficiency, the virtual dynamics can be optimized according to a falling risk of the robot by using a selection algorithm for locomotion (SAL) proposed in previous work [19]. SAL optimizes the robot’s mass virtually to maximize an evaluation function, in other words, efficiency within the allowable falling risk, called VM-SAL. Note that unlike previous use of SAL, the evaluation function is difficult to be maximized by the change of the virtual mass. This is because a dominant factor to determine the evaluation function is a set of interaction forces and torques, which are difficult to be optimized freely. To solve this problem, such interaction is regarded as stochastic variable [20], and instead of the evaluation function, its expected value is maximized.

To evaluate proposed VM-SAL, two types of simulations are carried out, while comparing the case without and with VM-SAL. In simulations, the robot achieved the secure and efficient traveling from the start to the goal autonomously. Namely, the robot could travel in the area with obstacles or on a 5° up slope. Specific resistance was fairly small in comparison with walking by a local-stability-based method conducted in ref. [21].

Our contribution in this paper is not only in system integration from the viewpoint of practicality, but also to expand applicable problems of SAL from the theoretical point of view. Specifically, SAL has solved the optimization problems for discrete variable using deterministic or stochastic objective [19, 22] and for continuous variables using deterministic objective [19, 23] so far. VM-SAL corresponds to the optimization for continuous variables using stochastic objective, which has not been solved yet. By enabling SAL to cover this problem, most of the optimization problems for locomotion can be handled with SAL.

Prerequisites for limit-cycle-based bipedal gait and its speed control

Passive dynamic autonomous control: PDAC

Let us introduce a method used as the limit-cycle-based bipedal gait, called passive dynamic autonomous control (PDAC) [21]. PDAC models the robot as an inverted telescopic pendulum model, which has a point mass m and three-dimensional half-spherical coordinates \((\theta , \phi , \ell )\), where \(\theta \in [0, \pi /2]\) means the inclination of the pendulum, \(\phi \in [-\pi , \pi ]\) is for the direction of the inclination, and \(\ell \in (0, \infty )\) gives the pendulum length. Here, \(\ell\) is constrained by a virtual holonomic constraint (VHC) for robots in accordance with ref. [24], where all joints of the robot interlock with single state variable: in this paper, it is given by \(\theta\) as \(\ell := \ell (\theta )\). This paper employs the dynamics-based VHC [13] specifically.

By such VHC, the dynamics of the pendulum, \(\ddot{\theta }\) and \(\ddot{\phi }\), can be analytically integrated as follows:

$$\begin{aligned} \dot{\theta }(\theta ) \,= & {} \, \frac{ \pm 1 }{ m \ell ^2(\theta ) } \sqrt{ 2 \! \left( \! -\frac{ C_\phi }{ \sin ^2{\theta } } \! + \! m^2 g \! \int \! \ell ^3(\theta ) \sin {\theta } d\theta \! + \! C_\theta \! \right) } \nonumber \\= & {} \mp \frac{ \sqrt{ 2 \left( - C(\theta ) + L(\theta ) + C_\theta \right) } }{ M(\theta ) } \end{aligned}$$
$$\begin{aligned} \dot{\phi }(\theta )= & {} \pm \frac{ \sqrt{ 2 C_\phi } }{ m \ell ^2(\theta ) \sin ^2{\theta } } = \pm \frac{ \sqrt{ 2 C(\theta ) } }{ M(\theta ) \sin {\theta } } \end{aligned}$$

where g is the gravitational acceleration. \(C_\theta\) and \(C_\phi\) are conserved quantities in each gait step, named PDAC constants. \(M(\theta )\), \(C(\theta )\), and \(L(\theta )\) are substituted for respective terms. In particular, \(L(\theta )\) can be solved under the condition that \(\ell ^3(\theta )\sin \theta\) is integrable. The sign ± in Eq. (2) is simply chosen by the sign of the initial angular velocity of \(\phi\), \(\dot{\phi }_0\). The sign \(\mp\) in Eq. (1) is inverted before and after an apex.

The bipedal gait exposed by PDAC is fully described by given VHC and two PDAC constants. In particular, the gait speeds are given by the combination of two PDAC constants, whose square roots have the same units of angular momentum.

Adaptive speed controller (ASC)

ASC proposed in the previous work [4] controls the translational (forward and sideward) speeds. Let us introduce the principle of ASC. Now, the reference and actual gait speeds are defined as \(v_{x,y}^{\text {ref}} {\text{\,and\,}} {v_{x,y}}\), respectively, where x is the forward direction of the robot is facing and y is the sideward direction orthogonal to it. The origin of coordinate system of the swing-leg position \({\varvec{p}_L}\) is on a hip joint, where is geometrically connected to the vicinity of the center of gravity (COG) position.

Forward speed control

Forward speed control is considered on the basis of shifting an equilibrium point of the robot’s mechanical energy. To this end, the momentum of the swing leg is utilized for our solution. As defined in above, PDAC models the robot as the inverted pendulum with a point mass m, but in fact, the mass of the swing leg \(m_L\) is not lost as modeling error. The forward swing-up and touchdown positions, \(p_{Lx}^{su}\) and \(p_{Lx}^{td}\), are simply designed based on a capture point [25], while the term of the COG position in the capture point is ignored according to their coordinate system as follows:

$$\begin{aligned} p_{Lx}^{su}\,=\, & {} ( \alpha _x + \iota _x ) v_x^{\text{ref}} / \omega \end{aligned}$$
$$\begin{aligned} p_{Lx}^{td}\,=\, & {} \alpha _x v_{2x} / \omega \end{aligned}$$

where, \(\alpha _x = 0.8\) is the constant to keep locomotion going forward (or backward), and \(\omega\) is the natural frequency of the inverted pendulum \(\omega =\sqrt{g/(\ell (\theta )\cos \theta )}\). \(\iota _x\) is the accumulated term to adapt the swing-up amount for the optimal momentum that yields \(v_x^\text{{ref}}\).

Sideward speed control

In the bipedal gait, separation of the limit cycles (asymmetric behaviors) for the right and left legs would lead to a slide motion. The degree of this separation is given as the function of the asymmetric touchdown position of the right and left legs, \(\Delta _y\), namely, it should be adjusted to obtain the reference sideward speed. To this end, the sideward touchdown position \({p_{Ly}^{td}}\) is designed based on the capture point as follows:

$$\begin{aligned} {p_{Ly}^{td}}\,=\, & {} {\alpha _y}{v_{2y}} / \omega \pm {\beta _{L}{d_L} - {\Delta _y}} \end{aligned}$$
$$\begin{aligned} {\Delta _{y}}\,=\, & {} ( {\alpha _{y}} + {\iota _{y}}) {v_y^{\text{ref}}} / \omega \end{aligned}$$

where, \(\alpha _y = 1.2\) is the constant to keep locomotion stabilizing. \({d_{L}}\) denotes the half of width between the right and left legs and \({\beta _{L}} \in [0,1]\) means degree of the vicinity of the COG. \(\Delta _y\) is adjusted by the integral term \(\iota _y\) to achieve \({v_y^{\text{ref}}}\) as well as Eq. (3).

Quasi-passive dynamic autonomous control (Q-PDAC)

ASC is insufficient to reach arbitrary goal because its sideward speed control is limited depending on the forward speed. In that case, the turning speed control is additionally required like a vehicle. We have proposed Q-PDAC in previous work [5] to overcome this problem.

Q-PDAC adds the angular momentum \(L_w\), which is explicitly generated by rotating a yaw-axis hip joint depending on the reference turning speed \(v_w^{\text{ref}}\) as follows:

$$\begin{aligned} {L_{w}} = {I_{w}}{v_w^{\text{ref}}} \simeq {md_{L}^2} {v_{w}^{\text{ref}}} \end{aligned}$$

where, \(I_w\) is an inertia term and approximated as \(m d_L^2\). Note that the adverse effect of this approximation would be small because the contact position on the ground is basically in the vicinity of the rotating hip joint.

\(L_w\) is injected into the PDAC dynamics in Eqs. (1) and (2). The square roots of two PDAC constants, \(C_\theta\) and \(C_\phi\), are the same unit system as the angular momentum. In particular, \(\sqrt{2C_\phi }\) means the conserved angular momentum of \(\phi\) rotation during one gait step. Thus, \(L_w\) can be summed up with \(\sqrt{2C_\phi }\) in Eq. (2) as follows:

$$\begin{aligned} \dot{\phi }(\theta )= & {} \frac{ \pm \sqrt{ 2 C_\phi } + L_w }{ M(\theta ) \sin ^2{\theta } } \end{aligned}$$
$$\begin{aligned} C(\theta ) := \frac{ C_\phi }{ \sin ^2{\theta } }\Rightarrow & {} \frac{ (\pm \sqrt{ 2 C_\phi } + L_w)^2 }{ 2 \sin ^2{\theta } } \end{aligned}$$

where, the definition of \(C(\theta )\) is changed to Eq. (9). In addition, the touchdown position after turning is considered so that \(v_{2x}\) and \(v_{2y}\) in Eqs. (4) and (5) are converted into the one in coordinate system after turning, respectively.

Design of virtual dynamics


The reference gait speeds, the translational speeds \(v_{x,y}^{\text{ref}}\) and the turning speed \(v_{w}^{\text{ref}}\), are generated only from virtual attractive (sometimes repulsive) dynamics with a virtual leader point. Note that this leader point is regarded as a ghost of the robot, which has the same physical parameters as the robot’s one and goes ahead. The leader point is towed by a goal, while it is repelled by obstacles, such as walls.

Such design of dynamics aims that the robot is indirectly towed by the goal and repelled by the obstacles. This indirect interaction yields secure acceleration and deceleration, which are suitable for the limit-cycle-based gait to maintain stability at respective gait speeds, although responsiveness is deteriorated reluctantly.

In this section, an attractive force (and torque) between the robot and the leader point, \(\varvec{f}_L\) (and \(\tau _L\)), and an attractive force between the leader point and the goal, \(\varvec{f}_G\) (and \(\tau _G\)), are designed. They entail the secure reference speeds without falling down at respective gait speeds.

In Appendices A, B, and C, a repulsive force from the obstacles, \(\varvec{f}_O\) (and \(\tau _O\)), and an explorative force to escape from stationary points, \(\varvec{f}_E\), are defined. Although they are not main focus of this study, they should be implemented as a practical matter, as can be seen from many types of previous work [14, 15].

Connection among robot—leader-point—goal

Above connection model is shown in Fig. 2, where the connections among the robot, the leader point, and the goal, give subscripts L and G. For instance, dampers and springs for their connections are given as \(c_{L,G}\) and \(k_{L,G}\) respectively. From these dampers and springs, attractive forces \(\varvec{f}_L\) and \(\varvec{f}_G\) are given as follows:

$$\begin{aligned} \varvec{f}_L= & {} - c_L \dot{\varvec{\Delta p_L}} - k_L \varvec{\Delta p_L}\end{aligned}$$
$$\begin{aligned} \varvec{f}_G= & {} - c_G \dot{\varvec{\Delta p_G}} - k_G \varvec{\Delta p_G} \end{aligned}$$

where \(\varvec{\Delta p_L}\) and \(\dot{\varvec{\Delta p_L}}\) are relative distance and velocity between the robot and the leader point, and \(\varvec{\Delta p_G}\) and \(\dot{\varvec{\Delta p_G}}\) are relative distance and velocity between the leader point and the goal. Note that attractive torques \(\tau _L\) and \(\tau _G\) are given in the same manner. In most cases, \(\varvec{f}_L\) and \(\varvec{f}_G\) are directed to \(\varvec{\Delta p_L}\) and \(\varvec{\Delta p_G}\), respectively, since traveling directions of the robot and the leader point would asymptotically converge to the towing directions.

Fig. 2
figure 2

Connection among robot—leader-point—goal the robot is towed by the virtual leader point, which is also towed by the goal; their connections are given as damping and spring models

Given parameters for the connections, namely \(c_{L,G}\) and \(k_{L,G}\), are key parameters to generate the secure reference gait speeds. Hence, they are designed by giving following intuitive conditions.

Dynamics between leader point and goal

Firstly, the dynamics between the leader point and the goal is designed. To facilitate intuitive comprehension for design, this dynamics is represented by a damping ratio \(\zeta _G\) and a natural angular frequency \(\omega _G\), instead of \(c_G\) and \(k_G\).

Now, design criteria are given based on two kinds of time. One is a desired time to arrive at the goal in consideration with the robot’s locomotion ability, \(T_s\), and another is an acceleration time to converge on steady states, \(T_a\). These two are highly easy to be given by designer.

When \(T_s\) is assumed to be a settling time of the dynamics (i.e., the damped oscillation), the leader point will converge on the goal nearly on time.

$$\begin{aligned} T_s := \frac{ \ln {e_s}^{-1} }{\zeta _G \omega _G} \end{aligned}$$

where \(e_s = 0.05\) means that \(T_s\) corresponds to the 5% settling time. Note that the damped oscillation is rarely settled by \(T_s\) since disturbances from the obstacles are frequently caused.

With respect to \(T_a\), it is assumed to be an inflection point of the dynamics. Namely, \(T_a\) is derived by solving equation that the second-order differential of the damped oscillation with \(\zeta _G<1\) is equal to 0.

$$\begin{aligned} T_a= & {} \frac{1}{\omega _G \sqrt{1-\zeta _G^2}}\tan ^{-1}\frac{\sqrt{1-\zeta _G^2}}{\zeta _G} \nonumber \\\simeq & {} \frac{T_s}{\ln {e_s}^{-1}} \left( 1 -\frac{1-\zeta _G^2}{3\zeta _G^2} \right) \end{aligned}$$

where \(\tan ^{-1}(x)\) is approximated as \(x-x^3/3\) by third-order Maclaurin expansion.

From the above design criteria \(T_s\) and \(T_a\), \(\zeta _G\) and \(\omega _G\) are derived as follows:

$$\begin{aligned} \zeta _G= & {} \sqrt{ \frac{T_s}{ T_s + 3 ( T_s - \ln {e_s}^{-1} T_a ) } }\end{aligned}$$
$$\begin{aligned} \omega _G= & {} \frac{ \ln {e_s}^{-1} }{\zeta _G T_s} \end{aligned}$$

\(c_G\) and \(k_G\) are derived from \(\zeta _G\) and \(\omega _G\) via their definitions: \(k_G = m \omega _G^2\) and; \(c_G = m \zeta _G \omega _G\). Note that m is translated into I, which means a moment of inertia, for rotational dynamics.

Now, we focus on the constraint, i.e., \(T_s \ge \ln {e_s}^{-1} T_a\), given at deriving \(\zeta _G\). To effectively and surely converge to the goal on time, \(T_s\) can be updated every gait step as \(T_s \leftarrow T_s - T_{\text{sup}}\), where \(T_{\text{sup}}\) means the elapsed time at k-th gait step. Such updating, however, reaches the limitation \(T_s=\ln {e_s}^{-1} T_a\). In that case, \(\zeta _G\) becomes 1, and therefore, the leader point is expected to converge on the goal without oscillation in accordance with the critical damping.

Dynamics between robot and leader point

Secondly, the dynamics between the robot and the leader point is designed. Now, not only \(c_L\) and \(k_L\) but also a damping ratio \(\zeta _L\) and a natural angular frequency \(\omega _L\) are used.

The leader point is required to be naturally in an observable range of the robot, where the robot can observe by a laser sensor or a camera, to allow the robot to avoid the obstacles in surroundings. If the leader point is outside of the observable range, its interaction with the obstacles cannot be calculated. To keep the leader point inside of the observable range, the equilibrium point of \(\varvec{f}_L\) and \(\varvec{f}_G\) should be on the edge of the observable range at least. Namely, \(k_L\) is given from following equilibrium of \(\varvec{f}_L\) and \(\varvec{f}_G\).

$$\begin{aligned}&k_L {\max (R_{obs} , \Vert \varvec{\Delta p_G} \Vert _2 )} - k_G \Vert \varvec{\Delta p_G} \Vert _2 = 0 \end{aligned}$$
$$\begin{aligned}&\therefore k_L = \frac{\Vert \varvec{\Delta p_G} \Vert _2}{{\max (R_{obs} , \Vert \varvec{\Delta p_G} \Vert _2 )}} k_G \end{aligned}$$

where \(R_{obs}\) is the half of radius of maximum observable circle. Note that when the distance between the robot and the leader point meets \(\Vert \varvec{\Delta p_G} \Vert _2 \le R_{obs}\), the above condition would be usually kept, namely, \(k_L\) can be fixed to \(k_L = k_G\).

The robot should not become closer to the goal rather than the leader point due to risk of collision with the obstacles. \(\zeta _L\) is therefore designed to restrict an overshoot of the damped oscillation, in other words, to entail the critical damping, i.e., \(\zeta _L := 1\). From \(k_L\) and \(\zeta _L\), \(c_L\) is given. Now, as another point of view, the derivation of \(k_L\) and \(\zeta _L\) sets the settling time of the dynamics between the robot and the leader point to \(\sqrt{ R_{obs} / \max (R_{obs}, \Vert \Delta p_G \Vert _2) } T_s \le T_s\). This means that the robot will converge on the leader point faster than the time when the leader point converges on the goal, and eventually it will converge on the goal by tracking the leader point by \(T_s\).

Design of rotational dynamics between robot and leader point

The above connection models reveal the translational motions of the robot and the leader point. The rotational motions, however, should be treated because the robot cannot travel omni-directional without rotation, as mentioned above. Hence, the dampers and the springs, which are the same design for translational motions, are connected to rotate the leader point and the robot by generating attractive torques \(\tau _G\) and \(\tau _L\), respectively. Note that m is replaced with the moment of inertia I in rotation.

Unfortunately, \(\tau _L\) is insufficient to reach the leader point (the goal eventually) in most cases. This is because the sideward speed is absolutely dependent on the forward speed, i.e., the bipedal gait is a nonholonomic system similar to wheeled robots. To overcome this limitation of the sideward speed, an additional rotational torques \(\tau _S\) is required.

Now, the sideward component of \(\varvec{f}_L\) is newly divided into \(\hat{\varvec{f}}_L\) and \(f_S\) in accordance with the limitation of the sideward speed. \(f_S\) is assumed to generate the robot’s rotation, not translation, and therefore, it is converted into \(\tau _S\) as follows:

$$\begin{aligned} \tau _S = (1 \mp \mu _S)\frac{t_R}{2}f_S \end{aligned}$$

where \(t_R\) is the robot’s thickness. In addition, \(\mu _S \in (0,1)\) is a rotational friction coefficient, and therefore, the rotational friction works in the direction depending on the directions of rotational speed and \(\tau _S\): if the direction of \(\tau _S\) matches the direction of rotational speed, the friction direction is given to be minus; otherwise, it is given to be plus.

The leader point receives the reaction force \(-f_S\), which rotates it. Note that the point of load of \(-f_S\) is regarded as the rear of the leader point \(-t_R/2\), namely the rotational torques of the leader point is the same as \(\tau _S\) by canceling the sign. This reaction would result in that the robot moves only straight since the attitude error between the robot and the leader point becomes equal to 0 and the sideward error also becomes equal to 0.

Update of reference gait speed

The attractive forces, \(\varvec{f}_L\) (precisely \(\hat{\varvec{f}}_L\), which was replaced into \(\varvec{f}_L\) for simplicity) and \(\varvec{f}_G\), and the rotational torques, \(\tau _L\), \(\tau _G\), and \(\tau _S\), are given in above sections. Besides, the force to avoid the obstacles, \(\varvec{f}_O\) (and \(\tau _O\)), and the force to escape from the stationary points, \(\varvec{f}_E\), are designed in Appendices A, B, and C. Accordingly, the reference gait speed, \(\varvec{v}^{\text{ref}} = (v_x^{\text{ref}}, v_y^{\text{ref}}, v_w^{\text{ref}})^\top\), and the leader point’s position, \(\varvec{p}^L = (p_x^L , p_y^L , p_w^L)^\top\), are updated under the influence of them.

Firstly, \(\varvec{p}^L\) is updated under the condition that all forces and torques designed in this study act on the leader point.

$$\begin{aligned} \varvec{p}_{x,y}^L[t+1]\,=\, & {} \varvec{p}_{x,y}^L[t] + \dot{\varvec{p}}_{x,y}^L dt + \frac{( -\varvec{f}_L + \varvec{f}_G + \varvec{f}_O + \varvec{f}_E )}{2m} dt^2 \end{aligned}$$
$$\begin{aligned} \varvec{p}_w^L[t+1]\,=\, & {} \varvec{p}_w^L[t] + \dot{\varvec{p}}_w^L dt + \frac{(- \tau _L + \tau _S + \tau _G + \tau _O )}{2I} dt^2 \end{aligned}$$

where dt is a control period at t-th control step.

Secondly, \(\varvec{f}_L\), \(\tau _L\), and \(\tau _S\) act on the robot to update \(\varvec{v}^{\text{ref}}\). While paying attention to a constraint of update timing, when \(\varvec{v}^{\text{ref}}\) can be updated just after touchdown of swing leg unlike update of \(\varvec{p}^L\), \(\varvec{v}^{\text{ref}}\) are updated as follows:

$$\begin{aligned} \varvec{v}_{x,y}^{\text{ref}}[k+1]\,=\, & {} \varvec{v}_{x,y}^{\text{ref}}[k] + \frac{\sigma _m \varvec{f}_L}{m}T_{\text{sup}} \end{aligned}$$
$$\begin{aligned} v_w^{\text{ref}}[k+1]\,=\, & {} v_w^{\text{ref}}[k] + \frac{\sigma _m (\tau _L + \tau _S)}{I}T_{\text{sup}} \end{aligned}$$

where \(\sigma _m\) is a variable to adjust the robot’s mass virtually. The way to adjustment of \(\sigma _m\) is introduced in next “Optimization of virtual mass by selection algorithm for locomotion: VM-SAL” section.

Confirmation of convergence

To confirm the convergence of the gait speeds, simple numerical simulations are conducted. In the following simulations, the positions of the leader point and the robot are directly updated according to the given virtual dynamics. First simulations are in one dimension: the goal is in 20 m; the settling time is at 30 s; and a disturbance (\(-5\) N) will be injected from 10 to 20 s. Second simulations are in three dimensions (xyw): two goals are in (10 m, 10 m, 0°) and (0 m, 10 m, 0°); the settling time is at 30 s. The gait speeds converge on the reference gait speeds as given immediately, and the references are updated at about 0.35 s intervals in accordance with the gait step time of the actual robot. \(\sigma _m\) is fixed to 1. Other parameters are the same as Table 1.

Table 1 Parameters for robot model and proposed method

One-dimensional simulations

Simulation results are depicted in Fig. 3. When the robot was not disturbed, the robot reached the goal with smooth change of the speed. Even when the robot was disturbed, the robot reached the goal late.

Fig. 3
figure 3

Examples of the virtual dynamics in one dimension a the robot converged on the goal (20 m) by the settling time (30 s) without disturbance (\(-5\) N); b even with disturbance, the robot reach the goal late while smoothly changing its speed. a W/O disturbance, b W/ disturbance

As a remarkable point, the robot accelerated its speed again after the disturbance to catch up. This is the advantage of the virtual-dynamics-based (i.e., the state-based) reference speed generator.

Three-dimensional simulations

Simulation results are depicted in Fig. 4. The robot eventually reached the respective goals in both cases, although their routes were not linear. This is because the sideward speed is limited as mentioned in above.

Fig. 4
figure 4

Examples of the virtual dynamics in three dimensions in both cases, the robot could follow the leader point to the goal, although the distance between them seemed to be far. a Goal: (10 m, 10 m, 0°), b goal: (0 m, 10 m, 0°)

As a remarkable point, the distance between the robot and the leader point seemed to be so far, although it is in the observable range \(R_{obs}\). If this goes on, the robot would not avoid obstacles due to the tracking delay. This risk comes from no consideration of stability in the design of the virtual dynamics. Therefore, an additional design of \(\sigma _m\) to ensure asymptotic stability is proposed in the next section.

Optimization of virtual mass by selection algorithm for locomotion (VM-SAL)


We notice that \(\sigma _m\) changes the robot’s mass virtually so as to adjust trackability to the leader point. If \(\sigma _m\) is smaller than 1, the update of \(\varvec{v}^{\text{ref}}\) is restrained, and such a behavior facilitates the convergence on the limit cycles at current gait speeds; otherwise, it yields rapid acceleration or deceleration in pursuit of speed or stability. Both of these properties are required, but it is difficult to achieve them simultaneously. In this section, therefore, online optimization of \(\sigma _m\) by using SAL [19], called VM-SAL, is explained for secure and efficient bipedal gait.

An unconventional key point to apply SAL to the optimization of \(\sigma _m\) is to estimate an expected value of the evaluation function, which is maximized by optimizing \(\sigma _m\). This is because a main factor for the evaluation function is stochastically given by the virtual dynamics designed in above, and \(\sigma _m\) determines just a tendency of its dynamics.

Selection algorithm for locomotion

In general, locomotion has a trade-off relation between stability and efficiency (e.g., energy efficiency and gait speed). Several approaches to select locomotion, therefore, have been proposed for switching the priority of stability and efficiency according to the situation [19, 26, 27]. SAL is the state-of-the-art algorithm among those approaches. SAL is divided into two phases: a recognition phase and a selection phase (see ref. [19] for more details).

In the recognition phase, the robot estimates the many uncertainties for locomotion from sensors: in this paper, zero moment point (ZMP) errors on x- and y-axes; a touchdown timing error; a swing-leg trajectory error; a step height; and a slope angle. They are integrated stochastically as a falling risk S using a Bayesian network. The structure of the Bayesian network and the connection strength between the nodes are obtained via offline and online learning.

As reported in ref. [19], S is proportional to the change of the gait speed \(\Delta v\). The gait-speed-based falling risk \(S^v\) is therefore defined as follows:

$$\begin{aligned} S^v := S + C_v \Delta v \end{aligned}$$

where \(C_v\) is a coefficient, although it is simplified as 1.

In the selection phase, the robot reveals the desired balance of stability and efficiency, and adjusts the balance toward the desired one by changing its variables of locomotion. Here, the desired balance of stability and efficiency is defined the maximum efficiency within allowable falling risk (in most cases, without falling down). The variable is given as \(\sigma _m\) in this paper (see the next section).

To satisfy the desired balance of stability and efficiency, locomotion reward R is defined by product of a smooth threshold of the allowable falling risk and the efficiency given as a linear combination of the gait speed v and a reciprocal of specific resistance (or cost of transport) \(1/C_{mt}\) [28].

$$\begin{aligned} {R := \left( 1 - \gamma _S^{\text{reg}} \sum _{n=0}^N \frac{\gamma _{S}^n}{1+{\text {e}}^{r(1/2-S(t+n))}} \right) \left( C_{R1} v + \frac{C_{R2}}{C_{mt}} \right) } \end{aligned}$$

where \(C_{R1} = 100\) and \(C_{R2} = 20\) are coefficients used to adjust the priority between v and \(C_{mt}\). t is the current time step, and \(N = 5\) is the maximum time step used for predicting the falling risk. Further, r designs the shape of the logistic function. \(\gamma _S = 0.8\) is the reliability of the prediction and \(\gamma _S^{\text {reg}}\) is a normalization term.

Assumption for interaction forces as stochastic variables

Maximization of R yields the desired balance of stability and efficiency for locomotion. The virtual mass variable \(\sigma _m\) is therefore newly optimized for the purpose of maximization of R. Now, the role of \(\sigma _m\) has appeared in Eqs. (21) and (22): large/small \(\sigma _m\) strengthens/weakens the influences of given interaction forces and torques. In other words, \(\sigma _m\) can adjust \(\Delta v\) in SAL indirectly.

However, dominant factors for \(\Delta v\) are obviously the interaction forces and torques, \(\varvec{f}_L\), \(\tau _L\), and \(\tau _S\), and therefore, the effect of \(\sigma _m\) for R highly depends on them. This means that \(\sigma _m\) would oscillate if it is optimized by previous SAL, which decides optimal values deterministically.

To solve this problem, the interaction forces and torques are regarded as stochastic variables: i.e., \(\Delta v\) is also regarded as a stochastic variable. In general, when the gait speed v is under the average speed to travel from the start to the goal by \(T_s\), namely \(\bar{v}\), \(\Delta v\) tends to be positive to accelerate; otherwise, \(\Delta v\) tends to be negative to decelerate. Hence, \(\Delta v\) has a skewness depending on v in its distribution.

To represent this skewness, \(\Delta v\) is assumed to be following skew normal distribution \(\mathcal {SN}\) proposed in ref. [20].

$$\begin{aligned} \Delta v:= \,& {} x \sim \mathcal {SN}(\mu ,\sigma ,\lambda ) \nonumber \\= & {} \frac{2}{\sigma } \phi \left( \frac{x-\mu }{\sigma } \right) \Phi \left( \lambda \frac{x-\mu }{\sigma } \right) \end{aligned}$$

where \(\Delta v\) is replaced as x for the sake of convenience. \(\phi\) is the standard normal probability density function with its cumulative distribution function \(\Phi\). This \(\mathcal {SN}\) has three parameters that should be given: a location \(\mu\); a scale \(\sigma\); and a shape \(\lambda\). From these three, its mean \(\mu _{\text {skew}}\), its variance \(\sigma _{\text {skew}}\), and its skewness \(\gamma _{\text {skew}}\) are derived.

Now, these three parameters, \(\mu\), \(\sigma\), and \(\lambda\) are designed according to the behavior of \(\Delta v\). For the sake of convenience, \(\delta\) is defined as follows:

$$\begin{aligned} \delta (\lambda ) := \sqrt{ \frac{2}{\pi } } \frac{\lambda }{\sqrt{1+\lambda ^2} } \end{aligned}$$

where \(\delta\) is within \([0,\sqrt{2/\pi })\).

With respect to \(\mu\), it is simply designed to be 0. In that case, \(\mu _{\text {skew}}\) is given to be \(\mu _{\text {skew}} = \sigma \delta\).

We can easily assume that \(\sigma\) correlates with \(\sigma _m\) since \(\Delta v\) is proportional to \(\sigma _m\). \(\sigma\) is, however, not actual variance of \(\mathcal {SN}\), which is absolutely proportional to \(\sigma _m\). Thus, the relation between \(\sigma\) and \(\sigma _m\) is given via \(\sigma _{\text {skew}}\) as follows:

$$\begin{aligned} \sigma _{\text {skew}} =\, & {} \sigma _0 \sigma _m = \sigma \sqrt{1-\delta ^2} \nonumber \\ \therefore \sigma= & {} \frac{\sigma _0 \sigma _m}{\sqrt{1-\delta ^2}} \end{aligned}$$

where \(\sigma _0\) is default variance of \(\mathcal {SN}\).

Finally, \(\lambda\) is designed to represent the dependency on v.

$$\begin{aligned} \lambda (v) := \lambda _0 \left( 1 - \frac{v}{\bar{v}} \right) \end{aligned}$$

where \(\lambda _0\) is an initial skewness at \(v = 0\). The case with \(\lambda _0 = 5\) and \(\bar{v} = 0.5\) is illustrated in Fig. 5.

Fig. 5
figure 5

Skewness of \(\mathcal {SN}\) depending on v when v is smaller than \(\bar{v}=0.5\), the skewness is plus since the dynamics tends to accelerate v; otherwise, it tends to decelerate v. a Overview, b typical cases

Maximization of expected value of locomotion reward

From the above skew normal distribution, the expected value of R, namely \(E_R\), can be derived as follows:

$$\begin{aligned} E_R(\sigma _m) \,= & {} \int _{-\infty }^{\infty } R(x) \mathcal {SN}(0,\sigma (\sigma _m),\lambda )dx \end{aligned}$$
$$\begin{aligned} \!\!\simeq & {} \!\!\int _{-\infty }^{\infty } \!\!(\eta _0 \!+\! \eta _1 x \!+\! \eta _2 x^2 \!+\! \eta _3 x^3) \mathcal {SN}(0,\sigma (\sigma _m),\lambda )dx \end{aligned}$$
$$\begin{aligned} \,= \,& {} \eta _0 + \eta _1 \mu _{\text {skew}} + \eta _2 (\sigma _{\text {skew}}^2 + \mu _{\text {skew}}^2) \nonumber \\ \!\! \quad+ \,& {} \eta _3 (\gamma _{\text {skew}} \sigma _{\text {skew}}^3 + 3 \mu _{\text {skew}} \sigma _{\text {skew}}^2 + \mu _{\text {skew}}^3) \end{aligned}$$

Note that this integral is hard to calculate analytically. R is therefore approximated by third-order Maclaurin expansion, which gives an analytical solution according to properties of probability distribution by using \(\mu _{\text {skew}}\), \(\sigma _{\text {skew}}\), and \(\gamma _{\text {skew}}\). Here, η0–3 means 0–3-th order terms of R regarding \(\sigma _m\).

To maximize \(E_R\), the virtual mass factor \(\sigma _m\) is optimized by using SAL, called VM-SAL. This purpose can be achieved by using the gradient of \(E_R\) with respect to \(\sigma _m\), \(\partial E_R/\partial \sigma _m\). Namely, \(\sigma _m\) should be updated according to the direction of the gradient as follows:

$$\begin{aligned} \sigma _m[k+1] = \sigma _m[k] + G_\sigma \left( \max \frac{\partial E_R}{\partial \sigma _m} \right) ^{-1} \frac{\partial E_R}{\partial \sigma _m} \end{aligned}$$

where \(G_\sigma\) is a gain. The gradient is normalized by its maximum value (it would be at \(v = 2\bar{v}\) and \(S = 0.5\)).

One typical sample of the gradient, which has \(\sigma _m = 1\) and other parameters are the same as Table 1, is illustrated in Fig. 6. The gradient in Fig. 6 is divided into five regions to determine the behavior of VM-SAL.

  1. 1.

    Stable state with low speed (the gradient is plus): the robot would accelerate rapidly to get high speed.

  2. 2.

    Stable state with high speed (the gradient is minus): the robot would keep high speed for efficiency to reach the goal fast.

  3. 3.

    Unstable state with low speed (the gradient is minus): the robot would keep low speed for stability not to fall down.

  4. 4.

    Unstable state with high speed (the gradient is plus): the robot would decelerate rapidly to travel carefully.

  5. 5.

    Highly unstable state (the gradient is almost zero): this region would be out of scope of VM-SAL.

These behaviors are certainly reasonable similar to human behaviors, although they are absolutely determined based on the expected value. Namely, we notice that they would not be always expected.

Fig. 6
figure 6

A sample of the gradient this gradient is divided into five regions; when stable state, the robot expects to go to/stay in high speed; when uncertain state, the robot expects to go to/stay in low speed; when further unstable state, the robot judges that this region is out of scope of VM-SAL. a Overview, b region division


Simulation conditions

Robot details

Following two types of simulations are conducted on a simulator named V-REP [29]. An using robot model is created based on Gorilla Robot III that has been developed for a prototype of multi-locomotion robot [13, 30], as shown in Fig. 7. This model measures whole joint angles by respective encoders, and three-axis angular velocities by a gyro sensor mounted on the torso, and three-axis acceleration by an acceleration sensor mounted on the torso. They are used to predict the current COG states. In addition, the environmental map is given in advance, and a laser sensor is assumed to be used to estimate the self location. A contact model between the ground and the foot of the robot is defined as a non-slip model, and to this end, the COG trajectory is forcibly modified under the limitation of the friction pyramid and gravitational acceleration. From the COG trajectory and the swing leg trajectory, the resolved momentum control [9] generates whole joint angles, which are kinematically obtained almost exactly. Note that the above calculation time to generate whole joint angles was confirmed to be less than 1 ms by Intel Core i7 (2.2 GHz), which is the control step time in the following simulations.

Fig. 7
figure 7

Modified model of Gorilla Robot III for simulation its weight is about 22 kg; its height is about 1 m; total DOFs are given as 22

Parameters for the robot model and proposed method are summarized in Table 1. Note that torque and angular velocity of all joints are almost limitless for simplicity. Now, the gait speed is represented by a dimensionless format, i.e., a Froude number \(F_{rd}\) calculated by \(v/\sqrt{g\ell _{\text {leg}}}\) (\(\ell _{\text {leg}}\) is a leg length).

Environment details

In the first type of simulation, the robot will go toward the goal: \((p_x, p_y, p_w)\) = (15 m, 15 m, 90°). In the middle of traveling, four pillars are arranged as obstacles to disturb traveling. This simulation is desired to be finished by \(T_s = 45\) s, namely, \(\bar{v}\) is derived to be about 0.2.

In the second type of simulation, the robot will go straight toward the goal: \((p_x, p_y, p_w)\) = (25 m, 0 m, 0°). The settling time is given as \(T_s = 30\) s, namely, \(\bar{v}\) is derived to be about 0.5. When the gait speed of our robot model is over 0.5, the gait will transit to running in pursuit of energy minimization [13, 31], although running is easy to break its balance [32]. A slope with 5° inclination is set on the way, and therefore, the robot should transit to walking again for secureness.

In both types, two cases, without and with VM-SAL, are compared to evaluate the performance of VM-SAL. In terms of secure traveling, a distance between the robot and the leader point or a phase of the gait formed by \((\theta , \dot{\theta }, \dot{\phi })\) is confirmed. In terms of efficient traveling, the specific resistance \(C_{mt}\) [28] is evaluated.

The above simulations are finished as successful cases when the robot steps into the radius of 0.5 m of the goal at a stopable gait speed (\(|F_{rd}| < 0.1\)). This stopable gait speed was given from experience of experiments and simulations so far. In failure cases, the COG trajectory cannot be generated by PDAC due to the infeasible initial parameters, and the robot would fall down eventually.

Simulation results

All simulations were recorded in the attached video. In the following, effectiveness of VM-SAL is verified from the simulation results (Additional file 1).

Traveling while avoiding obstacles

Geometric trajectories from the start to the goal, ZMP margins, and forward gait speeds were depicted in Figs. 8, 9, and 10, respectively. In both cases, the robot reached the goal while avoiding the obstacles, although the case without VM-SAL sometimes had the ZMP on the edge of the support polygons, namely it might be failed to keep the dynamic constraints. In first and last areas (i.e., stable areas), VM-SAL yielded the rapid acceleration rather than the case without VM-SAL for efficiency. In the obstacles area, VM-SAL decreased the gait speed and kept it low for secureness since locomotion was slightly deviated from steady walking to avoid the obstacles, and that caused increase of the falling risk.

Fig. 8
figure 8

Trajectories of the robot and the leader point from the start to the goal in both cases, the robot could track the leader point to the goal without avoiding the obstacles

Fig. 9
figure 9

ZMP margin scaled by the maximum distance to the edge of the support polygons a when stepping into the obstacle area, the robot was disturbed by the obstacles, and ZMP was instantaneously on the edge of the support polygons; b the robot succeeded to keep ZMP in the support polygons

Fig. 10
figure 10

Reference and actual forward gait speed a even in the obstacles area, the gait speed was hardly decreased, which caused oscillation of actual gait speed; the robot could not brake its speed on the goal, and wasted about 16 s to reach the goal; b in the first stable area, the gait speed was rapidly accelerated to prioritize efficiency; in the obstacles area, the robot braked to prioritize stability

As can be seen in these figures, in the case without VM-SAL, the robot went through the goal due to insufficient brake, while in the case with VM-SAL, the robot could stop near the goal on the first attempt. As a result, respective arrival times were highly different: 49.6 s in the case without VM-SAL; and 35.4 s in the case with VM-SAL.

To confirm the behavior of VM-SAL, observed data were plotted in Fig. 11. The change of \(\sigma _m\) was confirmed as intended, although the behavior resulting from it was not always as expected. In the obstacles area (i.e., the region 4), \(\sigma _m\) became large for secure traveling, and actually, the gait speed was decelerated. The mean \(\sigma _m\) was large, 4.6, since the robot’s state was not stepped in the regions 2 and 3 deeply. This is due to influence of instability by high speed and weak disturbance by obstacles. Such large \(\sigma _m\) instead enabled to stop on the goal as a result, although it is expected to achieve high speed.

Fig. 11
figure 11

Verification of behavior of VM-SAL the plotted data has the size based on \(\sigma _m^{-1}\) and the color based on the change of \(\sigma _m\); in the region 4 with above-average speed and high falling risk (i.e., when initial stage in the obstacles area), \(\sigma _m\) became large to easily decelerate for stability; in the middle of and after the obstacles area, excess increase of \(\sigma _m\) was restrained due to above-average speed and low falling risk (i.e., in the region 2)

Two types of indexes were evaluated in addition to the arrival time (see Fig. 12): the distances between the robot and the leader point \((\Delta r , \Delta \theta )\) for secureness; and the specific resistance \(C_{mt}\) for efficiency. Keeping the distances short yields the gradual update of the gait speeds, which would reduce the risk of falling down by large acceleration/deceleration. Conversely, \(\sigma _m\) optimized by VM-SAL tends to make \((\Delta r , \Delta \theta )\) small. Furthermore, such secure acceleration and deceleration yielded rapid convergence to the steady state, namely \(C_{mt}\) became small about half.

Fig. 12
figure 12

Evaluation of secureness and efficiency both indexes are improved about twice by VM-SAL. a Distance between the robot and the leader point. b Specific resistance in related to the gait speed

Traveling on slope

Time-series data of ZMP margins and the reference and actual gait speeds were depicted in Figs. 13 and 14. In both cases, the gait transited to running when its speed was over 0.5, while keeping ZMP in the support polygons. The robot failed to travel on the slope in the case without VM-SAL due to delayed deceleration to transit to walking. In contrast, the robot started to decelerate the gait speed rapidly before stepping into the slope in the case with VM-SAL. Consequently, the robot succeeded in reaching the goal beyond the slope.

Fig. 13
figure 13

ZMP margin scaled by the maximum distance to the edge of the support polygons In both cases, ZMP could be kept in the support polygons, although its margin was small when the robot ran; a in the case without VM-SAL, the robot could not keep its balance on the slope and failed; b in the case with VM-SAL, even on the slope, the robot got steady walking and ZMP also became stable

Fig. 14
figure 14

Reference and actual forward gait speed a the robot fell down just after stepping into the slope because on delayed deceleration; b the gait speed was decelerated before stepping into the slope, and kept low for secureness

To confirm the behavior of VM-SAL, observed data were plotted in Fig. 15. A notable difference from traveling in the area with obstacles shown in Fig. 11 appeared in the slope area. Namely, the gait speed was decelerated but the slope kept the falling risk high, and therefore, \(\sigma _m\) became smaller than 1 in the region 3. Small \(\sigma _m\) prevented disturbance caused by unnecessary acceleration and deceleration. This behavior yielded the secureness to succeed in going through the slope, although the arrival time was delayed by about 10 s.

Fig. 15
figure 15

Verification of behavior of VM-SAL the plotted data has the size based on \(\sigma _m^{-1}\) and the color based on the change of \(\sigma _m\); on the slope, the state was kept in the region 3 to prevent speed fluctuation; the region 2 was hardly visited because running with high speed tends to be risky

Two indexes were evaluated in addition to the arrival time (see Fig. 16): the phase formed by \((\theta , \dot{\theta }, \dot{\phi })\) to confirm asymptotic stability; and the specific resistance \(C_{mt}\) for efficiency. As can be seen in the phase, the respective gaits (walking on flat, walking on slope, and running) converge to respective limit cycles, namely, asymptotic stability could be judged to be guaranteed. On the other hand, the case without VM-SAL (on slope) could not converge to specific limit cycle, and the COG trajectory was failed to be generated by PDAC. Larger \(C_{mt}\) than the case without VM-SAL (this data was evaluated until falling) was due to influence of the slope, where potential energy was additionally required. Even in consideration of that point, the case with VM-SAL achieved fairly good efficiency in comparison with walking on flat by a local-stability-based method conducted in ref. [21] (\(C_{mt} = 0.57\) by the actual robot).

Fig. 16
figure 16

Evaluation of secureness and efficiency a all gaits with VM-SAL achieved respective limit cycles, while the case without VM-SAL (on slope) failed to converge to specific limit cycle; b slightly high \(C_{mt}\) included potential energy to go up the slope. a Phase formed by \((\theta , \dot{\theta }, \dot{\phi })\). b Specific resistance in related to the gait speed


In this paper, we achieved the secure and efficient reference gait speed generator, i.e., the virtual dynamics with VM-SAL. The virtual dynamics was given among the robot—the leader point—the goal, and designed based on the desired time to arrive at the goal, \(T_s\), and the acceleration time, \(T_a\). Namely, this dynamics enabled the robot to reach the goal on time implicitly if no disturbances were added. The reference gait speeds were generated from this dynamics, however, they did not always maintain the limit cycle at respective gait speeds. Alternatively, more rapid acceleration and deceleration would be possible.

To ensure asymptotic stability and enhance efficiency, SAL optimizes the robot’s mass virtually depending on the gait speed and the falling risk, called VM-SAL. This VM-SAL aimed to maximize the locomotion reward stochastically. Namely, its expected value was maximized by updating the virtual mass variable \(\sigma _m\) in accordance with its gradient.

As a result, the robot traveled from the start to the goal in two types of environment: one is with the four pillars as obstacles; another is with the 5° up slope. When the robot traveled in the area with these obstacles, VM-SAL improved both trackability (i.e., secureness) and the specific resistance (i.e., efficiency) doubled in comparison with the case without VM-SAL. When the robot traveled on the slope, VM-SAL achieved rapid transition between walking and running according to the gait speed and prevented disturbance caused by unnecessary acceleration and deceleration. Such transition succeeded in traveling even though the case without VM-SAL failed to travel.

To regard \(\Delta v\) as the stochastic variable shown in Fig. 5 is a fairly rough assumption, which would cause unexpected behaviors. Future challenge of this research is therefore to reflect observed data into parameters of the stochastic variable for more effective optimization of \(\sigma _m\). Such reflection restrains the behavior contrary to expectation.


  1. McGeer T (1990) Passive dynamic walking. Int J Robot Res 9:62–82

    Article  Google Scholar 

  2. Hobbelen DGE, Wisse M (2008) Controlling the walking speed in limit cycle walking. Int J Robot Res 27(9):989–1005

    Article  Google Scholar 

  3. Luo X, Zhu L, Xia L (2015) Principle and method of speed control for dynamic walking biped robots. Robot Auton Syst 66:129–144

    Article  Google Scholar 

  4. Kobayashi T, Aoyama T, Hasegawa Y, Sekiyama K, Fukuda T (2016) Adaptive speed controller using swing leg motion for 3-D limit-cycle-based bipedal gait. Nonlinear Dyn 84(4):2285–2304

    Article  MathSciNet  Google Scholar 

  5. Kobayashi T, Sekiyama K, Hasegawa Y, Aoyama T, Fukuda T (2016) Quasi-passive dynamic autonomous control to enhance horizontal and turning gait speed control. In: IEEE/RSJ international conference on intelligent robots and systems, pp 5612–5617

  6. Gregg RD, Righetti L (2013) Controlled reduction with unactuated cyclic variables: application to 3d bipedal walking with passive yaw rotation. IEEE Trans Autom Control 58(10):2679–2685

    Article  MathSciNet  MATH  Google Scholar 

  7. Gregg RD, Tilton AK, Candido S, Bretl T, Spong MW (2012) Control and planning of 3-d dynamic walking with asymptotically stable gait primitives. IEEE Trans Robot 28(6):1415–1423

    Article  Google Scholar 

  8. Kajita S, Kanehiro F, Kaneko K, Fujiwara K, Yokoi K, Hirukawa H (2003) Biped walking pattern generation by a simple three-dimensional inverted pendulum model. Adv Robot 17(2):131–147

    Article  Google Scholar 

  9. Kajita S, Kanehiro F, Kaneko K, Fujiwara K, Harada K, Yokoi K, Hirukawa H (2003) Resolved momentum control: humanoid motion planning based on the linear and angular momentum. IEEE/RSJ Int Conf Intell Robots Syst 2:1644–1650

    Google Scholar 

  10. Kuffner J, Nishiwaki K, Kagami S, Inaba M, Inoue H (2005) Motion planning for humanoid robots. Robot Res 15:365–374

    MATH  Google Scholar 

  11. Perrin N, Stasse O, Baudouin L, Lamiraux F, Yoshida E (2012) Fast humanoid robot collision-free footstep planning using swept volume approximations. IEEE Trans Robot 28(2):427–439

    Article  Google Scholar 

  12. Gregg RD, Spong MW (2010) Reduction-based control of three-dimensional bipedal walking robots. Int J Robot Res 29(6):680–702

    Article  Google Scholar 

  13. Kobayashi T, Sekiyama Y, Hasegawa Y, Aoyama T, Fukuda T (2018) Unified bipedal gait for autonomous transition between walking and running in pursuit of energy minimization. Robot Auton Syst 103:27–41

    Article  Google Scholar 

  14. Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robot Res 5(1):90–98

    Article  Google Scholar 

  15. Deng M, Inoue A, Sekiguchi K, Jiang L (2010) Two-wheeled mobile robot motion control in dynamic environments. Robot Comput Integrated Manuf 26(3):268–272

    Article  Google Scholar 

  16. Das AK, Fierro R, Kumar V, Ostrowski JP, Spletzer J, Taylor CJ (2002) A vision-based formation control framework. IEEE Trans Robot Autom 18(5):813–825

    Article  Google Scholar 

  17. Asl AN, Menhaj MB, Sajedin A (2014) Control of leader-follower formation and path planning of mobile robots using asexual reproduction optimization (ARO). Appl Soft Comput 14:563–576

    Article  Google Scholar 

  18. Seto F, Sugihara T (2010) Motion control with slow and rapid adaptation for smooth reaching movement under external force disturbance. In: IEEE/RSJ international conference on intelligent robots and systems, pp 1650–1655

  19. Kobayashi T, Aoyama T, Sekiyama K, Fukuda T (2015) Selection algorithm for locomotion based on the evaluation of falling risk. IEEE Trans Robot 31(3):750–765

    Article  Google Scholar 

  20. Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12(2):171–178

    MathSciNet  MATH  Google Scholar 

  21. Aoyama T, Hasegawa Y, Sekiyama K, Fukuda T (2009) Stabilizing and direction control of efficient 3-D biped walking based on PDAC. IEEE/ASME Trans Mech 14(6):712–718

    Article  Google Scholar 

  22. Kobayashi T, Sekiyama K, Aoyama T, Fukuda T (2015) Cane-supported walking by humanoid robot and falling-factor-based optimal cane usage selection. Robot Auton Syst 68:21–35

    Article  Google Scholar 

  23. Kobayashi T, Sekiyama K, Aoyama T, Hasegawa Y, Fukuda T (2016) Selection of two arm-swing strategies for bipedal walking to enhance both stability and efficiency. Adv Robot 30(6):386–401

    Article  Google Scholar 

  24. Grizzle JW, Abba G, Plestan F (2001) Asymptotically stable walking for biped robots: analysis via systems with impulse effects. IEEE Trans Autom Control 46(1):51–64

    Article  MathSciNet  MATH  Google Scholar 

  25. Pratt J, Carff J, Drakunov S, Goswami A (2006) Capture point: a step toward humanoid push recovery. In: IEEE-RAS international conference on humanoid robot, pp 200–207

  26. Toda K, Tomiyama K (2007) An adaptive biped gait generation scheme utilizing characteristics of various gaits. In: Humanoid robots, human-like machines, InTech, New York, pp 228–244

  27. Saglam CO, Byl K (2014) Quantifying the trade-offs between stability versus energy use for under actuated biped walking. In: IEEE/RSJ international conference on intelligent robots and systems, pp 2550–2557

  28. Collins S, Ruina A, Tedrake R, Wisse M (2005) Efficient bipedal robots based on passive-dynamic walkers. Science 307:1082–1085

    Article  Google Scholar 

  29. Rohmer E, Singh SPN, Freese M (2013) V-REP: a versatile and scalable robot simulation framework. In: IEEE/RSJ international conference on intelligent robots and systems, pp 1321–1326

  30. Fukuda T, Hasegawa Y, Sekiyama K (2012) Multi-locomotion robotic systems: new concepts of bio-inspired robotics. Springer, Berlin

    Book  MATH  Google Scholar 

  31. Srinivasan M, Ruina A (2006) Computer optimization of a minimal biped model discovers walking and running. Nature 439(5):72–75

    Article  Google Scholar 

  32. Seipel JE, Holmes P (2005) Running in three dimensions: analysis of a point-mass sprung-leg model. Int J Robot Res 24(8):657–674

    Article  Google Scholar 

Download references

Authors’ contributions

TF, YH, and KS designed and directed the project. TK and TA processed the experimental data, performed the analysis, drafted the manuscript. All authors read and approved the final manuscript.


This work was supported by JSPS KAKENHI, Grant-in-Aid for JSPS Fellows, Grant Number 16J05354.

Competing interests

The authors declare that they have no competing interests.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Taisuke Kobayashi.

Additional file

Additional file 1.

Simulation videos: the robot of gray/wine color are with/without VM-SAL, respectively; in two scenarios, VM-SAL improved the gait stability and efficiency and enabled the robot to stop on the given goals smoothly.



Appendix A: Avoidance of obstacles

To avoid obstacles, many studies [14, 15] deal with virtual repulsive forces from obstacles to the robot. In limit-cycle-based bipedal gait, however, such repulsive forces may deviate the robot’s state from the limit cycle to guarantee stability, thereby causing falling down. This problem can be solved by indirect interaction with the obstacles by means of the leader point, as mentioned in “Overview” section.

Here, the obstacles generating the repulsive forces are restricted to the ones that are in the circle around the leader point with \(R_{obs}\) radius. If Eq. (17) is properly satisfied, this circle would fit into the maximum observable range of the robot. This restriction reduces calculation cost, while ensuring the minimum necessary number of the obstacles.

The repulsive force generated by the i-th observed obstacle \(O_i\), \(\varvec{f}_i\), is assumed to be inversely proportional to the square of the distance between \(O_i\) and the leader point, \(r_i\), and the masses of both, \(m_i\) and m. \(r_i\) is given as Mahalanobis distance calculated using major and minor axes of the ellipse covering \(O_i\), \(a_i\) and \(b_i\), as shown in Fig. 17a.

$$\begin{aligned} r_i = \sqrt{ \left( \frac{p_{ix}}{b_i} \right) ^2 + \left( \frac{p_{iy}}{a_i} \right) ^2 } \end{aligned}$$

where \(\varvec{p}_i = (p_{ix} , p_{iy})^\top\) is the leader point’s position relative to the coordinate of \(O_i\). The coordinate of \(O_i\) defines its origin on the point, where the leader point may collide, and the y axis as the major axis of the ellipse.

Fig. 17
figure 17

Illustrations of the forces to avoid obstacles and reach the goal a to avoid collision with obstacles, repulsive forces according to the Mahalanobis distances with them are added; b to escape from stationary points caused by obstacles, a part of the attractive force is rotated along with a tangential direction of the contour lines. a Repulsive force, b explorative force

Now, the way to design \(m_i\) and a constant of proportionality \(G_i\) is introduced, although they have been designed experimentally in almost cases of previous studies. This study focuses on kinetic energy by the velocity of the leader point approaching to \(O_i\), given \(v_i\). Note that \(v_i\) is divided into three cases: whether the actual velocity of the leader point \(\varvec{v}^L = (v_x^L , v_y^L)^\top\) intersects \(O_i\); whether \(\varvec{v}^L\) intersects the y axis of \(O_i\); and whether \(\varvec{v}^L\) is not directed to \(O_i\).

$$\begin{aligned} v_i = {\left\{ \begin{array}{ll} \Vert \varvec{v}_L \Vert _2 &{} | - \frac{v_y^L}{v_x^L} p_{ix} + p_{iy} | \le a_i \\ \frac{\varvec{v}_L \cdot ( - \varvec{p}_i )}{\Vert \varvec{p}_i \Vert _2} &{} | - \frac{v_y^L}{v_x^L} p_{ix} + p_{iy} | > a_i \\ 0 &{} - \frac{v_y^L}{v_x^L} \ge 0 \end{array}\right. } \end{aligned}$$

The kinetic energy produced by \(v_i\) should be converted into the potential energy stored by the repulsive force field not to collide with \(O_i\). The conversion to the potential energy would be started just after the observation of \(O_i\). The farthest distance at starting the observation is assumed to be \(R_{obs}/b_i\). If conversion to the potential energy is finished at \(\gamma _O R_{obs}/b_i\), where \(\gamma _O \in (0,1)\) is an appropriate magnification, \(m_i\) and \(G_i\) are derived together from the relation between the kinetic and potential energy.

$$\begin{aligned} \frac{1}{2}m v_i^2= & {} \int _{\frac{R_{obs}}{b_i}}^{\gamma _O \frac{R_{obs}}{b_i}} - G_i \frac{m m_i}{r_i^2} dr = G_i m m_i \frac{1-\gamma _O}{\gamma _O}\frac{b_i}{R_{obs}}\end{aligned}$$
$$\begin{aligned} \therefore G_im_i= & {} \frac{1}{2}\frac{\gamma _O}{1-\gamma _O} \frac{R_{obs}}{b_i} v_i^2 \end{aligned}$$

Even when the observation is started at \(R_{obs}/a_i\), which means the closest case, with this design, the kinetic energy is converted into the potential energy since the repulsive force becomes stronger than the farthest case.

From the above, total of the repulsive forces, \(\varvec{f}_O\), is given as follows:

$$\begin{aligned} \varvec{f}_O = \sum _{i=1}^{N_O} \varvec{f}_i = - \frac{m R_{obs}}{2} \frac{\gamma _O}{1-\gamma _O} \sum _{i=1}^{N_O} \frac{v_i^2}{b_i r_i^2} \end{aligned}$$

where \(N_O\) is the number of the obstacles in the observable range of the robot.

Appendix B: Rotational torque to contour lines of the gradient

To facilitate avoidance of obstacles and escape from stationary points, it is effective to rotate the attitude of the leader point toward contour lines of the gradient generated by obstacles, as can be seen in Appendix C. Such rotation is assumed to be completed in time \(T_O\), when momentum of the leader point will be consumed by the repulsive force \(\varvec{f}_O\). Finally, the leader point is rotated by dynamics of critical damping, which is designed on the basis of settling time \(T_O\).

Before considering the momentum of the leader point, an average force of \(\varvec{f}_O\), \(\bar{f}_O\), is derived from the relation between work and energy. Namely, \(\varvec{f}_O\) is assumed as a non-conservative force. \(\bar{f}_O\) is derived according to this assumption as follows:

$$\begin{aligned} \frac{1}{2} m v_O^2= & {} - \bar{f}_O (1-\gamma _O)R_{obs} \nonumber \\ \therefore \bar{f}_O= & {} - \frac{1}{2} m v_O^2 \frac{1}{(1-\gamma _O)R_{obs}} \end{aligned}$$

where the minus sign in right-hand side of above equation is given since \(\varvec{f}_O\) is the repulsive force.

\(\bar{f}_O\) can be converted into an impulse \(\bar{f}_O T_O\), which is allowed to dissipate the momentum of the leader point. Namely, \(T_O\) is derived from law of conservation of momentum as follows:

$$\begin{aligned} 0 - m v_O= \,& {} \bar{f}_O T_O \nonumber \\ \therefore \,\, T_O= & {} -\frac{m v_O}{\bar{f}_O} = \frac{2(1-\gamma _O)R_{obs}}{v_O} \end{aligned}$$

By regarding \(T_O\) as the settling time of the critical damping, a damper \(c_O\) and a spring \(k_O\), which are connected between the leader point and the contour line for convergence destination, are given as well as \(c_{L,G}\) and \(k_{L,G}\). The rotational torque \(\tau _O\) is generated from \(c_O\) and \(k_O\) as follows:

$$\begin{aligned} \tau _O = - c_O \dot{\Delta p_w^O} - k_O \Delta p_w^O \end{aligned}$$

where \(\Delta p_w^O\) and \(\dot{\Delta p_w^O}\) are angular difference and its velocity between the leader point and the contour line. Note that the direction of \(\tau _O\), in other words which direction of the contour line becomes targeted, is in accordance with the direction close to a current attitude of the leader point.

Appendix C: Escape from stationary points

As a drawback of the dynamics-based reference generators, such as an artificial potential field approach, stationary points would be caused from equilibrium of acting forces (i.e., \(-\varvec{f}_L\), \(\varvec{f}_G\), and \(\varvec{f}_O\)). The leader point should escape from such stationary points by adding an explorative force \(\varvec{f}_E\). Note that the robot (i.e., the reference gait speeds) does not requrie the explorative force since the robot is towed only by the leader point, namely no stationary points are caused.

\(\varvec{f}_E\) is generated based on the attractive force \(\varvec{f}_G\) (see Fig. 17b). Firstly, a required level of \(\varvec{f}_E\), named \(\lambda _E\), should be considered. When the equilibrium of acting forces occurs, its dominant forces are estimated as \(\varvec{f}_G\) and \(\varvec{f}_O\). Namely, if \(\varvec{f}_G\) and \(\varvec{f}_O\) are not similar vectors to each other, in particular vectors in the opposite direction, \(\varvec{f}_E\) is required. \(\lambda _E\) is therefore given as a cosine similarity between \(\varvec{f}_G\) and \(\varvec{f}_O\).

$$\begin{aligned} \lambda _E = {\left\{ \begin{array}{ll} 1-\frac{ \varvec{f}_G \cdot \varvec{f}_O }{ \Vert \varvec{f}_G \Vert _2 \Vert \varvec{f}_O \Vert _2 } &{} \quad \varvec{f}_G \cdot \varvec{f}_O > 0 \\ 1 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

where \(\lambda _E\) is in the range of [0, 1]. When \(\varvec{f}_G\) and \(\varvec{f}_O\) are opposite vectors to each other, \(\lambda _E\) is fixed to be equal to 1.

Secondly, a direction of \(\varvec{f}_E\) should be considered. This direction is basically given as a tangential direction of the contour lines of \(\varvec{f}_O\) so as to minimize \(\varvec{f}_O\) while avoiding the obstacles. The leader point is rotated along to the contour lines, hence, the direction of \(\varvec{f}_E\) is reasonably given to be the attitude of the leader point \(p_w^L\).

\(\varvec{f}_E\) is therefore designed as follows:

$$\begin{aligned} \varvec{f}_E = {^{\text {E}}R_{\text {G}}} \lambda _E \varvec{f}_G \end{aligned}$$

where \({^{\text {E}}R_{\text {G}}}\) is a rotational matrix from the direction of \(\varvec{f}_G\) to \(p_w^L\).

If part of \(\varvec{f}_G\) is assumed to be converted into \(\varvec{f}_E\), \(\varvec{f}_G\) is modified as follows:

$$\begin{aligned} \varvec{f}_G \Leftarrow ( 1 - \lambda _E ) \varvec{f}_G \end{aligned}$$

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kobayashi, T., Sekiyama, K., Hasegawa, Y. et al. Virtual-dynamics-based reference gait speed generator for limit-cycle-based bipedal gait. Robomech J 5, 18 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: