Obstacle avoidance shape control of deformable linear objects with online parameters adaptation based on differentiable simulation

Ying, Changjian; Yamazaki, Kimitoshi

doi:10.1186/s40648-024-00283-1

Research
Open access
Published: 27 September 2024

Obstacle avoidance shape control of deformable linear objects with online parameters adaptation based on differentiable simulation

Changjian Ying¹ &
Kimitoshi Yamazaki²

ROBOMECH Journal volume 11, Article number: 15 (2024) Cite this article

166 Accesses
Metrics details

Abstract

The manipulation of deformable linear objects (DLOs) such as ropes, cables, and hoses by robots has promising applications in various fields such as product assembly and surgical suturing. However, DLOs are more difficult to manipulate than rigid objects because their shape changes during manipulation. Furthermore, preventing a DLO from colliding with the environment is important to prevent it from becoming entangled and causing shape control to fail. In this paper, we proposed an obstacle avoidance and shape control scheme for DLOs based on differentiable simulation that does not require prior data or a specialized controller. First, we established a dynamic model of the DLO that allows for both forward dynamics transfer and error backpropagation to obtain gradients. Then, we employed model predictive control to optimize the embedded neural network for predicting the actions that would manipulate the DLO. Finally, the control scheme was made applicable to DLOs with different material properties by allowing online adaptation of the model parameters essential to deformation during manipulation. Simulations and real-world experiments demonstrate that the proposed control scheme could manipulate the DLO stably and accurately to avoid obstacles and achieve the goal state. In addition, the online adaptation of parameters helped mitigate the sim-to-real gap.

Introduction

The robotics community has been extensively studying the manipulation of deformable objects for various applications [1, 2]. Unlike rigid objects, deformable objects change their shape during manipulation due to external forces, which increases the degrees of freedom (DoFs) and makes planning and control of the robot’s motion more challenging. Deformable linear objects (DLOs) such as ropes, cables, and hoses are common in households and industry, so tasks involving their manipulation have received increasing attention [3], such as tying knots [4], routing wires [5], and insertion [6]. Furthermore, the DLO manipulation methods can be generalized to deformable objects with more complex geometries such as cloth and soft tissues by extending the dimensionality. Therefore, DLO manipulation is a promising area of study in the robotics field.

This study focuses on the manipulation problem of DLOs: moving a DLO from an initial shape to a given desired shape (i.e., goal shape) while avoiding obstacles. This is referred to as the shape control problem for DLOs. The obstacle avoidance shape control of DLOs has specific applications. For example, in the wiring of an automobile interior, the limited space and large number of parts inside the automobile create a challenging environment for the cable, which is susceptible to entanglement or severe damage. Therefore, one of the essential applications of our proposed method is to develop a solution that can dexterously avoid contact with other parts to achieve the desired shape. In our previous work [7], we proposed a differentiable simulation framework for modeling DLO deformation and generating obstacle avoidance trajectories offline. In this study, we extended the previous method into a shape control and obstacle avoidance scheme for DLO manipulation with online adaptation of the model parameters to mitigate the sim-to-real gap. In contrast to our previous method, the proposed control scheme does not rely on prior real-world data to determine the model parameters.

Solving the shape control problem requires developing an accurate model of DLO deformation. Specifically, the relation ${\varvec{s}}=\mathcal{F}({\varvec{a}})$ between the state ${\varvec{s}}$ of the DLO and action ${\varvec{a}}$ needs to be mapped, where $\mathcal{F}$ is a blackbox function that outputs the state of the DLO (i.e., position or velocity) when an action (i.e., force or velocity) is input. A rigid body has few DoFs, which allows for an accurate formulation of $\mathcal{F}$ based on physical laws. The inverse kinematics can then be solved by using ${\varvec{a}}={\mathcal{F}}^{-1}({\varvec{s}})$ to control the state of the object. However, DLOs have many more DoFs than a rigid body and the action applied to the DLO has much fewer DoFs than the state of the DLO, so establishing such an underactuated control system is a major challenge.

Analytical models and data-driven methods are often used to model the deformation of a DLO. In computer graphics, the analytical model of a DLO is relatively straightforward compared with that of more complex deformable objects [8]. In robotics, the deformation of DLOs is commonly modeled by using mass–spring systems [9], the finite-element method (FEM) [10], and position-based dynamics (PBD) [11] because of their ease of implementation. However, these analytical models are only approximate, and they assume that the material of the DLO remains isotropic. The sim-to-real gap problem can be defined as the need to determine certain model parameters so that the model can achieve a close approximation of reality. Traditional methods for optimizing model parameters involve minimizing the error between simulated and real data. However, this approach can be time-consuming, and collecting new data when the material properties of the DLO have been altered is impractical. Therefore, the data-driven (i.e., learning-based) approach [12,13,14] is commonly used to model the deformation of a DLO. Learning-based models such as neural networks (NNs) are trained to approximate the dynamics of a DLO. Then, techniques such as transfer learning and online learning can be used to apply trained models to untrained objects with different material properties. However, this approach requires the collection of large amounts of prior data for training, which usually comes from commercial physics engines such as Unity, Mujoco, and Isaac Sim instead of collecting real-world data. Furthermore, learning-based models have difficulty with effectively learning the dynamics of a DLO in contact with the environment [15] or can only simulate some simple responses to a collision [16]. Thus, few learning-based models can consider scenarios with obstacles, which are ubiquitous in manufacturing sites.

In this paper, we consider the obstacle avoidance shape control problem for DLO, meaning that in addition to obtaining the deformation model of DLO a proper control scheme must be established. Given the sim-to-real gap of analytical models and large amount of prior data required by data-driven models, an alternative approach is to use differentiable simulation, which allows the deformation of DLOs to be simulated by using an analytical model without requiring any prior data. Moreover, differentiable simulation offers end-to-end differentiable capability that can be used to optimize the model parameters related to shape control online to reduce the sim-to-real gap. Specifically, we implemented an analytical model of DLO based on PBD from scratch in a differentiable framework. To obtain analytic gradients with respect to the parameters that need to be optimized, we made necessary modifications to the analytical model of DLO due to the limitations of the auto-differentiation criterion. These gradients can then be used to update the parameters using gradient-based optimization methods. The differentiable simulation seamlessly embedded an NN controller, which predicts the actions necessary for shape control of the DLO. Model Predictive Control (MPC) optimizes the NN controller to output correct obstacle avoidance actions, even if the goal and environment configuration changes. Additionally, we utilized a markerless perception algorithm to effectively recognize the state of the DLO during manipulation, thereby mitigating sim-to-real errors caused by inappropriate simulation parameters through closed-loop visual feedback.

The contributions of this study are as follows:

We extended the existing simulator to compute ready-to-use gradients from simulations that incorporate collision effects.
We proposed a control scheme for the DLO to reach the goal state while avoiding obstacles. The control scheme can be adapted to different types of DLOs by adjusting deformation-related model parameters to reduce the sim-to-real gap.
Simulation and real-world experiments were conducted to evaluate the effectiveness and stability of the proposed control scheme.

The remainder of this paper is structured as follows. Sect. "Related work" presents a literature review on related studies. Sect. "Problematization and approach" describes the problem setting and presents an overview of the proposed control scheme. Sects. "Differentiable position-based dynamics" and "Shape control with online parameters adaptation" describe the parts of the proposed control scheme in detail. Sect. "Validation" presents the simulation and real-world experiments performed to validate the proposed control scheme. Sect. "Conclusion" concludes the paper.

Related work

DLO manipulation can be categorized as a control or planning problem. The former is usually limited to small deformation tasks and does not consider the effects of contact with the environment whereas the latter considers large deformation tasks in a constrained environment. Below, we review some common approaches used to solve these two problems.

Analytical models

Analytical models are a mature approach and are widely used to control or optimize the actions of a DLO. Such models are highly interpretive and can be easily integrated into classical control theory. Among analytical models, FEM [10, 17, 18] based on strict material and mechanical laws is generally considered the most accurate. However, high precision in simulations comes with drawbacks such as high computational requirements and difficult implementation. Moreover, the simulation results can be heavily influenced by material properties such as Young’s modulus and Poisson’s ratio. To address this issue, some researchers have resorted to empirical determination or derivative-free optimization algorithms, such as the Nelder–Mead algorithm [19] and grid search [20]. However, these techniques require collecting prior data again to determine the material properties whenever the properties of the DLO change. Alternatively, mass–spring systems and PBD have been used to approximate exact physical phenomena to simulate DLO deformation [21, 22]. Although these methods are easy to implement and fast, they encounter the same difficulty with determining critical parameters such as stiffness coefficients.

For DLOs, the planning problem involves generating a path from the initial state to the goal state while avoiding obstacles. Because of the high dimensionality of a DLO, obtaining an analytical solution for inverse kinematics is impractical. Therefore, sampling algorithms such as the rapidly exploring random tree (RRT) [23] and probabilistic roadmap methods (PRM) [24] have become widely adopted. Researchers then integrate an analytical model capable of generating valid DLO shapes into a classical high-dimensional planner [25,26,27,28]. However, such methods heavily rely on an accurate DLO energy model to calculate stable shapes in equilibrium. In addition, the absence of closed-loop control means that there is no assurance of a favorable execution outcome in the real world.

Learning-based methods

The application of learning-based methods to DLO manipulation can be divided into three main categories: offline learning using NNs to approximate the global dynamics of a DLO, online inference of Jacobian matrices representing the local deformations of a DLO, and end-to-end learning of a control strategy.

For offline learning of the global dynamics, various network constructs have been proposed. Wang et al. [13] and Li et al. [29] used graph neural networks to treat a DLO as a graphical structure with vertices and edges. Similarly, Yang et al. [12] and Yan et al. [30] used bidirectional long short-term memory to learn the interactions between the segments of a DLO. To avoid the difficulty of directly learning nonlinear dynamics, Zhang et al. [31] and Yan et al. [32] encoded the image of a DLO into a latent space where linear dynamics can be easily learned. Each of these approaches uses model predictive control (MPC) to control the differentiability of actions to manipulate the DLO because NNs are inherently differentiable. However, obtaining differentiable dynamics requires first collecting a large amount of prior data from the real world or simulations. In addition, learning the complex dynamics of a DLO in contact with its environment remains an issue.

In online inference, small local deformations of the DLO are assumed linear. This assumption can be used to create a Jacobian matrix that maps the velocity of the robot’s end-effector to the states of the DLO. The control inputs applied by the robot to the DLO can then be derived by solving the inverse Jacobian matrix, which is commonly referred to as visual servoing [33,34,35]. Berenson et al. [36] approximated the Jacobian matrix by using the diminishing stiffness of a deformable object. Zhu et al. [37] parameterized the shape of the DLO by using a Fourier series and used the shape parameter to estimate the Jacobian matrix of the local deformations of a cable for online two-handed manipulation. Yu et al. [14] used a radial basis function network to learn the Jacobian matrix of local deformations from simulation data. They then updated the Jacobian matrix online to compensate for errors caused by changes in the properties of the DLO and inaccurate modeling. However, these methods have limited ability to handle large deformations or account for the presence of obstacles because they constrain nonlinear DLO deformations to linear deformations.

Reinforcement learning (RL) is typically used for end-to-end learning of a DLO control strategy [38], where the actions of a robot are directly output from visually observed inputs. Han et al. [39] used the model-based RL algorithm PILCO to learn a strategy for manipulating a DLO on a 2D plane. Wu et al. [40] used the model-free RL algorithm soft actor–critic (SAC) to learn a strategy for placement and selection where images are taken directly as the input. However, RL methods are typically trained on simulation data because of the high cost of obtaining real-world data. Therefore, further research is needed on whether the learned strategies perform well in the real world. Additionally, imitation learning has recently garnered considerable attention due to its ability to enable robots to generate motions based on human demonstrations at speeds comparable to human performance. Examples include teaching a robot to use a tool to pick up objects [41] or perform long-term tasks such as writing and erasing [42]. However, a limitation of these studies is that generalization to untrained tasks does not yield stable results or requires the re-collection of teaching data for new tasks. In contrast, methods based on differentiable simulation can generalize the control scheme to various tasks by modifying the objective function, without the necessity of reconstructing the model.

Differentiable simulation

In the field of robot manipulation, differentiable simulation is primarily used for parameter identification [43] and motion control [44, 45]. The main distinction between differentiable simulation and analytical models is that the former requires an analytical model of the dynamics to be constructed as well as backpropagation of errors like in deep learning to obtain the analytical gradients of the model parameters. These gradients can then be used to optimize the parameters. Thanks to recent advances in differentiable programing, automatic differentiation techniques can now be applied to the error backpropagation of analytical models. Various studies have applied differentiable simulation to manipulating deformable objects. Chen et al. [46] proposed an end-to-end learning pipeline that uses a differentiable physics engine to teach a NN how to represent high-dimensional point cloud data collected from a deformable object. Millard et al. [47] represented a deformable object with a tetrahedral FEM mesh and optimized the material parameters to minimize the discrepancy between the observed and predicted values. Liu et al. [21] proposed differentiable formulations to represent the deformations of extensible and inextensible rope-like objects. These formulations solve the parameter estimation problem and mitigate the sim-to-real gap to match rope physics to realistic scenarios. Our approach is similar to that of Liu et al. [21], but we also made collision effects differentiable during the simulation episode and achieved obstacle avoidance shape control, which is not considered in [21]. In addition, we achieved online parameter adaptation coupled with shape control, rather than offline parameter identification as presented in [21].

Problematization and approach

Problem setting

This study addresses the problem of dynamic manipulation planning for deformable objects, without considering the manipulation sequence between different objects such as grasping, moving, and dropping. In particular, a DLO is continuously moved from an initial state to a goal state within a constrained environment. The manipulations must be dynamically planned to prevent collisions between the object and the environment throughout the movement. This category of problems involves deformable objects that undergo continuous state changes and deformations during manipulation, necessitating real-time sensing and prediction of their shape and position. To characterize this, we created a laboratory scenario as shown in Fig. 1. The state of the DLO is represented by the positions of several nodes uniformly distributed along its length. During manipulation, the DLO must avoid obstacles and overstretching (i.e., the distance between the two endpoints cannot exceed the original length of the DLO). Our aim was to develop a control scheme that generates a sequence of actions by the end-effectors of the robot that minimizes the positional differences between the DLO after manipulation and the goal state. We made the following assumptions:

The robot does not move very quickly, and the DLO is in a quasi-static state, which means that the DLO is only subjected to elastic and gravitational potential energies while inertial effects can be ignored.
The DLO is made from an isotropic and inextensible material that only undergoes elastic deformation.
An RGB-D camera captures the entire state of the DLO, and there is no occlusion or self-intersection during manipulation.
The positions and shapes of the obstacles are fixed and known.

Then the manipulation of the DLO can be described as an optimization problem:

$$\begin{array}{*{20}c} {\user2{a}_{{1:T - 1}}^{\user2{*}} = \mathop {{\text{arg~min}}}\limits_{\user2{a}} \left( {loss\left( {\user2{s}_{T} ,\user2{s}^{g} } \right)} \right)} \\ {s.t.~~\user2{s}_{{2:T}} = D_{\user2{\theta }} \left( {\user2{s}_{1} ,\user2{a}_{{1:T - 1}} } \right)\# \left( 1 \right)} \\ \end{array}$$

(1)

Here, ${{\varvec{s}}}_{1:T}=[{{\varvec{s}}}_{1},\dots ,{{\varvec{s}}}_{T}]\in {\mathbb{R}}^{3nT}$ denotes all states of the DLO in the timespan $T$, where ${{\varvec{s}}}_{t}=[{{\varvec{x}}}_{1},\dots ,{{\varvec{x}}}_{n}]\in {\mathbb{R}}^{3n}$ denotes the state of the DLO at the time step $t$, ${{\varvec{x}}}_{k}\in {\mathbb{R}}^{3}$ denotes the $k$ th node of the DLO, and n is the number of nodes. ${{\varvec{a}}}_{1:T-1}=[{{\varvec{a}}}_{1},\dots ,{{\varvec{a}}}_{T-1}]\in {\mathbb{R}}^{\mu (T-1)}$ denotes the sequence of actions throughout the timespan, ${{\varvec{a}}}_{t}\in {\mathbb{R}}^{\mu }$ denotes the action at the time step $t$, and $\mu$ is the DoF of the action. The function ${\mathcal{D}}_{{\varvec{\theta}}}$ represents the dynamics of the DLO, where ${\varvec{\theta}}$ denotes parameters related to the deformation. When the state ${{\varvec{s}}}_{t}$ and action ${{\varvec{a}}}_{t}$ of the current time step $t$ are given, the state of the next time step ${{\varvec{s}}}_{t+1}$ can be computed as ${{\varvec{s}}}_{t+1}={\mathcal{D}}_{{\varvec{\theta}}}({{\varvec{s}}}_{t},{{\varvec{a}}}_{t})$. The function $loss$ denotes the objective function used to solve the optimization problem, ${{\varvec{s}}}_{1}$ and ${{\varvec{s}}}_{T}$ denote the initial and final states of the DLO before and after manipulation, respectively, and ${{\varvec{s}}}^{g}=[{{\varvec{x}}}_{1}^{g},\dots ,{{\varvec{x}}}_{n}^{g}]\in {\mathbb{R}}^{3n}$ denotes the goal state. Solving this optimization problem generates an optimal sequence of actions ${{\varvec{a}}}_{1:T-1}^{\boldsymbol{*}}$ that minimizes the error between the final and goal states.

Outline of the proposed control scheme

Figure 2 shows the proposed control scheme based on differentiable simulation. This approach has two advantages: the dynamics of the DLO can be obtained without relying on prior data, and the scheme can be adapted to different types of DLOs. First, PBD is used to construct an analytical model that represents the deformation of the DLO with end-to-end differentiability. The use of automatic differentiation eliminates the need for a data-driven approach to obtain the dynamics of the DLO while providing the required gradients for task optimization. Next, an NN is integrated into the model as a controller to predict the corresponding action in the current state. MPC is used to update the NN weights via a gradient-based optimization algorithm to determine the optimal action. The upper half of Fig. 2 illustrates one control horizon of the MPC given the current state ${{\varvec{s}}}_{t}$ and model parameter ${{\varvec{\theta}}}_{t}$. By minimizing the error between the goal state ${{\varvec{s}}}^{g}$ and the state ${{\varvec{s}}}_{t+H}$ of the DLO after a control horizon, an optimized action sequence $\left[{{\varvec{a}}}_{t}^{*},\dots ,{{\varvec{a}}}_{t+H-1}^{*}\right]$ can be obtained. To reduce the sim-to-real gap, we can adjust the parameters associated with deformation of the DLO during execution of the optimized actions. As shown in the lower half of Fig. 2, this is done by minimizing the error between the simulated state ${{\varvec{s}}}_{t+H}^{r}$ and corresponding observed state ${{\varvec{s}}}_{t+H}$ of the DLO. Thus, the proposed control scheme can be applied to different types of DLOs without the need to collect prior data from the real world.

In a related approach, Yang et al. [48] proposed an algorithm for action control based on learning dynamics. In contrast, our proposed control scheme obtains the gradients from the simulation directly followed by action optimization and parameter identification. This circumvents the need to train the NN and interpret the rationality of the network structure. This work is an extension of our previous work [7], where we proposed a method of generating obstacle avoidance trajectories to reach the goal state offline. Our proposed control scheme improves upon our previous work in three main aspects: we use MPC to shift shape control of the DLO from an open loop to a closed loop, we integrate a parameter identification module into the control scheme, and we validated our proposed control scheme by conducting additional simulations and real-world experiments.

Differentiable position-based dynamics

Motivation

We modeled the dynamics of a DLO by using Cosserat rods [49], which we implemented by using PBD. To reduce the computational cost, we used compliant PBD (XPBD) [50]. Traditional dynamic models calculate the acceleration of an object based on the applied force and then updates the velocity and position of the object by integration over time. In contrast, PBD directly modifies the positions of the particles (which are equal to the nodes of the DLOs mentioned in the problem setting) that compose the object and thus does not need to solve for velocity, which makes the model lightweight and easy to implement.

We selected Cosserat rods [49] to model the dynamics of the DLO for several reasons. First, this model can accurately represent elastic deformations such as stretching, shearing, bending, and twisting of the DLO. In addition, constraints can be incorporated to simulate responses to a collision. Second, external manipulations of the DLO can be directly represented as a sequence of positions, which avoids the problem of overshooting due to explicit integration. Third, the computational formulas are fully differentiable, which facilitates differentiable simulation in an auto-differentiation framework without requiring additional approximations such as the linear complementarity problem. PBD deviates from real-world behavior because it prioritizes visual plausibility over strict adherence to physical laws, but the sim-to-real gap can be narrowed by online adaptation of the model parameters. For brevity, only the key formulas are presented here. Detailed derivations are available in the literature [49, 50]. We adopted Kugelstadt and Schömer’s concept [49] of treating the orientation as the same as the position and their stretch/shear constraints and bending/twisting constraints. We made three extensions to their original model: we added a collision constraint to simulate deformations that occur when the DLO contacts with obstacles, we employed XPBD to solve the constraint equations, and we implemented an auto-differentiation framework to obtain the gradients.

Original model

Figure 3 shows the Cosserat rods used to model a DLO with $n$ positions and $n+1$ orientations. The positions are defined as ${\varvec{x}}=[{{\varvec{x}}}_{1},\dots {{\varvec{x}}}_{n}]\in {\mathbb{R}}^{3n}$ in a Cartesian coordinate system. The orientations are defined as ${\varvec{q}}=\left[{{\varvec{q}}}_{1},{{\varvec{q}}}_{2},\dots ,{{\varvec{q}}}_{n+1}\right]{\in {\mathbb{R}}}^{4(n+1)}$, where ${{\varvec{q}}}_{\text{i}}{\in {\mathbb{R}}}^{4}$ is a quaternion that represents the rotation from the global frame $({{\varvec{e}}}_{1},{{\varvec{e}}}_{2},{{\varvec{e}}}_{3})\in {\mathbb{R}}^{3}$ rotated to the material frame. To simplify the notation, the position and orientation are integrated into the set ${\varvec{p}}=\left[{\varvec{x}},{\varvec{q}}\right]$. The corrections to the position and orientation $\Delta {\varvec{p}}$ are calculated for each time step by solving the set of constraint equations ${\varvec{C}}\left({\varvec{p}}+\Delta {\varvec{p}}\right)=0$. These constraints can be linearized by using the Taylor series expansion:

$$\begin{array}{c}C\left({\varvec{p}}+\Delta {\varvec{p}}\right)\approx C\left({\varvec{p}}\right)+{\nabla }_{{\varvec{p}}}C\Delta p=0\end{array}$$

(2)

where ${\nabla }_{{\varvec{p}}}{\varvec{C}}$ is the Jacobian matrix of ${\varvec{C}}$ with regard to vector ${\varvec{p}}$. This equation can be solved by restricting $\Delta {\varvec{p}}$ to the derived orientation of the constraint function:

$$\begin{array}{c}\Delta p={{\varvec{M}}}^{-1}{\left({\nabla }_{{\varvec{p}}}{\varvec{C}}\right)}^{T}\Delta \lambda ,\end{array}$$

(3)

where ${\varvec{M}}$ is the mass/inertia matrix $diag({m}_{1}\cdot 1,{m}_{2}\cdot 1,\dots ,{m}_{n}\cdot 1,{{\varvec{I}}}_{1},{{\varvec{I}}}_{2},\dots ,{{\varvec{I}}}_{n-1})$. The change in the Lagrange multiplier $\Delta{\varvec{\lambda}}$ can be computed separately for each constraint $j$ by using the Gauss–Seidel solution.

The original model defines two constraints to represent the deformability of a DLO, which we describe here.

Stretch and shear constraint Based on the Cosserat theory, the measured strain due to shear and stretching was coupled to the difference between the tangent vector of the centerline and the normal of the cross-section. Therefore, the stretch and shear constraint ${C}_{s}$ is formed by two adjacent particles with positions ${x}_{i}$ and ${x}_{i+1}$ and the quaternion ${q}_{i}$ between them:

$$\begin{array}{c}{{\varvec{C}}}_{s}({{\varvec{x}}}_{i},{{\varvec{x}}}_{i+1},{{\varvec{q}}}_{i})=\left({{\varvec{x}}}_{i}-{{\varvec{x}}}_{i+1}\right)/l-R\left({{\varvec{q}}}_{i}\right){{\varvec{e}}}_{3},\end{array}$$

(4)

where $l$ is the resting length between the two particles. For simplicity, we assumed that all $l$ values are equal. ${\varvec{R}}(\cdot )$ denotes the rotation matrix converted from the quaternion. Next, the derivatives with regard to the involved positions and quaternions can be obtained:

$$\begin{array}{c}{\nabla }_{{{\varvec{x}}}_{i}}{{\varvec{C}}}_{s}=-{\nabla }_{{{\varvec{x}}}_{i+1}}{{\varvec{C}}}_{s}=-1/l\cdot {1}_{3\times 3},\end{array}$$

(5)

$$\begin{array}{c}{\nabla }_{{\varvec{q}}}{{\varvec{C}}}_{s}=2\left(\mathfrak{I}\left({{\varvec{e}}}_{3}\overline{{\varvec{q}} }\right)|\mathfrak{R}\left({{\varvec{e}}}_{3}\overline{{\varvec{q}} }\right){1}_{3\times 3}-{\left[\mathfrak{I}\left({{\varvec{e}}}_{3}\overline{{\varvec{q}} }\right)\right]}^{\times }\right),\end{array}$$

(6)

where $\mathfrak{I}(\cdot )$ and $\mathfrak{R}\left(\cdot \right)$ denote the imaginary and real parts, respectively, of the quaternion, $\overline{{\varvec{q}} }$ is the conjugate quaternion, and ${\left[\cdot \right]}^{\times }$ denotes the skew-symmetric matrix of a vector.

Bend and twist constraint We employed the Darboux vector ${\varvec{\Omega}}$ to define the measured strain due to bending and twisting. The bend and twist constraint ${{\varvec{C}}}_{b}$ can be coupled with the two adjacent quaternions ${{\varvec{q}}}_{i}$ and ${{\varvec{q}}}_{i+1}$ to compute the difference from the resting value:

$$\begin{array}{c}{{\varvec{C}}}_{b}\left({{\varvec{q}}}_{i},{{\varvec{q}}}_{i+1}\right)=I\left(\overline{{{\varvec{q}} }_{i}}{{\varvec{q}}}_{i+1}-{\overline{{{\varvec{q}} }_{i}}}^{0}{{\varvec{q}}}_{i+1}^{0}\right)=\Omega -\xi {{\varvec{\Omega}}}^{0},\end{array}$$

(7)

$$\begin{array}{c}\xi =\left\{\begin{array}{c}+1 if {\left|{\varvec{\Omega}}-{{\varvec{\Omega}}}^{0}\right|}^{2}<{\left|{\varvec{\Omega}}+{{\varvec{\Omega}}}^{0}\right|}^{2},\\ -1 if {\left|{\varvec{\Omega}}-{{\varvec{\Omega}}}^{0}\right|}^{2}>{\left|{\varvec{\Omega}}+{{\varvec{\Omega}}}^{0}\right|}^{2},\end{array}\right.\end{array}$$

(8)

where ${{\varvec{q}}}_{i}^{0}$ and ${{\varvec{\Omega}}}^{0}$ denote the quaternion and Darboux vector at the resting state, respectively. The derivatives from the involved quaternions were obtained as follows:

$$\begin{array}{c}{\nabla }_{{\mathbf{q}}_{i}}{{\varvec{C}}}_{b}=-\left(-{{\varvec{q}}}_{i+1} | {q}_{i+\text{1,0}}{1}_{3\times 3}-{\left[{{\varvec{q}}}_{i+1}\right]}^{\times }\right),\end{array}$$

(9)

$$\begin{array}{c}{\nabla }_{{\mathbf{q}}_{i+1}}{{\varvec{C}}}_{b}=+\left(-{{\varvec{q}}}_{i} | {q}_{i,0}{1}_{3\times 3}-{\left[{{\varvec{q}}}_{i}\right]}^{\times }\right).\end{array}$$

(10)

Extensions for our approach

We added a third constraint to represent the deformability of a DLO.

Collision constraint The response of the DLO to a collision with an obstacle must be considered. We did not consider the response of the DLO to collision with itself to reduce the computation time. In PBD, the collision response can also be treated as a constraint. First, we can test the ray ${{\varvec{x}}}_{t,i}\to {{\varvec{x}}}_{t+1,i}$ for each particle $i$ to determine whether the ray penetrates the obstacle, and we can compute the entry point ${{\varvec{x}}}_{ic}$ and surface normal ${{\varvec{n}}}_{ic}$ at this position. ${{\varvec{x}}}_{ic}$ is replaced by the point on the surface of the obstacle nearest to ${{\varvec{x}}}_{t,i}$ if the ray lies completely inside the obstacle, and ${{\varvec{n}}}_{ic}$ is replaced by the normal at the nearest point. Next, the unilateral constraint function ${{\varvec{C}}}_{c}\left({{\varvec{x}}}_{i}\right)=({{\varvec{x}}}_{i}-{{\varvec{x}}}_{ic})\cdot {{\varvec{n}}}_{ic}$ is defined, and an inverse stiffness $\beta =1$ is added. Then, the derivative with regard to the position can be readily obtained by ${\nabla }_{{{\varvec{x}}}_{i}}{{\varvec{C}}}_{c}={{\varvec{n}}}_{ic}$.

XPBD can be to solve the constraint Eq. (3):

$$\begin{array}{c}\Delta {\lambda }_{j}={-\left({\nabla }_{{\varvec{p}}}{C}_{j}{{\varvec{M}}}^{-1}{\nabla }_{{\varvec{p}}}{C}_{ j}^{T}+{\widetilde{\beta }}_{j}\right)}^{-1}\left({C}_{j}\left({{\varvec{p}}}_{i}\right)+{\widetilde{\beta }}_{j}{\lambda }_{ij}\right).\end{array}$$

(11)

Here, ${\lambda }_{ij}$ is the total Lagrange multiplier for constraint $j$ at the current iteration $i$. $\widetilde{{\varvec{\beta}}}={\varvec{\beta}}/\Delta {t}^{2},$ where ${\varvec{\beta}}$ and $\Delta t$ denote the inverse stiffness and time step, respectively.

After solving all the constraint functions, the linear and angular velocities of each particle are updated by the corrections to the position and quaternion. Then, the velocity is multiplied by a damping factor, such as, $v={\left(1-{\xi }_{v}\right)}^{\Delta t}v$, to represent the energy dissipation. The damping factor ${\xi }_{v}$ indicates the damping ratio per second of the corresponding velocity. The larger the damping factor, the more energy is dissipated and the faster the model is stabilized.

The original PBD updates the position and orientation at each time step but only stores the most recent state and overwrites past states. Differentiable simulation requires storing all intermediate states, which allows automatic differentiation tools to compute the analytic gradients of the parameters. Thus, Algorithm 1 presents the modified PBD framework for differentiable simulation. Lines 1 and 2 initialize the state of the object and create the parameter sequence required to compute the gradients, such as the stiffness ${\varvec{\beta}}$ and action ${\varvec{a}}$. Lines 3–14 present the standard PBD framework. To calculate the gradients, all momentary states and the intermediate process of iteratively solving the constraints are stored in lines 8 and 9. Lines 15 and 16 compute the error according to a predefined objective function. Then the gradient of the error with respect to the parameter ${\varvec{\theta}}$ is computed by backpropagation.

Shape control with online parameters adaptation

Model predictive control

The shape control of a DLO can be treated as an optimization problem for MPC, where the entire timespan $T$ is divided into several control horizons $H$ and the actions in each control horizon are optimized. For a given state ${{\varvec{s}}}_{t}$ at the time step $t$, a goal state ${{\varvec{s}}}^{g}$, and the dynamics $\mathcal{D}$ of the DLO, a controller is required that can predict the action ${{\varvec{a}}}_{t}$ at the current time step, which we approximated by using an NN with four fully connected layers. The input layer with a size of $n$ × 3 × 2 neurons consists of the current state ${{\varvec{s}}}_{t}$ and goal state ${{\varvec{s}}}^{g}$. The two hidden layers both have a size of 256 neurons, and the output layer with a size of $\mu$ neurons consists of the movement velocity ${{\varvec{u}}}_{t}\in {\mathbb{R}}^{\mu }$. Hidden layers are connected by a ReLU function, other layers are connected by linear functions. The prediction of the NN is represented by ${\pi }_{{\varvec{w}}}\left({{\varvec{s}}}_{t},{{\varvec{s}}}^{g}\right)={{\varvec{u}}}_{t}$, where ${\varvec{w}}$ denotes the weight of the NN. The current state can be computed by using ${{\varvec{a}}}_{t}={{\varvec{a}}}_{t-1}+{\Delta t {\varvec{u}}}_{t}$, where $\Delta t$ is the time interval. The NN is embedded into the model as shown in Fig. 2 to optimize the weights using a gradient-based optimization method.

We adopted a shooting approach for MPC. For a given control horizon, the current state ${{\varvec{s}}}_{t}$ is input to the NN, which outputs the current action ${{\varvec{a}}}_{t}$ according to ${{\varvec{a}}}_{t}={{\varvec{a}}}_{t-1}+{\pi }_{{\varvec{w}}}\left({{\varvec{s}}}_{t},{{\varvec{s}}}^{g}\right)\cdot \Delta t$. The state in the next time step ${{\varvec{s}}}_{t+1}$ can then be computed based on the dynamics of the DLO: ${{\varvec{s}}}_{t+1}={\mathcal{D}}_{{\varvec{\theta}}}({{\varvec{s}}}_{t},{{\varvec{a}}}_{t})$. By analogy, the state sequence ${{\varvec{s}}}_{t+1:t+H}=[{{\varvec{s}}}_{t+1},\dots ,{{\varvec{s}}}_{t+H}]$ and action sequence ${{\varvec{a}}}_{t:t+H-1}=[{{\varvec{a}}}_{t},\dots ,{{\varvec{a}}}_{t+H-1}]$ can be computed. The weights of the gradients with respect to the NN are backpropagated from the sum of future errors. To avoid obstacles, we define a specialized objective function ${\mathcal{L}}_{1}$ as illustrated in Fig. 4 that applies an artificial potential field:

$$\begin{array}{c}{\mathcal{L}}_{1}\left({{\varvec{s}}}_{t+1:t+H},{{\varvec{s}}}^{g}\right)=\frac{1}{H}\sum_{i=t+1}^{t+H}\left({\omega }_{1}\Vert {{\varvec{s}}}_{i}-{{\varvec{s}}}^{g}\Vert +{\omega }_{2}U\left({{\varvec{s}}}_{i}\right)+{\omega }_{3}{\Vert {{\varvec{u}}}_{i}\Vert }^{2}\right)\end{array}$$

(12)

where ${\omega }_{1}$, ${\omega }_{2}$ and ${\omega }_{3}$ are the tradeoff coefficients. They represent the scaling of the corresponding terms. $\Vert {{\varvec{s}}}_{i}-{{\varvec{s}}}^{g}\Vert$ represents the positional error between the current state and goal state, and ${\Vert {{\varvec{u}}}_{i}\Vert }^{2}$ is a term that limits the speed of the action. $U({{\varvec{x}}}_{k})$ is designed with a repulsive potential to prevent the node ${{\varvec{x}}}_{k}$ from colliding with the obstacle $\mathcal{O}$:

$$U\left({{\varvec{x}}}_{k}\right)=\left\{\begin{array}{ll}{\left(\frac{1}{d\left({{\varvec{x}}}_{k}\right)}-\frac{1}{{\varepsilon }_{d}}\right)}^{2}, & d\left({{\varvec{x}}}_{k}\right)<{\varepsilon }_{d}\\ 0, & otherwise\end{array}\right.$$

(13)

where $d\left({{\varvec{x}}}_{k}\right)=\underset{{x}_{k}{\prime}\in \mathcal{O}}{\text{min}}\Vert {{\varvec{x}}}_{k}-{{\varvec{x}}}_{k}{\prime}\Vert$ is the closest distance between a node on the DLO and the obstacle and ${\varepsilon }_{d}$ is the maximum distance at which the repulsive potential works. Then, each control horizon can be represented as an optimization problem for the weights of the NN:

$$\begin{array}{c}{{\varvec{w}}}^{*}=\underset{w}{\text{arg min}}{\mathcal{L}}_{1}\left({{\varvec{s}}}_{t+1:t+H},{{\varvec{s}}}^{g}\right)\\ s.t. {{\varvec{s}}}_{t+1}={\mathcal{D}}_{{\varvec{\theta}}}\left({{\varvec{s}}}_{t},{{\varvec{a}}}_{t}\right)\\ {{\varvec{a}}}_{t}={{\varvec{a}}}_{t-1}+\Delta t{\pi }_{{\varvec{w}}}\left({{\varvec{s}}}_{t},{{\varvec{s}}}^{g}\right)\end{array}$$

(14)

The gradient optimization algorithm efficiently solves the optimization problem by using the gradients obtained by backpropagation. Combining differentiable simulation with MPC offers two advantages. First, there is no need to learn gradients by learning the dynamics from prior data. Second, compared to model-free RL algorithms, various similar tasks can be accomplished by simply changing the objective function without retraining.

Online adaptation of parameters

The simulated deformation of a DLO is influenced by physical parameters such as the stiffness, mass, and damping. To narrow the sim-to-real gap, we can modify such parameters based on visual feedback from shape control of the DLO. In our proposed control scheme, the completion of a control horizon generates the state sequence ${{\varvec{s}}}_{t+1:t+H}$ and action sequence ${{\varvec{a}}}_{t:t+H-1}$. Executing the generated action sequence in the real world generates the state sequence ${{\varvec{s}}}_{t+1:t+H}^{r}$ in the real world, which can be obtained via a visual perception algorithm. The adaptation of the model parameters can then be represented as an optimization problem:

$$\begin{array}{c}{{\varvec{\theta}}}^{*}=\underset{\theta }{\text{arg min}}{\mathcal{L}}_{2}\left({{\varvec{s}}}_{t+H},{{\varvec{s}}}_{t+H}^{r}\right)\\ s.t. {{\varvec{s}}}_{t+1}={\mathcal{D}}_{{\varvec{\theta}}}\left({{\varvec{s}}}_{t},{{\varvec{a}}}_{t}\right)\end{array}$$

(15)

where ${\varvec{\theta}}$ denotes the model parameter that needs to be recognized and ${\mathcal{L}}_{2}=\Vert {{\varvec{s}}}_{t+H}-{{\varvec{s}}}_{t+H}^{r}\Vert$ represents the positional error between the simulated and real-world states. To save manipulation time, only the final state in one control horizon is used to calculate the error.

Algorithm 2 incorporates online parameter adaptation represented by (15) into the shape control discussed in Sect. "Model predictive control". Line 1 initializes the model parameters and weights of the NN. Line 20 indicates that the entire timespan is divided into several control horizons $H$ for processing. Lines 4–11 describe the optimization process of the MPC for one control horizon, from which a sequence of control horizon actions ${{\varvec{a}}}_{t:t+H-1}^{*}$ is generated. Lines 12 and 13 execute ${{\varvec{a}}}_{t:t+H-1}^{*}$ in the real world, which moves the DLO to an intermediate state. Then, the real-world state of the DLO is obtained based on the visual perception algorithm. Lines 14–19 repeatedly execute the same action sequence in the simulation and iteratively update the model parameters by comparing the simulated and real-world states of the DLO.

Validation

Simulations

We conducted simulations to investigate the accuracy of the proposed control scheme at controlling the shape the DLO. Thus, we used fixed parameters without online parameter adaptation. The objective was to investigate the positional error between the current and goal states of the DLO after manipulation with the proposed control scheme.

Implementation and setting of the simulation

The differentiable simulation of the DLO was implemented by using the automatic differentiation framework Taichi [51], which allows for parallel computation and reduces the computational time to obtain the analytic gradients. Adam was used for optimization. The rest of the proposed control scheme was implemented in Python. A computer with an Intel Xeon Gold 6242 CPU and Quadro RTX 6000 GPU was used for the simulations. The DLO was represented by 40 nodes ($n=40$) and was grasped at its ends by end-effectors. As assumed in the problem setting presented in SubSect. "Problem setting", there is no self-intersection during the manipulation of the DLO. To prevent DLO from self-intersecting, we ignored the twist thereby limiting the DoFs of each action to only translations and without rotations ($\mu =2\times 3$). Table 1 presents the main simulation parameters related to the deformation of the DLO. The MPC hyperparameters were set to $T=400,H=5,\mathcal{I}=30$, and ${\omega }_{1}=1.0~\text{m}^{-1},{\omega }_{2}=0.01~\text{m}^{2},{\omega }_{3}=0.5~{\text{s}}^{2}/\text{m}^{2}$. The initial state of the DLO was a straight line (Fig. 5), and the goal state was randomly generated. Three cylinders with identical radii of 0.03 m but varying heights of 0.13, 0.16, and 0.19 m were installed as obstacles on the vertical plane.

Table 1 Main simulation parameters of the DLO

Full size table

Simulation results

We tested the method on 10 different goal shapes, and the error for all cases was below 5 mm with an average error of approximately 2.7 mm. Figure 5 shows two examples of manipulation processes, which demonstrate the ability of the proposed control scheme to avoid obstacles and move the DLO from the initial state to the goal state. No prior data were collected to learn the dynamics of the DLO.

Real-world experiments

We conducted real-world experiments to investigate the accuracy of the proposed control scheme on different types of DLOs and the effectiveness of online parameter adaptation at narrowing the sim-to-real gap. We also conducted an ablation study to evaluate the effectiveness of individual modules of the proposed control scheme.

Instrument and setting

As shown in Fig. 6, one end of the DLO was grasped by a CR5 robot while the other end remained fixed. To prevent the DLO from self-occlusion, which makes it difficult to measure the state of the DLO from the real world. Similar to the simulation setting, again we did not consider the twist thereby limiting the DoFs of the end-effector to only translations and without rotations ($\mu =3$). The experiments were performed with a black background and white DLO. An Azure Kinect camera was used to acquire both RGB and depth images of the DLO, which were processed by using OpenCV. The positions of obstacles were determined by augmented reality markers. The MPC hyperparameters were identical to those used in the simulation experiments (Sect. "Simulations"). Two DLOs were tested using three different goal states. The first DLO was a softer rope with a length of 0.45 m and diameter of 8 mm. The second DLO was a harder cable with a length of 0.55 m and diameter of 5 mm. The states of the DLOs were represented by 30 and 40 nodes, respectively.

Figure 7 shows the markerless perception algorithm that we used to obtain the actual state of the DLO. The inputs consisted of RGB and depth images that contained the DLO, the length $L$ of the DLO, and the number n of nodes used to represent the state of the DLO. An RGB-D camera was used to capture the images from which the length was measured. The number of nodes was artificially defined to ensure adequate representation of the DLO geometry. Segmenting the DLO region in the image is an essential step that can be done by traditional image processing or data-driven approaches, but this was beyond the scope of this study. Therefore, we set the background to a pure black color that contrasted with the white color of the DLO so that the DLO could be segmented by binarization. Then, the centerline of the DLO was extracted by using skeletonization. The pixels of the centerline were projected into a 3D point cloud and sorted. The first and last points in the sorted point cloud were treated as the first node ${{\varvec{x}}}_{1}$ and last node ${{\varvec{x}}}_{n}$. The remaining nodes were then searched for in the point cloud by a greedy algorithm, which ensured that the error of the Euclidean distance between neighboring nodes was under a given threshold $\epsilon$ with respect to $L/(n-1)$. Once all the nodes were found, a chain was obtained consisting of nodes that were uniformly distributed over the DLO. This approach enabled quick detection of the actual state of the DLO while keeping it physically inextensible.

Real-world experimental results

Table 2 presents the differences between the model parameters of the rope and cable owing to the differences in their properties. In particular, the cable had a larger stiffness coefficient than the rope, which is consistent with their material properties. Figure 8 shows the manipulation process of the DLOs. The visual perception algorithm allowed for robust real-time acquisition of the DLO state for online adaptation of model parameters. Even when the physical properties of the DLO were altered, the proposed control scheme could adapt the parameters to reduce the sim-to-real gap. Thus, we were able to successfully move the DLO to avoid obstacles and reach the goal state. As presented in Table 3, the average error in the real-world experiments was approximately 1 cm. Each error was defined as the average of the sum of the node-by-node Euclidean distances between the state after the manipulation ${{\varvec{s}}}_{T}$ and the goal state ${{\varvec{s}}}^{g}$, i.e. $\Vert {{\varvec{s}}}_{T}-{{\varvec{s}}}^{g}\Vert =\frac{1}{n}\sum_{i=1}^{n}\Vert {{\varvec{x}}}_{i}-{{\varvec{x}}}_{i}^{g}\Vert$. Several factors contributed to the larger error than in the simulation, such as limitations in the camera accuracy, errors in robot–camera calibration, and imprecise DLO modeling. However, because we did not use any markers to acquire the state of the DLO and did not learn its dynamics based on any prior data, we believe that the proposed control scheme is effective at controlling the shape of the DLO. Movie 1 in the supplementary information demonstrate the manipulation of the DLOs in detail.

Table 2 Parameter adaptation

Full size table

Table 3 Errors (in meters) of shape control in the real world

Full size table

Results of the ablation study

Neutral networks We conducted simulations to test the contribution of the NN to the shape control of the DLO. In scenarios without the NN, the DLO simulation constructed is end-to-end differentiable, allowing for the direct calculation of the gradient of the objective function value with respect to the action. Subsequently, a gradient descent method, such as Adam, was employed to optimize the sequence of actions for each MPC horizon. Fig. 9 (left) shows the results. For a given goal state and timespan, the average and variance of the control error were significantly reduced with the NN than without it. Despite our best efforts to optimize the action sequences by tuning the hyperparameters, the results still had a high probability of falling into a local optimum without the NN.

Online parameter adaptation Next, we investigated the contribution of online parameter adaptation to shape control using the fixed parameters given in Table I for the rope. The goal state was GOAL 1, and no online parameter adaptation was enabled. The results are shown in Fig. 9 (right). The error converged to a smaller value with online parameter adaptation than without, which can be attributed to narrowing of the sim-to-real gap. As shown in Fig. 10, a large discrepancy was observed between the simulated and real-world states without online parameter adaptation, which caused the shape control to converge to a local optimum that ultimately left the DLO in a position far from the goal state.

Conclusion

In this paper, we proposed a control scheme for DLOs that eliminates the need for a priori data and specialized controllers. Our scheme uses differentiable simulation to facilitate not only forward dynamics transfer but also error backpropagation to obtain gradients useful for task optimization, and the shape control combines MPC with an embedded NN controller and online parameter adaptation to narrow the sim-to-real gap. Simulations and real-world experiments showed that the proposed control scheme was able to accurately manipulate DLOs into desired shapes while avoiding obstacles with average errors of 3 mm and 1 cm, respectively. Ablation studies were conducted that demonstrated the necessity of the NN controller and online parameter adaptation module for the proposed control scheme.

However, the proposed control scheme still has some limitations. First, the sim-to-real gap is still present, which is inevitable because DLOs are not made of completely isotropic materials in the real world, and they are prone to plastic deformation and perception errors. Therefore, building a perfect model that completely matches the real-world deformation of a DLO is not possible, which explains the much larger control error in the real-world experiment than in simulation. Second, our study was focused on control rather than planning, so the path of actions may not be globally optimal. In addition, the control actions were limited to translation because the DLO interacting with itself was not considered. Third, the computation time was relatively long at approximately 1.7 s to complete a control horizon. This limitation can be attributed to two primary factors. The first reason is that the more precise the simulation model is, the greater the number of computational steps required, which in turn results in a more complex computational graph for automatic differentiation. The second reason is that the currently employed optimization method, namely gradient descent, necessitates a substantial number of iterations before the error converges. To reduce the computational time, some unnecessary computational steps may need to be removed from the computational graph of automatic differentiation. Furthermore, more efficient optimization algorithms would need to be employed to substitute the simple gradient descent method and thus decrease the number of optimization iterations. Finally, the existing control scheme is unable to effectively address the issue of shaping the DLO through contact with the external environment. This is because the constructed objective function solely considers the artificial potential field for obstacle avoidance, which would result in the DLO maintaining a distance from the obstacle. To achieve the desired outcome of contact with the external environment, it is necessary to modify the objective function accordingly.

In future work, we plan to investigate tracking algorithms to improve the robustness of the proposed control scheme against occlusion and to refine the model of the DLO dynamics to further narrow the sim-to-real gap. We also plan to achieve more complex control tasks, particularly in instances where contact with the external environment is necessary to shape the DLO. To this end, a novel objective function that can discern which contact is beneficial and which is detrimental may be required.

Availability of data and materials

No datasets were generated or analysed during the current study.

References

Yin H, Varava A, Kragic D (2021) Modeling, learning, perception, and control methods for deformable object manipulation. Sci Robot. https://doi.org/10.1126/scirobotics.abd8803
Article Google Scholar
Arriola-Rios VE, Guler P, Ficuciello F et al (2020) Modeling of deformable objects for robotic manipulation: a tutorial and review. Front Robot AI 7:82. https://doi.org/10.3389/frobt.2020.00082
Article Google Scholar
Laezza R, Gieselmann R, Pokorny FT, Karayiannidis Y (2021) ReForm: a robot learning sandbox for deformable linear object manipulation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE: Xi’an, China, pp 4717–4723
Wang W, Balkcom D (2018) Knot grasping, folding, and re-grasping. Int J Robot Res 37:378–399. https://doi.org/10.1177/0278364918754676
Article Google Scholar
Chen K, Bing Z, Wu F, et al (2023) Contact-aware shaping and maintenance of deformable linear objects with fixtures. In: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp 1–8
Zhang Z, Dequidt J, Back J et al (2019) Motion control of cable-driven continuum catheter robot through contacts. IEEE Robot Autom Lett 4:1852–1859. https://doi.org/10.1109/LRA.2019.2898047
Article Google Scholar
Ying C, Yamazaki K (2023) Motion generation for shaping deformable linear objects with contact avoidance using differentiable simulation *. In: 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, Koh Samui, Thailand, pp 1–8
Lv N, Liu J, Xia H et al (2020) A review of techniques for modeling flexible cables. Comput Aided Des 122:102826. https://doi.org/10.1016/j.cad.2020.102826
Article MathSciNet Google Scholar
Lv N, Liu J, Ding X et al (2017) Physically based real-time interactive assembly simulation of cable harness. J Manuf Syst 43:385–399. https://doi.org/10.1016/j.jmsy.2017.02.001
Article Google Scholar
Koessler A, Filella NR, Bouzgarrou BC, et al (2021) An efficient approach to closed-loop shape control of deformable objects using finite element models. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). pp 1637–1643
Xu L, Liu Q (2018) Real-time inextensible surgical thread simulation. Int J CARS 13:1019–1035. https://doi.org/10.1007/s11548-018-1739-1
Article Google Scholar
Yang Y, Stork JA, Stoyanov T (2022) Learning differentiable dynamics models for shape control of deformable linear objects. Robot Auton Syst 158:104258. https://doi.org/10.1016/j.robot.2022.104258
Article Google Scholar
Wang C, Zhang Y, Zhang X et al (2022) Offline-online learning of deformation model for cable manipulation with graph neural networks. IEEE Robot Autom Lett 7:5544–5551. https://doi.org/10.1109/LRA.2022.3158376
Article Google Scholar
Yu M, Lv K, Zhong H et al (2023) Global model learning for large deformation control of elastic deformable linear objects: an efficient and adaptive approach. IEEE Trans Robot 39:417–436. https://doi.org/10.1109/TRO.2022.3200546
Article Google Scholar
McConachie D, Power T, Mitrano P, Berenson D (2020) Learning when to trust a dynamics model for planning in reduced state spaces. IEEE Robot Autom Lett 5:3540–3547. https://doi.org/10.1109/LRA.2020.2972858
Article Google Scholar
Huang Y, Xia C, Wang X, Liang B (2023) Learning graph dynamics with external contact for deformable linear objects shape control. IEEE Robot Autom Lett 8:3891–3898. https://doi.org/10.1109/LRA.2023.3264764
Article Google Scholar
Duenser S, Bern JM, Poranne R, Coros S (2018) Interactive robotic manipulation of elastic objects. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp 3476–3481
Lv N, Liu J, Jia Y (2022) Dynamic modeling and control of deformable linear objects for single-arm and dual-arm robot manipulations. IEEE Trans Robot 38:2341–2353. https://doi.org/10.1109/TRO.2021.3139838
Article Google Scholar
Petit A, Ficuciello F, Fontanelli GA, et al (2017) Using physical modeling and rgb-d registration for contact force sensing on deformable objects: In: Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics. SCITEPRESS - Science and Technology Publications, Madrid, Spain.
Güler P, Pieropan A, Ishikawa M, Kragic D (2017) Estimating deformability of objects using meshless shape matching. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Liu F, Su E, Lu J et al (2023) Robotic manipulation of deformable rope-like objects using differentiable compliant position-based dynamics. IEEE Robot Autom Lett 8:3964–3971. https://doi.org/10.1109/LRA.2023.3264766
Article Google Scholar
Nozaki K, Ying C, Matsuura Y, Yamazaki K (2023) Manipulation planning for wiring connector-attached cables considering linear object’s deformability. Int J Autom Technol. https://doi.org/10.20965/ijat.2023.p0399
Article Google Scholar
S L (1998) Rapidly-exploring random trees : a new tool for path planning. Research Report 9811
Amato NM, Wu Y (1996) A randomized roadmap method for path and manipulation planning. In: Proceedings of IEEE International Conference on Robotics and Automation.
Bretl T, McCarthy Z (2014) Quasi-static manipulation of a Kirchhoff elastic rod based on a geometric analysis of equilibrium configurations. Int J Robot Res 33:48–68. https://doi.org/10.1177/0278364912473169
Article Google Scholar
Moll M, Kavraki LE (2006) Path planning for deformable linear objects. IEEE Trans Rob 22:625–636. https://doi.org/10.1109/TRO.2006.878933
Article Google Scholar
Mishani I, Sintov A (2022) Real-time non-visual shape estimation and robotic dual-arm manipulation control of an elastic wire. IEEE Robot Autom Lett 7:422–429. https://doi.org/10.1109/LRA.2021.3128707
Article Google Scholar
Roussel O, Fernbach P, Taïx M (2020) Motion planning for an elastic rod using contacts. IEEE Trans Autom Sci Eng 17:670–683. https://doi.org/10.1109/TASE.2019.2941046
Article Google Scholar
Li Y, Wu J, Zhu J-Y, et al (2019) Propagation networks for model-based control under partial observation. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE, Montreal, QC, Canada.
Yan M, Zhu Y, Jin N, Bohg J (2020) Self-supervised learning of state estimation for manipulating deformable linear objects. IEEE Robot Autom Lett 5:2372–2379. https://doi.org/10.1109/LRA.2020.2969931
Article Google Scholar
Zhang W, Schmeckpeper K, Chaudhari P, Daniilidis K (2021) Deformable linear object prediction using locally linear latent dynamics. In: 2021 IEEE International Conference on Robotics and Automation (ICRA).
Yan W, Vangipuram A, Abbeel P, Pinto L (2021) Learning predictive representations for deformable objects using contrastive estimation. In: Proceedings of the 2020 Conference on Robot Learning. PMLR.
Zhu J, Navarro-Alarcon D, Passama R, Cherubini A (2021) Vision-based manipulation of deformable and rigid objects using subspace projections of 2D contours. Robot Auton Syst 142:103798. https://doi.org/10.1016/j.robot.2021.103798
Article Google Scholar
Lagneau R, Krupa A, Marchal M (2020) Automatic shape control of deformable wires based on model-free visual servoing. IEEE Robot Autom Lett 5:5252–5259. https://doi.org/10.1109/LRA.2020.3007114
Article Google Scholar
Yang B, Lu B, Chen W et al (2023) Model-free 3-D shape control of deformable objects using novel features based on modal analysis. IEEE Trans Rob 39:3134–3153. https://doi.org/10.1109/TRO.2023.3269347
Article Google Scholar
Berenson D (2013) Manipulation of deformable objects without modeling and simulating deformation in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE: Tokyo.
Zhu J, Navarro B, Fraisse P, et al (2018) Dual-arm robotic manipulation of flexible cables In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE: Madrid.
Lin X, Wang Y, Olkin J, Held D (2021) SoftGym: Benchmarking deep reinforcement learning for deformable object manipulation. In: Proceedings of the 2020 Conference on Robot Learning. PMLR
Han H, Paul G, Matsubara T (2017) Model-based reinforcement learning approach for deformable linear object manipulation In: 2017 13th IEEE Conference on Automation Science and Engineering (CASE). IEEE: Xi’an.
Wu Y, Yan W, Kurutach T, et al (2020) Learning to manipulate deformable objects without demonstrations. In: Robotics: Science and Systems XVI. Robotics: Science and Systems Foundation
Hiruma H, Ito H, Mori H, Ogata T (2022) Deep active visual attention for real-time robot motion generation: emergence of tool-body assimilation and adaptive tool-use. IEEE Robot Autom Lett 7:8550–8557. https://doi.org/10.1109/LRA.2022.3187614
Article Google Scholar
Hayashi K, Sakaino S, Tsuji T (2022) An independently learnable hierarchical model for bilateral control-based imitation learning applications. IEEE Access 10:32766–32781. https://doi.org/10.1109/ACCESS.2022.3155255
Article Google Scholar
Liang J, Lin M, Koltun V (2019) Differentiable cloth simulation for inverse problems. In: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Degrave J, Hermans M, Dambre J, Wyffels F (2019) A differentiable physics engine for deep learning in robotics. Front Neurorobot 13:6. https://doi.org/10.3389/fnbot.2019.00006
Article Google Scholar
Chen S, Werling K, Wu A, Liu CK (2023) Real-time model predictive control and system identification using differentiable simulation. IEEE Robot Autom Lett 8:312–319. https://doi.org/10.1109/LRA.2022.3226027
Article Google Scholar
Chen S, Liu Y, Yao SW et al (2022) DiffSRL: learning dynamical state representation for deformable object manipulation with differentiable simulation. IEEE Robot Autom Lett 7:9533–9540. https://doi.org/10.1109/LRA.2022.3192209
Article Google Scholar
Millard D, Preiss JA, Barbič J, Sukhatme GS (2023) Parameter estimation for deformable objects in robotic manipulation tasks. In: Billard A, Asfour T, Khatib O (eds) Robotics Research. Springer Nature Switzerland, Cham, pp 239–251
Chapter Google Scholar
Yang Y, Stork JA, Stoyanov T (2022) Online model learning for shape control of deformable linear objects. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, Kyoto, Japan, pp 4056–4062
Kugelstadt T, Schömer E (2016) Position and orientation based Cosserat rods. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Eurographics Association, Goslar, DEU, pp 169–178
Macklin M, Müller M, Chentanez N (2016) XPBD: position-based simulation of compliant constrained dynamics in proceedings of the 9th international conference on motion in games. ACM: Burlingame California.
Hu Y, Anderson L, Li T-M, et al (2020) Difftaichi: differentiable programming for physical simulation. arXiv preprint arXiv

Download references

Acknowledgements

This work was supported by JST SPRING, Grant Number JPMJSP2144 (Shinshu University).

Funding

Not applicable.

Author information

Authors and Affiliations

Graduate School of Medicine, Science and Technology, Shinshu University, Nagano, Japan
Changjian Ying
Faculty of Engineering, Shinshu University, Nagano, Japan
Kimitoshi Yamazaki

Authors

Changjian Ying
View author publications
You can also search for this author in PubMed Google Scholar
Kimitoshi Yamazaki
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to develop the method. C.Y conceived and designed this study, conducted experiments, and wrote the article. K.Y supervised this study and wrote the article. All authors read and approved the manuscript.

Corresponding author

Correspondence to Changjian Ying.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ying, C., Yamazaki, K. Obstacle avoidance shape control of deformable linear objects with online parameters adaptation based on differentiable simulation. Robomech J 11, 15 (2024). https://doi.org/10.1186/s40648-024-00283-1

Download citation

Received: 30 April 2024
Accepted: 20 September 2024
Published: 27 September 2024
DOI: https://doi.org/10.1186/s40648-024-00283-1

Obstacle avoidance shape control of deformable linear objects with online parameters adaptation based on differentiable simulation

Abstract

Introduction

Related work

Analytical models

Learning-based methods

Differentiable simulation

Problematization and approach

Problem setting

Outline of the proposed control scheme

Differentiable position-based dynamics

Motivation

Original model

Extensions for our approach

Shape control with online parameters adaptation

Model predictive control

Online adaptation of parameters

Validation

Simulations

Implementation and setting of the simulation

Simulation results

Real-world experiments

Instrument and setting

Real-world experimental results

Results of the ablation study

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords