Skip to main content
Fig. 4 | ROBOMECH Journal

Fig. 4

From: Optimization algorithm for feedback and feedforward policies towards robot control robust to sensing failures

Fig. 4

Network architecture of the proposed method: it contains seven modules for the encoder \(q(z \mid s, h^s)\), decoder \(p(s^\prime \mid z^\prime )\), time-dependent prior \(q(z \mid h^s)\), dynamics f(z, a), value function V(s), and the FB/FF policies \(\pi _{\mathrm{FB}}(a \mid s)\), \(\pi _{\mathrm{FF}}(a \mid h^a)\) with two RNN features, \(h^s\) and \(h^a\); \(\pi _{\mathrm{FB}}\) and \(\pi _{\mathrm{FF}}\) are composed as \(\pi\), while being regularized between each other

Back to article page