Development of 3D viewer based teleoperation interface for Human Support Robot HSR
© Yaguchi et al.; licensee Springer. 2014
Received: 14 January 2014
Accepted: 4 July 2014
Published: 12 October 2014
In this paper, we introduce 3D-viewer-based teleoperation system for manipulation task of Human Support Robot HSR. The system is integrated by three functions; 3D environment information visualization based on Manhattan-world assumption, known object recognition using LINE-MOD algorithm, and virtual teaching with 3D robot model using interactive marker. We show some experimental results of manipulation tasks using proposed system.
In Japan, population aging of society is estimated to progress faster than foreign countries, corresponding to shortage of labor and caregiver is a very serious problem. Also encouraging self-reliance support of elderly and handicapped persons is a very important problem to improve in a viewpoint of quality of life. To solve this problem, we are developing safety, small, wide working area, high functional Human Support Robot: HSR for elderly or handicapped person and their family or caregiver.
It is difficult to estimate distance to target object or space between robot and environment.
II) The consumer needs to use not custom-made shelf for robots with visual marker but normal shelf.
III) In object manipulation task, it is difficult to move a hand to target position correctly.
Wide-view environment information representation around robot and grasp target object
II) Known object recognition without visual marker
III) Robot operation HMI easy to understand robot status and to operate.
To realize these functions, we developed about 3D face set representation of environment converting from point clouds obtained 3D camera, robust and markerless known object recognition using Improved LINE-MOD, and motion teaching method to 3D robot model using interactive markers.
In this paper, we developed a novel teleoperation interface showing robot status, surrounding environment, and detected known objects on 3D viewer. We also constructed teleoperating/motion teaching system by operating 3D robot model on developped 3D viewer.
2HSR: Human Support Robot and teleoperation system
Especially, in unknown real environment, it is very difficult to sense, plan, and act autonomously, so sometimes the direct operation of robot can be the most realistic and fastest way to achieve tasks. We also consider to operate structured object, such as furniture or doors, as a higher level task. In related works, Chitta et al.  achieved autonomous door opening task of PR2 using motion planning based on graph search with fast collision checking, assuming the door model is known. However, generally the door model is unknown, model teaching method of human is necessary. Sturm et al. proposed a furniture model construction method based on furniture tracking in operating motion. using assumption that front side of furniture has rectangle shape . In this work the robot actually operated and obtained kinematics model of furniture. Azuma et al. proposed the multi touch tablet based interface limiting operation to push, pull, and revolve . Operation type and amount are instructed directly on input image from camera. Azuma’s system can operate various structured objects with simple solution, however, this can not reuse instructed operation. Yamazaki et al.  proposed a method which can add manipulable object through creating models from texture and shape information measured by external sensors. Yamazaki defined the drawer opening task by the robot as combination of following sub tasks.
i) Teaching knowledge that is necessary to operate the object.
ii) Adding ID tags or artificial markers to target object.
iii) Teaching how to operate via teleoperation or direct teaching.
According to Yamazaki’s approach, robots can operate furniture when it has following 3 abilities;
i) Human can estimate necessary knowledge to operate from environment information.
ii) The robot can recognize a target object autonomously.
iii) The robot can be teleoperated.
These factors are consistent with problems of HMI described in the introduction.
We propose following 3 functions;
Light-weight and easy to understand environment information representation and storing.
Robust recognition and pose estimation of known objects.
Direct operating and motion teaching using 3D interactive robot model.
In this paper, we develop the user interface that can show robot pose and surrounding environment simultaneously by representing environment, object, and robot information on 3D viewer to teleoperate Human Support Robot HSR combining 3 functions.
3.1 3D environment reconstruction and representation
To realize a useful human machine interface, we propose the method to translate from 3D point cloud to orthogonal face set to show 3D environment information as lightweight and easy to understand representation on the 3D viewer.
We consider to employ 3D reconstruction method like Visual SLAM to show 3D environment information. RGBDSLAM  and KinectFusion  are popular visual SLAM method using 3D camera. In RGBDSLAM, environmental information is stored as dense point cloud using key frames of input images. KinectFusion  stores environment information as infinite 3d voxel. However these representation does not have enough shape features.
Indoor room environment has a feature that is constructed combination of orthogonal planes each other, like floor, ceiling and walls. This assumption about structure of indoor environment is called Manhattan-world assumption. Using this shape assumption, easy to understand environment information representation can be realized. Furukawa et al.  used this assumption to reconstruct and estimate structure of building using photograph database. Yaguchi et al.  applied this assumption to 3D camera and proposed the fast and lightweight method to construct 3D environment model of the room. In this paper we represent environment information using .
Results of plane detection; Comparison distance between 2 planes
Ground Truth [mm]
Microwave - Wall (1)
Dashboard - Floor (1)
Microwave - Wall (2)
Dashboard - Floor (2)
Specification of 3d environment models
Input points size (ave.) [kB]
Face set size (ave.) [kB]
Texture image size (ave.) [kB]
Rest points size (ave.) [kB]
compression rate [%]
Input points num
Rest points num (ave.)
Rejected rate [%]
3.2 Object recognition and pose estimation of known objects
Results of door lever recognition
3.3 Robot teleoperation and motion teaching using 3D robot model
To achieve teleoperation and motion teaching by intuitive action generation, we develop the teleoperation system to 3D robot model using interactive marker . Interactive marker is a framework adding interactive operation functions from users through mouse input to 3D models on rviz, 3D viewer of ROS. Using this framework, users can teach motions directly to the robot through operating 3D robot model.
Robot centric action: operating robot’s own motion directly. Interactive markers are fixed on the robot body.
Environment centric action: operating built-in structures in environment. Interactive markers are fixed on the environment.
Object centric action: operating objects. Interactive markers are fixed on detected objects using LINE-MOD.
These 3 actions are used from our interface simultaneously without changing modal, and robot task is realized as combination of these actions each other.
Motion teaching is also realized to record these action combination sequence and the robot can play back same task.
3.3.1 Robot centric action
In robot centric action, interactive markers for robot operation is shown with the robot model in the 3D viewer. User can operate not only joint level action, such as base moving or head direction, but also arm reaching action with setting target hand pose using a motion planner and inverse kinematics solver.
Figure 13 shows the basic interactive operation system for HSR. In robot centric action, the robot can act following basic functions;
Base move (Figure 13(a))
Neck pan/tilt (Figure 13(b))
Head up/down (Figure 13(b))
Hand reaching to target pose (Figure 13(c))
and system can also teach and play back motion combining these basic functions.
To control these markers, user drags arrow or circular handles arround each joints by mouse, arrow handle can set moving distance, and circular handle can instruct rotation angle such as joint angle. In case of hand reaching marker, user can set target hand pose using 6 DoF arrow and circular handle directly, after setting target user click right button and select “go to target” from menu, then inverse kinematics is solved and motion is executed.
3.3.2 Environment centric action
In environment centric action, to realize the operation of movable structures, interactive markers are put on environment to describe the structure of objects using parent-child relationship of interactive markers.
To put interactive markers on the environment, we construct 6 DoF interactive marker which is not associated any parts but put onto the world coordinate. User can operate this marker freely and can copy anytime by right click and select “copy” from menu list. Parent-child relationship is also instructed from menu list of parent marker and select child marker one by one.
3.3.3 Object centric action
In object centric action, interactive markers are put on the recognition result of known objects and achieve operation teaching to object describing how to operate the object. Focusing the door structure in Figure 14 again, when recognizing the door lever as known object, rotational axis of the door are decided from estimated pose of the door lever, it can also adopt half-opened door or error of localization.
4Results and discussion
4.1 Task operation using 3D viewer based teleoperation interface
We evaluate the proposed system with 2 teleoperation task experimentation, pick-and-place task and door opening task. In the experiments one operator achieved task.
4.1.1 Object pick-and-place task
Moving to the spot near the table (Environment centric).
Picking the object up (Object centric).
Moving to the spot near the shelf (Environment centric).
Putting the object on the shelf (Environment centric).
In this case, we taught a grasp pattern of objects as object centric, and spots associated the table and the shelf as environment centric. When the position of the table or the shelf is changed, the system can adopt by resetting spot’s position again by humans.
4.1.2 Door opening task
Secondly, we achieved door opening task. To open the door by the robot, human have to teach door axis and rotational direction, door lever axis and rotational direction, the position of these parts, manipulation sequence and amount, respectively. In this experiment, door lever is defined as known object and can be detected using LINE-MOD, human teaches all remaining information.
Moving to the spot near the door.
Moving head to the door lever.
Grasping the door lever.
Revolving down the door lever.
Pulling the door lever rotating around pivot of the door.
Revolving up the door lever.
Holding the door lever off.
Leaving the door.
Teleoperating by human with putting interactive markers.
Teleoperating by human using already put interactive markers.
Playing back human’s teleoperation using already put interactive markers and manipulation amount.
Note that in this experiment the door has not spring, so it does not close by itself, because the robot cannot open the door completely for its range of motion. However, this is the problem about motion planning and hardware specification, proposed method can be used in situation to open the door which has door closer.
In this paper, we developed 3D-viewer-based teleoperation user interface that is easy to use and understand for Human Support Robot HSR, and we also shew its usage example according to object pick-and-place task and door opening task. These 2 tasks can not be achieved using only dialog-based HMI because these require to move unresistered spot and to instruct phisical amount of operation, proposed method can realized to achieve these difficult tasks. Especially the proposed system that constructs a 3D map in unknown environments and has intuitive teaching playback function using the virtual robot model can reduce time to achieve task drastically. When robots can construct object models in operation, the proposed task achievement method is completed in robot’s motion, it also can increase usability in real environment. In future work, we will apply operation test by the user to verify the usability of the proposed system and to derive more problems about teleoperation in the real environment.
- Hashimoto K, Saito F, Yamamoto T, Ikeda K: A field study of the human support robot in the home environment. In 2013 IEEE Workshop on Advanced Robotics and Its Social Impacts. IEEE Robotics and Automation Society, Tokyo, Japan; 2013:143–150. 10.1109/ARSO.2013.6705520View ArticleGoogle Scholar
- Chitta S, Cohen B, Likhachev M: Planning for autonomous door opening with a mobile manipulator. In Robotics and Automation (ICRA), 2010 IEEE International Conference On. IEEE, Anchorage, Alaska; 2010:1799–1806. 10.1109/ROBOT.2010.5509475View ArticleGoogle Scholar
- Sturm J, Stachniss C, Burgard W: A probabilistic framework for learning kinematic models of articulated objects. J Artif Intell Res (JAIR) 2011, 41: 477–626.MATHMathSciNetGoogle Scholar
- Azuma H, Kakiuchi Y, Saito M, Okada K, Inaba M: View-base multi-touch gesture interface for furniture manipulation robots. In IEEE Workshop on Advanced Robotics and Its Social Impacts. IEEE Robotics and Automation Society, California, USA; 2011.Google Scholar
- Yamazaki K, Tsubouchi T, Tomono M: Furniture model creation through direct teaching to a mobile robot. J Robot Mechatronics 2008,20(2):213–220.Google Scholar
- Engelhard N, Endres F, Hess J, Sturm J, Burgard W: Real-time 3d visual slam with a hand-held rgb-d camera. In Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum. Robotdalen, Vasteras, Sweden; 2011.Google Scholar
- Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A. J, Kohli P, Shotton J, Hodges S, Fitzgibbon A: Kinectfusion: Real-time dense surface mapping and tracking. In IEEE ISMAR. IEEE, Basel, Switzerland; 2011.Google Scholar
- Furukawa Y, Curless B, Seitz SM, Szeliski R: Reconstructing building interiors from images. In Computer Vision, 2009 IEEE 12th International Conference On. IEEE, Kyoto, Japan; 2009:80–87. 10.1109/ICCV.2009.5459145View ArticleGoogle Scholar
- Yaguchi H, Takaoka Y, Yamamoto T, Inaba M: A method of 3d model generation of indoor environment with manhattan world assumption using 3d camera. In Proceedings of the 2013 IEEE/SICE International Symposium on System Integration. IEEE Robotics and Automation Society, Kobe, Japan; 2013:759–765. 10.1109/SII.2013.6776686View ArticleGoogle Scholar
- Hinterstoisser SSH, Cagniart C, Ilic S, Konolige K, Navab N, Lepetit V: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In IEEE International Conference on Computer Vision (ICCV). IEEE, Barcelona, Spain; 2011.Google Scholar
- Gossow D, Leeper A, Hershberger D, Ciocarlie MT: Interactive markers: 3-d user interfaces for ros applications [ros topics]. IEEE Robot Automat Mag 2011,18(4):14–15. 10.1109/MRA.2011.943230View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.