0
Research Papers

Learning-Based Variable Compliance Control for Robotic Assembly

[+] Author and Article Information
Tianyu Ren

State Key Laboratory of Tribology,
Department of Mechanical Engineering,
Tsinghua University,
Beijing 100084, China
e-mail: ultrain@126.com

Yunfei Dong

State Key Laboratory of Tribology,
Department of Mechanical Engineering,
Tsinghua University,
Beijing 100084, China
e-mail: d_yunfei@163.com

Dan Wu

State Key Laboratory of Tribology,
Department of Mechanical Engineering,
Tsinghua University,
Beijing 100084, China
e-mail: wud@tsinghua.edu.cn

Ken Chen

State Key Laboratory of Tribology,
Department of Mechanical Engineering,
Tsinghua University,
Beijing 100084, China
e-mail: kenchen@tsinghua.edu.cn

1Corresponding author.

2Postal address: Room A829, Lee Shau Kee Science and Technology Building, Tsinghua University, Beijing, 100084, China.

Contributed by the Mechanisms and Robotics Committee of ASME for publication in the JOURNAL OF MECHANISMS AND ROBOTICS. Manuscript received May 22, 2018; final manuscript received August 20, 2018; published online September 17, 2018. Assoc. Editor: Philippe Wenger.

J. Mechanisms Robotics 10(6), 061008 (Sep 17, 2018) (8 pages) Paper No: JMR-18-1147; doi: 10.1115/1.4041331 History: Received May 22, 2018; Revised August 20, 2018

The assembly task is of major difficulty for manufacturing automation. Wherein the peg-in-hole problem represents a group of manipulation tasks that feature continuous motion control in both unconstrained and constrained environments, so that it requires extremely careful consideration to perform with robots. In this work, we adapt the ideas underlying the success of human to manipulation tasks, variable compliance and learning, for robotic assembly. Based on sensing the interaction between the peg and the hole, the proposed controller can switch the operation strategy between passive compliance and active regulation in continuous spaces, which outperforms the fixed compliance controllers. Experimental results show that the robot is able to learn a proper stiffness strategy along with the trajectory policy through trial and error. Further, this variable compliance policy proves robust to different initial states and it is able to generalize to more complex situation.

FIGURES IN THIS ARTICLE
<>
Copyright © 2018 by ASME
Your Session has timed out. Please sign back in to continue.

References

Chang, R. J. , Lin, C. Y. , and Lin, P. S. , 2011, “ Visual-Based Automation of Peg-in-Hole Microassembly Process,” ASME J. Manuf. Sci. Eng., 133(4), p. 041015. [CrossRef]
Yoshimi, B. H. , and Allen, P. K. , 1994, “ Active, Uncalibrated Visual Servoing,” IEEE International Conference on Robotics and Automation, San Diego, CA, May 8–13, pp. 156–161.
Burdet, E. , and Nuttin, M. , 1999, “ Learning Complex Tasks Using a Stepwise Approach,” J. Intell. Rob. Syst., 24(1), pp. 43–68. [CrossRef]
Usubamatov, R. , and Leong, K. W. , 2011, “ Analyses of Peg-Hole Jamming in Automatic Assembly Machines,” Assem. Autom., 31(4), pp. 358–362. [CrossRef]
Jasim, I. F. , and Plapper, P. W. , 2014, “ Contact-State Monitoring of Force-Guided Robotic Assembly Tasks Using Expectation Maximization-Based Gaussian Mixtures Models,” Int. J. Adv. Manuf. Technol., 73(5–8), pp. 623–633. [CrossRef]
Zhang, K. , Shi, M. H. , Xu, J. , Liu, F. , and Chen, K. , 2017, “ Force Control for a Rigid Dual Peg-in-Hole Assembly,” Assem. Autom., 37(2), pp. 200–207. [CrossRef]
Inoue, T. , Magistris, G. D. , Munawar, A. , Yokoya, T. , and Tachibana, R. , 2017, “ Deep Reinforcement Learning for High Precision Assembly Tasks,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Sept. 24–28, pp. 819–825.
Whitney, D. E. , 1982, “ Quasi-Static Assembly of Compliantly Supported Rigid Parts,” ASME J. Dyn. Syst., Meas., Control, 104(1), pp. 65–77. [CrossRef]
Yun, S. K. , 2008, “ Compliant Manipulation for Peg-in-Hole: Is Passive Compliance a Key to Learn Contact Motion?,” IEEE International Conference on Robotics and Automation, Pasadena, CA, May 19–23, pp. 1647–1652.
Ganesh, G. , Albu-Schäffer, A. , Haruno, M. , Kawato, M. , and Burdet, E. , 2010, “ Biomimetic Motor Behavior for Simultaneous Adaptation of Force, Impedance and Trajectory in Interaction Tasks,” IEEE International Conference on Robotics and Automation, Anchorage, AK, May 3–7, pp. 2705–2711.
Franklin, D. W. , Burdet, E. , Tee, K. P. , Osu, R. , Chew, C. M. , Milner T. E. , and Kawato, M. , 2008, “ CNS Learns Stable, Accurate, and Efficient Movements Using a Simple Algorithm,” J. Neurosci. Official J. Soc. Neurosci., 28(44), pp. 11165–11173. [CrossRef]
Richard S. S. , and Barto, A. G. , 1998, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA.
Kober, J. , and Peters, J. , 2013, “ Reinforcement Learning in Robotics: A Survey,” Int. J. Rob. Res., 32(11), pp. 1238–1274. [CrossRef]
Buchli, J. , Stulp, F. , Theodorou, E. , and Schaal, S. , 2011, “ Learning Variable Impedance Control,” Int. J. Rob. Res., 30(7), pp. 820–833. [CrossRef]
Abu-Dakka, F. J. , Nemec, B. , Jørgensen, J. A. , Savarimuthu, T. R. , Krüger, N. , and Ude, A. , 2015, “ Adaptation of Manipulation Skills in Physical Contact With the Environment to Reference Force Profiles,” Auton. Robots, 39(2), pp. 199–217. [CrossRef]
Pastor, P. , Kalakrishnan, M. , Chitta, S. , and Theodorou, E. , 2011, “ Skill Learning and Task Outcome Prediction for Manipulation,” IEEE International Conference on Robotics and Automation, Shanghai, China, May 9–13, pp. 3828–3834.
Kormushev, P. , Calinon, S. , and Caldwell, D. G. , 2010, “ Robot Motor Skill Coordination With EM-Based Reinforcement Learning,” IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, Oct. 18–22, pp. 3232–3237.
Mnih, V. , Kavukcuoglu, K. , Silver, D. , Rusu, A. A. , Veness, J. , Bellemare, M. G. , Graves, A. , Riedmiller, M. , Fidjeland, K. A. , Ostrovski, G. , Petersen, S. , Beattie, C. , Sadik, A. , Antonoglou, I. , King, H. , Kumaran, D. , Wierstra, D. , Legg, S. , and Hassabis, D. , 2015, “ Human-Level Control Through Deep Reinforcement Learning,” Nature, 518(7540), p. 529. [CrossRef] [PubMed]
Silver, D. , Lever, G. , Heess, N. , Degris, T. , Wierstra, D. , and Riedmiller, M. , 2014, “ Deterministic Policy Gradient Algorithms,” International Conference on International Conference on Machine Learning, Beijing, China, June 21–26, pp. 387–395.
Lillicrap, T. P. , Hunt, J. J. , Pritzel, A. , Heess, N. , Erez, T. , Tassa, Y. , Silver, D. , and Wierstra, D. , 2015, “ Continuous Control With Deep Reinforcement Learning,” eprint: arXiv:1509.02971. https://arxiv.org/abs/1509.02971
Sharma, K. , Shirwalkar, V. , and Pal, P. K. , 2014, “ Intelligent and Environment-Independent Peg-In-Hole Search Strategies,” International Conference on Control, Automation, Robotics and Embedded Systems (CARE), Jabalpur, India, Dec. 16–18, pp. 1–6.
Hogan, N. , 1985, “ Impedance Control: An Approach to Manipulation—Part I: Theory,” ASME J. Dyn. Syst., Meas., Control, 107(1), pp. 481–489.
Albu-Schäffer, A. , and Hirzinger, G. , 2003, “ Cartesian Compliant Control Strategies for Light-Weight, Flexible Joint Robots,” Control Problems in Robotics, Bicchi, A., Christensen, H., and Prattichizzo, D., eds., Springer, Berlin, Heidelberg, pp 135–151.
Drake, S. H. , 1977, “ Using Compliance in Lieu of Sensory Feedback for Automatic Assembly,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA.
Peters, J. , Vijayakumar, S. , and Schaal, S. , 2005, “ Natural Actor-Critic,” European Conference on Machine Learning, pp. 280–291.
Eberman, B. S. , and Salisbury, K. , 1989, “ Whole-Arm Manipulation: Kinematics and Control,” MSc thesis, Massachusetts Institute of Technology, Cambridge, MA. https://dspace.mit.edu/handle/1721.1/14428
Ren, T. , Dong, Y. , Wu, D. , Wang, G. , and Chen, K. , 2018, “ Collision Detection and Identification for Robot Manipulators Based on Extended State Observer,” Control Eng. Pract., 79, pp. 144–153. [CrossRef]
Kingma, D. P. , and Ba, J. , 2014, “ Adam: A Method for Stochastic Optimization,” eprint: arXiv:1412.6980 https://arxiv.org/abs/1412.6980.

Figures

Grahic Jump Location
Fig. 1

Considered robotic assembly system. The peg-in-hole problem is implemented in the assembly plane oxy. x, y, and w present the positive directions of the robot motion (rotation), while fx, fy, and mw present the positive total external force (moment).

Grahic Jump Location
Fig. 2

Passive response of the peg to external forces and moments, and the definition of stiffness in three directions. (a) Response to external vertical forces. (b) Response to external lateral forces. (c) Response to external moments.

Grahic Jump Location
Fig. 3

Schematic illustration of the insertion controller and its training process with DDPG. Wherein, the proposed controller is implemented by neural networks termed actor in actor-critic methods.

Grahic Jump Location
Fig. 4

Description of the robot and peg-in-hole components. Torque sensors integrated in the robot joints are used to estimate contact force and moment in Cartesian space. The initial angle error ew is defined as the angle between the axes of the peg and hole in the assembly plane in staring position of the task.

Grahic Jump Location
Fig. 5

Learning curve of the neural controller. (a) Cumulative reward of an episode. (b) Achieved insertion progress.

Grahic Jump Location
Fig. 6

Insertion process conducted by the proposed controller with ew = 0 deg. (i) Observation: insertion progress. (ii) Observation: contact force and moment. (iii) Action: incremental virtual trajectory. (iv) Action: stiffness.

Grahic Jump Location
Fig. 7

Stages of the assembly process with large initial angle error. Corresponding step number is marked at the bottom of the picture.

Grahic Jump Location
Fig. 8

Insertion process conducted by the proposed controller with ew = 8 deg. (i) Observation: insertion progress. (ii) Observation: contact force and moment. (iii) Action: incremental virtual trajectory. (iv) Action: stiffness.

Grahic Jump Location
Fig. 9

Illustration of the rolling phase. (a) The starting state. The robot active force and moment on the left are equivalent to the force and moment acting on the upper contact point p1. (b) The terminate state. The peg rotates and slides downward until jamming in a deeper position. p2 is the new contact point.

Grahic Jump Location
Fig. 10

Performance comparison of the controllers with selective compliance and uniformly high compliance. (i) Observation: insertion progress. (ii) Observation: Contact force in the y-direction.

Grahic Jump Location
Fig. 11

Description of the deformable peg for the insertion skill

Grahic Jump Location
Fig. 12

Learning curve of transfer learning for deformable workpiece

Tables

Errata

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In