Structured Bayesian Meta-Learning for Data-Efficient Visual-Tactile Model Estimation

Shaoxiong Yao1, Yifan Zhu2, Kris Hauser1
1University of Illinois at Urbana-Champaign, IL, USA. 2Yale University, CT, USA.

TL;DR: We enable data-efficient visual-tactile model estimation by learning a prior of visual-tactile model from diverse real world objects.

Offline visual-tactile dataset collection

Here we show the robot touching several artificial plants. We collect RGBD video streams and corresponding joint torque sensor readings.

Zero-shot prediction from vision

Zero-shot prediction

Given a novel object, we use the prior learned offline to predict stiffness from appearance alone. Here, the branch region is predicted to be stiffer than the leaf region, aligning with our intuition.

Few-shot adaption using touch

Once the robot begins touching the plant, we enable efficient stiffness map update using touch data. Branches behind the leaves cause a torque response larger than expected, leading to an increased estimated stiffness to match observations.

Abstract

Estimating visual-tactile models of deformable objects is challenging because vision suffers from occlusion while touch data is sparse and noisy. We propose a novel data-efficient method for dense heterogeneous model estimation by leveraging experience from diverse training objects. The method is based on Bayesian meta-learning (BML), which can mitigate overfitting high-capacity visual-tactile models by meta-learning an informed prior and naturally achieves few-shot online estimation via posterior estimation. However, BML requires a shared parametric model across tasks but visual-tactile models for diverse objects have different parameter spaces. To address this issue, this paper introduces Structured Bayesian Meta-Learning (SBML) that incorporates heterogeneous physics models, enabling learning from training objects with varying appearances and geometries. SBML performs zero-shot vision-only prediction of deformable model parameters, as well as few-shot adaptation after a handful of touches. Experiments show that in two classes of heterogeneous objects, namely plants and shoes, SBML outperforms existing approaches in force and torque prediction accuracy in zero- and few-shot settings.

Method

Method overview
An overview of the structured Bayesian meta-learning method. The method learns a high-capacity prior over visual-tactile models from diverse training objects and adapts to novel objects with few touches.

Results

Here we present zero- and few-shot stiffness estimations for test set shoes.

Unified prior over plants and shoes

Our method can learn a high-capacity prior across multiple object categories. Here we show the unified prior mean prediction trained using both plant and shoe broad datasets.

Dracaena

Dracaena Visualization

Orange tree

Dracaena Visualization

Leather boot

Dracaena Visualization

Sneaker

Dracaena Visualization

Running shoe

Dracaena Visualization

BibTeX

@inproceedings{
  yao2024structured,
  title={Structured Bayesian Meta-Learning for Data-Efficient Visual-Tactile Model Estimation},
  author={Shaoxiong Yao and Yifan Zhu and Kris Hauser},
  booktitle={8th Annual Conference on Robot Learning},
  year={2024},
  url={https://openreview.net/forum?id=TzqKmIhcwq}
}