SimWeaver: Zero-Shot RGB Sim-to-Real for Deformable Manipulation

TL;DR

Train deformable manipulation policies entirely in simulation, with no teleoperation, deployed zero-shot on real hardware.

200

simulated demos per task

No teleoperation at any stage. Single-seed deterministic synthesis.

91.30%

average real success · n=115 pooled

Across 5 deformable tasks. n=23 consecutive trials per task; Wilson 95% CI [84.7, 95.2].

100%

silk grasp under visual shift

Texture / lighting / rotation OOD. Real-data baseline drops to 13.0% / 69.6% / 8.7%.

$0.03

per usable trajectory

2824 trajectories per day on a single 8×RTX 4090 server — two orders of magnitude cheaper than real-robot collection.

Why this has been hard

Deformable sim-to-real has been blocked by three structural barriers.

BARRIER 01

Simulator unreliability

Cloth–rigid contact on thin fabrics breaks across Isaac Sim and VBD — parameter-sensitive instability, penetration, and non-deterministic replay.

BARRIER 02

Trajectory synthesis fails on deformables

High-DOF deformation breaks rigid trajectory transfer; deformable methods still need human teleop to start and ignore both arm and cloth constraints.

BARRIER 03

RGB sim-to-real has stalled

Pixel-based policies still don’t transfer reliably — depth and point-cloud sidestep the gap but collapse on dark, low-texture, and reflective fabrics.

Sim-to-Real Evidence

What the policy saw in simulation — and what it did on the real robot.

Left: the policy’s actual training observation (front oblique camera). Right: real-world execution with no fine-tuning. Per-task success rates from n=23 consecutive trials.

TASK 01

Silk Grasp

Bimanual grasp of thin reflective silk. Head-to-head vs. real-data baseline.

100.00% n=23

L wrist

R wrist

SIM · POLICY INPUT

REAL · ZERO-SHOT

TASK 02

Silk Unfolding

Recover flat layout from a draped configuration of reflective fabric.

95.65% n=23

L wrist

R wrist

SIM · POLICY INPUT

REAL · ZERO-SHOT

TASK 03

Garment Folding

Bimanual T-shirt full-fold with corner alignment.

91.30% n=23

L wrist

R wrist

SIM · POLICY INPUT

REAL · ZERO-SHOT

TASK 04

Snack Packaging

Insertion into a thin plastic bag — an open category in prior sim-to-real work.

86.96% n=23

L wrist

R wrist

SIM · POLICY INPUT

REAL · ZERO-SHOT

TASK 05

Garment Unfolding

Recover flat T-shirt layout from a wrinkled start.

82.61% n=23

L wrist

R wrist

SIM · POLICY INPUT

REAL · ZERO-SHOT

Qualitative rollout · five tasks

SAME POLICY CHECKPOINT · SIM ROLLOUT KEYFRAMES VS. ZERO-SHOT REAL DEPLOYMENT

System

Four components, one zero-shot pipeline.

SimWeaver-Asset — simulation-ready deformable assets, extensible by generation

SimWeaver-Sim — stable, penetration-free rigid–soft contact

SimWeaver-Syn — topology-aware, teleoperation-free trajectory generation

SimWeaver-Real — the sim-to-real deployment protocol

Method

The infrastructure behind zero-shot deformable manipulation.

COMPONENT · ASSET

SimWeaver-Asset

An extensible deformable-asset framework — 6,000+ simulation-ready assets, growable by generation.

Large-Scale Asset Library

6,000+ simulation-ready meshes with grounded, interpretable physical parameters, spanning garments and deformable bags. Bags are a deformable category prior datasets miss.

Garments

Bags

Generative Extension

A single image becomes a simulation-ready 3D mesh — not just geometry, but the physical parameters that make it simulate.

Bag

Image

→

3D mesh

→

Stretch

Bend

Density

…

Physics

→

Sim

T-shirt

Image

→

3D mesh

→

Stretch

Bend

Density

…

Physics

→

Sim

COMPONENT · SIM

SimWeaver-Sim

Robust collision handling, penetration prevention, and trajectory-replay determinism — fixing the contact instability and non-deterministic replay that break thin fabrics in standard simulators.

Newton VBD · Cloth penetration

FULL

ARM R

ARM L

Isaac Sim · Grasp fails

FULL

ARM R

ARM L

SimWeaver-Sim · Stable grasp

RELIABILITY ON BIMANUAL GARMENT GRASPING

Simulator	Task ↑	Grasp ↑	Pen. ↓	Expl. ↓	Per-step (ms) ↓
Isaac Sim	0.0%	0.0%	0.0%	0.0%	7.80
Newton VBD	0.0%	100.0%	77.5%	22.5%	10.38
SimWeaver	100.0%	100.0%	0.0%	0.0%	4.44

Penetration (Pen.) / explosion (Expl.): physically invalid contact failures, lower is better.

Per-step time: lower = faster sim, more demos per hour.

COMPONENT · SYN

SimWeaver-Syn / TopoSynth

Topology-aware trajectory synthesis — deterministic demonstrations from a single seed.

✕ No learned models

Pure topology graph + closed-form predicates — no prior to fit.

✕ No teleoperation

Zero human demos at any stage of the pipeline.

✕ No post-hoc filter

Single-seed synthesis — no over-generate-and-discard.

T-SHIRT FOLD · n=100

97.2%

Pass rate

100/100

Replay success

SYNTHESIS QUALITY · T-SHIRT FOLD · n=100

Method	Pass rate ↑	Replay 100× ↑
SIM1 (learned + filter)	24.0%	13 / 100
SimWeaver-Syn	97.2%	100 / 100

Replay 100× = one successful trajectory re-executed from 100 fresh simulator resets.

COMPONENT · REAL

SimWeaver-Real

A sim-to-real protocol that closes the deformable-specific gaps generic domain randomization leaves open.

AXIS 01 · INITIAL STATE

Physics-driven cloth init

drop-and-settle · pin-and-fold

AXIS 02 · IMAGE AUGMENTATION

Sensor-aware augmentation

BLUE
−12.6

RED
+19.5

FLICKER
+12.3

Removing this augmentation collapses real-world success to 0 % on all five tasks.

AXIS 03 · SCENE RANDOMIZATION

Lighting · background · robot pose

Hardware: bimanual Piper 6-DOF arms with parallel-jaw grippers · 1 overhead + 2 wrist-mounted RealSense D435i cameras.

Generalization · Silk Grasping

Sim-trained policies don’t just match real-data training — they surpass it under visual distribution shift.

Across texture, lighting, and rotation OOD shifts: real-robot teleop baseline (100 demos) drops to 13% / 70% / 9%. SimWeaver (200 sim demos + DR) holds at 100% on all three.

(A) SAMPLE EFFICIENCY

SimWeaver matches real-data efficiency.

Sim + DR scales as efficiently as teleop, and pushes further at the top end.

(B) DISTRIBUTION SHIFT

Real-data collapses under visual shifts.
SimWeaver doesn’t.

TEXTURE

13.0%

100%

LIGHTING

69.6%

100%

ROTATION

8.7%

100%

REAL

SIMWEAVER

SIDE-BY-SIDE · SAME TASK, OOD VISUAL SHIFT · 10× SPEED

SimWeaver · OOD · consistent success

Real-data baseline · OOD · fails

Both clips: real-robot silk grasping at 10× speed. Across texture, lighting, and rotation OOD shifts, SimWeaver (200 sim demos + DR) holds 100% success on all three; the real-data baseline (100 real-robot teleop demos) collapses to 13% / 70% / 9% success — full per-axis breakdown in chart above. Shown here: the texture OOD condition.

Why RGB

Point-cloud baselines collapse on dark, absorbing surfaces.

Black grippers and far-field dark surfaces absorb the projected pattern. Both D435i (active stereo) and Photoneo (industrial structured-light) drop large regions of the scan. Geometry-only policies have nothing to act on.

0 / 5

DP3 fails
all real tasks

Point-cloud baseline trained on the same 200 sim demos · zero deployed successfully on real hardware.

RealSense D435i point cloud failure on black gripper

PCD

RGB

D435i · consumer active-stereo

Photoneo industrial point cloud failure on dark surfaces

PCD

RGB

Photoneo · industrial structured-light

Top-left inset shows the RGB view of the same scene; the main image shows what the depth sensor actually captured. Black gripper bodies and far dark objects vanish in both consumer and industrial scanners — silk reflects diffusely but the gripper and table edges drop. Geometric policies act on these holes.

Render gallery

Render Gallery — diverse backgrounds · diverse objects.

Additional renderings of scenes and asset variants.

Silk Unfolding · Balcony

Garment Unfolding · Balcony

Snack into Bag · Kitchen

Garment Folding · Bedroom

BAG ASSET VARIANTS · PLASTIC / CANVAS / KNIT

Plastic

Canvas

Knit

GARMENT COLOR VARIANTS · PINK / BLUE / YELLOW

Pink

Blue

Yellow

Resources

Paper · Code · Video

GitHub · MIT-licensed

BibTeX

@misc{simweaver2026,
  title  = {SimWeaver: Zero-Shot RGB Sim-to-Real for Deformable Manipulation},
  author = {Wenkang Hu and Haoran Wang and Yitong Li and Liu Liu and Mengao Zhao and Lai Jiang and Xincheng Tang and Junhang Wei and Zhengjie Shu and Zhendong Wang and Zhizhong Su and Huamin Wang and Ruigang Yang},
  year   = {2026},
  note   = {Preprint}
}

Train deformable manipulation policies entirely in simulation, with no teleoperation, deployed zero-shot on real hardware.

Deformable sim-to-real has been blocked by three structural barriers.

Simulator unreliability

Trajectory synthesis fails on deformables

RGB sim-to-real has stalled

What the policy saw in simulation — and what it did on the real robot.

Silk Grasp

Silk Unfolding

Garment Folding

Snack Packaging

Garment Unfolding

Four components, one zero-shot pipeline.

The infrastructure behind zero-shot deformable manipulation.

SimWeaver-Asset

Large-Scale Asset Library

Generative Extension

SimWeaver-Sim

SimWeaver-Syn / TopoSynth

SimWeaver-Real

Physics-driven cloth init

Sensor-aware augmentation

Lighting · background · robot pose

Sim-trained policies don’t just match real-data training — they surpass it under visual distribution shift.

SimWeaver matches real-data efficiency.

Real-data collapses under visual shifts.SimWeaver doesn’t.

Point-cloud baselines collapse on dark, absorbing surfaces.

Render Gallery — diverse backgrounds · diverse objects.

Paper · Code · Video

Real-data collapses under visual shifts.
SimWeaver doesn’t.