TACTful
Multi-Channel Terrain Affordance and Compliance Training
for Payload-Robust Perceptive Humanoid Locomotion
Under Review
Anonymous Authors  ·  Anonymous Institution
Act 1 — The Problem
Blind policy · stair ascent failure
Source: Online video, TikTok (China), retrieved 2025
Blind Locomotion
Is Not Enough

On flat ground — works well.

On structured terrain:

  • Sole straddles step edge → line contact
  • Friction cone admissibility collapses
  • GRF needed to recover scales with total weight
Without foothold quality awareness,
the robot has no strategy for stairs.
Act 1 — The Problem
Perceptive Locomotion — Still Not Enough
✓ Perceptive — terrain, no payload
Skild AI Team, “One Policy, All Scenarios,” skild.ai, Aug. 2025
✗ + Payload — policy fails
Boston Dynamics, online (retrieved 2025)
CoM shifts · trunk tilts · capture point diverges · the policy has no answer
Act 1 — The Problem
Object carrying downstairs — prior work
Skild AI Team, “One Policy, All Scenarios,” skild.ai, Aug. 2025
Prior Work:
Some Payload Capability

Some systems demonstrate object carrying on stairs.

But key questions remain unanswered:

?  Payload weight not disclosed
?  Compliance mechanism unspecified — possibly domain randomization or a rigid prop
?  No generalization analysis across weights or moment arms
Does this scale to varied real loads?
Our Approach
Terrain cost + payload compliance,
trained jointly.
Multi-channel terrain awareness & compliance — learned end-to-end with standard PPO.
1.0 m/s
on stairs
0.20 m risers
~15 kg
zero-shot
payload
50 Hz
policy
frequency
0-shot
sim-to-real
transfer
No distillation  ·  No teacher-student staging  ·  No force sensor
Act 2 — Method
System Overview

Overview of the proposed framework
DCM Foothold Planner — terrain-optimal landing targets + Bézier swing Wrench Compliance Scheduler — virtual force + moment at load point
Act 2 — Method
Multi-Channel Terrain Cost

Planner minimises: $\mathcal{J}_i = \alpha_\text{pos}\,d_{\text{pos},i} + \alpha_\text{dcm}\,d_{\text{dcm},i} + \alpha_E E_i + \alpha_Q Q_i + \alpha_M M_i - \alpha_\text{climb}\,b_i$

Terrain affordance channels on stair map
Terrain affordance map — footholds coloured by cost
$Q_i$ — Flatness
Elevation range over footprint kernel. Flags edge landings where friction-cone admissibility collapses.
$E_i$ — Steepness
Max-pooled Sobel gradient. Propagates the worst riser gradient within the footprint.
$M_i$ — Height Feasibility
Quadratic penalty above velocity-aware $h^*_\text{eff}(v_x)$. Low speed = reduced reach.
$b_i$ — Climb Bonus
Speed-gated reward for stepping up onto reachable treads, capped at $h^*_\text{eff}$.
Act 2 — Method
Bézier Swing & Tangent-Guided Foot Orientation

Step-up
(a) Step-up — apex over riser face
Step-down
(b) Step-down — extended horizontal travel
Gap
(c) Gap crossing
Flat
(d) Flat terrain — symmetric arc

Apex $xy$ biased toward the higher endpoint:

$\text{bias} = \mathrm{clip}(0.5 + \kappa\,\Delta z / h^*_\text{max},\,b_\text{min},\,b_\text{max})$

Pre-apex orientation
$\dot{\mathbf{p}}$ rotated 90° in sagittal plane → sole clears riser, reduces toe-strike risk.
Post-apex orientation
$\hat{\mathbf{t}}_f = \dot{\mathbf{p}}/\|\dot{\mathbf{p}}\|$ → flat-sole tread landing.
Act 2 — Method
Lower-Body Compliance Training

Robot with point cloud
Lower-Body Compliance

Payload moment $\boldsymbol{\tau} = \mathbf{r}_\text{load}\times\mathbf{F}$ tilts the trunk — CoM offset compounds per step under LIPM.

Virtual spring-damper at load point:

$\mathbf{F}_\text{ext}(t) = k(\mathbf{p}_a - \mathbf{p}_\text{load}) - c\,\dot{\mathbf{p}}_\text{load}$

Force + moment — no payload mass in sim
54 % body-attached  ·  36 % arm-extended  ·  10 % isotropic
Policy yields trunk & pelvis
No force sensor  ·  No retraining at deployment
Act 2 — Method
Training Setup

8192
parallel envs
MuJoCo
PPO
standard
no distillation
30 Hz
depth CNN
4-frame stack
50 Hz
policy freq.
on hardware
20 k
training
iterations
0-shot
sim-to-real
no fine-tuning
No teacher-student staging  ·  Asymmetric actor-critic  ·  Domain randomization for sim-to-real gap
Act 3 — Simulation
SIMULATION RESULTS — Payload Generalization
Baseline · no load
Front · 5 kg
Rear · 5 kg
Act 3 — Simulation
SIMULATION RESULTS — Challenging Scenarios
Front + Rear · 10 kg
Basket · 5 kg
Car Door · ~5 kg
Act 4 — Real Robot
Real Robot  ·  Zero-shot from simulation
Stair traversal  ·  risers up to 0.20 m  ·  1.0 m/s
Act 4 — Real Robot
Real Robot  ·  Moment-dominated load
Wrist-mounted tray  ·  7 kg  (hardware demo)
Sim wrist +10 kg: 65% SR  vs.  50% baseline
Act 4 — Real Robot
20 kg
Full expedition backpack
Real Robot  ·  Zero-shot
Rear-mounted backpack, 20 kg  ·  descending stairs
Act 4 — Real Robot
Real Robot  ·  Zero-shot
20 kg backpack  ·  ascending stairs
Act 5 — Results
Terrain Traversal: Success Rate

Standard terrain ablation
Standard terrain (0.05–0.20 m risers)
Hard terrain ablation
Hard terrain, OOD (0.20–0.30 m risers)
70%
standard SR
(ours)
+13 pp
vs. TACT-only
+17 pp
vs. adaptive
gait only
On hard terrain, adaptive gait alone drops to 18% — below the blind baseline (19%). Re-timing without foothold guidance is actively harmful at kinematic limits.
Act 5 — Results
Terrain Traversal: Foothold Quality

SR vs. speed
SR vs. commanded speed — gap is speed-invariant
Foot-target distance
Foot-target distance during training
2.8×
foot-target
without TACT
The success rate gap is speed-invariant from 0.5 to 1.0 m/s. Without terrain guidance, foot-target distance during training is 2.8× larger. The terrain cost signal is the operative variable.
Act 5 — Results
Payload Generalization — Zero-Shot

SR and power vs payload
SR (%) and mean power (W) vs. payload condition on stairs
Pelvis +15 kg — moderate centered load
50% vs. 38% SR  ·  247 vs. 277 W
Compliance absorbs the downward wrench.
Pelvis +20 kg — near distribution boundary
~37% vs. ~35% SR (parity)  ·  293 vs. 320 W
−27 W retained: partial compliance preserved.
Wrist +10 kg ✦ largest margin
65% vs. 50% SR  ·  223 vs. 259 W
Arm-extended wrench samples target moment-dominated loads directly.
Act 5 — Results
Limitations

  • Quantitative ablations in simulation only — hardware results are qualitative; absolute real-world SR uncharacterised
  • Hard-terrain SR only 30% on risers 0.22–0.28 m — insufficient for reliable deployment on tall staircases
  • Compliance ceiling ~20 kg centered; lateral, distributed & swinging loads fall outside the sampled wrench space
  • DCM planner requires accurate elevation-map registration — no fallback under localization drift; forward camera only
  • Tangent-guided orientation fails on slopes >~15° — arc tangent ≠ surface normal → toe/heel contact, narrowed friction cone
Next: quantitative real-world evaluation  ·  terrain × payload coupling study
TACTful
Standard PPO  ·  Zero-shot sim-to-real  ·  Service humanoid
Multi-Channel Terrain Cost Compliance Without Force Sensors Zero-Shot Sim-to-Real
70%
standard SR
65%
wrist SR vs. 50%
20 kg
real demo
0-shot
sim-to-real
fai-rl-tech.github.io/tact-locomotion.github.io