English
Urdu
Personalize

Chapter 1: Locomotion Control

Learning Objectives

Understand bipedal walking dynamics
Implement Zero Moment Point (ZMP) control
Use Model Predictive Control (MPC) for locomotion
Train RL-based walking policies
Deploy locomotion controllers on real humanoids

Introduction

Locomotion is the ability to move through an environment. For humanoid robots, bipedal walking is one of the most challenging control problems due to:

Underactuation: Fewer actuators than degrees of freedom
Contact dynamics: Complex foot-ground interactions
Balance: Maintaining stability while moving

Zero Moment Point (ZMP)

ZMP is a point on the ground where the net moment from contact forces is zero. For stable walking:

ZMP must stay inside the support polygon (foot contact area)

def compute_zmp(com_pos, com_acc, gravity=9.81):
    """
    Compute ZMP from center of mass (CoM) state
    """
    zmp_x = com_pos[0] - (com_pos[2] / gravity) * com_acc[0]
    zmp_y = com_pos[1] - (com_pos[2] / gravity) * com_acc[1]
    return [zmp_x, zmp_y]

# Chapter 1: Check stability
def is_stable(zmp, support_polygon):
    return point_in_polygon(zmp, support_polygon)

Model Predictive Control (MPC)

MPC plans future trajectories by solving an optimization problem:

import casadi as ca

# Chapter 1: Define optimization problem
opti = ca.Opti()

# Chapter 1: Decision variables (foot positions over horizon)
N = 20  # Horizon steps
foot_pos = opti.variable(N, 3)

# Chapter 1: Objective: minimize CoM tracking error
com_ref = [0, 0, 0.9]  # Desired CoM height
cost = ca.sumsqr(com_pos - com_ref)

# Chapter 1: Constraints: ZMP stability
for k in range(N):
    zmp = compute_zmp(com_pos[k], com_acc[k])
    opti.subject_to(zmp_in_support(zmp, foot_pos[k]))

# Chapter 1: Solve
opti.minimize(cost)
sol = opti.solve()

RL-Based Locomotion

Train walking policies with Isaac Gym:

class HumanoidWalkEnv:
    def compute_reward(self):
        # Forward velocity reward
        vel_reward = self.base_lin_vel[0]
        
        # Upright orientation reward
        up_reward = torch.sum(self.base_quat[:, 2])
        
        # Energy penalty
        energy_penalty = -0.01 * torch.sum(self.dof_vel ** 2)
        
        return vel_reward + up_reward + energy_penalty

Training:

2048 parallel environments
PPO algorithm
10M timesteps (~2 hours on RTX 4090)

Deployment Example

# Chapter 1: Load trained policy
policy = torch.load("humanoid_walk.pth")

# Chapter 1: ROS 2 node for real robot
class LocomotionController(Node):
    def __init__(self):
        super().__init__('locomotion_controller')
        self.joint_pub = self.create_publisher(
            JointTrajectory, '/joint_commands', 10
        )
        
    def control_loop(self, obs):
        with torch.no_grad():
            actions = policy(obs)
        
        # Send to robot
        msg = JointTrajectory()
        msg.points = [JointTrajectoryPoint(positions=actions.tolist())]
        self.joint_pub.publish(msg)

Key Takeaways

✅ ZMP ensures stability by keeping balance point in support polygon
✅ MPC plans optimal trajectories with constraints
✅ RL learns robust policies from simulation
✅ Sim-to-real transfer requires domain randomization

Previous Chapter: ← Chapter 4: NVIDIA Isaac
Next Section: 5.2 Manipulation and Grasping →

Learning Objectives​

Introduction​

Zero Moment Point (ZMP)​

Model Predictive Control (MPC)​

RL-Based Locomotion​

Deployment Example​

Key Takeaways​