Block Stacking Challenge¶

Difficulty: Hard Time Limit: 90 seconds Status: Active

Overview¶

The Block Stacking scenario is an advanced manipulation challenge where a robot arm must pick up colored blocks and stack them in a specific order on a target platform. This requires precise grasping, lifting, and placement.

Objective¶

Stack the colored blocks on the platform in the correct order: Red (bottom), Blue (middle), Green (top).

Environment¶

Scene Layout¶

┌─────────────────────────────────────────┐
│                                         │
│                        [PLATFORM]       │
│                        ┌───────┐        │
│                        │  🟡   │ Target │
│                        └───────┘        │
│                                         │
│     🟥  🟦  🟩   ← Colored blocks       │
│                                         │
│   🤖═══════╗                            │
│            ║  ← Robot arm with gripper  │
│            ╠═╗                          │
└────────────╬═╬──────────────────────────┘
             ╚═╝

Objects¶

Red Block: Must be placed first (bottom of stack)
Blue Block: Must be placed second (middle)
Green Block: Must be placed third (top)
Platform: Gray target pedestal where blocks should be stacked

Robot¶

Type: 4-DOF articulated arm with 2-finger parallel gripper
Joints: Base rotation, shoulder, elbow, wrist rotation
Gripper: Two-finger gripper with position control
Strategy: Grasp, lift, and place blocks precisely

Observations¶

Your policy receives:

observation = {
    # Arm joint positions (4 values)
    'joint_positions': np.array([base, shoulder, elbow, wrist]),

    # Gripper opening (distance between fingers)
    'gripper_opening': float,

    # Gripper position (3 values)
    'gripper_pos': np.array([x, y, z]),

    # Platform position (3 values)
    'platform_pos': np.array([x, y, z]),

    # Block positions (dictionary)
    'block_positions': {
        'red_block_pos': np.array([x, y, z]),
        'blue_block_pos': np.array([x, y, z]),
        'green_block_pos': np.array([x, y, z]),
    },

    # Target stacking order
    'correct_order': ['red', 'blue', 'green'],
}

Actions¶

Return a 6D action vector:

action = np.array([
    base_torque,      # Base joint torque (-5 to 5)
    shoulder_torque,  # Shoulder joint torque (-5 to 5)
    elbow_torque,     # Elbow joint torque (-5 to 5)
    wrist_torque,     # Wrist joint torque (-3 to 3)
    left_finger,      # Left finger force (-2 to 2)
    right_finger,     # Right finger force (-2 to 2)
])

Gripper Control: - To open gripper: left_finger > 0, right_finger < 0 - To close gripper: left_finger < 0, right_finger > 0

Scoring¶

Verdict Criteria¶

Verdict	Criteria
PASS	At least one block correctly positioned in the stack
FAIL	No blocks on platform, wrong order, or blocks knocked off table

Score Calculation¶

Correct position in stack: +20 points per block
On platform but wrong order: +5 points per block
Block knocked off table: -10 points per block
Distance bonus: -(distance to platform) for floating blocks

Maximum possible score: 60 points (all 3 blocks correctly stacked)
Partial success counts: Red alone on platform earns +20

Tips¶

Strategy¶

Plan the Order: Always pick up red first, then blue, then green
Approach from Above: Move above the block before lowering to grasp
Secure Grip: Close gripper fully before lifting
Lift High: Clear other objects before moving horizontally
Precise Placement: Center the block over the platform/stack before lowering

State Machine Approach¶

A typical solution uses a state machine:

MOVE_TO_BLOCK → LOWER → GRASP → LIFT → MOVE_TO_PLATFORM → LOWER → RELEASE → (repeat)

Common Mistakes¶

Collision During Approach: Not lifting high enough
Dropping Blocks: Opening gripper too early
Stack Toppling: Placing blocks off-center
Wrong Order: Not following red → blue → green sequence
Rushing: Moving too fast causes instability

Example Approach¶

import numpy as np

# State machine states
STATE_APPROACH = 0
STATE_LOWER = 1
STATE_GRASP = 2
STATE_LIFT = 3
STATE_MOVE_PLATFORM = 4
STATE_PLACE = 5
STATE_RELEASE = 6

state = STATE_APPROACH
current_block = 0  # 0=red, 1=blue, 2=green
timer = 0

def policy(observation: dict) -> np.ndarray:
    global state, current_block, timer

    gripper_pos = observation['gripper_pos']
    platform_pos = observation['platform_pos']
    block_positions = observation['block_positions']
    colors = ['red', 'blue', 'green']

    if current_block >= 3:
        return np.zeros(6)  # Done!

    target_block = block_positions[f'{colors[current_block]}_block_pos']
    stack_height = current_block * 0.06

    action = np.zeros(6)

    if state == STATE_APPROACH:
        # Move above the target block
        target = [target_block[0], target_block[1], 0.55]
        action = move_toward(gripper_pos, target)
        action[4], action[5] = 1, -1  # Open gripper

        if reached(gripper_pos, target):
            state = STATE_LOWER

    elif state == STATE_LOWER:
        # Lower to grasp height
        target = [target_block[0], target_block[1], target_block[2] + 0.02]
        action = move_toward(gripper_pos, target)
        action[4], action[5] = 1, -1  # Keep open

        if reached(gripper_pos, target):
            state = STATE_GRASP
            timer = 0

    elif state == STATE_GRASP:
        # Close gripper
        action[4], action[5] = -2, 2  # Close
        timer += 1
        if timer > 20:
            state = STATE_LIFT

    elif state == STATE_LIFT:
        # Lift up
        target = [gripper_pos[0], gripper_pos[1], 0.6]
        action = move_toward(gripper_pos, target)
        action[4], action[5] = -2, 2  # Keep closed

        if gripper_pos[2] > 0.55:
            state = STATE_MOVE_PLATFORM

    elif state == STATE_MOVE_PLATFORM:
        # Move above platform
        target = [platform_pos[0], platform_pos[1], 0.5 + stack_height]
        action = move_toward(gripper_pos, target)
        action[4], action[5] = -2, 2

        if reached(gripper_pos, target):
            state = STATE_PLACE

    elif state == STATE_PLACE:
        # Lower onto stack
        target = [platform_pos[0], platform_pos[1], 0.45 + stack_height]
        action = move_toward(gripper_pos, target)
        action[4], action[5] = -2, 2

        if reached(gripper_pos, target):
            state = STATE_RELEASE
            timer = 0

    elif state == STATE_RELEASE:
        # Open gripper
        action[4], action[5] = 1, -1
        timer += 1
        if timer > 15:
            current_block += 1
            state = STATE_APPROACH

    return action

def move_toward(current, target):
    """Simple proportional controller."""
    diff = np.array(target) - np.array(current)
    gain = min(np.linalg.norm(diff) * 5, 2.0)
    direction = diff / (np.linalg.norm(diff) + 1e-6)
    action = np.zeros(6)
    action[0] = direction[1] * gain  # Base for Y
    action[1] = -direction[0] * gain * 0.5  # Shoulder for X
    action[2] = direction[2] * gain  # Elbow for Z
    return action

def reached(current, target, tol=0.03):
    return np.linalg.norm(np.array(current) - np.array(target)) < tol

Local Testing¶

Test your policy locally before submitting:

from botmanifold import BotManifoldClient

client = BotManifoldClient()

# Load scenario locally
env = client.load_scenario("block_stacking_v1")

# Run your policy
observation = env.reset()
done = False

while not done:
    action = policy(observation)
    observation, reward, done, info = env.step(action)

print(f"Stacked: {info['correct_order_count']}/{info['total_blocks']} correct")
print(f"Complete: {info['complete']}")

Leaderboard¶

View current rankings at botmanifold.com/arena/leaderboard?scenario=block_stacking_v1

Video Example¶

Watch an example of a successful policy completing this scenario on the scenario detail page.