Topic 3 — Mapping, SLAM, and World Reconstruction

Once your humanoid can detect objects and understand scenes, it still needs to know where it is and what the environment looks like over time. This topic covers SLAM (Simultaneous Localization and Mapping), map building, and integration with the digital twin from Chapter 3.

3.1 SLAM Fundamentals

Simultaneous Localization and Mapping (SLAM) solves two problems at once:

Localization: Where is the robot in the environment?
Mapping: What does the environment look like?

These two tasks are coupled:

You cannot localize without a map.
You cannot build a map without knowing where measurements were taken.

VSLAM vs LiDAR SLAM

VSLAM (Visual SLAM):
- Uses images (RGB or RGB-D) as the primary sensor.
- Tracks visual features across frames.
- Often combined with IMU for robustness.
LiDAR SLAM:
- Uses point clouds from LiDAR.
- Aligns scans over time using scan-matching algorithms.

Trade-offs:

VSLAM:
- Richer semantics (can associate visual features with objects).
- More sensitive to lighting and texture.
LiDAR SLAM:
- Very accurate geometry and range.
- Less semantic information, higher hardware cost.

Many humanoid robots use VSLAM + IMU, optionally augmented by LiDAR in challenging environments.

Key Concepts

Features and Keypoints: Distinctive points in images used for tracking.
Keyframes: Selected frames that represent important viewpoints.
Loop closure: Detecting when the robot revisits a place, to correct accumulated drift.
Pose graph: Graph whose nodes are poses and edges are relative constraints between poses.

SLAM systems maintain and optimize a pose graph to keep estimates globally consistent.

3.2 Building Maps in Real Time

SLAM pipelines typically operate in several stages:

Frontend:
- Extract features from sensor data (visual or LiDAR).
- Track features over time or match scans.
- Estimate relative motion between frames.
Backend:
- Maintain a pose graph.
- Add constraints from odometry, loop closures, and IMU.
- Optimize the graph to minimize overall error.
Mapping:
- Integrate depth or point clouds into:
  - Occupancy grids.
  - TSDF volumes.
  - Mesh reconstructions.

Occupancy Grids

2D grid where each cell stores probability of being occupied.
Used for:
- Navigation and path planning.
- Simple obstacle avoidance.

TSDF and 3D Meshes

TSDF stores signed distances to surfaces in a 3D volume.
By extracting zero-level sets, you can generate:
- Smooth 3D meshes.
- Detailed reconstructions of rooms or objects.

These 3D maps can be visualized, used for collision checking, or exported back to simulation tools (e.g., as Gazebo/Isaac Sim environments).

3.3 SLAM Inside the Digital Twin

A key advantage of having a digital twin (Chapter 3) is that you can test SLAM pipelines in simulation before relying on real hardware.

Workflow:

Create a simulated environment in Gazebo or Isaac Sim:
- Walls, furniture, and obstacles.
- Realistic sensor models for camera, depth, LiDAR, and IMU.
Run SLAM on simulated sensor topics:
- Use the same ROS 2 nodes you plan to run on hardware.
Compare SLAM output to ground truth:
- Pose trajectories (estimated vs true).
- Map quality (occupancy grids, meshes).

Benefits:

Safe testing of parameter settings (feature thresholds, loop closure criteria, etc.).
Ability to reproduce corner cases by replaying simulated data.
Easier debugging with full access to ground truth.

This lab focuses on building and validating a mapping stack.

Objectives

Run a VSLAM pipeline end-to-end.
Build a 2D or 3D map suitable for navigation.
Evaluate loop closure and trajectory accuracy.

Tasks

Data Capture
- Use either:
  - Real robot sensor logs (RGB-D + IMU), or
  - Simulated trajectories in Gazebo/Isaac Sim.
Run SLAM
- Choose a SLAM implementation (e.g., ORB-SLAM3, RTAB-Map, Isaac ROS VSLAM).
- Configure sensor topics and camera/IMU calibration.
Map Generation
- Produce:
  - An occupancy grid for navigation, and/or
  - A 3D reconstruction (TSDF/mesh) of the environment.
Loop Closure Testing
- Design a trajectory that revisits the same area.
- Verify whether the SLAM system:
  - Detects the loop.
  - Corrects accumulated drift.
Evaluation
- Compute:
  - Trajectory error (if ground truth poses are available).
  - Qualitative map quality (alignment with known layout).

Deliverables

Generated maps (occupancy grid and/or 3D mesh).
Trajectory plots and error metrics (if available).
Brief report summarizing:
- SLAM configuration.
- Successes and failure cases.
- Lessons learned for running SLAM on the real humanoid.

The output of SLAM must be compatible with navigation stacks:

Clean up maps:
- Remove outliers and transient obstacles.
- Inflate obstacles to account for robot footprint.
Export maps in standard formats:
- 2D costmaps for navigation (Topic 5).
- 3D maps or point clouds for higher-level planning.

Think of SLAM as building the world model that your planner will use in the next chapter. The better your maps, the more reliable your autonomous behaviors will be.

AI Assistant

Topic 3 — Mapping, SLAM, and World Reconstruction

3.1 SLAM Fundamentals

VSLAM vs LiDAR SLAM

Key Concepts

3.2 Building Maps in Real Time

Occupancy Grids

TSDF and 3D Meshes

3.3 SLAM Inside the Digital Twin

3.4 Lab B: SLAM-Based Mapping and Navigation Awareness

Objectives

Tasks

Deliverables

3.5 Preparing Maps for Navigation

AI Assistant

3.1 SLAM Fundamentals​

VSLAM vs LiDAR SLAM​

Key Concepts​

3.2 Building Maps in Real Time​

Occupancy Grids​

TSDF and 3D Meshes​

3.3 SLAM Inside the Digital Twin​

3.4 Lab B: SLAM-Based Mapping and Navigation Awareness​

Objectives​

Tasks​

Deliverables​

3.5 Preparing Maps for Navigation​

3.1 SLAM Fundamentals

VSLAM vs LiDAR SLAM

Key Concepts

3.2 Building Maps in Real Time

Occupancy Grids

TSDF and 3D Meshes

3.3 SLAM Inside the Digital Twin

3.4 Lab B: SLAM-Based Mapping and Navigation Awareness

Objectives

Tasks

Deliverables

3.5 Preparing Maps for Navigation