Skip to main content

AI Assistant

Physical AI & Humanoid Robotics

Hello! I'm your AI assistant for the AI-Native Guide to Physical AI & Humanoid Robotics. How can I help you today?

04:57 AM

Topic 1 — ROS 2 Architecture & Core Concepts

This topic introduces the fundamental question: Why do robots need middleware? We explore the distributed architecture problem in robotics, compare ROS 1 and ROS 2, and establish the conceptual foundation for everything that follows in this chapter.


1.1 What is ROS 2? The Middleware Problem

Why Robots Need Middleware

Imagine building a humanoid robot from scratch. You need:

  • Sensors streaming data: cameras capturing RGB-D images at 30 FPS, LiDAR scanning 360° environments, IMUs tracking balance
  • Perception algorithms processing sensor data: object detection, SLAM, state estimation
  • Planning systems computing trajectories: path planning, manipulation planning, gait generation
  • Control systems executing commands: motor controllers, balance controllers, safety monitors

In a monolithic architecture, all of this runs in a single process. This approach has critical flaws:

  • Tight coupling: Changing the camera driver breaks the planner
  • No modularity: Can't reuse perception code on a different robot
  • Single point of failure: One crash brings down the entire system
  • Resource contention: Heavy perception blocks time-critical control loops
  • Testing difficulty: Can't test components in isolation

A distributed architecture solves these problems by separating concerns:

  • Sensors run on dedicated hardware (edge devices, specialized boards)
  • Perception runs on powerful GPUs (workstation, Jetson)
  • Planning runs on CPUs with access to maps and models
  • Control runs on real-time hardware (robot's onboard computer)

But distributed systems create a new problem: How do these components communicate?

This is where middleware comes in. Middleware provides:

  1. Standardized communication protocols — All components speak the same language
  2. Discovery and naming — Components find each other automatically
  3. Type safety — Messages are validated before transmission
  4. Quality of Service (QoS) — Guarantees about delivery, latency, reliability
  5. Hardware abstraction — Same code works with different sensors/actuators

ROS 2 as a Publish-Subscribe Message Bus

Robot Operating System 2 (ROS 2) is middleware specifically designed for robotics. At its core, ROS 2 provides:

  • Nodes: Independent processes that perform specific tasks
  • Topics: Named data streams for publish-subscribe communication
  • Services: Request-response operations for discrete queries
  • Actions: Goal-oriented tasks with feedback and cancellation
  • Parameters: Configuration values accessible at runtime

ROS 2 uses DDS (Data Distribution Service) as its underlying communication layer. DDS is an industry-standard middleware used in aerospace, defense, and industrial automation. It provides:

  • Real-time guarantees: Deterministic latency for critical control loops
  • Type safety: Strong typing prevents message mismatches
  • Discovery: Automatic detection of publishers and subscribers
  • QoS policies: Fine-grained control over reliability, durability, history

ROS 1 vs ROS 2: Why ROS 2 Matters

ROS 1 (the original ROS) revolutionized robotics but had fundamental limitations:

FeatureROS 1ROS 2
Real-time supportLimited, not deterministicFull real-time support via DDS
Type safetyWeak, runtime errors commonStrong typing, compile-time checks
Network securityNone (plain TCP)Built-in security (DDS Security)
Multi-robot supportDifficult, namespace hacksNative multi-robot support
Cross-platformLinux onlyLinux, Windows, macOS, RTOS
Lifecycle managementManual, error-proneManaged nodes with state machines
QoS controlNoneGranular QoS policies

Key improvements in ROS 2:

  1. Real-time determinism: Control loops can run with guaranteed latency
  2. Production-ready: Used in commercial robots (Boston Dynamics Spot, Fetch Robotics)
  3. Security: DDS Security prevents unauthorized access
  4. Modularity: Better separation of concerns, easier testing

For this course, we use ROS 2 Humble (or Iron), the current LTS (Long-Term Support) release.


1.2 The ROS 2 Computation Graph

The computation graph is the conceptual model of how ROS 2 systems are organized. It consists of:

Nodes

Nodes are independent processes that perform specific tasks. Each node has:

  • Single responsibility: One node does one thing well
  • Unique name: Identified by namespace and name (e.g., /perception/camera_driver)
  • Lifecycle: Managed startup, shutdown, and error recovery
  • Interfaces: Publishes/subscribes to topics, provides/uses services, handles actions

Example nodes in a humanoid robot:

  • /sensors/camera — Publishes RGB-D images
  • /perception/object_detector — Subscribes to images, publishes detections
  • /planning/navigator — Provides navigation service
  • /control/motor_controller — Subscribes to commands, controls motors

Topics

Topics are named data streams for asynchronous, one-to-many communication. They use the publish-subscribe pattern:

  • Publishers send messages without knowing who receives them
  • Subscribers receive messages without knowing who sends them
  • Decoupling: Publishers and subscribers are independent

Example topics:

  • /camera/rgb — RGB images (published by camera driver)
  • /lidar/scan — LiDAR point clouds
  • /joint_states — Current joint positions and velocities
  • /motor_commands — Desired joint velocities

Services

Services provide synchronous, request-response communication. Unlike topics (which stream continuously), services are:

  • One-to-one: One client calls one server
  • Blocking: Client waits for response
  • Discrete: Used for queries, not continuous data

Example services:

  • /get_robot_pose — Returns current robot position
  • /plan_trajectory — Takes start/goal, returns path
  • /set_parameters — Updates configuration

Actions

Actions are asynchronous, goal-oriented tasks with feedback. They combine:

  • Goal: Client sends a goal (e.g., "navigate to position X")
  • Feedback: Server reports progress (e.g., "50% complete")
  • Result: Server returns final outcome (e.g., "goal reached" or "failed")

Example actions:

  • /navigate_to_goal — Long-running navigation task
  • /grasp_object — Manipulation with progress updates
  • /execute_trajectory — Motion execution with feedback

Parameters

Parameters are configuration values accessible at runtime. They enable:

  • Dynamic reconfiguration: Change behavior without restarting nodes
  • Environment-specific settings: Different values for sim vs. real
  • Tuning: Adjust control gains, thresholds, limits

Example parameters:

  • control/max_velocity — Maximum joint velocity
  • perception/confidence_threshold — Object detection threshold
  • planning/timeout — Planning timeout in seconds

1.3 Node Lifecycle & Executors

Node Lifecycle States

ROS 2 supports lifecycle-managed nodes that transition through well-defined states:

  1. Unconfigured — Node created but not initialized
  2. Inactive — Node configured but not active
  3. Active — Node running and processing
  4. Finalized — Node cleaned up and shut down

Lifecycle transitions:

  • configure — Initialize node (load parameters, setup)
  • activate — Start processing (begin publishing/subscribing)
  • deactivate — Stop processing (pause, but keep state)
  • cleanup — Clean up resources
  • shutdown — Final shutdown

Why lifecycle management matters:

  • Predictable startup: Nodes initialize in correct order
  • Graceful shutdown: Clean resource cleanup
  • Error recovery: Nodes can restart without full system reboot
  • Safety: Critical nodes can be paused without losing state

Executors: Single-Threaded vs Multi-Threaded

Executors control how nodes process callbacks (messages, service requests, timers). ROS 2 provides two models:

SingleThreadedExecutor:

  • All callbacks run in one thread
  • Deterministic: Predictable execution order
  • Real-time friendly: No thread contention
  • Use case: Control loops, time-critical nodes

MultiThreadedExecutor:

  • Callbacks run in thread pool
  • Higher throughput: Parallel processing
  • Non-deterministic: Order depends on scheduling
  • Use case: Perception, planning (can tolerate jitter)

Best practice: Use single-threaded executors for control, multi-threaded for perception/planning.


1.4 Quality of Service (QoS) Profiles

QoS policies control how messages are delivered. This is critical for robotics where different data types have different requirements.

Reliability

  • Reliable: Messages guaranteed to be delivered (may retry)
  • Best-effort: Messages may be dropped if queue is full

Use cases:

  • Reliable: Motor commands, safety signals (must not be lost)
  • Best-effort: Camera images, high-frequency sensor data (can tolerate drops)

Durability

  • Volatile: Only current subscribers receive messages
  • Transient Local: New subscribers receive last message

Use cases:

  • Volatile: Real-time sensor streams
  • Transient Local: Robot state, map data (new nodes need current state)

History

  • Keep Last: Keep N most recent messages
  • Keep All: Keep all messages (may grow unbounded)

Use cases:

  • Keep Last (depth=1): Latest state only
  • Keep Last (depth=10): Small buffer for jitter tolerance
  • Keep All: Debugging, logging (use with caution)

Common QoS Profiles

ROS 2 provides pre-configured profiles:

  • Sensor Data: Best-effort, volatile, keep last (depth=5)
  • Services: Reliable, volatile, keep last (depth=10)
  • Parameters: Reliable, transient local, keep last (depth=1000)
  • System Default: Reliable, volatile, keep last (depth=10)

Matching QoS: Publishers and subscribers must have compatible QoS. If incompatible, they won't connect.


1.5 Hands-On Preview

In the following topics, you will build:

  1. Minimal nodes: Publisher and subscriber nodes
  2. Service nodes: Server and client for planning queries
  3. Action nodes: Goal-oriented navigation with feedback
  4. Parameter nodes: Dynamic configuration management
  5. Multi-node system: Complete 3-node pipeline

Each topic includes:

  • Conceptual explanation: Why and when to use each pattern
  • Code examples: Working Python (rclpy) implementations
  • Best practices: Common mistakes and how to avoid them
  • Debugging tips: Tools and techniques for troubleshooting

Summary

ROS 2 solves the distributed coordination problem in robotics by providing:

  • Standardized communication: Topics, services, actions
  • Real-time guarantees: DDS-based deterministic delivery
  • Modularity: Independent nodes with clear interfaces
  • Type safety: Strong typing prevents runtime errors
  • QoS control: Fine-grained control over message delivery

Understanding these fundamentals is essential for building robust, scalable robot systems. The next topics dive deep into implementation details, code examples, and hands-on labs.


References