Topic 4 — Fleet Communications & Inter-Agent Messaging
Inter-robot communication is the backbone of all multi-robot systems. This topic explores fleet networking, message passing, reliability/safety, and the tools available in ROS2, DDS, and modern IoT/cloud frameworks. Robust comms are required for everything from map merges to market-based auctions, and are crucial for scaling robot teams.
4.1 ROS 2 DDS Networking & QoS
- ROS 2 decentralizes communication using DDS (Data Distribution Service).
- Topic sharing: Publish/subscribe model works across a fleet if domain IDs, namespaces, and QoS are configured correctly.
- Quality of Service (QoS):
- Reliable vs. best-effort (use reliable for mission-critical messages; best-effort for high-rate sensor streams)
- History, durability, lifespan parameters for tuning performance and reliability
- Practical settings: For large fleets/tight safety, prefer static discovery, bounded resource pools, and explicit heartbeats/timeouts for liveness.
- Comms patterns: Broadcast (all robots), peer-to-peer (pairs/teams), relay (one robot as gateway to the cloud or web dashboard)
4.2 API-Based Multi-Robot Control
- REST APIs: Human or backend services can assign or query robot state and tasks via simple HTTP.
- WebSocket: Real-time dashboards and bidirectional cloud comms for remote fleet management.
- MQTT & brokers: Message brokers (e.g., Mosquitto, HiveMQ) allow lightweight, scalable publish-subscribe comms for robot-to-robot and robot-to-cloud coordination.
- Edge/cloud hybrid: Some robots operate with edge-only comms, others relay state/data to cloud dashboards or centralized planners.
- Broker relays: Allow communication across network/firewall/domain boundaries, and log all comms for security audits.
4.3 Security & Identity
- Authentication: Each robot must have secure ID (cert/key, pre-shared auth, or dynamic credential exchange).
- Encrypted channels: TLS for DDS and MQTT, or VPN overlay for entire fleet network.
- Access roles: Partition robots by capabilities (scout/worker/carrier), restrict commands based on role/user, and log all high-stakes commands.
- Fleet-level safety: Monitor for spoofing, replay, or compromised bots. Real-time network integrity checks with alerts/escalation protocols.
Diagram: Fleet Network Schematic
+----------+ TLS +----------+ MQTT+REST +----------+
| Robot 1 |<-----> | Broker/ | <---------> | Cloud/ |
| | | EdgeHub | (encrypted) | Operator |
+----------+ +----------+ +----------+
| DDS pub/sub ^ ^ Peer relays |
+----------+------+ | ----------------+
| +---+ QoS-partition |
+----------+ | | |
| Robot 2 |<--+---+---...---Robot N ...---+
+----------+
Lab: ROS 2 Fleet DDS Domain + Reliability Testing
Goal: Set up a multi-robot ROS2/Gazebo fleet. Vary comms settings and demonstrate:
- Topics propagate between robots
- Single-robot failures don’t halt fleet; QoS tuning affects message loss/delay
- (Optional) Relay robot/REST/MQTT dashboard to observe state and queue comms
Tasks:
- Launch two+ robots with separate and shared DDS domain IDs.
- Configure topics for both reliable (commands/goals) and best-effort (sensor streams) QoS.
- Log:
- Message propagation delay and loss (packet drops, QoS mismatch)
- Reaction to simulated disconnections or network partition
- (Optional) Connect edge broker or cloud dashboard for remote monitoring/control via API
Deliverables:
- DDS and comms config files
- Logs of message passing, failures, timing, and dashboard connectivity
- Short report: Fleet comms bottlenecks, lessons on ROS2 networking for large-scale deployments
Summary
Any distributed autonomy depends on the speed, reliability, and security of fleet communication. Getting QoS and network topology right unlocks safe, resilient, and scalable robot teams for all higher-level behaviors, planning, and mission assurance.