Topic 3 — Task Execution & Action Sequencing
Navigation answers the question “how do I get there?”. Task execution answers “what should I do in what order, and how do I recover when things go wrong?”. This topic introduces task graphs, behavior trees, skill libraries, and multi-step task composition for humanoid robots.
3.1 Task Graphs & Behavior Trees
Task Graphs
Task graphs represent tasks as nodes and edges:
- Nodes: actions or conditions (e.g., "navigate to room", "pick object").
- Edges: transitions (success/failure/conditions).
They are useful for:
- Visualizing complex workflows.
- Reasoning about dependencies between actions.
Behavior Trees
Behavior trees are a structured way to control behavior using:
- Root node: entry point for the tree.
- Composite nodes:
- Sequence: run children in order until one fails.
- Selector (fallback): try children in order until one succeeds.
- Decorator nodes:
- Modify behavior (e.g., retry, invert result, add timeouts).
- Leaf nodes:
- Actions (e.g., "navigate_to_pose", "pick_object").
- Conditions (e.g., "object_visible", "door_open").
Advantages:
- Clear separation of decision logic from skills.
- Natural support for:
- Retry logic.
- Parallel and conditional branches.
- Modular reuse of subtrees.
Example pattern:
- Root
- Sequence:
- Condition: object identified?
- Action: navigate to object.
- Action: pick object.
- Action: navigate to destination.
- Action: place object.
- Sequence:
If any step fails, the tree can:
- Switch to a fallback branch (e.g., re-scan, search another room).
- Abort and report failure to the high-level agent.
3.2 Skill Library Construction
Skills are atomic capabilities that behavior trees and planners can call.
Examples of humanoid skills:
- Pick up object
- Inputs: object ID or pose.
- Steps:
- Align base.
- Reach with arm (IK).
- Close gripper with appropriate force.
- Outcomes: success, failure (e.g., object slipped), reasons.
- Place object
- Inputs: target pose or surface.
- Steps:
- Align approach vector.
- Lower object to surface.
- Release grip smoothly.
- Follow human
- Inputs: person ID or tracking target.
- Steps:
- Use perception to track human pose.
- Maintain safe distance using local planner.
- Deliver object
- Inputs: target location or person.
- Combines:
- Navigation skill.
- Pick/place skills.
- Inspect or scan area
- Inputs: region or room ID.
- Steps:
- Execute waypoint pattern.
- Log observations or changes.
Design guidelines:
- Each skill should:
- Have a clear ROS 2 interface (action, service, or topic).
- Publish its status and errors.
- Be testable in isolation (unit tests, simulation scenarios).
3.3 Chaining Skills into Tasks
Complex tasks are compositions of skills.
Examples:
- Find object → navigate → grasp → deliver
- Perception: detect object and estimate pose.
- Navigation: move to an approach pose.
- Manipulation: pick object.
- Navigation: move to delivery location.
- Manipulation: place object.
- Track person → follow → maintain distance
- Perception: detect and track human pose.
- Navigation: constantly update goal to follow path.
- Control: enforce distance constraints and comfort zones.
- Scan room → detect changes → report findings
- Navigation: waypoint-based sweep.
- Perception: detect objects and compare with baseline.
- Reporting: summarize changes (e.g., "chair moved", "new object on table").
Behavior trees or task graphs orchestrate these chains:
- Condition checks before actions.
- Fallbacks when expected conditions are not met.
- Loops for retry and search behaviors.
3.4 Lab: Task Graph for Pick-and-Deliver
This lab focuses on implementing a task graph or behavior tree for a pick-and-deliver task.
Objectives
- Build a behavior tree that orchestrates navigation and manipulation skills.
- Handle common failure modes (object not found, grasp failure, blocked path).
Tasks
- Define Skills
- Ensure you have working skills for:
navigate_to_pose.pick_object.place_object.
- Ensure you have working skills for:
- Design the Behavior Tree
- Plan a tree that:
- Locates the object.
- Navigates to approach pose.
- Attempts to pick the object.
- Navigates to delivery location.
- Places the object.
- Add:
- Timeouts for each step.
- Retry policies (e.g., try pick up to N times).
- Fallbacks (e.g., re-scan environment if object not visible).
- Plan a tree that:
- Integrate with ROS 2
- Use a behavior-tree framework (e.g., BehaviorTree.CPP with ROS 2 integration).
- Implement action nodes that call existing ROS 2 actions/services.
- Test in Simulation
- Use your digital twin from Chapter 3.
- Run multiple scenarios:
- Ideal conditions.
- Partial occlusion.
- Slightly moved objects.
Deliverables
- Behavior tree definition (XML/JSON/YAML or code).
- Logs from multiple runs (success and failure cases).
- Short report describing:
- Tree structure.
- How failures are handled.
- Lessons learned about task-level robustness.
This lab provides the task execution backbone that later topics will drive with natural language and higher-level reasoning.