If you want to look at code and run some of it, check out arc_robot_arm README for quick start details and usage instructions.
How to use this if you’re new
From the diagram and subsection descriptions, you can hopefully get a preliminary high-level understanding of each of the parts and how they fit together.
From this, you can then:
- Choose which subsection interests you most
- Follow links in the specified subsection to get more contextual understanding (Optionally ask questions in the Discord, hop in a voice chat, etc for more fun convo about the area)
- Ensure you understand the related topics for each subsections with practical understanding/experience by completing a step-by-step tutorial (either use google to find one or ask for suggestions in Discord), ensuring that it uses the related topics (Can be done concurrently with #2)
- At this point, you should have enough knowledge/experience in your subsection to contribute to the current approach and possibly even change what the current approach is if you find better ways to do things (if so, update these docs!)
graph subgraph Behavior Planning BP1[Behavior Planner] BP2[Chess Engine or Human] %% BP3[RL Policy] end subgraph Low Level direction LR LL1[Protoarm ROS Controller] LL2[driver] LL3[servos] end subgraph Simulation direction LR S1[Gazebo ROS Controller] S2[Simulated robot joints from URDF] end subgraph Vision direction LR V1[YOLOv5 Object Detector ROS node] V2[Chessboard Detector] end subgraph Planning/Control PC1[Visual Servoing ROS node] PC2[MoveTo C++ ROS node] PC3[MoveIt] end %% Low level LL2 -- servo position PWM --> LL3 LL3 -- potentiometer feedback \- not done --> LL2 LL1 --> LL2 %% Behavior Planning BP2 --> BP1 BP1 --> PC1 %% Vision V2 --> BP1 V1 --> PC1 %% Simulation S1 --> S2 %% Planning/Control PC1 --> PC2 PC2 --> PC3 PC3 --> S1 PC3 --> LL1 %% Links click LL1 href "http://wiki.purduearc.com/wiki/robot-arm/software#controller" "Sensors" click LL2 href "http://wiki.purduearc.com/wiki/robot-arm/software#driver" "Sensors" click PC1 href "http://wiki.purduearc.com/wiki/robot-arm/software#visual-servoing" "Sensors" click PC2 href "http://wiki.purduearc.com/wiki/robot-arm/software#kinematics-and-planning" "Sensors" click PC3 "https://moveit.ros.org/assets/images/diagrams/moveit_pipeline.png" "Open this in a new tab" _blank click BP1 href "http://wiki.purduearc.com/wiki/robot-arm/software#behavior-planner" "Sensors" click S1 href "http://wiki.purduearc.com/wiki/robot-arm/software#simulation" "Sensors" click S2 href "http://wiki.purduearc.com/wiki/robot-arm/software#simulation" "Sensors" click V1 href "http://wiki.purduearc.com/wiki/robot-arm/software#chess-piece-detection" "Sensors" click V2 href "http://wiki.purduearc.com/wiki/robot-arm/software#chessboard-detection" "Sensors" classDef not_started fill:#ff8181 classDef in_progress fill:#ffba82 classDef done fill:#81ff9b class BP1,LL1,BP2,BP3,V2 not_started class PC1 in_progress class S1,S2,LL2,LL3,V1,PC2,PC3 done
graph l1[Not Started] l2[In Progress] l3[Done] classDef not_started fill:#ff8181 classDef in_progress fill:#ffba82 classDef done fill:#81ff9b class l1 not_started class l2 in_progress class l3 done
After the robot turns on or at any given point of time, what should the robot do? This is the job of the behavior planner.
Overall, it should output executable commands that return true or false if they are completed successfully and then output the next command. A preliminary decision flowchart that a robot can make is modeled here:
graph a1[Scanning for change in board state] p1[Virtual Human] p2[Engine] a3[Identify and pick up piece] a4[Identify destination and place piece] a1 -- 2d picture of board --> p1 a1 -- FEN Notation of board--> p2 p1 -- Next move --> a3 p2 -- Next move --> a3 a3 --> a4 --> a1
To implement the behavior planner, a Finite State Machine (FSM) and/or a Behavior Tree (BT) can be used, which both have tradeoffs in modularity and reactivity, (read more, scroll to the last section for ).
Chess Piece Detection
How does our robot determine which piece is which and where is that piece relative to the arm? This is a common scenario in many real-world use cases and object detection, as the name suggests, is used to detect the chess pieces.
The object detection stack consists of the YOLOv5 object detection model trained on a custom dataset and outputs 2d bounding boxes. This can then be extrapolated to 3D coordinates from the camera intrinsics relative to the arm with some coordinate transformations from camera -> robot arm base.
Inference is using the
yolov5_pytorch_ros ROS package using the
detector ROS node using Python/PyTorch, allowing it to communicate detections and classes to other nodes such as for visual servoing. It reaches around 15-20 FPS without a GPU on a Mac.
See chess_piece_detector for quick start details and usage instructions.
Playing chess is more than just picking and placing pieces. The robot needs to actually beat the human. We need to know the exact state of the chessboard, so another person or an overpowered engine can say what move to play next.
We can do this using a computer vision techniques with an image of the current chessboard as an input and the FEN notation of to board as an output. The following diagram shows this visually:
graph LR a1[Real chessboard] --> a2[2D chessboard] a2 --> a3[FEN Notation] a3 --> a4[Chess Engine] a2 --> a5[Virtual human player]
Some external projects we plan to use to complete the above:
Even if we can see where the pieces are with the camera, how does the robot arm move closer to the chess piece it wants to pick up? Without visual servoing, any error that the robot makes cannot be accounted for and adjusted for accordingly, resulting in lots of fails. Visual servoing is a control algorithm using images as input to control for any errors that the robot makes.
We are using image-based visual servoing to localize the robot arm hand over an object in a graspable configuration, using images/2d bounding boxes as inputs to the system and servo commands as outputs.
Visual servoing system diagram (source)
protoarm_visual_servoing package for more details.
Kinematics and Planning
As a human, it is simple to move our joints and pick something up in 3D space. For a robot arm, it knows nothing of 3D space, only numerical angles for each of its joints. So, we use inverse kinematics (IK), the mathematical process of converting 3D space coordinates to joint angles, to determine the final position for the robot arm.
Now what happens between the start and final position? That’s the job of the motion planner. It determines a safe collision-free trajectory for each joint and creates the plan. Then, the plan is executed, either in real life or in simulation.
As of now, all the kinematics and planning heavy lifting is done by MoveIt. The
protoarm_kinematics package houses a wrapper written in C++ that abstracts the process of sending JointState or Pose goals.
The wrapper is interfaced externally using the
move_to node. Refer to the test_kinematics rospy file in
protoarm_kinematics/src for usage of the
This package is also where we would keep our custom kinematics plugin and planning library if we choose to make it from scratch, instead of the default KDL kinematics plugin and OMPL motion planning library.
See protoarm_kinematics for quick start details and usage instructions.
protoarm_control is the ROS package that ensures that MoveIt execution commands are executed exactly as expected and to a degree of certainty in the real world.
Right now, the driver communicates directly with MoveIt as the
protoarm_control package doesn’t exist yet and because we do not have servo feedback.
Given that the protoarm uses some of the cheapest servos on the market, how can we still get dependable sub-millimeter precision to do tasks reliably? One way to do so is to hack our servos to add encoders, and use a control scheme that can take velocity and torque into account.
Inspired by Adam’s Servo Project.
protoarm_driver is written in Arduino that actually interfaces with the servos and encoders. It sets joint limits, does coordinate frame conversions from the URDF to the actual robot, converts MoveIt angles to servo joint angles \((-\pi,\pi)\) to \((0^\circ,180^\circ)\), and executes servo commands using PWM to the servos.
There are lots of benefits from simulation spanning from speeding up development and testing of software, realistic environments for reinforcement learning, and testing proof of concepts.
Our arm is simulated in Gazebo with a Realsense D435 camera (
realsense_ros_gazebo), chessboard, and chess pieces (
chessboard_gazebo). The robot arm and the camera are represented in URDF.
Sensors, actuation, gazebo plugins, and more are specifications that can be added. For the arm, these specifications exist in the
.xacro files in the
urdf folder of the
protoarm_description ROS package.
Gazebo uses SDF models (in
models folder of
chessboard_gazebo) to represent static assets in the simulation like the chess pieces and chessboard. These objects are then spawned into a Gazebo world (along with the robot URDF), represented with a
See protoarm_bringup for quick start details and usage instructions.
- In depth tutorial playlist for C++
- Recommended topics:
- if/else, loops, functions, classes
- Smart pointers
- Dynamic Arrays (std::vector)
- Recommended topics:
- Very useful numeric libraries
- Eigen: Extremely efficient matrix math library
- Important topics to understand:
- Basics are good - variables + logic, functions, classes
Must use when working with large arrays (i.e images)
- Important topics to understand:
- Creating arrays
- slicing + indexing
- linear algebra
Use for computer vision and image transformations like color detection, tracking, etc
- Important topics to understand:
- image transformation
- read/write images from file
- Top hit for “arduino tutorial” on google should work
- Important topics (Googlable)
- Creating a world
- Adding assets
- ROS control in Gazebo
- Cameras in Gazebo
Good coding practices