- ROS basics
- Gazebo basics
- Turtlebot3 simulation
- Test on the real Turtlebot3
- Turtlebot3 MOGI
- Line following
- Neural network
- Test on the real robot
ROS, or Robot Operating System, is an open-source framework designed to facilitate the development of robotic applications. It provides a collection of tools, libraries, and conventions that simplify the process of designing complex robot behaviors across a wide variety of robotic platforms.
ROS was initially developed in 2007 by the Stanford Artificial Intelligence Laboratory and continued by Willow Garage, with the goal of providing a common platform for research and development in robotics. The primary motivation was to create a standard framework that could support a broad range of robotic applications, promote code reuse, and foster collaboration within the robotics community.
Key reasons for ROS development include:
- Standardization: Creating a common platform that simplifies the integration of different hardware and software components.
- Modularity: Enabling the development of modular and reusable software components (nodes) that can be easily shared and adapted for various robotic systems.
- Community Collaboration: Encouraging collaboration among researchers and developers, resulting in a vast collection of tools and libraries.
ROS 2 was developed to address the limitations of ROS 1 and meet the growing demands for industrial and commercial robotics applications. The development began around 2014 and aimed to enhance the capabilities of ROS, particularly in areas such as security, real-time performance, and support for multi-robot systems. In practice, the biggest difference is in the underlying middleware, ROS1 uses a custom transport layer and message-passing system that was not designed for real-time or distributed applications (see ROS1's roscore
The latest ROS1 release is ROS Noetic which was intended to be used on Ubuntu 20.04. It goes to EOL in May, 2025 together with Ubuntu 20.04.
Ubuntu 24.04 LTS
In the course we'll use ROS2 Jazzy Jalisco, which requires Ubuntu 24.04 for the smoothest operation.
You have a couple of options, but the most recommended is the native installation of the operating system - external SSD, dual boot, etc.
- Native install, the most recommended way. You will learn how to set up the environment, there won't be difficulties with GPU acceleration and many more advantages.
- Windows 11 WSL2 (Windows Subsystem Linux), see instructions. It's a straightforward way if you want to use within Windows environment. You can still learn how to set up the environment but can be more challenging with GUI applications and 3D acceleration.
- Virtual machine, VMware fusion is now free for personal use. Less flexible than WSL2 on Windows 11 but works well on macOS. 3D acceleration can be challenging.
- Docker container. First it might look an easy way to use any ROS distribution on any host operating system, but it's getting more and more challenging if we need GUI applications, 3D acceleration and can be confusing for beginners how to work within the container. You might miss some important experience with setting up the environment if using a pre-configured container image. I only recommend this way for experienced Docker users.
- Using an online environment e.g. The Construct. It looks promising that you don't have to install any special software, but you won't gain experience with setting up the environment. It can be difficult to cherry pick the software versions you need and accessing GUI applications through the web interface is a poor experience.
The options 1 and 2 are the most practical and preferred ways to use ROS. In an exotic case, if you want to run Ubuntu 24.04 and ROS2 Jazzy on macOS and Apple silicon this is a very good tutorial.
Pro tip if you want to mount directories from your host system into your guest Ubuntu 24.04 running in VMware fusion, more details on this link:
/usr/bin/vmhgfs-fuse .host:/BME/ROS2-lessons /home/david/ros2_ws/src/ROS2-lessons -o subtype=vmhgfs-fuse,allow_other
Visual Studio Code
The recommended code editor during the course is Visual Studio Code, but it's up to your choice if you want to go with your different editor. Depending on your Ubuntu install method you might install it natively on Ubuntu, in your virtual environment or on your host operating system.
Recommended extensions to install:
- Markdown All in One
- C/C++
- Python
- CMake Tools
- Remote - SSH - if you work on physical robots, too
- Remote - WSL - if you do the course using WSL2
GitHub and a git client
The course materials are available on GitHub, and the submissions of your final projects shall also use GitHub. You'll need a very good excuse why to use alternative git solutions like GitLab.
So I encourage everyone to register your GitHub accounts, and if you are there don't forget to sign up for the GitHub Student Developer Pack which gives you a bunch of powerful developer tools for free.
I recommend to use a graphical git client that can boost your experience with git, in my optinion the best one is GitKraken, which is not a free software, but you get the pro version as part of the GitHub Student Developr Pack! If you prefer using git as a cli tool then no worries, it's absoluetely all right.
Markdown is not a standalone software but rather a lightweight, plain-text formatting language used to create formatted documents. It was created by John Gruber in 2004 with the goal of being easy to read and write, using simple syntax to style text, create lists, links, images, and more. It is widely used for writing documentation, readme files, and content for static websites.
Basic Markdown Syntax
- Headings:
Heading 1,##
Heading 2, etc. - Bold:
**bold text**
or__bold text__
- Italic:
*italic text*
or_italic text_
- Lists:
- Unordered:
- Item
or* Item
- Ordered:
1. Item
- Links:
[Link text](URL)
- Images:

- Code: Inline code or code blocks using triple backticks (```)
- Unordered:
GitHub Flavored Markdown (GFM)
GitHub Flavored Markdown (GFM) is a variant of Markdown used by GitHub to provide additional features and syntax that are not available in standard Markdown. It includes:
- Tables:
| Column 1 | Column 2 | |----------|----------| | Row 1 | Data | | Row 2 | Data |
- Task lists:
- [x] Task 1 - [ ] Task 2
- Strikethrough:
~~strikethrough text~~
- Syntax highlighting in a specific language:
```python def hello_world(): print("Hello, world!")
- Tables of Contents
- @mentions for users, references to issues, and pull requests using #number
Most of the tips and tricks that you might need for your own project documentation can be found in the source of this readme that you read right now, feel free to use any snippets from it!
A good terminal
It's up to your choice which terminal tool would you like to use, but I strongly recommend one that support multiple split windows in a single unified window, because we will use a lot of terminals! On Linux, I can recommend terminator
In case you use WSL2, the built-in Windows terminal also support multiple panes and works really well!
And finally, install ROS2 Jazzy
ROS always had very good and detailed installed guides, it's not anything different for ROS2's Jazzy release.
The installation steps can be found here, with Ubuntu 24.04 it can be installed simply through pre-built, binary deb
After installing it we have to set up our ROS2 environment with the following command:
source /opt/ros/jazzy/setup.bash
By default, we have to run this command in every new shell session we start, but there is a powerful tool in Linux for such use cases. .bashrc
file is always in the user's home directory and it is used for user-specific settings for our shell sessions. You can edit .bashrc
directly in a terminal window with a basic text editor, like nano
david@david-ubuntu24:~$ nano .bashrc
Here, you can add your custom user-specific settings in the end of the file, that will be executed every time you initiate a new shell session. I created an example gist that you can add to the end of your file and use it during the course.
ROS2 Jazzy has an even more detailed tutorial about setting up your environment, you can check it out, too!
Your ROS2 install comes with a couple of good examples as you can also find it on the install page.
Let's try them!
The following command starts a simple publisher
written in C++. A publisher
is a node that is responsible for sending messages with a certain type over a specific topic
(in this example the topic's name is chatter
and the type is a string). A topic
is a communication pipeline in the publish-subscribe communication model where a single message is sent to multiple subscribers, unlike message-queues that are point-to-point models, where a single message is sent to a single consumer. Publishers broadcast messages to topics, and subscribers listen to those topics to receive a copy of the message.
Publish-subscribe models are asynchronous, one-to-many or many-to-many interactions where the publishers don't know how many subscribers there are (if any). Therefore publisher never expects any response or confirmation from the subscribers.
Now, let's run the demo publisher written in C++:
ros2 run demo_nodes_cpp talker
Your output should look like this:
david@david-ubuntu24:~$ ros2 run demo_nodes_cpp talker
[INFO] [1727116062.558281395] [talker]: Publishing: 'Hello World: 1'
[INFO] [1727116063.558177802] [talker]: Publishing: 'Hello World: 2'
[INFO] [1727116064.558010534] [talker]: Publishing: 'Hello World: 3'
[INFO] [1727116065.557939861] [talker]: Publishing: 'Hello World: 4'
[INFO] [1727116066.557849645] [talker]: Publishing: 'Hello World: 5'
Let's start a subscriber - written in Python - in another terminal window, which subscribes to the chatter
topic and listens to the publisher node's messages:
david@david-ubuntu24:~$ ros2 run demo_nodes_py listener
[INFO] [1727116231.574662048] [listener]: I heard: [Hello World: 170]
[INFO] [1727116232.560517676] [listener]: I heard: [Hello World: 171]
[INFO] [1727116233.558907367] [listener]: I heard: [Hello World: 172]
[INFO] [1727116234.560768278] [listener]: I heard: [Hello World: 173]
[INFO] [1727116235.559821377] [listener]: I heard: [Hello World: 174]
[INFO] [1727116236.559993767] [listener]: I heard: [Hello World: 175]
Now both nodes are running we can try a few useful tools. The first on let us know what kind of nodes are running in your ROS2 system:
ros2 node list
Which gives us the following output:
david@david-ubuntu24:~/ros2_ws$ ros2 node list
If we want to know more about one of our nodes, we can use the ros2 node info /node
david@david-ubuntu24:~/ros2_ws$ ros2 node info /listener
/chatter: std_msgs/msg/String
/parameter_events: rcl_interfaces/msg/ParameterEvent
/rosout: rcl_interfaces/msg/Log
Service Servers:
/listener/describe_parameters: rcl_interfaces/srv/DescribeParameters
/listener/get_parameter_types: rcl_interfaces/srv/GetParameterTypes
/listener/get_parameters: rcl_interfaces/srv/GetParameters
/listener/get_type_description: type_description_interfaces/srv/GetTypeDescription
/listener/list_parameters: rcl_interfaces/srv/ListParameters
/listener/set_parameters: rcl_interfaces/srv/SetParameters
/listener/set_parameters_atomically: rcl_interfaces/srv/SetParametersAtomically
Service Clients:
Action Servers:
Action Clients:
At the moment, the most interesting detail we can gather about a node is if it's subscribing or publishing to any topic. In later lessons we'll learn more about parameters and services.
In a very similar way, we can also list all of our topics with ros2 topic list
david@david-ubuntu24:~/ros2_ws$ ros2 topic list
And we can get more details about a certain topic with the ros2 topic info /topic
david@david-ubuntu24:~/ros2_ws$ ros2 topic info /chatter
Type: std_msgs/msg/String
Publisher count: 1
Subscription count: 1
Another powerful tool is rqt_graph
that helps us visualizing the nodes and topics in a graph.
can be used as a standalone tool, or part of rqt
which can be used to build a complete dashboard to mintor and control your nodes. We'll spend a lot of time with it, at the moment let's just see the message monitoring function:
Let's see another built in example which is a simple 2D plotter game.
In a case it's not automatically installed, you can install it with the following command:
sudo apt install ros-jazzy-turtlesim
To run the main node just execute the follwoing command:
ros2 run turtlesim turtlesim_node
And in another terminal start its remote controller, you can simply drive the turtle with the arrows:
ros2 run turtlesim turtle_teleop_key
We can use the same tools as before to see the running nodes and topics, here is how does it look like in rqt_graph
We should notice two important things:
- turtlesim is more complex than the previous example with multiple services and parameters that we'll check in the end of this lesson.
- the turtle is controlled with a
message which is a 6D vector in space. We'll use this exact same message type in the future to drive our simulated robots.
Now let's move on to create our own nodes!
To create, build and run custom nodes we need packages, but first we need a workspace where we'll maintain our future packages. There are 2 new terms we must learn about ROS2 workspaces:
provides the underlying build system and tools specifically for ROS2 packages.ament_cmake
is a CMake-based build system for C/C++ nodes andament_python
provides the tools for packing and installing python nodes and libraries.colcon
(COmmand Line COLlectioN) is a general-purpose tool to build and manage entire workspaces with various build systems, including ament, cmake, make, and more.
It means that our ROS2 workspace will be a colcon workspace
which - in the backround - will use ament
for building the individual packages.
If you have experience with ROS1,
replaces the oldcatkin
Let's create our workspace inside our user's home directory:
mkdir -p ~/ros2_ws/src
cd ~/ros2_ws
A workspace must have a
folder where we maintain the source files of our packages, during building of the workspace colcon will create folders for the deployment of binaries and other output files.
Let's go into the src
folder and create our first python package:
ros2 pkg create --build-type ament_python bme_ros2_tutorials_py
During package creation we should define if it's a C/C++ (
) or a python (ament_python
) package. If we don't do it, the default is alwaysament_cmake
We'll put our python scripts under bme_ros2_tutorials_py
which is an automatically created folder with the same name as our package, it already has an empty file __init__.py
, let's add our first node here: hello_world.py
We can create files in Linux in several different ways, just a few examples:
- Right click in the folder using the desktop environment
- Through the development environment, in our case Visual Studio Code
- From command line in the current folder using the
command:touch hello_world.py
At this point our workspace should look like this (other files and folders are not important at this point):
david@david-ubuntu24:~/ros2_ws$ tree -L 4
└── src
└── bme_ros2_tutorials_py
├── bme_ros2_tutorials_py
│ ├── __init__.py
│ └── hello_world.py
├── package.xml
└── setup.py
It's always recommended to fill the
with your name and email address andlicense
fields in yourpackage.xml
files. I personally prefer a highly permissive license in non-commercial packages of mine, likeBSD
orApache License 2.0
#!/usr/bin/env python3
# Main entry point, args is a parameter that is used to pass arguments to the main function
def main(args=None):
print("Hello, world!")
# Check if the script is being run directly
if __name__ == '__main__':
Although this is a python script that doesn't require any compilation, we have to make sure that ament
will pack, copy and install our node. It's important to understand that we are not running python scripts directly from the source folder!
Let's edit setup.py
that was automatically generated when we defined that our package will use ament_python
Add an entry point for our python node. An entry point describes the folder, the filename (without .py
) and the main entry point within the script:
'console_scripts': [
'py_hello_world = bme_ros2_tutorials_py.hello_world:main'
Our first node within our first package is ready for building it! Build must be initiated always in the root of our workspace!
cd ~/ros2_ws
And here we execute the colcon build
After a successful build we have to update our environnment to make sure ROS2 cli tools are aware about of any new packages. To do this we have to run the following command:
source install/setup.bash
As we did with the base ROS2 environment, we can add this to the .bashrc
so it'll be automatically sourced every time when we open a terminal:
source ~/ros2_ws/install/setup.bash
And now we are ready to run our first node:
ros2 run bme_ros2_tutorials_py py_hello_world
Athough we could run our first node, it was just a plain python script, not using any ROS API. Let's upgrade hello world to a more ROS-like hello world. We import the rclpy
which is the ROS2 python API and we start using the most basic functions of rclpy
like init()
, create_node()
and shutdown()
. If you already want to do a deep-dive in the API functions you can find everything here.
#!/usr/bin/env python3
import rclpy # Import ROS2 python interface
# Main entry point, args is a parameter that is used to pass arguments to the main function
def main(args=None):
rclpy.init(args=args) # Initialize the ROS2 python interface
node = rclpy.create_node('python_hello_world') # Node constructor, give it a name
node.get_logger().info("Hello, ROS2!") # Use the ROS2 node's built in logger
node.destroy_node() # Node destructor
rclpy.shutdown() # Shut the ROS2 python interface down
# Check if the script is being run directly
if __name__ == '__main__':
We don't have to do anything with setup.py
, the entrypoint is already there, but we have to re-build the colcon workspace!
After the build we can run our node:
ros2 run bme_ros2_tutorials_py py_hello_world
Let's make our first publisher in python, we create a new file in the bme_ros2_tutorials_py
folder: publisher.py
We start expanding step-by-step our knowledge about the ROS2 API with publishing related functions (create_publisher()
and publish()
#!/usr/bin/env python3
import rclpy
from std_msgs.msg import String # Import 'String' from ROS2 standard messages
import time
def main(args=None):
node = rclpy.create_node('python_publisher')
# Register the node as publisher
# It will publish 'String' type to the topic named 'topic' (with a queue size of 10)
publisher = node.create_publisher(String, 'topic', 10)
msg = String() # Initialize msg as a 'String' instance
i = 0
while rclpy.ok(): # Breaks the loop on ctrl+c
msg.data = f'Hello, world: {i}' # Write the actual string into msg's data field
i += 1
node.get_logger().info(f'Publishing: "{msg.data}"')
publisher.publish(msg) # Let the node publish the msg according to the publisher setup
time.sleep(0.5) # Python wait function in seconds
if __name__ == '__main__':
We have to edit setup.py
, registering our new node as entry point:
'console_scripts': [
'py_hello_world = bme_ros2_tutorials_py.hello_world:main',
'py_publisher = bme_ros2_tutorials_py.publisher:main'
Don't forget to rebuild the workspace and we can run our new node:
david@david-ubuntu24:~$ ros2 run bme_ros2_tutorials_py py_publisher
[INFO] [1727526317.470055907] [python_publisher]: Publishing: "Hello, world: 0"
[INFO] [1727526317.971461827] [python_publisher]: Publishing: "Hello, world: 1"
[INFO] [1727526318.473896872] [python_publisher]: Publishing: "Hello, world: 2"
[INFO] [1727526318.977439178] [python_publisher]: Publishing: "Hello, world: 3"
We can observe the published topic through rqt
's topic monitor:
Or we can use a simple but powerful tool, the topic echo
david@david-ubuntu24:~$ ros2 topic echo /topic
data: 'Hello, world: 23'
data: 'Hello, world: 24'
data: 'Hello, world: 25'
The publisher node above is very simple and looks exactly how we historically impelented nodes in ROS1. But ROS2 provides more powerful API functions and also places a greater emphasis on object-oriented programming. So let's create another publisher in a more OOP way and using the timer functions (create_timer()
) of the ROS2 API. The other important API function is rclpy.spin(node)
which keeps the node running until we don't quit it with ctrl+c
in the terminal.
#!/usr/bin/env python3
import rclpy
from rclpy.node import Node # Import ROS2 Node as parent for our own node class
from std_msgs.msg import String
class MyPublisherNode(Node):
def __init__(self):
self.publisher_ = self.create_publisher(String, 'topic', 10)
self.timer = self.create_timer(0.5, self.timer_callback) # Timer callback, period in seconds, not frequency!
self.i = 0
self.msg = String()
self.get_logger().info("Publisher OOP has been started.")
def timer_callback(self): # Timer callback function implementation
self.msg.data = f"Hello, world: {self.i}"
self.i += 1
self.get_logger().info(f'Publishing: "{self.msg.data}"')
def main(args=None):
node = MyPublisherNode() # node is now a custom class based on ROS2 Node
rclpy.spin(node) # Keeps the node running until it's closed with ctrl+c
if __name__ == "__main__":
As we did previously, add the new script's entrypoint in the setup.py
, build the workspace and run our new node:
ros2 run bme_ros2_tutorials_py py_publisher_oop
Let's create a new file subscriber.py
in our python package (bme_ros2_tutorials_py
). First we make a very simple implementation and after that we'll implement a more OOP version of it again. We further extend our knowledge with more API functions related to subscriptions (create_subscription()
#!/usr/bin/env python3
import rclpy
from std_msgs.msg import String
def main(args=None):
node = rclpy.create_node('python_subscriber')
def subscriber_callback(msg): # Subscriber callback will be invoked every time when a message arrives to the topic it has subsctibed
node.get_logger().info(f"I heard: {msg.data}")
# Register the node as a subscriber on a certain topic: 'topic' (with a certain data type: String)
# and assign the callback function that will be invoked when a message arrives to the topic
# with a queue size of 10 which determines how many incoming messages can be held in the subscriber’s
# queue while waiting to be processed by the callback function
subscriber = node.create_subscription(String, 'topic', subscriber_callback, 10)
node.get_logger().info("Subsciber has been started.")
if __name__ == '__main__':
Add the node to the setup.py
file as a new entry point
'console_scripts': [
'py_hello_world = bme_ros2_tutorials_py.hello_world:main',
'py_publisher = bme_ros2_tutorials_py.publisher:main',
'py_publisher_oop = bme_ros2_tutorials_py.publisher_oop:main',
'py_subscriber = bme_ros2_tutorials_py.subscriber:main'
Then, build the workspace and we can run our new node!
david@david-ubuntu24:~$ ros2 run bme_ros2_tutorials_py py_subscriber
[INFO] [1727606328.416973729] [python_subscriber]: Subsciber has been started.
If we don't start a publisher, then our subscriber is just keep listening to the /topic
but the callback function is not invoked. The node doesn't stop running because of the rclpy.spin(node)
Let's start our C++ publisher in another terminal:
david@david-ubuntu24:~$ ros2 run bme_ros2_tutorials_cpp publisher_cpp
[INFO] [1727606744.184678739] [cpp_publisher]: CPP publisher has been started.
[INFO] [1727606744.685934650] [cpp_publisher]: Publishing: 'Hello, world: 0'
[INFO] [1727606745.185073828] [cpp_publisher]: Publishing: 'Hello, world: 1'
[INFO] [1727606745.686288921] [cpp_publisher]: Publishing: 'Hello, world: 2'
[INFO] [1727606746.186169881] [cpp_publisher]: Publishing: 'Hello, world: 3'
And let's see what happens with the subscriber! It's subscription callback function is invoked every time when the publisher sends a message onto the /topic
david@david-ubuntu24:~/ros2_ws$ ros2 run bme_ros2_tutorials_py py_subscriber
[INFO] [1727606614.099180007] [python_subscriber]: Subsciber has been started.
[INFO] [1727606744.695260304] [python_subscriber]: I heard: Hello, world: 0
[INFO] [1727606745.187956805] [python_subscriber]: I heard: Hello, world: 1
[INFO] [1727606745.689289484] [python_subscriber]: I heard: Hello, world: 2
[INFO] [1727606746.188467429] [python_subscriber]: I heard: Hello, world: 3
We can also check it with rqt_graph
And we can also observe the language agnostic approach of ROS2, without any additional effort this middleware provides interfacing between nodes written in different programming languages.
As before, let's make our subscriber more OOP using our previous template from the publisher. Compared to the publisher we just need to replace the timer callback with a subscription callback and that's all!
#!/usr/bin/env python3
import rclpy
from rclpy.node import Node
from std_msgs.msg import String
class MySubscriberNode(Node):
def __init__(self):
self.subscriber_ = self.create_subscription(String, 'topic', self.subscriber_callback, 10)
self.get_logger().info("Subsciber OOP has been started.")
def subscriber_callback(self, msg):
self.get_logger().info(f"I heard: {msg.data}")
def main(args=None):
node = MySubscriberNode()
if __name__ == "__main__":
Build the workspace and run the node.
As you noticed with the previous examples we have to use as many terminals as many nodes we start. With a simple publisher and subscriber this isn't really a big deal, but in more complex robotic projects, it's quite common to use ROS nodes in the range of tens or even hundreds. Therefore ROS provides an efficient interface to start multiple nodes together and even re-map their topics to different ones or change its parameters instead of changing the source code itself.
Compared to ROS1 it's a bit more complicated to bundle these launchfiles with our nodes, so as a best practice, I recommend creating an individual pakage only for our launcfiles.
Let's create a new package with ament_cmake
or simply without specifying the build type (by default it's ament_cmake
ros2 pkg create bme_ros2_tutorials_bringup
Now let's create a launch
folder within this new package.
We can freely delete include and src folders:
If you want to delete a folder from command line that is not empty you can use the
rm -rf folder
rm -rf include/ src/
Add the following to the CMakeLists.txt
to install the content of launch
when we build the workspce:
Create a new launch file, and let's call it publisher_subscriber.launch.py
. In ROS2 the launchfiles are special declarative python scripts (with some imperative flavours) instead of the xml
files we used in ROS1! Actually ROS2 also has the possibility to use xml
based launch files, but the general usage and the documentation of this feature is very poor. Initially the python based launch system was intended to be the backend of xml launchfiles but it wasn't ready for the initial launch of ROS2 and the community rather jumped on using the python launch system.
touch publisher_subscriber.launch.py
Let's create our template that we can re-use in the future with only one publisher first. When we add a node to the launch file we must define the the package
, the node
(executable) and a freely chosen name
#!/usr/bin/env python3
from launch import LaunchDescription
from launch_ros.actions import Node
def generate_launch_description():
ld = LaunchDescription()
publisher_node = Node(
return ld
Build and don't forget to source the workspace because we added a new package!
After it we can execute our launchfile with the ros2 launch
david@david-ubuntu24:~$ ros2 launch bme_ros2_tutorials_bringup publisher_subscriber.launch.py
[INFO] [launch]: All log files can be found below /home/david/.ros/log/2024-09-29-14-17-29-864407-david-ubuntu24-41228
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [py_publisher-1]: process started with pid [41231]
[py_publisher-1] [INFO] [1727612250.056173684] [my_publisher]: Publishing: "Hello, world: 0"
[py_publisher-1] [INFO] [1727612250.559170990] [my_publisher]: Publishing: "Hello, world: 1"
[py_publisher-1] [INFO] [1727612251.061618736] [my_publisher]: Publishing: "Hello, world: 2"
If we don't write the
word explicitly in the filename of our launch file, theros2 launch
cli tool won't be able to autocomplete the filenames.
We can notice that our node is now called my_publisher
instead of python_publisher
as we coded in the node itself earlier. With the launch files we can easily rename our nodes for better handling and organizing as our application scales up.
We can use the node list
tool to list our nodes and the output will look like this:
david@david-ubuntu24:~/ros2_ws$ ros2 node list
Every time when we add a node to the launch file we also have to register it with the ld.add_action()
from launch import LaunchDescription
from launch_ros.actions import Node
def generate_launch_description():
ld = LaunchDescription()
publisher_node = Node(
subscriber_node = Node(
return ld
Don't forget to rebuild the workspace so the changed launchfile will be installed, after that we can run it!
david@david-ubuntu24:~$ ros2 launch bme_ros2_tutorials_bringup publisher_subscriber.launch.py
[INFO] [launch]: All log files can be found below /home/david/.ros/log/2024-09-29-14-20-15-603371-david-ubuntu24-41380
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [py_publisher-1]: process started with pid [41383]
[INFO] [py_subscriber-2]: process started with pid [41384]
[py_publisher-1] [INFO] [1727612415.811451529] [my_publisher]: Publishing: "Hello, world: 0"
[py_subscriber-2] [INFO] [1727612415.811459737] [my_subscriber]: Subsciber has been started.
[py_subscriber-2] [INFO] [1727612415.811878677] [my_subscriber]: I heard: Hello, world: 0
[py_publisher-1] [INFO] [1727612416.313222973] [my_publisher]: Publishing: "Hello, world: 1"
[py_subscriber-2] [INFO] [1727612416.315340170] [my_subscriber]: I heard: Hello, world: 1
We can see that both nodes started and their logging to the standard output is combined in this single terminal window.
We can verify this with node list
or using rqt_graph
david@david-ubuntu24:~$ ros2 node list
We can also verify the used topics with the topic list
david@david-ubuntu24:~$ ros2 topic list
Gazebo is a powerful robotics simulation tool that provides a 3D environment for simulating robots, sensors, and objects. It is widely used in the ROS ecosystem for testing and developing robotics algorithms in a realistic virtual environment before deploying them to real hardware.
Gazebo integrates tightly with ROS, enabling simulation and control of robots using ROS topics, services, and actions. In ROS2 with the latest Gazebo releases the integration is facilitated by ros_gz
Key Features of Gazebo:
- 3D Physics Engine: Simulates rigid body dynamics, collision detection, and other physics phenomena using engines like ODE, Bullet, and DART.
- Realistic Sensors: Simulates cameras, LiDAR, IMUs, GPS, and other sensors with configurable parameters.
- Plugins: Extensible via plugins to control robots, customize physics, or add functionality.
- Worlds and Models: Enables users to create complex environments with pre-built or custom objects and robots.
Besides Gazebo, there are many alternative simulation environments for ROS, but usually the setup of these simulators are more complicated and less documented. Certain simulators also have very high requirements for the GPU.
Simulator | Best For | Advantages | Disadvantages |
Gazebo | General robotics simulation in ROS | Free, accurate physics, ROS support | Moderate visuals, resource-heavy |
Unity | High-fidelity visuals and AI/ML tasks | Realistic graphics, AI tools | Steep learning curve, not robotics-specific |
Webots | Beginner-friendly robotics simulation | Easy setup, cross-platform | Limited graphics, less customizable |
Isaac Sim | High-end AI and robotics simulation | High-fidelity physics, AI support | GPU-intensive, complex setup |
Before we install Gazebo we have to understand the compatibility between Gazebo versions and ROS distributions.
ROS Distribution | Gazebo Citadel (LTS) | Gazebo Fortress (LTS) | Gazebo Garden | Gazebo Harmonic (LTS) | Gazebo Ionic |
ROS 2 Rolling | ❌ | ❌ | ⚡ | ⚡ | ✅ |
ROS 2 Jazzy (LTS) | ❌ | ❌ | ⚡ | ✅ | ❌ |
ROS 2 Iron | ❌ | ✅ | ⚡ | ⚡ | ❌ |
ROS 2 Humble (LTS) | ❌ | ✅ | ⚡ | ⚡ | ❌ |
ROS 2 Foxy (LTS) | ✅ | ❌ | ❌ | ❌ | ❌ |
ROS 1 Noetic (LTS) | ✅ | ⚡ | ❌ | ❌ | ❌ |
Since we use the latest LTS ROS2 distribution, Jazzy, we need Gazebo Harmonic.
To install Gazebo Harmonic binaries on Ubuntu 24.04 simply follow the steps on this link.
Once it's installed we can try it with the following command:
gz sim shapes.sdf
If everything works well you should see the following screen:
If you have a problem with opening this example shapes.sdf
there might be various reasons that requires some debugging skills with Gazebo and Linux.
If you see a
Segmentation fault (Address not mapped to object [(nil)])
due to problems withQt
you can try to set the following environmental variable to force Qt to use X11 instead of Wayland. Linkexport QT_QPA_PLATFORM=xcb
If you run Gazebo in WSL2 or virtual machine the most common problem is with the 3D acceleration with the OGRE2 rendering engine of Gazebo. You can either try disabling HW acceleration (not recommended) or you can switch the older OGRE rendering engine with the following arguments. Link
gz sim shapes.sdf --render-engine ogre
If you run Ubuntu natively on a machine with an integrated Intel GPU and a discrete GPU you can check this troubleshooting guide.
After Gazebo successfully starts we can install the Gazebo ROS integration with the following command:
sudo apt install ros-jazzy-ros-gz
You can find the official install guide here.
Let's start again the gz sim shapes.sdf
example again and let's see what is important on the Gazebo GUI:
- Blue - Start and pause the simulation. By default Gazebo starts the simulation paused but if you add the
when you start Gazebo it automatically starts the simulation. - Cyan - The display shows the real time factor. It should be always close to 100%, if it drops seriously (below 60-70%) it's recommended to change the simulation step size. We'll see this later.
- Red - You can add basic shapes or lights here and you can move and rotate them.
- Pink - The model hierarchy, every item in the simulation is shown here, you can check the links (children) of the model, their collision, inertia, etc.
- Green - Detailed information of the selected model in
some parameters can be changed most of them are read only. - Plug-in browser, we'll open useful tools like
Resource Spawner
,Visualize Lidar
,Image Display
, etc.
Gazebo has an online model database available here, you can browse and download models from here. Normally this online model library is accessible within Gazebo although there might be issues in WSL2 or in virtual machines, so I prepared an offline model library with some basic models.
You can download this offline model library from Google Drive.
After download unzip it and place it in the home folder of your user. To let Gazebo know about the offline model library we have to set the GZ_SIM_RESOURCE_PATH
environmental variable, the best is to add it to the .bashrc
export GZ_SIM_RESOURCE_PATH=~/gazebo_models
After setting up the offline model library let's open the empty.sdf
in Gazebo and add a few models through the Resource Spawner
within the plug-in browser
In this lesson we'll use the simulated and the real Turtlebot3 robot in burger
configuration. Turtlebot3 is not supported anymore with the latest ROS2 and Gazebo distributions, but we maintain our own packages to ensure compatibility.
Let's download the following GitHub repositories with the right branch (using the -b branch
flag) to our colcon workspace:
git clone -b ros2 https://github.com/MOGI-ROS/turtlebot3_msgs
git clone -b mogi-ros2 https://github.com/MOGI-ROS/turtlebot3
git clone -b new_gazebo https://github.com/MOGI-ROS/turtlebot3_simulations
We'll need to install a couple of other dependencies with apt
- don't forget to run sudo apt update
and sudo apt upgrade
if your system is not up to date:
sudo apt install ros-jazzy-dynamixel-sdk
sudo apt install ros-jazzy-hardware-interface
sudo apt install ros-jazzy-nav2-msgs
sudo apt install ros-jazzy-nav2-costmap-2d
sudo apt install ros-jazzy-nav2-map-server
sudo apt install ros-jazzy-nav2-bt-navigator
sudo apt install ros-jazzy-nav2-bringup
sudo apt install ros-jazzy-interactive-marker-twist-server
sudo apt install ros-jazzy-cartographer-ros
sudo apt install ros-jazzy-slam-toolbox
If for some reasons you want to install the Dynamixel SDK from source you can download the following branch from GitHub:
git clone -b humble-devel https://github.com/MOGI-ROS/DynamixelSDK/and if your Dynamixel SDK runs into a problem with module
, uninstall existingem
and install this version as it's reported in this GitHub issue. You might also need to install the modulelark
:pip install empy==3.3.4 pip install lark
Before we can test the Turtlebot3 packages we have to set up TURTLEBOT3_MODEL
environmental variable:
export TURTLEBOT3_MODEL=burger
It's only valid for that terminal session where you set it up, so it's recommended to add it into your .bashrc
file so every time when you open a new terminal, it will be executed. You can use the following gist as an example how to set up the .bashrc file.
After building the workspace and sourcing the setup.bash
file we can test the simulation of the Turtlebot3 burger:
ros2 launch turtlebot3_gazebo empty_world.launch.py
If we start a keyboard teleop node we can already drive the robot in the simulation:
ros2 run teleop_twist_keyboard teleop_twist_keyboard
Or there is another example world:
ros2 launch turtlebot3_gazebo turtlebot3_world.launch.py
Where we can try the cartographer
package for mapping:
ros2 launch turtlebot3_cartographer cartographer.launch.py use_sim:=true
There is a third simulated environment:
ros2 launch turtlebot3_gazebo turtlebot3_house.launch.py
Where we can try the nav2
navigation stack:
ros2 launch turtlebot3_navigation2 navigation2_use_sim_time.launch.py map_yaml_file:=/home/david/ros2_ws/src/turtlebot3_simulations/turtlebot3_gazebo/maps/map.yaml
Replace the
path to your path!
Let's try the same functionality on the real Turtlebot3 Burger. The robots at the lab are updated to the latest SD card image but in case you need to write the official MOGI image to another SD card you find it here.
On Linux you can use the
tool to create a backup or write your image back on a card.if
is the input file andof
is the output file. To create a backup you can run the following command, where/dev/sda
must match the path to your SD card:sudo dd if=/dev/sda of=/home/david/backup.img status=progress
And you can easily write an image file back to the card:
sudo dd of=/dev/sda if=/home/david/backup.img status=progress
With the MOGI image the robots are already fully set up, this is the .bashrc
that is running on the robots.
It's useful to take a look especially on this environmental variable:
# Set up a ROS2 domain ID
export ROS_DOMAIN_ID=30
This environment variable used in ROS2 that plays a key role in how nodes communicate over the DDS (Data Distribution Service) middleware. It partitions the DDS network into isolated segments. Nodes with the same domain ID can discover and communicate with each other, while nodes with different domain IDs remain isolated.
First, we have to make sure that the robots are on the same wireless network, if needed this must be set up using a screen and a keyboard. The wifi networks can be configured by editing the /etc/netplan/50-cloud-init.yaml
When the robot is on the same network as our PC we can connect to it using SSH, where the user name is pi
and the IP address must match with our robot's IP address:
Then we are asked to enter the password, which is 123
for this image:
[email protected]'s password:
The ROS2 workspace is already set up on the robot, we can run the following launch file to start all the functions of the real robot:
ros2 launch turtlebot3_bringup hardware.launch.py
The on the PC we can start the teleop node:
ros2 run teleop_twist_keyboard teleop_twist_keyboard
I suggest to decrease the linear and angular speeds with the
key to some value like this:q/z : increase/decrease max speeds by 10% currently: speed 0.12709329141645007 turn 0.25418658283290013
Then on the PC we can try a SLAM algorithm like the cartographer
as before or the best open-source SLAM package slam_toolbox
ros2 launch turtlebot3_slam_toolbox slam_toolbox.launch.py
or if you prefer cartographer
ros2 launch turtlebot3_cartographer cartographer.launch.py
After this point we'll use the turtlebot3_mogi
package which is available in this repository and you can download into your workspace with the following command:
git clone https://github.com/MOGI-ROS/Week-1-8-Cognitive-robotics
If you've already downloaded it and you want to make sure it's up-to-date you can run the following command:
git pull
Let's see what is in the package:
turtlebot3_mogi$ tree
├── CMakeLists.txt
├── package.xml
├── gazebo_models
│ ├── dark_bg
│ │ ├── meshes
│ │ │ └── dark_bg.dae
│ │ ├── model.config
│ │ └── model.sdf
│ ├── light_bg
│ │ ├── meshes
│ │ │ └── light_bg.dae
│ │ ├── model.config
│ │ └── model.sdf
│ └── red_line
│ ├── meshes
│ │ └── red_line.dae
│ ├── model.config
│ └── model.sdf
├── launch
│ ├── check_urdf.launch.py
│ ├── robot_mapping.launch.py
│ ├── robot_navigation.launch.py
│ ├── robot_visualization.launch.py
│ ├── simulation_bringup_line_follow.launch.py
│ ├── simulation_bringup_navigation.launch.py
│ ├── simulation_bringup_navigation_with_slam.launch.py
│ └── simulation_bringup_slam.launch.py
├── maps
│ ├── map.pgm
│ └── map.yaml
├── meshes
│ ├── dark_bg.blend
│ └── light_bg.blend
├── rviz
│ ├── robot_basic.rviz
│ ├── robot_mapping.rviz
│ ├── robot_navigation.rviz
│ ├── turtlebot3_line_follower.rviz
│ ├── turtlebot3_navigation.rviz
│ ├── turtlebot3_slam.rviz
│ └── urdf.rviz
└── worlds
├── dark_background.sdf
├── empty.sdf
├── light_background.sdf
└── red_line.sdf
: 3D models for the line following worldslaunch
: Default launch files are already part of the starting package, we can test the package withsimulation_bringup_slam.launch.py
. Launchfiles starting withrobot_
prefix are intended to run with the real robot.maps
: saved map for testing thesimulation_bringup_navigation.launch.py
: this folder contains the 3D models of the line following worlds in native Blender format.rviz
: Pre-configured RViz2 layoutsworlds
: default Gazebo worlds that we'll use in the simulations.
Important: this package has a dependency on the mogi_trajectory_server
package that helps visualizing the robot's past trajectory. You can download this package from git to your workspace to use it:
git clone https://github.com/MOGI-ROS/mogi_trajectory_server
Some launchfiles of the package acts as a simple wrapper to quickly launch the simulations that we already tried previous weeks. Let's try them out one by one, first the SLAM mapping:
ros2 launch turtlebot3_mogi simulation_bringup_slam.launch.py
In another terminal run a teleop node:
ros2 run teleop_twist_keyboard teleop_twist_keyboard
We can try the navigation:
ros2 launch turtlebot3_mogi simulation_bringup_navigation.launch.py
And finally a navigation without having an a priori map and running real time SLAM:
ros2 launch turtlebot3_mogi simulation_bringup_navigation_with_slam.launch.py
There are 3 launch files that are intended to use with the real robot and not with the simulation:
ros2 launch turtlebot3_mogi robot_visualization.launch.py
ros2 launch turtlebot3_mogi robot_mapping.launch.py
ros2 launch turtlebot3_mogi robot_navigation.launch.py
To use these launchfiles, make sure that the robot is on the same network and its ROS nodes are started - on the robot:
ros2 launch turtlebot3_bringup hardware.launch.py
And finally there is one more launch file that we will use during the next weeks:
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py
We can switch to another world with dark background and a light colored line (dark_background.sdf
) by changing the launch file or overriding the world argument when we launch the file:
world_arg = DeclareLaunchArgument(
'world', default_value='light_background.sdf',
description='Name of the Gazebo world file to load'
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py world:=dark_background.sdf
The turtlebot3_mogi
package already includes the different colored tracks that I made in Blender, you can find a short tutorial about how to create your own world in Blender:
You can also see the recording of another tutorial video about modeling in Blender:
The Blender files can be found in the
First, we have to set up a python virtual environment where we'll install the python packages that we'll use in the next weeks. I call my virtual environment as tf
because primarily I use it for Tensorflow.
There are various ways to set up and use a Python virtual environment, here I show an example using the pipx
Install the following packages using apt
sudo apt install python3-pip
sudo apt install pipx
Then we are ready to start using the pipx
package, run the following commands:
pipx ensurepath
pipx install virtualenv
pipx install virtualenvwrapper
Now let's edit our .bashrc
file, the pipx ensurepath
command added a line that we'll change now. Replace the following line:
# Created by `pipx` on 2024-12-15 20:49:03
export PATH="$PATH:/home/david/.local/bin"
to this:
# Virtual environment for pipx and tensorflow
export PATH="$PATH:/home/$USER/.local/bin"
export WORKON_HOME=~/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/home/$USER/.local/share/pipx/venvs/virtualenvwrapper/bin/python3
source /home/$USER/.local/share/pipx/venvs/virtualenvwrapper/bin/virtualenvwrapper_lazy.sh
workon tf
Start a new terminal and you'll get the following error message because in .bashrc
we used the command workon tf
but there is no virtual environment named tf
ERROR: Environment 'tf' does not exist. Create it with 'mkvirtualenv tf'.
So let's create one with the following command:
mkvirtualenv tf
Now, start a new terminal and you should see the active virtual environment between parentheses in your terminal:
(tf) david@david-ubuntu24:~$
Let's install Python packages that we'll use, to ensure compatibility with the codes in this repository let's use a specific version from numpy
and tensorflow
pip install tensorflow==2.18.0
pip install imutils
pip install scikit-learn
pip install opencv-python
pip install matplotlib
pip install numpy==1.26.4
OpenCV (Open Source Computer Vision Library) is a free, open-source library used for computer vision, image processing, and machine learning. It provides tools to analyze visual data from images and videos, such as detecting faces, objects, and motion, or applying filters and transformations. OpenCV is widely used in robotics, AI, and real-time applications, and it supports many programming languages, including Python and C++. It helps developers easily build systems that can “see” and interpret visual information.
As we saw in the previous chapter, we can start the simulation that is set up for the line following with the following command:
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py
Now let's try the node that follows the line with image processing using OpenCV:
ros2 run turtlebot3_mogi_py line_follower
The robot starts follwoing the line and we see the following window:
Let's analyze the code! We create a subscriber for compressed images from the robot's camera and a publisher for the cmd_vel
topic that will drive the robot. We also start another thread that guarantees that the spin()
function is called regardless how long our image processing will take. The spin()
function is essential to ensure that image_callback()
will be always executed and we won't miss frames.
class ImageSubscriber(Node):
def __init__(self):
# Create a subscriber with a queue size of 1 to only keep the last frame
self.subscription = self.create_subscription(
'image_raw', # Replace with your topic name
1 # Queue size of 1
self.subscription = self.create_subscription(
'image_raw/compressed', # Replace with your topic name
1 # Queue size of 1
self.publisher = self.create_publisher(Twist, 'cmd_vel', 10)
# Initialize CvBridge
self.bridge = CvBridge()
# Variable to store the latest frame
self.latest_frame = None
self.frame_lock = threading.Lock() # Lock to ensure thread safety
# Flag to control the display loop
self.running = True
# Start a separate thread for spinning (to ensure image_callback keeps receiving new frames)
self.spin_thread = threading.Thread(target=self.spin_thread_func)
def spin_thread_func(self):
"""Separate thread function for rclpy spinning."""
while rclpy.ok() and self.running:
rclpy.spin_once(self, timeout_sec=0.05)
def image_callback(self, msg):
"""Callback function to receive and store the latest frame."""
# Convert ROS Image message to OpenCV format and store it
with self.frame_lock:
#self.latest_frame = self.bridge.imgmsg_to_cv2(msg, "bgr8")
self.latest_frame = self.bridge.compressed_imgmsg_to_cv2(msg, desired_encoding="bgr8")
Then we run the display_image()
function in an infinite loop which uses built-in OpnCV functions to create a window and display the result
image. OpenCV is also responsible for handling keyboard commands, if we press q
key the node will stop. Of course, the node can be also stopped with Ctrl+C
if the focus is on the command line.
def display_image(self):
# Create a single OpenCV window
cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
cv2.resizeWindow("frame", 800,600)
while rclpy.ok():
# Check if there is a new frame available
if self.latest_frame is not None:
# Process the current image
mask, contour, crosshair = self.process_image(self.latest_frame)
# Add processed images as small images on top of main image
result = self.add_small_pictures(self.latest_frame, [mask, contour, crosshair])
# Show the latest frame
cv2.imshow("frame", result)
self.latest_frame = None # Clear the frame after displaying
# Check for quit key
if cv2.waitKey(1) & 0xFF == ord('q'):
self.running = False
# Close OpenCV window after quitting
self.running = False
The result
frame is created using a simple function that overlays 3 small images on the original frame from the camera.
Every image processing and publishing the commands on the cmd_vel
topic is happening within the process_image()
def process_image(self, img):
msg = Twist()
msg.linear.x = 0.0
msg.linear.y = 0.0
msg.linear.z = 0.0
msg.angular.x = 0.0
msg.angular.y = 0.0
msg.angular.z = 0.0
rows,cols = img.shape[:2]
# 1. Convert to HLS color space to extract lightness channel
H,L,S = self.convert2hls(img)
# 2. Invert lightness channel if we follow a dark line on a light background
L = 255 - L # Invert lightness channel
# 3. apply a polygon mask to filter out simulation's bright sky
L_masked, mask = self.apply_polygon_mask(L)
# 4. For light line on dark background in simulation:
lightnessMask = self.threshold_binary(L_masked, (50, 255))
# For light line on dark background in real life environment:
#lightnessMask = self.threshold_binary(L_masked, (180, 255))
stackedMask = np.dstack((lightnessMask, lightnessMask, lightnessMask))
contourMask = stackedMask.copy()
crosshairMask = stackedMask.copy()
# 5. return value of findContours depends on OpenCV version
(contours,hierarchy) = cv2.findContours(lightnessMask.copy(), 1, cv2.CHAIN_APPROX_NONE)
# overlay mask on lightness image to show masked area on the small picture
lightnessMask = cv2.addWeighted(mask,0.2,lightnessMask,0.8,0)
# 6. Find the biggest contour (if detected) and calculate its centroid
if len(contours) > 0:
biggest_contour = max(contours, key=cv2.contourArea)
M = cv2.moments(biggest_contour)
# Make sure that "m00" won't cause ZeroDivisionError: float division by zero
if M["m00"] != 0:
cx = int(M["m10"] / M["m00"])
cy = int(M["m01"] / M["m00"])
cx, cy = 0, 0
# Show contour and centroid
cv2.drawContours(contourMask, biggest_contour, -1, (0,255,0), 10)
cv2.circle(contourMask, (cx, cy), 20, (0, 0, 255), -1)
# Show crosshair and difference from middle point
# Chase the ball
#print(abs(cols - cx), cx, cols)
if abs(cols/2 - cx) > 20:
msg.linear.x = 0.05
if cols/2 > cx:
msg.angular.z = 0.15
msg.angular.z = -0.15
msg.linear.x = 0.1
msg.angular.z = 0.0
msg.linear.x = 0.0
msg.angular.z = 0.0
# Publish cmd_vel
# Return processed frames
return lightnessMask, contourMask, crosshairMask
The image processing pipeline is the following:
- Convert the RGB image to HLS color space so we can extract the lightness channel
- Invert lightness channel if we want to detect a dark line on lighter background
- Add a polygon mask to filter out disturbances of the environment
- Apply a highpass binary threshold on lightness channel so we can detect light objects (white), everything else is black
- Find all the white contours on the binary image
- Find the biggest contour and calculate its centroid
Let's switch to the dark background:
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py world:=dark_background.sdf
And don't forget to turn off inverting the lightness channel!
# 2. Invert lightness channel if we follow a dark line on a light background
#L = 255 - L # Invert lightness channel
You can try to tune the channel filter for the 3rd world in the package:
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py world:=red_line.sdf
It's difficult to do it with the lightness channel because both the background's and line's lightness values are similar although they are clearly different on the hue channel. In RGB color space it would be even easier to segment the line from its background.
In this chapter we'll create our own convolutional neural network (CNN) and train it for following the line. But before that, let's see what did we learn from the line following with OpenCV:
- With color space filtering we only focus on one property of line (color) but our model wasn't trying to detect it's shape or any other properties.
- Color space filtering is very sensitive in changes in environment or with the object. We have to adjust filter values for changed conditions. Also it's possible that we have to significantly modify the image processing pipeline because the problem can be solved e.g. in another color space.
- We had high resolution information (in number of pixels) about how much do we have to turn but we didn't use that. Above 20 pixels difference - between the centerpoint and the centroid of the line - the robot was turning to the left or right on a fixed radius. We'll apply the same logic to the neural network.
To save training images let's start the simulation:
ros2 launch turtlebot3_mogi simulation_bringup_line_follow.launch.py
Start a teleop node:
ros2 run teleop_twist_keyboard teleop_twist_keyboard
And finally, we will run the save_training_images
node that can save training images by pressing the s
key, but before that, make sure that self.save_path
is set to your own directory in the node:
class ImageSubscriber(Node):
def __init__(self):
self.subscription = self.create_subscription(
'image_raw/compressed', # Replace with your topic name
1 # Queue size of 1
self.save_path = "/home/david/ros2_ws/src/ROS2-lessons/Week-1-8-Cognitive-robotics/turtlebot3_mogi_py/saved_images/"
If the path is set up correctly we can run the node:
ros2 run turtlebot3_mogi_py save_training_images
To label the saved images we just simply have to copy the images to the suitable folder under the training_images
folder. We only distinguish 4 labels:
- Forward
- Turn left
- Turn right
- There is no line on the image
There are already about 100-100 training images in the package from both dark and light background environments under different lighting conditions also some images from a real-world environment. The training data is not balanced though with the images where there is no line. This is good enough for the training but you can add your own images! Intentionally, there aren't any images about the red line with green background, we want to verify later how well does our generalized model perform on a completely new environment.
The turtlebot3_mogi
package already has a trained network in network_model
folder that is ready to use. This model was trained using the following Tensorflow and Keras version:
Tensorflow version: 2.18.0
Keras version: 3.7.0
TensorFlow is an open-source deep learning library developed by Google. It helps build, train, and deploy machine learning models, especially neural networks, using efficient tools for math, data handling, and GPU acceleration.Keras:
Keras is a high-level API that runs on top of TensorFlow. It makes building and training models easier and more user-friendly with a simple, intuitive interface.
If you installed a different Tensorflow or Keras version you might need to train a new network before using it.
There is already a training script in the package - although this isn't a ROS node just a simple python script! So don't run it with ros2 run ...
First, navigate to the right folder then run the script:
(tf) david@david-ubuntu24:~/ros2_ws/src/ROS2-lessons/Week-1-8-Cognitive-robotics/turtlebot3_mogi_py/turtlebot3_mogi_py$ python train_network.py
[INFO] Version:
Tensorflow version: 2.18.0
Keras version: b'3.7.0'
[INFO] loading images and labels...
Let's take a look on the code! The most important part of the code is just a couple of lines:
# initialize the model
print("[INFO] compiling model...")
model = build_LeNet(width=image_size, height=image_size, depth=3, classes=4)
opt = Adam(learning_rate=INIT_LR)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics=["accuracy"])
# print model summary
# checkpoint the best model
checkpoint_filepath = "..//network_model//model.best.keras"
checkpoint = ModelCheckpoint(checkpoint_filepath, monitor = 'val_loss', verbose=1, save_best_only=True, mode='min')
# set a learning rate annealer
reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=3, verbose=1, factor=0.5, min_lr=1e-6)
# callbacks
callbacks_list=[reduce_lr, checkpoint]
# train the network
print("[INFO] training network...")
history = model.fit(trainX, trainY, batch_size=BS, validation_data=(testX, testY), epochs=EPOCHS, callbacks=callbacks_list, verbose=1)
This will initialize a model using the build_LeNet
function, then turns on 2 important functions:
- ModelCheckpoint() will guarantee that if it finds a better fitting model during the training, it saves it. So in the end we won't only have the last model, but also the best model.
- ReduceLROnPlateau() helps reducing the learning rate (how much it can update the weights) if it detects that our model cannot find a more fitting optimum during the training due to too high learning rate. Higher learning rate is better in the beginning but later it's a risk of overshooting or missing the bottom.
But what is the build_LeNet
def build_LeNet(width, height, depth, classes):
# initialize the model
model = Sequential()
inputShape = (height, width, depth)
# After Keras 2.3 we need an Input layer instead of passing it as a parameter to the first layer
# first set of CONV => RELU => POOL layers
model.add(Conv2D(20, (5, 5), padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# second set of CONV => RELU => POOL layers
model.add(Conv2D(50, (5, 5), padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# first (and only) set of FC => RELU layers
# softmax classifier
# return the constructed network architecture
return model
LeNet-5 was originally invented by Yann LeCun and colleagues for handwritten digit recognition on the MNIST dataset in 1998! It's first predecessor is dated back to 1989 though as the first It was one of the first convolutional neural networks (CNN). LeNet-5 was pioneering for CNNs and laid the groundwork for modern deep learning in vision.
I'm cheating a little bit here, because this network that we use is significantly bigger than the original LeNet-5 that had the following architecture:
┃ Layer ┃ Type ┃ Output Shape ┃ Params Calculation ┃ Parameters ┃
┃ Input ┃ Input(shape=(32,32,1)) ┃ (32, 32, 1) ┃ No params ┃ 0 ┃
┃ C1 ┃ Conv2D(6, 5×5) ┃ (28, 28, 6) ┃ (5×5×1 + 1) × 6 = 156 ┃ 156 ┃
┃ S2 ┃ AveragePooling2D(2×2) ┃ (14, 14, 6) ┃ No params ┃ 0 ┃
┃ C3 ┃ Conv2D(16, 5×5) ┃ (10, 10, 16) ┃ (5×5×6 + 1) × 16 = 2,416 ┃ 2,416 ┃
┃ S4 ┃ AveragePooling2D(2×2) ┃ (5, 5, 16) ┃ No params ┃ 0 ┃
┃ C5 ┃ Conv2D(120, 5×5) ┃ (1, 1, 120) ┃ (5×5×16 + 1) × 120 = 48,120 ┃ 48,120 ┃
┃ Flatten ┃ – ┃ (120,) ┃ No params ┃ 0 ┃
┃ F6 ┃ Dense(84) ┃ (84,) ┃ 120×84 + 84 = 10,164 ┃ 10,164 ┃
┃ Output ┃ Dense(10) ┃ (10,) ┃ 84×10 + 10 = 850 ┃ 850 ┃
Total params: 61,706
Our version of LeNet has about 1 million trainable parameters!
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
│ conv2d (Conv2D) │ (None, 24, 24, 20) │ 1,520 │
│ activation (Activation) │ (None, 24, 24, 20) │ 0 │
│ max_pooling2d (MaxPooling2D) │ (None, 12, 12, 20) │ 0 │
│ conv2d_1 (Conv2D) │ (None, 12, 12, 50) │ 25,050 │
│ activation_1 (Activation) │ (None, 12, 12, 50) │ 0 │
│ max_pooling2d_1 (MaxPooling2D) │ (None, 6, 6, 50) │ 0 │
│ flatten (Flatten) │ (None, 1800) │ 0 │
│ dense (Dense) │ (None, 500) │ 900,500 │
│ activation_2 (Activation) │ (None, 500) │ 0 │
│ dense_1 (Dense) │ (None, 4) │ 2,004 │
│ activation_3 (Activation) │ (None, 4) │ 0 │
Total params: 929,074 (3.54 MB)
Trainable params: 929,074 (3.54 MB)
Non-trainable params: 0 (0.00 B)
1 million parameters might look a lot, especially compared to the original LeNet-5's 61k parameters. But let's put into more context:
- GPT-2 XL was a state of the art LLM (large language model) in 2019, it has 1.5 billion paremeters
- GPT-3 has 175 billion parameters
- GPT-4 isn't openly published anymore, but it's assumed it has around 500-1000 billion parameters
- DeepSeek V2's architecture is open-source, there are different downloadable models, the biggest one has 671 billion parameters and it's about 404GB. The smallest one had 1.5 billion parameters and its size is 1.1GB.
These are all LLMs but how did we get there in about 20 (30) years from the first CNNs.
- CNNs are focusing on spatial patterns that makes them very efficient in image processing.
- RNNs were also introduced in the late '80s, but their major real life application was around the early 2000s to generate basic character sequences or predict the next words in a sentence. It focuses on temporal patterns, but it had problems with vanishing gradient during training - the model cannot learn from earlier inputs with long sequences when it goes back many time steps.
- LSTMs (Long-short term memory) addressed to solve the issues of RNNs and these were the state of the art neural networks around 2013-2018 for natural language processing. LSTM cells were introduced to preserve important input that was flowing forward during training without vanishing. Although it was a huge step for natural language processing and it handles temporal data better (text, speech), it was slow and had issues with longer sequences. Since it was still analyzing the input word by word, it was hard to remember words from earler.
- Transformers (LLMs based on transformers) made LSTMs completely obsolete since 2018, they handle long sequences very well and it's fast with parallelism. It handles longer sequences much better because it has attention that can extract the important information from the context.
But let's come back to our training and let's see how did it go!
Accuracy measures how many predictions were correct out of all predictions. Higher accuracy = better performance. Loss is a formula that compares the model’s prediction to the true value (label) and gives a number representing how wrong it is. The goal of training is to minimize the loss.
The loss was successfully minimized during training, but the validation loss doesn't look great. This usually means our model is overfitting or not generalizing well to new, unseen data. That can have multiple reasons:
- It learns training data too well (memorizes), but struggles on validation data.
- If validation data has noise, imbalance, or different distribution, loss may stay higher.
- Model is too complex, large models memorize training data easily but don’t generalize well.
Although our model is not exactly small, still it's far from too high complexity, we can exclude point 3.
We can reduce overfitting by adding dropout after certain layers. Dropout randomly turns off neurons during training so it prevents over-reliance on specific ones. Helps the model generalize better which shouldt reduce our validation loss!
def build_LeNet(width, height, depth, classes):
# initialize the model
model = Sequential()
inputShape = (height, width, depth)
# After Keras 2.3 we need an Input layer instead of passing it as a parameter to the first layer
# first set of CONV => RELU => POOL layers
model.add(Conv2D(20, (5, 5), padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# Optional Dropout after first pool (small value)
# second set of CONV => RELU => POOL layers
model.add(Conv2D(50, (5, 5), padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# Optional Dropout again
# first (and only) set of FC => RELU layers
# Dropout after fully connected (higher value)
# softmax classifier
# return the constructed network architecture
return model
Let's try the model training with the dropouts:
We can clearly see the impact of droupout on the smaller spikes, but it was not improving our validation loss at all. Let's remember that our training set is not balanced on images with no line, only 20 pictures compared to the other 3 classes with 100 pictures. But then we should have random results after every training, right?
Not really, we intentionally fixed all of our random seeds in the beginning of the script to make sure that the trainings are reproducible!
# Fix every random seed to make the training reproducible
Let's try different random seeds:
# Fix every random seed to make the training reproducible
And the result is much better!
This time we were lucky, the validation loss was high only because of a not balanced input data we don't have to change anything on our model.
model_path = "/home/david/ros2_ws/src/ROS2-lessons/Week-1-8-Cognitive-robotics/turtlebot3_mogi_py/network_model/model.best.keras"
OpenCV version: 4.11.0 Tensorflow version: 2.18.0 Keras version: 3.7.0 CNN model: /home/david/ros2_ws/src/ROS2-lessons/Week-1-8-Cognitive-robotics/turtlebot3_mogi_py/network_model/model.best.keras Model's Keras version: 3.7.0
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
│ conv2d (Conv2D) │ (None, 28, 28, 2) │ 52 │
│ activation (Activation) │ (None, 28, 28, 2) │ 0 │
│ max_pooling2d (MaxPooling2D) │ (None, 7, 7, 2) │ 0 │
│ conv2d_1 (Conv2D) │ (None, 7, 7, 4) │ 204 │
│ activation_1 (Activation) │ (None, 7, 7, 4) │ 0 │
│ max_pooling2d_1 (MaxPooling2D) │ (None, 2, 2, 4) │ 0 │
│ flatten (Flatten) │ (None, 16) │ 0 │
│ dense (Dense) │ (None, 32) │ 544 │
│ activation_2 (Activation) │ (None, 32) │ 0 │
│ dense_1 (Dense) │ (None, 4) │ 132 │
│ activation_3 (Activation) │ (None, 4) │ 0 │
Total params: 932 (3.64 KB)
Trainable params: 932 (3.64 KB)
Non-trainable params: 0 (0.00 B)
Trainable params: 932 (1000 times smaller than a LeNet-5)
ros2 launch turtlebot3_bringup hardware.launch.py
ros2 run turtlebot3_mogi_py line_follower_cnn_robot