ATRON: Autonomous Trash Retrieval for Oceanic Neatness

ATRON overview trailer

Abstract

ATRON (Autonomous Trash Retrieval for Oceanic Neatness) represents an innovative autonomous solution for marine debris collection and removal. This project addresses the growing environmental concern of oceanic pollution through applied robotics technology.

The system integrates a twin-hull catamaran design with a motorized conveyor belt collection mechanism capable of retrieving multiple types of debris. Advanced navigation capabilities are achieved through a combination of 2D LiDAR, IMU sensors, and kalman filtering that enable accurate SLAM (Simultaneous Localization and Mapping) and obstacle avoidance in marine environments.

Key achievements include the development of a robust autonomous vehicle weighing 120kg with dimensions of 1.6m × 2.7m × 1.0m, capable of operating at 1.0m/s and 15 deg/s with a 2.3-hour battery life. The debris detection system achieves a mean average precision of 0.95 using custom-trained YOLOv11 models, while the collection mechanism can process up to 20 debris items per minute under optimal conditions. The completed prototype demonstrates a commercially viable solution that could be scaled for real-world deployment in marine cleanup operations.

Simulation

For this project, we have developed simulations using two of the most popular simulation tools that can utilize ROS2: Gazebo and NVIDIA Isaac SIM. Both simulations enable the creation of virtual 360-degree cameras, LiDAR, and IMU to test out our system design.

Gazebo Simulation

NVIDIA Isaac SIM Simulation

The video below shows the SLAM algorithm working in NVIDIA Isaac SIM. Note that this takes place in a different environment from the previous video, which was simply an open area.

Isaac SIM with SLAM and rectified front image view

Full Autonomous Stack

The complete autonomous system integrates all subsystems to enable fully autonomous operation. This includes real-time object detection, path planning, obstacle avoidance, and debris collection coordination.

Object Detection via YOLO

Object detection is performed using YOLOv11 and is used to detect obstacles that must be avoided and debris that must be collected. These are performed on the cubemap projection from the cameras. Please see the 'Vision System' section under System Design for more details.

Navigation, Object Detection, and 3D Projection Algorithm

2D to 3D Projection

The bounding boxes from YOLO are used to estimate the location of the debris in the 3D coordinate system.
2D to 3D Projection Diagram

2D to 3D Projection Diagram

Path Planning and Navigation

To optimize debris collection efficiency, the order of collecting debris is determined via the Orienteering Problem. The path between debris is determined via OMPL. Pure pursuit is used to give the trajectory needed to follow the path created by OMPL.
Path Planning Visualization

Path Planning Visualization

System Design

CAD Model

The design of the system is initially created using SOLIDWORKS. This model is then used to create a Unified Robot Description Format (URDF) file, which then enables testing of the robot via simulators such as Gazebo and NVIDIA Isaac SIM.

SOLIDWORKS Model

SOLIDWORKS Model

TF Rigging and RViz Visualization

TF Rigging and RViz
Visualization

Simulated Physics Model

Simulated Physics Model

System Prototype

The system is then built with the features illustrated in the images below.

Frontal View

Frontal View

Side View

Side View

We also take steps to ensure that the system has good build quality. This includes having waterproof electrical connections, having and having fully welded joints.

Mechanical Features

Mechanical Features

Parameters and Measurements

Various measurements of the physical system are summarized below.

Parameters and Measurements
Mass 120 kg
Dimensions Width: 1.6 m
Length: 2.8 m
Height: 2.0 m
Peak Power Consumption 480 W
Operating Time (peak) 2.3 Hours
Peak Velocity Linear: 1m/s
Angular: 15 deg/s

Below is a video showcasing the speed of the conveyor belt.

The videos below showcase the water dynamics of the vehicle.

Linear Dynamics. Linear Velocity is 1 m/s

Angular Dynamics. Angular Velocity is 15 deg/s

Network Design

An overview of the various components and communication protocols are shown in the systems diagram below.

System Design Overview

System Network Overview

Transducer Configuration

We designed our own controller board that includes a built-in gyroscope in order to assist with odometry. The video below demonstrates the gyroscope. In addition, we also extract the acceleration and velocity data from the 360-degree camera to further improve the odometry of the boat.

Angular position using board (BNO055) gyroscope.

Angular position using camera gyroscope.

The design of the board is shown below, along with an electronics circuit diagram showcasing how it communicates with the other components at a low level.

Controller Board Model

Controller Board Model

Electric Circuit Configuration

Electric Circuit Configuration

We have tuned the thrusters of the USV to be effective even in wavy environments. We verify using NVIDIA Isaac SIM and use simulated thrusters to test the response curve to step inputs, as shown below

Thruster Step-Input Response Curve (Isaac SIM)

Image processing

Image Processing Overview

The image processing pipeline involves conversion from dual fisheye images to equirectangular. From there, we are able to convert to either a cubemap representation or a wide-angled bottom view representation.

The video below shows the raw video feed which contains dual fisheye images

The video below shows the processed equirectangular images

The video below shows the cubemap representation of the equirectangular images

Since we do not care about certain views, we instead go for a wide-angled view of the bottom which is facing the water surface. This is directly obtained from the equirectangular image representation and it skips the cubemap representation.

Object Detection

For our object detection model, we fine-tuned YOLOv11 on images of cans and bottles, as well as potential non-debris entities such as humans. The first video for this section already showcases this for the wide view bottom representation. Below is us applying it to the cubemap representation.

YOLOv11 on cubemap image representation

Navigation

We use sensor fusion in order to obtain a detailed pose estimate of our robot. The image below shows our pipeline.

Mapping Pipeline

Mapping Pipeline

We have additionally implemented a Visual Inertial Navigation System (VINS). We demonstrate this by performing odometry estimation with just the 360 camera in a handheld configuration, so our sensors only include dual fisheye and gyroscope. The dual fisheye images are converted to equirectangular in real time, as seen in the video below.

VINS Demonstration