A recent research presents an AI-powered bird’s-eye view safety monitoring system that combines camera and LiDAR data to detect and prevent hazards during tower crane lifting operations on modular construction sites.
Imagine standing atop a half-built skyscraper, surrounded by dangling 30-ton modules, swinging hooks, and steel arms overhead. Now imagine you're being protected—not by a person, but by AI, cameras, and lasers working in harmony. 👷🏽♂️ Sounds futuristic? Not anymore.
Researchers from the Hong Kong Center for Construction Robotics have created a powerful safety monitoring system designed specifically for one of the riskiest zones on a construction site: the space beneath a tower crane. With Modular Integrated Construction (MiC) becoming more popular—where entire rooms or building sections are lifted and placed on-site—the need for high-precision, real-time safety systems is critically needed.
Let’s explore how this system, called CRCUST Top, is changing the game in construction safety using AI 🧠, LiDAR 🌐, and computer vision 👁️ from a bird’s-eye view.
Tower cranes are the backbones of modern construction. But despite their essential role, they also pose massive risks. In just one year, China saw 125 tower crane accidents, the US recorded 129 incidents in a decade, and Singapore and Hong Kong experienced fatalities due to crane-related hazards.
What makes things even trickier? The rise of MiC modules. These prefabricated blocks—sometimes as big as entire rooms—create massive blind spots during lifting, making it harder for crane operators to detect people or obstacles below. Combine that with humans walking, working, or inspecting just meters away, and you’ve got a high-stakes environment in need of high-tech solutions.
The CRCUST Top system (short for Construction Top Under the Tower Crane Safety System) is like a digital guardian angel hovering over the site. Here’s what it includes:
Uses YOLOv8 (a fast, accurate object detection model) to spot:
Detects all of the above in real time, even in complex or cluttered environments.
A LiDAR sensor scans the site to get depth information—a huge upgrade over regular cameras.
By combining the LiDAR point cloud and the 2D image, the system projects objects into 3D space, determining exactly where workers and modules are.
Researchers defined “safe” and “dangerous” zones based on:
A graphical interface displays:
If a worker walks too close to a lifting module, an instant alert is triggered!
To ensure the system wasn’t just theoretical, the researchers installed CRCUST Top on live construction sites in Hong Kong. They captured 858 image-LiDAR pairs involving:
Fun fact: The images were 5K resolution (5472x3648 pixels), and the LiDAR collected 24,000 points per scan. That’s a ton of visual and spatial data processed per second!
The researchers benchmarked various AI models and found:
When the MiC module got too close to a worker (inside the defined safety bubble), the system turned red and sent out an alert. When everything was fine? Green light and all clear ✅
Let’s break it down into what makes this research so important:
Unlike traditional systems that alert after something goes wrong, CRCUST Top actively prevents hazards by analyzing real-time 3D positioning.
From object detection to warning generation, there’s no manual input needed. That means consistent performance, even on night shifts or under bad weather conditions.
Most construction AI systems rely on either cameras or LiDAR—not both. CRCUST Top smartly fuses the two, offering depth awareness and image clarity together.
The researchers are not stopping here. In their conclusion, they outlined exciting future directions:
Instead of retraining the system for each new site or MiC type, they want to use general-purpose AI (like ChatGPT for images!) that can understand any object label dynamically.
The goal is to allow CRCUST Top to be relocated to any site, without retraining or recalibration. This would make it viable for commercial deployment at scale.
Although the current system takes ~1.6 seconds per frame, the aim is to cut this down for real-time streaming and alerts with minimal lag.
✅ Problem Solved: Dangerous crane lifts, blind spots, and unpredictable human movement
✅ Tools Used: YOLOv8, SAM (Segment Anything Model), LiDAR, 3D clustering
✅ Innovation: Seamless 2D + 3D fusion to define and monitor risk in real time
✅ Impact: Huge potential to reduce crane-related injuries and fatalities
This paper is a shining example of how deep learning and robotics are reshaping construction engineering. By combining smart software with smart hardware, CRCUST Top isn't just watching over the site—it’s actively protecting every human on it.
So the next time you see a tower crane swinging above a construction zone, remember: there might just be an AI eye in the sky keeping everyone safe 👁️🛠️
🏗️ Tower Crane - A tall construction machine used to lift heavy materials like steel and concrete to high places on a building site. Think of it as a giant robot arm helping build skyscrapers.
🧱 Modular Integrated Construction (MiC) - A building method where parts of a building—like rooms or sections—are built in a factory, then transported and assembled on-site like Lego blocks. - More about this concept in the article "Building Smarter, Greener 🧱 Optimizing Modular Construction Supply Chains with AI & Multi-Agent Systems".
👁️ Bird’s-Eye View - A top-down view of the construction site, like what you’d see if you were flying overhead in a drone or standing on the crane itself. - More about this concept in the article "Radar-Camera Fusion: Pioneering Object Detection in Bird’s-Eye View 🚗🔍".
📷 Computer Vision - A field of AI that teaches computers to “see” and understand images—just like humans do—by analyzing photos and videos. - More about this concept in the article "Ensuring Construction Safety with AI: Detecting Scaffolding Completeness Using Deep Learning 🏗️ 🤖".
🌐 LiDAR (Light Detection and Ranging) - A sensor that uses laser light pulses to measure distances and create a 3D map of the surroundings—like giving sight and depth perception to machines. - More about this concept in the article "Flying into the Future 🚁 How UAVs Are Revolutionizing Transportation Infrastructure Assessment".
🧠 Artificial Intelligence (AI) - A type of smart software that can learn from data, recognize patterns, and make decisions—often used for things like object detection, predictions, and automation. - More about this concept in the article "Building Trust in Smart Factories 🧠 🏭 How Engineers Are Embedding Ethics into AI".
🧱📦 MiC Frame - The metal structure or rig used to hold and lift a modular building unit during crane operations.
🖼️ 2D Detection / Segmentation - AI techniques used to spot and outline objects (like workers or materials) in images, helping the system understand what’s in the scene.
☁️ Point Cloud - A set of 3D points collected by sensors like LiDAR, used to represent the shape and location of objects in space—imagine a digital dot version of a scene. - More about this concept in the article "Revolutionizing Maize Farming: 3D Rail-Driven Plant Phenotyping for Real-Time Growth Monitoring 🌱📊".
🎯 YOLO (You Only Look Once) - A fast AI model used for detecting multiple objects in images in one go—perfect for spotting workers, machines, and hazards in real-time. - More about this concept in the article "Smarter Helmet Detection with GAML-YOLO 🛵 Enhancing Road Safety Using Advanced AI Vision".
🧪 Kalman Filter - A clever math tool that helps track the position of moving objects over time, even if some information is missing or noisy. - More about this concept in the article "Navigating the Abyss: A Data-Driven Approach to Deep-Sea Vehicle Localization 🚢 🌊 🔍".
📐 Data Fusion - Combining data from different sensors (like cameras and LiDAR) to get a fuller, smarter picture of the environment. - More about this concept in the article "Revolutionizing Autonomous Driving: MapFusion's Smart Sensor Fusion 🚗💡".
Source: Yanke Wang, Yu Hin Ng, Haobo Liang, Ching-Wei Chang, Hao Chen. Bird's-eye view safety monitoring for the construction top under the tower crane. https://doi.org/10.48550/arXiv.2506.18938