I'm interested in computer vision, robotics and SLAM. I focus my research about scene understanding, HD maps and 3D reconstruction using different sensors (camera, LiDAR, RADAR, IMU, GPS, etc).
SpatialFusion-LM is a unified framework for spatial 3D scene understanding from monocular or stereo RGB input. It integrates depth estimation, differentiable 3D reconstruction, and spatial layout prediction using large language models.
The Edge Optimized Tracking System is a real-time object tracking pipeline that combines deep learning, tracking algorithms, and filtering techniques to accurately detect and follow objects. Designed for high-speed applications, it ensures efficient and reliable performance on edge devices.
A real-time navigation method that uses floor plans and sensor data to help individuals with blindness or low vision explore and localize themselves indoors, without prior training or known starting position.
RGBD-3DGS-SLAM is a SLAM system that combines 3D Gaussian Splatting with monocular depth estimation for accurate point cloud and visual odometry. It uses UniDepthV2 to infer depth and intrinsics from RGB images, while optionally incorporating depth maps and camera info for improved results.
A real-time system that fuses camera and motion sensor data using the Unscented Kalman Filter. It enables accurate and efficient tracking of a robot’s movement, supporting reliable navigation in autonomous systems.
A high-performance monocular depth estimation system designed for real-time use in robotics, autonomous navigation, and 3D perception. It delivers fast and accurate depth predictions, making it well-suited for applications requiring low-latency visual understanding.
FusionSLAM combines feature matching, depth estimation, and neural scene representation to improve mapping and localization using just a single camera. This approach enhances the accuracy and efficiency of monocular SLAM in real-world environments.
A graph-based SLAM system that fuses visual data from multiple cameras using learned feature points for robust mapping and localization. Designed for accurate 3D reconstruction and real-time operation in complex environments.
Combines machine learning with SLAM to help robots build more accurate maps and navigate better. Introduces bi-directional loop closure to improve reliability when revisiting places.
A method for determining precise indoor location using floor plan images and real-time sensor data from RGB-D and IMU, without relying on GPS. The approach aligns a live semantic map with the architectural layout to enable accurate global localization.