Environmental Semantic Segmentation Algorithm via LiDAR Point Cloud Spherical Projection and Camera Fusion

YANG Han; GAO Jun; LIU Yong; HE Xiuwei; TAN Li; YIN Yankun; SHEN Xiaolei; YANG Feifei; PENG Chenglei

doi:10.7643/issn.1672-9242.2025.07.002

PDF(6137 KB)

Equipment Environmental Engineering ›› 2025, Vol. 22 ›› Issue (7) : 9-15. DOI: 10.7643/issn.1672-9242.2025.07.002

Special Topic—Application and Collaborative Evaluation Technology of Light Weapons in Complex Environments

Environmental Semantic Segmentation Algorithm via LiDAR Point Cloud Spherical Projection and Camera Fusion

YANG Han¹, GAO Jun¹, LIU Yong², HE Xiuwei², TAN Li², YIN Yankun², SHEN Xiaolei², YANG Feifei², PENG Chenglei^1,*

Author information +

History +

Abstract

The work aims to achieve semantic segmentation by fusing multimodal information from LiDAR and cameras. The 3D LiDAR point cloud was transformed into a 2D sparse depth map via spherical projection, enabling pixel-level alignment with camera images. An asymmetric dual-branch encoder was designed within the neural network model, where sparse convolution was used to extract depth map features and residual structures were employed to extract image features. The two modalities were then fused through a feature-mixing network to generate a fused image. A skip connection-based decoder was employed to parse features, restore resolution, and produce semantic segmentation results. The proposed algorithm effectively extracted and fused multimodal features, reducing environmental interference on single-sensor performance while improving segmentation accuracy and robustness. On the KITTI dataset, the algorithm achieved an mIoU of 64.4%, outperforming traditional single-modal semantic segmentation methods by 8.6% and demonstrating greater resilience to lighting variations. The camera-based color-texture information and the LiDAR-based precise depth data have complementary information dimensions, and the multimodal fusion environment semantic segmentation algorithm designed in this work effectively completes the task of environment semantic segmentation, with high accuracy and robustness.

Key words

semantic segmentation / environmental perception / multimodal perception / deep learning / feature fusion / LiDAR / depth map

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

YANG Han, GAO Jun, LIU Yong, HE Xiuwei, TAN Li, YIN Yankun, SHEN Xiaolei, YANG Feifei, PENG Chenglei. Environmental Semantic Segmentation Algorithm via LiDAR Point Cloud Spherical Projection and Camera Fusion[J]. Equipment Environmental Engineering. 2025, 22(7): 9-15 https://doi.org/10.7643/issn.1672-9242.2025.07.002

References

[1] GEVERS T, SMEULDERS A W M. Color-Based Object Recognition[J]. Pattern Recognition, 1999, 32(3): 453-464.
[2] CHEN C, CHEN Q F, XU J, et al.Learning to See in the Dark[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018.
[3] LONG J, SHELHAMER E, DARRELL T.Fully Convolutional Networks for Semantic Segmentation[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015.
[4] RONNEBERGER O, FISCHER P, BROX T.U-Net: Convolutional Networks for Biomedical Image Segmentation[C]// Proceedings of Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham: Springer International Publishing, 2015.
[5] KIRILLOV A, MINTUN E, RAVI N, et al.Segment anything[C]// Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris: IEEE, 2023.
[6] CHARLES R Q, HAO S, MO K C, et al.PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017.
[7] WU B C, WAN A, YUE X Y, et al.SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud[C]// 2018 IEEE International Conference on Robotics and Automation (ICRA). Brisbane: IEEE, 2018.
[8] XU C F, WU B C, WANG Z N, et al.SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation[C]// Proceedings of Computer Vision- ECCV 2020. Cham: Springer International Publishing, 2020.
[9] ZHOU Y, TUZEL O.VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018.
[10] WANG W Y, NEUMANN U.Depth-Aware CNN for RGB-D Segmentation[C]// Proceedings of Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 144-161.
[11] HU X X, YANG K L, FEI L, et al.ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation[C]// Proceedings of 2019 IEEE International Conference on Image Processing (ICIP). Taipei: IEEE, 2019.
[12] HU Y S, CHEN Z Z, LIN W Y.RGB-D Semantic Segmentation: A Review[C]// Proceedings of 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). San Diego: IEEE, 2018.
[13] 张莹, 黄影平, 郭志阳, 等. 基于点云与图像交叉融合的道路分割方法[J]. 光电工程, 2021, 48(12): 210340.
ZHANG Y, HUANG Y P, GUO Z Y, et al.Point Cloud-Image Data Fusion for Road Segmentation[J]. Opto-Electronic Engineering, 2021, 48(12): 210340.
[14] MILIOTO A, VIZZO I, BEHLEY J, et al.RangeNet: Fast and Accurate LiDAR Semantic Segmentation[C]// Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau: IEEE, 2019.
[15] 李加定, 万若楠, 孙小广, 等. 基于球面投影的激光点云目标检测[J]. 电子测量技术, 2024, 47(8): 93-99.
LI J D, WAN R N, SUN X G, et al.Point Cloud Data Processing and Target Recognition Based on Spherical Projection[J]. Electronic Measurement Technology, 2024, 47(8): 93-99.
[16] GEIGER A, LENZ P, STILLER C, et al.Vision Meets Robotics: The KITTI Dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237.
[17] UHRIG J, SCHNEIDER N, SCHNEIDER L, et al.Sparsity Invariant CNNS[C]// Proceedings of 2017 International Conference on 3D Vision (3DV). Qingdao: IEEE, 2017.
[18] HE K M, ZHANG X Y, REN S Q, et al.Deep Residual Learning for Image Recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778.
[19] 陈晋音, 赵卓, 徐曦恩, 等. 面向激光雷达的自动驾驶相关任务安全性综述[J/OL]. 小型微型计算机系统, 2025: 1-17. (2025-01-21). http://kns.cnki.net/KCMS/detail/detail.aspx?filename=XXWX20250117002&dbname=CJFD&dbcode=CJFQ.
CHEN J Y, ZHAO Z, XU X E, et al. Overview on the Safety of Autopilot-Related Tasks for Lidar[J/OL]. China Industrial Economics, 2025: 1-17. (2025-01-21). http://kns.cnki.net/KCMS/detail/detail.aspx?filename=XXWX20250117002&dbname=CJFD&dbcode=CJFQ.
[20] YAN G H, LIU Z C, WANG C J, et al.OpenCalib: A Multi-Sensor Calibration Toolbox for Autonomous Driving[J]. Software Impacts, 2022, 14: 100393.
[21] ZHANG Z.A Flexible New Technique for Camera Calibration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 22(11): 1330-1334.
[22] 马建红, 王稀瑶, 陈永霞, 等. 自动驾驶中图像与点云融合方法研究综述[J]. 郑州大学学报(理学版), 2022, 54(6): 24-33.
MA J H, WANG X Y, CHEN Y X, et al.A Review of Research on Image and Point Cloud Fusion Methods in Automatic Driving[J]. Journal of Zhengzhou University (Natural Science Edition), 2022, 54(6): 24-33.
[23] CHEN L C, PAPANDREOU G, KOKKINOS I, et al.DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[24] ZHOU Z W, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: A Nested U-Net Architecture for Medical Image Segmentation[C]// Proceedings of Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham: Springer International Publishing, 2018.
[25] CHEN L C, ZHU Y K, PAPANDREOU G, et al.Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation[C]// Proceedings of Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018-.
[26] BEHLEY J, GARBADE M, MILIOTO A, et al.SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019.

Funding

Stability support Project of National Defense Key Laboratory of Science and Technology (JCKY2024209C001); The Fundamental Research Funds for the Central Universities (2025300207)