Post-fusion Strategy for Aerial-ground Cross-view Pedestrian Target Matching

GAO Jun, YANG Han, LIU Yong, HE Xiuwei, TAN Li, YIN Yankun, SHEN Xiaolei, YANG Feifei, PENG Chenglei

Equipment Environmental Engineering ›› 2025, Vol. 22 ›› Issue (7) : 16-23.

PDF(5617 KB)
PDF(5617 KB)
Equipment Environmental Engineering ›› 2025, Vol. 22 ›› Issue (7) : 16-23. DOI: 10.7643/issn.1672-9242.2025.07.003
Special Topic—Application and Collaborative Evaluation Technology of Light Weapons in Complex Environments

Post-fusion Strategy for Aerial-ground Cross-view Pedestrian Target Matching

  • GAO Jun1, YANG Han1, LIU Yong2, HE Xiuwei2, TAN Li2, YIN Yankun2, SHEN Xiaolei2, YANG Feifei2, PENG Chenglei1,*
Author information +
History +

Abstract

To address the challenge of cross-view pedestrian target matching in multi-perspective images for collaborative reconnaissance of aerial-ground heterogeneous equipment (drone-vehicle) under complex battlefield environments, the work aims to propose a post-fusion strategy-based cross-view aerial-ground target matching algorithm, so as to resolve significant viewpoint differences (>60°), and drastic scale variations. Firstly, a lightweight dual-branch YOLOv10 model was employed to achieve efficient pedestrian detection in aerial and ground view images. Secondly, a multi-scale feature extraction network (ResNet-18 integrated with adaptive spatial pyramid) and geometric localization information were fused to construct a spatial-appearance joint representation of targets. Finally, the Hungarian algorithm was adopted to optimize a weighted cost function combining features and geometric constraints to realize optimal cross-view target matching. Experiments on the CVMHT dataset demonstrated that the proposed method achieved average precision and recall rates of 81.4% and 79.0%, respectively, outperforming baseline methods without fused appearance features (76.3% and 78.8%) and significantly surpassing traditional person re-identification methods like ByteTrack V2 (33.8% and 36.1% after fine-tuning). The proposed algorithm effectively integrates detection, geometry, and appearance features through a post-fusion strategy, overcoming the dependency of pre-fusion methods on fixed-view layouts, thereby providing a flexible and robust target matching solution for collaborative reconnaissance of aerial-ground heterogeneous equipment.

Key words

cross-view aerial-ground target matching / post-fusion strategy / YOLOv10 / multi-scale feature fusion / geometric localization / Hungarian algorithm

Cite this article

Download Citations
GAO Jun, YANG Han, LIU Yong, HE Xiuwei, TAN Li, YIN Yankun, SHEN Xiaolei, YANG Feifei, PENG Chenglei. Post-fusion Strategy for Aerial-ground Cross-view Pedestrian Target Matching[J]. Equipment Environmental Engineering. 2025, 22(7): 16-23 https://doi.org/10.7643/issn.1672-9242.2025.07.003

References

[1] ZHUANG Z W, LI R, JIA K, et al.Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021.
[2] LIANG M, YANG B, CHEN Y, et al.Multi-Task Multi- Sensor Fusion for 3D Object Detection[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019.
[3] MITCHELL H B.Multi-Sensor Data Fusion: An Introduction[M]. Berlin: Springer Science & Business Media, 2007.
[4] WANG A, CHEN H, LIU L, et al.YOLOv10: Real-Time End-to-End Object Detection[J]. Advances in Neural Information Processing Systems, 2024, 37: 107984-108011.
[5] ZHENG L, YANG Y, HAUPTMANN A G. Person Re-Identification: Past, Present and Future[EB/OL].(2016-10-10)[2025-04-30]. https://arxiv.org/abs/1610.02984.
[6] NING E H, WANG C S, ZHANG H, et al.Occluded Person re-Identification with Deep Learning: A Survey and Perspectives[J]. Expert Systems with Applications, 2024, 239: 122419.
[7] NING E H, WANG Y F, WANG C S, et al.Enhancement, Integration, Expansion: Activating Representation of Detailed Features for Occluded Person re-Identification[J]. Neural Networks, 2024, 169: 532-541.
[8] TIAN Y C, CHEN C, SHAH M.Cross-View Image Matching for Geo-Localization in Urban Environments[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017.
[9] ARANDJELOVIĆ R, GRONAT P, TORII A, et al.NetVLAD: CNN Architecture for Weakly Supervised Place Recognition[C]// Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence. [s. l.]: IEEE, 2018.
[10] LIN T Y, CUI Y, BELONGIE S, et al.Learning Deep Representations for Ground-to-Aerial Geolocalization[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015.
[11] ZHU S J, YANG T, CHEN C.VIGOR: Cross-View Image Geo-Localization beyond One-to-One Retrieval[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021.
[12] TOKER A, ZHOU Q J, MAXIMOV M, et al.Coming down to Earth: Satellite-to-Street View Synthesis for Geo-Localization[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021.
[13] 饶子昱, 卢俊, 郭海涛, 等. 利用视角转换的跨视角影像匹配方法[J]. 地球信息科学学报, 2023, 25(2): 368-379.
RAO Z Y, LU J, GUO H T, et al.A Cross-View Image Matching Method with Viewpoint Conversion[J]. Journal of Geo-Information Science, 2023, 25(2): 368-379.
[14] BEWLEY A, GE Z, OTT L, et al.Simple Online and Realtime Tracking[C]// Proceedings of 2016 IEEE International Conference on Image Processing (ICIP). [s. l.]: IEEE, 2016.
[15] WOJKE N, BEWLEY A, PAULUS D.Simple Online and Realtime Tracking with a Deep Association Metric[C]// Proceedings of 2017 IEEE International Conference on Image Processing (ICIP). Beijing: IEEE, 2017.
[16] ZHANG Y F, SUN P Z, JIANG Y, et al.ByteTrack: Multi-Object Tracking by Associating every Detection Box[C]// Computer Vision-ECCV 2022. Cham: Springer Nature Switzerland, 2022.
[17] SUN Y F, ZHENG L, YANG Y, et al.Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline)[C]// Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018.
[18] HE S T, LUO H, WANG P C, et al.TransReID: Transformer-Based Object re-Identification[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021.
[19] HOU Y Z, ZHENG L, GOULD S.Multiview Detection with Feature Perspective Transformation[C]// Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020.
[20] HOU Y Z, ZHENG L.Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)[C]// Proceedings of the 29th ACM International Conference on Multimedia. [s. l.]: ACM, 2021.
[21] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is All You Need[C]//Advances in Neural Information Processing Systems.[s.l.]: [s.n.], 2017.
[22] YUN S, HAN D, CHUN S, et al.CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 6022-6031.
[23] HE K M, ZHANG X Y, REN S Q, et al.Deep Residual Learning for Image Recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016.
[24] HAN R, ZHANG Y, FENG W, et al. Multiple Human Association Between Top and Horizontal Views by Matching Subjects' Spatial Distributions[EB/OL]. (2019-07-26)[2025-04-30]. https://arxiv.org/abs/1907.11458.
[25] MILLS-TETTEY G A, STENTZ A, DIAS M B. The Dynamic Hungarian Algorithm for the Assignment Problem With Changing Costs[M]. Pittsburgh: Robotics Institute, Carnegie Mellon University, 2007: 5-18.
[26] GAN Y Y, HAN R Z, YIN L Q, et al.Self-Supervised Multi-View Multi-Human Association and Tracking[C]// Proceedings of the 29th ACM International Conference on Multimedia. [s. l.]: ACM, 2021.
[27] ZHANG Y, WANG X, YE X, et al. ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box[EB/OL]. (2023-03-27)[2025-04-30]. https://arxiv.org/abs/2303.15334.

Funding

Stability support Project of National Defense Key Laboratory of Science and Technology (JCKY2024209C001); The Fundamental Research Funds for the Central Universities (2025300207)
PDF(5617 KB)

Accesses

Citation

Detail

Sections
Recommended

/