周俊炎,王竟成,杨小奎,舒畅,王津梅,张宸.基于GAN的小样本腐蚀失厚率数据增强方法[J].装备环境工程,2023,20(1):142-150. ZHOU Jun-yan,WANG Jing-cheng,YANG Xiao-kui,SHU Chang,WANG Jin-mei,ZHANG Chen.Corrosion Thickness Loss Rate Data Enhancement Based on a Small Sample of GAN[J].Equipment Environmental Engineering,2023,20(1):142-150. |
基于GAN的小样本腐蚀失厚率数据增强方法 |
Corrosion Thickness Loss Rate Data Enhancement Based on a Small Sample of GAN |
|
DOI:10.7643/issn.1672-9242.2023.01.020 |
中文关键词: 腐蚀失厚率 小样本 生成对抗网络 数据增强 降维分析 样本分布中图分类号:TP399 文献标识码:A 文章编号:1672-9242(2023)01-0142-09 |
英文关键词:corrosion thickness loss rate small sample generative adversarial networks data enhancement dimensionality reduction analysis sample distribution |
基金项目: |
|
Author | Institution |
ZHOU Jun-yan | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
WANG Jing-cheng | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
YANG Xiao-kui | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
SHU Chang | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
WANG Jin-mei | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
ZHANG Chen | Southwest Institute of Technology and Engineering, Chongqing 400039, China |
|
摘要点击次数: |
全文下载次数: |
中文摘要: |
目的 对小样本腐蚀失厚率数据进行数据增强,实现数据扩充,以提升后续分析模型的预测精度,减轻过拟合程度,并提升模型的泛化能力。方法 利用生成对抗网络(Generative Adversarial Networks,GAN)扩充腐蚀失厚率数据,使数据分布更加全面。对生成数据进行降维可视化分析,探究生成数据与原始数据样本的分布规律,分析数据增强合理性,并从多个算法模型、多个评价指标角度对分析预测能力、泛化能力进行评估。结果 生成数据填补了原始数据在样本空间分布的薄弱环节,加入生成数据后,各机器学习算法模型得出的MSE均值为未加入生成数据的61.72%~91.74%,皮尔逊均值为99.01%~113.64%,预测准确度提升,结果关联性更强,模型泛化能力增强。结论 GAN能有效对小样本腐蚀失厚率数据进行增强,数据扩充对分析预测提供正向支持,生成数据不宜多于原始数据,防止扰乱训练样本分布,同时存在生成数据多样性受限的问题。 |
英文摘要: |
The work aims to conduct data enhancement on the corrosion thickness loss rate of small samples to achieve data expansion, improve the prediction accuracy of the subsequent analysis model, reduce the degree of overfitting and improve the generalization ability of the model. The Generative Adversarial Network (GAN) was used to expand the corrosion thickness loss rate data and make the data distribution more comprehensive. Dimensionality reduction visual analysis on the generated data was conducted. The distribution of generated data and original data samples was explored. The rationality of data enhancement was analyzed. In addition, the analysis and prediction ability and generalization ability were evaluated from the perspectives of multiple algorithm models and multiple evaluation indicators. The generated data filled in the weak link of the original data in the sample space distribution. After adding the generated data, the average MSE obtained by each machine learning algorithm model was 61.72% to 91.74% of the result without the generated data, and the Pearson average was 99.01% to 113.64 %. The prediction accuracy was improved. The results were more relevant. And the model generalization ability was enhanced. GAN can effectively enhance the corrosion thickness loss rate data of small samples. Data expansion provides positive support for analysis and prediction. The generated data should not be more than the original data to prevent disturbing the distribution of training samples. At the same time, there are problems with limited diversity of generated data. |
查看全文 查看/发表评论 下载PDF阅读器 |
关闭 |
|
|
|