基于多尺度循环注意力网络的遥感影像场景分类方法

马欣悦; 王梨名; 祁昆仑; 郑贵洲

doi:10.3799/dqkx.2020.365

基于多尺度循环注意力网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2020.365

中国地质大学地理与信息工程学院, 湖北武汉 430074

基金项目:

国家自然科学基金项目 42130309

国家重点研发计划项目 KZ21KA0002

国家重点研发计划项目 2020111052

详细信息

作者简介:
马欣悦(1998-), 女, 硕士研究生, 研究方向为遥感影像解译、机器学习.ORCID: 0000-0002-6765-6384.E-mail: Maxy@cug.edu.cn

通讯作者:
郑贵洲(1963-), ORCID: 0000-0002-2890-6395.E-mail: zhenggz@cug.edu.cn

中图分类号: P237
计量
- 文章访问数: 539
- HTML全文浏览量: 309
- PDF下载量: 33
- 被引次数: 0
出版历程
- 收稿日期: 2020-11-11
- 网络出版日期: 2021-11-03
- 刊出日期: 2021-11-03

Remote Sensing Image Scene Classification Method Based on Multi-Scale Cyclic Attention Network

School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China

摘要

摘要: 高分辨率遥感影像场景分类一直是遥感领域的研究热点.针对遥感场景对尺度的需求具有多样性的问题，提出了一种基于多尺度循环注意力网络的遥感影像场景分类方法.首先，通过Resnet50提取遥感影像多个尺度的特征，采用注意力机制得到影像不同尺度下的关注区域，对关注区域进行裁剪和缩放并输入到网络.然后，融合原始影像不同尺度的特征及其关注区域的影像特征，输入到全连接层完成分类预测.此分类方法在UC Merced Land-Use和NWPU-RESISC45公开数据集上进行了验证，平均分类精度较基础模型Resnet50分别提升了1.89%和2.70%.结果表明，多尺度循环注意力网络可以进一步提升遥感影像场景分类的精度.
- 遥感 /
- 场景分类 /
- 多尺度 /
- 卷积神经网络 /
- 注意力机制
Abstract: Scene classification of high-resolution remote sensing images has always been a research hotspot in the field of remote sensing. In view of the diversity of scale requirements of remote sensing scenes, in this paper it proposes a remote sensing image scene classification method based on multi-scale cyclic attention network. Firstly, the features of multiple scales of remote sensing scene image are extracted by Resnet50 network, the attention mechanism is used to obtain the region of interest of the image, and the region of interest is clipped and scaled. Then, the features of different scales of the original image and the features of different scale cropped images are fused, input to the full connection layer for classification prediction. The proposed method is validated in UC Merced Land-Use and NWPU-RESISC45, the average classification accuracy is improved by 1.89% and 2.70% respectively compared with Resnet50.The results show that the multi-scale cyclic attention network can further improve the accuracy of remote sensing image scene classification.
- remote sensing /
- scene classification /
- multi-scale /
- convolutional neural network /
- attention mechanism

HTML全文

图 1 遥感影像场景分类流程图

Fig. 1. Flow chart of scene classification of remote sensing image

下载: 全尺寸图片幻灯片

图 2 多尺度循环注意力网络结构

Fig. 2. Multi-scale cyclic attention network structure

下载: 全尺寸图片幻灯片

图 3 APN作用机制

Fig. 3. Mechanism of APN

下载: 全尺寸图片幻灯片

图 4 UC Merced Land-Use数据集部分样本示例

Fig. 4. Samples of UC Merced Land-Use dataset

下载: 全尺寸图片幻灯片

图 5 NWPU-RESISC45数据集部分样本示例

Fig. 5. Samples of NWPU-RESISC45 dataset

下载: 全尺寸图片幻灯片

图 6 不同尺度组合的图像训练网络的分类精度变化曲线（UC Merced Land-Use）

Fig. 6. Classification accuracy curve of image training network with different scale combinations (UC Merced Land-Use)

下载: 全尺寸图片幻灯片

图 7 不同尺度组合的图像训练网络的分类精度变化曲线图（NWPU-RESISC45）

Fig. 7. Classification accuracy curve of image training network with different scale combinations (NWPU-RESISC45)

下载: 全尺寸图片幻灯片

图 8 在UC Merced Land-Use数据集上的类别间错分率对比

a.单尺度模型的混淆矩阵（OA=98.1%）；b. 多尺度模型的混淆矩阵（OA=98.57%）

Fig. 8. Comparison of misclassification rates between categories on UCM dataset

下载: 全尺寸图片幻灯片

图 9 在UC Merced Land-Use数据集上的易混淆类别

Fig. 9. Misclassified samples on UC Merced Land-Use dataset

下载: 全尺寸图片幻灯片

图 10 在NWPU-RESISC45数据集上的类别间错分率对比

a.单尺度模型的混淆矩阵（OA= 90.62%）；b.多尺度模型的混淆矩阵（OA= 91.18%）

Fig. 10. Comparison of misclassification rates between categories on NWPU dataset

下载: 全尺寸图片幻灯片

图 11 在NWPU-RESISC45数据集上易混淆类别

Fig. 11. Misclassified samples on NWPU dataset

下载: 全尺寸图片幻灯片

表 1 Resnet50网络配置

Table 1. Resnet50 network configuration

layer name	50-layer
Conv1	7×7, 64, stride 2
Conv2_x	3×3 Max Pool, stride 2
Conv2_x	$ \left[\begin{array}{c}1\times \mathrm{1, 64}\\ 3\times \mathrm{3, 64}\\ 1\times \mathrm{1, 256}\end{array}\right] $ × 3
Conv3_x	$ \left[\begin{array}{c}1\times \mathrm{1, 128}\\ 3\times \mathrm{3, 128}\\ 1\times \mathrm{1, 512}\end{array}\right] $× 4
Conv4_x	$ \left[\begin{array}{c}1\times \mathrm{1, 256}\\ 3\times \mathrm{3, 256}\\ 1\times \mathrm{1, 1}\mathrm{ }024\end{array}\right] $ × 6
Conv5_x	$ \left[\begin{array}{c}1\times \mathrm{1, 512}\\ 3\times \mathrm{3, 512}\\ 1\times \mathrm{1, 2}\mathrm{ }048\end{array}\right] $ × 3
	GAP, k-d FC, softmax

下载: 导出CSV

表 2 两个数据集的相关信息

Table 2. Information about two datasets

Datasets	Scene	Images per class	Total images	Sizes	Training rate
UC Merced Land-Use	21	100	2 100	256×256	80%
NWPU-RESISC45	45	700	31 500	256×256	10%

下载: 导出CSV

表 3 基于UC Merced Land-Use不同尺度特征的分类精度

Table 3. Classification accuracy of different scale features on UCM dataset

number	scale	A-OA (%)
1	S_128_256	97.85$ \pm $0.67
2	S_160_256	98.10$ \pm $0.39
3	S_192_256	98.51$ \pm $0.11
4	S_224_256	98.33$ \pm $00.14
5	S_256	98.18$ \pm $00.09
6	S_288_256	98.10$ \pm $00.39

下载: 导出CSV

表 4 基于NWPU-RESISC45不同尺度特征的分类精度

Table 4. Classification accuracy of different scale features on NWPU-RESISC45 dataset

number	scale	A-OA (%)
1	S_128_256	91.04$ \pm $0.03
2	S_160_256	90.86$ \pm $0.19
3	S_192_256	91.18$ \pm $0.02
4	S_224_256	90.19$ \pm $0.31
5	S_256	90.25$ \pm $0.20
6	S_288_256	90.85$ \pm $0.27

下载: 导出CSV

表 5 不同方法对UC Merced Land-Use的分类精度

Table 5. Classification accuracy of different methods for UC Merced Land-Use

Method	OA (%)
BoVW（Yang and Newsam, 2010）	76.80
GoogleNet（Nogueira et al., 2017）	92.80
CaffeNet（Xia et al., 2017）	95.02$ \pm $0.81
Resnet50（Zhang et al., 2019）	96.62$ \pm $0.26
GLM16（Yuan et al., 2019）	94.97$ \pm $1.16
VGG-VD16+MSCP（He et al., 2018）	98.36$ \pm $0.58
AlexNet + MSCP（He et al., 2018）	97.29$ \pm $0.63
The model of this paper	98.51$ \pm $0.11

下载: 导出CSV

表 6 不同方法对NWPU-RESISC45的分类精度

Table 6. Classification accuracy of different methods for NWPU-RESISC45

Method	OA (%)
BoVW（Cheng et al., 2017）	41.72$ \pm $0.21
Fine-tuned AlexNet（Cheng et al., 2017）	81.22$ \pm $0.19
Fine-tuned GoogleNet (Cheng et al., 2017)	82.57$ \pm $0.12
Fine-tuned VGGNet-16(Cheng et al., 2017)	87.15$ \pm $0.45
Resnet50（Zhao et al., 2020）	88.48$ \pm $0.21
VGG-VD16+MSCP（He et al., 2018）	85.33$ \pm $0.17
AlexNet + MSCP（He et al., 2018）	81.70$ \pm $0.23
The model of this paper	91.18$ \pm $0.02

下载: 导出CSV

参考文献(37)

[1]	Bahdanau, D., Cho, K., Bengio, Y., 2014. Neural Machine Translation by Jointly Learning to Align and Translate. Computer Science, arXiv: 1409.0473. https://arxiv.org/abs/1409.0473
[2]	Castelluccio, M., Poggi, G., Sansone, C., et al., 2015. Land Use Classification in Remote Sensing Images by Convolutional Neural Networks. Acta Ecologica Sinica, 28(2): 627-635. http://pdfs.semanticscholar.org/4191/fe93bfd883740a881e6a60e54b371c2f241d.pdf
[3]	Chen, Q.H., Liu, Z.M., Liu, X.G., et al., 2010. Element-Oriented Land-Use Classification of Mining Area by High Spatial Resolution Remote Sensing Image. Earth Science, 35(3): 453-458(in Chinese with English abstract). http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5631116
[4]	Chen, S.Z., Tian, Y.L., 2014. Pyramid of Spatial Relatons for Scene-Level Land Use Classification. IEEE Transactions on Geoscience and Remote Sensing, 53(4): 1947-1957. https://doi.org/10.1109/TGRS.2014.2351395
[5]	Cheng, G., Han, J., Lu, X., 2017. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the IEEE, 105(10): 1865-1883. https://doi.org/10.1109/JPROC.2017.2675998
[6]	Cheng, G., Ma, C. C., Zhou, P. C., et al., 2016. Scene Classification of High Resolution Remote Sensing Images Using Convolutional Neural Networks. In Proceedings 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, 767-770. https://doi.org/10.1109/IGARSS.2016.7729193
[7]	Cheng, G.X., Niu, R.Q., Zhang, K.X., et al., 2018. Opencast Mining Area Recognition in High-Resolution Remote Sensing Images Using Convolutional Neural Networks. Earth Science, 43(Suppl. 2): 256-262(in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTotal-DQKX2018S2021.htm
[8]	Fu, J.L., Zheng, H.L., Mei, T., 2017. Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu Hawaii, 4476-4484. https://doi.org/10.1109/CVPR.2017.476
[9]	Gómez-Chova, L., Tuia, D., Moser, G., et al., 2015. Multimodal Classification of Remote Sensing Images: A Review and Future Directions. Proceedings of the IEEE, 103(9): 1560-1584. https://doi.org/10.1109/JPROC.2015.2449668
[10]	Han, X.B., Zhong, Y.F., Cao, L.Q., et al., 2017. Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification. Remote Sensing, 9(8): 848. https://doi.org/10.3390/rs9080848
[11]	He, K.M., Zhang, X.Y., Ren, S Q., et al., 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas Nevada, 770-778. https://doi.org/10.1109/CVPR.2016.90
[12]	He, N.J., Fang, L.Y., Li, S.T., et al., 2018. Remote Sensing Scene Classification Using Multilayer Stacked Covariance Pooling. IEEE Transactions on Geoscience and Remote Sensing, 56(12): 6899-6910. https://doi.org/10.1109/TGRS.2018.2845668
[13]	Jia, Y.Q., Shelhamer, E., Donahue, J., et al., 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In Proceedings of the 22nd ACM International Conference on Multimedia, Orlando Florida USA, 675-678. https://doi.org/10.1145/2647868.2654889
[14]	Ketkar, N., 2017. Introduction to PyTorch. Deep Learning with Python. Apress, Berkeley, CA, 195-208. https://doi.org/10.1007/978-1-4842-2766-4_12
[15]	Li, G.D., Zhang, C.J., Wang, M.K., et al., 2019. Transfer Learning Using Convolutional Neural Network for Scene Classification within High Resolution Remote Sensing Image. Science of Surveying and Mapping, 44(4): 116-123, 174(in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTotal-CHKD201904021.htm
[16]	Li, W.K., Zhang, W., Qin, J.H., et al., 2020. "Expansion-Fusion" Extraction of Surface Gully Area Based on DEM and High-Resolution Remote Sensing Images. Earth Science, 45(6): 1948-1955(in Chinese with English abstract).
[17]	Lienou, M., Maitre, H., Datcu, M., 2009. Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation. IEEE Geoscience and Remote Sensing Letters, 7(1): 28-32. https://doi.org/10.1109/LGRS.2009.2023536
[18]	Luo, W., Li, H. L., Liu, G. H., 2011. Automatic Annotation of Multispectral Satellite Images Using Author-Topic Model. IEEE Geoscience and Remote Sensing Letters, 9(4): 634-638. https://doi.org/10.1109/LGRS.2011.2177064
[19]	Nogueira, K., Penatti, O. A. B., dos Santos, J.A., 2017. Towards Better Exploiting Convolutional Neural Networks for Remote Sensing Scene Classification. Pattern Recognition, 61: 539-556. https://doi.org/10.1016/j.patcog.2016.07.001
[20]	Oliva, A., Torralba, A., 2001. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision, 42(3): 145-175. https://doi.org/10.1023/A:1011139631724
[21]	Pan, S. J., Yang, Q., 2009. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10): 1345-1359. https://doi.org/10.1109/TKDE.2009.191
[22]	Simonyan, K., Zisserman, A., 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR. Computer Science, arXiv: 1409.1556.
[23]	Szegedy, C., Liu, W., Jia, Y.Q., et al., 2015. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, IEEE, 1-9. https://doi.org/10.1109/CVPR.2015.7298594
[24]	Xia, G. S., Hu, J. W., Hu, F., et al., 2017. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 55(7): 3965-3981. https://doi.org/10.1109/TGRS.2017.2685945
[25]	Yang, Y., Newsam, S., 2010. Bag-of-Visual-Words and Spatial Extensions for Land-Use Classification. In Proceedings of the ACM International Symposium on Advances in Geographic Information Systems, San Jose California, 270-279. https://doi.org/10.1145/1869790.1869829
[26]	Yang, Y., Newsam, S., 2013. Geographic Image Retrieval Using Local Invariant Features. IEEE Transactions on Geoscience and Remote Sensing, 51(2): 818-832. https://doi.org/10.1109/TGRS.2012.2205158
[27]	Yu, D.H., Zhang, B.M., Zhao, C., et al., 2020. Scene Classification of Remote Sensing Image Using Ensemble Convolutional Neural Network. Journal of Remote Sensing, 24(6): 717-727(in Chinese with English abstract).
[28]	Yu, S.C., Yu, D.Q., Wang, L.C., et al., 2019. Remote Sensing Study of Dongting Lake Beach Changes before and after Operation of Three Gorges Reservoir. Earth Science, 44(12): 4275-4283(in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTotal-DQKX201912037.htm
[29]	Yuan, Y., Fang, J., Lu, X.Q., et al., 2019. Remote Sensing Image Scene Classification Using Rearranged Local Features. IEEE Transactions on Geoscience and Remote Sensing, 57(3): 1779-1792. https://doi.org/10.1109/TGRS.2018.2869101
[30]	Zhang, D., Li, N., Ye, Q.L., 2019. Positional Context Aggregation Network for Remote Sensing Scene Classification. IEEE Geoscience and Remote Sensing Letters, 17(6): 943-947. https://doi.org/10.1109/LGRS.2019.2937811
[31]	Zhao, Z.C., Li, J.Q., Luo, Z., et al., 2020. Remote Sensing Image Scene Classification Based on an Enhanced Attention Module. IEEE Geoscience and Remote Sensing Letters, (99): 1-5. https://doi.org/10.1109/LGRS.2020.3011405
[32]	陈启浩, 刘志敏, 刘修国, 等, 2010. 面向基元的高空间分辨率矿区遥感影像土地利用分类. 地球科学, 35(3): 453-458. doi: 10.3799/dqkx.2010.055
[33]	程国轩, 牛瑞卿, 张凯翔, 等, 2018. 基于卷积神经网络的高分遥感影像露天采矿场识别. 地球科学, 43(增刊2): 256-262. doi: 10.3799/dqkx.2018.987
[34]	李冠东, 张春菊, 王铭恺, 等, 2019. 卷积神经网络迁移的高分影像场景分类学习. 测绘科学, 444): 116-123, 174. https://www.cnki.com.cn/Article/CJFDTOTAL-CHKD201904021.htm
[35]	李文凯, 张唯, 秦家豪, 等, 2020. 基于DEM和高分辨率遥感影像的"膨胀-融合"式地表沟壑提取. 地球科学, 45(6): 1948-1955. doi: 10.3799/dqkx.2020.004
[36]	余东行, 张保明, 赵传, 等, 2020. 联合卷积神经网络与集成学习的遥感影像场景分类. 遥感学报, 24(6): 717-727. https://www.cnki.com.cn/Article/CJFDTOTAL-YGXB202006006.htm
[37]	余姝辰, 余德清, 王伦澈, 等, 2019. 三峡水库运行前后洞庭湖洲滩面积变化遥感认识. 地球科学, 44(12): 4275-4283. doi: 10.3799/dqkx.2019.182

施引文献

资源附件(0)

访问统计

点击查看大图

图(11) / 表(6)

计量

文章访问数: 539
HTML全文浏览量: 309
PDF下载量: 33
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于多尺度循环注意力网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2020.365

作者简介:
马欣悦(1998-), 女, 硕士研究生, 研究方向为遥感影像解译、机器学习.ORCID: 0000-0002-6765-6384.E-mail: Maxy@cug.edu.cn

通讯作者:
郑贵洲(1963-), ORCID: 0000-0002-2890-6395.E-mail: zhenggz@cug.edu.cn

计量

Remote Sensing Image Scene Classification Method Based on Multi-Scale Cyclic Attention Network

计量

目录

留言板

基于多尺度循环注意力网络的遥感影像场景分类方法

doi: 10.3799/dqkx.2020.365

作者简介: 马欣悦(1998-), 女, 硕士研究生, 研究方向为遥感影像解译、机器学习.ORCID: 0000-0002-6765-6384.E-mail: Maxy@cug.edu.cn

通讯作者: 郑贵洲(1963-), ORCID: 0000-0002-2890-6395.E-mail: zhenggz@cug.edu.cn

计量

出版历程

Remote Sensing Image Scene Classification Method Based on Multi-Scale Cyclic Attention Network

计量

出版历程

目录

作者简介:
马欣悦(1998-), 女, 硕士研究生, 研究方向为遥感影像解译、机器学习.ORCID: 0000-0002-6765-6384.E-mail: Maxy@cug.edu.cn

通讯作者:
郑贵洲(1963-), ORCID: 0000-0002-2890-6395.E-mail: zhenggz@cug.edu.cn