Maqola

TFNet: Transformer-Based Multi-Scale Feature Fusion Forest Fire Image Detection Network

Hongying LiuSchool of Computer and Artificial Intelligence, Nanjing University of Science and Technology Zijin College, Nanjing 210023, ChinaFuquan ZhangCollege of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, ChinaYiqing XuSchool of Computer and Software, Nanjing Vocational University of Industry Technology, Nanjing 210023, ChinaJunling WangSchool of Computer and Artificial Intelligence, Nanjing University of Science and Technology Zijin College, Nanjing 210023, ChinaHong LuSchool of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210038, ChinaWei WeiSchool of Computer and Artificial Intelligence, Nanjing University of Science and Technology Zijin College, Nanjing 210023, ChinaJun ZhuSchool of Computer and Software, Nanjing Vocational University of Industry Technology, Nanjing 210023, China

2025en

ABI

Annotatsiya

Forest fires pose a severe threat to ecological environments and the safety of human lives and property, making real-time forest fire monitoring crucial. This study addresses challenges in forest fire image object detection, including small fire targets, sparse smoke, and difficulties in feature extraction, by proposing TFNet, a Transformer-based multi-scale feature fusion detection network. TFNet integrates several components: SRModule, CG-MSFF Encoder, Decoder and Head, and WIOU Loss. The SRModule employs a multi-branch structure to learn diverse feature representations of forest fire images, utilizing 1 × 1 convolutions to generate redundant feature maps and enhance feature diversity. The CG-MSFF Encoder introduces a context-guided attention mechanism combined with adaptive feature fusion (AFF), enabling effective multi-scale feature fusion by reweighting features across layers and extracting both local and global representations. The Decoder and Head refine the output by iteratively optimizing target queries using self- and cross-attention, improving detection accuracy. Additionally, the WIOU Loss assigns varying weights to the IoU metric for predicted versus ground truth boxes, thereby balancing positive and negative samples and improving localization accuracy. Experimental results on two publicly available datasets, D-Fire and M4SFWD, demonstrate that TFNet outperforms comparative models in terms of precision, recall, F1-Score, mAP50, and mAP50–95. Specifically, on the D-Fire dataset, TFNet achieved metrics of 81.6% precision, 74.8% recall, an F1-Score of 78.1%, mAP50 of 81.2%, and mAP50–95 of 46.8%. On the M4SFWD dataset, these metrics improved to 86.6% precision, 83.3% recall, an F1-Score of 84.9%, mAP50 of 89.2%, and mAP50–95 of 52.2%. The proposed TFNet offers technical support for developing efficient and practical forest fire monitoring systems.

Hali tarjima qilinmagan

Identifikatorlar

DOI: 10.3390/fire8020059

Iqtiboslar va manbalar

2 ta iqtibos0 ta foydalanilgan manba

Koʻrsatkichlar — AkademScholar