Skip to main content
Other

Grapevine Leaf Variety & Disease Dataset (GLVD) (Compiled Dataset)

Syed Taimoor Hussain ShahPolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin 10129, ItalySyed Adil Hussain ShahPolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin 10129, ItalySilvia GodioPolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin 10129, ItalyKarim KassemPolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin 10129, ItalyShahzad Ahmad, QureshiDepartment of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad 45650, PakistanSyed Bilal HussainSchool of Agriculture and Food Science, University College Dublin, Dublin 4, IrelandDeriu, Marco AgostinoPolytechnic University of Turin
Open MINDrepository2026
ABI

Abstract

Overview This repository provides a compiled and standardized image dataset for two related tasks: Grapevine leaf variety (cultivar) recognition (leaf-only images) Grapevine leaf disease classification (disease/health condition images) The dataset was compiled, cleaned, unified in label space, and split into train/validation/test partitions to reproduce and re-express the experimental results reported in the manuscript: Explainable AI-Based Two-Stage Swin Transformer for Grapevine Leaf Variety Recognition and Disease ClassificationSyed Taimoor Hussain Shah, Syed Adil Hussain Shah, Silvia Godio, Karim Kassem, Shahzad Ahmad Qureshi, Syed Bilal Hussain, and Marco Agostino Deriu This release supports reproducibility, benchmarking, and explainable AI (XAI) analysis using the same label definitions and dataset splits described in the manuscript. Authors and Affiliations (Manuscript) PolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin 10129, Italy Department of Research and Development (R&D), GPI SpA, Trento 38123, Italy Centro Medico Santagostino, Milan, Italy Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad 45650, Pakistan School of Agriculture and Food Science, University College Dublin, Dublin 4, Ireland Dataset Sources and Acknowledgements (Original Creators) This compiled dataset is derived from the following publicly available datasets. All credit belongs to the original dataset creators/maintainers, and we sincerely thank them for making the data accessible to the community: Grapevine Leaves Image Dataset (Kaggle)Ak, Ala Idris, Büzgülü, Dimnit and Nazli grapevine leaveshttps://www.kaggle.com/datasets/muratkokludataset/grapevine-leaves-image-dataset Grapevine Leaves (Kaggle)Leaf photos of 11 Vitis vinifera varietieshttps://www.kaggle.com/datasets/maximvlah/grapevine-leaves Grape400 Dataset (Kaggle)https://www.kaggle.com/datasets/nirmalsankalana/grape400-dataset Niphad Grape Leaf Disease Dataset (NGLD) (Mendeley Data)DOI: 10.17632/8nnd2ypcv3.4 (Version 4; Published: 4 March 2025)https://data.mendeley.com/datasets/8nnd2ypcv3/4 Reported by the source: 2,726 images captured using mobile phones for real-world authenticity; 256×256 image dimensions and 96 dpi; categories include Downy Mildew, Bacterial Rot, Powdery Mildew, and Healthy Leaves. Important: Please also cite the original datasets above when using this compiled dataset in any publication or derivative work. What We Changed (Compilation and Unification) To build a consistent benchmark aligned with the manuscript experiments, we applied the following steps: Unified the folder structure into Leaves/ and Diseases/ with train/val/test splits. Standardized class names (e.g., character normalization such as Büzgülü → Buzgulu, consistent spacing/case). Merged “Healthy” concepts across sources into a single Healthy Leaves label for the disease task. Removed duplicates and ensured clean partitioning so that train/val/test splits are disjoint. Created fixed validation and test sets to support fair and reproducible comparisons and XAI analysis. Folder Structure Grape Leaves and Diseases Dataset/├── Leaves/│ ├── train/│ ├── val/│ └── test/└── Diseases/├── train/├── val/└── test/ Each class is a subfolder inside train/, val/, and test/, compatible with standard frameworks (e.g., PyTorch ImageFolder). Task 1: Leaf Variety (Cultivar) Recognition — Split Summary 16 cultivars (leaf-only) Train: 1309 images Validation: 80 images Test: 80 images Total: 1469 images Validation and test were fixed to 5 images per cultivar (80 each). Task 2: Disease Classification — Split Summary 7 classes (Healthy + 6 diseases) Train: 3470 images Validation: 428 images Test: 428 images Total: 4326 images Class Lists Leaves (Cultivars) Ak, Ala_Idris, Auxerrois, Buzgulu, Cabernet Franc, Cabernet Sauvignon, Chardonnay, Dimnit, Merlot, Muller Thurgau, Nazli, Pinot Noir, Riesling, Sauvignon Blanc, Syrah, Tempranillo Diseases Bacterial Rot, Black Measles, Black Rot, Downy Mildew, Healthy Leaves, Leaf Blight, Powdery Mildew Recommended Citation (This Compiled Dataset) If you use this compiled dataset, please cite: Zenodo record (this dataset): DOI: 10.5281/zenodo.18937397[Authors], “Grapevine Leaves and Diseases Dataset (Compiled) (GLVD),” Zenodo, 2026. DOI: 10.5281/zenodo.18937397. The manuscript:Explainable-AI based Two Stage Swin Transformer for Grapevine Leaf Variety Recognition and Disease ClassificationSyed Taimoor Hussain Shah et al. And the original datasets listed in “Dataset Sources and Acknowledgements”. Intended Use This dataset is released to: reproduce the training and evaluation pipeline described in the manuscript, benchmark models for leaf variety recognition and disease classification, enable explainable AI analyses (e.g., Grad-CAM variants, SHAP, LIME, RISE, and transformer-native attribution methods). License and Terms This compilation is shared for research and reproducibility purposes. Users must respect the licensing/terms of the original data sources (Kaggle datasets and the Mendeley dataset). If any original source has specific constraints, those constraints take precedence for the corresponding images. Contact For questions, issues, or collaboration, please contact the corresponding author(s): Syed Taimoor Hussain Shah — [email protected] Syed Bilal Hussain — [email protected] Acknowledgement Statement We gratefully acknowledge and thank the original dataset creators and maintainers on Kaggle and Mendeley Data for making their grape leaf and disease image datasets publicly available, enabling this compiled benchmark and the reproducibility study presented in our manuscript.

Identifiers

Citations and references

Cited by 00 references