An AI-powered strawberry disease detection and diagnostic system built on three complementary models: a ResNet-50 image classifier for disease identification, an RF-DETR Small object detector for plant-part grounding, and MiniGPT-v2 for generating detailed, visually grounded diagnostic reports. The system serves a Gradio web interface where users upload or capture images of strawberry plants and receive structured diagnoses with actionable treatment recommendations sourced from university extension publications.
Research prototype developed at Middle Tennessee State University. Not a substitute for professional agronomic advice.
The system runs a three-stage pipeline on every uploaded image:
-
ResNet-50 Classification -- A fine-tuned ResNet-50 classifies the image into one of seven conditions: healthy, drought, overwatering, root rot, frost injury, gray mold, or white mold. Test-time augmentation and temperature scaling produce calibrated confidence scores.
-
RF-DETR Part Detection -- An RF-DETR Small object detector identifies which plant parts are visible in the image (flower, fruit, leaf, root, soil, stem). Per-class confidence thresholds are tuned to each part's detection characteristics. The detected parts constrain MiniGPT's output so it only describes what is actually visible.
-
MiniGPT-v2 Grounded Explanation -- A LoRA-adapted MiniGPT-v2 (LLaMA-2-7B backbone) receives the ResNet diagnosis as ground truth and the RF-DETR part detections as visibility constraints. It generates a structured diagnostic report covering the diagnosis, visible symptoms on detected parts, and treatment recommendations. Parts that are not detected are excluded from the response, reducing hallucination.
Image Upload --> ResNet-50 Classification --> RF-DETR Part Detection --> MiniGPT-v2 Grounded Report
A knowledge graph RAG system supports follow-up Q&A and provides part-specific treatment advice when RF-DETR detects particular plant structures.
- Three-stage AI pipeline with classification, detection, and vision-language explanation
- Seven disease classes: healthy, drought, overwatering, root rot, frost injury, gray mold, white mold
- Plant-part grounding: RF-DETR constrains MiniGPT to only describe visible parts, with optional bounding box overlay on the uploaded image
- Knowledge graph RAG: disease-specific follow-up Q&A with part-specific treatment recommendations grounded in university extension publications
- Hallucination detection: cross-checks model output against known disease indicators to flag and regenerate confused responses
- Web search integration: optional SERP API enrichment for supplementary context
- Dark-themed Gradio interface with chat, settings, and an interactive disease knowledge graph visualization
- Python 3.9+ (the demo runs in a
minigptvConda environment; RF-DETR training uses a separaterfdetrenvironment with Python 3.11) - CUDA GPU with 16 GB+ VRAM (RTX 3090 or better recommended)
- PyTorch with CUDA support
- Model weights (not included in the repository):
- LLaMA-2-7B-chat:
llama_weights/Llama-2-7b-chat-hf/ - MiniGPT-v2 fine-tuned checkpoint:
output/minigpt_strawberry/checkpoint_best.pth - ResNet-50 classifier:
plant_diagnostic/models/resnet_strawberry.pth - RF-DETR Small fine-tuned checkpoint:
rfdetr_parts/output/checkpoint_best_total.pth
- LLaMA-2-7B-chat:
- (Optional) SERP API key in a
.envfile for web search features
# Clone the repository
git clone https://github.com/greatroboticslab/AGAI.git
cd AGAI
# Create the Conda environment for the demo
conda env create -f environment.yml
conda activate minigptv
# Install PyTorch with CUDA support (adjust cu121 to match your driver)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Install remaining dependencies
pip install transformers==4.41.1 bitsandbytes gradio pillow numpy pandas plotly networkx python-dotenv timm serpapiFor RF-DETR training, a separate environment is needed:
conda create -n rfdetr python=3.11 -y
conda activate rfdetr
pip install rfdetr torch torchvision --index-url https://download.pytorch.org/whl/cu121export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
export SERP_API_KEY=your_key_here # optional, for web search featuresOr place SERP_API_KEY=your_key_here in a .env file at the project root (gitignored).
python demo_v5.pyThis starts a Gradio web interface. Open the printed URL in your browser, upload a strawberry plant image, and the system will return a structured diagnosis. Use the settings panel to toggle bounding box visualization, adjust generation temperature, or explore the knowledge graph tab.
CUDA_VISIBLE_DEVICES=0,1 MASTER_PORT=29607 PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
python -m torch.distributed.run --nproc_per_node=2 train.py \
--cfg-path train_configs/minigptv2_strawberry_v2.yamlPrepare the COCO-format dataset and launch training (in the rfdetr Conda environment):
conda activate rfdetr
python rfdetr_parts/prepare_dataset.py
python rfdetr_parts/train_rfdetr.pyTraining hyperparameters are in rfdetr_parts/rfdetr_config.yaml. The trained checkpoint is saved to rfdetr_parts/output/.
AGAI/
├── demo_v5.py # Gradio web interface (main entry point)
├── resnet_classifier.py # ResNet-50 inference and TTA
├── train.py # MiniGPT-v2 training script
├── eval_only.py # Non-interactive evaluation
├── eval_holdout.py # Holdout set evaluation
├── test_grounding.py # End-to-end grounding test orchestrator
├── ui_components.py # Gradio UI HTML/CSS constants
├── dark_theme.css # UI theme stylesheet
├── AGAILogo.png # Project logo
├── environment.yml # Conda environment definition
│
├── grounding/ # RF-DETR grounding pipeline
│ ├── config.py # Paths, part lists, thresholds
│ ├── detector.py # Subprocess wrapper for RF-DETR inference
│ ├── analyzer.py # Text analysis (negation, advice, calyx remap)
│ ├── inference.py # MiniGPT loading and prompt construction
│ └── report.py # Structured terminal output
│
├── rfdetr_parts/ # RF-DETR Small training and inference
│ ├── rfdetr_config.yaml # Training hyperparameters
│ ├── prepare_dataset.py # COCO dataset preparation and splitting
│ ├── train_rfdetr.py # Training script
│ ├── detect_images.py # Inference with per-class thresholds
│ └── eval_thresholds.py # Confidence threshold evaluation
│
├── knowledge_graph/ # Disease knowledge RAG system
│ ├── disease_knowledge_base.json # Sourced treatments, part-specific advice
│ ├── rag_retriever.py # Context retrieval, hallucination checks
│ ├── qa_retriever.py # Follow-up Q&A with pattern matching
│ └── README.md
│
├── minigpt4/ # Core MiniGPT-v2 framework
│ ├── models/ # Model architectures (MiniGPT-v2, EVA-ViT, Q-Former)
│ ├── datasets/ # Dataset builders and loaders
│ ├── processors/ # Image and text processors
│ ├── runners/ # Training runners
│ ├── tasks/ # Training task definitions
│ ├── conversation/ # Conversation management
│ └── common/ # Utilities, config, registry
│
├── plant_diagnostic/ # ResNet-50 training data and models
│ ├── data/ # Training images (7 classes)
│ ├── models/ # Trained ResNet checkpoint
│ ├── scripts/ # Training and utility scripts
│ └── src/ # Source modules
│
├── hallucination_evaluation/ # Response quality evaluation suite
├── evaluation/ # Model evaluation scripts
├── configs/ # MiniGPT model config YAMLs
├── eval_configs/ # Inference configuration
├── train_configs/ # Training configuration
├── figs/ # Architecture diagrams and paper figures
├── documentation/ # Session notes
├── annotationjsons/ # Label Studio annotation exports
├── RESEARCH_PAPER.md # Research paper (Markdown)
└── RESEARCH_PAPER.docx # Research paper (Word)
Large directories excluded from the repository via .gitignore:
llama_weights/-- LLaMA-2-7B-chat model weightsoutput/-- MiniGPT training checkpointscheckpoints/-- Stage-2 checkpointsrfdetr_parts/dataset/andrfdetr_parts/output/-- RF-DETR training data and checkpoints
The knowledge_graph/disease_knowledge_base.json file contains detailed, citation-backed information for all seven disease classes, including:
- Symptoms, causes, visual indicators, and severity ratings
- Treatment protocols web-crawled and curated from UC IPM, Penn State Extension, NC State Extension, Cornell Extension, University of Minnesota Extension, and others
- Part-specific treatments keyed to RF-DETR detections (e.g., fruit-specific advice for gray mold when fruit is detected in the image)
- Recovery timelines and prevention strategies
The DiseaseRAG retriever provides context injection and hallucination checking, while DiseaseQA handles follow-up questions with pattern-matched routing to the appropriate knowledge section.
@software{plant_diagnostic_system,
title = {Plant Diagnostic System: AI-Powered Strawberry Disease Detection},
author = {William Starks and Gus Marcum},
year = {2026},
url = {https://github.com/greatroboticslab/AGAI}
}This project builds on:
- MiniGPT-v2 -- vision-language model framework
- LLaMA-2 -- language model backbone (Meta AI)
- RF-DETR -- real-time detection transformer (Roboflow)
- ResNet -- image classification architecture
Treatment recommendations in the knowledge base are web-crawled and sourced from university cooperative extension publications (UC IPM, Cornell, Penn State, NC State, UMN, Virginia Tech, Utah State, UGA). See individual disease entries in knowledge_graph/disease_knowledge_base.json for full citations.
The models were trained on a curated corpus of publicly available web-scale data and indexed digital repositories.
Please respect upstream licenses and dataset terms of use.





