Files
radiacode/train/vega_ml
Jacquin Antoine 745a64b342 Pipeline complet Radiacode 103 - identification automatique d'isotopes
- VegaModel CNN-FCNN 34.5M params, 82 isotopes, val acc 99.89%
- Generation 50k spectres synthetiques 1D (12-24h durees)
- Entrainement 100 epochs sur RTX 5060 Ti (CUDA 12.8, Blackwell)
- Detection continue avec soustraction du background
- Capture background 24h avec gestion deconnexion
- Docker Compose : conteneur train (GPU) + detect (CPU/USB)
- Modele entraite inclus (vega_best.pt, 395 Mo)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-19 12:29:56 +02:00
..

ML for Isotope Identification

A machine learning system for identifying radioactive isotopes from gamma-ray spectra captured by Radiacode scintillation detectors.

Project Status

Completed: Synthetic gamma spectra generation system
Completed: Vega ML model architecture (CNN-FCNN hybrid)
Completed: Training pipeline with GPU support
Completed: Inference engine
🔲 Next: Generate large training dataset (10,000-100,000 samples)
🔲 Future: Real-time inference on Radiacode devices


Overview

This project aims to build a neural network that can identify radioactive isotopes from gamma spectra. Since collecting real gamma spectra requires radioactive sources and is expensive/regulated, we generate synthetic training data based on realistic physics models.

Target Hardware

  • Training: NVIDIA RTX 5090 GPU (requires PyTorch nightly with CUDA 12.8)
  • Inference: Radiacode 101, 102, 103, 103G, 110 scintillation detectors

Data Format

  • Input: 2D spectrograms (time intervals × 1023 energy channels)
  • Output: Multi-label isotope classification with activity estimation

Quick Start

Installation

# Create virtual environment
python -m venv .venv
.venv\Scripts\activate  # Windows
# or: source .venv/bin/activate  # Linux/Mac

# Install dependencies
pip install numpy scipy pillow

# Install PyTorch (nightly for RTX 5090/Blackwell support)
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128

Generate Synthetic Data

# Generate 10 test samples
python -m synthetic_spectra.generate_spectra

Train the Model

# Quick test run (5 epochs, small dataset)
python training/vega/run_training.py --test

# Full training
python training/vega/run_training.py --epochs 100 --batch-size 32

Run Inference

# Run inference on synthetic data
python inference/run_inference.py --model models/vega_best.pt --data data/synthetic

Vega Model Architecture

Vega is a CNN-FCNN hybrid model optimized for gamma spectrum isotope identification, based on research showing 99%+ accuracy on similar tasks.

Architecture Details

Component Configuration
Input 1023 energy channels
CNN Backbone 3 ConvBlocks [64, 128, 256 channels]
Kernel Size 7 (captures spectral features)
FC Layers [512, 256] with dropout
Output Heads Dual: Classification (82 isotopes) + Regression (activity)
Total Parameters 34.5M
Activation LeakyReLU + BatchNorm

Training Features

  • Mixed Precision (AMP): Faster training on modern GPUs
  • Multi-task Learning: Simultaneous isotope ID + activity estimation
  • Loss Function: BCE (classification) + Huber (regression)
  • LR Scheduling: ReduceLROnPlateau with early stopping

Synthetic Spectra Generation

Features

  • 82 isotopes with accurate gamma emission lines
  • Realistic physics: Gaussian peaks, Poisson noise, Compton continuum, environmental background
  • Multiple detector models: Radiacode 101, 102, 103, 103G, 110 with correct FWHM and energy ranges
  • Configurable variation: Activity levels, measurement durations, isotope combinations

Sample Distribution

Type Proportion Description
Single isotope 40% One source + background
Dual isotope 30% Two sources blended
Multi isotope 20% 3-5 sources combined
Background only 10% Environmental only

Scaling Up

Edit synthetic_spectra/generate_spectra.py to generate larger datasets:

generate_training_batch(
    n_samples=100000,  # Generate 100k samples
    output_dir=Path("data/synthetic/spectra"),
    detector_type="radiacode_103"
)

Project Structure

ml-for-isotope-identification/
├── README.md                    # This file
├── agents.md                    # AI agent context documentation
├── .gitignore                   # Git ignore rules
│
├── synthetic_spectra/           # Spectrum generation package
│   ├── __init__.py
│   ├── config.py                # Detector configurations
│   ├── generator.py             # Main generation logic
│   ├── generate_spectra.py      # CLI batch generation
│   ├── ground_truth/
│   │   ├── isotope_data.py      # 82 isotopes database
│   │   └── decay_chains.py      # Decay chain definitions
│   └── physics/
│       └── spectrum_physics.py  # Physics calculations
│
├── training/                    # Training infrastructure
│   └── vega/                    # Vega model package
│       ├── __init__.py
│       ├── isotope_index.py     # Isotope ↔ index mapping
│       ├── model.py             # VegaModel architecture
│       ├── dataset.py           # PyTorch Dataset/DataLoader
│       ├── train.py             # Training loop & utilities
│       └── run_training.py      # CLI training script
│
├── inference/                   # Inference engine
│   ├── vega_inference.py        # VegaInference class
│   └── run_inference.py         # CLI inference script
│
├── models/                      # Saved model checkpoints
│   ├── vega_best.pt             # Best validation loss
│   ├── vega_final.pt            # Final epoch
│   └── vega_history.json        # Training metrics
│
└── data/                        # Generated data (git-ignored)
    └── synthetic/
        └── spectra/

Technical Details

Detector Specifications

Model Crystal FWHM @ 662 keV Energy Range Channels
Radiacode 101 CsI(Tl) 9.0% 20-3000 keV 1024
Radiacode 102 CsI(Tl) 9.5% 20-3000 keV 1024
Radiacode 103 CsI(Tl) 8.4% 20-3000 keV 1024
Radiacode 103G GAGG(Ce) 7.4% 20-3000 keV 1024
Radiacode 110 CsI(Tl) 8.4% 20-3000 keV 1024

Physics Model

  • Peak shape: Gaussian with FWHM scaling as √(E/662)
  • Expected counts: λ = A × t × I × ε × T
  • Noise: Poisson counting statistics
  • Background: Exponential continuum + environmental isotopes (K-40, Pb-214, Bi-214, etc.)

Isotope Categories

  • Natural background (K-40, Ra-226, Rn-222)
  • Decay chains (U-238, Th-232, U-235)
  • Calibration sources (Am-241, Cs-137, Co-60, Ba-133, Eu-152)
  • Medical isotopes (Tc-99m, F-18, I-131, Ga-68)
  • Industrial sources (Ir-192, Se-75)
  • Reactor fallout (Cs-134, Cs-137, Sr-90)

Development

Dependencies

numpy>=1.24.0
scipy>=1.10.0
pillow>=9.0.0
torch>=2.11.0 (nightly with CUDA 12.8 for RTX 5090)

GPU Support

The RTX 5090 (Blackwell architecture, sm_120) requires PyTorch nightly builds with CUDA 12.8:

pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128

For AI Agents

See agents.md for comprehensive documentation on:

  • System architecture and design decisions
  • Physics model implementation details
  • Vega model architecture and training
  • Configuration options and variation strategies

TODO

  • Push to repository - Initial commit with generation system
  • Create PyTorch DataLoader for training
  • Implement CNN-FCNN model architecture (Vega)
  • Create training script with logging
  • Implement inference module
  • Generate large training dataset (100k samples)
  • Train model to convergence
  • Add data augmentation pipeline
  • Add model evaluation metrics & confusion matrix
  • Implement real-time inference module
  • Create Radiacode device integration

License

[TBD]


Acknowledgments

  • Radiacode for device specifications
  • IAEA Nuclear Data Services for isotope data
  • NNDC at Brookhaven National Laboratory
  • Wang et al. research on CNN-FCNN for gamma spectroscopy