Files
radiacode/train/vega_ml/docs/VEGA_2D_COMPLETE_GUIDE.md
Jacquin Antoine 745a64b342 Pipeline complet Radiacode 103 - identification automatique d'isotopes
- VegaModel CNN-FCNN 34.5M params, 82 isotopes, val acc 99.89%
- Generation 50k spectres synthetiques 1D (12-24h durees)
- Entrainement 100 epochs sur RTX 5060 Ti (CUDA 12.8, Blackwell)
- Detection continue avec soustraction du background
- Capture background 24h avec gestion deconnexion
- Docker Compose : conteneur train (GPU) + detect (CPU/USB)
- Modele entraite inclus (vega_best.pt, 395 Mo)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-19 12:29:56 +02:00

44 KiB
Raw Blame History

Vega 2D Isotope Identification System - Complete Technical Guide

Version: 2.0 (2D Model)
Last Updated: January 2025
Architecture: 2D-CNN with Temporal Feature Extraction


Table of Contents

  1. Executive Summary
  2. System Architecture Overview
  3. Data Format Specification
  4. Synthetic Data Generation
  5. Model Architecture
  6. Training Procedures
  7. Inference System
  8. Output Interpretation
  9. Isotope Reference
  10. Decay Chain Analysis
  11. Threshold Selection Guide
  12. Example Workflows
  13. Troubleshooting

1. Executive Summary

What This System Does

The Vega 2D system identifies radioactive isotopes from gamma-ray spectra captured by Radiacode scintillation detectors. Given a spectrum measurement, it outputs:

  1. Presence predictions - Which of 82 isotopes are present (with probability 0-1)
  2. Activity estimates - Estimated radioactivity in Becquerels (Bq) for each detected isotope

Why 2D?

Unlike traditional 1D approaches that collapse temporal data, the Vega 2D model treats spectra as images with:

  • Y-axis: 60 time intervals (1 second each)
  • X-axis: 1023 energy channels (20 keV - 3000 keV)

This preserves crucial temporal information:

  • Decay patterns - Short-lived isotopes show decreasing counts over time
  • Activity fluctuations - Real sources have statistical variations
  • Noise characteristics - Poisson statistics create time-varying patterns
  • Equilibrium dynamics - Daughter isotope ingrowth over time

Key Specifications

Parameter Value
Input Shape (60, 1023) - 60 time intervals × 1023 channels
Output Classes 82 isotopes
Model Parameters ~59 million
Inference Time <100ms on GPU, ~500ms on CPU
Typical F1 Score >96%

2. System Architecture Overview

Directory Structure

ml-for-isotope-identification/
├── synthetic_spectra/                  # Data generation
│   ├── generate_spectra_v3.py          # Main generation script
│   ├── generator.py                    # SpectrumGenerator class
│   ├── config.py                       # Detector configurations
│   └── ground_truth/
│       ├── isotope_data.py             # 82 isotope definitions
│       └── decay_chains.py             # Decay chain relationships
│
├── training/vega/                      # Training infrastructure
│   ├── model_2d.py                     # Vega2DModel architecture
│   ├── dataset_2d.py                   # 2D data loading
│   ├── train_2d.py                     # Training loop
│   └── isotope_index.py                # Isotope ↔ index mapping
│
├── inference/                          # Inference system
│   └── vega_portable_inference_2d.py   # Self-contained inference
│
├── models/                             # Saved checkpoints
│   ├── vega_2d_best.pt                 # Best validation model
│   └── vega_2d_final.pt                # Final epoch model
│
└── data/synthetic/                     # Generated training data
    └── spectra/                        # .npy spectrum files

Data Flow

[Radiacode Detector] → [Raw Counts Array] → [Normalization] → [Vega 2D Model]
                                                                    ↓
[Results Display] ← [Activity Estimation] ← [Sigmoid(logits)] ← [Dual Heads]

3. Data Format Specification

3.1 Input Spectrum Format

Shape: (num_time_intervals, 1023) or ideally (60, 1023)

Data Type: float32 or float64

Value Range:

  • Raw counts: integers 0 to ~thousands
  • Normalized: 0.0 to 1.0 (divided by max value)

Channel Mapping:

def channel_to_energy(channel: int) -> float:
    """Convert channel index to energy in keV."""
    E_MIN, E_MAX = 20.0, 3000.0
    return E_MIN + channel * (E_MAX - E_MIN) / 1023

def energy_to_channel(energy_kev: float) -> int:
    """Convert energy in keV to channel index."""
    E_MIN, E_MAX = 20.0, 3000.0
    channel = int((energy_kev - E_MIN) / (E_MAX - E_MIN) * 1023)
    return max(0, min(1022, channel))

Example Channel Mappings:

Energy (keV) Channel Notable Isotope
59.5 14 Am-241
122.1 35 Co-57
356.0 116 Ba-133
661.7 221 Cs-137
1173.2 397 Co-60
1274.5 432 Na-22
1332.5 452 Co-60
1460.8 496 K-40 (background)

3.2 Time Dimension Handling

The model requires exactly 60 time intervals. Input spectra are handled as follows:

def _pad_or_truncate(spectrum: np.ndarray, target_rows: int = 60) -> np.ndarray:
    """Ensure spectrum has exactly 60 rows."""
    current_rows = spectrum.shape[0]
    
    if current_rows == target_rows:
        return spectrum
    elif current_rows > target_rows:
        # Truncate - take LAST N intervals (most recent data)
        return spectrum[-target_rows:]
    else:
        # Pad with zeros at the BEGINNING
        padding = np.zeros((target_rows - current_rows, spectrum.shape[1]))
        return np.vstack([padding, spectrum])

Important: When truncating, the most recent 60 seconds are kept (last rows), not the first.

3.3 Normalization

Before inference, spectra should be normalized to [0, 1]:

def normalize(spectrum: np.ndarray) -> np.ndarray:
    """Normalize spectrum to [0, 1] range."""
    max_val = spectrum.max()
    if max_val > 0:
        return spectrum / max_val
    return spectrum

Why normalize?

  • Neural networks work best with standardized inputs
  • Prevents high-activity samples from dominating gradients
  • Allows model to focus on spectral shape rather than absolute counts

4. Synthetic Data Generation

4.1 Overview

Training data is generated synthetically because:

  1. Real radioactive sources require permits and safety protocols
  2. ML requires 100,000+ samples
  3. Ground truth labels are perfect with synthetic data
  4. Can systematically vary all parameters

4.2 Generation Command

# Generate 200,000 training samples
python -m synthetic_spectra.generate_spectra_v3 \
    --num_samples 200000 \
    --output_dir "O:/master_data_collection/isotopev2" \
    --detector radiacode_103 \
    --workers 8 \
    --activity_min 1.0 \
    --activity_max 100.0

4.3 Generation Parameters

Parameter Default Description
--num_samples 200000 Total samples to generate
--output_dir data/synthetic Output directory
--detector radiacode_103 Detector model to simulate
--workers CPU_count-1 Parallel workers
--activity_min 1.0 Minimum source activity (Bq)
--activity_max 100.0 Maximum source activity (Bq)
--seed None Random seed for reproducibility

4.4 Sample Scenario Distribution

The v3 generator creates diverse, realistic scenarios:

Scenario Fraction Description
background_only 15% No isotopes - just environmental background
single_calibration 20% One calibration source (Cs-137, Co-60, etc.)
single_medical 8% One medical isotope (Tc-99m, I-131, etc.)
single_industrial 5% One industrial source (Ir-192, Se-75, etc.)
uranium_chain 10% U-238 + daughters in equilibrium
thorium_chain 10% Th-232 + daughters in equilibrium
norm 7% 2-4 NORM isotopes (K-40, Ra-226, etc.)
fallout 5% Reactor fallout signature (Cs-137 + Cs-134)
mixed 10% Random 2-3 isotope combination
complex_mix 5% 4-6 isotopes from various categories
weak_source 5% Very low activity (0.1-5 Bq)

4.5 Isotope Pools

# Calibration sources (individual, well-characterized)
CALIBRATION_ISOTOPES = [
    "Cs-137", "Co-60", "Am-241", "Ba-133", 
    "Eu-152", "Na-22", "Co-57", "Mn-54"
]

# Medical isotopes (short-lived, hospital settings)
MEDICAL_ISOTOPES = [
    "Tc-99m", "I-131", "I-123", "F-18", 
    "Ga-67", "Ga-68", "In-111", "Lu-177", "Tl-201"
]

# Industrial sources (sealed sources, gauges)
INDUSTRIAL_ISOTOPES = [
    "Ir-192", "Se-75", "Zn-65", "Co-58", "Cd-109"
]

# Natural decay chains (always appear together)
URANIUM_238_CHAIN = ["U-238", "Ra-226", "Pb-214", "Bi-214"]
THORIUM_232_CHAIN = ["Th-232", "Ac-228", "Pb-212", "Bi-212", "Tl-208"]

# Reactor fallout signature
FALLOUT_SIGNATURE = ["Cs-137", "Cs-134"]  # Indicates reactor origin

4.6 Background Model

Every synthetic spectrum includes realistic environmental background:

  1. Exponential continuum: B(E) = B₀ × exp(-E / E_char)
  2. K-40 (potassium-40): 1460.8 keV - from soil, building materials
  3. Radon progeny (Pb-214, Bi-214): From atmospheric radon
  4. Thorium progeny (Pb-212, Tl-208, Ac-228): From soil

Background intensity is randomized (0.3× to 3.0× baseline).

4.7 Physics Model

Each gamma peak is generated as:

# Gaussian peak generation
FWHM = FWHM_662 * sqrt(E / 662)  # Resolution scales with energy
sigma = FWHM / 2.355

expected_counts = activity_bq * time_seconds * branching_ratio * efficiency

# Poisson noise applied to expected counts
observed_counts = np.random.poisson(expected_counts)

4.8 Output Files

Each sample generates:

  • {uuid}_spectrum.npy - NumPy array (60, 1023)
  • {uuid}_spectrum.png - Visualization (optional)

Plus a global labels.json:

{
  "abc123-def456": {
    "isotopes": [
      {"name": "Cs-137", "activity_bq": 45.2, "category": "CALIBRATION"}
    ],
    "background_isotopes": ["K-40", "Pb-214", "Bi-214"],
    "detector": "radiacode_103",
    "duration_seconds": 60,
    "num_intervals": 60,
    "background_scale": 1.2,
    "generation_timestamp": "2025-01-24T12:34:56"
  }
}

5. Model Architecture

5.1 Architecture Overview

Vega2DModel (59M parameters)
│
├─ Input: (batch, 1, 60, 1023)  [Grayscale image representation]
│
├─ ConvBlock2D #1
│   ├─ Conv2d(1→32, kernel=(3,7), padding=(1,3))
│   ├─ BatchNorm2d(32)
│   ├─ LeakyReLU(0.01)
│   ├─ Conv2d(32→32, kernel=(3,7), padding=(1,3))
│   ├─ BatchNorm2d(32)
│   ├─ LeakyReLU(0.01)
│   ├─ MaxPool2d((2,2))  → (batch, 32, 30, 511)
│   └─ Dropout2d(0.3)
│
├─ ConvBlock2D #2
│   ├─ Conv2d(32→64, kernel=(3,7), padding=(1,3))
│   ├─ ...same structure...
│   └─ MaxPool2d((2,2))  → (batch, 64, 15, 255)
│
├─ ConvBlock2D #3
│   ├─ Conv2d(64→128, kernel=(3,7), padding=(1,3))
│   ├─ ...same structure...
│   └─ MaxPool2d((2,2))  → (batch, 128, 7, 127)
│
├─ Flatten  → (batch, 113792)
│
├─ FC Block #1
│   ├─ Linear(113792→512)
│   ├─ BatchNorm1d(512)
│   ├─ LeakyReLU(0.01)
│   └─ Dropout(0.3)
│
├─ FC Block #2
│   ├─ Linear(512→256)
│   ├─ BatchNorm1d(256)
│   ├─ LeakyReLU(0.01)
│   └─ Dropout(0.3)
│
└─ Dual Output Heads
    ├─ Classifier: Linear(256→82) → logits (for BCEWithLogitsLoss)
    └─ Regressor: Linear(256→82) → ReLU → normalized activity [0,1]

5.2 Configuration Parameters

@dataclass
class Vega2DConfig:
    # Input dimensions
    num_channels: int = 1023          # Energy channels
    num_time_intervals: int = 60      # Time dimension
    
    # Output
    num_isotopes: int = 82
    
    # CNN architecture
    conv_channels: List[int] = [32, 64, 128]
    kernel_size: Tuple[int, int] = (3, 7)  # (time, energy)
    pool_size: Tuple[int, int] = (2, 2)
    
    # FC layers
    fc_hidden_dims: List[int] = [512, 256]
    
    # Regularization
    dropout_rate: float = 0.3
    leaky_relu_slope: float = 0.01
    
    # Activity scaling
    max_activity_bq: float = 1000.0

5.3 Kernel Size Rationale

The kernel (3, 7) is asymmetric:

  • 3 in time dimension: Captures short temporal correlations (3 seconds)
  • 7 in energy dimension: Captures spectral features wider than peak FWHM

This asymmetry reflects the different nature of the two dimensions.

5.4 Dual-Head Design

The model has two output heads:

  1. Classifier Head (presence detection)

    • Output: 82 logits (raw scores)
    • Loss: BCEWithLogitsLoss (sigmoid applied internally)
    • Interpretation: sigmoid(logit) > threshold → isotope present
  2. Regressor Head (activity estimation)

    • Output: 82 values in [0, 1] (normalized activity)
    • Loss: HuberLoss (robust to outliers)
    • Interpretation: output × max_activity_bq = estimated Bq

5.5 Loss Function

total_loss = cls_weight * BCEWithLogitsLoss(logits, presence_labels)
           + reg_weight * HuberLoss(pred_activities, true_activities)

# Default weights
cls_weight = 1.0   # Classification dominates
reg_weight = 0.1   # Activity estimation is secondary

6. Training Procedures

6.1 Quick Start

# Test run (5 epochs)
python -m training.vega.train_2d --test

# Full training
python -m training.vega.train_2d \
    --epochs 50 \
    --batch-size 32 \
    --data-dir "O:/master_data_collection/isotopev2"

# Without mixed precision (if GPU issues)
python -m training.vega.train_2d --no-amp

6.2 Training Configuration

@dataclass
class TrainingConfig2D:
    # Data paths
    data_dir: str = "O:/master_data_collection/isotopev2"
    model_dir: str = "models"
    
    # Training hyperparameters
    epochs: int = 50
    batch_size: int = 32
    learning_rate: float = 1e-3
    weight_decay: float = 1e-5
    
    # Loss weights
    classification_weight: float = 1.0
    regression_weight: float = 0.1
    
    # Mixed precision
    use_amp: bool = True
    
    # Early stopping
    early_stopping_patience: int = 10
    
    # Learning rate scheduler
    lr_scheduler_patience: int = 5
    lr_scheduler_factor: float = 0.5
    
    # Data loading
    num_workers: int = 4

6.3 Data Splits

# Default splits in dataset_2d.py
train_ratio = 0.8   # 80% training
val_ratio = 0.1     # 10% validation
test_ratio = 0.1    # 10% test

6.4 Training Loop

Each epoch:

  1. Training phase: Forward pass → loss → backward → optimizer step
  2. Validation phase: Compute metrics without gradients
  3. Checkpointing: Save if validation loss improved
  4. LR Scheduling: Reduce LR if plateau detected
  5. Early stopping: Stop if no improvement for N epochs

6.5 Metrics Tracked

Metric Description
loss Combined BCE + Huber loss
cls_loss Binary cross-entropy (classification)
reg_loss Huber loss (activity regression)
exact_match % samples with all 82 isotopes correct
precision TP / (TP + FP)
recall TP / (TP + FN)
f1 Harmonic mean of precision and recall

6.6 Expected Results

After 50 epochs on 200K samples:

Metric Expected Value
F1 Score >96%
Precision >97%
Recall >94%
Exact Match >88%
Training Time ~4 hours (RTX 5090)

6.7 Checkpoint Files

Training produces:

  • vega_2d_best.pt - Best validation loss (use for inference)
  • vega_2d_final.pt - Final epoch
  • vega_2d_epoch_{N}.pt - Per-epoch checkpoints
  • vega_2d_history.json - Training metrics over time

6.8 Checkpoint Contents

checkpoint = {
    'epoch': epoch,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'model_config': asdict(model_config),
    'training_config': asdict(config),
    'best_val_loss': best_val_loss,
    'history': history
}

7. Inference System

7.1 Portable Inference Script

The file inference/vega_portable_inference_2d.py is completely self-contained and can be deployed anywhere with just:

  • Python 3.8+
  • NumPy
  • PyTorch

It embeds:

  • Model architecture definition
  • Isotope index (all 82 names)
  • Key gamma lines for sample generation
  • Sample spectrum generator for testing

7.2 Command Line Usage

# Run demo with synthetic spectra
python vega_portable_inference_2d.py --model vega_2d_best.pt

# Analyze a specific spectrum
python vega_portable_inference_2d.py \
    --model vega_2d_best.pt \
    --spectrum my_measurement.npy \
    --threshold 0.5

# Lower threshold for higher sensitivity
python vega_portable_inference_2d.py \
    --model vega_2d_best.pt \
    --spectrum unknown_sample.npy \
    --threshold 0.3

# JSON output
python vega_portable_inference_2d.py \
    --model vega_2d_best.pt \
    --spectrum sample.npy \
    --json

7.3 Programmatic Usage

from vega_portable_inference_2d import Vega2DInference
import numpy as np

# Initialize inference engine
inference = Vega2DInference("vega_2d_best.pt")

# Load your spectrum (shape: any × 1023, will be padded/truncated to 60×1023)
spectrum = np.load("my_measurement.npy")

# Run inference
result = inference.predict(spectrum, threshold=0.5)

# Get human-readable summary
print(result.summary())

# Access individual predictions
for isotope in result.get_present_isotopes():
    print(f"{isotope.name}: {isotope.probability:.1%} confidence, {isotope.activity_bq:.1f} Bq")

# Get all 82 probabilities (even non-detected)
full_result = inference.predict(spectrum, threshold=0.0, return_all=True)

# Export to JSON
json_str = result.to_json()

# Export to dict
data = result.to_dict()

7.4 API Reference

Vega2DInference Class

class Vega2DInference:
    def __init__(
        self,
        model_path: Union[str, Path],   # Path to .pt checkpoint
        isotope_index: Optional = None,  # Custom index (uses default)
        device: Optional = None          # 'cuda', 'cpu', or auto-detect
    ):
        ...
    
    def predict(
        self,
        spectrum: np.ndarray,           # (T, 1023) array
        threshold: float = 0.5,         # Detection threshold
        return_all: bool = False        # Include non-detected isotopes
    ) -> SpectrumPrediction:
        ...
    
    def predict_from_file(
        self,
        file_path: str,                 # Path to .npy file
        threshold: float = 0.5
    ) -> SpectrumPrediction:
        ...
    
    def predict_batch(
        self,
        spectra: List[np.ndarray],
        threshold: float = 0.5
    ) -> List[SpectrumPrediction]:
        ...

SpectrumPrediction Dataclass

@dataclass
class SpectrumPrediction:
    isotopes: List[IsotopePrediction]   # All predictions
    num_present: int                    # Count above threshold
    confidence: float                   # Average probability of detected
    threshold_used: float               # Threshold used
    
    def get_present_isotopes(self) -> List[IsotopePrediction]:
        """Return only detected isotopes."""
    
    def summary(self) -> str:
        """Human-readable summary."""
    
    def to_dict(self) -> dict:
        """Convert to dictionary."""
    
    def to_json(self, indent=2) -> str:
        """Convert to JSON string."""

IsotopePrediction Dataclass

@dataclass
class IsotopePrediction:
    name: str           # e.g., "Cs-137"
    probability: float  # 0.0 to 1.0
    activity_bq: float  # Estimated activity in Becquerels
    present: bool       # True if probability >= threshold

8. Output Interpretation

8.1 Understanding Predictions

Each prediction contains:

Field Type Range Meaning
probability float 0.0-1.0 Model's confidence isotope is present
activity_bq float 0-1000 Estimated activity (only meaningful if present)
present bool T/F Whether probability >= threshold

8.2 Probability Interpretation

Probability Interpretation Action
>0.95 Very High Confidence Definitely present
0.80-0.95 High Confidence Very likely present
0.50-0.80 Moderate Confidence Probably present, verify
0.30-0.50 Low Confidence Possibly present, investigate
<0.30 Very Low Likely absent

8.3 Activity Estimation Accuracy

Activity estimates are approximate due to:

  • Unknown source distance
  • Unknown shielding
  • Detector efficiency variations
  • Normalization removes absolute count information

Use activity estimates for:

  • Relative comparisons between isotopes
  • Order-of-magnitude estimates
  • Identifying dominant vs minor contributors

Do NOT use for:

  • Regulatory compliance measurements
  • Precise quantitative analysis
  • Safety limit calculations

8.4 Single Isotope Detection

When one isotope is detected:

{
  "isotopes": [
    {"name": "Cs-137", "probability": 0.98, "activity_bq": 45.2, "present": true}
  ],
  "num_present": 1,
  "confidence": 0.98
}

Interpretation:

  • Clean calibration source or specific contamination
  • Verify gamma lines match expected energies
  • Single-isotope sources are common in:
    • Calibration checks
    • Medical procedures
    • Industrial gauges

8.5 Multiple Isotope Detection

When multiple isotopes are detected:

{
  "isotopes": [
    {"name": "Cs-137", "probability": 0.95, "activity_bq": 32.1, "present": true},
    {"name": "Cs-134", "probability": 0.87, "activity_bq": 18.4, "present": true}
  ],
  "num_present": 2,
  "confidence": 0.91
}

Interpretation:

  • Check for decay chain relationships (Section 10)
  • Look for known signatures (fallout, NORM, equilibrium)
  • Consider mixed-source scenarios

8.6 Background-Only (No Detection)

When no isotopes exceed threshold:

{
  "isotopes": [],
  "num_present": 0,
  "confidence": 0.82
}

Interpretation:

  • Spectrum shows only environmental background
  • K-40 (1460 keV) may be visible but below threshold
  • Natural radon daughters may contribute
  • This is normal for most measurements!

8.7 Common Detection Patterns

Pattern 1: Calibration Source

Cs-137: 98% ───────────────────────
All others: <5%

Clean single-source signature. Typical for check sources.

Pattern 2: NORM Material

K-40: 75%  ─────────────────
Ra-226: 62% ───────────────
Th-232: 58% ──────────────
Bi-214: 71% ───────────────

Multiple natural isotopes at similar activities. Indicates rocks, soil, building materials.

Pattern 3: Decay Chain

U-238: 45%  ─────────────
Ra-226: 88% ──────────────────
Pb-214: 92% ───────────────────
Bi-214: 94% ────────────────────

Parent + daughters detected. Indicates secular equilibrium. See Section 10.

Pattern 4: Reactor Fallout

Cs-137: 95% ───────────────────
Cs-134: 72% ────────────────

Cs-137 + Cs-134 is the fingerprint of reactor-origin material.


9. Isotope Reference

9.1 Complete Isotope List (82 Total)

The model identifies these isotopes, sorted alphabetically (same order as model output indices):

Index | Isotope   | Category            | Primary Gamma (keV)
------|-----------|---------------------|--------------------
  0   | Ac-225    | U235_CHAIN          | 99.9
  1   | Ac-227    | U235_CHAIN          | 236.0
  2   | Ac-228    | TH232_CHAIN         | 911.2
  3   | Ag-110m   | ACTIVATION          | 657.8
  4   | Am-241    | CALIBRATION         | 59.5
  5   | Au-198    | MEDICAL             | 411.8
  6   | Ba-133    | CALIBRATION         | 356.0
  7   | Be-7      | COSMOGENIC          | 477.6
  8   | Bi-207    | CALIBRATION         | 569.7
  9   | Bi-210    | U238_CHAIN          | 46.5
 10   | Bi-211    | U235_CHAIN          | 351.1
 11   | Bi-212    | TH232_CHAIN         | 727.3
 12   | Bi-214    | U238_CHAIN          | 609.3
 13   | C-14      | COSMOGENIC          | (beta only)
 14   | Cd-109    | INDUSTRIAL          | 88.0
 15   | Ce-139    | ACTIVATION          | 165.9
 16   | Ce-141    | REACTOR_FALLOUT     | 145.4
 17   | Ce-144    | REACTOR_FALLOUT     | 133.5
 18   | Co-57     | CALIBRATION         | 122.1
 19   | Co-58     | ACTIVATION          | 810.8
 20   | Co-60     | CALIBRATION         | 1173.2, 1332.5
 21   | Cr-51     | ACTIVATION          | 320.1
 22   | Cs-134    | REACTOR_FALLOUT     | 604.7, 795.9
 23   | Cs-137    | CALIBRATION         | 661.7
 24   | Cu-64     | MEDICAL             | 1345.8
 25   | Eu-152    | CALIBRATION         | 121.8, 344.3
 26   | Eu-154    | CALIBRATION         | 123.1, 1274.4
 27   | Eu-155    | REACTOR_FALLOUT     | 86.5, 105.3
 28   | F-18      | MEDICAL             | 511.0
 29   | Fe-55     | ACTIVATION          | (X-rays)
 30   | Fe-59     | ACTIVATION          | 1099.3
 31   | Ga-67     | MEDICAL             | 93.3, 184.6
 32   | Ga-68     | MEDICAL             | 511.0
 33   | Ge-68     | CALIBRATION         | 511.0
 34   | H-3       | COSMOGENIC          | (beta only)
 35   | Hf-175    | ACTIVATION          | 343.4
 36   | Hf-181    | ACTIVATION          | 482.2
 37   | Hg-203    | INDUSTRIAL          | 279.2
 38   | I-123     | MEDICAL             | 159.0
 39   | I-125     | MEDICAL             | 35.5
 40   | I-131     | MEDICAL             | 364.5
 41   | In-111    | MEDICAL             | 171.3, 245.4
 42   | Ir-192    | INDUSTRIAL          | 316.5, 468.1
 43   | K-40      | NATURAL_BACKGROUND  | 1460.8
 44   | Kr-85     | REACTOR_FALLOUT     | 514.0
 45   | La-140    | REACTOR_FALLOUT     | 1596.2
 46   | Lu-177    | MEDICAL             | 208.4
 47   | Mn-54     | CALIBRATION         | 834.8
 48   | Mo-99     | MEDICAL             | 140.5, 739.5
 49   | Na-22     | CALIBRATION         | 511.0, 1274.5
 50   | Na-24     | ACTIVATION          | 1368.6, 2754.0
 51   | Nb-95     | REACTOR_FALLOUT     | 765.8
 52   | Np-237    | INDUSTRIAL          | 86.5
 53   | Pa-231    | U235_CHAIN          | 283.7
 54   | Pa-233    | U238_CHAIN          | 311.9
 55   | Pa-234m   | U238_CHAIN          | 1001.0
 56   | Pb-210    | U238_CHAIN          | 46.5
 57   | Pb-211    | U235_CHAIN          | 404.9
 58   | Pb-212    | TH232_CHAIN         | 238.6
 59   | Pb-214    | U238_CHAIN          | 351.9
 60   | Po-210    | U238_CHAIN          | (alpha only)
 61   | Pu-239    | INDUSTRIAL          | 413.7
 62   | Ra-223    | U235_CHAIN          | 269.5
 63   | Ra-224    | TH232_CHAIN         | 241.0
 64   | Ra-226    | U238_CHAIN          | 186.2
 65   | Rb-86     | ACTIVATION          | 1076.6
 66   | Rn-219    | U235_CHAIN          | 271.2
 67   | Rn-220    | TH232_CHAIN         | 549.7
 68   | Rn-222    | U238_CHAIN          | (alpha only)
 69   | Ru-103    | REACTOR_FALLOUT     | 497.1
 70   | Ru-106    | REACTOR_FALLOUT     | 511.9, 621.9
 71   | Sb-124    | ACTIVATION          | 602.7
 72   | Sb-125    | REACTOR_FALLOUT     | 427.9
 73   | Sc-46     | ACTIVATION          | 889.3
 74   | Se-75     | INDUSTRIAL          | 264.7, 279.5
 75   | Sr-85     | CALIBRATION         | 514.0
 76   | Sr-90     | REACTOR_FALLOUT     | (beta only)
 77   | Tc-99m    | MEDICAL             | 140.5
 78   | Th-227    | U235_CHAIN          | 236.0
 79   | Th-228    | TH232_CHAIN         | 84.4
 80   | Th-232    | PRIMORDIAL          | (chain daughters)
 81   | Th-234    | U238_CHAIN          | 63.3, 92.4

9.2 Key Gamma Lines Reference

GAMMA_LINES = {
    # Calibration Sources
    "Cs-137": [(661.7, 0.851)],                    # Classic 662 keV
    "Co-60": [(1173.2, 0.999), (1332.5, 0.9998)],  # Dual peaks
    "Am-241": [(59.5, 0.359)],                     # Low energy
    "Ba-133": [(356.0, 0.623), (81.0, 0.329)],
    "Na-22": [(511.0, 1.798), (1274.5, 0.999)],    # Positron annihilation
    "Eu-152": [(121.8, 0.284), (344.3, 0.265), (1408.0, 0.210)],
    
    # Medical
    "Tc-99m": [(140.5, 0.890)],
    "I-131": [(364.5, 0.817)],
    "F-18": [(511.0, 1.934)],    # PET isotope
    
    # Background
    "K-40": [(1460.8, 0.107)],   # Always present
    
    # Decay Chains
    "Pb-214": [(351.9, 0.371), (295.2, 0.192)],
    "Bi-214": [(609.3, 0.461), (1120.3, 0.150)],
    "Tl-208": [(583.2, 0.845), (2614.5, 0.359)],
}

9.3 Isotope Categories

Category Description Examples
CALIBRATION Check sources, well-characterized Cs-137, Co-60, Am-241
MEDICAL Hospital/imaging use, short-lived Tc-99m, I-131, F-18
INDUSTRIAL Sealed sources, gauges Ir-192, Se-75
NATURAL_BACKGROUND Always present in environment K-40
PRIMORDIAL Existed since Earth formed U-238, Th-232, U-235
U238_CHAIN Uranium-238 decay daughters Ra-226, Pb-214, Bi-214
TH232_CHAIN Thorium-232 decay daughters Ac-228, Pb-212, Tl-208
U235_CHAIN Uranium-235 decay daughters Pa-231, Ac-227
REACTOR_FALLOUT Fission products Cs-134, I-131, Sr-90
ACTIVATION Neutron-activated materials Co-58, Fe-59, Zn-65
COSMOGENIC Cosmic ray produced Be-7, Na-22, C-14

10. Decay Chain Analysis

10.1 Understanding Decay Chains

Radioactive isotopes decay into other isotopes, forming decay chains. The three major natural chains are:

  1. Uranium-238 Series → ends at Pb-206 (stable)
  2. Thorium-232 Series → ends at Pb-208 (stable)
  3. Uranium-235 Series → ends at Pb-207 (stable)

10.2 Secular Equilibrium

In secular equilibrium (closed system, long time), all daughter activities equal the parent activity:

A_parent = A_daughter1 = A_daughter2 = ... = A_daughterN

This means detecting daughters implies parent presence!

10.3 Chain Signatures for Parent Inference

The system defines ChainSignature patterns to infer parent isotopes from detected daughters:

Rn-222 Progeny (Indicates Radon)

required: {"Pb-214", "Bi-214"}
optional: {"Pb-210"}
inferred_parent: "Rn-222"

When you see Pb-214 + Bi-214 → atmospheric radon is present

Ra-226 Equilibrium (Indicates Uranium)

required: {"Ra-226", "Pb-214", "Bi-214"}
optional: {"Pb-210", "Bi-210"}
inferred_parent: "U-238"

When you see Ra-226 + daughters → U-238 decay chain in equilibrium

Th-232 Equilibrium (Indicates Thorium)

required: {"Ac-228", "Pb-212", "Bi-212"}
optional: {"Tl-208", "Ra-224"}
inferred_parent: "Th-232"

When you see Ac-228 + Pb-212 + Bi-212 → Th-232 source material

Rn-220 Progeny (Thoron Daughters)

required: {"Pb-212", "Bi-212"}
optional: {"Tl-208"}
inferred_parent: "Rn-220"

When you see Pb-212 + Bi-212 → thoron (Rn-220) is present

10.4 Using Decay Chain Inference

from synthetic_spectra.ground_truth.decay_chains import infer_parent_from_daughters

# After running inference, get detected isotope names
detected = {iso.name for iso in result.get_present_isotopes()}

# Infer parent isotopes
parents = infer_parent_from_daughters(detected)

for parent_name, signature, confidence in parents:
    print(f"Inferred: {parent_name} (confidence: {confidence:.1%})")
    print(f"  Based on: {signature.name}")
    print(f"  Required daughters: {signature.required_daughters}")

10.5 Interpreting Chain Detections

Example 1: Uranium Ore

Detected: U-238 (45%), Ra-226 (88%), Pb-214 (92%), Bi-214 (94%)

Interpretation:

  • U-238 has low detection probability (weak gamma)
  • Daughters are strong gamma emitters
  • High confidence of uranium-bearing material
  • In secular equilibrium

Example 2: Radon in Air

Detected: Pb-214 (78%), Bi-214 (82%)
NOT detected: Ra-226, U-238

Interpretation:

  • Airborne radon daughters (deposited on detector)
  • Parent Rn-222 is gas (no gamma)
  • Ra-226/U-238 not present locally
  • Common indoor measurement result

Example 3: Thorium Lantern Mantle

Detected: Th-232 (52%), Ac-228 (71%), Pb-212 (85%), Bi-212 (79%), Tl-208 (67%)

Interpretation:

  • Complete Th-232 chain
  • Tl-208's 2614 keV line is distinctive
  • Indicates thoriated material

10.6 U-238 Decay Chain Detail

U-238 (4.47 Gy)
  ↓ α
Th-234 (24.1 d) [63.3, 92.4 keV]
  ↓ β
Pa-234m (1.17 min) [1001 keV]
  ↓ β
U-234 (245 ky)
  ↓ α
Th-230 (75.4 ky)
  ↓ α
Ra-226 (1600 y) [186.2 keV]
  ↓ α
Rn-222 (3.82 d) [gas, no gamma]
  ↓ α
Po-218 (3.1 min)
  ↓ α
Pb-214 (26.8 min) [351.9, 295.2 keV] ★ KEY INDICATOR
  ↓ β
Bi-214 (19.9 min) [609.3, 1120.3 keV] ★ KEY INDICATOR
  ↓ β
Po-214 (164 μs)
  ↓ α
Pb-210 (22.3 y) [46.5 keV]
  ↓ β
Bi-210 (5.01 d)
  ↓ β
Po-210 (138 d)
  ↓ α
Pb-206 (stable)

10.7 Th-232 Decay Chain Detail

Th-232 (14.0 Gy)
  ↓ α
Ra-228 (5.75 y) [no significant gamma]
  ↓ β
Ac-228 (6.15 h) [911.2, 338.3, 969.0 keV] ★ KEY INDICATOR
  ↓ β
Th-228 (1.91 y) [84.4 keV]
  ↓ α
Ra-224 (3.66 d) [241.0 keV]
  ↓ α
Rn-220 (55.6 s) [549.7 keV]
  ↓ α
Po-216 (0.145 s)
  ↓ α
Pb-212 (10.64 h) [238.6 keV] ★ KEY INDICATOR
  ↓ β
Bi-212 (60.6 min) [727.3 keV]
  ↓ α (35.94%)        ↓ β (64.06%)
Tl-208 (3.05 min)     Po-212 (0.3 μs)
[583.2, 2614.5 keV]     ↓ α
      ↓ β               ↙
         → Pb-208 (stable)

11. Threshold Selection Guide

11.1 What is the Threshold?

The threshold is the probability cutoff for declaring an isotope "present":

  • probability >= thresholdDETECTED
  • probability < thresholdNOT DETECTED

11.2 Threshold Trade-offs

Threshold Precision Recall False Positives False Negatives
0.9 Very High Low Very Few Many
0.7 High Moderate Few Some
0.5 Balanced Balanced Balanced Balanced
0.3 Moderate High Some Few
0.1 Low Very High Many Very Few
Scenario Threshold Rationale
General purpose 0.5 Balanced performance
Calibration verification 0.7 High confidence needed
Weak source detection 0.3 Don't miss faint signals
Safety screening 0.3 Prioritize recall
Research/survey 0.4 Slightly favor sensitivity
Regulatory reporting 0.6 Minimize false positives

11.4 Adjusting Threshold at Runtime

# High-sensitivity scan
result_sensitive = inference.predict(spectrum, threshold=0.3)

# High-confidence confirmation
result_confident = inference.predict(spectrum, threshold=0.7)

# Compare
print(f"At 0.3: {result_sensitive.num_present} isotopes")
print(f"At 0.7: {result_confident.num_present} isotopes")

11.5 Multi-Threshold Analysis

def analyze_at_multiple_thresholds(spectrum, inference):
    """Analyze spectrum at multiple thresholds."""
    thresholds = [0.3, 0.5, 0.7, 0.9]
    
    for t in thresholds:
        result = inference.predict(spectrum, threshold=t)
        names = [iso.name for iso in result.get_present_isotopes()]
        print(f"Threshold {t}: {names}")

Example Output:

Threshold 0.3: ['Cs-137', 'Cs-134', 'K-40', 'Pb-214']
Threshold 0.5: ['Cs-137', 'Cs-134', 'K-40']
Threshold 0.7: ['Cs-137', 'Cs-134']
Threshold 0.9: ['Cs-137']

Interpretation: Cs-137 is definitely present (>0.9), Cs-134 is very likely (>0.7), K-40 is probable (>0.5), Pb-214 is possible (>0.3).


12. Example Workflows

12.1 Basic Inference Workflow

import numpy as np
from vega_portable_inference_2d import Vega2DInference

# 1. Initialize
inference = Vega2DInference("models/vega_2d_best.pt")

# 2. Load spectrum
spectrum = np.load("measurement.npy")
print(f"Spectrum shape: {spectrum.shape}")

# 3. Run inference
result = inference.predict(spectrum, threshold=0.5)

# 4. Display results
print(result.summary())

# 5. Export
with open("results.json", "w") as f:
    f.write(result.to_json())

12.2 Batch Processing Workflow

from pathlib import Path

def process_directory(data_dir: str, model_path: str, threshold: float = 0.5):
    """Process all spectra in a directory."""
    inference = Vega2DInference(model_path)
    results = []
    
    for npy_file in Path(data_dir).glob("*.npy"):
        spectrum = np.load(npy_file)
        prediction = inference.predict(spectrum, threshold)
        
        results.append({
            "file": npy_file.name,
            "detected": [iso.name for iso in prediction.get_present_isotopes()],
            "confidence": prediction.confidence
        })
    
    return results

# Usage
results = process_directory("spectra/", "models/vega_2d_best.pt")
for r in results:
    print(f"{r['file']}: {r['detected']}")

12.3 Decay Chain Analysis Workflow

from vega_portable_inference_2d import Vega2DInference
from synthetic_spectra.ground_truth.decay_chains import (
    infer_parent_from_daughters,
    get_chain_daughters
)

def analyze_with_chain_inference(spectrum, inference, threshold=0.5):
    """Full analysis including decay chain inference."""
    
    # Run basic inference
    result = inference.predict(spectrum, threshold)
    detected = {iso.name for iso in result.get_present_isotopes()}
    
    print("=== DIRECT DETECTIONS ===")
    for iso in result.get_present_isotopes():
        print(f"  {iso.name}: {iso.probability:.1%}")
    
    # Infer parents from daughters
    print("\n=== DECAY CHAIN ANALYSIS ===")
    parents = infer_parent_from_daughters(detected)
    
    if parents:
        for parent, signature, confidence in parents:
            print(f"\n  Inferred Parent: {parent}")
            print(f"    Confidence: {confidence:.1%}")
            print(f"    Signature: {signature.name}")
            print(f"    Required daughters found: {detected & signature.required_daughters}")
    else:
        print("  No decay chain signatures identified")
    
    return result, parents

# Usage
result, parents = analyze_with_chain_inference(spectrum, inference)

12.4 Real-Time Monitoring Workflow

import time

def monitor_spectrum_stream(inference, spectrum_source, interval=1.0, threshold=0.5):
    """Monitor incoming spectra in real-time."""
    
    while True:
        # Get latest spectrum (implement your data source)
        spectrum = spectrum_source.get_latest()
        
        if spectrum is not None:
            result = inference.predict(spectrum, threshold)
            
            if result.num_present > 0:
                print(f"[{time.strftime('%H:%M:%S')}] DETECTION!")
                for iso in result.get_present_isotopes():
                    print(f"  {iso.name}: {iso.probability:.1%}, {iso.activity_bq:.1f} Bq")
            else:
                print(f"[{time.strftime('%H:%M:%S')}] Background only")
        
        time.sleep(interval)

12.5 Sample JSON Output

{
  "isotopes": [
    {
      "name": "Cs-137",
      "probability": 0.9823,
      "activity_bq": 45.2,
      "present": true
    },
    {
      "name": "Cs-134",
      "probability": 0.8741,
      "activity_bq": 18.7,
      "present": true
    }
  ],
  "num_present": 2,
  "confidence": 0.9282,
  "threshold_used": 0.5
}

12.6 Sample Input Generation (for Testing)

from vega_portable_inference_2d import create_sample_spectrum_2d

# Generate test spectrum
test_spectrum = create_sample_spectrum_2d(
    isotope="Cs-137",
    activity_bq=100.0,
    duration_seconds=60,
    add_background=True,
    add_noise=True,
    detector_fwhm_percent=8.5,
    seed=42
)

print(f"Shape: {test_spectrum.shape}")  # (60, 1023)
print(f"Range: [{test_spectrum.min():.1f}, {test_spectrum.max():.1f}]")

# Save for later
np.save("test_cs137.npy", test_spectrum)

13. Troubleshooting

13.1 Common Issues

Issue: "No isotopes detected" for known source

Possible causes:

  1. Threshold too high → Lower to 0.3
  2. Source very weak → Increase measurement time
  3. Wrong normalization → Check if max > 0
  4. Input shape wrong → Must be (T, 1023)

Solution:

# Check probabilities before thresholding
result = inference.predict(spectrum, threshold=0.0, return_all=True)
top5 = sorted(result.isotopes, key=lambda x: -x.probability)[:5]
for iso in top5:
    print(f"{iso.name}: {iso.probability:.1%}")

Issue: "Too many false positives"

Possible causes:

  1. Threshold too low → Raise to 0.6-0.7
  2. Noisy data → Check for acquisition problems
  3. Strong overlapping peaks → Check decay chains

Solution:

# Use higher threshold for confirmation
result = inference.predict(spectrum, threshold=0.7)

Issue: "CUDA out of memory"

Possible causes:

  1. Batch size too large
  2. Other GPU processes

Solution:

# Force CPU inference
inference = Vega2DInference(model_path, device=torch.device('cpu'))

Issue: "Model weights not matching"

Possible causes:

  1. Model architecture changed
  2. Wrong checkpoint version

Solution:

  • Ensure checkpoint matches Vega2DConfig defaults
  • Re-train if architecture was modified

13.2 Data Quality Checks

def check_spectrum_quality(spectrum: np.ndarray) -> dict:
    """Check spectrum data quality."""
    issues = []
    
    # Shape check
    if spectrum.ndim != 2:
        issues.append(f"Wrong dimensions: {spectrum.ndim}, expected 2")
    
    if spectrum.shape[1] != 1023:
        issues.append(f"Wrong channels: {spectrum.shape[1]}, expected 1023")
    
    # Value checks
    if spectrum.min() < 0:
        issues.append("Contains negative values")
    
    if spectrum.max() == 0:
        issues.append("All zeros - no data")
    
    if np.isnan(spectrum).any():
        issues.append("Contains NaN values")
    
    if np.isinf(spectrum).any():
        issues.append("Contains infinite values")
    
    return {
        "shape": spectrum.shape,
        "min": float(spectrum.min()),
        "max": float(spectrum.max()),
        "mean": float(spectrum.mean()),
        "issues": issues,
        "valid": len(issues) == 0
    }

13.3 Performance Optimization

# Batch predictions are faster than individual
spectra = [np.load(f) for f in spectrum_files]
results = inference.predict_batch(spectra, threshold=0.5)

# Pre-load model once, reuse for all predictions
inference = Vega2DInference(model_path)  # Do once
for spectrum in stream:
    result = inference.predict(spectrum)  # Fast

Appendix A: Complete Configuration Reference

A.1 Vega2DConfig Defaults

Vega2DConfig(
    num_channels=1023,
    num_time_intervals=60,
    num_isotopes=82,
    conv_channels=[32, 64, 128],
    kernel_size=(3, 7),
    pool_size=(2, 2),
    fc_hidden_dims=[512, 256],
    dropout_rate=0.3,
    leaky_relu_slope=0.01,
    max_activity_bq=1000.0
)

A.2 TrainingConfig2D Defaults

TrainingConfig2D(
    data_dir="O:/master_data_collection/isotopev2",
    model_dir="models",
    target_time_intervals=60,
    epochs=50,
    batch_size=32,
    learning_rate=0.001,
    weight_decay=1e-05,
    classification_weight=1.0,
    regression_weight=0.1,
    use_amp=True,
    early_stopping_patience=10,
    lr_scheduler_patience=5,
    lr_scheduler_factor=0.5,
    num_workers=4
)

A.3 Generation Scenario Fractions

DEFAULT_SCENARIOS = [
    BackgroundOnlyScenario(0.15),
    SingleCalibrationScenario(0.20),
    SingleMedicalScenario(0.08),
    SingleIndustrialScenario(0.05),
    UraniumChainScenario(0.10),
    ThoriumChainScenario(0.10),
    NORMScenario(0.07),
    FalloutScenario(0.05),
    MixedSourcesScenario(0.10),
    ComplexMixScenario(0.05),
    WeakSourceScenario(0.05),
]

Appendix B: Version History

Version Date Changes
2.0 Jan 2025 2D model architecture, temporal features
1.0 Dec 2024 Original 1D model (deprecated)

Document End

For questions or issues, consult the agents.md file in the repository root.