- VegaModel CNN-FCNN 34.5M params, 82 isotopes, val acc 99.89% - Generation 50k spectres synthetiques 1D (12-24h durees) - Entrainement 100 epochs sur RTX 5060 Ti (CUDA 12.8, Blackwell) - Detection continue avec soustraction du background - Capture background 24h avec gestion deconnexion - Docker Compose : conteneur train (GPU) + detect (CPU/USB) - Modele entraite inclus (vega_best.pt, 395 Mo) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
44 KiB
Vega 2D Isotope Identification System - Complete Technical Guide
Version: 2.0 (2D Model)
Last Updated: January 2025
Architecture: 2D-CNN with Temporal Feature Extraction
Table of Contents
- Executive Summary
- System Architecture Overview
- Data Format Specification
- Synthetic Data Generation
- Model Architecture
- Training Procedures
- Inference System
- Output Interpretation
- Isotope Reference
- Decay Chain Analysis
- Threshold Selection Guide
- Example Workflows
- Troubleshooting
1. Executive Summary
What This System Does
The Vega 2D system identifies radioactive isotopes from gamma-ray spectra captured by Radiacode scintillation detectors. Given a spectrum measurement, it outputs:
- Presence predictions - Which of 82 isotopes are present (with probability 0-1)
- Activity estimates - Estimated radioactivity in Becquerels (Bq) for each detected isotope
Why 2D?
Unlike traditional 1D approaches that collapse temporal data, the Vega 2D model treats spectra as images with:
- Y-axis: 60 time intervals (1 second each)
- X-axis: 1023 energy channels (20 keV - 3000 keV)
This preserves crucial temporal information:
- Decay patterns - Short-lived isotopes show decreasing counts over time
- Activity fluctuations - Real sources have statistical variations
- Noise characteristics - Poisson statistics create time-varying patterns
- Equilibrium dynamics - Daughter isotope ingrowth over time
Key Specifications
| Parameter | Value |
|---|---|
| Input Shape | (60, 1023) - 60 time intervals × 1023 channels |
| Output Classes | 82 isotopes |
| Model Parameters | ~59 million |
| Inference Time | <100ms on GPU, ~500ms on CPU |
| Typical F1 Score | >96% |
2. System Architecture Overview
Directory Structure
ml-for-isotope-identification/
├── synthetic_spectra/ # Data generation
│ ├── generate_spectra_v3.py # Main generation script
│ ├── generator.py # SpectrumGenerator class
│ ├── config.py # Detector configurations
│ └── ground_truth/
│ ├── isotope_data.py # 82 isotope definitions
│ └── decay_chains.py # Decay chain relationships
│
├── training/vega/ # Training infrastructure
│ ├── model_2d.py # Vega2DModel architecture
│ ├── dataset_2d.py # 2D data loading
│ ├── train_2d.py # Training loop
│ └── isotope_index.py # Isotope ↔ index mapping
│
├── inference/ # Inference system
│ └── vega_portable_inference_2d.py # Self-contained inference
│
├── models/ # Saved checkpoints
│ ├── vega_2d_best.pt # Best validation model
│ └── vega_2d_final.pt # Final epoch model
│
└── data/synthetic/ # Generated training data
└── spectra/ # .npy spectrum files
Data Flow
[Radiacode Detector] → [Raw Counts Array] → [Normalization] → [Vega 2D Model]
↓
[Results Display] ← [Activity Estimation] ← [Sigmoid(logits)] ← [Dual Heads]
3. Data Format Specification
3.1 Input Spectrum Format
Shape: (num_time_intervals, 1023) or ideally (60, 1023)
Data Type: float32 or float64
Value Range:
- Raw counts: integers 0 to ~thousands
- Normalized: 0.0 to 1.0 (divided by max value)
Channel Mapping:
def channel_to_energy(channel: int) -> float:
"""Convert channel index to energy in keV."""
E_MIN, E_MAX = 20.0, 3000.0
return E_MIN + channel * (E_MAX - E_MIN) / 1023
def energy_to_channel(energy_kev: float) -> int:
"""Convert energy in keV to channel index."""
E_MIN, E_MAX = 20.0, 3000.0
channel = int((energy_kev - E_MIN) / (E_MAX - E_MIN) * 1023)
return max(0, min(1022, channel))
Example Channel Mappings:
| Energy (keV) | Channel | Notable Isotope |
|---|---|---|
| 59.5 | 14 | Am-241 |
| 122.1 | 35 | Co-57 |
| 356.0 | 116 | Ba-133 |
| 661.7 | 221 | Cs-137 |
| 1173.2 | 397 | Co-60 |
| 1274.5 | 432 | Na-22 |
| 1332.5 | 452 | Co-60 |
| 1460.8 | 496 | K-40 (background) |
3.2 Time Dimension Handling
The model requires exactly 60 time intervals. Input spectra are handled as follows:
def _pad_or_truncate(spectrum: np.ndarray, target_rows: int = 60) -> np.ndarray:
"""Ensure spectrum has exactly 60 rows."""
current_rows = spectrum.shape[0]
if current_rows == target_rows:
return spectrum
elif current_rows > target_rows:
# Truncate - take LAST N intervals (most recent data)
return spectrum[-target_rows:]
else:
# Pad with zeros at the BEGINNING
padding = np.zeros((target_rows - current_rows, spectrum.shape[1]))
return np.vstack([padding, spectrum])
Important: When truncating, the most recent 60 seconds are kept (last rows), not the first.
3.3 Normalization
Before inference, spectra should be normalized to [0, 1]:
def normalize(spectrum: np.ndarray) -> np.ndarray:
"""Normalize spectrum to [0, 1] range."""
max_val = spectrum.max()
if max_val > 0:
return spectrum / max_val
return spectrum
Why normalize?
- Neural networks work best with standardized inputs
- Prevents high-activity samples from dominating gradients
- Allows model to focus on spectral shape rather than absolute counts
4. Synthetic Data Generation
4.1 Overview
Training data is generated synthetically because:
- Real radioactive sources require permits and safety protocols
- ML requires 100,000+ samples
- Ground truth labels are perfect with synthetic data
- Can systematically vary all parameters
4.2 Generation Command
# Generate 200,000 training samples
python -m synthetic_spectra.generate_spectra_v3 \
--num_samples 200000 \
--output_dir "O:/master_data_collection/isotopev2" \
--detector radiacode_103 \
--workers 8 \
--activity_min 1.0 \
--activity_max 100.0
4.3 Generation Parameters
| Parameter | Default | Description |
|---|---|---|
--num_samples |
200000 | Total samples to generate |
--output_dir |
data/synthetic | Output directory |
--detector |
radiacode_103 | Detector model to simulate |
--workers |
CPU_count-1 | Parallel workers |
--activity_min |
1.0 | Minimum source activity (Bq) |
--activity_max |
100.0 | Maximum source activity (Bq) |
--seed |
None | Random seed for reproducibility |
4.4 Sample Scenario Distribution
The v3 generator creates diverse, realistic scenarios:
| Scenario | Fraction | Description |
|---|---|---|
background_only |
15% | No isotopes - just environmental background |
single_calibration |
20% | One calibration source (Cs-137, Co-60, etc.) |
single_medical |
8% | One medical isotope (Tc-99m, I-131, etc.) |
single_industrial |
5% | One industrial source (Ir-192, Se-75, etc.) |
uranium_chain |
10% | U-238 + daughters in equilibrium |
thorium_chain |
10% | Th-232 + daughters in equilibrium |
norm |
7% | 2-4 NORM isotopes (K-40, Ra-226, etc.) |
fallout |
5% | Reactor fallout signature (Cs-137 + Cs-134) |
mixed |
10% | Random 2-3 isotope combination |
complex_mix |
5% | 4-6 isotopes from various categories |
weak_source |
5% | Very low activity (0.1-5 Bq) |
4.5 Isotope Pools
# Calibration sources (individual, well-characterized)
CALIBRATION_ISOTOPES = [
"Cs-137", "Co-60", "Am-241", "Ba-133",
"Eu-152", "Na-22", "Co-57", "Mn-54"
]
# Medical isotopes (short-lived, hospital settings)
MEDICAL_ISOTOPES = [
"Tc-99m", "I-131", "I-123", "F-18",
"Ga-67", "Ga-68", "In-111", "Lu-177", "Tl-201"
]
# Industrial sources (sealed sources, gauges)
INDUSTRIAL_ISOTOPES = [
"Ir-192", "Se-75", "Zn-65", "Co-58", "Cd-109"
]
# Natural decay chains (always appear together)
URANIUM_238_CHAIN = ["U-238", "Ra-226", "Pb-214", "Bi-214"]
THORIUM_232_CHAIN = ["Th-232", "Ac-228", "Pb-212", "Bi-212", "Tl-208"]
# Reactor fallout signature
FALLOUT_SIGNATURE = ["Cs-137", "Cs-134"] # Indicates reactor origin
4.6 Background Model
Every synthetic spectrum includes realistic environmental background:
- Exponential continuum:
B(E) = B₀ × exp(-E / E_char) - K-40 (potassium-40): 1460.8 keV - from soil, building materials
- Radon progeny (Pb-214, Bi-214): From atmospheric radon
- Thorium progeny (Pb-212, Tl-208, Ac-228): From soil
Background intensity is randomized (0.3× to 3.0× baseline).
4.7 Physics Model
Each gamma peak is generated as:
# Gaussian peak generation
FWHM = FWHM_662 * sqrt(E / 662) # Resolution scales with energy
sigma = FWHM / 2.355
expected_counts = activity_bq * time_seconds * branching_ratio * efficiency
# Poisson noise applied to expected counts
observed_counts = np.random.poisson(expected_counts)
4.8 Output Files
Each sample generates:
{uuid}_spectrum.npy- NumPy array (60, 1023){uuid}_spectrum.png- Visualization (optional)
Plus a global labels.json:
{
"abc123-def456": {
"isotopes": [
{"name": "Cs-137", "activity_bq": 45.2, "category": "CALIBRATION"}
],
"background_isotopes": ["K-40", "Pb-214", "Bi-214"],
"detector": "radiacode_103",
"duration_seconds": 60,
"num_intervals": 60,
"background_scale": 1.2,
"generation_timestamp": "2025-01-24T12:34:56"
}
}
5. Model Architecture
5.1 Architecture Overview
Vega2DModel (59M parameters)
│
├─ Input: (batch, 1, 60, 1023) [Grayscale image representation]
│
├─ ConvBlock2D #1
│ ├─ Conv2d(1→32, kernel=(3,7), padding=(1,3))
│ ├─ BatchNorm2d(32)
│ ├─ LeakyReLU(0.01)
│ ├─ Conv2d(32→32, kernel=(3,7), padding=(1,3))
│ ├─ BatchNorm2d(32)
│ ├─ LeakyReLU(0.01)
│ ├─ MaxPool2d((2,2)) → (batch, 32, 30, 511)
│ └─ Dropout2d(0.3)
│
├─ ConvBlock2D #2
│ ├─ Conv2d(32→64, kernel=(3,7), padding=(1,3))
│ ├─ ...same structure...
│ └─ MaxPool2d((2,2)) → (batch, 64, 15, 255)
│
├─ ConvBlock2D #3
│ ├─ Conv2d(64→128, kernel=(3,7), padding=(1,3))
│ ├─ ...same structure...
│ └─ MaxPool2d((2,2)) → (batch, 128, 7, 127)
│
├─ Flatten → (batch, 113792)
│
├─ FC Block #1
│ ├─ Linear(113792→512)
│ ├─ BatchNorm1d(512)
│ ├─ LeakyReLU(0.01)
│ └─ Dropout(0.3)
│
├─ FC Block #2
│ ├─ Linear(512→256)
│ ├─ BatchNorm1d(256)
│ ├─ LeakyReLU(0.01)
│ └─ Dropout(0.3)
│
└─ Dual Output Heads
├─ Classifier: Linear(256→82) → logits (for BCEWithLogitsLoss)
└─ Regressor: Linear(256→82) → ReLU → normalized activity [0,1]
5.2 Configuration Parameters
@dataclass
class Vega2DConfig:
# Input dimensions
num_channels: int = 1023 # Energy channels
num_time_intervals: int = 60 # Time dimension
# Output
num_isotopes: int = 82
# CNN architecture
conv_channels: List[int] = [32, 64, 128]
kernel_size: Tuple[int, int] = (3, 7) # (time, energy)
pool_size: Tuple[int, int] = (2, 2)
# FC layers
fc_hidden_dims: List[int] = [512, 256]
# Regularization
dropout_rate: float = 0.3
leaky_relu_slope: float = 0.01
# Activity scaling
max_activity_bq: float = 1000.0
5.3 Kernel Size Rationale
The kernel (3, 7) is asymmetric:
- 3 in time dimension: Captures short temporal correlations (3 seconds)
- 7 in energy dimension: Captures spectral features wider than peak FWHM
This asymmetry reflects the different nature of the two dimensions.
5.4 Dual-Head Design
The model has two output heads:
-
Classifier Head (presence detection)
- Output: 82 logits (raw scores)
- Loss:
BCEWithLogitsLoss(sigmoid applied internally) - Interpretation:
sigmoid(logit) > threshold→ isotope present
-
Regressor Head (activity estimation)
- Output: 82 values in [0, 1] (normalized activity)
- Loss:
HuberLoss(robust to outliers) - Interpretation:
output × max_activity_bq= estimated Bq
5.5 Loss Function
total_loss = cls_weight * BCEWithLogitsLoss(logits, presence_labels)
+ reg_weight * HuberLoss(pred_activities, true_activities)
# Default weights
cls_weight = 1.0 # Classification dominates
reg_weight = 0.1 # Activity estimation is secondary
6. Training Procedures
6.1 Quick Start
# Test run (5 epochs)
python -m training.vega.train_2d --test
# Full training
python -m training.vega.train_2d \
--epochs 50 \
--batch-size 32 \
--data-dir "O:/master_data_collection/isotopev2"
# Without mixed precision (if GPU issues)
python -m training.vega.train_2d --no-amp
6.2 Training Configuration
@dataclass
class TrainingConfig2D:
# Data paths
data_dir: str = "O:/master_data_collection/isotopev2"
model_dir: str = "models"
# Training hyperparameters
epochs: int = 50
batch_size: int = 32
learning_rate: float = 1e-3
weight_decay: float = 1e-5
# Loss weights
classification_weight: float = 1.0
regression_weight: float = 0.1
# Mixed precision
use_amp: bool = True
# Early stopping
early_stopping_patience: int = 10
# Learning rate scheduler
lr_scheduler_patience: int = 5
lr_scheduler_factor: float = 0.5
# Data loading
num_workers: int = 4
6.3 Data Splits
# Default splits in dataset_2d.py
train_ratio = 0.8 # 80% training
val_ratio = 0.1 # 10% validation
test_ratio = 0.1 # 10% test
6.4 Training Loop
Each epoch:
- Training phase: Forward pass → loss → backward → optimizer step
- Validation phase: Compute metrics without gradients
- Checkpointing: Save if validation loss improved
- LR Scheduling: Reduce LR if plateau detected
- Early stopping: Stop if no improvement for N epochs
6.5 Metrics Tracked
| Metric | Description |
|---|---|
loss |
Combined BCE + Huber loss |
cls_loss |
Binary cross-entropy (classification) |
reg_loss |
Huber loss (activity regression) |
exact_match |
% samples with all 82 isotopes correct |
precision |
TP / (TP + FP) |
recall |
TP / (TP + FN) |
f1 |
Harmonic mean of precision and recall |
6.6 Expected Results
After 50 epochs on 200K samples:
| Metric | Expected Value |
|---|---|
| F1 Score | >96% |
| Precision | >97% |
| Recall | >94% |
| Exact Match | >88% |
| Training Time | ~4 hours (RTX 5090) |
6.7 Checkpoint Files
Training produces:
vega_2d_best.pt- Best validation loss (use for inference)vega_2d_final.pt- Final epochvega_2d_epoch_{N}.pt- Per-epoch checkpointsvega_2d_history.json- Training metrics over time
6.8 Checkpoint Contents
checkpoint = {
'epoch': epoch,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'model_config': asdict(model_config),
'training_config': asdict(config),
'best_val_loss': best_val_loss,
'history': history
}
7. Inference System
7.1 Portable Inference Script
The file inference/vega_portable_inference_2d.py is completely self-contained and can be deployed anywhere with just:
- Python 3.8+
- NumPy
- PyTorch
It embeds:
- Model architecture definition
- Isotope index (all 82 names)
- Key gamma lines for sample generation
- Sample spectrum generator for testing
7.2 Command Line Usage
# Run demo with synthetic spectra
python vega_portable_inference_2d.py --model vega_2d_best.pt
# Analyze a specific spectrum
python vega_portable_inference_2d.py \
--model vega_2d_best.pt \
--spectrum my_measurement.npy \
--threshold 0.5
# Lower threshold for higher sensitivity
python vega_portable_inference_2d.py \
--model vega_2d_best.pt \
--spectrum unknown_sample.npy \
--threshold 0.3
# JSON output
python vega_portable_inference_2d.py \
--model vega_2d_best.pt \
--spectrum sample.npy \
--json
7.3 Programmatic Usage
from vega_portable_inference_2d import Vega2DInference
import numpy as np
# Initialize inference engine
inference = Vega2DInference("vega_2d_best.pt")
# Load your spectrum (shape: any × 1023, will be padded/truncated to 60×1023)
spectrum = np.load("my_measurement.npy")
# Run inference
result = inference.predict(spectrum, threshold=0.5)
# Get human-readable summary
print(result.summary())
# Access individual predictions
for isotope in result.get_present_isotopes():
print(f"{isotope.name}: {isotope.probability:.1%} confidence, {isotope.activity_bq:.1f} Bq")
# Get all 82 probabilities (even non-detected)
full_result = inference.predict(spectrum, threshold=0.0, return_all=True)
# Export to JSON
json_str = result.to_json()
# Export to dict
data = result.to_dict()
7.4 API Reference
Vega2DInference Class
class Vega2DInference:
def __init__(
self,
model_path: Union[str, Path], # Path to .pt checkpoint
isotope_index: Optional = None, # Custom index (uses default)
device: Optional = None # 'cuda', 'cpu', or auto-detect
):
...
def predict(
self,
spectrum: np.ndarray, # (T, 1023) array
threshold: float = 0.5, # Detection threshold
return_all: bool = False # Include non-detected isotopes
) -> SpectrumPrediction:
...
def predict_from_file(
self,
file_path: str, # Path to .npy file
threshold: float = 0.5
) -> SpectrumPrediction:
...
def predict_batch(
self,
spectra: List[np.ndarray],
threshold: float = 0.5
) -> List[SpectrumPrediction]:
...
SpectrumPrediction Dataclass
@dataclass
class SpectrumPrediction:
isotopes: List[IsotopePrediction] # All predictions
num_present: int # Count above threshold
confidence: float # Average probability of detected
threshold_used: float # Threshold used
def get_present_isotopes(self) -> List[IsotopePrediction]:
"""Return only detected isotopes."""
def summary(self) -> str:
"""Human-readable summary."""
def to_dict(self) -> dict:
"""Convert to dictionary."""
def to_json(self, indent=2) -> str:
"""Convert to JSON string."""
IsotopePrediction Dataclass
@dataclass
class IsotopePrediction:
name: str # e.g., "Cs-137"
probability: float # 0.0 to 1.0
activity_bq: float # Estimated activity in Becquerels
present: bool # True if probability >= threshold
8. Output Interpretation
8.1 Understanding Predictions
Each prediction contains:
| Field | Type | Range | Meaning |
|---|---|---|---|
probability |
float | 0.0-1.0 | Model's confidence isotope is present |
activity_bq |
float | 0-1000 | Estimated activity (only meaningful if present) |
present |
bool | T/F | Whether probability >= threshold |
8.2 Probability Interpretation
| Probability | Interpretation | Action |
|---|---|---|
| >0.95 | Very High Confidence | Definitely present |
| 0.80-0.95 | High Confidence | Very likely present |
| 0.50-0.80 | Moderate Confidence | Probably present, verify |
| 0.30-0.50 | Low Confidence | Possibly present, investigate |
| <0.30 | Very Low | Likely absent |
8.3 Activity Estimation Accuracy
Activity estimates are approximate due to:
- Unknown source distance
- Unknown shielding
- Detector efficiency variations
- Normalization removes absolute count information
Use activity estimates for:
- Relative comparisons between isotopes
- Order-of-magnitude estimates
- Identifying dominant vs minor contributors
Do NOT use for:
- Regulatory compliance measurements
- Precise quantitative analysis
- Safety limit calculations
8.4 Single Isotope Detection
When one isotope is detected:
{
"isotopes": [
{"name": "Cs-137", "probability": 0.98, "activity_bq": 45.2, "present": true}
],
"num_present": 1,
"confidence": 0.98
}
Interpretation:
- Clean calibration source or specific contamination
- Verify gamma lines match expected energies
- Single-isotope sources are common in:
- Calibration checks
- Medical procedures
- Industrial gauges
8.5 Multiple Isotope Detection
When multiple isotopes are detected:
{
"isotopes": [
{"name": "Cs-137", "probability": 0.95, "activity_bq": 32.1, "present": true},
{"name": "Cs-134", "probability": 0.87, "activity_bq": 18.4, "present": true}
],
"num_present": 2,
"confidence": 0.91
}
Interpretation:
- Check for decay chain relationships (Section 10)
- Look for known signatures (fallout, NORM, equilibrium)
- Consider mixed-source scenarios
8.6 Background-Only (No Detection)
When no isotopes exceed threshold:
{
"isotopes": [],
"num_present": 0,
"confidence": 0.82
}
Interpretation:
- Spectrum shows only environmental background
- K-40 (1460 keV) may be visible but below threshold
- Natural radon daughters may contribute
- This is normal for most measurements!
8.7 Common Detection Patterns
Pattern 1: Calibration Source
Cs-137: 98% ───────────────────────
All others: <5%
Clean single-source signature. Typical for check sources.
Pattern 2: NORM Material
K-40: 75% ─────────────────
Ra-226: 62% ───────────────
Th-232: 58% ──────────────
Bi-214: 71% ───────────────
Multiple natural isotopes at similar activities. Indicates rocks, soil, building materials.
Pattern 3: Decay Chain
U-238: 45% ─────────────
Ra-226: 88% ──────────────────
Pb-214: 92% ───────────────────
Bi-214: 94% ────────────────────
Parent + daughters detected. Indicates secular equilibrium. See Section 10.
Pattern 4: Reactor Fallout
Cs-137: 95% ───────────────────
Cs-134: 72% ────────────────
Cs-137 + Cs-134 is the fingerprint of reactor-origin material.
9. Isotope Reference
9.1 Complete Isotope List (82 Total)
The model identifies these isotopes, sorted alphabetically (same order as model output indices):
Index | Isotope | Category | Primary Gamma (keV)
------|-----------|---------------------|--------------------
0 | Ac-225 | U235_CHAIN | 99.9
1 | Ac-227 | U235_CHAIN | 236.0
2 | Ac-228 | TH232_CHAIN | 911.2
3 | Ag-110m | ACTIVATION | 657.8
4 | Am-241 | CALIBRATION | 59.5
5 | Au-198 | MEDICAL | 411.8
6 | Ba-133 | CALIBRATION | 356.0
7 | Be-7 | COSMOGENIC | 477.6
8 | Bi-207 | CALIBRATION | 569.7
9 | Bi-210 | U238_CHAIN | 46.5
10 | Bi-211 | U235_CHAIN | 351.1
11 | Bi-212 | TH232_CHAIN | 727.3
12 | Bi-214 | U238_CHAIN | 609.3
13 | C-14 | COSMOGENIC | (beta only)
14 | Cd-109 | INDUSTRIAL | 88.0
15 | Ce-139 | ACTIVATION | 165.9
16 | Ce-141 | REACTOR_FALLOUT | 145.4
17 | Ce-144 | REACTOR_FALLOUT | 133.5
18 | Co-57 | CALIBRATION | 122.1
19 | Co-58 | ACTIVATION | 810.8
20 | Co-60 | CALIBRATION | 1173.2, 1332.5
21 | Cr-51 | ACTIVATION | 320.1
22 | Cs-134 | REACTOR_FALLOUT | 604.7, 795.9
23 | Cs-137 | CALIBRATION | 661.7
24 | Cu-64 | MEDICAL | 1345.8
25 | Eu-152 | CALIBRATION | 121.8, 344.3
26 | Eu-154 | CALIBRATION | 123.1, 1274.4
27 | Eu-155 | REACTOR_FALLOUT | 86.5, 105.3
28 | F-18 | MEDICAL | 511.0
29 | Fe-55 | ACTIVATION | (X-rays)
30 | Fe-59 | ACTIVATION | 1099.3
31 | Ga-67 | MEDICAL | 93.3, 184.6
32 | Ga-68 | MEDICAL | 511.0
33 | Ge-68 | CALIBRATION | 511.0
34 | H-3 | COSMOGENIC | (beta only)
35 | Hf-175 | ACTIVATION | 343.4
36 | Hf-181 | ACTIVATION | 482.2
37 | Hg-203 | INDUSTRIAL | 279.2
38 | I-123 | MEDICAL | 159.0
39 | I-125 | MEDICAL | 35.5
40 | I-131 | MEDICAL | 364.5
41 | In-111 | MEDICAL | 171.3, 245.4
42 | Ir-192 | INDUSTRIAL | 316.5, 468.1
43 | K-40 | NATURAL_BACKGROUND | 1460.8
44 | Kr-85 | REACTOR_FALLOUT | 514.0
45 | La-140 | REACTOR_FALLOUT | 1596.2
46 | Lu-177 | MEDICAL | 208.4
47 | Mn-54 | CALIBRATION | 834.8
48 | Mo-99 | MEDICAL | 140.5, 739.5
49 | Na-22 | CALIBRATION | 511.0, 1274.5
50 | Na-24 | ACTIVATION | 1368.6, 2754.0
51 | Nb-95 | REACTOR_FALLOUT | 765.8
52 | Np-237 | INDUSTRIAL | 86.5
53 | Pa-231 | U235_CHAIN | 283.7
54 | Pa-233 | U238_CHAIN | 311.9
55 | Pa-234m | U238_CHAIN | 1001.0
56 | Pb-210 | U238_CHAIN | 46.5
57 | Pb-211 | U235_CHAIN | 404.9
58 | Pb-212 | TH232_CHAIN | 238.6
59 | Pb-214 | U238_CHAIN | 351.9
60 | Po-210 | U238_CHAIN | (alpha only)
61 | Pu-239 | INDUSTRIAL | 413.7
62 | Ra-223 | U235_CHAIN | 269.5
63 | Ra-224 | TH232_CHAIN | 241.0
64 | Ra-226 | U238_CHAIN | 186.2
65 | Rb-86 | ACTIVATION | 1076.6
66 | Rn-219 | U235_CHAIN | 271.2
67 | Rn-220 | TH232_CHAIN | 549.7
68 | Rn-222 | U238_CHAIN | (alpha only)
69 | Ru-103 | REACTOR_FALLOUT | 497.1
70 | Ru-106 | REACTOR_FALLOUT | 511.9, 621.9
71 | Sb-124 | ACTIVATION | 602.7
72 | Sb-125 | REACTOR_FALLOUT | 427.9
73 | Sc-46 | ACTIVATION | 889.3
74 | Se-75 | INDUSTRIAL | 264.7, 279.5
75 | Sr-85 | CALIBRATION | 514.0
76 | Sr-90 | REACTOR_FALLOUT | (beta only)
77 | Tc-99m | MEDICAL | 140.5
78 | Th-227 | U235_CHAIN | 236.0
79 | Th-228 | TH232_CHAIN | 84.4
80 | Th-232 | PRIMORDIAL | (chain daughters)
81 | Th-234 | U238_CHAIN | 63.3, 92.4
9.2 Key Gamma Lines Reference
GAMMA_LINES = {
# Calibration Sources
"Cs-137": [(661.7, 0.851)], # Classic 662 keV
"Co-60": [(1173.2, 0.999), (1332.5, 0.9998)], # Dual peaks
"Am-241": [(59.5, 0.359)], # Low energy
"Ba-133": [(356.0, 0.623), (81.0, 0.329)],
"Na-22": [(511.0, 1.798), (1274.5, 0.999)], # Positron annihilation
"Eu-152": [(121.8, 0.284), (344.3, 0.265), (1408.0, 0.210)],
# Medical
"Tc-99m": [(140.5, 0.890)],
"I-131": [(364.5, 0.817)],
"F-18": [(511.0, 1.934)], # PET isotope
# Background
"K-40": [(1460.8, 0.107)], # Always present
# Decay Chains
"Pb-214": [(351.9, 0.371), (295.2, 0.192)],
"Bi-214": [(609.3, 0.461), (1120.3, 0.150)],
"Tl-208": [(583.2, 0.845), (2614.5, 0.359)],
}
9.3 Isotope Categories
| Category | Description | Examples |
|---|---|---|
CALIBRATION |
Check sources, well-characterized | Cs-137, Co-60, Am-241 |
MEDICAL |
Hospital/imaging use, short-lived | Tc-99m, I-131, F-18 |
INDUSTRIAL |
Sealed sources, gauges | Ir-192, Se-75 |
NATURAL_BACKGROUND |
Always present in environment | K-40 |
PRIMORDIAL |
Existed since Earth formed | U-238, Th-232, U-235 |
U238_CHAIN |
Uranium-238 decay daughters | Ra-226, Pb-214, Bi-214 |
TH232_CHAIN |
Thorium-232 decay daughters | Ac-228, Pb-212, Tl-208 |
U235_CHAIN |
Uranium-235 decay daughters | Pa-231, Ac-227 |
REACTOR_FALLOUT |
Fission products | Cs-134, I-131, Sr-90 |
ACTIVATION |
Neutron-activated materials | Co-58, Fe-59, Zn-65 |
COSMOGENIC |
Cosmic ray produced | Be-7, Na-22, C-14 |
10. Decay Chain Analysis
10.1 Understanding Decay Chains
Radioactive isotopes decay into other isotopes, forming decay chains. The three major natural chains are:
- Uranium-238 Series → ends at Pb-206 (stable)
- Thorium-232 Series → ends at Pb-208 (stable)
- Uranium-235 Series → ends at Pb-207 (stable)
10.2 Secular Equilibrium
In secular equilibrium (closed system, long time), all daughter activities equal the parent activity:
A_parent = A_daughter1 = A_daughter2 = ... = A_daughterN
This means detecting daughters implies parent presence!
10.3 Chain Signatures for Parent Inference
The system defines ChainSignature patterns to infer parent isotopes from detected daughters:
Rn-222 Progeny (Indicates Radon)
required: {"Pb-214", "Bi-214"}
optional: {"Pb-210"}
inferred_parent: "Rn-222"
When you see Pb-214 + Bi-214 → atmospheric radon is present
Ra-226 Equilibrium (Indicates Uranium)
required: {"Ra-226", "Pb-214", "Bi-214"}
optional: {"Pb-210", "Bi-210"}
inferred_parent: "U-238"
When you see Ra-226 + daughters → U-238 decay chain in equilibrium
Th-232 Equilibrium (Indicates Thorium)
required: {"Ac-228", "Pb-212", "Bi-212"}
optional: {"Tl-208", "Ra-224"}
inferred_parent: "Th-232"
When you see Ac-228 + Pb-212 + Bi-212 → Th-232 source material
Rn-220 Progeny (Thoron Daughters)
required: {"Pb-212", "Bi-212"}
optional: {"Tl-208"}
inferred_parent: "Rn-220"
When you see Pb-212 + Bi-212 → thoron (Rn-220) is present
10.4 Using Decay Chain Inference
from synthetic_spectra.ground_truth.decay_chains import infer_parent_from_daughters
# After running inference, get detected isotope names
detected = {iso.name for iso in result.get_present_isotopes()}
# Infer parent isotopes
parents = infer_parent_from_daughters(detected)
for parent_name, signature, confidence in parents:
print(f"Inferred: {parent_name} (confidence: {confidence:.1%})")
print(f" Based on: {signature.name}")
print(f" Required daughters: {signature.required_daughters}")
10.5 Interpreting Chain Detections
Example 1: Uranium Ore
Detected: U-238 (45%), Ra-226 (88%), Pb-214 (92%), Bi-214 (94%)
Interpretation:
- U-238 has low detection probability (weak gamma)
- Daughters are strong gamma emitters
- High confidence of uranium-bearing material
- In secular equilibrium
Example 2: Radon in Air
Detected: Pb-214 (78%), Bi-214 (82%)
NOT detected: Ra-226, U-238
Interpretation:
- Airborne radon daughters (deposited on detector)
- Parent Rn-222 is gas (no gamma)
- Ra-226/U-238 not present locally
- Common indoor measurement result
Example 3: Thorium Lantern Mantle
Detected: Th-232 (52%), Ac-228 (71%), Pb-212 (85%), Bi-212 (79%), Tl-208 (67%)
Interpretation:
- Complete Th-232 chain
- Tl-208's 2614 keV line is distinctive
- Indicates thoriated material
10.6 U-238 Decay Chain Detail
U-238 (4.47 Gy)
↓ α
Th-234 (24.1 d) [63.3, 92.4 keV]
↓ β
Pa-234m (1.17 min) [1001 keV]
↓ β
U-234 (245 ky)
↓ α
Th-230 (75.4 ky)
↓ α
Ra-226 (1600 y) [186.2 keV]
↓ α
Rn-222 (3.82 d) [gas, no gamma]
↓ α
Po-218 (3.1 min)
↓ α
Pb-214 (26.8 min) [351.9, 295.2 keV] ★ KEY INDICATOR
↓ β
Bi-214 (19.9 min) [609.3, 1120.3 keV] ★ KEY INDICATOR
↓ β
Po-214 (164 μs)
↓ α
Pb-210 (22.3 y) [46.5 keV]
↓ β
Bi-210 (5.01 d)
↓ β
Po-210 (138 d)
↓ α
Pb-206 (stable)
10.7 Th-232 Decay Chain Detail
Th-232 (14.0 Gy)
↓ α
Ra-228 (5.75 y) [no significant gamma]
↓ β
Ac-228 (6.15 h) [911.2, 338.3, 969.0 keV] ★ KEY INDICATOR
↓ β
Th-228 (1.91 y) [84.4 keV]
↓ α
Ra-224 (3.66 d) [241.0 keV]
↓ α
Rn-220 (55.6 s) [549.7 keV]
↓ α
Po-216 (0.145 s)
↓ α
Pb-212 (10.64 h) [238.6 keV] ★ KEY INDICATOR
↓ β
Bi-212 (60.6 min) [727.3 keV]
↓ α (35.94%) ↓ β (64.06%)
Tl-208 (3.05 min) Po-212 (0.3 μs)
[583.2, 2614.5 keV] ↓ α
↓ β ↙
→ Pb-208 (stable)
11. Threshold Selection Guide
11.1 What is the Threshold?
The threshold is the probability cutoff for declaring an isotope "present":
probability >= threshold→ DETECTEDprobability < threshold→ NOT DETECTED
11.2 Threshold Trade-offs
| Threshold | Precision | Recall | False Positives | False Negatives |
|---|---|---|---|---|
| 0.9 | Very High | Low | Very Few | Many |
| 0.7 | High | Moderate | Few | Some |
| 0.5 | Balanced | Balanced | Balanced | Balanced |
| 0.3 | Moderate | High | Some | Few |
| 0.1 | Low | Very High | Many | Very Few |
11.3 Recommended Thresholds by Scenario
| Scenario | Threshold | Rationale |
|---|---|---|
| General purpose | 0.5 | Balanced performance |
| Calibration verification | 0.7 | High confidence needed |
| Weak source detection | 0.3 | Don't miss faint signals |
| Safety screening | 0.3 | Prioritize recall |
| Research/survey | 0.4 | Slightly favor sensitivity |
| Regulatory reporting | 0.6 | Minimize false positives |
11.4 Adjusting Threshold at Runtime
# High-sensitivity scan
result_sensitive = inference.predict(spectrum, threshold=0.3)
# High-confidence confirmation
result_confident = inference.predict(spectrum, threshold=0.7)
# Compare
print(f"At 0.3: {result_sensitive.num_present} isotopes")
print(f"At 0.7: {result_confident.num_present} isotopes")
11.5 Multi-Threshold Analysis
def analyze_at_multiple_thresholds(spectrum, inference):
"""Analyze spectrum at multiple thresholds."""
thresholds = [0.3, 0.5, 0.7, 0.9]
for t in thresholds:
result = inference.predict(spectrum, threshold=t)
names = [iso.name for iso in result.get_present_isotopes()]
print(f"Threshold {t}: {names}")
Example Output:
Threshold 0.3: ['Cs-137', 'Cs-134', 'K-40', 'Pb-214']
Threshold 0.5: ['Cs-137', 'Cs-134', 'K-40']
Threshold 0.7: ['Cs-137', 'Cs-134']
Threshold 0.9: ['Cs-137']
Interpretation: Cs-137 is definitely present (>0.9), Cs-134 is very likely (>0.7), K-40 is probable (>0.5), Pb-214 is possible (>0.3).
12. Example Workflows
12.1 Basic Inference Workflow
import numpy as np
from vega_portable_inference_2d import Vega2DInference
# 1. Initialize
inference = Vega2DInference("models/vega_2d_best.pt")
# 2. Load spectrum
spectrum = np.load("measurement.npy")
print(f"Spectrum shape: {spectrum.shape}")
# 3. Run inference
result = inference.predict(spectrum, threshold=0.5)
# 4. Display results
print(result.summary())
# 5. Export
with open("results.json", "w") as f:
f.write(result.to_json())
12.2 Batch Processing Workflow
from pathlib import Path
def process_directory(data_dir: str, model_path: str, threshold: float = 0.5):
"""Process all spectra in a directory."""
inference = Vega2DInference(model_path)
results = []
for npy_file in Path(data_dir).glob("*.npy"):
spectrum = np.load(npy_file)
prediction = inference.predict(spectrum, threshold)
results.append({
"file": npy_file.name,
"detected": [iso.name for iso in prediction.get_present_isotopes()],
"confidence": prediction.confidence
})
return results
# Usage
results = process_directory("spectra/", "models/vega_2d_best.pt")
for r in results:
print(f"{r['file']}: {r['detected']}")
12.3 Decay Chain Analysis Workflow
from vega_portable_inference_2d import Vega2DInference
from synthetic_spectra.ground_truth.decay_chains import (
infer_parent_from_daughters,
get_chain_daughters
)
def analyze_with_chain_inference(spectrum, inference, threshold=0.5):
"""Full analysis including decay chain inference."""
# Run basic inference
result = inference.predict(spectrum, threshold)
detected = {iso.name for iso in result.get_present_isotopes()}
print("=== DIRECT DETECTIONS ===")
for iso in result.get_present_isotopes():
print(f" {iso.name}: {iso.probability:.1%}")
# Infer parents from daughters
print("\n=== DECAY CHAIN ANALYSIS ===")
parents = infer_parent_from_daughters(detected)
if parents:
for parent, signature, confidence in parents:
print(f"\n Inferred Parent: {parent}")
print(f" Confidence: {confidence:.1%}")
print(f" Signature: {signature.name}")
print(f" Required daughters found: {detected & signature.required_daughters}")
else:
print(" No decay chain signatures identified")
return result, parents
# Usage
result, parents = analyze_with_chain_inference(spectrum, inference)
12.4 Real-Time Monitoring Workflow
import time
def monitor_spectrum_stream(inference, spectrum_source, interval=1.0, threshold=0.5):
"""Monitor incoming spectra in real-time."""
while True:
# Get latest spectrum (implement your data source)
spectrum = spectrum_source.get_latest()
if spectrum is not None:
result = inference.predict(spectrum, threshold)
if result.num_present > 0:
print(f"[{time.strftime('%H:%M:%S')}] DETECTION!")
for iso in result.get_present_isotopes():
print(f" {iso.name}: {iso.probability:.1%}, {iso.activity_bq:.1f} Bq")
else:
print(f"[{time.strftime('%H:%M:%S')}] Background only")
time.sleep(interval)
12.5 Sample JSON Output
{
"isotopes": [
{
"name": "Cs-137",
"probability": 0.9823,
"activity_bq": 45.2,
"present": true
},
{
"name": "Cs-134",
"probability": 0.8741,
"activity_bq": 18.7,
"present": true
}
],
"num_present": 2,
"confidence": 0.9282,
"threshold_used": 0.5
}
12.6 Sample Input Generation (for Testing)
from vega_portable_inference_2d import create_sample_spectrum_2d
# Generate test spectrum
test_spectrum = create_sample_spectrum_2d(
isotope="Cs-137",
activity_bq=100.0,
duration_seconds=60,
add_background=True,
add_noise=True,
detector_fwhm_percent=8.5,
seed=42
)
print(f"Shape: {test_spectrum.shape}") # (60, 1023)
print(f"Range: [{test_spectrum.min():.1f}, {test_spectrum.max():.1f}]")
# Save for later
np.save("test_cs137.npy", test_spectrum)
13. Troubleshooting
13.1 Common Issues
Issue: "No isotopes detected" for known source
Possible causes:
- Threshold too high → Lower to 0.3
- Source very weak → Increase measurement time
- Wrong normalization → Check if max > 0
- Input shape wrong → Must be (T, 1023)
Solution:
# Check probabilities before thresholding
result = inference.predict(spectrum, threshold=0.0, return_all=True)
top5 = sorted(result.isotopes, key=lambda x: -x.probability)[:5]
for iso in top5:
print(f"{iso.name}: {iso.probability:.1%}")
Issue: "Too many false positives"
Possible causes:
- Threshold too low → Raise to 0.6-0.7
- Noisy data → Check for acquisition problems
- Strong overlapping peaks → Check decay chains
Solution:
# Use higher threshold for confirmation
result = inference.predict(spectrum, threshold=0.7)
Issue: "CUDA out of memory"
Possible causes:
- Batch size too large
- Other GPU processes
Solution:
# Force CPU inference
inference = Vega2DInference(model_path, device=torch.device('cpu'))
Issue: "Model weights not matching"
Possible causes:
- Model architecture changed
- Wrong checkpoint version
Solution:
- Ensure checkpoint matches Vega2DConfig defaults
- Re-train if architecture was modified
13.2 Data Quality Checks
def check_spectrum_quality(spectrum: np.ndarray) -> dict:
"""Check spectrum data quality."""
issues = []
# Shape check
if spectrum.ndim != 2:
issues.append(f"Wrong dimensions: {spectrum.ndim}, expected 2")
if spectrum.shape[1] != 1023:
issues.append(f"Wrong channels: {spectrum.shape[1]}, expected 1023")
# Value checks
if spectrum.min() < 0:
issues.append("Contains negative values")
if spectrum.max() == 0:
issues.append("All zeros - no data")
if np.isnan(spectrum).any():
issues.append("Contains NaN values")
if np.isinf(spectrum).any():
issues.append("Contains infinite values")
return {
"shape": spectrum.shape,
"min": float(spectrum.min()),
"max": float(spectrum.max()),
"mean": float(spectrum.mean()),
"issues": issues,
"valid": len(issues) == 0
}
13.3 Performance Optimization
# Batch predictions are faster than individual
spectra = [np.load(f) for f in spectrum_files]
results = inference.predict_batch(spectra, threshold=0.5)
# Pre-load model once, reuse for all predictions
inference = Vega2DInference(model_path) # Do once
for spectrum in stream:
result = inference.predict(spectrum) # Fast
Appendix A: Complete Configuration Reference
A.1 Vega2DConfig Defaults
Vega2DConfig(
num_channels=1023,
num_time_intervals=60,
num_isotopes=82,
conv_channels=[32, 64, 128],
kernel_size=(3, 7),
pool_size=(2, 2),
fc_hidden_dims=[512, 256],
dropout_rate=0.3,
leaky_relu_slope=0.01,
max_activity_bq=1000.0
)
A.2 TrainingConfig2D Defaults
TrainingConfig2D(
data_dir="O:/master_data_collection/isotopev2",
model_dir="models",
target_time_intervals=60,
epochs=50,
batch_size=32,
learning_rate=0.001,
weight_decay=1e-05,
classification_weight=1.0,
regression_weight=0.1,
use_amp=True,
early_stopping_patience=10,
lr_scheduler_patience=5,
lr_scheduler_factor=0.5,
num_workers=4
)
A.3 Generation Scenario Fractions
DEFAULT_SCENARIOS = [
BackgroundOnlyScenario(0.15),
SingleCalibrationScenario(0.20),
SingleMedicalScenario(0.08),
SingleIndustrialScenario(0.05),
UraniumChainScenario(0.10),
ThoriumChainScenario(0.10),
NORMScenario(0.07),
FalloutScenario(0.05),
MixedSourcesScenario(0.10),
ComplexMixScenario(0.05),
WeakSourceScenario(0.05),
]
Appendix B: Version History
| Version | Date | Changes |
|---|---|---|
| 2.0 | Jan 2025 | 2D model architecture, temporal features |
| 1.0 | Dec 2024 | Original 1D model (deprecated) |
Document End
For questions or issues, consult the agents.md file in the repository root.