# Vega 2D Isotope Identification System - Complete Technical Guide **Version:** 2.0 (2D Model) **Last Updated:** January 2025 **Architecture:** 2D-CNN with Temporal Feature Extraction --- ## Table of Contents 1. [Executive Summary](#1-executive-summary) 2. [System Architecture Overview](#2-system-architecture-overview) 3. [Data Format Specification](#3-data-format-specification) 4. [Synthetic Data Generation](#4-synthetic-data-generation) 5. [Model Architecture](#5-model-architecture) 6. [Training Procedures](#6-training-procedures) 7. [Inference System](#7-inference-system) 8. [Output Interpretation](#8-output-interpretation) 9. [Isotope Reference](#9-isotope-reference) 10. [Decay Chain Analysis](#10-decay-chain-analysis) 11. [Threshold Selection Guide](#11-threshold-selection-guide) 12. [Example Workflows](#12-example-workflows) 13. [Troubleshooting](#13-troubleshooting) --- ## 1. Executive Summary ### What This System Does The Vega 2D system identifies **radioactive isotopes** from gamma-ray spectra captured by Radiacode scintillation detectors. Given a spectrum measurement, it outputs: 1. **Presence predictions** - Which of 82 isotopes are present (with probability 0-1) 2. **Activity estimates** - Estimated radioactivity in Becquerels (Bq) for each detected isotope ### Why 2D? Unlike traditional 1D approaches that collapse temporal data, the Vega 2D model treats spectra as **images** with: - **Y-axis:** 60 time intervals (1 second each) - **X-axis:** 1023 energy channels (20 keV - 3000 keV) This preserves crucial temporal information: - **Decay patterns** - Short-lived isotopes show decreasing counts over time - **Activity fluctuations** - Real sources have statistical variations - **Noise characteristics** - Poisson statistics create time-varying patterns - **Equilibrium dynamics** - Daughter isotope ingrowth over time ### Key Specifications | Parameter | Value | |-----------|-------| | Input Shape | `(60, 1023)` - 60 time intervals × 1023 channels | | Output Classes | 82 isotopes | | Model Parameters | ~59 million | | Inference Time | <100ms on GPU, ~500ms on CPU | | Typical F1 Score | >96% | --- ## 2. System Architecture Overview ### Directory Structure ``` ml-for-isotope-identification/ ├── synthetic_spectra/ # Data generation │ ├── generate_spectra_v3.py # Main generation script │ ├── generator.py # SpectrumGenerator class │ ├── config.py # Detector configurations │ └── ground_truth/ │ ├── isotope_data.py # 82 isotope definitions │ └── decay_chains.py # Decay chain relationships │ ├── training/vega/ # Training infrastructure │ ├── model_2d.py # Vega2DModel architecture │ ├── dataset_2d.py # 2D data loading │ ├── train_2d.py # Training loop │ └── isotope_index.py # Isotope ↔ index mapping │ ├── inference/ # Inference system │ └── vega_portable_inference_2d.py # Self-contained inference │ ├── models/ # Saved checkpoints │ ├── vega_2d_best.pt # Best validation model │ └── vega_2d_final.pt # Final epoch model │ └── data/synthetic/ # Generated training data └── spectra/ # .npy spectrum files ``` ### Data Flow ``` [Radiacode Detector] → [Raw Counts Array] → [Normalization] → [Vega 2D Model] ↓ [Results Display] ← [Activity Estimation] ← [Sigmoid(logits)] ← [Dual Heads] ``` --- ## 3. Data Format Specification ### 3.1 Input Spectrum Format **Shape:** `(num_time_intervals, 1023)` or ideally `(60, 1023)` **Data Type:** `float32` or `float64` **Value Range:** - Raw counts: integers 0 to ~thousands - Normalized: 0.0 to 1.0 (divided by max value) **Channel Mapping:** ```python def channel_to_energy(channel: int) -> float: """Convert channel index to energy in keV.""" E_MIN, E_MAX = 20.0, 3000.0 return E_MIN + channel * (E_MAX - E_MIN) / 1023 def energy_to_channel(energy_kev: float) -> int: """Convert energy in keV to channel index.""" E_MIN, E_MAX = 20.0, 3000.0 channel = int((energy_kev - E_MIN) / (E_MAX - E_MIN) * 1023) return max(0, min(1022, channel)) ``` **Example Channel Mappings:** | Energy (keV) | Channel | Notable Isotope | |--------------|---------|-----------------| | 59.5 | 14 | Am-241 | | 122.1 | 35 | Co-57 | | 356.0 | 116 | Ba-133 | | 661.7 | 221 | Cs-137 | | 1173.2 | 397 | Co-60 | | 1274.5 | 432 | Na-22 | | 1332.5 | 452 | Co-60 | | 1460.8 | 496 | K-40 (background) | ### 3.2 Time Dimension Handling The model **requires exactly 60 time intervals**. Input spectra are handled as follows: ```python def _pad_or_truncate(spectrum: np.ndarray, target_rows: int = 60) -> np.ndarray: """Ensure spectrum has exactly 60 rows.""" current_rows = spectrum.shape[0] if current_rows == target_rows: return spectrum elif current_rows > target_rows: # Truncate - take LAST N intervals (most recent data) return spectrum[-target_rows:] else: # Pad with zeros at the BEGINNING padding = np.zeros((target_rows - current_rows, spectrum.shape[1])) return np.vstack([padding, spectrum]) ``` **Important:** When truncating, the **most recent 60 seconds** are kept (last rows), not the first. ### 3.3 Normalization Before inference, spectra should be normalized to [0, 1]: ```python def normalize(spectrum: np.ndarray) -> np.ndarray: """Normalize spectrum to [0, 1] range.""" max_val = spectrum.max() if max_val > 0: return spectrum / max_val return spectrum ``` **Why normalize?** - Neural networks work best with standardized inputs - Prevents high-activity samples from dominating gradients - Allows model to focus on spectral shape rather than absolute counts --- ## 4. Synthetic Data Generation ### 4.1 Overview Training data is generated synthetically because: 1. Real radioactive sources require permits and safety protocols 2. ML requires 100,000+ samples 3. Ground truth labels are perfect with synthetic data 4. Can systematically vary all parameters ### 4.2 Generation Command ```bash # Generate 200,000 training samples python -m synthetic_spectra.generate_spectra_v3 \ --num_samples 200000 \ --output_dir "O:/master_data_collection/isotopev2" \ --detector radiacode_103 \ --workers 8 \ --activity_min 1.0 \ --activity_max 100.0 ``` ### 4.3 Generation Parameters | Parameter | Default | Description | |-----------|---------|-------------| | `--num_samples` | 200000 | Total samples to generate | | `--output_dir` | data/synthetic | Output directory | | `--detector` | radiacode_103 | Detector model to simulate | | `--workers` | CPU_count-1 | Parallel workers | | `--activity_min` | 1.0 | Minimum source activity (Bq) | | `--activity_max` | 100.0 | Maximum source activity (Bq) | | `--seed` | None | Random seed for reproducibility | ### 4.4 Sample Scenario Distribution The v3 generator creates diverse, realistic scenarios: | Scenario | Fraction | Description | |----------|----------|-------------| | `background_only` | 15% | No isotopes - just environmental background | | `single_calibration` | 20% | One calibration source (Cs-137, Co-60, etc.) | | `single_medical` | 8% | One medical isotope (Tc-99m, I-131, etc.) | | `single_industrial` | 5% | One industrial source (Ir-192, Se-75, etc.) | | `uranium_chain` | 10% | U-238 + daughters in equilibrium | | `thorium_chain` | 10% | Th-232 + daughters in equilibrium | | `norm` | 7% | 2-4 NORM isotopes (K-40, Ra-226, etc.) | | `fallout` | 5% | Reactor fallout signature (Cs-137 + Cs-134) | | `mixed` | 10% | Random 2-3 isotope combination | | `complex_mix` | 5% | 4-6 isotopes from various categories | | `weak_source` | 5% | Very low activity (0.1-5 Bq) | ### 4.5 Isotope Pools ```python # Calibration sources (individual, well-characterized) CALIBRATION_ISOTOPES = [ "Cs-137", "Co-60", "Am-241", "Ba-133", "Eu-152", "Na-22", "Co-57", "Mn-54" ] # Medical isotopes (short-lived, hospital settings) MEDICAL_ISOTOPES = [ "Tc-99m", "I-131", "I-123", "F-18", "Ga-67", "Ga-68", "In-111", "Lu-177", "Tl-201" ] # Industrial sources (sealed sources, gauges) INDUSTRIAL_ISOTOPES = [ "Ir-192", "Se-75", "Zn-65", "Co-58", "Cd-109" ] # Natural decay chains (always appear together) URANIUM_238_CHAIN = ["U-238", "Ra-226", "Pb-214", "Bi-214"] THORIUM_232_CHAIN = ["Th-232", "Ac-228", "Pb-212", "Bi-212", "Tl-208"] # Reactor fallout signature FALLOUT_SIGNATURE = ["Cs-137", "Cs-134"] # Indicates reactor origin ``` ### 4.6 Background Model Every synthetic spectrum includes realistic environmental background: 1. **Exponential continuum**: `B(E) = B₀ × exp(-E / E_char)` 2. **K-40** (potassium-40): 1460.8 keV - from soil, building materials 3. **Radon progeny** (Pb-214, Bi-214): From atmospheric radon 4. **Thorium progeny** (Pb-212, Tl-208, Ac-228): From soil Background intensity is randomized (0.3× to 3.0× baseline). ### 4.7 Physics Model Each gamma peak is generated as: ```python # Gaussian peak generation FWHM = FWHM_662 * sqrt(E / 662) # Resolution scales with energy sigma = FWHM / 2.355 expected_counts = activity_bq * time_seconds * branching_ratio * efficiency # Poisson noise applied to expected counts observed_counts = np.random.poisson(expected_counts) ``` ### 4.8 Output Files Each sample generates: - `{uuid}_spectrum.npy` - NumPy array (60, 1023) - `{uuid}_spectrum.png` - Visualization (optional) Plus a global `labels.json`: ```json { "abc123-def456": { "isotopes": [ {"name": "Cs-137", "activity_bq": 45.2, "category": "CALIBRATION"} ], "background_isotopes": ["K-40", "Pb-214", "Bi-214"], "detector": "radiacode_103", "duration_seconds": 60, "num_intervals": 60, "background_scale": 1.2, "generation_timestamp": "2025-01-24T12:34:56" } } ``` --- ## 5. Model Architecture ### 5.1 Architecture Overview ``` Vega2DModel (59M parameters) │ ├─ Input: (batch, 1, 60, 1023) [Grayscale image representation] │ ├─ ConvBlock2D #1 │ ├─ Conv2d(1→32, kernel=(3,7), padding=(1,3)) │ ├─ BatchNorm2d(32) │ ├─ LeakyReLU(0.01) │ ├─ Conv2d(32→32, kernel=(3,7), padding=(1,3)) │ ├─ BatchNorm2d(32) │ ├─ LeakyReLU(0.01) │ ├─ MaxPool2d((2,2)) → (batch, 32, 30, 511) │ └─ Dropout2d(0.3) │ ├─ ConvBlock2D #2 │ ├─ Conv2d(32→64, kernel=(3,7), padding=(1,3)) │ ├─ ...same structure... │ └─ MaxPool2d((2,2)) → (batch, 64, 15, 255) │ ├─ ConvBlock2D #3 │ ├─ Conv2d(64→128, kernel=(3,7), padding=(1,3)) │ ├─ ...same structure... │ └─ MaxPool2d((2,2)) → (batch, 128, 7, 127) │ ├─ Flatten → (batch, 113792) │ ├─ FC Block #1 │ ├─ Linear(113792→512) │ ├─ BatchNorm1d(512) │ ├─ LeakyReLU(0.01) │ └─ Dropout(0.3) │ ├─ FC Block #2 │ ├─ Linear(512→256) │ ├─ BatchNorm1d(256) │ ├─ LeakyReLU(0.01) │ └─ Dropout(0.3) │ └─ Dual Output Heads ├─ Classifier: Linear(256→82) → logits (for BCEWithLogitsLoss) └─ Regressor: Linear(256→82) → ReLU → normalized activity [0,1] ``` ### 5.2 Configuration Parameters ```python @dataclass class Vega2DConfig: # Input dimensions num_channels: int = 1023 # Energy channels num_time_intervals: int = 60 # Time dimension # Output num_isotopes: int = 82 # CNN architecture conv_channels: List[int] = [32, 64, 128] kernel_size: Tuple[int, int] = (3, 7) # (time, energy) pool_size: Tuple[int, int] = (2, 2) # FC layers fc_hidden_dims: List[int] = [512, 256] # Regularization dropout_rate: float = 0.3 leaky_relu_slope: float = 0.01 # Activity scaling max_activity_bq: float = 1000.0 ``` ### 5.3 Kernel Size Rationale The kernel `(3, 7)` is asymmetric: - **3 in time dimension**: Captures short temporal correlations (3 seconds) - **7 in energy dimension**: Captures spectral features wider than peak FWHM This asymmetry reflects the different nature of the two dimensions. ### 5.4 Dual-Head Design The model has **two output heads**: 1. **Classifier Head** (presence detection) - Output: 82 logits (raw scores) - Loss: `BCEWithLogitsLoss` (sigmoid applied internally) - Interpretation: `sigmoid(logit) > threshold` → isotope present 2. **Regressor Head** (activity estimation) - Output: 82 values in [0, 1] (normalized activity) - Loss: `HuberLoss` (robust to outliers) - Interpretation: `output × max_activity_bq` = estimated Bq ### 5.5 Loss Function ```python total_loss = cls_weight * BCEWithLogitsLoss(logits, presence_labels) + reg_weight * HuberLoss(pred_activities, true_activities) # Default weights cls_weight = 1.0 # Classification dominates reg_weight = 0.1 # Activity estimation is secondary ``` --- ## 6. Training Procedures ### 6.1 Quick Start ```bash # Test run (5 epochs) python -m training.vega.train_2d --test # Full training python -m training.vega.train_2d \ --epochs 50 \ --batch-size 32 \ --data-dir "O:/master_data_collection/isotopev2" # Without mixed precision (if GPU issues) python -m training.vega.train_2d --no-amp ``` ### 6.2 Training Configuration ```python @dataclass class TrainingConfig2D: # Data paths data_dir: str = "O:/master_data_collection/isotopev2" model_dir: str = "models" # Training hyperparameters epochs: int = 50 batch_size: int = 32 learning_rate: float = 1e-3 weight_decay: float = 1e-5 # Loss weights classification_weight: float = 1.0 regression_weight: float = 0.1 # Mixed precision use_amp: bool = True # Early stopping early_stopping_patience: int = 10 # Learning rate scheduler lr_scheduler_patience: int = 5 lr_scheduler_factor: float = 0.5 # Data loading num_workers: int = 4 ``` ### 6.3 Data Splits ```python # Default splits in dataset_2d.py train_ratio = 0.8 # 80% training val_ratio = 0.1 # 10% validation test_ratio = 0.1 # 10% test ``` ### 6.4 Training Loop Each epoch: 1. **Training phase**: Forward pass → loss → backward → optimizer step 2. **Validation phase**: Compute metrics without gradients 3. **Checkpointing**: Save if validation loss improved 4. **LR Scheduling**: Reduce LR if plateau detected 5. **Early stopping**: Stop if no improvement for N epochs ### 6.5 Metrics Tracked | Metric | Description | |--------|-------------| | `loss` | Combined BCE + Huber loss | | `cls_loss` | Binary cross-entropy (classification) | | `reg_loss` | Huber loss (activity regression) | | `exact_match` | % samples with all 82 isotopes correct | | `precision` | TP / (TP + FP) | | `recall` | TP / (TP + FN) | | `f1` | Harmonic mean of precision and recall | ### 6.6 Expected Results After 50 epochs on 200K samples: | Metric | Expected Value | |--------|----------------| | F1 Score | >96% | | Precision | >97% | | Recall | >94% | | Exact Match | >88% | | Training Time | ~4 hours (RTX 5090) | ### 6.7 Checkpoint Files Training produces: - `vega_2d_best.pt` - Best validation loss (use for inference) - `vega_2d_final.pt` - Final epoch - `vega_2d_epoch_{N}.pt` - Per-epoch checkpoints - `vega_2d_history.json` - Training metrics over time ### 6.8 Checkpoint Contents ```python checkpoint = { 'epoch': epoch, 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict(), 'model_config': asdict(model_config), 'training_config': asdict(config), 'best_val_loss': best_val_loss, 'history': history } ``` --- ## 7. Inference System ### 7.1 Portable Inference Script The file `inference/vega_portable_inference_2d.py` is **completely self-contained** and can be deployed anywhere with just: - Python 3.8+ - NumPy - PyTorch It embeds: - Model architecture definition - Isotope index (all 82 names) - Key gamma lines for sample generation - Sample spectrum generator for testing ### 7.2 Command Line Usage ```bash # Run demo with synthetic spectra python vega_portable_inference_2d.py --model vega_2d_best.pt # Analyze a specific spectrum python vega_portable_inference_2d.py \ --model vega_2d_best.pt \ --spectrum my_measurement.npy \ --threshold 0.5 # Lower threshold for higher sensitivity python vega_portable_inference_2d.py \ --model vega_2d_best.pt \ --spectrum unknown_sample.npy \ --threshold 0.3 # JSON output python vega_portable_inference_2d.py \ --model vega_2d_best.pt \ --spectrum sample.npy \ --json ``` ### 7.3 Programmatic Usage ```python from vega_portable_inference_2d import Vega2DInference import numpy as np # Initialize inference engine inference = Vega2DInference("vega_2d_best.pt") # Load your spectrum (shape: any × 1023, will be padded/truncated to 60×1023) spectrum = np.load("my_measurement.npy") # Run inference result = inference.predict(spectrum, threshold=0.5) # Get human-readable summary print(result.summary()) # Access individual predictions for isotope in result.get_present_isotopes(): print(f"{isotope.name}: {isotope.probability:.1%} confidence, {isotope.activity_bq:.1f} Bq") # Get all 82 probabilities (even non-detected) full_result = inference.predict(spectrum, threshold=0.0, return_all=True) # Export to JSON json_str = result.to_json() # Export to dict data = result.to_dict() ``` ### 7.4 API Reference #### Vega2DInference Class ```python class Vega2DInference: def __init__( self, model_path: Union[str, Path], # Path to .pt checkpoint isotope_index: Optional = None, # Custom index (uses default) device: Optional = None # 'cuda', 'cpu', or auto-detect ): ... def predict( self, spectrum: np.ndarray, # (T, 1023) array threshold: float = 0.5, # Detection threshold return_all: bool = False # Include non-detected isotopes ) -> SpectrumPrediction: ... def predict_from_file( self, file_path: str, # Path to .npy file threshold: float = 0.5 ) -> SpectrumPrediction: ... def predict_batch( self, spectra: List[np.ndarray], threshold: float = 0.5 ) -> List[SpectrumPrediction]: ... ``` #### SpectrumPrediction Dataclass ```python @dataclass class SpectrumPrediction: isotopes: List[IsotopePrediction] # All predictions num_present: int # Count above threshold confidence: float # Average probability of detected threshold_used: float # Threshold used def get_present_isotopes(self) -> List[IsotopePrediction]: """Return only detected isotopes.""" def summary(self) -> str: """Human-readable summary.""" def to_dict(self) -> dict: """Convert to dictionary.""" def to_json(self, indent=2) -> str: """Convert to JSON string.""" ``` #### IsotopePrediction Dataclass ```python @dataclass class IsotopePrediction: name: str # e.g., "Cs-137" probability: float # 0.0 to 1.0 activity_bq: float # Estimated activity in Becquerels present: bool # True if probability >= threshold ``` --- ## 8. Output Interpretation ### 8.1 Understanding Predictions Each prediction contains: | Field | Type | Range | Meaning | |-------|------|-------|---------| | `probability` | float | 0.0-1.0 | Model's confidence isotope is present | | `activity_bq` | float | 0-1000 | Estimated activity (only meaningful if present) | | `present` | bool | T/F | Whether probability >= threshold | ### 8.2 Probability Interpretation | Probability | Interpretation | Action | |-------------|----------------|--------| | >0.95 | **Very High Confidence** | Definitely present | | 0.80-0.95 | **High Confidence** | Very likely present | | 0.50-0.80 | **Moderate Confidence** | Probably present, verify | | 0.30-0.50 | **Low Confidence** | Possibly present, investigate | | <0.30 | **Very Low** | Likely absent | ### 8.3 Activity Estimation Accuracy Activity estimates are **approximate** due to: - Unknown source distance - Unknown shielding - Detector efficiency variations - Normalization removes absolute count information **Use activity estimates for:** - Relative comparisons between isotopes - Order-of-magnitude estimates - Identifying dominant vs minor contributors **Do NOT use for:** - Regulatory compliance measurements - Precise quantitative analysis - Safety limit calculations ### 8.4 Single Isotope Detection When **one isotope** is detected: ```json { "isotopes": [ {"name": "Cs-137", "probability": 0.98, "activity_bq": 45.2, "present": true} ], "num_present": 1, "confidence": 0.98 } ``` **Interpretation:** - Clean calibration source or specific contamination - Verify gamma lines match expected energies - Single-isotope sources are common in: - Calibration checks - Medical procedures - Industrial gauges ### 8.5 Multiple Isotope Detection When **multiple isotopes** are detected: ```json { "isotopes": [ {"name": "Cs-137", "probability": 0.95, "activity_bq": 32.1, "present": true}, {"name": "Cs-134", "probability": 0.87, "activity_bq": 18.4, "present": true} ], "num_present": 2, "confidence": 0.91 } ``` **Interpretation:** - Check for decay chain relationships (Section 10) - Look for known signatures (fallout, NORM, equilibrium) - Consider mixed-source scenarios ### 8.6 Background-Only (No Detection) When **no isotopes** exceed threshold: ```json { "isotopes": [], "num_present": 0, "confidence": 0.82 } ``` **Interpretation:** - Spectrum shows only environmental background - K-40 (1460 keV) may be visible but below threshold - Natural radon daughters may contribute - **This is normal** for most measurements! ### 8.7 Common Detection Patterns #### Pattern 1: Calibration Source ``` Cs-137: 98% ─────────────────────── All others: <5% ``` Clean single-source signature. Typical for check sources. #### Pattern 2: NORM Material ``` K-40: 75% ───────────────── Ra-226: 62% ─────────────── Th-232: 58% ────────────── Bi-214: 71% ─────────────── ``` Multiple natural isotopes at similar activities. Indicates rocks, soil, building materials. #### Pattern 3: Decay Chain ``` U-238: 45% ───────────── Ra-226: 88% ────────────────── Pb-214: 92% ─────────────────── Bi-214: 94% ──────────────────── ``` Parent + daughters detected. Indicates secular equilibrium. See Section 10. #### Pattern 4: Reactor Fallout ``` Cs-137: 95% ─────────────────── Cs-134: 72% ──────────────── ``` Cs-137 + Cs-134 is the **fingerprint of reactor-origin material**. --- ## 9. Isotope Reference ### 9.1 Complete Isotope List (82 Total) The model identifies these isotopes, sorted alphabetically (same order as model output indices): ``` Index | Isotope | Category | Primary Gamma (keV) ------|-----------|---------------------|-------------------- 0 | Ac-225 | U235_CHAIN | 99.9 1 | Ac-227 | U235_CHAIN | 236.0 2 | Ac-228 | TH232_CHAIN | 911.2 3 | Ag-110m | ACTIVATION | 657.8 4 | Am-241 | CALIBRATION | 59.5 5 | Au-198 | MEDICAL | 411.8 6 | Ba-133 | CALIBRATION | 356.0 7 | Be-7 | COSMOGENIC | 477.6 8 | Bi-207 | CALIBRATION | 569.7 9 | Bi-210 | U238_CHAIN | 46.5 10 | Bi-211 | U235_CHAIN | 351.1 11 | Bi-212 | TH232_CHAIN | 727.3 12 | Bi-214 | U238_CHAIN | 609.3 13 | C-14 | COSMOGENIC | (beta only) 14 | Cd-109 | INDUSTRIAL | 88.0 15 | Ce-139 | ACTIVATION | 165.9 16 | Ce-141 | REACTOR_FALLOUT | 145.4 17 | Ce-144 | REACTOR_FALLOUT | 133.5 18 | Co-57 | CALIBRATION | 122.1 19 | Co-58 | ACTIVATION | 810.8 20 | Co-60 | CALIBRATION | 1173.2, 1332.5 21 | Cr-51 | ACTIVATION | 320.1 22 | Cs-134 | REACTOR_FALLOUT | 604.7, 795.9 23 | Cs-137 | CALIBRATION | 661.7 24 | Cu-64 | MEDICAL | 1345.8 25 | Eu-152 | CALIBRATION | 121.8, 344.3 26 | Eu-154 | CALIBRATION | 123.1, 1274.4 27 | Eu-155 | REACTOR_FALLOUT | 86.5, 105.3 28 | F-18 | MEDICAL | 511.0 29 | Fe-55 | ACTIVATION | (X-rays) 30 | Fe-59 | ACTIVATION | 1099.3 31 | Ga-67 | MEDICAL | 93.3, 184.6 32 | Ga-68 | MEDICAL | 511.0 33 | Ge-68 | CALIBRATION | 511.0 34 | H-3 | COSMOGENIC | (beta only) 35 | Hf-175 | ACTIVATION | 343.4 36 | Hf-181 | ACTIVATION | 482.2 37 | Hg-203 | INDUSTRIAL | 279.2 38 | I-123 | MEDICAL | 159.0 39 | I-125 | MEDICAL | 35.5 40 | I-131 | MEDICAL | 364.5 41 | In-111 | MEDICAL | 171.3, 245.4 42 | Ir-192 | INDUSTRIAL | 316.5, 468.1 43 | K-40 | NATURAL_BACKGROUND | 1460.8 44 | Kr-85 | REACTOR_FALLOUT | 514.0 45 | La-140 | REACTOR_FALLOUT | 1596.2 46 | Lu-177 | MEDICAL | 208.4 47 | Mn-54 | CALIBRATION | 834.8 48 | Mo-99 | MEDICAL | 140.5, 739.5 49 | Na-22 | CALIBRATION | 511.0, 1274.5 50 | Na-24 | ACTIVATION | 1368.6, 2754.0 51 | Nb-95 | REACTOR_FALLOUT | 765.8 52 | Np-237 | INDUSTRIAL | 86.5 53 | Pa-231 | U235_CHAIN | 283.7 54 | Pa-233 | U238_CHAIN | 311.9 55 | Pa-234m | U238_CHAIN | 1001.0 56 | Pb-210 | U238_CHAIN | 46.5 57 | Pb-211 | U235_CHAIN | 404.9 58 | Pb-212 | TH232_CHAIN | 238.6 59 | Pb-214 | U238_CHAIN | 351.9 60 | Po-210 | U238_CHAIN | (alpha only) 61 | Pu-239 | INDUSTRIAL | 413.7 62 | Ra-223 | U235_CHAIN | 269.5 63 | Ra-224 | TH232_CHAIN | 241.0 64 | Ra-226 | U238_CHAIN | 186.2 65 | Rb-86 | ACTIVATION | 1076.6 66 | Rn-219 | U235_CHAIN | 271.2 67 | Rn-220 | TH232_CHAIN | 549.7 68 | Rn-222 | U238_CHAIN | (alpha only) 69 | Ru-103 | REACTOR_FALLOUT | 497.1 70 | Ru-106 | REACTOR_FALLOUT | 511.9, 621.9 71 | Sb-124 | ACTIVATION | 602.7 72 | Sb-125 | REACTOR_FALLOUT | 427.9 73 | Sc-46 | ACTIVATION | 889.3 74 | Se-75 | INDUSTRIAL | 264.7, 279.5 75 | Sr-85 | CALIBRATION | 514.0 76 | Sr-90 | REACTOR_FALLOUT | (beta only) 77 | Tc-99m | MEDICAL | 140.5 78 | Th-227 | U235_CHAIN | 236.0 79 | Th-228 | TH232_CHAIN | 84.4 80 | Th-232 | PRIMORDIAL | (chain daughters) 81 | Th-234 | U238_CHAIN | 63.3, 92.4 ``` ### 9.2 Key Gamma Lines Reference ```python GAMMA_LINES = { # Calibration Sources "Cs-137": [(661.7, 0.851)], # Classic 662 keV "Co-60": [(1173.2, 0.999), (1332.5, 0.9998)], # Dual peaks "Am-241": [(59.5, 0.359)], # Low energy "Ba-133": [(356.0, 0.623), (81.0, 0.329)], "Na-22": [(511.0, 1.798), (1274.5, 0.999)], # Positron annihilation "Eu-152": [(121.8, 0.284), (344.3, 0.265), (1408.0, 0.210)], # Medical "Tc-99m": [(140.5, 0.890)], "I-131": [(364.5, 0.817)], "F-18": [(511.0, 1.934)], # PET isotope # Background "K-40": [(1460.8, 0.107)], # Always present # Decay Chains "Pb-214": [(351.9, 0.371), (295.2, 0.192)], "Bi-214": [(609.3, 0.461), (1120.3, 0.150)], "Tl-208": [(583.2, 0.845), (2614.5, 0.359)], } ``` ### 9.3 Isotope Categories | Category | Description | Examples | |----------|-------------|----------| | `CALIBRATION` | Check sources, well-characterized | Cs-137, Co-60, Am-241 | | `MEDICAL` | Hospital/imaging use, short-lived | Tc-99m, I-131, F-18 | | `INDUSTRIAL` | Sealed sources, gauges | Ir-192, Se-75 | | `NATURAL_BACKGROUND` | Always present in environment | K-40 | | `PRIMORDIAL` | Existed since Earth formed | U-238, Th-232, U-235 | | `U238_CHAIN` | Uranium-238 decay daughters | Ra-226, Pb-214, Bi-214 | | `TH232_CHAIN` | Thorium-232 decay daughters | Ac-228, Pb-212, Tl-208 | | `U235_CHAIN` | Uranium-235 decay daughters | Pa-231, Ac-227 | | `REACTOR_FALLOUT` | Fission products | Cs-134, I-131, Sr-90 | | `ACTIVATION` | Neutron-activated materials | Co-58, Fe-59, Zn-65 | | `COSMOGENIC` | Cosmic ray produced | Be-7, Na-22, C-14 | --- ## 10. Decay Chain Analysis ### 10.1 Understanding Decay Chains Radioactive isotopes decay into other isotopes, forming **decay chains**. The three major natural chains are: 1. **Uranium-238 Series** → ends at Pb-206 (stable) 2. **Thorium-232 Series** → ends at Pb-208 (stable) 3. **Uranium-235 Series** → ends at Pb-207 (stable) ### 10.2 Secular Equilibrium In **secular equilibrium** (closed system, long time), all daughter activities equal the parent activity: ``` A_parent = A_daughter1 = A_daughter2 = ... = A_daughterN ``` This means detecting daughters implies parent presence! ### 10.3 Chain Signatures for Parent Inference The system defines **ChainSignature** patterns to infer parent isotopes from detected daughters: #### Rn-222 Progeny (Indicates Radon) ```python required: {"Pb-214", "Bi-214"} optional: {"Pb-210"} inferred_parent: "Rn-222" ``` **When you see Pb-214 + Bi-214 → atmospheric radon is present** #### Ra-226 Equilibrium (Indicates Uranium) ```python required: {"Ra-226", "Pb-214", "Bi-214"} optional: {"Pb-210", "Bi-210"} inferred_parent: "U-238" ``` **When you see Ra-226 + daughters → U-238 decay chain in equilibrium** #### Th-232 Equilibrium (Indicates Thorium) ```python required: {"Ac-228", "Pb-212", "Bi-212"} optional: {"Tl-208", "Ra-224"} inferred_parent: "Th-232" ``` **When you see Ac-228 + Pb-212 + Bi-212 → Th-232 source material** #### Rn-220 Progeny (Thoron Daughters) ```python required: {"Pb-212", "Bi-212"} optional: {"Tl-208"} inferred_parent: "Rn-220" ``` **When you see Pb-212 + Bi-212 → thoron (Rn-220) is present** ### 10.4 Using Decay Chain Inference ```python from synthetic_spectra.ground_truth.decay_chains import infer_parent_from_daughters # After running inference, get detected isotope names detected = {iso.name for iso in result.get_present_isotopes()} # Infer parent isotopes parents = infer_parent_from_daughters(detected) for parent_name, signature, confidence in parents: print(f"Inferred: {parent_name} (confidence: {confidence:.1%})") print(f" Based on: {signature.name}") print(f" Required daughters: {signature.required_daughters}") ``` ### 10.5 Interpreting Chain Detections #### Example 1: Uranium Ore ``` Detected: U-238 (45%), Ra-226 (88%), Pb-214 (92%), Bi-214 (94%) ``` **Interpretation:** - U-238 has low detection probability (weak gamma) - Daughters are strong gamma emitters - High confidence of uranium-bearing material - In secular equilibrium #### Example 2: Radon in Air ``` Detected: Pb-214 (78%), Bi-214 (82%) NOT detected: Ra-226, U-238 ``` **Interpretation:** - Airborne radon daughters (deposited on detector) - Parent Rn-222 is gas (no gamma) - Ra-226/U-238 not present locally - Common indoor measurement result #### Example 3: Thorium Lantern Mantle ``` Detected: Th-232 (52%), Ac-228 (71%), Pb-212 (85%), Bi-212 (79%), Tl-208 (67%) ``` **Interpretation:** - Complete Th-232 chain - Tl-208's 2614 keV line is distinctive - Indicates thoriated material ### 10.6 U-238 Decay Chain Detail ``` U-238 (4.47 Gy) ↓ α Th-234 (24.1 d) [63.3, 92.4 keV] ↓ β Pa-234m (1.17 min) [1001 keV] ↓ β U-234 (245 ky) ↓ α Th-230 (75.4 ky) ↓ α Ra-226 (1600 y) [186.2 keV] ↓ α Rn-222 (3.82 d) [gas, no gamma] ↓ α Po-218 (3.1 min) ↓ α Pb-214 (26.8 min) [351.9, 295.2 keV] ★ KEY INDICATOR ↓ β Bi-214 (19.9 min) [609.3, 1120.3 keV] ★ KEY INDICATOR ↓ β Po-214 (164 μs) ↓ α Pb-210 (22.3 y) [46.5 keV] ↓ β Bi-210 (5.01 d) ↓ β Po-210 (138 d) ↓ α Pb-206 (stable) ``` ### 10.7 Th-232 Decay Chain Detail ``` Th-232 (14.0 Gy) ↓ α Ra-228 (5.75 y) [no significant gamma] ↓ β Ac-228 (6.15 h) [911.2, 338.3, 969.0 keV] ★ KEY INDICATOR ↓ β Th-228 (1.91 y) [84.4 keV] ↓ α Ra-224 (3.66 d) [241.0 keV] ↓ α Rn-220 (55.6 s) [549.7 keV] ↓ α Po-216 (0.145 s) ↓ α Pb-212 (10.64 h) [238.6 keV] ★ KEY INDICATOR ↓ β Bi-212 (60.6 min) [727.3 keV] ↓ α (35.94%) ↓ β (64.06%) Tl-208 (3.05 min) Po-212 (0.3 μs) [583.2, 2614.5 keV] ↓ α ↓ β ↙ → Pb-208 (stable) ``` --- ## 11. Threshold Selection Guide ### 11.1 What is the Threshold? The threshold is the **probability cutoff** for declaring an isotope "present": - `probability >= threshold` → **DETECTED** - `probability < threshold` → **NOT DETECTED** ### 11.2 Threshold Trade-offs | Threshold | Precision | Recall | False Positives | False Negatives | |-----------|-----------|--------|-----------------|-----------------| | 0.9 | Very High | Low | Very Few | Many | | 0.7 | High | Moderate | Few | Some | | **0.5** | **Balanced** | **Balanced** | **Balanced** | **Balanced** | | 0.3 | Moderate | High | Some | Few | | 0.1 | Low | Very High | Many | Very Few | ### 11.3 Recommended Thresholds by Scenario | Scenario | Threshold | Rationale | |----------|-----------|-----------| | **General purpose** | 0.5 | Balanced performance | | **Calibration verification** | 0.7 | High confidence needed | | **Weak source detection** | 0.3 | Don't miss faint signals | | **Safety screening** | 0.3 | Prioritize recall | | **Research/survey** | 0.4 | Slightly favor sensitivity | | **Regulatory reporting** | 0.6 | Minimize false positives | ### 11.4 Adjusting Threshold at Runtime ```python # High-sensitivity scan result_sensitive = inference.predict(spectrum, threshold=0.3) # High-confidence confirmation result_confident = inference.predict(spectrum, threshold=0.7) # Compare print(f"At 0.3: {result_sensitive.num_present} isotopes") print(f"At 0.7: {result_confident.num_present} isotopes") ``` ### 11.5 Multi-Threshold Analysis ```python def analyze_at_multiple_thresholds(spectrum, inference): """Analyze spectrum at multiple thresholds.""" thresholds = [0.3, 0.5, 0.7, 0.9] for t in thresholds: result = inference.predict(spectrum, threshold=t) names = [iso.name for iso in result.get_present_isotopes()] print(f"Threshold {t}: {names}") ``` **Example Output:** ``` Threshold 0.3: ['Cs-137', 'Cs-134', 'K-40', 'Pb-214'] Threshold 0.5: ['Cs-137', 'Cs-134', 'K-40'] Threshold 0.7: ['Cs-137', 'Cs-134'] Threshold 0.9: ['Cs-137'] ``` **Interpretation:** Cs-137 is definitely present (>0.9), Cs-134 is very likely (>0.7), K-40 is probable (>0.5), Pb-214 is possible (>0.3). --- ## 12. Example Workflows ### 12.1 Basic Inference Workflow ```python import numpy as np from vega_portable_inference_2d import Vega2DInference # 1. Initialize inference = Vega2DInference("models/vega_2d_best.pt") # 2. Load spectrum spectrum = np.load("measurement.npy") print(f"Spectrum shape: {spectrum.shape}") # 3. Run inference result = inference.predict(spectrum, threshold=0.5) # 4. Display results print(result.summary()) # 5. Export with open("results.json", "w") as f: f.write(result.to_json()) ``` ### 12.2 Batch Processing Workflow ```python from pathlib import Path def process_directory(data_dir: str, model_path: str, threshold: float = 0.5): """Process all spectra in a directory.""" inference = Vega2DInference(model_path) results = [] for npy_file in Path(data_dir).glob("*.npy"): spectrum = np.load(npy_file) prediction = inference.predict(spectrum, threshold) results.append({ "file": npy_file.name, "detected": [iso.name for iso in prediction.get_present_isotopes()], "confidence": prediction.confidence }) return results # Usage results = process_directory("spectra/", "models/vega_2d_best.pt") for r in results: print(f"{r['file']}: {r['detected']}") ``` ### 12.3 Decay Chain Analysis Workflow ```python from vega_portable_inference_2d import Vega2DInference from synthetic_spectra.ground_truth.decay_chains import ( infer_parent_from_daughters, get_chain_daughters ) def analyze_with_chain_inference(spectrum, inference, threshold=0.5): """Full analysis including decay chain inference.""" # Run basic inference result = inference.predict(spectrum, threshold) detected = {iso.name for iso in result.get_present_isotopes()} print("=== DIRECT DETECTIONS ===") for iso in result.get_present_isotopes(): print(f" {iso.name}: {iso.probability:.1%}") # Infer parents from daughters print("\n=== DECAY CHAIN ANALYSIS ===") parents = infer_parent_from_daughters(detected) if parents: for parent, signature, confidence in parents: print(f"\n Inferred Parent: {parent}") print(f" Confidence: {confidence:.1%}") print(f" Signature: {signature.name}") print(f" Required daughters found: {detected & signature.required_daughters}") else: print(" No decay chain signatures identified") return result, parents # Usage result, parents = analyze_with_chain_inference(spectrum, inference) ``` ### 12.4 Real-Time Monitoring Workflow ```python import time def monitor_spectrum_stream(inference, spectrum_source, interval=1.0, threshold=0.5): """Monitor incoming spectra in real-time.""" while True: # Get latest spectrum (implement your data source) spectrum = spectrum_source.get_latest() if spectrum is not None: result = inference.predict(spectrum, threshold) if result.num_present > 0: print(f"[{time.strftime('%H:%M:%S')}] DETECTION!") for iso in result.get_present_isotopes(): print(f" {iso.name}: {iso.probability:.1%}, {iso.activity_bq:.1f} Bq") else: print(f"[{time.strftime('%H:%M:%S')}] Background only") time.sleep(interval) ``` ### 12.5 Sample JSON Output ```json { "isotopes": [ { "name": "Cs-137", "probability": 0.9823, "activity_bq": 45.2, "present": true }, { "name": "Cs-134", "probability": 0.8741, "activity_bq": 18.7, "present": true } ], "num_present": 2, "confidence": 0.9282, "threshold_used": 0.5 } ``` ### 12.6 Sample Input Generation (for Testing) ```python from vega_portable_inference_2d import create_sample_spectrum_2d # Generate test spectrum test_spectrum = create_sample_spectrum_2d( isotope="Cs-137", activity_bq=100.0, duration_seconds=60, add_background=True, add_noise=True, detector_fwhm_percent=8.5, seed=42 ) print(f"Shape: {test_spectrum.shape}") # (60, 1023) print(f"Range: [{test_spectrum.min():.1f}, {test_spectrum.max():.1f}]") # Save for later np.save("test_cs137.npy", test_spectrum) ``` --- ## 13. Troubleshooting ### 13.1 Common Issues #### Issue: "No isotopes detected" for known source **Possible causes:** 1. Threshold too high → Lower to 0.3 2. Source very weak → Increase measurement time 3. Wrong normalization → Check if max > 0 4. Input shape wrong → Must be (T, 1023) **Solution:** ```python # Check probabilities before thresholding result = inference.predict(spectrum, threshold=0.0, return_all=True) top5 = sorted(result.isotopes, key=lambda x: -x.probability)[:5] for iso in top5: print(f"{iso.name}: {iso.probability:.1%}") ``` #### Issue: "Too many false positives" **Possible causes:** 1. Threshold too low → Raise to 0.6-0.7 2. Noisy data → Check for acquisition problems 3. Strong overlapping peaks → Check decay chains **Solution:** ```python # Use higher threshold for confirmation result = inference.predict(spectrum, threshold=0.7) ``` #### Issue: "CUDA out of memory" **Possible causes:** 1. Batch size too large 2. Other GPU processes **Solution:** ```python # Force CPU inference inference = Vega2DInference(model_path, device=torch.device('cpu')) ``` #### Issue: "Model weights not matching" **Possible causes:** 1. Model architecture changed 2. Wrong checkpoint version **Solution:** - Ensure checkpoint matches Vega2DConfig defaults - Re-train if architecture was modified ### 13.2 Data Quality Checks ```python def check_spectrum_quality(spectrum: np.ndarray) -> dict: """Check spectrum data quality.""" issues = [] # Shape check if spectrum.ndim != 2: issues.append(f"Wrong dimensions: {spectrum.ndim}, expected 2") if spectrum.shape[1] != 1023: issues.append(f"Wrong channels: {spectrum.shape[1]}, expected 1023") # Value checks if spectrum.min() < 0: issues.append("Contains negative values") if spectrum.max() == 0: issues.append("All zeros - no data") if np.isnan(spectrum).any(): issues.append("Contains NaN values") if np.isinf(spectrum).any(): issues.append("Contains infinite values") return { "shape": spectrum.shape, "min": float(spectrum.min()), "max": float(spectrum.max()), "mean": float(spectrum.mean()), "issues": issues, "valid": len(issues) == 0 } ``` ### 13.3 Performance Optimization ```python # Batch predictions are faster than individual spectra = [np.load(f) for f in spectrum_files] results = inference.predict_batch(spectra, threshold=0.5) # Pre-load model once, reuse for all predictions inference = Vega2DInference(model_path) # Do once for spectrum in stream: result = inference.predict(spectrum) # Fast ``` --- ## Appendix A: Complete Configuration Reference ### A.1 Vega2DConfig Defaults ```python Vega2DConfig( num_channels=1023, num_time_intervals=60, num_isotopes=82, conv_channels=[32, 64, 128], kernel_size=(3, 7), pool_size=(2, 2), fc_hidden_dims=[512, 256], dropout_rate=0.3, leaky_relu_slope=0.01, max_activity_bq=1000.0 ) ``` ### A.2 TrainingConfig2D Defaults ```python TrainingConfig2D( data_dir="O:/master_data_collection/isotopev2", model_dir="models", target_time_intervals=60, epochs=50, batch_size=32, learning_rate=0.001, weight_decay=1e-05, classification_weight=1.0, regression_weight=0.1, use_amp=True, early_stopping_patience=10, lr_scheduler_patience=5, lr_scheduler_factor=0.5, num_workers=4 ) ``` ### A.3 Generation Scenario Fractions ```python DEFAULT_SCENARIOS = [ BackgroundOnlyScenario(0.15), SingleCalibrationScenario(0.20), SingleMedicalScenario(0.08), SingleIndustrialScenario(0.05), UraniumChainScenario(0.10), ThoriumChainScenario(0.10), NORMScenario(0.07), FalloutScenario(0.05), MixedSourcesScenario(0.10), ComplexMixScenario(0.05), WeakSourceScenario(0.05), ] ``` --- ## Appendix B: Version History | Version | Date | Changes | |---------|------|---------| | 2.0 | Jan 2025 | 2D model architecture, temporal features | | 1.0 | Dec 2024 | Original 1D model (deprecated) | --- **Document End** *For questions or issues, consult the agents.md file in the repository root.*