Files
radiacode/CLAUDE.md
2026-05-21 17:36:33 +02:00

120 lines
7.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Radiacode 103 is a gamma-ray spectrometer isotope identification pipeline. It captures spectra from a Radiacode 103 USB detector, subtracts background radiation, and identifies isotopes using a CNN-FCNN multi-task PyTorch model (VegaModel, 34.5M params, 82 isotopes). The project runs as Docker containers orchestrated by docker-compose.
## Architecture
Three Docker containers, each with its own Dockerfile:
- **train/** — Generates 50k synthetic spectra and trains VegaModel on GPU. Entrypoint runs generation then training sequentially. Code lives in `train/vega_ml/` (synthetic_spectra, training/vega).
- **detect/** — Production monitor. Connects to Radiacode 103 via USB, samples every 60s, accumulates spectrum, subtracts background, runs inference, writes JSON state and daily reports. Two scripts: `radiacode_monitor.py` (main loop) and `capture_background.py` (24h background capture).
- **web/** — FastAPI dashboard on port 8080. Serves a single-page HTML/JS frontend with tabs for spectrum, background, CPS timeline, and history. Reads monitor state from JSON files written by the detect container.
Data flow: `detect` writes `monitor_state.json` + `cps_log.jsonl` + daily reports to `/data/` and `/logs/``web` reads them (read-only volume mounts). The `train` container reads/writes `/data/synthetic/` and `/models/`.
### Web API Routes
- `/api/status` — monitor status (connected, CPS, staleness)
- `/api/spectrum/current` — accumulated spectrum (CsI-corrected, 1023 channels)
- `/api/spectrum/difference` — background-subtracted spectrum (CsI-corrected)
- `/api/background`, `/api/background/spectrum`, `/api/background/reference`, `/api/background/theoretical` — background data (live, 24h reference, theoretical CsI(Tl) model)
- `/api/cps/timeline` — CPS time series
- `/api/history`, `/api/history/{date}` — daily detection reports
### Key Physics Constants
Energy calibration: `E(keV) = 0.33 + 2.97 * channel_index` (env vars `ENERGY_CALIBRATION_OFFSET` and `ENERGY_CALIBRATION_SLOPE`). The detector has 1024 raw channels but channel 1023 is an overflow bin — only the first 1023 channels (0.333036 keV) are used for display and inference. CsI(Tl) crystal with 8.4% FWHM at 662 keV.
**CsI(Tl) non-linear response correction**: CsI(Tl) has non-proportional scintillation response at low energies, causing peaks to appear at higher energies than their true gamma energy. The correction `E_apparent = E_true * (1 + alpha * exp(-E_true/beta))` with `alpha=0.37, beta=100` shifts the Am-241 peak from 71.6 keV (apparent) back to 59.5 keV (true). This correction is applied in the inference pipeline (`radiacode_monitor.py`) and web display, NOT in training data (which uses theoretical energies). Parameters are configurable via `CSI_NONLINEAR_ALPHA` and `CSI_NONLINEAR_BETA` env vars.
## Commands
```bash
# Build all images
docker compose build
# Train model (GPU required, ~30 min on RTX 5060 Ti)
docker compose run --rm train
# Capture 24h background (leave running, no radioactive source nearby)
docker compose run --rm -d --name radiacode-bg detect python capture_background.py
# Start continuous detection monitor
docker compose up detect
# Start web dashboard
docker compose up web
# Run both detect and web
docker compose up detect web
# Test detection manually (inside detect container)
docker compose run --rm -v $(pwd)/test_detection.py:/app/test_detection.py detect python /app/test_detection.py
```
No test suite exists in this project. No linter is configured.
## VegaModel
Defined in `train/vega_ml/training/vega/model.py`. Input: 1D spectrum (1023 channels). Output: classification logits (82 isotopes, apply sigmoid for probabilities) + activity predictions (Bq, scaled by max_activity_bq=1000). Loss: `VegaLoss = BCE(logits) + 0.1 * Huber(activities * mask)` — regression only penalizes present isotopes.
**Inference pipeline** (in `radiacode_monitor.py::run_inference`):
1. Subtract background from accumulated spectrum → net_rate
2. Apply CsI(Tl) non-linear correction: `correct_csi_nonlinear(net_rate)` — remaps channels so peaks appear at theoretical energies
3. Normalize with log1p: `log1p(corrected) / max(log1p(corrected))`
4. Feed to VegaModel → sigmoid → filter by threshold
The model checkpoint (`models/vega_best.pt`) stores `model_config` and `model_state_dict`. At inference, the detect container dynamically imports `VegaModel` and `IsotopeIndex` from the mounted `vega_ml` volume.
## Synthetic Spectrum Generation
### Detector Physics Model
Training spectra include realistic CsI(Tl) detector effects:
- **Energy calibration**: `E = 0.33 + 2.97 * ch` with 1023 channels (matching real detector)
- **K-escape peaks**: Iodine K-shell X-ray escape at `E - 28.5 keV` with energy-dependent escape fraction (up to 35% at low energies). Implemented in `spectrum_physics.py::_k_escape_fraction()`
- **Asymmetric peaks**: Low-energy tail for peaks below 200 keV (15% tail fraction at 0 keV, 0% above 200 keV). Implemented in `spectrum_physics.py::_asymmetric_peak()`
- **FWHM**: Energy-dependent resolution `FWHM(E) = 0.084 * 662 * sqrt(E/662)` keV (8.4% at 662 keV)
### Background Model
The training background uses a realistic CsI(Tl) continuum shape:
- **Continuum**: Asymmetric hump at ~110 keV (sigma_left=55, sigma_right=50 keV) + Compton tail + noise floor. Calibrated against real Radiacode 103 measurements.
- **Isotope peaks**: K-40, Pb-214, Bi-214, Ac-228, Pb-212, Tl-208 — with stochastic activity variation per sample.
- **Hybrid training**: If `MEASURED_BACKGROUND_PATH` points to a valid `.npy` file, 70% measured + 30% synthetic continuum is used.
- **Background subtraction mode**: 10% of training samples are background-subtracted (simulate the inference pipeline)
### Training Data Augmentation
- **Normalization**: log1p (replaces max normalization for better weak-signal detection)
- **Low-signal samples**: 15% of samples use 0.015 Bq activities with 30300s durations
- **Duration range**: 30300 seconds (covers short accumulations to long measurements)
- **Activity range**: 0.01100 Bq (covers weak to strong sources)
## Configuration
All config is via environment variables in `docker-compose.yml`. Key variables:
**Train container:**
- `NUM_SAMPLES` — number of synthetic spectra (default 50000)
- `BATCH_SIZE` — training batch size (default 32)
- `MIN_DURATION`/`MAX_DURATION` — spectrum duration range in seconds (default 30300)
- `MEASURED_BACKGROUND_PATH` — path to measured background `.npy` for hybrid training
**Detect container:**
- `MODEL_PATH`, `ISOTOPE_INDEX_PATH`, `BACKGROUND_PATH` — file paths
- `VEGA_DEVICE``cpu` or `cuda`
- `THRESHOLD` — detection probability threshold (default 0.5)
- `SAMPLE_INTERVAL` — seconds between samples (default 60)
- `ENERGY_CALIBRATION_OFFSET/SLOPE` — energy calibration constants
- `CSI_NONLINEAR_ALPHA/BETA` — CsI(Tl) non-linear response correction (default 0.37/100.0)
**Web container:**
- `ENERGY_CALIBRATION_OFFSET/SLOPE` — energy calibration constants
- `CSI_NONLINEAR_ALPHA/BETA` — CsI(Tl) correction parameters (must match detect)