120 lines
7.1 KiB
Markdown
120 lines
7.1 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Project Overview
|
||
|
||
Radiacode 103 is a gamma-ray spectrometer isotope identification pipeline. It captures spectra from a Radiacode 103 USB detector, subtracts background radiation, and identifies isotopes using a CNN-FCNN multi-task PyTorch model (VegaModel, 34.5M params, 82 isotopes). The project runs as Docker containers orchestrated by docker-compose.
|
||
|
||
## Architecture
|
||
|
||
Three Docker containers, each with its own Dockerfile:
|
||
|
||
- **train/** — Generates 50k synthetic spectra and trains VegaModel on GPU. Entrypoint runs generation then training sequentially. Code lives in `train/vega_ml/` (synthetic_spectra, training/vega).
|
||
- **detect/** — Production monitor. Connects to Radiacode 103 via USB, samples every 60s, accumulates spectrum, subtracts background, runs inference, writes JSON state and daily reports. Two scripts: `radiacode_monitor.py` (main loop) and `capture_background.py` (24h background capture).
|
||
- **web/** — FastAPI dashboard on port 8080. Serves a single-page HTML/JS frontend with tabs for spectrum, background, CPS timeline, and history. Reads monitor state from JSON files written by the detect container.
|
||
|
||
Data flow: `detect` writes `monitor_state.json` + `cps_log.jsonl` + daily reports to `/data/` and `/logs/` → `web` reads them (read-only volume mounts). The `train` container reads/writes `/data/synthetic/` and `/models/`.
|
||
|
||
### Web API Routes
|
||
|
||
- `/api/status` — monitor status (connected, CPS, staleness)
|
||
- `/api/spectrum/current` — accumulated spectrum (CsI-corrected, 1023 channels)
|
||
- `/api/spectrum/difference` — background-subtracted spectrum (CsI-corrected)
|
||
- `/api/background`, `/api/background/spectrum`, `/api/background/reference`, `/api/background/theoretical` — background data (live, 24h reference, theoretical CsI(Tl) model)
|
||
- `/api/cps/timeline` — CPS time series
|
||
- `/api/history`, `/api/history/{date}` — daily detection reports
|
||
|
||
### Key Physics Constants
|
||
|
||
Energy calibration: `E(keV) = 0.33 + 2.97 * channel_index` (env vars `ENERGY_CALIBRATION_OFFSET` and `ENERGY_CALIBRATION_SLOPE`). The detector has 1024 raw channels but channel 1023 is an overflow bin — only the first 1023 channels (0.33–3036 keV) are used for display and inference. CsI(Tl) crystal with 8.4% FWHM at 662 keV.
|
||
|
||
**CsI(Tl) non-linear response correction**: CsI(Tl) has non-proportional scintillation response at low energies, causing peaks to appear at higher energies than their true gamma energy. The correction `E_apparent = E_true * (1 + alpha * exp(-E_true/beta))` with `alpha=0.37, beta=100` shifts the Am-241 peak from 71.6 keV (apparent) back to 59.5 keV (true). This correction is applied in the inference pipeline (`radiacode_monitor.py`) and web display, NOT in training data (which uses theoretical energies). Parameters are configurable via `CSI_NONLINEAR_ALPHA` and `CSI_NONLINEAR_BETA` env vars.
|
||
|
||
## Commands
|
||
|
||
```bash
|
||
# Build all images
|
||
docker compose build
|
||
|
||
# Train model (GPU required, ~30 min on RTX 5060 Ti)
|
||
docker compose run --rm train
|
||
|
||
# Capture 24h background (leave running, no radioactive source nearby)
|
||
docker compose run --rm -d --name radiacode-bg detect python capture_background.py
|
||
|
||
# Start continuous detection monitor
|
||
docker compose up detect
|
||
|
||
# Start web dashboard
|
||
docker compose up web
|
||
|
||
# Run both detect and web
|
||
docker compose up detect web
|
||
|
||
# Test detection manually (inside detect container)
|
||
docker compose run --rm -v $(pwd)/test_detection.py:/app/test_detection.py detect python /app/test_detection.py
|
||
```
|
||
|
||
No test suite exists in this project. No linter is configured.
|
||
|
||
## VegaModel
|
||
|
||
Defined in `train/vega_ml/training/vega/model.py`. Input: 1D spectrum (1023 channels). Output: classification logits (82 isotopes, apply sigmoid for probabilities) + activity predictions (Bq, scaled by max_activity_bq=1000). Loss: `VegaLoss = BCE(logits) + 0.1 * Huber(activities * mask)` — regression only penalizes present isotopes.
|
||
|
||
**Inference pipeline** (in `radiacode_monitor.py::run_inference`):
|
||
1. Subtract background from accumulated spectrum → net_rate
|
||
2. Apply CsI(Tl) non-linear correction: `correct_csi_nonlinear(net_rate)` — remaps channels so peaks appear at theoretical energies
|
||
3. Normalize with log1p: `log1p(corrected) / max(log1p(corrected))`
|
||
4. Feed to VegaModel → sigmoid → filter by threshold
|
||
|
||
The model checkpoint (`models/vega_best.pt`) stores `model_config` and `model_state_dict`. At inference, the detect container dynamically imports `VegaModel` and `IsotopeIndex` from the mounted `vega_ml` volume.
|
||
|
||
## Synthetic Spectrum Generation
|
||
|
||
### Detector Physics Model
|
||
|
||
Training spectra include realistic CsI(Tl) detector effects:
|
||
|
||
- **Energy calibration**: `E = 0.33 + 2.97 * ch` with 1023 channels (matching real detector)
|
||
- **K-escape peaks**: Iodine K-shell X-ray escape at `E - 28.5 keV` with energy-dependent escape fraction (up to 35% at low energies). Implemented in `spectrum_physics.py::_k_escape_fraction()`
|
||
- **Asymmetric peaks**: Low-energy tail for peaks below 200 keV (15% tail fraction at 0 keV, 0% above 200 keV). Implemented in `spectrum_physics.py::_asymmetric_peak()`
|
||
- **FWHM**: Energy-dependent resolution `FWHM(E) = 0.084 * 662 * sqrt(E/662)` keV (8.4% at 662 keV)
|
||
|
||
### Background Model
|
||
|
||
The training background uses a realistic CsI(Tl) continuum shape:
|
||
|
||
- **Continuum**: Asymmetric hump at ~110 keV (sigma_left=55, sigma_right=50 keV) + Compton tail + noise floor. Calibrated against real Radiacode 103 measurements.
|
||
- **Isotope peaks**: K-40, Pb-214, Bi-214, Ac-228, Pb-212, Tl-208 — with stochastic activity variation per sample.
|
||
- **Hybrid training**: If `MEASURED_BACKGROUND_PATH` points to a valid `.npy` file, 70% measured + 30% synthetic continuum is used.
|
||
- **Background subtraction mode**: 10% of training samples are background-subtracted (simulate the inference pipeline)
|
||
|
||
### Training Data Augmentation
|
||
|
||
- **Normalization**: log1p (replaces max normalization for better weak-signal detection)
|
||
- **Low-signal samples**: 15% of samples use 0.01–5 Bq activities with 30–300s durations
|
||
- **Duration range**: 30–300 seconds (covers short accumulations to long measurements)
|
||
- **Activity range**: 0.01–100 Bq (covers weak to strong sources)
|
||
|
||
## Configuration
|
||
|
||
All config is via environment variables in `docker-compose.yml`. Key variables:
|
||
|
||
**Train container:**
|
||
- `NUM_SAMPLES` — number of synthetic spectra (default 50000)
|
||
- `BATCH_SIZE` — training batch size (default 32)
|
||
- `MIN_DURATION`/`MAX_DURATION` — spectrum duration range in seconds (default 30–300)
|
||
- `MEASURED_BACKGROUND_PATH` — path to measured background `.npy` for hybrid training
|
||
|
||
**Detect container:**
|
||
- `MODEL_PATH`, `ISOTOPE_INDEX_PATH`, `BACKGROUND_PATH` — file paths
|
||
- `VEGA_DEVICE` — `cpu` or `cuda`
|
||
- `THRESHOLD` — detection probability threshold (default 0.5)
|
||
- `SAMPLE_INTERVAL` — seconds between samples (default 60)
|
||
- `ENERGY_CALIBRATION_OFFSET/SLOPE` — energy calibration constants
|
||
- `CSI_NONLINEAR_ALPHA/BETA` — CsI(Tl) non-linear response correction (default 0.37/100.0)
|
||
|
||
**Web container:**
|
||
- `ENERGY_CALIBRATION_OFFSET/SLOPE` — energy calibration constants
|
||
- `CSI_NONLINEAR_ALPHA/BETA` — CsI(Tl) correction parameters (must match detect) |