Fix: CsI(Tl) non-linear response correction + detector calibration overhaul

Root cause of Am-241 misidentification: the Radiacode 103's CsI(Tl) crystal
shifts low-energy peaks upward (59.5 keV → 71.6 keV for Am-241) due to
non-proportional scintillation response. The model was trained on theoretical
peak positions and couldn't match the shifted real peaks.

Changes:
- Add inverse CsI(Tl) non-linear correction to inference pipeline
  (radiacode_monitor.py, web/config.py, test_detection.py)
  E_apparent = E_true * (1 + 0.37 * exp(-E_true/100))
  Corrects channel mapping so peaks appear at theoretical energies
- Fix energy calibration: DetectorConfig now uses E = 0.33 + 2.97*ch
  with 1023 channels, matching the real detector (was energy_min=20,
  skip_first_channel=True, different channel width)
- Add K-escape peaks for CsI(Tl) iodine X-ray escape (E - 28.5 keV)
- Add asymmetric peak shapes for low-energy tails (< 200 keV)
- Add log1p normalization in dataset and inference (replaces max-norm)
- Add background-subtracted training mode (subtract_background flag)
- Add low-signal augmentation (0.01-5 Bq activities, 30-300s durations)
- Update docker-compose.yml: batch_size=32, duration=30-300s,
  CSI_NONLINEAR_ALPHA/BETA env vars for detect and web
- Web dashboard: apply CsI correction to displayed spectra
- Various UI fixes (Chart.js width, zoom/pan, isotope lines)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Jacquin Antoine
2026-05-21 17:35:22 +02:00
parent 3b4446b181
commit 0847a3fc80
21 changed files with 913 additions and 278 deletions

View File

@ -31,24 +31,38 @@ class SpectrumSample:
detector: str
def normalize_log1p(spectrum: np.ndarray) -> np.ndarray:
"""Log1p normalization: log(1 + x) / max(log(1 + x)).
Preserves relative signal levels across channels, works well when
many channels are zero (e.g. after background subtraction).
"""
log_spec = np.log1p(np.maximum(spectrum, 0))
max_val = log_spec.max()
if max_val > 0:
return log_spec / max_val
return log_spec
class SpectrumDataset(Dataset):
"""
PyTorch Dataset for synthetic gamma spectra.
Loads spectra from numpy files and their labels from JSON files.
Supports both individual JSON files per sample (efficient for large datasets)
and combined labels.json (legacy format).
Converts to tensors suitable for the Vega model.
"""
def __init__(
self,
data_dir: Path,
isotope_index: Optional[IsotopeIndex] = None,
max_activity_bq: float = 1000.0,
collapse_time: bool = True,
transform=None
transform=None,
normalization: str = "log1p"
):
"""
Initialize the dataset.
@ -66,6 +80,7 @@ class SpectrumDataset(Dataset):
self.max_activity_bq = max_activity_bq
self.collapse_time = collapse_time
self.transform = transform
self.normalization = normalization
# Detect label format and load sample list
self.use_individual_labels = self._detect_label_format()
@ -156,7 +171,15 @@ class SpectrumDataset(Dataset):
if self.collapse_time and spectrum.ndim == 2:
# Average across time intervals to get single spectrum
spectrum = spectrum.mean(axis=0)
# Normalize spectrum
if self.normalization == "log1p":
spectrum = normalize_log1p(spectrum)
elif self.normalization == "max":
max_val = spectrum.max()
if max_val > 0:
spectrum = spectrum / max_val
# Convert to tensor
spectrum_tensor = torch.tensor(spectrum, dtype=torch.float32)