Files

Jacquin Antoine d334892880 Improve visualizations: adaptive scales, revert z-score to std normalization

- MSRM/TPI/roughness/anomalies: revert z-score (x-mean)/std to std normalization x/std
  to preserve contrast and visibility of linear features (paths, ditches, trenches)
- MSRM: adaptive scales based on resolution, archaeological weight combination
- TPI: extend from 2 to 4 scales (3m/15m/50m/200m) with weighted combination
- Hillshade: 8 directions instead of 4, altitude 35° instead of 30°
- LRM: adaptive sigma based on resolution
- Openness: doubled radius (100m instead of 50m)
- Roughness: multi-scale (3m fine + 15m broad) instead of single 5x5 window
- Anomalies: uses MSRM multi-scale relief instead of single LRM 15m
- Wavelet: 8 adaptive scales, std normalization, archaeological weights
- Remove svf (Sky-View Factor) and local_dominance visualizations
- Add AVIF format support (default), quality 98
- Add multi-resolution support (-r 0.5,0.2)
- Improve Ctrl+C handling for immediate process termination
- Update rendering.py descriptions for all modified visualizations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-05-14 23:12:08 +02:00

6.0 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

LiDAR archaeological processing pipeline that generates 16 terrain visualizations from LAZ/LAS point clouds. Runs in Docker with optional NVIDIA GPU acceleration (CuPy). Designed for French LiDAR HD data in Lambert 93 (EPSG:2154).

Commands

All commands run inside Docker. Use ./run.sh as the primary interface.

./run.sh -g                                    # Standard run with GPU
./run.sh -g -w 4                               # GPU + 4 parallel workers
./run.sh -g -r 0.2                             # High resolution (0.2m/px)
./run.sh -g -r 0.5,0.2                         # Multi-resolution (0.5m + 0.2m)
./run.sh --test                                 # Run unit tests
./run.sh -g --file LHD_FXX_1000_6882_PTS_LAMB93_IGN69.copc  # Single file
./run.sh --ground-classification csf            # Force CSF ground classification (complex terrain)
./run.sh -g --keep-tif                          # Keep TIFF files (allows WebP regeneration without recalculating DTM)
./run.sh -g --only hillshade svf lrm            # Only generate specific visualizations
./run.sh -g --skip ortho topo                   # Exclude specific visualizations
./run.sh -g --quality 90                       # WebP quality 90 (default: 85)
./run.sh -g --lossless                          # Lossless WebP compression
./run.sh                                        # Print help (no args)

Direct Docker:

docker build -t lidar-lidar .
docker run --rm --gpus all -v $(pwd)/input:/data/input:ro -v $(pwd)/output:/data/output lidar-lidar

Architecture

Module responsibilities

cli.py — argparse + logging setup. Entry point via python -m lidar_pipeline.
pipeline.py — LidarArchaeoPipeline orchestrator. VIZ_STEPS registry maps names to generate functions. FilePrefixFilter for parallel logging. Creates SharedDEM once per file and passes it to all visualizations. Multi-resolution support: self.resolutions list, _res_suffix() for naming, generate_all_visualizations() accepts vis_dir override.
dtm.py — PDAL ground classification (SMRF/CSF + auto-detection) and DTM generation via scipy binned_statistic_2d. create_dtm_fast() accepts output_suffix for multi-resolution DTM naming.
visualizations.py — 13 generate_* functions + 2 IGN overlay lambdas. All take (dem_file, basename, vis_dir, resolution, shared=None) and return a TIF path or None. SharedDEM class pre-computes gradient, NaN mask, LRM to avoid redundant I/O and computation. Lazy evaluation: properties computed on first access.
gpu.py — CuPy/numpy abstraction: HAS_GPU, to_gpu(), to_cpu(), xp_gaussian_filter(), xp_uniform_filter(), xp_minimum_filter(), gpu_cleanup(). Falls back to CPU gracefully.
ign.py — IGN WMTS tile download + overlay generation for orthophoto and topographic maps.
rendering.py — COLORMAPS dict maps filename keywords to (cmap, title, legend, description). tif_to_png() converts TIF→WebP with legend/scale/north arrow. Quality parameter controls WebP compression (default 85).

SharedDEM optimization

SharedDEM pre-computes once per file:

DEM data (single I/O read)
NaN mask + filled DEM (single _fill_nans call, avoiding ~20 redundant calls)
Gradient components (dy, dx, slope, aspect) shared by hillshade, slope, aspect, curvature
LRM at 15m kernel (shared by lrm + anomalies)

_filter_nanaware_from_filled() applies filters on the pre-filled DEM, skipping the expensive _fill_nans interpolation.

Adding a visualization

Three places must be updated:

visualizations.py — add generate_X(dem_file, basename, vis_dir, resolution, shared=None) function
pipeline.py VIZ_STEPS — add ('name', generate_X) entry
rendering.py COLORMAPS — add entry keyed by the output filename keyword

Ground classification

Auto-detection in dtm.py detect_ground_method():

Single-return ratio > 0.6 → CSF (urban terrain, cloth simulation)
Height std > 30m → CSF (complex/mountainous terrain)
Default → SMRF (natural terrain)

Override with --ground-classification {auto,smrf,csf}.

NaN handling

DTM small gaps (< 1m from existing data) are filled using rasterio.fill.fillnodata. Large gaps remain as NaN. SharedDEM fills NaN once; _filter_nanaware_from_filled() applies filters on the pre-filled array and restores the NaN mask.

Flow accumulation

Uses priority-flood algorithm (Wang & Liu 2006) for sink filling, which is O(n log n) instead of iterative minimum_filter. D8 accumulation uses numba JIT; falls back to pure Python if numba unavailable.

Multi-resolution

-r 0.5,0.2 processes each tile at both 0.5m and 0.2m. Ground classification is shared (done once per tile). Each resolution gets its own DTM (_dtm.tif / _dtm_r0p2.tif) and visualization subdirectory (basename/ / basename_r0p2/).

Parallel processing

Uses ProcessPoolExecutor with 'spawn' start method (required for CUDA). Each worker gets its own temp directory (temp_{basename}). _process_file_standalone() configures its own logger with _file_filter for per-file log prefixes.

Key conventions

Language: UI messages and comments in French. Code identifiers in English.
Logging: Use logger = logging.getLogger("lidar"). Prefix per-file logs via _file_filter.basename.
GPU pattern: arr_gpu = to_gpu(arr) → compute → result = to_cpu(arr_gpu) → gpu_cleanup() between visualizations.
Output format: Visualizations saved as AVIF (quality 98 by default, best quality/size ratio). Use --format webp for WebP output. TIFF intermediates deleted by default. Use --keep-tif to keep DTM+TIF for regeneration with --force. No PDF reports, no COGs or viewer.
Compression: TIF intermediates use deflate compression (faster than LZW for float32 data).
Tests: Run only inside Docker via ./run.sh --test. Synthetic DEM fixture in tests/conftest.py.

6.0 KiB Raw Permalink Blame History