deploy_views.sql (v13 → v14):
- CRITICAL: ml_detected_anomalies ORDER BY (src_ip) → (src_ip, ja4, host, model_name)
ReplacingMergeTree was collapsing all detections to 1 row per IP on merge
- Add PARTITION BY toDate + ttl_only_drop_parts on all 4 data tables
- ml_all_scores TTL 3d → 7d; ml_detected_anomalies TTL 30d → 7d
- agg_host_ip_ja4_1h + agg_header_fingerprint_1h: add partition + TTL 7d
- view_ip_recurrence: add WHERE detected_at >= now() - 7 DAY (was full scan)
- Remove dead views: summary/timeseries/threat_dist/variability
- Add view_dashboard_entities (fixes HTTP 500 in clustering/incidents/fingerprints)
- Add view_dashboard_user_agents (fixes HTTP 500 in fingerprints/metrics)
- Add view_ai_features_24h (enables ENABLE_MULTIWINDOW in bot_detector)
- Mark max_requests_per_sec as DEPRECATED (always 0)
New files:
- correlator/sql/migrations/01_ttl_adjustments.sql: ALTER TABLE migration
- tests/integration/verify_mvs.py: MV pipeline verification assertions
- docs/THESIS_HTTP_Traffic_Detection.md: detection techniques thesis
All DB references use ja4_processing/ja4_logs (no mabase_prod).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Couvre les 7 étapes de mise en place :
1. Installation ClickHouse
2. Déploiement du schéma (deploy_schema.sh + migrations)
3. Configuration des utilisateurs (data_writer, analyst, bot_writer)
4. Fichiers CSV externes pour les dictionnaires
5. Installation des services Go via RPM (sentinel, correlator, mod-reqin-log)
6. Installation des services Python via Docker (bot-detector, dashboard)
7. Vérification bout-en-bout (ingestion, agrégation, ML, dashboard)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>