Commit Graph

2 Commits

Author SHA1 Message Date
14db3d9040 refactor: suppression dépendance User-Agent de la détection navigateur
Changements SQL :
- modern_browser_score : sec-ch-ua→100, Sec-Fetch→70 (plus de UA fallback)
- Ajout has_sec_ch_ua (UInt8) dans agg_header_fingerprint_1h et ml_all_scores
- mss_mobile_mismatch utilise has_sec_ch_ua au lieu de modern_browser_score
- header_order_confidence : PARTITION BY ja4 au lieu de first_ua
- sec_ch_mobile_mismatch : comparaison Client Hints interne (sans UA)
- Migration 03_remove_ua_browser_detection.sql

Changements Python :
- browser.py Axe 3 : Client Hints + Sec-Fetch + is_fake_navigation (PAS de UA)
- Pondération axes : ja4_known 0.30, tls_coherence 0.20 (signaux TLS renforcés)
- preprocessing.py : has_sec_ch_ua ajouté aux features et binary_features

Fichiers modifiés : 8 SQL/Python + 1 migration, 36/36 tests passent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-09 23:06:01 +02:00
1f103392ac refactor(bot-detector): extract monolith into modular package
Split bot_detector.py (~1982 lines) into 10 focused modules:
- config.py: all configuration constants and optional imports
- log.py: logging utilities (log_info, log_decision, append_training_history)
- infra.py: ClickHouse client, health check HTTP server, shutdown
- browser.py: multifactorial browser identification (5 axes)
- scoring.py: drift detection, feature validation, SHAP, clustering
- models.py: EIF, Autoencoder, XGBoost model management
- preprocessing.py: data preprocessing and feature list definitions
- pipeline.py: core semi-supervised scoring loop
- cycle.py: main analysis cycle orchestration
- __main__.py: entry point with startup banner

Update Dockerfile to copy package directory and use python -m bot_detector.

All 36 existing tests pass unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-09 01:02:04 +02:00