feat: LEGITIMATE_BROWSER classification from JA4 + behavioral consistency
Add browser legitimacy classification (A9) to the bot detection pipeline: - New features: is_known_browser (binary) and browser_consistency_score [0..5] combining 5 signals: JA4 browser match, modern_browser_score, Accept-Language, cookies, Sec-Fetch-* presence - Post-scoring: sessions with known browser JA4 + consistency >= 4/5 + NORMAL/LOW threat level are reclassified as LEGITIMATE_BROWSER - Spoofing detection: inconsistent behavior (known JA4 but low consistency) stays in normal anomaly scoring — prevents evasion via JA4 spoofing - XGBoost treats LEGITIMATE_BROWSER as non-threat (negative label) - ClickHouse: browser_family column added to ml_detected_anomalies and ml_all_scores - Dashboard: browser_family filter/sort on detections and scores endpoints, legitimate_browsers count and browser_stats in overview - 6 new unit tests covering classification threshold, spoofing, exclusion logic Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
@ -24,6 +24,7 @@
|
||||
CREATE TABLE IF NOT EXISTS ja4_processing.ml_detected_anomalies
|
||||
(
|
||||
detected_at DateTime, src_ip IPv6, ja4 String, host String, bot_name String,
|
||||
browser_family LowCardinality(String) DEFAULT '',
|
||||
anomaly_score Float32, threat_level String, model_name String, recurrence UInt32,
|
||||
asn_number String, asn_org String, asn_detail String, asn_domain String,
|
||||
country_code String, asn_label String,
|
||||
@ -80,6 +81,7 @@ CREATE TABLE IF NOT EXISTS ja4_processing.ml_all_scores
|
||||
ja4 String,
|
||||
host String,
|
||||
bot_name String,
|
||||
browser_family LowCardinality(String) DEFAULT '',
|
||||
anomaly_score Float32,
|
||||
raw_anomaly_score Float32,
|
||||
threat_level String,
|
||||
|
||||
Reference in New Issue
Block a user