ja4-platform

Author	SHA1	Message	Date
toto	d05969867f	docs: rewrite architecture/README, update deployment/development - architecture.md: complete rewrite (French) with dual-database diagram, 5-phase data flow, full table ownership, triple-voice ML pipeline, 7 dictionaries, 13 SQL files, updated tech stack - README.md: complete rewrite (English) with updated pipeline diagram, services table, scripts section, integration tests, full doc index, Go 1.24.6 workspace - deployment.md: update to 13 SQL files, remove Anubis UA/Country refs, add scripts section, add ensemble env vars (AE_WEIGHT, XGB_WEIGHT), update verification queries and network diagram - development.md: translate to French, add bot-detector 11-module structure, add Python ML deps, add scripts/integration test sections, fix bot-detector run command, add make targets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 22:00:29 +02:00
toto	7bdc6e2865	docs: mise à jour du document de thèse (§2-§8) - §2.1.3: Simplifié Anubis à 2 dictionnaires (dict_anubis_ip, dict_anubis_asn) avec priorité COALESCE - §2.4.2: Ajouté bibliothèque isotree, formule de calibration, ntrees=300, sérialisation joblib - §2.4.2b/§2.4.4: Remplacé DBSCAN par HDBSCAN partout - §2.4.2c: Remplacé régression logistique par pondération linéaire fixe, ajouté formule et poids - §2.4.3: Clarifié approximation par 5 quantiles pour la détection de dérive - §3.1: Mis à jour le diagramme ASCII (dual-database, 3×EIF+AE+XGB, HDBSCAN, 55 routes) - §3.8: Mis à jour la trifurcation + ajouté détection multifactorielle navigateur (5 axes) - §4: Élargi taxonomie de 51 à 65+ features sur 8 familles - §5: Ajouté statut d'implémentation (✅/❌) à chaque technique - §6: Ajouté §6.6 résultats de déploiement (3M+ logs, 34K sessions/cycle) - §7: Mis à jour conclusion (65+ features, 5/8 techniques, refactorisation modulaire) - §8: Ajouté références isotree, PyTorch, HDBSCAN, XGBoost, SHAP Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 21:59:34 +02:00
toto	9ea36ad22e	feat(scripts): complete stack init + prod data import with date shift Schema cleanup: - Remove anubis_ua_rules table stub from 03_anubis_tables.sql - Remove anubis_ua_rules from bot-detector deploy_schema.sql - Remove UA seed step from clickhouse-init.sh (no more REGEXP_TREE dependency) - Drop dict_anubis_ua, dict_anubis_country, anubis_ua_rules, anubis_country_rules New scripts: - scripts/init-stack.sh: comprehensive ClickHouse init (13 SQL files + migrations + validation + cleanup of obsolete tables). Supports --reset, --import-prod. - scripts/import-prod-data.sh: imports pre-exported prod data (Native format) with dynamic date shift (max(time) → now). Supports --shift, --no-truncate. - scripts/data/prod-export/: directory for cached Native format exports Makefile targets: init-stack, import-prod-data, init-and-import Tested: init-stack.sh passes all 13 SQL + 7 critical tables + 7 dicts import-prod-data.sh: 3M rows in ~37s with auto date shift Dashboard: 55 routes OK, bot-detector: 36/36 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 21:40:05 +02:00
toto	d8ca804a55	feat(scripts): add reload-prod-logs.sh for prod→dev data sync Exports http_logs from prod ClickHouse via HTTP API, imports into dev with dynamic date shifting (max(time) → now() by default). Features: - Batch export in Native format (200K rows/batch, ~10s each) - Auto date shift: prod max(time) aligned to current time - --shift N: manual override (seconds) - --days N: filter to last N days only - --cron: silent mode for scheduled runs - Staging table approach: export → staging → INSERT SELECT with shift → cleanup Tested: 3,054,122 rows imported in ~3 minutes, dates 2026-04-03→2026-04-09. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 15:41:38 +02:00
toto	8180f4af04	refactor(anubis): simplify to IP/CIDR + ASN only, remove UA and Country rules - Remove UA regex extraction (extract_ua_regex, _extract_ua_from_all/any) - Remove Country rule collection from parse_bot_policies_inline - Simplify fetch_rules.py: collect_all_rules returns (ip_rules, asn_rules) - Remove insert_ua_rules and insert_country_rules functions - reload_dicts now only reloads dict_anubis_ip + dict_anubis_asn - Simplify CASE blocks in 04_mv_http_logs.sql, 07_ai_features_view.sql, view_ai_features_anubis.sql, mv_http_logs.sql: IP > ASN (was 5-level UA+IP > UA > IP > ASN > Country cascade) - Remove dict_anubis_country + dict_anubis_ua from 03_anubis_tables.sql (UA table kept as stub for REGEXP_TREE catch-all compatibility) - Remove anubis_country_rules table from schema - Remove Anubis UA and Country tabs from dashboard reflists page - Remove anubis_ua_rules/country_rules from API reflist queries - deploy_schema.sql simplified from 339 to 122 lines - 764 lines removed across 9 files Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 15:25:33 +02:00
toto	98abbc80c7	feat(dashboard): page Listes de référence — visualisation CSV/dictionnaires Nouvelle page /reflists pour visualiser les 9 dictionnaires ClickHouse : - bot_ip (3.5K entrées) : IP/CIDR de bots connus - bot_ja4 (31) : fingerprints JA4 de bots - browser_ja4 (1.2K) : fingerprints JA4 navigateurs → famille, lib TLS - asn_reputation (82.5K) : ASN → réputation (isp, datacenter, cdn…) - iplocate_asn (714K) : géolocalisation IP → ASN, pays, nom - anubis_ua_rules, anubis_ip_rules, anubis_asn_rules, anubis_country_rules Fonctionnalités : - 9 onglets de navigation entre les listes - Recherche textuelle avec filtrage côté ClickHouse - Pagination (200 entrées/page) - Tri par colonne (ASC/DESC) - Graphique de répartition (ECharts) par catégorie - KPIs dictionnaires en haut de page - Infobulles de documentation API : /api/dictionaries, /api/reflist/{name}, /api/reflist/{name}/stats Helpers : esc() (HTML escape) ajouté à base.html Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 14:56:54 +02:00
toto	039086a0b3	feat: nouvelles techniques de détection et page tactiques SOC SQL: - Ajout 5 colonnes d'agrégation (count_xff, count_unusual_ct, count_non_std_port, count_login_post, sec_ch_mobile_mismatch) - Exposition de 5 features calculées dans view_ai_features_1h - Migration ALTER TABLE pour déploiements existants Bot-detector: - 7 nouvelles features ML (has_xff, unusual_content_type_ratio, non_standard_port_ratio, login_post_concentration, sec_ch_mobile_mismatch, true_window_size, window_mss_ratio) - Propagation campaign_id vers ml_all_scores (était toujours -1) - Escalade campagne : HIGH→CRITICAL si cluster ≥5 membres Dashboard: - Page Tactiques SOC : brute-force, rotation JA4, récurrence, alertes temps réel — 4 KPIs + 4 panneaux + infobulles doc - Ajout fmtDate() helper global - Navigation sidebar mise à jour Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 14:29:18 +02:00
toto	702c0d5edb	feat(dashboard): add JA4 fingerprint and cluster investigation pages - /ja4/{fingerprint} page: 8 KPIs, timeline, threat pie, IP scores table, ASN/geo charts, HTTP logs, AI features — full JA4 investigation - /cluster/{cid} page: 8 KPIs, timeline, threat/JA4/ASN/host charts, member table with bulk classify — full campaign investigation - /api/ja4/{fingerprint} and /api/cluster/{cid} API endpoints - fmtJA4 links now navigate to /ja4/ investigation page - campaigns.html: 'Ouvrir' button links to /cluster/{cid} full page - Fix: double-brace {{param}} in non-f-string queries → single {param} (was causing HTTP 500 on all parameterized ClickHouse queries) - 50 routes total, all tests pass, 0 JS console errors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 14:05:52 +02:00
toto	70188b508c	fix(dashboard): eliminate @apply CSS, fix status column, fix click propagation Playwright testing revealed 3 critical bugs: 1. Tailwind CDN @apply with custom brand-* colors produces empty CSS rules, breaking ALL design components (kpi-card, data-table, badges, filter-btn, section-card, nav-item). Fix: replace all @apply directives with equivalent raw CSS values. 2. Traffic API and IP detail API reference non-existent 'status' column in http_logs table → HTTP 500 on /traffic and /ip/{ip}. Fix: remove status from SELECT, sort whitelist, filters, and templates. 3. Nested <a> links (fmtJA4, fmtASN, fmtCountry, fmtBotName) inside clickable <tr onclick> capture clicks, preventing row navigation to /ip/ detail. Fix: add event.stopPropagation() to all formatter links. Verified with Playwright: 10 pages × 0 JS errors, all tooltips hidden by default, sidebar toggle works, keyboard shortcuts (Alt+1-9, Alt+B), classification form saves to DB, campaign detail panel opens on click. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 13:54:38 +02:00
toto	6babc55e3e	fix(dashboard): hover infobulles, full-width layout, UX polish - Fix doc tooltips: split CSS into <style type='text/tailwindcss'> for @apply directives + raw CSS for reliable doc panel rendering - Convert doc panels from click-toggle to hover-based infobulles with arrow pointer, fade-in animation, and auto-dismiss on mobile - Replace '?' icons with 'ⓘ' across all 11 templates (51 tooltips) - Full-width layout: reduce padding on mobile (px-3), scale up on desktop (lg:px-5, xl:px-6) for maximum screen utilization - Auto-collapse sidebar on narrow screens (<1024px) - Keyboard shortcuts: Alt+1–9 for page navigation, Alt+B toggle sidebar - Add LEGITIMATE_BROWSER filter button to detections page - Sticky header with stronger blur (backdrop-blur-md) - All 46 routes pass tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 13:30:16 +02:00
toto	63ba6d203c	feat(dashboard): complete SOC dashboard with full monitoring and workflows - models.html: Full rewrite — 6 KPIs, scoring volume timeline, anomaly rate chart, threat breakdown per model, enhanced model cards with validation gate - classify.html: SOC workflow — suggested unclassified IPs, quick-classify buttons, classification stats pie, pre-fill from URL params - traffic.html: Clickable rows → ip_detail, column sorting, status column, search filter, doc tooltips on all chart sections - scores.html: Search input, clickable rows → ip_detail, LEGITIMATE_BROWSER filter button, doc tooltips on distribution + scatter charts - ip_detail.html: Resource cascade section (headless browser detection), status column in HTTP logs table - detections.html: Doc tooltips on threat/reason/ASN chart sections - features.html: Doc tooltips on radar/importance/scatter sections - api.py: 4 new endpoints — /api/models/timeline, /api/models/threats, /api/classify/stats, /api/classify/suggested. Traffic API: status + search. 46 routes total. All tests pass (dashboard + bot-detector 36/36). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 01:25:01 +02:00
toto	396baa90d2	feat(dashboard): visualisation clusters HDBSCAN - Page /campaigns dédiée avec 4 vues graphiques : · Scatter plot (score vs vélocité, bulles colorées par campagne) · Graphe réseau force-directed (IPs liées par JA4 partagé) · Grille de cartes campagne (KPIs, ASN, pays, JA4) · Panneau détail (radar comportemental, timeline horaire, table membres) - 4 nouveaux endpoints API : · GET /api/campaigns (fix: campaign_id >= 0 au lieu de != '') · GET /api/campaigns/graph (nœuds + arêtes) · GET /api/campaigns/scatter (score/vélocité par IP) · GET /api/campaigns/{cid} (détail + profil + timeline) - Sidebar: lien Campagnes ajouté dans Surveillance - Overview: campagnes clickables → lien vers /campaigns Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 01:11:16 +02:00
toto	f1547423b5	refactor(bot-detector): suppression monolithe, tests multifactoriels - Suppression de bot_detector.py (1982 lignes) remplacé par 11 modules - Tests navigateur mis à jour pour le système multifactoriel (browser_confidence) - 36/36 tests passent avec la nouvelle structure modulaire Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 01:03:17 +02:00
toto	1f103392ac	refactor(bot-detector): extract monolith into modular package Split bot_detector.py (~1982 lines) into 10 focused modules: - config.py: all configuration constants and optional imports - log.py: logging utilities (log_info, log_decision, append_training_history) - infra.py: ClickHouse client, health check HTTP server, shutdown - browser.py: multifactorial browser identification (5 axes) - scoring.py: drift detection, feature validation, SHAP, clustering - models.py: EIF, Autoencoder, XGBoost model management - preprocessing.py: data preprocessing and feature list definitions - pipeline.py: core semi-supervised scoring loop - cycle.py: main analysis cycle orchestration - __main__.py: entry point with startup banner Update Dockerfile to copy package directory and use python -m bot_detector. All 36 existing tests pass unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 01:02:04 +02:00
toto	2d04288e95	feat(dashboard): SOC workflow overhaul — sidebar nav, doc tooltips, full-width layout - base.html: collapsible sidebar navigation, doc tooltip system, JS helpers (fmtNum, fmtPct, fmtDuration, ecGrid, buildTable, docHTML) - overview.html: SOC command center with stacked timeline, live alerts, campaigns panel, browser donut, 6 KPIs - detections.html: threat color dots, raw score column, click-to-navigate rows - network.html: JA4 rotation, brute-force, persistent threats tables, 6 KPIs - ip_detail.html: ASN/country KPIs, AE/XGB/campaign columns, enriched features - scores/traffic/features/models/classify: page_title blocks + doc tooltips - api.py: 9 new endpoints (campaigns, brute-force, ja4-rotation, recurrence, cascade, alerts, timeline-detail, ua-rotation) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 00:29:34 +02:00
toto	c994ad4466	fix: XGB label query + SHAP isotree compatibility XGB: query was selecting features from ml_all_scores which doesn't store them. Now joins ml_all_scores (labels) with view_ai_features_1h (features). Dynamically discovers available columns to skip thesis §5 features not present in the view. Returns (model, features) tuple. SHAP: TreeExplainer doesn't support isotree. Fall back to permutation- based Explainer(model.decision_function, X_sample) for isotree. Verified: XGB trained on 50000 labels (18436 positives), triple-voice ensemble scoring active (EIF+AE+XGB), SHAP silent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-09 00:06:54 +02:00
toto	c6666e2bba	fix: isotree score convention — proper sklearn calibration isotree decision_function returns [0,1] (higher=anomalous, 0.5=boundary). The entire pipeline (normalize_scores, score_to_threat_level, compute_adaptive_threshold) expects sklearn convention (negative=anomalous). Previous fix (-raw_scores) negated all values, making everything below -0.30 → all CRITICAL. New fix: 0.5 - isotree_score maps correctly to sklearn's convention: isotree 0.80 → -0.30 (CRITICAL) isotree 0.65 → -0.15 (HIGH) isotree 0.55 → -0.05 (MEDIUM) isotree 0.50 → 0.00 (boundary) Verified: 27,952 LEGITIMATE_BROWSER + 15,843 HIGH + 15,059 MEDIUM Tests: 36/36 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 23:56:05 +02:00
toto	db306fb9da	fix: P0 audit bugs — bot-detector + dashboard + SQL Bot-detector: - B1.1: campaign_id and raw_anomaly_score now inserted into ml_detected_anomalies - B1.4/B1.5: log_decision argument order fixed (cycle_id, name) - B1.7: AE broadcast error — model now returns features list, scoring uses model's features instead of current cycle's (prevents dim mismatch) - B1.8: Anubis ALLOW bots now get bot_name from anubis_bot_name Dashboard: - C1.1: XSS in ip_detail.html — {{ ip \| tojson }} instead of raw string - C1.2: Stored XSS via innerHTML — added escapeHtml() helper, all user-facing formatters (fmtIP, fmtASN, fmtCountry, fmtJA4, fmtBotName, fmtLabel) sanitized - C2.1: status filter now correctly filters http_version column - C2.2: heatmap toDayOfWeek() - 1 for 0-indexed JS days SQL: - B1.3: view_ip_recurrence worst_score uses max() not min() (0=normal, 1=anomal) - B1.6: view_resource_cascade_1h joined into view_thesis_features_1h (§5.4) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 23:33:00 +02:00
toto	b66d41a200	docs: updated conformity audit bot-detector + dashboard vs thesis Score: 93% (was 72%) — 4 thesis techniques now implemented, browser classification, ASN PeeringDB, SOC feedback loop. Identifies 9 bot-detector bugs (2 critical: campaign_id/raw_anomaly_score never inserted, worst_score inverted) and 11 dashboard bugs (4 critical: XSS, no auth, no CSRF, CORS misconfiguration). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 23:25:19 +02:00
toto	98289ccf04	fix: ASN dictionary pipeline + verbose bot-detector logging - Fix dict_iplocate_asn: remove non-existent org/domain columns (4→4 cols) - Add CSV header to iplocate-ip-to-asn.csv (CSVWithNames format) - Replace org/domain dictGet calls with empty string literals in MV - Full 714K CIDR stub for complete ASN resolution in tests - Add header generation to generate_asn_data.py - Verbose bot-detector stdout: data summary, triage breakdown, model training details, scoring stats, browser classification, boxed results - Fix IPv6 filter in traffic seeder (_ips_from_cidrs) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 17:43:55 +02:00
toto	7b7b69dee3	Rewrite seed_clickhouse.py: 500K rows from 20K IPs with realistic traffic - 350K browser rows (14K IPs) using real JA4s from browser_ja4.csv - 100K scanner rows (3K IPs) with vuln/cred/scraper/DDoS sub-categories - 30K legit bot rows (2K IPs) from real bot_ip.csv CIDRs - 20K AI bot rows (1K IPs) for GPTBot, ClaudeBot, etc. Key improvements: - Load browser_ja4.csv at startup, match JA4 to browser family - Load bot_ip.csv to generate IPs from real Googlebot/Bingbot CIDRs - Hard-coded ISP /24 prefixes from real ASNs (Comcast, Orange, DT, etc.) - Realistic navigation patterns with Referer chains and cookies - Sec-CH-UA headers for Chromium browsers (modern_browser_score >= 50) - Batch size increased to 2000, progress reporting every 10K rows - New CLI args: --rows, --ips, --seed, --data-dir - Bot JA4s are synthetic hashes guaranteed NOT in browser_ja4.csv Also updated: - Dockerfile: COPY *.py (was missing seed_clickhouse.py) - docker-compose.yml: mount scripts/data as /app/data for CSV access - run-tests.sh: updated seeder description comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 16:35:40 +02:00
toto	74e0406c38	chore: update ASN stubs with new classification labels Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 16:05:25 +02:00
toto	5c5bca71d1	feat: rewrite ASN classification with PeeringDB + expanded heuristics Major improvements to generate_asn_data.py: - Add PeeringDB network data source (34K networks with info_type) - Add new categories: education, government, enterprise - Rename 'human' label to 'isp' across all consumers - Expand keyword heuristics (ISP, datacenter, hosting, CDN, education, gov) - Add hard-coded lists for education, government, enterprise ASNs - Support both --output-dir and --output-asn/--output-ipasn CLI interfaces - Add --no-peeringdb flag for offline use Results: unknown dropped from 86% to 57%, ISP coverage 21.8K ASNs, education 3.1K, enterprise 5.7K, government 520. Updated consumers: - bot_detector.py: 'human' -> 'isp' for baseline selection - dashboard api.py: 'human' -> 'isp' in SQL queries - run-tests.sh: 'human' -> 'isp' in integration test assertions - update-csv-data.sh: updated label description comment Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 16:02:07 +02:00
toto	9a48fb9d29	feat: LEGITIMATE_BROWSER classification from JA4 + behavioral consistency Add browser legitimacy classification (A9) to the bot detection pipeline: - New features: is_known_browser (binary) and browser_consistency_score [0..5] combining 5 signals: JA4 browser match, modern_browser_score, Accept-Language, cookies, Sec-Fetch-* presence - Post-scoring: sessions with known browser JA4 + consistency >= 4/5 + NORMAL/LOW threat level are reclassified as LEGITIMATE_BROWSER - Spoofing detection: inconsistent behavior (known JA4 but low consistency) stays in normal anomaly scoring — prevents evasion via JA4 spoofing - XGBoost treats LEGITIMATE_BROWSER as non-threat (negative label) - ClickHouse: browser_family column added to ml_detected_anomalies and ml_all_scores - Dashboard: browser_family filter/sort on detections and scores endpoints, legitimate_browsers count and browser_stats in overview - 6 new unit tests covering classification threshold, spoofing, exclusion logic Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 15:46:22 +02:00
toto	7d09c614c3	feat: browser JA4 detection, Anubis bot rules, worldwide ASN data - Add generate_browser_ja4.py: 1,186 browser JA4 fingerprints from FoxIO + ja4db.com covering 11 families (Chromium, Firefox, Safari, Edge, Tor, Opera, Vivaldi...) - Rewrite generate_bot_ip.py: Anubis YAML rules (Google, Bing, Apple, DuckDuck, OpenAI, Perplexity bots) + Tor exit nodes + cloud scanner IPs (3,555 entries) - Rewrite generate_asn_data.py: worldwide iptoasn.com data (78,049 ASNs, 714K CIDRs) - Add dict_browser_ja4 ClickHouse dictionary + browser_family in AI features views - Add /api/browsers dashboard endpoint - Fix CSV quoting for fields containing commas (User-Agent strings) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 15:27:37 +02:00
toto	b6184e6529	feat: CSV generation scripts, API filter params, enriched CSV stubs - scripts/generate_bot_ip.py: download Tor exit nodes + curate scanner IPs (1353 entries) - scripts/generate_bot_ja4.py: 31 bot JA4 fingerprints across 16 families - scripts/generate_asn_data.py: 38 ASNs + 96 IP-to-ASN prefixes - scripts/update-csv-data.sh: master orchestrator with --install-stubs - api.py: add asn_org/country_code/ja4/bot_name filters on detections+scores - pages.py: add /network route - csv-stubs: enriched with generated data (Tor nodes, scanner IPs, etc.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 15:05:43 +02:00
toto	c6ca352db9	feat(dashboard): add clickable drill-down to all data elements Add navigation helpers (fmtASN, fmtCountry, fmtJA4, fmtBotName, fmtThreatLink, fmtLabel) to base.html for SOC analyst drill-down. Update all templates: - overview.html: clickable table cells + ECharts click handlers for ASN, country, JA4, bot, and threat charts - detections.html: URL param pre-filters, active filter bar with clear buttons, clickable ASN/country/JA4/threat in table - scores.html: URL param pre-filters, clickable threat/JA4/country - traffic.html: clickable JA4 and country columns - ip_detail.html: clickable threat/JA4 in detections, clickable asn_org/country_code/asn_label in AI features grid - network.html: click handlers on ASN treemap and country sunburst, fmtJA4Full/fmtLabel/fmtBotName/fmtASN in tables - features.html: scatter plot click navigates to /ip/{ip} Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 14:58:48 +02:00
toto	fc882dd3e7	feat(tests): realistic traffic seeder + IP diversity via mod_remoteip Option A — X-Forwarded-For + mod_remoteip: - httpd-integration.conf: load mod_remoteip, trust all Docker RFC-1918 subnets (172/192.168/10). mod_reqin_log uses r->useragent_ip which mod_remoteip updates from XFF → each request logged with distinct src_ip - generate_traffic.py: XFF always set (was 30% only); human scenarios use 91.121/78.41/90.x ranges, bot scenarios use 185.220/45.155/193.32; pool of 1168 human IPs and 180 bot IPs; default --requests 500 Option D — Direct ClickHouse seeder (seed_clickhouse.py, stdlib only): - Inserts ~4000 rows into http_logs_raw triggering full MV chain: http_logs_raw → mv_http_logs → http_logs → mv_agg_host_ip_ja4_1h → agg_host_ip_ja4_1h • 720 human sessions: IPs in OVH/SFR/Orange ASN ranges (16276/15557/3215) → dict_asn_reputation maps these to asn_label='human' → satisfies bot_detector human_baseline >= 500 threshold • 150 scanner sessions: datacenter IPs, attack paths (/.env, wp-login, SQLi, path traversal), scanner UAs, minimal TCP fingerprints • 100 known-bot sessions: IPs matching bot_ip.csv entries • 20 brute-force clusters: 20-50 POST /login per IP All TCP/TLS metadata is profile-realistic (window, MSS, TTL, JA4, JA3) CSV stubs (mounted at /var/lib/clickhouse/user_files/): - iplocate-ip-to-asn.csv: 13 CIDR→ASN mappings (OVH/SFR/Orange/Tor/Contabo) - asn_reputation.csv: 13 ASN→label (8 'human', 3 'datacenter'/'hosting') - bot_ip.csv: 14 known scanner/Tor IPs (Shodan, Censys, Tor exits) - bot_ja4.csv: 5 bot JA4 fingerprints (curl, python-requests, masscan, zgrab) run-tests.sh: - Phase 4a: seeder runs before live traffic (ensures bot_detector baseline) - Phase 4b: live traffic gen at 500 requests (up from 200) - Phase 5f: new assertions — agg_host_ip_ja4_1h populated, ≥500 human rows in view_ai_features_1h, known-bot labels present - Phase 7: verifies ml_all_scores populated (bot_detector ran a cycle) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 11:35:34 +02:00
toto	f448dcb4b0	fix(rpm): standardize systemd scriptlets and unit installation paths - Add BuildRequires: systemd-rpm-macros to sentinel and correlator specs - Replace manual systemctl calls with %systemd_post, %systemd_preun, %systemd_postun_with_restart macros (handles daemon-reload, stop/disable, try-restart on upgrade correctly and is a no-op in containers) - ja4sentinel.spec: use %{_unitdir} macro instead of hardcoded path (/usr/lib/systemd/system); remove cross-service /var/run/logcorrelator from %files and %post (owned by logcorrelator package, not sentinel) - logcorrelator.spec: move unit from /etc/systemd/system (admin namespace) to %{_unitdir} (/usr/lib/systemd/system) — correct packaging location; move user/group creation from %post to %pre so file ownership is valid during RPM install phase; add Requires(pre): shadow-utils; fix bare directory entries in %files with %dir macro; add version fallback macro so spec is buildable without --define version - test-rpm.sh: auto-build RPM via Dockerfile.package if dist/rpm/ is empty; update service file path check to /usr/lib/systemd/system/ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 10:49:21 +02:00
toto	f7ee5e63f8	fix(docker): add g++ for isotree build, add dashboard Dockerfile.tests - bot-detector Dockerfile + Dockerfile.tests: install g++ for isotree C++ extension - dashboard Dockerfile.tests: new smoke test (verify FastAPI app loads) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 08:08:13 +02:00
toto	77c0450a22	docs: update copilot-instructions.md for dashboard rewrite and ML upgrades - Dashboard: FastAPI+React → FastAPI+Jinja2+htmx+Chart.js (2 route modules) - Bot-detector: IsolationForest → triple-voice EIF+Autoencoder+XGBoost ensemble - SQL schema: 10 → 13 files (added thesis features, perf indexes, views) - Added ClickHouse 24.8 gotchas (projections, nested aggregates, let bindings) - Added IPv4/IPv6 duality pattern, bot-detector test patterns - Updated data retention table with 4 new thesis aggregation tables - Fixed single-test commands to reference existing files Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 07:31:10 +02:00
toto	b735bab5a5	feat(dashboard): rebuild SOC dashboard + fix ClickHouse SQL Complete rewrite of the SOC dashboard using FastAPI + Jinja2 + htmx + Chart.js + Tailwind CSS. Replaces the old React/Vite frontend with server-rendered templates. Dashboard pages: - Overview: KPIs, timeline chart, threat distribution, top IPs - Detections: paginated/filterable anomaly table - Scores: ml_all_scores with AE error & XGB prob columns - Traffic: HTTP logs with method/host filters - IP Investigation: full deep-dive (scores, features, HTTP logs, classify) - Classification: SOC feedback form + history - Features: AI + thesis feature stats - Models: scoring stats + model metadata API: 9 JSON endpoints with parameterized queries, sort whitelists SQL fixes: - 05_aggregation_tables: add deduplicate_merge_projection_mode - 11_views: fix nested aggregate (argMax inside sum) - 12_thesis_features: remove invalid 'let' bindings, fix groupArrayIf type Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 03:21:05 +02:00
toto	228ad7026a	fix(integration): mount missing SQL files 10-12 in ClickHouse init 3 SQL files were missing from the docker-compose.yml volume mounts: - 10_perf_indexes.sql (performance indexes) - 11_views.sql (dashboard views) - 12_thesis_features.sql (thesis §5 MVs and views) Also make 10_perf_indexes.sql non-fatal in init script since ALTER TABLE ADD INDEX may fail if index already exists. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:55:43 +02:00
toto	8d58f2b932	feat(bot-detector): add XGBoost supervised third voice (#10 ) Triple-voice ensemble architecture: - EIF (non-supervisé, anomalies zero-day) - Autoencoder (non-supervisé, corrélations non-linéaires) - XGBoost (supervisé, patterns connus + feedback SOC) XGBoost implementation: - Trained on historical ml_all_scores labels (NORMAL=0, HIGH/CRITICAL/DENY/KNOWN=1) - Weekly retraining (XGB_RETRAIN_INTERVAL_H=168), min 100 labels required - Score = predict_proba, combined via meta-learner: (1-β)(EIF+AE) + βxgb_prob - Configurable: XGB_WEIGHT (β=0.20), XGB_MIN_LABELS, XGB_RETRAIN_INTERVAL_HOURS - Graceful fallback: if xgboost unavailable or labels insufficient, EIF+AE only - ClickHouse: xgb_prob column added to ml_all_scores - Tests: 4 new tests (availability, train/predict, meta-learner, save/load) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:45:57 +02:00
toto	57cf6c3828	feat(bot-detector): add parallel Autoencoder scorer (#9 ) - TrafficAutoEncoder class: symmetric AE (n→64→32→16→32→64→n) with BatchNorm+ReLU - Trained alongside EIF on human_baseline, saved/loaded with model versioning - Score = per-sample MSE reconstruction error, combined with EIF via AE_WEIGHT (α=0.30) - AE latent space (16-dim) used for HDBSCAN clustering instead of raw features - Configurable: AE_WEIGHT, AE_EPOCHS, AE_LATENT_DIM, AE_LEARNING_RATE - Graceful fallback: if torch unavailable or AE fails, EIF-only scoring continues - ClickHouse: ae_recon_error column added to ml_all_scores - Tests: 5 new tests (AE train/score, encode latent, state dict save/load, weight combination) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:40:39 +02:00
toto	f6e2d3c0ca	feat(bot-detector): implement 8 state-of-art improvements - EIF: Extended Isolation Forest via isotree (fallback to sklearn IF) - Benford's Law deviation feature on inter-request timing - Lag-1 autocorrelation feature for cadence analysis - Validation gate: reject model if val_anomaly_rate > 20% - Feature pruning: remove variance < 1e-6 features before training - Quantile drift: replace N(μ,σ) synthetic with quantile interpolation - Thread safety: Lock for _service_healthy/_consecutive_failures - Score normalization: inverted to [0,1] where 1=most anomalous SQL: add lag1_autocorrelation + benford_deviation to view_thesis_features_1h Tests: 10 new test functions covering all improvements Integration: verify_mvs.py checks new thesis feature columns Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:31:26 +02:00
toto	0d1a6a81e0	docs: update thesis with EIF, autoencoders, ensemble architecture, quantile drift - §2.4.2: Add Extended Isolation Forest theory (Hariri et al., TKDE 2021) - §2.4.2b: New section on autoencoders for network anomaly detection (Kitsune, β-VAE, hybrid AE+IF studies) - §2.4.2c: New section on hybrid supervised+unsupervised ensembles (triple-voice architecture: EIF + AE + XGBoost + meta-learner) - §2.4.3: Enhanced drift detection with quantile digest and validation gate - §6.2: Multi-level baseline contamination mitigation - §7: Updated conclusion reflecting ensemble architecture - §8: 10 new references (Hariri 2021, Mirsky 2018, Baptiste 2026, etc.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:23:00 +02:00
toto	3ae8c7d9c9	feat(bot-detector): upgrade to state-of-the-art detection pipeline - Fix UnboundLocalError on global _consecutive_failures/_service_healthy - Add SQL identifier validation for DB names at startup - Replace Z-score drift detection with KS test (scipy.stats.ks_2samp) - Replace DBSCAN with HDBSCAN (adaptive clustering, no epsilon needed) - Fix NaN→0 blanket imputation with per-feature median/sentinel strategy - Add 80/20 temporal train/validation split with offline metrics logging - Integrate thesis §5 features from view_thesis_features_1h: path_transition_entropy, cadence_cv, burst/pause ratios, host_diversity, host_sweep_speed, host_coverage_uniformity, ja4_drift_ratio (Complet model only) - Add SOC feedback loop: read classifications from audit_logs, reclassify FP IPs as human, exclude TP IPs from baseline - Update dependencies: clickhouse-connect 0.8.12, scikit-learn 1.6.1, pandas 2.2.3, shap 0.47.2, add scipy>=1.14, hdbscan>=0.8.38 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 02:09:18 +02:00
toto	6d02f21c1e	feat: implement thesis §5 advanced detection techniques as ClickHouse MVs New aggregation tables + materialized views: - agg_path_sequences_1h + MV (§5.1 Path Sequence Entropy) - agg_request_timing_1h + MV (§5.3 Request Cadence Fingerprint) - agg_ip_behavior_1h + MV (§5.5 JA4 Drift + §5.8 Cross-Domain) - agg_resource_cascade_1h + MV (§5.4 Resource Dependency Tree) New analytical views: - view_thesis_features_1h: unified view exposing all computable features (path_transition_entropy, cadence_cv, burst_ratio, pause_ratio, ja4_drift_ratio, host_diversity, host_sweep_speed, host_coverage_uniformity) - view_resource_cascade_1h: root_to_first_asset_delay, asset_load_stddev Documented future techniques (not feasible as MV): - §5.2 Bipartite Fleet Graph (needs Python networkx) - §5.6 DNS Shadow Analysis (needs sentinel UDP/53 extension) - §5.7 Compression Ratio Invariant (needs mod_reqin_log extension) Updated: deploy_schema.sh, verify_mvs.py (sections 8-10) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 01:42:52 +02:00
toto	0ccd417a02	docs: audit conformité détection vs thèse état de l'art Analyse exhaustive feature-par-feature des techniques de détection implémentées vs ce que décrit la thèse. Score: 97% base, 6% techniques avancées, 72% global pondéré. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-08 00:12:51 +02:00
toto	11b46b2eab	docs: update copilot-instructions.md for v14 changes - Fix coverage gate: 60% → 80% for correlator - Document dual-model pattern (Complet/Applicatif) in bot-detector - Add SQL deployment paths: deploy_views.sql + service migrations - Add data retention TTL table with partition info - Fix integration test description (8 phases, --build-only flag) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 23:55:28 +02:00
toto	51b8eb57a8	feat: port v14 schema fixes, migration, MV verifier, thesis from ja4/ deploy_views.sql (v13 → v14): - CRITICAL: ml_detected_anomalies ORDER BY (src_ip) → (src_ip, ja4, host, model_name) ReplacingMergeTree was collapsing all detections to 1 row per IP on merge - Add PARTITION BY toDate + ttl_only_drop_parts on all 4 data tables - ml_all_scores TTL 3d → 7d; ml_detected_anomalies TTL 30d → 7d - agg_host_ip_ja4_1h + agg_header_fingerprint_1h: add partition + TTL 7d - view_ip_recurrence: add WHERE detected_at >= now() - 7 DAY (was full scan) - Remove dead views: summary/timeseries/threat_dist/variability - Add view_dashboard_entities (fixes HTTP 500 in clustering/incidents/fingerprints) - Add view_dashboard_user_agents (fixes HTTP 500 in fingerprints/metrics) - Add view_ai_features_24h (enables ENABLE_MULTIWINDOW in bot_detector) - Mark max_requests_per_sec as DEPRECATED (always 0) New files: - correlator/sql/migrations/01_ttl_adjustments.sql: ALTER TABLE migration - tests/integration/verify_mvs.py: MV pipeline verification assertions - docs/THESIS_HTTP_Traffic_Detection.md: detection techniques thesis All DB references use ja4_processing/ja4_logs (no mabase_prod). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 23:51:56 +02:00
toto	ecceb04174	perf(clickhouse): P3 — view_ip_recurrence avec filtre TTL + supprimer FINAL view_ip_recurrence : Ajout de WHERE detected_at >= now() - INTERVAL 30 DAY → Avec PARTITION BY (P1), ClickHouse élagage les partitions hors de cette plage avant même de lire les données. La vue ne scanne que les partitions actives (au lieu des 30 partitions journalières complètes). → ORDER BY (src_ip) garantit que le GROUP BY src_ip lit des données contiguës (aucune réorganisation mémoire). rotation.py — supprimer FINAL sur ml_detected_anomalies : FINAL force une déduplication complète du ReplacingMergeTree en mémoire (équivalent à un DISTINCT sur toute la table) — une des opérations les plus coûteuses dans ClickHouse. Fix : remplacer le sous-SELECT FINAL par view_ip_recurrence (déjà aggrégée par src_ip, retourne recurrence directement sans FINAL). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 22:33:29 +02:00
toto	2bfb4b7282	perf(dashboard): P2 — remplacer replaceRegexpAll dans les WHERE par IPv4MappedToIPv6 Problème : 8 clauses WHERE appliquaient une fonction sur la colonne src_ip : WHERE replaceRegexpAll(toString(src_ip), '^::ffff:', '') = %(ip)s → ClickHouse ne peut pas utiliser l'index de tri ou les skipping indexes quand une fonction est appliquée à la colonne filtrée. Fix : transformer l'INPUT (le paramètre) plutôt que la colonne : WHERE src_ip = IPv4MappedToIPv6(toIPv4(%(ip)s)) → src_ip reste intact → ClickHouse utilise les indexes (P1) et la projection proj_by_ip (P1) pour ces requêtes. Fichiers modifiés : investigation_summary.py — 6 WHERE (ml_detected_anomalies, agg_host_ip_ja4_1h, view_form_bruteforce_detected, view_host_ip_ja4_rotation, view_ip_recurrence) ml_features.py — 1 WHERE (view_ai_features_1h) rotation.py — 1 WHERE (agg_host_ip_ja4_1h) Note : les 27 autres occurrences de replaceRegexpAll dans les SELECT sont des transformations d'affichage (IPv6→IPv4 pour l'UI) et ne bloquent pas les indexes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 22:31:57 +02:00
toto	14323f7b05	perf(clickhouse): P10 — créer les 4 vues métier manquantes + corriger préfixes DB Bug de production : view_form_bruteforce_detected, view_host_ip_ja4_rotation, view_dashboard_entities, view_dashboard_user_agents étaient référencées dans 13 endpoints du dashboard mais n'existaient nulle part dans le schéma. Tous ces endpoints retournaient HTTP 500 en production. shared/clickhouse/11_views.sql (nouveau) : view_form_bruteforce_detected Source : agg_host_ip_ja4_1h (24h) Logique : GROUP BY (src_ip, host) HAVING count_post >= 10 Usage : bruteforce.py (3 endpoints), investigation_summary.py view_host_ip_ja4_rotation Source : agg_host_ip_ja4_1h (24h) Logique : uniqExact(ja4) par src_ip, HAVING >= 2 (rotation de fingerprint) Usage : rotation.py (3 endpoints), investigation_summary.py view_dashboard_entities Source : http_logs (7 jours), UNION ALL 5 branches (ip/ja4/country/asn/host) Colonnes : entity_type, entity_value, src_ip, ja4, host, log_date, client_headers Array(String), asns Array, countries Array, user_agents Array Usage : entities.py (5 endpoints), clustering.py view_dashboard_user_agents Source : http_logs (7 jours), GROUP BY (src_ip, ja4, hour) Colonnes : src_ip, ja4, hour, log_date, user_agents Array(String), requests Usage : variability.py (4 endpoints), fingerprints.py (5 endpoints) attributes.py (2 endpoints) deploy_schema.sh : ajout de 10_perf_indexes.sql et 11_views.sql dans la liste routes/variability.py + fingerprints.py : Correction de 9 requêtes utilisant view_dashboard_user_agents sans préfixe de base de données → remplacé par {settings.CLICKHOUSE_DB_PROCESSING}.view_* Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 22:30:09 +02:00
toto	f4ffe3410a	perf(clickhouse): P1 — partition + skipping indexes sur ml_detected_anomalies, http_logs, agg_host_ip_ja4_1h Problème : toutes les requêtes du dashboard WHERE detected_at >= now() - INTERVAL N faisaient un full scan car ml_detected_anomalies avait ORDER BY (src_ip) sans partition ni index temporel. Changements : - 06_ml_tables.sql : * ml_detected_anomalies : PARTITION BY toYYYYMMDD(detected_at) → élagage de partitions journalières sur toutes les requêtes temporelles * INDEX idx_detected_at (minmax) → skip des granules hors plage * INDEX idx_threat_level set(8) → skip pour countIf(threat_level = ...) * INDEX idx_bot_name bloom_filter → skip pour bot_name != '' * ttl_only_drop_parts = 1 → TTL par suppression de partition entière * ml_all_scores : même traitement (PARTITION BY + 2 indexes) - 04_mv_http_logs.sql : * http_logs : INDEX idx_src_ip bloom_filter(0.01) → les requêtes WHERE src_ip = X (analysis.py, variability.py) sautent ~90% des granules sans scanner toute la plage temporelle * INDEX idx_ja4 bloom_filter(0.01) → idem pour filtres JA4 - 05_aggregation_tables.sql : * agg_host_ip_ja4_1h : PROJECTION proj_by_ip ORDER BY (src_ip, window_start, ...) → investigation_summary.py et rotation.py (WHERE src_ip = X) utilisent automatiquement la projection au lieu de scanner tous les window_start - 10_perf_indexes.sql (nouveau) : * Migration ALTER TABLE pour instances existantes * ADD INDEX + MATERIALIZE INDEX pour les 4 tables * ADD PROJECTION + MATERIALIZE PROJECTION pour agg_host_ip_ja4_1h * Note : PARTITION BY sur table existante nécessite recréation (documenté) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 22:28:04 +02:00
toto	69940bf18b	docs: update copilot-instructions with integration tests, gotchas, comment standard - Add make test-integration* commands - Add SKIP_TESTS=true fast build flag - Add 'Comments standard' section referencing docs/commenting-standard.md - Add 'Known gotchas' section with 6 non-obvious issues: * go.work build context must include both sentinel + correlator * YAML does not expand env vars in Go (hardcode DSN) * REGEXP_TREE dict requires >=1 rule or inserts fail * pcap only captures non-loopback traffic * ClickHouse init needs 120s timeout * RPM builds must use Rocky Linux (libpcap.so.1 vs .so.0.8) * FLAT() layout requires numeric keys (use COMPLEX_KEY_HASHED for strings) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 21:42:54 +02:00
toto	3b8c06b86d	docs: add Doxygen comments to mod_reqin_log.c - File header: French multi-line description block - 7 section banners in French (/* ====== Section ====== / format): Configuration du serveur, Buffer dynamique, Sérialisation JSON, Gestionnaires de directives, Socket Unix, Journalisation, Hooks Apache - 26 @brief/@param/@return blocks on every function: server config, dynbuf_, JSON helpers, cmd_set_* handlers, socket helpers (try_connect/ensure_connected/write_to_socket), log_request, Apache hooks (post_read_request, child_init, etc.) - No logic changes (1033 → 1268 lines, comments only) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 21:35:19 +02:00
toto	3dfeba860b	docs: add standardized comments to all services (Python, Go, Bash) - Add docs/commenting-standard.md defining per-language comment standards (Go godoc, Python PEP-257, C Doxygen, Bash header blocks, SQL banners) - services/dashboard: 100% docstring coverage (100/100 functions) - All FastAPI route handlers, helpers, classes, and models documented - Language: French (project convention) - services/bot-detector: 100% docstring coverage (53/53 symbols) - bot_detector.py: 14 functions + module docstring - anubis/fetch_rules.py: 9 functions - shared/python/ja4_common: full docstrings on ClickHouseClient (7 methods) and ClickHouseSettings class - services/correlator: 24 godoc comments added across 6 Go files - correlation_service.go: 10 private helpers - unixsocket/source.go: 6 parsing/socket helpers - correlated_log.go: 4 field extraction helpers - orchestrator.go, logger.go, main.go: 4 comments - services/correlator/scripts/audit-architecture.sh: standardized header block Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 21:32:29 +02:00
toto	12d60975da	feat: Python traffic generator with realistic varied HTTP/HTTPS traffic - Replace curlimages/curl with Python stdlib traffic generator - 200 requests, 10 workers, 16 scenario types: browsers (Chrome/Firefox/Safari/Edge/mobile), bots (Googlebot/Bing/curl/wget), GET/POST/HEAD/PUT/PATCH/DELETE/OPTIONS, HTTP + HTTPS - Multiple SSL contexts (default, TLS1.2-only, TLS1.3-only, few_ciphers) → 4 distinct JA4/JA3 fingerprints per test run - Realistic headers: Accept, Accept-Language, Sec-Fetch-*, Referer, X-Forwarded-For, Cookie, Cache-Control - JSON payloads, form data, CORS preflights - DB always reset (down -v) at start of each test run - Enhanced Phase 5 checks: distinct UAs, method variety, JA4/JA3 counts + uniqueness Results: 199/200 OK, 24 distinct UAs, 7 HTTP methods, TLS 1.2+1.3, 4 JA4 fingerprints Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-07 21:14:55 +02:00

1 2

59 Commits