feat(e2e): add multi-IP endpoint architecture with dedicated traffic VM

Replace single-service-per-endpoint with all-ips mode running nginx, apache,
and hitch+varnish simultaneously on 3 dedicated IPs per VM (eth1 alias IPs).
Add a dedicated traffic VM with curl-impersonate for realistic TLS fingerprints,
parallelized traffic generation, and paired SNI_HOSTS/TARGET_IPS lists for
per-VM per-service hostname identification (e.g. rocky9-nginx-platform.test).

Key changes:
- run-tests-vm.sh: add setup_all_ips(), IP-specific Listen/bind directives
  with reset-before-apply pattern, graceful service availability checks
- run-e2e-test.sh: traffic VM architecture, all-ips mode, eth1 network,
  paired IP/SNI lists, updated cleanup for alias IPs
- generate-traffic.sh: parallel background jobs, curl-impersonate detection,
  auto source interface detection via ip route get, Host header in HTTP traffic
- Vagrantfile: add traffic VM with provision-traffic.sh
- provision-traffic.sh: install curl-impersonate and httpx for traffic gen
- test-rpm.sh: multi-interface TC check, updated ja4ebpf config
- clickhouse-init.sh: load CSV stubs for Anubis/bot-networks dictionaries
- Remove obsolete correlator/sentinel/mod-reqin-log docs
- Add h2_settings_ack column to http_logs schema
- Upgrade Go toolchain to 1.25.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jacquin Antoine
2026-04-16 14:25:24 +02:00
parent f0c8fe81c6
commit 36b5065a0a
17 changed files with 674 additions and 924 deletions

View File

@ -28,3 +28,33 @@ for f in "$TMP_DIR"/*.sql; do
done
echo "[init] All SQL files executed — initialisation terminée"
# =============================================================================
# Charger les stubs CSV dans les tables peuplées par des scripts externes
# (fetch_rules.py, update-csv-data.sh) qui ne sont pas exécutés en E2E.
# =============================================================================
load_csv_stub() {
local csv_file="$1"
local table="$2"
local db="${3:-ja4_processing}"
if [ -f "$csv_file" ]; then
echo "[init] Loading $table from $(basename $csv_file)"
clickhouse-client --query="INSERT INTO $db.$table FORMAT CSVWithNames" < "$csv_file" \
&& echo "[init] OK" || echo "[init] FAILED (non-fatal)"
fi
}
STUB_DIR="/var/lib/clickhouse/user_files"
load_csv_stub "$STUB_DIR/anubis_ip_rules.csv" "anubis_ip_rules"
load_csv_stub "$STUB_DIR/anubis_asn_rules.csv" "anubis_asn_rules"
load_csv_stub "$STUB_DIR/ref_bot_networks.csv" "ref_bot_networks"
load_csv_stub "$STUB_DIR/browser_h2_signatures.csv" "browser_h2_signatures"
# Recharger les dictionnaires qui dépendent des nouvelles données
echo "[init] Reloading Anubis and H2 dictionaries..."
clickhouse-client --query="SYSTEM RELOAD DICTIONARY ja4_processing.dict_anubis_ip" 2>/dev/null || true
clickhouse-client --query="SYSTEM RELOAD DICTIONARY ja4_processing.dict_anubis_asn" 2>/dev/null || true
clickhouse-client --query="SYSTEM RELOAD DICTIONARY ja4_processing.dict_browser_h2_signatures" 2>/dev/null || true
echo "[init] Dictionary reload complete"