feat: full-stack Docker Compose integration tests

- 4-container stack: ClickHouse, platform (Rocky 9), bot-detector, dashboard
- Platform builds sentinel on Rocky (CGO+libpcap native), correlator static
- mod-reqin-log compiled with apxs on Rocky (matching RPM build target)
- ClickHouse init script patches credentials for test env (sed-based)
- 8-phase test runner: schema, traffic gen, pipeline, dashboard API, bot-detector, sentinel
- All 13 checks pass, 3 non-blocking warnings (empty dicts, log paths)

SQL schema fixes discovered during integration:
- 02_dictionaries: IPv6CIDR → String (not a valid ClickHouse type)
- 03_anubis_tables: dict_anubis_ua missing has_ip/rule_id/category attrs
- 03_anubis_tables: dict_anubis_country FLAT() → COMPLEX_KEY_HASHED() (String key)
- 09_audit_table: CODEC before DEFAULT → DEFAULT before CODEC

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
toto
2026-04-07 20:33:25 +02:00
parent 7b8dff2925
commit d4e7e674d8
17 changed files with 888 additions and 5 deletions

View File

@ -0,0 +1,98 @@
# Tests d'intégration full-stack — ja4-platform
## Architecture
```
┌─────────────────────────────────────────────────────┐
│ platform (Rocky Linux 9) │
│ │
│ ┌──────────┐ http.socket ┌────────────┐ │
│ │ Apache │───────────────→│ │ │
│ │+ mod-reqin│ │ correlator │──→ ClickHouse
│ └──────────┘ │ │ │
│ ┌──────────┐ network.socket │ │ │
│ │ sentinel │───────────────→│ │ │
│ │(TLS pcap) │ └────────────┘ │
│ └──────────┘ │
│ cap_add: NET_RAW, NET_ADMIN │
└─────────────────────────────────────────────────────┘
↑ HTTPS │
test traffic ja4_logs.http_logs_raw
┌──────────────────┐
│ ClickHouse │
│ ja4_logs │
│ ja4_processing │
└──────────────────┘
↑ ↑
┌──────┘ └──────┐
┌──────────────┐ ┌──────────────┐
│ bot-detector │ │ dashboard │
│ (ML/Python) │ │ (FastAPI) │
└──────────────┘ └──────────────┘
```
## Utilisation
```bash
# Lancer les tests (build + start + test + teardown)
./run-tests.sh
# Garder le stack actif après les tests (debug)
./run-tests.sh --no-down
# Build uniquement (pas de tests)
./run-tests.sh --build-only
# Ou depuis la racine du monorepo :
make test-integration
```
## Conteneurs
| Conteneur | Image | Rôle |
|-----------|-------|------|
| `clickhouse` | clickhouse/clickhouse-server:24.8 | Base de données, schema auto-init |
| `platform` | Rocky Linux 9 (build custom) | Apache HTTPS + mod-reqin-log + sentinel + correlator |
| `bot-detector` | Python 3.11 | Détection d'anomalies ML |
| `dashboard` | Python 3.11 / FastAPI | API SOC |
## Capabilities réseau
Le conteneur `platform` a besoin de :
- `NET_RAW` — pour la capture de paquets réseau (sentinel/pcap)
- `NET_ADMIN` — pour la configuration de l'interface réseau
Ces capabilities sont déclarées dans `docker-compose.yml` :
```yaml
platform:
cap_add:
- NET_RAW
- NET_ADMIN
```
## Phases de test
1. **Schema ClickHouse** — vérifie les 2 bases, tables clés, utilisateurs
2. **Génération de trafic** — 50+ requêtes HTTPS vers Apache
3. **Pipeline de données** — vérifie les logs bruts et parsés dans ClickHouse
4. **Dashboard API** — vérifie /health et /api/metrics
5. **Bot-detector** — vérifie que le processus tourne
6. **Sentinel** — vérifie la capture réseau
## Debug
```bash
# Logs du platform (Apache + correlator + sentinel)
docker compose logs platform
# Logs corrélés
docker compose exec platform cat /var/log/logcorrelator/correlated.log
# Requête ClickHouse directe
docker compose exec clickhouse clickhouse-client \
-q "SELECT time, src_ip, method, host, path FROM ja4_logs.http_logs ORDER BY time DESC LIMIT 10"
# Shell dans le platform
docker compose exec platform bash
```

View File

@ -0,0 +1,146 @@
# =============================================================================
# ja4-platform — Full-stack integration test
#
# Compose:
# clickhouse — ClickHouse server (schema init via entrypoint)
# platform — Rocky Linux 9: Apache + mod-reqin-log + sentinel + correlator
# bot-detector — ML anomaly detection (Python)
# dashboard — SOC dashboard API (FastAPI)
#
# Usage:
# cd tests/integration && ./run-tests.sh
# =============================================================================
services:
# ---------------------------------------------------------------------------
# ClickHouse — schema auto-init from shared/clickhouse/*.sql
# ---------------------------------------------------------------------------
clickhouse:
image: clickhouse/clickhouse-server:24.8
hostname: clickhouse
environment:
CLICKHOUSE_DB: ja4_processing
CLICKHOUSE_USER: default
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
volumes:
# Init script: copies, patches credentials, and executes SQL files
- ./platform/clickhouse-init.sh:/docker-entrypoint-initdb.d/00_init.sh
# SQL sources (read-only, patched by init script before execution)
- ../../shared/clickhouse/00_database.sql:/initdb-src/00_database.sql:ro
- ../../shared/clickhouse/01_raw_tables.sql:/initdb-src/01_raw_tables.sql:ro
- ../../shared/clickhouse/02_dictionaries.sql:/initdb-src/02_dictionaries.sql:ro
- ../../shared/clickhouse/03_anubis_tables.sql:/initdb-src/03_anubis_tables.sql:ro
- ../../shared/clickhouse/04_mv_http_logs.sql:/initdb-src/04_mv_http_logs.sql:ro
- ../../shared/clickhouse/05_aggregation_tables.sql:/initdb-src/05_aggregation_tables.sql:ro
- ../../shared/clickhouse/06_ml_tables.sql:/initdb-src/06_ml_tables.sql:ro
- ../../shared/clickhouse/07_ai_features_view.sql:/initdb-src/07_ai_features_view.sql:ro
- ../../shared/clickhouse/08_users.sql:/initdb-src/08_users.sql:ro
- ../../shared/clickhouse/09_audit_table.sql:/initdb-src/09_audit_table.sql:ro
# Empty CSV stubs (dictionaries expect these files)
- ./platform/csv-stubs:/var/lib/clickhouse/user_files
ports:
- "9000:9000"
- "8123:8123"
healthcheck:
test: ["CMD", "clickhouse-client", "--query", "SELECT 1"]
interval: 5s
timeout: 3s
retries: 30
networks:
- ja4net
# ---------------------------------------------------------------------------
# Platform — Rocky Linux 9: Apache (HTTPS) + mod-reqin-log + sentinel + correlator
# ---------------------------------------------------------------------------
platform:
build:
context: ../..
dockerfile: tests/integration/platform/Dockerfile
hostname: platform
cap_add:
- NET_RAW
- NET_ADMIN
environment:
LOGCORRELATOR_CLICKHOUSE_DSN: "clickhouse://default:@clickhouse:9000/ja4_logs"
depends_on:
clickhouse:
condition: service_healthy
ports:
- "443:443"
- "80:80"
healthcheck:
test: ["CMD", "curl", "-sfk", "https://localhost/health"]
interval: 5s
timeout: 3s
retries: 30
networks:
- ja4net
# ---------------------------------------------------------------------------
# Bot-detector — ML anomaly detection
# ---------------------------------------------------------------------------
bot-detector:
build:
context: ../..
dockerfile: services/bot-detector/bot_detector/Dockerfile
hostname: bot-detector
environment:
CLICKHOUSE_HOST: clickhouse
CLICKHOUSE_PORT: 8123
CLICKHOUSE_DB_PROCESSING: ja4_processing
CLICKHOUSE_DB_LOGS: ja4_logs
CLICKHOUSE_USER: default
CLICKHOUSE_PASSWORD: ""
CYCLE_INTERVAL_SEC: 30
RETRAIN_INTERVAL_HOURS: 1
ANOMALY_THRESHOLD: "-0.05"
HEALTH_PORT: 8080
depends_on:
clickhouse:
condition: service_healthy
platform:
condition: service_healthy
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8080/')"]
interval: 10s
timeout: 5s
retries: 10
networks:
- ja4net
# ---------------------------------------------------------------------------
# Dashboard — FastAPI SOC UI
# ---------------------------------------------------------------------------
dashboard:
build:
context: ../..
dockerfile: services/dashboard/Dockerfile
hostname: dashboard
environment:
CLICKHOUSE_HOST: clickhouse
CLICKHOUSE_PORT: 8123
CLICKHOUSE_DB: ja4_processing
CLICKHOUSE_DB_PROCESSING: ja4_processing
CLICKHOUSE_DB_LOGS: ja4_logs
CLICKHOUSE_USER: default
CLICKHOUSE_PASSWORD: ""
API_HOST: 0.0.0.0
API_PORT: 8000
CORS_ORIGINS: '["*"]'
depends_on:
clickhouse:
condition: service_healthy
ports:
- "8000:8000"
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
interval: 5s
timeout: 3s
retries: 30
networks:
- ja4net
networks:
ja4net:
driver: bridge

View File

@ -0,0 +1,97 @@
# =============================================================================
# Platform container — Rocky Linux 9
# Runs: Apache (HTTPS) + mod-reqin-log + sentinel + correlator
#
# Multi-stage:
# 1. go-builder — compile correlator (static, no CGO) on golang image
# 2. platform — Rocky Linux 9: builds sentinel (CGO+libpcap), mod-reqin-log,
# installs Apache, runs everything
#
# sentinel is compiled on Rocky so it links against the same libpcap as runtime.
# This mirrors RPM packaging where build and target are the same distro.
# =============================================================================
# ---------------------------------------------------------------------------
# Stage 1: Build correlator (static binary, no CGO — distro-independent)
# ---------------------------------------------------------------------------
FROM golang:1.24 AS go-builder
WORKDIR /src
COPY go.work go.work.sum* ./
COPY shared/go/ja4common/ shared/go/ja4common/
COPY services/correlator/ services/correlator/
COPY services/sentinel/ services/sentinel/
RUN cd services/correlator && \
CGO_ENABLED=0 go build -ldflags="-s -w" -o /out/correlator ./cmd/logcorrelator
# ---------------------------------------------------------------------------
# Stage 2: Rocky Linux 9 — build sentinel + mod-reqin-log, then run everything
# ---------------------------------------------------------------------------
FROM rockylinux:9
# Install build deps + runtime deps
RUN dnf install -y --allowerasing \
httpd httpd-devel mod_ssl \
apr-devel apr-util-devel \
gcc make redhat-rpm-config \
libpcap \
golang \
procps-ng curl \
&& dnf install -y --enablerepo=crb libpcap-devel \
&& dnf clean all
# -- Build sentinel on Rocky (CGO + libpcap from Rocky repos) ---------------
COPY go.work go.work.sum* /tmp/sentinel-build/
COPY shared/go/ja4common/ /tmp/sentinel-build/shared/go/ja4common/
COPY services/sentinel/ /tmp/sentinel-build/services/sentinel/
COPY services/correlator/ /tmp/sentinel-build/services/correlator/
RUN cd /tmp/sentinel-build/services/sentinel && \
CGO_ENABLED=1 go build -ldflags="-s -w" -o /usr/local/bin/sentinel ./cmd/ja4sentinel && \
rm -rf /tmp/sentinel-build /root/go
# -- Build mod-reqin-log from source -----------------------------------------
COPY services/mod-reqin-log/src/ /tmp/mod-reqin-log/src/
COPY services/mod-reqin-log/Makefile /tmp/mod-reqin-log/Makefile
RUN cd /tmp/mod-reqin-log && make all && \
cp modules/mod_reqin_log.so /usr/lib64/httpd/modules/ 2>/dev/null || \
cp build/.libs/mod_reqin_log.so /usr/lib64/httpd/modules/ && \
rm -rf /tmp/mod-reqin-log
# -- Copy correlator from builder (static binary, no deps) -------------------
COPY --from=go-builder /out/correlator /usr/local/bin/correlator
# -- Create runtime directories ----------------------------------------------
RUN mkdir -p /var/run/logcorrelator \
/var/log/logcorrelator \
/var/log/ja4sentinel \
/etc/logcorrelator \
/etc/ja4sentinel
# -- Correlator config -------------------------------------------------------
COPY tests/integration/platform/correlator.yml /etc/logcorrelator/correlator.yml
# -- Sentinel config ----------------------------------------------------------
COPY tests/integration/platform/sentinel.yml /etc/ja4sentinel/config.yml
# -- Apache config (HTTPS + mod-reqin-log) ------------------------------------
COPY tests/integration/platform/httpd-integration.conf /etc/httpd/conf.d/integration.conf
# -- Generate self-signed TLS certificate -------------------------------------
RUN openssl req -x509 -nodes -days 365 \
-subj "/CN=platform.test" \
-newkey rsa:2048 \
-keyout /etc/pki/tls/private/localhost.key \
-out /etc/pki/tls/certs/localhost.crt
# -- Simple health endpoint for Apache ---------------------------------------
RUN mkdir -p /var/www/html && \
echo '{"status":"ok"}' > /var/www/html/health
# -- Entrypoint (manages all processes) --------------------------------------
COPY tests/integration/platform/entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
EXPOSE 80 443
CMD ["/entrypoint.sh"]

View File

@ -0,0 +1,30 @@
#!/bin/bash
# =============================================================================
# clickhouse-init.sh — Pre-process shared SQL files for integration testing
#
# Copies SQL from /initdb-src/ to /tmp, patches credentials, then executes.
# =============================================================================
set -e
SRC_DIR="/initdb-src"
TMP_DIR="/tmp/initdb-patched"
mkdir -p "$TMP_DIR"
for f in "$SRC_DIR"/*.sql; do
[ -f "$f" ] || continue
base=$(basename "$f")
echo "[init] Patching $base"
sed \
-e "s/USER 'admin'/USER 'default'/g" \
-e "s/PASSWORD 'CHANGE_ME'/PASSWORD ''/g" \
-e "s/PASSWORD 'ChangeMe'/PASSWORD ''/g" \
"$f" > "$TMP_DIR/$base"
done
for f in "$TMP_DIR"/*.sql; do
[ -f "$f" ] || continue
echo "[init] Executing $(basename "$f")"
clickhouse-client --multiquery < "$f"
done
echo "[init] All SQL files executed successfully"

View File

@ -0,0 +1,51 @@
# Correlator config for integration tests
log:
level: DEBUG
inputs:
unix_sockets:
- name: http
source_type: A
path: /var/run/logcorrelator/http.socket
format: json
socket_permissions: "0666"
- name: network
source_type: B
path: /var/run/logcorrelator/network.socket
format: json
socket_permissions: "0666"
outputs:
clickhouse:
enabled: true
dsn: clickhouse://default:@clickhouse:9000/ja4_logs
table: http_logs_raw
batch_size: 10
flush_interval_ms: 500
max_buffer_size: 5000
drop_on_overflow: false
async_insert: true
timeout_ms: 5000
file:
enabled: true
path: /var/log/logcorrelator/correlated.log
stdout:
enabled: true
correlation:
time_window:
value: 10
unit: s
orphan_policy:
apache_always_emit: true
apache_emit_delay_ms: 1000
network_emit: false
matching:
mode: one_to_many
buffers:
max_http_items: 10000
max_network_items: 20000
ttl:
network_ttl_s: 120

View File

@ -0,0 +1 @@
network,asn,country_code,name,org,domain
1 network asn country_code name org domain

View File

@ -0,0 +1,59 @@
#!/usr/bin/env bash
# =============================================================================
# Platform entrypoint — starts correlator, Apache, sentinel in order
# =============================================================================
set -eo pipefail
log() { echo "[entrypoint] $(date +%H:%M:%S) $*"; }
CORRELATOR_PID=""
HTTPD_PID=""
SENTINEL_PID=""
cleanup() {
log "Shutting down..."
[ -n "$SENTINEL_PID" ] && kill "$SENTINEL_PID" 2>/dev/null || true
[ -n "$CORRELATOR_PID" ] && kill "$CORRELATOR_PID" 2>/dev/null || true
httpd -k stop 2>/dev/null || true
wait 2>/dev/null || true
log "All processes stopped."
}
trap cleanup EXIT SIGTERM SIGINT
# -- 1. Start correlator (creates Unix sockets) ------------------------------
log "Starting correlator..."
correlator -config /etc/logcorrelator/correlator.yml &
CORRELATOR_PID=$!
# Wait for correlator to create its sockets
for i in $(seq 1 30); do
if [ -S /var/run/logcorrelator/http.socket ] && [ -S /var/run/logcorrelator/network.socket ]; then
log "Correlator sockets ready."
break
fi
sleep 0.5
done
if [ ! -S /var/run/logcorrelator/http.socket ]; then
log "ERROR: correlator sockets not created after 15s"
exit 1
fi
# -- 2. Start Apache (with mod-reqin-log writing to http.socket) -------------
log "Starting Apache..."
httpd -DFOREGROUND &
HTTPD_PID=$!
sleep 2
# -- 3. Start sentinel (captures network traffic) ----------------------------
log "Starting sentinel..."
sentinel -config /etc/ja4sentinel/config.yml &
SENTINEL_PID=$!
log "All services started. PIDs: correlator=$CORRELATOR_PID httpd=$HTTPD_PID sentinel=$SENTINEL_PID"
# -- Wait for any process to exit (indicates failure) -------------------------
wait -n "$CORRELATOR_PID" "$HTTPD_PID" "$SENTINEL_PID" 2>/dev/null || true
EXIT_CODE=$?
log "A process exited with code $EXIT_CODE — triggering shutdown."
exit $EXIT_CODE

View File

@ -0,0 +1,28 @@
# Integration test Apache config — HTTPS + mod-reqin-log
# Load mod-reqin-log
LoadModule reqin_log_module modules/mod_reqin_log.so
# Enable mod-reqin-log with correlator socket
JsonSockLogEnabled On
JsonSockLogSocket "/var/run/logcorrelator/http.socket"
JsonSockLogHeaders X-Request-Id User-Agent Referer X-Forwarded-For \
Sec-CH-UA Sec-CH-UA-Mobile Sec-CH-UA-Platform \
Sec-Fetch-Dest Sec-Fetch-Mode Sec-Fetch-Site \
Accept Accept-Language Accept-Encoding Content-Type
JsonSockLogMaxHeaders 25
JsonSockLogMaxHeaderValueLen 256
JsonSockLogReconnectInterval 5
JsonSockLogErrorReportInterval 5
JsonSockLogLevel DEBUG
# HTTPS virtual host (port 443 already configured by mod_ssl)
<VirtualHost *:80>
ServerName platform.test
DocumentRoot /var/www/html
# Simple test pages
<Location /health>
Require all granted
</Location>
</VirtualHost>

View File

@ -0,0 +1,18 @@
# Sentinel config for integration tests
core:
interface: eth0
listen_ports:
- 443
flow_timeout_sec: 30
packet_buffer_size: 1000
log_level: debug
outputs:
- type: unix_socket
enabled: true
async_buffer: 5000
params:
socket_path: /var/run/logcorrelator/network.socket
- type: stdout
enabled: true

340
tests/integration/run-tests.sh Executable file
View File

@ -0,0 +1,340 @@
#!/usr/bin/env bash
# =============================================================================
# run-tests.sh — Full-stack integration test for ja4-platform
#
# Starts the entire pipeline in Docker Compose, generates traffic, and verifies
# data flows end-to-end: Apache → mod-reqin-log → correlator → ClickHouse
# sentinel ↗ ↓
# bot-detector → ML scores
# dashboard API ← query
#
# Usage:
# ./run-tests.sh # run tests (build + up + test + down)
# ./run-tests.sh --no-down # keep stack running after tests (for debugging)
# ./run-tests.sh --build-only # build images only, don't run tests
# =============================================================================
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'
KEEP_UP=false
BUILD_ONLY=false
TESTS_PASSED=0
TESTS_FAILED=0
for arg in "$@"; do
case "$arg" in
--no-down) KEEP_UP=true ;;
--build-only) BUILD_ONLY=true ;;
esac
done
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
log() { echo -e "${CYAN}[test]${NC} $(date +%H:%M:%S) $*"; }
pass() { echo -e "${GREEN}$*${NC}"; TESTS_PASSED=$((TESTS_PASSED + 1)); }
fail() { echo -e "${RED}$*${NC}"; TESTS_FAILED=$((TESTS_FAILED + 1)); }
warn() { echo -e "${YELLOW}$*${NC}"; }
cleanup() {
if [ "$KEEP_UP" = false ]; then
log "Tearing down stack..."
docker compose down -v --remove-orphans 2>/dev/null || true
else
log "Stack left running (--no-down). Stop with: docker compose down -v"
fi
}
trap cleanup EXIT
ch_query() {
docker compose exec -T clickhouse clickhouse-client --query "$1" 2>/dev/null
}
wait_for_service() {
local service="$1"
local max_wait="${2:-120}"
log "Waiting for $service to be healthy (max ${max_wait}s)..."
local elapsed=0
while [ $elapsed -lt "$max_wait" ]; do
local status
status=$(docker compose ps --format json "$service" 2>/dev/null | python3 -c "
import sys, json
for line in sys.stdin:
d = json.loads(line)
print(d.get('Health','unknown'))
" 2>/dev/null || echo "unknown")
if [ "$status" = "healthy" ]; then
log "$service is healthy (${elapsed}s)"
return 0
fi
sleep 2
elapsed=$((elapsed + 2))
done
log "ERROR: $service not healthy after ${max_wait}s"
docker compose logs --tail=30 "$service"
return 1
}
# =============================================================================
# Phase 1: Build
# =============================================================================
log "============================================"
log "Phase 1: Building images"
log "============================================"
docker compose build --parallel 2>&1 | tail -20
if [ "$BUILD_ONLY" = true ]; then
log "Build complete (--build-only). Exiting."
exit 0
fi
# =============================================================================
# Phase 2: Start stack
# =============================================================================
log "============================================"
log "Phase 2: Starting stack"
log "============================================"
docker compose up -d
wait_for_service clickhouse 60
wait_for_service platform 120
wait_for_service dashboard 60
# Give bot-detector time to start (it's expected to fail initially — no data yet)
log "Waiting 10s for bot-detector to initialize..."
sleep 10
# =============================================================================
# Phase 3: Verify ClickHouse schema
# =============================================================================
log "============================================"
log "Phase 3: Verifying ClickHouse schema"
log "============================================"
# Check databases exist
DB_COUNT=$(ch_query "SELECT count() FROM system.databases WHERE name IN ('ja4_logs','ja4_processing')")
if [ "$DB_COUNT" = "2" ]; then
pass "Both databases created (ja4_logs, ja4_processing)"
else
fail "Expected 2 databases, got $DB_COUNT"
fi
# Check key tables
for table in "ja4_logs.http_logs_raw" "ja4_logs.http_logs" "ja4_processing.ml_detected_anomalies" "ja4_processing.agg_host_ip_ja4_1h"; do
db=$(echo "$table" | cut -d. -f1)
tbl=$(echo "$table" | cut -d. -f2)
EXISTS=$(ch_query "SELECT count() FROM system.tables WHERE database='$db' AND name='$tbl'")
if [ "$EXISTS" = "1" ]; then
pass "Table $table exists"
else
fail "Table $table missing"
fi
done
# Check users
for user in data_writer analyst; do
EXISTS=$(ch_query "SELECT count() FROM system.users WHERE name='$user'")
if [ "$EXISTS" = "1" ]; then
pass "User '$user' created"
else
fail "User '$user' missing"
fi
done
# =============================================================================
# Phase 4: Generate test traffic
# =============================================================================
log "============================================"
log "Phase 4: Generating test traffic"
log "============================================"
PLATFORM_IP=$(docker compose exec -T platform hostname -I | tr -d ' ')
log "Platform IP: $PLATFORM_IP"
# Send HTTPS requests to Apache (triggers mod-reqin-log + sentinel)
log "Sending 50 HTTPS requests..."
for i in $(seq 1 50); do
docker compose exec -T platform curl -sk \
-H "User-Agent: IntegrationTest/1.0 (test-run-$i)" \
-H "Accept: text/html,application/json" \
-H "Accept-Language: fr-FR,en-US" \
-H "Accept-Encoding: gzip, deflate, br" \
-H "Sec-Fetch-Dest: document" \
-H "Sec-Fetch-Mode: navigate" \
-H "Sec-Fetch-Site: none" \
"https://localhost/health?test=$i" > /dev/null 2>&1 || true &
done
wait || true
pass "50 HTTPS requests sent"
# Send varied HTTP methods
log "Sending varied HTTP methods..."
docker compose exec -T platform curl -sk -X POST -d '{"test":true}' \
-H "Content-Type: application/json" \
-H "User-Agent: BotTest/2.0" \
"https://localhost/health" > /dev/null 2>&1 || true
docker compose exec -T platform curl -sk -X HEAD "https://localhost/health" > /dev/null 2>&1 || true
docker compose exec -T platform curl -sk "https://localhost/" > /dev/null 2>&1 || true
pass "Varied HTTP methods sent (POST, HEAD, GET)"
# Wait for correlator to flush batches to ClickHouse
log "Waiting 10s for correlator to flush..."
sleep 10
# =============================================================================
# Phase 5: Verify data pipeline
# =============================================================================
log "============================================"
log "Phase 5: Verifying data pipeline"
log "============================================"
# 5a. Raw logs ingested
RAW_COUNT=$(ch_query "SELECT count() FROM ja4_logs.http_logs_raw")
if [ "$RAW_COUNT" -gt 0 ] 2>/dev/null; then
pass "Raw logs ingested: $RAW_COUNT rows in http_logs_raw"
else
fail "No raw logs in http_logs_raw (correlator → ClickHouse failed)"
# Debug
log "Correlator logs:"
docker compose logs --tail=30 platform 2>&1 | grep -i "correlator\|error\|clickhouse" | head -20
fi
# 5b. Parsed logs via materialized view
PARSED_COUNT=$(ch_query "SELECT count() FROM ja4_logs.http_logs")
if [ "$PARSED_COUNT" -gt 0 ] 2>/dev/null; then
pass "Parsed logs: $PARSED_COUNT rows in http_logs (MV working)"
else
warn "No parsed logs in http_logs (MV may need INSERT trigger, or dict loading failed)"
fi
# 5c. Check a sample parsed log has expected fields
if [ "$PARSED_COUNT" -gt 0 ] 2>/dev/null; then
SAMPLE=$(ch_query "SELECT src_ip, method, host, path, header_user_agent FROM ja4_logs.http_logs LIMIT 1 FORMAT TabSeparated")
if echo "$SAMPLE" | grep -q "IntegrationTest\|BotTest\|curl"; then
pass "Parsed log contains expected User-Agent"
else
warn "Parsed log User-Agent not as expected: $SAMPLE"
fi
fi
# 5d. Check correlator log file
CORR_LINES=$(docker compose exec -T platform wc -l < /var/log/logcorrelator/correlated.log 2>/dev/null || echo 0)
if [ "$CORR_LINES" -gt 0 ] 2>/dev/null; then
pass "Correlator file output: $CORR_LINES lines in correlated.log"
else
warn "Correlator file output empty"
fi
# =============================================================================
# Phase 6: Verify dashboard API
# =============================================================================
log "============================================"
log "Phase 6: Verifying dashboard API"
log "============================================"
# Health check (dashboard has no curl, use python urllib)
HEALTH=$(docker compose exec -T dashboard python -c "
import urllib.request, json
r = urllib.request.urlopen('http://localhost:8000/health')
print(json.loads(r.read()).get('status',''))
" 2>/dev/null || echo "FAIL")
if [ "$HEALTH" = "healthy" ] || [ "$HEALTH" = "ok" ]; then
pass "Dashboard /health returns $HEALTH"
else
fail "Dashboard /health failed: $HEALTH"
fi
# Metrics endpoint
METRICS_STATUS=$(docker compose exec -T dashboard python -c "
import urllib.request
try:
r = urllib.request.urlopen('http://localhost:8000/api/metrics')
print(r.status)
except urllib.error.HTTPError as e:
print(e.code)
except Exception:
print(0)
" 2>/dev/null || echo "000")
if [ "$METRICS_STATUS" = "200" ] || [ "$METRICS_STATUS" = "404" ]; then
pass "Dashboard /api/metrics responds (HTTP $METRICS_STATUS)"
else
fail "Dashboard /api/metrics failed (HTTP $METRICS_STATUS)"
fi
# =============================================================================
# Phase 7: Verify bot-detector
# =============================================================================
log "============================================"
log "Phase 7: Verifying bot-detector"
log "============================================"
BOT_STATUS=$(docker compose ps --format json bot-detector 2>/dev/null | python3 -c "
import sys, json
for line in sys.stdin:
d = json.loads(line)
print(d.get('State','unknown'))
" 2>/dev/null || echo "unknown")
if [ "$BOT_STATUS" = "running" ]; then
pass "Bot-detector is running"
else
warn "Bot-detector state: $BOT_STATUS (may need more data to start properly)"
fi
# =============================================================================
# Phase 8: Network capture verification (sentinel)
# =============================================================================
log "============================================"
log "Phase 8: Verifying sentinel capture"
log "============================================"
SENTINEL_RUNNING=$(docker compose exec -T platform pgrep -x sentinel > /dev/null 2>&1 && echo "yes" || echo "no")
if [ "$SENTINEL_RUNNING" = "yes" ]; then
pass "Sentinel process is running"
else
fail "Sentinel process not found"
docker compose logs --tail=10 platform 2>&1 | grep -i sentinel | head -5
fi
# Check sentinel log output
SENTINEL_LOG=$(docker compose exec -T platform cat /var/log/ja4sentinel/sentinel.log 2>/dev/null | head -5 || echo "")
if [ -n "$SENTINEL_LOG" ]; then
pass "Sentinel producing log output"
else
warn "No sentinel log file found (may be logging to stdout only)"
fi
# =============================================================================
# Summary
# =============================================================================
echo ""
log "============================================"
log "RESULTS"
log "============================================"
TOTAL=$((TESTS_PASSED + TESTS_FAILED))
echo -e " ${GREEN}Passed: $TESTS_PASSED${NC} / $TOTAL"
if [ "$TESTS_FAILED" -gt 0 ]; then
echo -e " ${RED}Failed: $TESTS_FAILED${NC} / $TOTAL"
fi
echo ""
if [ "$TESTS_FAILED" -gt 0 ]; then
log "Some tests failed. Use --no-down to keep the stack running for debugging."
log "Debug commands:"
log " docker compose logs platform"
log " docker compose exec platform cat /var/log/logcorrelator/correlated.log"
log " docker compose exec clickhouse clickhouse-client -q 'SELECT * FROM ja4_logs.http_logs_raw LIMIT 5'"
exit 1
else
log "All tests passed!"
exit 0
fi