8180f4af04
refactor(anubis): simplify to IP/CIDR + ASN only, remove UA and Country rules
...
- Remove UA regex extraction (extract_ua_regex, _extract_ua_from_all/any)
- Remove Country rule collection from parse_bot_policies_inline
- Simplify fetch_rules.py: collect_all_rules returns (ip_rules, asn_rules)
- Remove insert_ua_rules and insert_country_rules functions
- reload_dicts now only reloads dict_anubis_ip + dict_anubis_asn
- Simplify CASE blocks in 04_mv_http_logs.sql, 07_ai_features_view.sql,
view_ai_features_anubis.sql, mv_http_logs.sql: IP > ASN (was 5-level
UA+IP > UA > IP > ASN > Country cascade)
- Remove dict_anubis_country + dict_anubis_ua from 03_anubis_tables.sql
(UA table kept as stub for REGEXP_TREE catch-all compatibility)
- Remove anubis_country_rules table from schema
- Remove Anubis UA and Country tabs from dashboard reflists page
- Remove anubis_ua_rules/country_rules from API reflist queries
- deploy_schema.sql simplified from 339 to 122 lines
- 764 lines removed across 9 files
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com >
2026-04-09 15:25:33 +02:00
d4e7e674d8
feat: full-stack Docker Compose integration tests
...
- 4-container stack: ClickHouse, platform (Rocky 9), bot-detector, dashboard
- Platform builds sentinel on Rocky (CGO+libpcap native), correlator static
- mod-reqin-log compiled with apxs on Rocky (matching RPM build target)
- ClickHouse init script patches credentials for test env (sed-based)
- 8-phase test runner: schema, traffic gen, pipeline, dashboard API, bot-detector, sentinel
- All 13 checks pass, 3 non-blocking warnings (empty dicts, log paths)
SQL schema fixes discovered during integration:
- 02_dictionaries: IPv6CIDR → String (not a valid ClickHouse type)
- 03_anubis_tables: dict_anubis_ua missing has_ip/rule_id/category attrs
- 03_anubis_tables: dict_anubis_country FLAT() → COMPLEX_KEY_HASHED() (String key)
- 09_audit_table: CODEC before DEFAULT → DEFAULT before CODEC
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com >
2026-04-07 20:33:25 +02:00
9f3e0621e5
feat: split ClickHouse into dual configurable databases (ja4_logs / ja4_processing)
...
Architecture:
- ja4_logs: raw log ingestion (http_logs_raw, http_logs, mv_http_logs)
- ja4_processing: analytics, aggregation, ML, dictionaries, audit
Configuration (env vars):
- CLICKHOUSE_DB_LOGS (default: ja4_logs)
- CLICKHOUSE_DB_PROCESSING (default: ja4_processing)
Changes:
- SQL migrations (10 files): all mabase_prod refs → ja4_logs or ja4_processing
with correct cross-database references (MVs, views, dicts)
- deploy_schema.sh: substitutes DB names from env vars at deploy time
- Python shared settings: added CLICKHOUSE_DB_LOGS + CLICKHOUSE_DB_PROCESSING
- Dashboard routes (19 files): replaced ~80 hardcoded mabase_prod refs
with settings.CLICKHOUSE_DB_LOGS / settings.CLICKHOUSE_DB_PROCESSING
- Bot-detector: DB → CLICKHOUSE_DB_PROCESSING, fetch_rules.py configurable
- Correlator: DSN example updated to ja4_logs
- Docker-compose + .env files: new env vars with defaults
- All documentation updated (14 markdown files)
All tests pass: sentinel 10/10, correlator 67.1%, bot-detector 11, dashboard 20, ja4_common 18
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com >
2026-04-07 19:10:35 +02:00
d469e39da7
feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
...
Services:
- ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap)
- logcorrelator: JA4 log correlation engine (Go, ClickHouse)
- mod_reqin_log: Apache module (C, JSON request logging)
- bot_detector: ML bot detection pipeline (Python)
- dashboard: FastAPI/Streamlit analytics UI (Python)
Shared libraries:
- shared/go/ja4common: logger, config, shutdown, ipfilter (Go module)
- shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package)
- shared/clickhouse/: canonical SQL migrations (10 files)
Build & packaging:
- Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10)
- go.work workspace linking sentinel, correlator, ja4common
- Makefile with test-all, build-all, rpm-* targets
Fixes applied:
- go.work: 1.21 → 1.24.6 (required by sentinel)
- correlator Dockerfiles: golang:1.21 → golang:1.24
- replace directives in go.mod for ja4common local path
- pyproject.toml: setuptools.backends → setuptools.build_meta
- Removed static libpcap linking (unavailable on Rocky 9)
- Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32)
- Rewrote corrupted test files (logger_test.go × 2)
Test coverage:
- correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%)
- sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse)
Documentation:
- README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com >
2026-04-07 16:42:59 +02:00