feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized

Services:
- ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap)
- logcorrelator: JA4 log correlation engine (Go, ClickHouse)
- mod_reqin_log: Apache module (C, JSON request logging)
- bot_detector: ML bot detection pipeline (Python)
- dashboard: FastAPI/Streamlit analytics UI (Python)

Shared libraries:
- shared/go/ja4common: logger, config, shutdown, ipfilter (Go module)
- shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package)
- shared/clickhouse/: canonical SQL migrations (10 files)

Build & packaging:
- Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10)
- go.work workspace linking sentinel, correlator, ja4common
- Makefile with test-all, build-all, rpm-* targets

Fixes applied:
- go.work: 1.21 → 1.24.6 (required by sentinel)
- correlator Dockerfiles: golang:1.21 → golang:1.24
- replace directives in go.mod for ja4common local path
- pyproject.toml: setuptools.backends → setuptools.build_meta
- Removed static libpcap linking (unavailable on Rocky 9)
- Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32)
- Rewrote corrupted test files (logger_test.go × 2)

Test coverage:
- correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%)
- sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse)

Documentation:
- README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
toto
2026-04-07 16:42:59 +02:00
commit d469e39da7
278 changed files with 1621301 additions and 0 deletions

246
docs/development.md Normal file
View File

@ -0,0 +1,246 @@
# Development Guide
This guide covers building, testing, packaging, and extending the ja4-platform monorepo. All build and test operations run inside Docker — no native Go, Python, or C toolchains are required on the host.
## Prerequisites
| Requirement | Minimum Version | Notes |
|-------------|----------------|-------|
| Docker | 20.10+ | BuildKit enabled (`DOCKER_BUILDKIT=1`) |
| Docker Compose | 2.x | For bot-detector and dashboard |
| make | 3.81+ | GNU Make |
| git | 2.x | For version tagging |
No Go, Python, or C compilers are needed on the host machine.
## Building All Services
```bash
make build-all
```
This builds Docker images for:
- `ja4-platform/sentinel:latest`
- `ja4-platform/correlator:latest`
- `ja4-platform/bot-detector:latest`
- `ja4-platform/dashboard:latest`
mod-reqin-log is an Apache module and is only built as part of the RPM packaging process.
### Building Individual Services
```bash
make build-sentinel # Go binary in Docker
make build-correlator # Go binary in Docker
make build-bot-detector # Python image
make build-dashboard # FastAPI + React image
```
## Running Tests
```bash
make test-all
```
### Per-Service Testing
| Service | Command | Details |
|---------|---------|---------|
| sentinel | `make test-sentinel` | Go tests with `-race` flag, requires `NET_RAW`/`NET_ADMIN` caps |
| correlator | `make test-correlator` | Go tests with 80% coverage gate enforced |
| mod-reqin-log | `make test-mod-reqin-log` | C unit tests (JSON serialization, config parsing, header handling) |
| bot-detector | `make test-bot-detector` | Python pytest suite |
| dashboard | `make test-dashboard` | Python pytest for FastAPI routes |
| ja4_common (Python) | `make test-ja4common-python` | Shared Python library tests |
## Building RPM Packages
```bash
make rpm-all
```
Builds RPMs for sentinel, correlator, and mod-reqin-log targeting Rocky Linux 8/9/10:
```bash
make rpm-sentinel # → services/sentinel/dist/rpm/
make rpm-correlator # → services/correlator/dist/rpm/
make rpm-mod-reqin-log # → services/mod-reqin-log/dist/rpm/
```
Each RPM build uses a multi-stage Docker pipeline:
1. Builder stage compiles the binary (Go) or shared object (C)
2. RPM builder stage runs `rpmbuild` for each target distro (el8, el9, el10)
3. Output stage copies RPMs to the host via `--output type=local`
### Distribution Packages
```bash
make dist # Alias for rpm-all
# RPMs in services/<service>/dist/rpm/el{8,9,10}/
```
## Local Development Workflow
### Go Services (sentinel, correlator)
The `go.work` workspace links Go modules:
```
go 1.21
use (
./services/sentinel
./services/correlator
./shared/go/ja4common
)
```
If you have Go 1.21+ installed locally, you can develop without Docker:
```bash
# Run sentinel tests locally
cd services/sentinel && go test ./... -race -v
# Run correlator tests locally
cd services/correlator && go test ./... -race -cover -v
# Build sentinel binary locally (requires libpcap-dev)
cd services/sentinel && go build -o ja4sentinel ./cmd/ja4sentinel/
```
### Python Services (bot-detector, dashboard)
```bash
# Install shared library in development mode
cd shared/python/ja4_common && pip install -e .
# Run bot-detector locally
cd services/bot-detector && pip install -r bot_detector/requirements.txt
python -m bot_detector.bot_detector
# Run dashboard locally
cd services/dashboard && pip install -r backend/requirements.txt
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
```
### C Module (mod-reqin-log)
Requires `apxs` (Apache extension tool) and development headers:
```bash
cd services/mod-reqin-log
make build # Compiles mod_reqin_log.so
make test # Runs unit tests
make rpm # Builds RPM packages
```
## Adding a New Service
### Go Service
1. Create the service directory:
```bash
mkdir -p services/my-service/cmd/my-service
mkdir -p services/my-service/internal
```
2. Initialize the Go module:
```bash
cd services/my-service
go mod init github.com/antitbone/ja4/my-service
```
3. Add to `go.work`:
```
use (
./services/sentinel
./services/correlator
./services/my-service # ← add this
./shared/go/ja4common
)
```
4. Import the shared library:
```go
import (
"github.com/antitbone/ja4/ja4common/logger"
"github.com/antitbone/ja4/ja4common/config"
"github.com/antitbone/ja4/ja4common/shutdown"
)
```
5. Add Makefile targets:
```makefile
build-my-service:
docker build -f services/my-service/Dockerfile -t ja4-platform/my-service:latest .
test-my-service:
docker build -f services/my-service/Dockerfile.dev -t ja4-platform/my-service-tests:latest .
docker run --rm ja4-platform/my-service-tests:latest
```
6. Update `build-all` and `test-all` dependencies.
### Python Service
1. Create the service directory with a `requirements.txt` or `pyproject.toml`.
2. Add `ja4-common` as a dependency (installed from `shared/python/ja4_common`).
3. Use `from ja4_common.clickhouse import get_client` for ClickHouse access.
4. Add Makefile targets following the bot-detector/dashboard pattern.
## go.work Workspace
The `go.work` file at the repository root links all Go modules, allowing cross-module development without publishing:
```
go 1.21
use (
./services/sentinel
./services/correlator
./shared/go/ja4common
)
```
When adding a new Go module:
1. `go mod init` in the service directory
2. Add the path to `go.work`
3. Reference shared packages via their module path: `github.com/antitbone/ja4/ja4common/...`
4. Run `go work sync` to update the workspace
## ja4_common Python Package
The shared Python package (`shared/python/ja4_common`) provides:
- `ClickHouseSettings` — pydantic-settings model reading from `.env`
- `ClickHouseClient` — singleton client with auto-reconnect
- `get_client()` — module-level singleton accessor
### Extending ja4_common
1. Add new modules under `shared/python/ja4_common/ja4_common/`
2. Export them in `__init__.py`
3. Add dependencies to `pyproject.toml`
4. Run tests: `make test-ja4common-python`
### Using in a New Service
Add to `requirements.txt`:
```
ja4-common @ file:///app/shared/python/ja4_common
```
Or in Docker, copy the shared library and install:
```dockerfile
COPY shared/python/ja4_common /app/shared/python/ja4_common
RUN pip install /app/shared/python/ja4_common
```
## Environment Variables
Each service reads configuration from environment variables and/or YAML config files. See individual service documentation for the full reference:
- [Sentinel configuration](services/sentinel.md#configuration-reference)
- [Correlator configuration](services/correlator.md#configuration-reference)
- [Bot Detector configuration](services/bot-detector.md#environment-variables)
- [Dashboard configuration](services/dashboard.md#configuration)