Skip to content

Atrax

Atrax is the web scanner — it crawls customer domains to discover cookies, scripts, and tracking technologies. It consists of a controller API and headless browser nodes.

Architecture

flowchart LR
    API[atrax-api] -->|dispatches| Node[atrax-node]
    Node -->|browser| Site[Customer site]
    Node -->|results| API
    API --> S3[(S3 screenshots)]
    CoreAPI[core-api] -->|POST /scans| API
Component Purpose
atrax-api Node.js controller. Receives scan requests, dispatches browser jobs, stores results.
atrax-node Headless browser worker. Crawls sites, captures cookies/scripts/screenshots.

Stage Deployment

Component Details
URL https://atrax-api.stage.cookiehub.net
ECS cluster stage-euc1-atrax-ecs-cluster (t3.medium)
atrax-api ECS service, 512 CPU / 512 MB
atrax-node ECS service, 1024 CPU / 1024 MB (no exposed port)
ECR repos atrax-api, atrax-node
Database PostgreSQL (via database_url SSM parameter)

Production

Migration in progress

Production Atrax currently runs on a different setup not managed by Terraform. The plan is to migrate prod to match the ECS architecture used in stage.

Secrets

Stored in SSM under /atrax/{env}/atrax-api/ and /atrax/{env}/atrax-node/:

Parameter Component Description
database_url atrax-api PostgreSQL connection string
controller_auth atrax-api Bearer tokens for authentication
s3_access_key_id atrax-api S3 credentials for screenshots
s3_secret_access_key atrax-api S3 credentials for screenshots
controller_url atrax-node URL of the atrax-api controller
controller_auth atrax-node Bearer token for controller auth
bugsnag_api_key atrax-node Error tracking

Dependencies

  • PostgreSQL RDS — scan job and result storage
  • S3 — screenshot and scan artifact storage
  • Core API — triggers scans via POST /domains/:id/scans