Ingestion Edge Service
A simple service for delivering HTTP messages to Google Cloud PubSub
The source code lives in the ingestion-edge subdirectory of the gcp-ingestion repository.
Building
We assume that you have docker-compose installed.
From inside the ingestion-edge
subdirectory:
# docker-compose
docker-compose build
# pytest
bin/build
Running
Use docker-compose
to run a local development server that auto-detects changes:
# run the web server and PubSub emulator
docker-compose up --detach web
# manually check the server
curl http://localhost:8000/__version__
curl http://localhost:8000/__heartbeat__
curl http://localhost:8000/__lbheartbeat__
curl http://localhost:8000/submit/test -d "test"
# check web logs
docker-compose logs web
# clean up docker-compose environment
docker-compose down --timeout 0
Configuration
The ingestion-edge docker container accepts these configuration options from environment variables:
ROUTE_TABLE
: a JSON list of mappings fromuri
to PubSub topic,uri
matches are detected in order, defaults to[]
, each mapping is a list and may include an optional third element that specifies a list of allowed methods instead of the default["POST","PUT"]
QUEUE_PATH
: a filesystem path to a directory where a SQLite database will be created to store requests for when PubSub is unavailable, paths may be relative to the docker containerWORKDIR
, defaults toqueue
MINIMUM_DISK_FREE_BYTES
: an integer indicating the threshold of free bytes on the filesystem whereQUEUE_PATH
is mounted below which/__heartbeat__
will fail, defaults to0
which disables the checkMETADATA_HEADERS
: a JSON list of headers to preserve as PubSub message attributes, defaults to["Content-Length", "Date", "DNT", "User-Agent", "X-Forwarded-For", "X-Pingsender-Version", "X-Pipeline-Proxy", "X-Debug-ID", "X-Telemetry-Agent", "X-Source-Tags", "X-Foxsec-IP-Reputation", "X-LB-Tags"]
; the message attribute name will be the header name in lowercase and with-
converted to_
PUBLISH_TIMEOUT_SECONDS
: a float indicating the maximum number of seconds to wait for the PubSub client to complete a publish operation, defaults to 1 second and may require tuningFLUSH_CONCURRENT_MESSAGES
: an integer indicating the number of messages per worker that may be read from the queue before waiting on publish results, defaults to 1000 messages based on publish request limits and may require tuningFLUSH_CONCURRENT_BYTES
: an integer indicating the number of bytes per worker that may be read from the queue before waiting on publish results, which may be exceeded by one message and measures data bytes rather than serialized message size, defaults to 10MB based on publish request limits and may require tuningFLUSH_SLEEP_SECONDS
: a float indicating the number of seconds waited between flush attempts, defaults to 1 second and may require tuning
Testing
Run tests with CircleCI Local
CLI,
docker-compose
, or pytest
wrappers
# circleci
(cd .. && circleci build --job ingestion-edge)
# docker-compose
docker-compose run --rm test
# pytest wrapper (pytest-all calls lint and pytest)
./bin/pytest-all
The pytest
wrappers add these options via the environment:
CLEAN_RELOCATES
controls whetherbin/lint
andbin/pytest
will remove.pyc
files not invenv/
that do not contain$PWD
to prevent errors when switching between running in and out of docker, defaults totrue
VENV
controls whether to use a pythonvenv
invenv/$(uname)
inbin/lint
andbin/pytest
, and inbin/build
to create and use thatvenv
, defaults tofalse
inDockerfile
andtrue
otherwiseREQUIRE_CODE_COVERAGE
controls whetherbin/pytest-all
will fail if code coverage is less than 100%, defaults totrue
- Use comments like
# pragma: no cov
to disable coverage for things that cannot be reasonably covered in CircleCI, as described in thecoverage.py
docs
Style Checks
Run style checks
# docker-compose
docker-compose run --rm test bin/lint
# pytest wrapper
./bin/lint
Unit Tests
Run unit tests
# docker-compose
docker-compose run --rm test bin/pytest tests/unit
# pytest wrapper
./bin/pytest tests/unit
Integration Tests
Run integration tests locally
# docker-compose
docker-compose run --rm test bin/pytest tests/integration
# pytest wrapper
./bin/pytest tests/integration
Test a remote server (requires credentials to read PubSub)
# define the same ROUTE_TABLE as your edge server
export ROUTE_TABLE='[["/submit/telemetry/<suffix:path>","projects/PROJECT/topics/TOPIC"]]'
# docker using latest image and no git checkout
docker run --rm --tty --interactive --env ROUTE_TABLE mozilla/ingestion-edge:latest bin/pytest tests/integration --server https://myedgeserver.example.com
# docker-compose
docker-compose run --rm -e ROUTE_TABLE test bin/pytest tests/integration --server https://myedgeserver.example.com
# pytest wrapper
./bin/pytest tests/integration --server https://myedgeserver.example.com
Load Tests
Run a load test (defaults to a single GKE cluster and a PubSub emulator)
# docker using latest image and no git checkout
docker run --rm --tty --interactive mozilla/ingestion-edge:latest bin/pytest tests/load
# docker-compose
docker-compose run --rm test bin/pytest tests/load
# pytest
./bin/pytest tests/load
Load test options (from ./bin/test -h
)
--min-success-rate=MIN_SUCCESS_RATE
Minimum 200 responses per non-200 response to require
during --test-period, default is 1000 (0.1% errors)
--min-throughput=MIN_THROUGHPUT
Minimum 200 responses per second to require during
--test-period, default is 15000
--test-period=TEST_PERIOD
Number of seconds to evaluate after warmup, default is
1800 (30 minutes)
--warmup-threshold=WARMUP_THRESHOLD
Minimum 200 responses per second that indicate warmup
is complete, default is 15000
--warmup-timeout=WARMUP_TIMEOUT
Maximum number of seconds to wait for warmup to
complete, default is 600 (10 minutes)
--cluster=CLUSTER Name of GKE cluster to create for test resources,
default is 'load-test', ignored when --load-balancer
and --no-traffic-generator are both specified
--location=LOCATION Location to use for --cluster, default is us-west1
--preemptible Use preemptible instances for --cluster, default is
False
--project=PROJECT Project to use for --cluster, default is from
credentials
--load-balancer=LOAD_BALANCER
Load Balancing url map to monitor, implies --no-
generator when --server-uri is not specified, ignores
--image and --no-emulator
--server-uri=SERVER_URI
Server uri like 'https://edge.stage.domain.com/submit/
telemetry/suffix', ignored when --no-generator is
specified or --load-balancer is missing
--image=IMAGE Docker image for server deployment, default is
'mozilla/ingestion-edge:latest', ignored when --load-
balancer is specified
--no-emulator Don't use a PubSub emulator, ignored when --load-
balancer is specified
--topic=TOPIC PubSub topic name, default is 'topic', ignored when
--load-balancer is specified
--no-generator Don't deploy a traffic generator, ignore --script
--script=SCRIPT Lua script to use for traffic generator deployment,
default is 'tests/load/wrk/telemetry.lua', ignored
when --no-generator is specified