MozMEAO SRE Status Report - February 16, 2018
Here’s what happened on the MozMEAO SRE team from January 23 - February 16.
Current work
SRE general
Load Balancers
- We’ve tried several methods of automating our AWS Elastic Load Balancers, including Terraform, the AWS CLI and Kubernetes-managed services. Each method has proven to be quirky and error-prone, so we’re trying a Python-based automation system. Additionally, the automation has the ability to generate code from existing ELBs in a given region.
Cloudflare to Datadog service
- The Cloudflare to Datadog service has been converted to use a non-helm based install, and is running in our new Oregon-B cluster.
Oregon-A cluster
- We have a new Kubernetes cluster running in the us-west-2 AWS region that will run support.mozilla.org (SUMO) services as well as many of our other services.
Bedrock
- Bedrock is moving to a “sqlitened” version in our Oregon-B Kubernetes cluster that removes the dependency on an external database.
MDN
-
The cronjob that performs backups on attachments and other static media broke due to a misconfigured
LANG
environment variable. The base image for the cronjob was updated and deployed. We’ve also added some cron troubleshooting documentation as part of the same pull request. -
Safwan Rahman submitted an excellent PR to optimize Kuma document views 🎉🎉🎉.
support.mozilla.org (SUMO)
- SUMO now uses AWS Simple Email Service (SES) to send email.
- We’re working on establishing a secure link between SCL3 and AWS for MySQL replication, which will help us signficantly reduce the amount of time needed in our migration window.
- SUMO is now using a CDN to host static media
- We’re working on Python-based Kubernetes automation for SUMO based on the Invoke library. Automation includes web, cron and celery deployments, as well as rollout and rollback functionality.
- Using the Python automation above, SUMO now runs in “vanilla Kubernetes” without Deis Workflow.