MozMEAO SRE Status Report - February 16, 2018

Here’s what happened on the MozMEAO SRE team from January 23 - February 16.

Current work

SRE general

Load Balancers

Cloudflare to Datadog service

  • The Cloudflare to Datadog service has been converted to use a non-helm based install, and is running in our new Oregon-B cluster.

Oregon-A cluster

  • We have a new Kubernetes cluster running in the us-west-2 AWS region that will run support.mozilla.org (SUMO) services as well as many of our other services.

Bedrock

  • Bedrock is moving to a “sqlitened” version in our Oregon-B Kubernetes cluster that removes the dependency on an external database.

MDN

  • The cronjob that performs backups on attachments and other static media broke due to a misconfigured LANG environment variable. The base image for the cronjob was updated and deployed. We’ve also added some cron troubleshooting documentation as part of the same pull request.

  • Safwan Rahman submitted an excellent PR to optimize Kuma document views 🎉🎉🎉.

support.mozilla.org (SUMO)

  • SUMO now uses AWS Simple Email Service (SES) to send email.
  • We’re working on establishing a secure link between SCL3 and AWS for MySQL replication, which will help us signficantly reduce the amount of time needed in our migration window.
  • SUMO is now using a CDN to host static media
  • We’re working on Python-based Kubernetes automation for SUMO based on the Invoke library. Automation includes web, cron and celery deployments, as well as rollout and rollback functionality.
  • Using the Python automation above, SUMO now runs in “vanilla Kubernetes” without Deis Workflow.