Here’s what happened on the MozMEAO SRE team from June 6th - June 13th.
Current work
Frankfurt Kubernetes cluster provisioning
We’re provisioning a new Kubernetes 1.6.4 cluster in Frankfurt (eu-central-1). This cluster takes advantage of features in new versions of kops, helm, and kubectl.
The first apps to be installed in this cluster will be bedrock and basket.
Basket move to Kubernetes
Basket has been moved to Kubernetes! We experienced some networking issues in our Virginia Kubernetes cluster, so traffic has been routed away from this cluster for the time being.
Snippets
The Firefox 56 activity stream will ship to some users, with some form of snippets integration.
Cloudflare to Datadog service running in Kubernetes
The Cloudflare to Datadog service that was previously running in Deis 1 is now running in Kubernetes. Additionally, an external contributor has submitted a pull request to add this service to the Kubernetes charts repo. The PR looks to be abandoned, so it may be closed without being merged within a few days. If this happens, we’ll open a new PR with any requested changed from the current PR.
Cloudfront Provisioning
We’ve started work on provisioning Cloudfront, a global content delivery network service, for our bedrock staging environment.
Once we iron out the wrinkles with bedrock stage, we’ll continue on to bedrock prod.
Preparing to move basket.mozilla.org to Kubernetes
We triedKubewatch, a service to watch Kubernetes events and report them to Slack. However, this doesn’t seem like the right tool for us, as it currently doesn’t allow us to filter the many notifications that we get.
Here’s what happened in May in
Kuma,
the engine of
MDN:
Refactored zone CSS
Improved drafts
Moved redirects into Kuma
Retired old features
Let data be data
Shipped tweaks and fixes
Here’s the plan for June:
Ship on-site interactive examples
Ship brand updates to beta users
Add KumaScript macro tests
Ship the sample database
Done in May
Refactored Zone CSS
Some MDN sections look different, like the archive of old
pages. Others also appear
at non-standard URLs, like the Firefox
pages. Kuma uses manually
maintained Zones to accomplish this, and it is a source of bugs and
inconsistent experiences.
We took a big step toward better zones by refactoring the custom styles.
escattone did the
backend work (PR 4209)
so that styles are automatically applied across translations.
stephaniehobson did the front-end work,
moving the CSS from the database to the
repository (PR 4206),
then splitting them into per-zone CSS files
(PR 4224,
PR 4229).
The zone CSS is now up to the quality standard of the rest of our CSS, and
the experience across translations is more consistent. It wasn’t easy,
taking 10 total PRs, but Sass and other front-end tools
made the transition smoother than it would have been a year ago. Custom Zone
URLs are still painful, but we’ll tackle those soon.
Improved Drafts
We have a
papercut process
to determine the most annoying bugs. Recently, bugs around the drafts feature
rose to the top. The draft feature saves the editor content to local storage,
to add a layer of safety from browser crashes and session timeouts.
stephaniehobson has been working on PR
4186 for a few weeks, and it was
recently merged to master. This PR fixes 6 known bugs, including the
document_saved query parameter. This code will be be deployed next
week.
Moved Redirects into Kuma
In production, many basic redirects are handled using
Apache RewriteRules.
This helped with the transition from DekiWiki to Kuma in 2012. As we move
to AWS, we’d like to move this functionality into Kuma. This makes it easier
to test and modify redirects, reduces differences between development and
deployment, and reduces or eliminates the need for Apache or another web server.
pmac recently released
django-redirect-url, which
packages the redirects code used by
bedrock.
metadave integrated this library
(PR 4217), and translated
production Apache rules into Kuma code
(PR 4220).
The functional tests exposed an Apache configuration difference between staging
and production, which our WebOps team fixed. The work continues in
PR 4231.
Now that we have a redirects framework in Kuma, we may use it to help retire
the custom zone URLs.
Retired Old Features
I removed some features that have been deprecated in the last year:
Vagrant, used from 2011 to 2016 for a
development environment, is replaced by Docker
(PR 4214 and
4216).
Ansible, used from 2016 to 2017 for
provisioning development and testing environments, is also replaced by
Docker.
(PR 4239 and
4242).
The changes removed 7,600 lines from the Kuma project, and means that we don’t
have to explain this bit of history to new contributors. We’re using more of
the native services of TravisCI, which
makes our py27 build 30% faster, and lets us experiment with alternate
environments and services.
Let Data be Data
There’s a lot of data on MDN, contributed over more than 10 years. A lot of
that data is trapped in formats like HTML that made it easy to contribute,
but hard to maintain and remix. We want to formalize this data in
machine-parsable formats, so that MDN and others can use it in new and
exciting ways.
mdn/browser-compat-data is a
growing repository of Browser Compatibility data extracted from MDN.
There were
36 merged PRs
in May, and we’re using it on some of the compatibility tables on MDN.
mdn/data contains general data for Web
technologies, starting with CSS data such as properties, selectors, and
types. There were
12 merged PRs
in May, and after some recent updates
(PR 162 by
jwhitlock and
PR 183 by
Elchi3) we’re using the master branch
on MDN again.
With these data sources rapidly changing, there is pressure on KumaScript to
move quickly and break less things. They can be loaded as npm
packages (npm install mdn/browser-compat-data and
npm install mdn/data), and with
escattone’sPR 183, we’re loading some of the
data this way. He also has switched from
nodeunit to
Mocha
(PR 188), in preparation for
automated testing of KumaScript macros.
Mozilla is gathering in San Francisco for an
All-Hands meeting
at the end of June, which leaves 3 week for development work.
Here’s what we’re planning to ship in June:
Ship On-site Interactive Examples
We ran an A/B test on popular pages, showing half the users pages with small
examples on top, and half without. We looked at the analytics, and we did not
see a significant change in user behavior. We did get feedback that the
samples are useful, especially for those reminding themselves how a familiar
technology works.
We’re going ahead with the next phase. We’re going to make the new
version the default, and start experimenting with interactive
examples. Instead of looking for changes in site usage, we’ll focus on
interaction and performance.
schalkneethling is leading this next
phase, and you can follow the work at
mdn/interactive-examples.
Ship Brand Updates to Beta Users
Mozilla had a open design process to
develop a new brand identity, and has a website
detailing the results. This new brand is rolling out across Mozilla websites.
We’ve also been thinking about the brand, mission, and focus of MDN, which has
evolved over the last five years.
In June, we’ll start talking about the MDN brand, and will start shipping some
of the new elements to beta users, such as updated logos, headers, and footers.
Add KumaScript Macro Tests
Currently, maintainers review KumaScript macro changes by manually testing
them in development environments. This works for small changes, but big
changes and complex macros are hard to test manually. In June,
escattone will start adding regression tests
for some key macros. When we have a working framework and some good examples,
we’ll start asking staff and contributors to add tests for other macros, and to
submit updated tests with PRs.
Ship the Sample Database
The Sample Database has been promised every month since October 2016, and
has slipped every month. We don’t want to break the tradition, so we’ll
bend it a little. The first bit of the supporting code, a scrape_user
command, has been merged, and the rest of the code will ship in July.
See PR 4248 for the
scrape_document command, and
PR 4076 for the remaining tasks.
Here’s what happened on the MozMEAO SRE team from May 23rd - May 30th.
Current work
Bedrock (mozilla.org)
Bedrock has been stable in production on Kubernetes for 7 days. The current traffic policy includes Virginia (K8s), Tokyo (K8s), Portland (Deis 1) and Ireland (Deis 1).
application limits/requests were increased to deal with initial performance issues.
we’re discussing replacing the usage of assets.mozilla.org on www.mozilla.org.
nucleus.mozilla.org
Nucleus has been moved from our Deis 1 infrastructure to Kubernetes in Virginia.
surveillance.mozilla.org
The surveillance site has been moved from our Deis 1 infrastructure to Kubernetes in Virginia.
We’re planning to move basket to Kubernetes shortly after the nucleus migration, and then proceed to decommission existing infrastructure.
Scale down Deis 1 clusters
Now that were serving a large portion of production traffic via Kubernetes, we can safely scale down the Portland and Ireland Deis 1/Fleet clusters to reduce AWS costs. We’ll also be provisioning a Portland Kubernetes cluster in the near future.
App limits and requests set: Dev, stage and production environment limits and requestshave been set in Kubernetes in preparation for the production push. This allows Bedrock to take advantage of cluster and pod autoscaling, which is documented here.
Since we’ve added K8s to the www.mozilla.org Route 53 Traffic Policy, we’ve had to fine tune the memory and cpu limits for better performance.