Testing in Production: the hard parts

Blast Radius

Prevention and Mitigation of Test Mishaps

Safe and Staged Deploys

Quick Service Restoration

To Crash or Not To Crash

Change One Thing At A Time

Multi-tiered Isolation

Divorce the Control Plane from the Data Plane

Control planes all the way down.

Eschew Global Synchronized State

Other Considerations

Client Resilience and Client-Side Metrics

Invest in Observability

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store