среда, 16 мая 2018 г.

Тестирование в production - 2

В предверии завтрашнего BOF на Heisenbug по тестированию в продакшене, вот вам набор материалов по теме (валите модератора там). Предыдущая статья по теме.

Testing in Production, the safe way (by Cindy Sridharan @copyconstruct)
Topics include:
- why test in prod when you can test in staging
- how to test in prod while minimizing risk
- how to test configuration changes in prod
- why proxies are your best friend
- what to monitor
- and more

Опыт использования Gremlin (Chaos Engineering tool) в Remind (ссылка)
Интересный коммент про вышеуказанную статью: ""instead" would be nice, but that's not how it works in reality. Chaos Engineering is not a substitute for postmortems. We still need to learn from actual incidents because systems will continue to fail (probably less frequently though)"

Diffy tool от Twitter для проверки новой версии путем сравнения результатов ее работы с текущей.

Тесты в проде - это не "ХХ и в прод", это все требует понимания monitoring и observability:
"Monitoring - watching out for 𝐤𝐧𝐨𝐰𝐧 failure modes in the system
Observability - being able to 𝐝𝐞𝐛𝐮𝐠 the system, and gain 𝐢𝐧𝐬𝐢𝐠𝐡𝐭𝐬 into the system’s behaviour 
by @theburningmonk"
"Don't attempt to "monitor everything". You can't. Engineers often waste so much time doing this that they lose track of the critical path, and their important alerts drown in fluff and cruft" - Observability and Understanding the Operational Ramifications of a System

Tips for High Availability” by Netflix Technology Blog

Engineering Large Systems When You're Not Google Or Facebook (test in prod) (by Charity Majors @mipsytipsy)

Следите за продолжением...

