# Murat Demirbas - Metastable Failures in the Wild (Highlights)

## Metadata
**Review**:: [readwise.io](https://readwise.io/bookreview/32134612)
**Source**:: #from/readwise #from/reader
**Zettel**:: #zettel/fleeting
**Status**:: #x
**Authors**:: [[Murat Demirbas]]
**Full Title**:: Metastable Failures in the Wild
**Category**:: #articles #readwise/articles
**Category Icon**:: 📰
**URL**:: [muratbuffalo.blogspot.com](http://muratbuffalo.blogspot.com/2023/09/metastable-failures-in-wild.html)
**Host**:: [[muratbuffalo.blogspot.com]]
**Highlighted**:: [[2023-09-13]]
**Created**:: [[2023-09-16]]
## Highlights
- Retries induced load increase constitutes over 50% of the sustaining effects. ([View Highlight](https://read.readwise.io/read/01ha71b3atnswn85whgt3374y3)) ^595185634
- Feel the pain! Don't mask the pain, feel the pain, and attribute the pain to the correct subsystem and shed load quickly so you do not trip over to the metastable state. If you get stuck in the metastability state, you elongate the unavailability, and need to shed load in even at a bigger scale, and need to do big reset, because this runs the risk of cascading to other subsystems and bringing them down. ([View Highlight](https://read.readwise.io/read/01ha71dc9gctjc2knmvt4v3717)) ^595185758
- A meta lesson is, don't [DOS](https://en.wikipedia.org/wiki/Denial-of-service_attack) yourself! Design your system so it doesn't inadvertently launch a denial of service on itself. ([View Highlight](https://read.readwise.io/read/01ha71e0x4d641h3e7nedv1c76)) ^595185772
- An easy thing to observe is to be careful about retries. Don't blindly retry, because you are causing work/load amplification. ([View Highlight](https://read.readwise.io/read/01ha71ee821hxqjkp90sm4jjgg)) ^595185908
- Don't overoptimize one part of your system/protocol to the detriment of creating an asymmetric work for response (work amplification) in other cases. ([View Highlight](https://read.readwise.io/read/01ha71g2kb6dbvf8xq2934y6bg)) ^595185980
The overoptimized component my DOS other components.
- While components such as caches can improve performance, DynamoDB does not allow them to hide the work that would be performed in their absence, ensuring that the system is always provisioned to handle the unexpected. ([View Highlight](https://read.readwise.io/read/01ha71nyzk4jab0a9e7ts74cap)) ^595186533
Amortize the cost of cache miss
<!-- New highlights added October 19, 2023 at 8:09 PM -->
- Metastable failure is defined as permanent overload with low throughput even after the fault-trigger is removed. ([View Highlight](https://read.readwise.io/read/01hd1hfs4qjt21mfm2pqgfha2b)) ^612405764
Somehow relate to [[Antifragile]]