"The original 2014 worker manager was an entirely in-memory process, which was great for latency and allowed it to optimize across a broad segment of the workload. The downside to it was that if a worker manager machine failed all that in-memory state was lost, and it was very expensive to reconstruct. We ended up redesigning worker manager to be persistent across AZs, and doing that while decreasing median latency was really challenging.”
https://brooker.co.za/blog/2024/11/14/lambda-ten-years