Netflix open sources Hystrix resilience library
Netflix has moved on from just releasing the tools it uses to test the resilience of the cloud services that power the video streaming company, and has now open sourced a library that it uses to engineer in that resilience. Hystrix is an Apache 2 licensed library which Netflix engineers have been developing over the course of 2012 and which has been adopted by many teams within the company. It is designed to manage how distributed services interact and give more tolerance to latency within those connections and the inevitable failures that can occur.
The library isolates access points between services and then stops any failures from cascading between those access points. Hystrix uses a Command pattern to execute or queue Command objects and evaluate whether the circuit to the service for which the command is destined for is in operation. This may not be the case where what Hystrix calls a circuit breaker has triggered leaving the circuit "open". Circuit breakers can be placed into a system to make it easier to trigger a coordinated failover. The library also checks for other issues which may prevent the execution of the command.
If there is an issue or the circuit is "open", a fallback is requested and, if implemented, executed. If there is no issue, the command is executed in a thread. The process is described in the Hystrix documentation in the section How it Works. The documentation also includes Getting Started and How to Use guides for the Java library.
Netflix is planning to release a realtime dashboard for monitoring Hystrix in the near future; this will show the status of the circuit breakers in a system and metrics on the traffic passing through them. The Hystrix source code is available in Netflix's Github repository.