In association with heise online

25 June 2013, 10:56

Netflix releases open source Genie for Hadoop

  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

Genie logo

Netflix, the movie streaming company, has open sourced a job and resource management system for Hadoop, called Genie. The Genie software was developed to help Netflix manage workloads with their multiple differently configured Hadoop clusters that run on the Amazon Web Services cloud. Using Genie, an end user can submit jobs to an execution service and let Genie "match-make" the job with an appropriate Hadoop cluster, while administrators can use Genie to browse through the registered Hadoop clusters that are available and view their associated configurations. Genie does not handle workflow scheduling, task scheduling or resource management such as provisioning or scaling Hadoop clusters.

A typical scenario would see a Hadoop cluster being configured with configurations stored on the Amazon S3 service. An administrator would then use the Genie client to tell the Genie service a unique ID, name and other properties of that cluster. Once registered, end users can issue job requests to Genie specifying the job type, command-line arguments and file dependencies and they also specify what kind of Hadoop cluster to pick, either by ID, name or by properties. These data points are then used by Genie to select an appropriate cluster.

The new management system is built on a range of tools for Hadoop that has been created, and released as open source, by Netflix. Karyon handles bootstrapping and lifecycle management for web services, Eureka provides the service registration and discovery for Genie and that in turn uses Archaius, a dynamic property system, and Servo, a monitoring interface. Finally Ribbon ties together those mid-tier services. Netflix has made it a point of principle to release its developments as open source, and makes them available through the Netflix Open Source Center under an Apache 2.0 licence. Genie's source code can be found in Netflix's GitHub repository and, even though it has been running in production at Netflix for some months, the developers say it is a work in progress and should be thought of as a version 0.



  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit