In association with heise online

14 June 2012, 16:18

Project Serengeti: Hadoop in the VMware cloud

  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

Serengeti logo

VMware has introduced a new open source project that is designed to allow the Apache Hadoop big data framework to be used in virtualised and cloud environments. VMware's plan is that the Serengeti technology will allow VMware's vSphere to become the main virtualisation platform for Hadoop applications.

The technology can handle various Hadoop implementations such as Hadoop 1.0, CDH 3 by Cloudera, Hortonworks 1.0, Greenplum HD 1.0, and implementations by IBM and MapR. Additionally, VMware has announced that it will release code to the Hadoop project. The company plans to make code for the HDFS (Hadoop Distributed File System) and Hadoop MapReduce components available so that data and MapReduce jobs can be distributed across a virtual infrastructure in an optimal way.

The developers have also updated the Spring for Apache Hadoop project, which allows programmers to use Hadoop as an analytics tool in the Java applications they create using the Spring framework. It also enables them to create, configure and execute Hadoop services such as MapReduce, Hive and Pig from within Spring. The new projects were announced at the Hadoop Summit in San Jose, where other companies, including Cloudera, DataStax, Hortonworks, MapR and Pentaho, have also announced new Hadoop-related products.

The Serengeti 0.5 toolkit is available under the Apache 2.0 licence on the GitHub hosting platform. Binaries can be downloaded from the VMware web site.


Print Version | Send by email | Permalink:

  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit