Hortonworks previews new Data Platform and Hadoop 2.0
Hortonworks have announced a community preview of Hortonworks Data Platform 2.0 as a self-contained VM with a pre-installed Hadoop 2.0 cluster. HDP 2.0 uses Hadoop 2.0 technology, introduces Apache Hadoop's YARN architecture and, Hortonworks hopes, will get more developers and partners to use the next generation of Hadoop architecture.
YARN moves the resource management elements that MapReduce previously handled and moves them into a new layer which MapReduce plugs into. This makes it possible to introduce new data processing engines, other than MapReduce, into a Hadoop cluster. Hadoop 2.0 is currently working its way through the community development process at the Apache Software Foundation and is expected to arrive as a beta release soon.
With that in mind, Hortonworks have shipped both the HDP 2.0 Community Preview and announced a certification program for applications that want to use Hadoop's YARN. Already, engines such as Yahoo's open source Storm-YARN computational engine and Continuity's Weave are making use of YARN and Hortonworks hopes its certification programme will help make it easier for other applications to work on YARN as well. Hortonworks says fifteen partners (Altiscale, Concurrent, Continuuity, DataTorrent, Elasticsearch, Karmasphere, Microsoft, MicroStrategy, Platfora, Red Hat, SAS, Splunk, Sqrrl, Tableau Software and TIBCO) have already joined the program.
The HDP2.0 Community Preview, meanwhile, is available as a single node image for Virtual Box or VMware and includes YARN, Tez (a MapReduce generaliser for faster interactive response) and Stinger. In the coming week, the company says it plans to release a full preview distribution.