In association with heise online

11 June 2009, 14:59

Sqoop - database migration for Apache Hadoop

  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

Cloudera have released Sqoop, a tool to migrate data from SQL databases into an Apache Hadoop system. Hadoop is a Java framework for creating distributed applications which can be scaled over thousands of systems. Hadoop was originally developed by Doug Cutting in 2002. He was inspired to create HDFS (Hadoop File System) by Google's MapReduce and GFS (Google File System). In 2006, Cutting joined Yahoo, which now uses Hadoop extensively as part of their infrastructure.

Sqoop is able to take data from an SQL database, move it into the HDFS and create appropriate entries in Hadoop's metastore, Hive. Cloudera, which specialises in Hadoop and offers commercial support for the framework, has made a beta release of Sqoop as part of its own Hadoop distribution. The documentation for Sqoop is also available. According to the announcement, Apache Hadoop users can apply a Cloudera contributed patch, but Cloudera do not expect the tool to become part of standard Hadoop "until at least version 0.21.0". The current Hadoop release version is 0.20.0.


Print Version | Send by email | Permalink:

  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit