22 July 2009, 13:31

HadoopDB reconciles SQL with Map/Reduce

Hadoop logo Opponents of SQL had their hands strengthened when Google's SQL-free technique, "Map/Reduce", showed it could search databases measured in petabytes. They look on relational databases as antiquated, a technique that can't cope with today's quantities of data or meet the requirements of full-text searching. Rather than relations, they rely on key-value pairs.

Some disagree with this. David DeWitt and Michael Stonebraker, for example, consider Map/Reduce to be a major step back, because it ignores everything we've learned since the first version of IBM's Information Management System (IMS) came out. Another group, led by Andrew Pavlo at Brown University, has compared the functions of conventional SQL with those of Map/Reduce in a benchmark study PDF on a cluster with 100 nodes and concluded that relational database systems showed "strikingly better" performance.

A Yale University project may now reconcile the two camps. Daniel Abadi, a professor there, has wedded Hadoop, the open source Map/Reduce framework, with the PostgreSQL database, which is also open source. He says in his blog that the result scales better than other parallel database systems and handles data-mining tasks faster than Hadoop. He describes his project in detail in a document PDF that he is will be presenting at this year's Very Large Database (VLDB) conference in Lyon.