Twitter to open source Storm in September
Twitter has announced that it will be open sourcing Storm, its stream processing framework, in September, at the Strange Loop conference. Storm was developed by BackType, a company that Twitter acquired in July. At the time, Nathan Marz, BackType's lead engineer, said that the company's plans to open source the technology had not changed. Now Twitter has put a date on those plans in a blog posting.
Storm is designed to bring the power of distributed processing to the realtime processing of streams of data. Although superficially similar to systems like Hadoop's MapReduce, it is different. Hadoop systems are oriented towards one-off tasks with the work sent out to the network and results then collected. With Storm, the computation never ends; the network continuously processes messages and produces results. These messages could be new data for analysis which will then update databases in real time; for example, Twitter messages being analysed for trending topics and passing that information on to client systems within the architecture.
More detail on the basic configuration and mechanics of Storm, such as the spout and bolt abstractions, is available in the blog posting. Currently, Nathan Marz is adding documentation to Storm "so you can get up and running with it quickly", but other details such as information on automated Storm deployment, distributed RPC and architectural elements for Storm topologies, will "have to wait until September 19th", says Marz. No information on how Storm will be licensed is currently available.