Apache Lucene-like Lucy gets first incubator release
The full-text search engine Lucene has a smaller sibling project in development, Lucy. The project, which was accepted as an incubator project by the Apache Software Foundation a year ago, has just released its first incubator version, 0.1.0, written mainly in C. It aims for speed and simplicity, but has dropped full compatibility with its bigger sibling.
Lucy started life as Kinosearch, created by Marvin Humphrey and other developers; in July 2010 Humphrey applied for it to be given the status as a Apache incubator project. This is a process set up by the Apache foundation to form communities and resolve legal issues for new projects that are hosted under the foundation's umbrella.
The full-text search engine library is written in C and uses Clownfish, a lightweight object oriented framework, that makes it easy to implement connectors for other programming languages. At the moment Lucy ships only with Perl bindings.
In their FAQ, the developers claim that, currently, the main advantage is more the starting time of the engine than the actual search performance, which is on par with Java-implemented Lucene. Nathan Kurz, a contributor to the code base, explains the design philosophy: "We use 'mmap()' heavily, and when running on 64-bit systems take liberal advantage of the giant address space. Using the system to do more of the buffering also allows us to have lightweight processes that can start quickly."
The released 738 KB library source code archive is labelled as version 0.1.0-rc3 and comes with Perl bindings, a sample indexer and a copy of the US Constitution. Lucy can be downloaded from the project's incubator pages and is licensed under the Apache Licence 2.0.
(Nils Magnus / djwm)