Apache Solr and Lucene 4.2 update default codec again
The developers at the Apache Lucene project have announced version 4.2 of the Lucene text search engine and Solr, the search platform built on top of Lucene. The Lucence search engine has updated its default codec for Lucene 4.1 to a new Lucene42Codec, which is stores DocValues (the generic class for number and string storage) with more efficiency than before. The new codec also offers better compression for term vectors.
There have also been refactoring efforts and performance improvements to the faceting module resulting in some cases in 3.8 times faster execution. Lucene 4.2 is also capable of handling FSTs (Finite State Transducers) that are greater than 2GB in size.
The Solr search platform now has a REST API which allows developers to read the schema; support for writing the schema is coming. DocValues are now integrated with Solr and as they allow faster loading and can use different compression algorithms, the integration offers a wide range of feature possibilities and performance benefits. Collections now support aliasing allowing for reindexing and swapping while in production, and the Collections API has now been improved to make it easier to "see how things turned out". It is also now possible to interact with a collection in a node even if it doesn't have a replica on that node.
The full details of the changes are in the Lucene 4.2 and Solr 4.2 release notes. The new releases are available to download (Lucence, Solr). Both packages are, as with all Apache projects, released under the Apache Licence 2.0.