This latest release contains “numerous bug fixes, optimizations, and improvements.”
The Lucene Project Management Committee (PMC) has announced the release of Apache Lucene 9.0.0. Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.
New scenarios and languages supported, with performance boosts
Lucene 9.0 explores ways of supporting new usage scenarios and Java features. For example, it is the first release to provide JARs with automatically generated module names. The hope is that this will help to enable work with the Java module system somewhere along the line.
There is now support for indexing high-dimensionality numeric vectors to perform nearest-neighbor search. The indexing uses the Hierarchical Navigable Small World graph algorithm. The release also includes new Analyzers for Serbian, Nepali, and Tamil languages. There is even an IME-friendly autosuggest for Japanese. Release 9.0 offers Snowball 2, adding Hindi, Indonesian, Nepali, Serbian, Tamil, and Yiddish stemmers. There is also new normalization/stemming for Swedish and Norwegian Optimizations.
There are significant performance improvements as well. Release 9.0 offers up to 400% faster taxonomy faceting, with 10-15% faster indexing of multi-dimensional points. It is also several times faster sorting on points-indexed fields. This optimization used to be an opt-in in late 8.x releases and is now opt-out as of 9.0.
ConcurrentMergeScheduler now assumes fast I/O. This will likely improve indexing speed in some cases. For example: where heuristics would incorrectly detect whether the system had modern I/O or not. The encoding of postings lists changed from FOR-delta to PFOR-delta to save further disk space. In addition, file formats moved from big-endian order to little endian order.
Finally, Lucene 9 no longer has split packages. This required renaming some packages outside of the Lucene-core JAR, so developers will need to adjust some imports accordingly. Developers should consider using Lucene 9 with the module system to be experimental. The Lucerne PMC expect to make progress on this in future 9.x releases.