I would like to use components that are free for commercial use.
I looked at a Lucene and MongoDB combo but wonder if there are better approaches, ideally a single system.
Sphinx can also handle billions of documents http://sphinxsearch.com/info/powered/
(although I also use Lucene and cannot tell whether Sphinx is better)