Full Text Searching with Rails
thinking_sphinx and sphinx work beautifully, no indexing, query, install problems ever (5 or 6 install, including production slicehost )
why doesn't everybody use sphinx, like, say craigslist? read here about its limitations (year and a half old articles. The sphinx developer, Aksyonoff, is working on these and he's putting in features and reliability and stamping out bugs at an amazing pace)
http://codemonkey.ravelry.com/2008/01/09/sphinx-for-search/
http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/
Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?
ferret: easy install, doesn't stem properly, very slow indexing (one mysql db: sphinx: 3 seconds, ferret: 50 minutes). Well documented problems (index corruption) in drb servers in production under load. Having said that, i have use it in develometn since acts-as_ferret came out 3 years ago, and it has served me well. Not adhering to porter stemming is an advantage in some contexts.
Lucene and Solr is the gorilla/mack truck / heavyweight champ of open source search. The teams have been doing an impressive number of new features in solr 14 release:
acts-as-solr: works well, once the tomcat or jetty is in place, but those sometimes are a pain. The A-A-S fork by mattmatt is the main fork, but the project is relatively unmaintained.
re the tomcat install: SOLR/lucene has unquestionably the best knowledge base/ support search engine of any software package i've seen ( i guess i'm not that surprised), the search box here:
http://www.lucidimagination.com/
Sunspot the new ruby wrapper, build on solr-ruby. Looks promising, but I couldn't get it to install on OSX. Indexes all ruby objects, not just databases through AR
one thing that's really instructive is to install 2 search plugins, e.g. sphinx and SOLR, sphinx and ferret, and see what different results they return. It's as easy as
@sphinx_results - @ferret_results
just saw this post and responses
http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/
http://www.jroller.com/otis/entry/open_source_search_engine_benchmark
http://www.flax.co.uk/blog/2009/07/07/xapian-compared/
I have not used SearchLogic but I can tell you that Lucene is a very mature project, that has implementation in many languages. It is fast and flexible and the API is fun to work with. It's a good bet.
First off, my obvious bias: I created and maintain Thinking Sphinx.
As it so happens, I actually saw Ben Johnson (creator of SearchLogic) present at the NYC ruby meet about it last night. SearchLogic is SQL-only - so if you're not dealing with massive tables, and relevance rankings aren't needed, then it could be exactly what you're looking for. The syntax is pretty clean, too.
However, if you want all the query intelligence handled by code that is not your own, then Sphinx or Solr (which is Lucene under the hood, I think) is probably going to work out better.
SearchLogic is a good plugin, but is really meant to make your search code more readable, it doesn't provide the automatic indexing that Sphinx does. I haven't used Ferret, but Sphinx is incredibly powerful.
http://railscasts.com/episodes/120-thinking-sphinx
Great introduction to see how flexible it is.