JavaScript library for search engine style searching?
The best (easy and good) way is to use a Vector Search Algorithm.
First take all words in each paragraph and save them in a vector object (how to build explained later) and compare relation to query Vector of each Paragraph Vector
Then on each word use the Porter stemmer to make it cluster things like kid and kids.
var Vector = function(phar) {
var self = this;
self.InitVector = function () {
var wordArray = self.spltwords(phar);
self.VectorSize = wordArray .length;
var stemdWordArray = self.runPotterStemmer(wordArray);
self.VectoData = self.GroupAndCountWords(stemdWordArray) ;
}
self.VectoData = {};
self.runPotterStemmer = function(arr){
// run potter as seen in link
}
self.spltwords= function(arr) {
// run split
}
self.GroupAndCountWords = function(arr) {
for (var i=0; i<arr.length; i++) {
if (VectoData[arr[i]] === undefined) {
VectoData[arr[i]] = 0;
} else {
VectoData[arr[i]] = VectoData[arr[i]] +1;
}
}
}
self.compare = function(queryVector) {
// compare queryVector to current vector and return a similarity number
// number of similar words count in query divided by the length of paragraph
}
self.InitVector()
return self;
Here are some libraries that I am evaluating for projects (in July 2013). Any of these should be able to provide the core of the search feature.
- http://lunrjs.com/
- stemming, scoring built in
- 13.8 kb minified
- updated recently (https://github.com/olivernn/lunr.js/commits/master)
- 10 contributors
- no external dependencies
- http://fusejs.io (formerly at http://kiro.me/projects/fuse.html)
- fuzzy search
- 1.58 kb minified
- updated recently (https://github.com/krisk/Fuse/commits/master)
- 1 contributor
- no external dependencies
- http://reyesr.github.io/fullproof/
- uses html5 storage with graceful degradation
- 459 kb minified
- last updated 2013 (https://github.com/reyesr/fullproof/commits/master)
- 2 contributors
- no external dependencies
- http://eikes.github.io/facetedsearch/
- pagination, templating built in
- 5.70 kb minified
- last updated 2014 (https://github.com/eikes/facetedsearch/commits/master)
- 1 contributor
- depends on jquery and underscore
If you feel like building your own, here are implementations of 2 common stemming algorithms to get you started:
- https://github.com/fortnightlabs/snowball-js
- http://tartarus.org/martin/PorterStemmer/
As for handling boolean logic search operators, maybe this question about js query parsers will be useful.