Welcome to Lucene Tutorial.com

Lucene is an open-source full-text search library which makes it easy to add search functionality to an application or website.

The goal of Lucene Tutorial.com is to provide a gentle introduction into Lucene.

First-time Visitors

If this is your first-time here, you most probably want to go straight to the 5 minute introduction to Lucene.

Popular books related to Lucene and search

 
 
 

What's New

Added new article on Lucene's Query Syntax
Fixed broken links to Scoring docs
Added instructions for compiling and running HelloLucene.java for readers who are new to Java.
TextFileIndexer.java and HelloLucene.java has been updated to the recently released Lucene 2.9.1.
Fixed broken link to Lucene Scoring docs.

My recent blog posts about Lucene

How to write a custom Solr FunctionQuery - Fri, 03 Sep 2010 09:31:10 +0000

Solr FunctionQueries allow you to modify the ranking of a search query in Solr by applying functions to the results. There are a list of out-of-box FunctionQueries available here: http://wiki.apache.org/solr/FunctionQuery In order to write a custom Solr FunctionQuery, you’ll need to do 2 things: 1. Subclass org.apache.solr.search.ValueSourceParser. Here’s a stub ValueSourceParser. public class MyValueSourceParser extends ValueSourceParser {   public void [...]

Read more about How to write a custom Solr FunctionQuery

Dynamic facet population with Solr DataImportHandler - Mon, 02 Aug 2010 19:03:22 +0000

Here’s what I’m trying to do: Given this mysql table: CREATE TABLE `tag` (     `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,     `name` varchar(100) NOT NULL UNIQUE,     `category` varchar(100) ); INSERT INTO tag (name,category) VALUES (‘good’,‘foo’); INSERT INTO tag (name,category) VALUES (‘awe-inspiring’,‘foo’); INSERT INTO tag (name,category) VALUES (‘mediocre’,‘bar’); INSERT INTO tag (name,category) VALUES (‘terrible’,‘car’);   and this solr schema <field name="tag-foo" type="string" [...]

Read more about Dynamic facet population with Solr DataImportHandler

Upgrading to Lucene 3.0 - Thu, 29 Apr 2010 00:58:42 +0000

Recently upgraded a 3-year old app from Lucene 2.1-dev to 3.0.1. Some random thoughts to the evolution of the Lucene API over the past 3 years: I miss Hits Sigh. Hits has been deprecated for awhile now, but with 3.0 its gone. And I have to say its a pain that it is. Where I used to pass [...]

Read more about Upgrading to Lucene 3.0

Mapping neighborhoods to street addresses via geocoding - Mon, 19 Apr 2010 22:27:34 +0000

As far as I know, none of the geocoders consistently provide neighborhood data given a street address. Useful information when consulting the gods at google proves elusive too. Here’s a step-by-step guide to obtaining neighborhood names for your street addresses (on Ubuntu). 0. Geocode your addresses if necessary using Yahoo, MapQuest or Google geocoders. (this means [...]

Read more about Mapping neighborhoods to street addresses via geocoding

Average length of a URL - Fri, 06 Nov 2009 23:48:39 +0000

Aug 16 update: I ran a more comprehensive analysis with a more complete dataset. Find out the new figures for the average length of a URL I’ve always been curious what the average length of a URL is, mostly when approximating memory requirements of storing URLs in RAM. Well, I did a dump of the DMOZ [...]

Read more about Average length of a URL