Tuesday, August 5, 2014

Basic Indexing and Searching Example with Apache Lucene.net

Lucene - Previous

1. Add Lucene.net to your application.
  • Copy Lucene.Net project to your solution folder or copy Lucene.Net.dll to your projects 
  • Add reference to Lucene.Net from the project your wish to use lucene as indexing 
  • Build your project.
2. Open writable Lucene database.

 Lucene database is a local file system folder which is the location of encrypted indexfiles generating from lucene.

  • FSDirectory is the class that we are using to create Lucene data folder in file system. This action will create write.lock file in target database path which is the lock file for that database. If you want to release this lock dispose the directory object and delete the write.lock file if its still exists. 

                          FSDirectory objDirectory  =  FSDirectory.GetDirectory(pstrDatabase_path);

  • Create Analyzer object to analyze and remove unnecessary chracters from indexing text. (@, a, :, / etc. )

                          Analyzer Analyzer = new StandardAnalyzer();

  • Create IndexWriter class to index text, html, numbers.

                          IndexWriter Writer = new IndexWriter(objDirectory, Analyzer);

3. Create your first index function with Lucene.


  • From section 2 we have


                FSDirectory objDirectory = FSDirectory.GetDirectory(pstrDatabase_path);
                Analyzer Analyzer = new StandardAnalyzer();
                IndexWriter Writer = new IndexWriter(objDirectory, Analyzer);


  • Create blank Lucene Document. Lucene will store your data as documents. Following code will create blank lucene document in memory. 
                Document doc = new Document();

  • Now we need to add data to this document. Data will be added as fields. Field is a keyvalue pair which include string key and string function. In later sections we may use fields which have custom data types. Following code line will insert string valued field to your lucene documents.


                doc.Add(new Field("FIELD_NAME", "FIELD_VALUE" , Field.Store.YES, 
                                                                                  Field.Index.NOT_ANALYZED));
  • Now we can add this document in to the Lucene writer from following code.
               Writer.AddDocument(doc);
  • Full code of indexing process.
              FSDirectory objDirectory = FSDirectory.GetDirectory(pstrDatabase_path);
              Analyzer Analyzer = new StandardAnalyzer();
              IndexWriter Writer = new IndexWriter(objDirectory, Analyzer);
              Document doc = new Document();
              doc.Add(new Field("FIELD_NAME", "FIELD_VALUE" , Field.Store.YES, 
              Field.Index.NOT_ANALYZED));
              Writer.AddDocument(doc);
              Writer.Commit();
              Writer.Close();

4. Searching an Index.


  • Creating Searcher object. Following code will create searcher object and it will take database path as a parameter.
            Searcher objSearcher = new IndexSearcher("C:\\Lucene_Data");

  • Create QueryParser object.
           QueryParser parser = new QueryParser("Default_Field", new StandardAnalyzer());

  • Create query from QueryParser object.
           Query query_lucene = parser.Parse("FIELD_NAME:FIELD_VALUE" );

  • Searching Lucene database and accessing data of searched documents.

           Hits hits = objSearcher.Search(query_lucene, sort);
           int aprox = (uint)hits.Length();
           for (int i = 0; i < aprox ; i++)
          {
          Document doc = hits.Doc(i);
                  string result = doc.Get("FIELD_NAME ");  
          }
          objSearcher.Close();


  • Complete simple search code.


            Searcher objSearcher = new IndexSearcher("C:\\Lucene_Data");
            QueryParser parser = new QueryParser("Default_Field", new StandardAnalyzer());
            Hits hits = objSearcher.Search(query_lucene, sort);
            int aprox = (uint)hits.Length();

            for (int i = 0; i < aprox ; i++)
           {
        Document doc = hits.Doc(i);
        string result = doc.Get("FIELD_NAME ");
           }

          objSearcher.Close();



If you need any help in other languages PHP, VB.net please comment.


Lucene - Previous





1 comment:

  1. error with sort parameter, cannot convert uint to in, query_lucene = parser //parser not declared

    Code doesn't work

    ReplyDelete