Use case:
Searcher: I want to search cars by different aspects such as car model, location, transmission, special features etc. Also, I want to see the similar cars that belong to same model as recommendations.SOLR uses index which is an optimized data structure for fast retrieval.
To create an index, we need to come up with a set of documents with fields in it. How do we create a document for the following advertisement?
Title: Toyota Rav4 for sale
Category: Jeeps
Location: Seeduwa
Model: Toyota Rav4
Transmission: Automatic
Description: find later
SOLR document:
document 1
call No brokers please.
Some more documents based on advertisements...
document 2
document 3
Inverted Index
Then SOLR creates an inverted index as given below: (Lets take example field as Title)
toyota doc1(1x)
rav4 doc1(1x)
sale doc1(1x) doc2(1x)
nissan doc2(1x)
march doc2(1x)
1x means the term frequency of the document for that particular field.
Lucene Analyzers
Note that “for” term here is eliminated during Lucene stop word removal process using Lucene text analysers. You can come up with your own analyser based on your preference as well.
Field configuration and search
You can configure, which fields your documents can contain, and how those fields should be dealt with when adding documents to the index, or when querying those fields using schema.xml.
For example, if you need to index description field as well and the description value of the field should be retrievable during search, what you need to do is add the following line in schema.xml [1].
Search query: “nissan cars for rent”
SOLR query would be /solr/select/?q=title:”nissan cars for rent"
Ok what about the other fields (Category, location, transmission etc. ?)?
By default, SOLR standard query parser can only search one field. To use multiple fields such as title and description and give them a weight to consider during retrieval based on their significance (boosts) we should use Dismax parser [2, 3]. Simply said, using Dismax parser you can make title field more important than description field.
Anatomy of a SOLR query
q - main search statement
fl - fields to be returned
wt - response writer (response format)
http://localhost:8983/solr/select?q=*:*&wt=json
- select all the advertisements
http://localhost:8983/solr/select?q=*:*&fl=title,category,location,transmission&sort=title desc
- select title,category,location,transmission and sort by title in descending order
wt parameter = response writer
http://localhost:8983/solr/select?q=*:*&wt=json - Display results in json format
http://localhost:8983/solr/select?q=*:*&wt=xml - Display results in XML format
http://localhost:8983/solr/select?q=category:cars&fl=title,category,location,transmission -
Give results related to cars only
more option can be found at [5].
Coming up next...
- Extending SOLR functionality using RequestHandlers and Components
- SOLR more like this
[1] http://wiki.apache.org/solr/SchemaXml
[2] https://wiki.apache.org/solr/DisMax
[3] http://searchhub.org//2010/05/23/whats-a-dismax/
[4] Ikman.lk
[5] http://wiki.apache.org/solr/CommonQueryParameters
sample are also about Toyota Rav4 :D
ReplyDelete:D why not?
ReplyDelete