sunburnt
Python interface to Solr
Python Sunburnt - Google Groups
How can I use Python-based interface for Solr - Sunburnt, to give me only the pdf files from a solr collection?
Source: (StackOverflow)
I am using Sunburnt Python Api For Solr Search
I am using highlighted search in Sunburnt it works fine
I am using the following code
search_record = solrconn.query(search_text).highlight("content").highlight("title")
records = search_record.execute().highlighting
Problem is it returns only 10 records. I know it can be change from solr-config.xml but issue is i want all records
I want to apply pagination using highlighted search of sunburnt
can any one help me ....
Source: (StackOverflow)
Trying to search for exactly "11000060K2"
from solr import SolrConnection
from sunburnt import RawString
term = "11000060K2"
solr_conn = SolrConnection()
scoreDocs = solr_conn.si.query(activityemail=RawString(term)).paginate(start=0, rows=1000).execute()
params_dict = scoreDocs.params
for key, keyvalue in params_dict:
logging.debug ("param %s value %s " %(key, keyvalue) )
Returns:
param start value 0
param q value activityemail:11000060K2
param rows value 1000
And a bunch of results that match other terms.
I want it to return only documents that match 11000060K2 with a query that returns / looks like:
param q value activityemail:"11000060K2"
Please tell me what am I doing wrong.
Source: (StackOverflow)
I am trying to index a few text files to Solr using sunburnt. Below is my code
solr_url = "http://localhost:8983/solr"
h = httplib2.Http(cache="/var/tmp/solr_cache")
solr_instance = sunburnt.SolrInterface(url=solr_url, http_connection=h)
for url,title, webpage in webpages:
html_id = hashlib.md5(url).hexdigest()
doc = {"id":html_id, "content":webpage, "title":title}
solr_instance.add(doc)
try:
solr_instance.commit()
except:
print "Could not Commit Changes to Solr, check the log files."
else:
print "Successfully committed changes"
But when I run this I get below error.
File "/Users/ananya/Desktop/dbms project/code/extractText/ExtractText.py", line 94, in index_to_Solr
solr_instance = sunburnt.SolrInterface(url=solr_url, http_connection=h)
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 166, in __init__
self.init_schema()
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 177, in init_schema
self.schema = SolrSchema(schemadoc, format=self.format)
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/schema.py", line 417, in __init__
if self.unique_key else None
KeyError: 'id'
I am very new to Solr. Please help me. Do I need to make any changes to the schema file? If yes, please let me know how.
Thanks.
Source: (StackOverflow)
Schema:
<field name="tags" type="string_ci" indexed="true" stored="true" multiValued="true" />
Document:
document = {
"id":123,
"title":"this is title",
"description":"this is desc",
"tags":["beach", "luxury", "RTW"]
}
Error:
<title>Error 400 ERROR: [doc=20] Error adding field \'tags\'=\'[beach, luxury, RTW]\'</title>
I tried REST, python module solrpy & sunburnt but gives the same error.
Source: (StackOverflow)
I have this situation:
{"product": {"name": "Name of Product",
"categories": [{'name': 'Category 1'}, {'name': 'Category' 2}]}
This is the structure's resume of my solr document. When I'm going to search, I always will search for the name of the product and for the category. But, if I search for this product and category = 'Category 1'
, I should return a json like this:
{"product": {"name": "Name of Product",
"categories": {'name': 'Category 1'}}
I don't know the best way to do this. For now, my options are:
- Make this final structure in the code;
- Make two collections in Solr, Product and Category, and simulate a join to mount this final response.
I am really new in Solr, so I am kind of confused.
By the way, I am using sunburnt in my Flask application.
Source: (StackOverflow)
Im using solr-sunburnt with django. I have used nutch to crawl and index my site. I copied the nutch schema.xml to solr.
The problem I'm facing is that when I send a query, the results do not have the content field in them.
Results are the same whether I query from sunburnt or directly solr (from browser, :8983/solr/select).
What do i need to do to get content field in my results
P.S. I'm a noob when it comes to searching and solr. :)
Source: (StackOverflow)
I am trying to call a object functions which also allows several different function to be called through same object :
for eg: sort(), facet(),exclude() are different functions with their
arguments and sort_flag, facet_flag and exclude_flag as condition set to true or false
can be called as:
si = sunburnt.SolrInterface("url","schema.xml")
response = si.query(arg1).facet(arg2).sort(arg3).exclude(arg4)
There can be certaing cases when I dont need to call all of these functions at same time or may be I dont have all the arguments to call these functions or vice versa. In that situtation how can I call si.facet(args).sort(args) something like this:
if sort_flag:
--append sort function to the query
if exclude_flag:
-- append exclude function
There can be alternative to do that using getattr but its confusing to use it using arguments of function and at same time it may generate lot of if check statements (for 3 flags close to 3 factorial statements)
Source: (StackOverflow)
I'm using sunburnt, a python library for talking to Solr. I'm getting some unexpected results and it would help me in debugging if I could see what query was being generated by sunburnt. So instead of doing:
result = query.execute()
I want to do something like
url = query.generate_url()
Is anything like this possible? Are there any hacks that can achieve the same effect?
Source: (StackOverflow)
What's the best way to implement sunburnt's highlight response into an application (django based, in this case)?
This link shows how's the response structured.
As they say
The results are shown as a dictionary of dictionaries
which is fair understandable enough. What i don't understand is this:
The text is highlighted with HTML, and the fragments should be suitable for dropping straight into a search template
How can i "drop the fragments in the template"? In the example they do highlight the word "Game". How can I use those highlighted fragments? Do i have to do a "search-and-replace regex" on my text? Is there another (hopefully smarter) way to deal with this?
I'm really stuck this time, and cannot come up with any solution.
Thanks all in advance.
Source: (StackOverflow)
I'm trying out the python Solr interface Sunburnt , and I've come across a little problem I can't seem to figure out. From my search field, I want to accept an arbitrary number of words which I put in an array (e.g. "Music 'Iron Maiden'" -> ['Music', 'Iron Maiden']. This I've figured out (using shlex).
The problem is that Sunburnt syntax for ORing terms is
response = si.query(si.Q(tag = 'Music') | si.Q(tag = 'Iron Maiden'))
How can I iterate over my searchword list and end up with something like the above? Or is there any other way of doing it that I'm not aware of?
Source: (StackOverflow)
I need a way of Using the solr wildcard : in sunburnt solr or is there another way of specifying 'all documents' from index then refining.Here is the code
....
si = sunburnt.SolrInterface(url=solr_url,http_connection=h)
search_terms = {SEARCH_TERMS_COMIN_FROM_A_FORM}
#!This is where I need help!
result = si.query(WILDCARD)#I need all the docs from the index
#then I can do this
if search_terms['province']:
result = result.query(province=search_terms['province'])
if search_terms['town']:
result = result.query(town=search_terms['town'])
.......#two other similar if statement blocks
#finally
results = result.execute()
Source: (StackOverflow)
I am using sunburnt solar API I want to make a query like this
solrconn.query(solrconn.Q("disease")|solrconn.Q("heart"))).highlight("content").highlight("title")
The above query is running accurately but i want to make this portion dynamic
solrconn.Q("disease")|solrconn.Q("heart")
For this i am doing
search_words=search_text.split(" ")
bitwiseQuery=""
count=0
for word in search_words:
count=count+1
if count<len(search_words):
bitwiseQuery+='solrconn.Q("'+word+'")|'
if count==len(search_words):
bitwiseQuery+='solrconn.Q("'+word+'")'
search_record=(solrconn.query(bitwiseQuery)).highlight("content").highlight("title")
But it is not giving me any result , Any Idea how can I do this...
Source: (StackOverflow)
I came upon the following error trace while just playing with this interface I plan to use in a Django app:
import sunburnt
si = sunburnt.SolrInterface("http://localhost:8984/solr/sprod/")
si.query(global_attr_article_type='casual shoes').execute()
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/sunburnt/search.py", line 599, in execute
result = self.interface.search(**self.options())
File "/usr/local/lib/python2.7/dist-packages/sunburnt/sunburnt.py", line 212, in search
return self.schema.parse_response(self.conn.select(params))
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 510, in parse_response
return SolrResponse(self, msg)
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 652, in init
self.result = SolrResult(schema, result_node)
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 691, in init
self.docs = [schema.parse_result_doc(n) for n in node.xpath("doc")]
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 519, in parse_result_doc
return dict([self.parse_result_doc(n) for n in doc.getchildren()])
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 516, in parse_result_doc
values = [self.parse_result_doc(n, name) for n in doc.getchildren()]
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 525, in parse_result_doc
return name, SolrFieldInstance.from_solr(field_class, doc.text or '').to_user_data()
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 326, in from_solr
self.value = self.field.from_solr(data)
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 161, in from_solr
return self.normalize(value)
File "/usr/local/lib/python2.7/dist-packages/sunburnt/schema.py", line 219, in normalize
(value, self.class, self.name))
SolrError: is invalid value for class 'sunburnt.schema.SolrFieldType_SolrIntField_indexed_True_omitNorms_True_stored_True' (field designer) `
The designer field in the indexed document is indeed empty
<arr name="designer">
<int/>
</arr>
<arr name="discount">
<float>0.0</float>
</arr>
<arr name="discount_label">
<str/>
</arr>
and here's what the schema's got
<fieldType name="integer" class="solr.IntField" omitNorms="true"/>
..
...
....
<field name="designer" type="integer" indexed="true" stored="true"/>
I understand this has to do with the field being empty but since the schema doesn't mention 'required' = true anywhere for this field, I wonder what's really up.
Source: (StackOverflow)