EzDevInfo.com

rsolr

A Ruby client for Apache Solr

Group documents with rSolr

I am building an application in ruby on rails which makes use of Solr 4. To support Solr 4 in my application I am using the rSolr gem. I am trying to parse a query and let the resulting documents be grouped on a number. The querying doesn't form a problem, but I can't seem to find the correct syntax to tell rSolr to parse the group part. Does anyone know what the correct syntax is?


Source: (StackOverflow)

sunspot heroku websolr authorization

We are using sunspot-rails to connect to websolr on Heroku. Websolr provides an authorization feature to protect read and update calls. This authorization feature requires three additional http headers to be present in every call to SOLR. I am trying to find out a way to add these http headers to every call going from sunspot. The following article shows how to do it for rsolr but not sunspot - https://github.com/onemorecloud/websolr-demo-advanced-auth. The official heroku doc at https://devcenter.heroku.com/articles/websolr has very little info about authorization. Is there a way to alter http headers through sunspot?


Source: (StackOverflow)

Advertisements

SSL Support for sunspot (solr) requests

In Rails project we are using sunspot gem for Solr full text search engine. Sunspot is built on top of the RSolr library, which provides a low-level interface for Solr interaction.

Is there any possibility to set that RSolr requests to SOLR server is HTTPS not HTTP


Source: (StackOverflow)

How to upload a file with rsolr?

I have a file which needs to be indexed on our solr server. How can I upload a file? I know how to do it with curl: curl "http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true" -F "myfile=@tutorial.html"

(from http://wiki.apache.org/solr/ExtractingRequestHandler ) but I don't know how to translate that to the rsolr rubygem.

Thanks in advance.


Source: (StackOverflow)

sunspot how query specific field

Reading throw sunspot documentation I try to find the way how can I query autocomplete fields

I create them like here https://github.com/haitham/sunspot_autocomplete

Here is config solr/conf/schema.xml

<types>
.......
<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
                      <tokenizer class="solr.LetterTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
     </analyzer>
     <analyzer type="query">
   <tokenizer class="solr.KeywordTokenizerFactory"/>
       <!--tokenizer class="solr.KeywordTokenizerFactory"/-->
       <filter class="solr.LowerCaseFilterFactory"/>
     </analyzer>
</fieldType>
<fieldType name="autosuggest" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
     <tokenizer class="solr.KeywordTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.LetterTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
   </analyzer>
 </fieldType>
 .....
 </types>
 <fields>
 ....
   <dynamicField name="*_ac" type="autocomplete" indexed="true"  stored="true"/>
   <dynamicField name="*_as" type="autosuggest" indexed="true"  stored="true"/>
 ....
 </fields>

Also I have model with searchable fields

class Post < ActiveRecord::Base
  attr_accessible :text, :user_id

  has_one :user

  searchable do
    text :text
    integer :user_id, :references => User
    autocomplete :post_text, :using => :text
  end
end

When I try search against autocomplete field like so

Sunspot.search(Post) { keywords('ra', :fields => :'autocomplete') }

I'm getting error

Sunspot::UnrecognizedFieldError: No text field configured for Post with name 'autocomplete'

What I'm doing wrong

I share application on github you can try it - https://github.com/pironim/my_sunspot_app.git


Source: (StackOverflow)

Rsolr::Ext return undefined method `to_i' for ["10", "10"]:Array

  • I had a problem about RSolr::Ext.
  • When i try to query with params page and per_page use library rsolr-ext in order to connect to Apache Solr.

And i get an error same as:

NoMethodError: undefined method `to_i' for ["10", "10"]:Array
from /home/khanhpn/.rvm/gems/ruby-2.2.2@music/gems/rsolr-ext-1.0.3/lib/rsolr-ext/response.rb:27:in `rows'

This is my code:

@solr_connection = RSolr::Ext.connect(
  url: "http://localhost:8080/solr/music",
  open_timeout: 10,
  read_timeout: 10,
  retry_503: 2)

solr_params = {
     :page => 0,
     :per_page => 10,
     :field_names => [:id, :title],
     :queries => "xuan"
}

solr_connection.find(solr_params)

Hope that everybody can support me. Thank you very much.


Source: (StackOverflow)

When should I create Solr connection in a Rails app

I'm accessing Solr in a Ruby on Rails application by using rsolr (not Sunspot). I create the local solr object that I use to send requests like this:

solr = RSolr.connect(:url => "http://localhost:8983/solr")

as far as I understand, this is not really a connection but just an object that will issue requests on demand, so it shouldn't be expensive to keep it initialized and it should never disconnect. According to that, it should be ok to have one global solr object, create it at start time and forget about it. Right? But maybe it's not thread safe?

When should I create the solr connection?


Source: (StackOverflow)

Query solr cluster for state of nodes

I'm trying to tweak our system status check to see the state of the Solr nodes in our SolrCloud. I'm facing the following problems:

We send a query to each of the Solr nodes separately. If we get a response and the status of the response is 0, we assume the node is running. Unfortunately, we've seen cases in which the node is recovering or even down and select queries are still handled.

In hope to prevent this, we've added a check which sends a ping request to solr. If the status returned by this is request reads 'OK' we assume the node is up. Unfortunately even with this request, if the node is recovering or down, this check won't fail.

My question is: What is the correct way to check the status of a node in SolrCloud?


Source: (StackOverflow)

solr newly added documents on indexes not reflected before restart

I am facing the strange behavior with solr insertion. Newly added documents never reflected in index. i have to restart the solr app instance in glassfish server to get the updates. This is happening all the time. Initially i was using lucene and it was fine.

i am not sure what messed up here, i verified everything and it seems ok. But i want to know is this how solr functions, like it takes some time to update the index after the new insert. Because it never reflected in index after long times(just 10 docs).

has anyone got an idea how to fix this?

Update: I am using rsolr ruby wrapper for connecting to solr


Source: (StackOverflow)

sunspot:reindex error - getaddrinfo: nodename nor servname provided, or not known

I'm working with a Rails 3.2 application that has a mysql database and a number of models that are being indexed in Solr.

Here's what's happening:

I am running the following command:

RAILS_ENV=development bundle exec rake sunspot:reindex[1000] --trace

After indexing about 12% of the 4 million records (although it's a different percentage every time), the process inevitably bombs out with the following error and stack trace:

rake aborted!
getaddrinfo: nodename nor servname provided, or not known
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:762:in `initialize'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:762:in `open'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:762:in `block in connect'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:762:in `connect'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:744:in `start'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/net/http.rb:1284:in `request'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rsolr-1.0.9/lib/rsolr/connection.rb:15:in `execute'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/solr_instrumentation.rb:14:in `block in execute_with_as_instrumentation'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/activesupport-3.2.13/lib/active_support/notifications.rb:123:in `block in instrument'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/activesupport-3.2.13/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/activesupport-3.2.13/lib/active_support/notifications.rb:123:in `instrument'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/solr_instrumentation.rb:12:in `execute_with_as_instrumentation'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rsolr-1.0.9/lib/rsolr/client.rb:167:in `execute'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rsolr-1.0.9/lib/rsolr/client.rb:161:in `send_and_receive'
(eval):2:in `post'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rsolr-1.0.9/lib/rsolr/client.rb:67:in `update'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rsolr-1.0.9/lib/rsolr/client.rb:87:in `add'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/indexer.rb:106:in `add_documents'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/indexer.rb:30:in `add'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/session.rb:91:in `index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/session_proxy/abstract_session_proxy.rb:11:in `index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/session_proxy/retry_5xx_session_proxy.rb:17:in `method_missing'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/session_proxy/abstract_session_proxy.rb:11:in `index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot.rb:184:in `index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/searchable.rb:261:in `block (2 levels) in solr_index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/searchable.rb:365:in `solr_benchmark'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/searchable.rb:260:in `block in solr_index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/activerecord-3.2.13/lib/active_record/relation/batches.rb:72:in `find_in_batches'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/activerecord-3.2.13/lib/active_record/querying.rb:8:in `find_in_batches'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/searchable.rb:259:in `solr_index'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/searchable.rb:203:in `solr_reindex'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/tasks.rb:64:in `block (3 levels) in <top (required)>'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/class_set.rb:16:in `each'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot-2.0.0/lib/sunspot/class_set.rb:16:in `each'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/sunspot_rails-2.0.0/lib/sunspot/rails/tasks.rb:63:in `block (2 levels) in <top (required)>'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:246:in `call'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:246:in `block in execute'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:241:in `each'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:241:in `execute'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:184:in `block in invoke_with_call_chain'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:177:in `invoke_with_call_chain'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/task.rb:170:in `invoke'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:143:in `invoke_task'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:101:in `block (2 levels) in top_level'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:101:in `each'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:101:in `block in top_level'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:110:in `run_with_threads'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:95:in `top_level'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:73:in `block in run'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:160:in `standard_exception_handling'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/lib/rake/application.rb:70:in `run'
/Users/tchapin/.rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/rake-10.0.4/bin/rake:33:in `<top (required)>'
/Users/tchapin/.rbenv/versions/1.9.3-p392/bin/rake:23:in `load'
/Users/tchapin/.rbenv/versions/1.9.3-p392/bin/rake:23:in `<main>'

The app is running in development mode at localhost:3000, and solr is running at localhost:8982. Here's my solr.rake file:

Rake::Task['sunspot:reindex'].enhance ['sunspot:scope_models_for_index']
Rake::Task['sunspot:solr:reindex'].enhance ['sunspot:scope_models_for_index']

namespace 'sunspot' do
  task :scope_models_for_index => :environment do
    require 'rsolr/error'
    Dir.glob(Rails.root.join('app/models/**/*.rb')).each { |path| require path }

    # Add the GC
    commit_extension = Module.new do
      def commit
        GC.start
        super
      end
    end

    Sunspot.extend commit_extension

    # Set all the models default scopes the index scope
    Sunspot.searchable.each do |model|
      model.class_eval do
        default_scope ->{ sunspot_index } if model.respond_to?(:sunspot_index)
      end
    end
  end
end

Anyone know what might be causing this error, or how to fix it?


Source: (StackOverflow)

Using POST instead of GET for long queries in Sunspot / Solr

I need to run a long query over Solr but Sunspot is using GET as the default method. I know that this is something supported on RSolr, but i don't know if i can do it through Susnpot.

Thanks!


Source: (StackOverflow)

Fetch top terms with sunspot

solr have a admin console like

http://localhost:8982/solr/admin/schema.jsp

which provides some schema fields data like top terms and freq for specific field.

I am using sunspot, and how to query data like this? like TOP 10 terms


Source: (StackOverflow)

Sunspot and Rsolr query

I'm reading sunspot documentation and find that sunspot based on RSolr library. Is there any way to get connection to perform low level request like this pseudo-code:

solr = Sunspot.connection
response = solr.get 'select', :params => {:q => '*:*'}

Source: (StackOverflow)

Where's my time going mysteriously?

I have a ruby script that uses rsolr rubygem to generate XMLs and POST them to Apache Solr (javadoc Update Command) surfaced by Jetty Server. My script logs certain time using the following code

405       unless docs.empty?
406         begin
407           log.info("Adding to solr")
408           response = solr.add(docs)
409           log.info("#{(id_2*100.0)/last_id}% Done")
410           if response['responseHeader']['status'] != 0
411             log.fatal("Document ids  not sent")
412             #log.fatal(Solr::Request::AddDocument.new(docs_single).to_s)
413             log.close
414             exit
415           end
416           log.info("#{Time.now.to_f - starttime}s to feed Solr. #{id_1} to #{id_2}")
417         rescue Exception => e
418           log.fatal("Document ids not sent => ")
419           #log.fatal(Solr::Request::AddDocument.new(docs_single).to_s)
420           #log.fatal(docs)
421           log.close   
422           exit
423         end

The log generated goes like

I, [2011-10-09T15:03:42.617048 #30092]  INFO -- : Executing - SELECT * FROM solr_feeddata_2 WHERE id >= 5879999 AND id < 5881999
I, [2011-10-09T15:03:44.086661 #30092]  INFO -- : External Data1 fetch time: 1.45462989807129
I, [2011-10-09T15:03:44.109514 #30092]  INFO -- : External Data2 fetch time: 0.0226790904998779
I, [2011-10-09T15:03:44.109611 #30092]  INFO -- : 1.49255704879761s to fetch details from database. 5879999 to 5881999
I, [2011-10-09T15:03:44.109702 #30092]  INFO -- : Adding data1, data2, building docs
I, [2011-10-09T15:03:45.912603 #30092]  INFO -- : 3.29554414749146s to build documents. 5879999 to 5881999
I, [2011-10-09T15:03:45.912730 #30092]  INFO -- : Adding to solr
I, [2011-10-09T15:04:24.797620 #30092]  INFO -- : 61.180855194502% Done
I, [2011-10-09T15:04:24.797744 #30092]  INFO -- : 42.180694103241s to feed Solr. 5879999 to 5881999

According to this log, Solr took (42.18 - 3.29 - 1.49 - 2) 35.4s to respond. (See below comment)

At the same time my Solr log for this particular update goes like

INFO: {add=[5879999, 5880000, 5880001, 5880002, 5880003, 5880004, 5880005, 5880007, ... (1468 adds)]} 0 5780
Oct 9, 2011 3:04:24 PM org.apache.solr.core.SolrCore execute
INFO: [core0] webapp=/solr path=/update params={wt=ruby} status=0 QTime=5780 
Oct 9, 2011 3:04:42 PM org.apache.solr.update.processor.LogUpdateProcessor finish

This clearly shows that Solr took 5.78s to add the docs, initiate response send and closed the log updater.

Both the services run on same machine, inside the network, and their ping summary is

rtt min/avg/max/mdev = 0.008/0.010/0.022/0.006 ms

This pattern is clearly visible for every batch data processed. Despite my sincere efforts to get this mystery out, I am not able to get the reason for this behavior.

My Solr mergeFactor is 10, autoCommit is off.


Source: (StackOverflow)

Rails 4 w/ Heroku RSolr::Error::Http (RSolr::Error::Http - 404 Not Found

my setup: Rails 4, Heroku with Websolr addon.

The Solr search was been working just fine for months with my Rails 4 (production) application. Then one day it quit and I went through the following stackoverflow answers and none of them worked.

Answer 1, Answer 2, Answer 3

Since I didn't have java installed at one point, they were pretty tedious to go through, but I do need to mention before I answer this that I went through all three answers BEFORE I solved it, so something I did above may have enabled it.


Source: (StackOverflow)