Installing apachesolr.module on Ubuntu 10.04

Ubuntu 10.04 (Lucid Lynx) just made installing the Apache Solr module for Drupal sooooo much easier. This setup won't work for Ubuntu 9.10 (Karmic Koala), because the aptitude install bit only works for tomcat5.5. If you're on 9.10, check out Nick Veenhof's article. Otherwise, enjoy!

1. Install Tomcat & Solr

sudo tasksel install tomcat-server
sudo aptitude install solr-tomcat

2. Download the Solr module

cd /var/www/drupal/sites/all/modules
drush dl apachesolr
cd apachesolr
wget http://solr-php-client.googlecode.com/files/SolrPhpClient.r22.2009-11-09...
tar xzf SolrPhpClient.r22.2009-11-09.tgz

3. Connect the two

cd /etc/solr/conf
sudo mv schema.xml schema.xml.bk
sudo mv solrconfig.xml solrconfig.xml.bk
sudo ln -s /var/www/drupal/sites/all/modules/apachesolr/schema.xml
sudo ln -s /var/www/drupal/sites/all/modules/apachesolr/solrconfig.xml

4. The tricky bit: a change in solrconfig.xml

Looks like the old solrconfig.xml points to a directory that Lucid doesn't like. You'll need to update it to point to /var/lib/solr/data (see the end of this post for more information). It should look like this in the end:

  <!-- Used to specify an alternate directory to hold all index data
       other than the default ./data under the Solr home.
       If replication is in use, this should match the replication configuration. -->
  <dataDir>/var/lib/solr/data</dataDir>

5. Finalize

sudo service tomcat6 restart

  • /admin/build/modules -> Enable Apache Solr modules
  • /admin/settings/apachesolr -> Solr Port: 8080 (note, I personally had to set "Solr host name" to my domain, rather than localhost... strange)
  • /cron.php

You're done!



Explanation of #4

If you're using solrconfig.xml provided with apachesolr.module, you may get the following error:
message Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: message Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change:
<abortOnConfigurationError>false</abortOnConfigurationError> in null ------------------------------------------------------------- java.lang.RuntimeException: java.io.IOException: Cannot create directory: /usr/share/solr/data/index at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:398) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:546) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4488) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053) at org.apache.catalina.core.StandardHost.start(StandardHost.java:722) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:516) at org.apache.catalina.core.StandardServer.start(StandardServer.java:710) at org.apache.catalina.startup.Catalina.start(Catalina.java:593) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) Caused by: java.io.IOException: Cannot create directory: /usr/share/solr/data/index at org.apache.lucene.store.FSDirectory.createDir(FSDirectory.java:349) at org.apache.lucene.store.FSDirectory.initOutput(FSDirectory.java:359) at org.apache.lucene.store.NIOFSDirectory.createOutput(NIOFSDirectory.java:75) at org.apache.lucene.index.SegmentInfos.write(SegmentInfos.java:330) at org.apache.lucene.index.SegmentInfos.prepareCommit(SegmentInfos.java:809) at org.apache.lucene.index.SegmentInfos.commit(SegmentInfos.java:893) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1574) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1407) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:190) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:393) ... 30 more

The module's xml expects to create the Solr index at /usr/share/solr/data, but Ubuntu's solr-tomcat prefers /var/lib/solr/data. You just need to update the following code to reflect what was listed in #4 above:

  <!-- Used to specify an alternate directory to hold all index data
       other than the default ./data under the Solr home.
       If replication is in use, this should match the replication configuration. -->
<!--
  <dataDir>${solr.data.dir:./solr/data}</dataDir>
-->


5 Comments

Multi-core configuration

Great post, very easy to follow. My question is do you have any idea how to do multi-core setup using the solr-tomcat package? I'm a bit of an Ubuntu neophyte and spent a while trying to figure out where to put the folders for the cores, how to specify the data directories, etc. It seems that some of my issues are coming from the dataDir issue that you describe. For example, creating sym links from the core directories in /var/lib/solr/data/ to usr/share/solr/ resulted in broken sym links in /usr/share/solr/... Any advice would be most appreciated. Thanks!
-Boden

Re: Multi-core configuration

http://www.drupalconnect.com/blog/steve/configuring-apache-solr-multi-co...

Execelent post. There is
Submitted by Jur de Vries (not verified) on July 8, 2010 - 8:16am.
Execelent post. There is only one thing: I'm missing the part about the config file for solr (/etc/tomcat6/Catalina/localhost/solr.xml).

I'm succeed. Good luck for you!

HTTP Status 500 - org/apache/lucene/index/memory/MemoryIndex

There is a bug in the solr-tomcat package which emits a required library (lucene-memory.jar) from Solr include path.

The bug only becomes apparent when you search for an exact match.

All that needs to be done is this:

cd /usr/share/solr/WEB-INF/lib
ln -s ../../../java/lucene-memory.jar lucene-memory.jar
sudo service tomcat6 restart

The full error trace that I got looked like this:

HTTP Status 500 - org/apache/lucene/index/memory/MemoryIndex java.lang.NoClassDefFoundError: org/apache/lucene/index/memory/MemoryIndex at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getReaderForField(WeightedSpanTermExtractor.java:361) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(WeightedSpanTermExtractor.java:282) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:149) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:158) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:414) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:226) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:636) Caused by: java.lang.ClassNotFoundException: org.apache.lucene.index.memory.MemoryIndex at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1484) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1329) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) ... 27 more

Hardware dimensioning

Great article, thank you.
Just a question...
Since Sorl runs it own process and requires java, I`d like to get some hints about hardware dimensioning.
I know it is very dependent on statistics as node number, search frequency and complexity of the the search configuration. Anyway, I`m completely lost about hardware size. I`m using www.linode.com VPS for a small site (but, it tends to grown up, since another site, with 1,000,000 pageviews by day will canalize traffic to it). I don`t know if I start with a 300MB or 1GB or 2GB...

This is not an issue...

Hi, just go to some hosting that provides a cloud computing infrastructure... something like rackspace cloud... you can start small and grow as much as you want without having to configure anything again...