Managing distributed Solr Servers

We use the open-source search server Solr for real-time search on data stored in a Hadoop cluster. For our terabyte-scale dataset, we had to implement distributed search on multiple Lucene index partitions (shards). This article describes our solution to manage 40 independent Solr instances without human interaction, including transparent failover and automatic backup of the Lucene index.

In part 3 of this article series we have discussed our use of Solr to enable real-time search on data stored in a Hadoop cluster. Because our data consists of a very large number of small documents (we consider a single log message with a size of 600 bytes average to be a document), we need to index the dataset into multiple index shards. In fact, we have one Solr instance running on each data node of our Hadoop cluster.

Hadoop has a lot of features to ensure that the cluster is kept functional even when data nodes fail. Data node failures are detected automatically and actions are performed to rebalance the blocks of data stored on the failed data node to other nodes in the cluster. And if possible, tasks are restarted on the failed node. In the end, the user of a Hadoop cluster does not even notice the failure of a data node.

Why we developed our own Solr Manager

Solr, on the other hand, offers only very limited management features out of the box. It is possible to remotely start and stop Solr cores, to deploy new indexes and also to perform distributed searches. But the default Solr distribution lacks sophisticated monitoring and cluster management tools.

My team at mgm technology partners has therefore implemented its own Solr manager with distributed management functionality similar to Hadoop:

  • detect failed Solr instances
  • detect new nodes in the Hadoop cluster
  • gather metadata of each Solr core
  • backup an Lucene index
  • move index from one Solr instance to another

With these features it is possible to have the same level of autonomy with respect to changes of the cluster topology as with the Hadoop software. We are able to detect failed nodes and move indexes which have been running on the failed node to other nodes in the cluster. Once the failed node is back online, this also gets detected and the indexes will be moved back to that server.

Monitoring the Health of Solr Instances

The Solr manager is a process which is running in regular intervals. We use 5 minutes for our interval. On each pass it will identify the current state of the Hadoop cluster, gather data from all Solr instances and perform actions to keep the cluster healthy. The following paragraphs describe the single steps of the process.

The first step is to detect the number of running data nodes. We use the Hadoop API for retrieving the addresses of all running data nodes. We can now iterate over this list and send a request to the Solr base-URL on each data node in order to check if a Solr instance is running on the data node. The result of this step is a list of all running Solr instances in our cluster.

The next step is to gather metadata of each Solr core. We use the CoreAdmin API to retrieve information about how many cores are running on each instance and how many documents are indexed in each core. Another very important piece of information is the optimize flag which indicates whether an index has been modified. This flag is needed to decide whether we need to backup a core.

To be able to correlate an index backup to a running Solr core, we need a mechanism to identify an index. To do this, each index contains a dummy document with a UUID as the shard name. We can identify each index with a simple query that returns only the dummy document. Once we have identified each index in the cluster, we have all the information we need, to perform maintenance actions.

The Solr Manager gathers data about all running Solr instances. This data is used to perform index maintenance tasks and to choose a server for indexing new documents.

Backing up and restoring an Lucene index

We use the Hadoop filesystem as a central storage for our indexes. Each index is backed up to HDFS whenever it is changed. We use the optimize flag to detect changes. Besides the optimize flag we also monitor the number of documents in the index to ensure that we do not start an backup when data is still being indexed in this shard. The backup is only started when optimize is false and the number of documents has not changed compared to the last run of the Solr manager.

The backup is performed by a small shell script on the data node hosting the index. It basically starts an optimize process of the index. In Solr we have configured a post-optimize hook which starts the snapshooter script after the optimize. Once a snapshot of the index is created, our backup script will zip it and write it into the HDFS.

Backups are performed in parallel, so that we utilize the performance of our Hadoop cluster.

We can now read the UUIDs of all backups stored in the HDFS. By comparing these UUIDs to the UUIDs hosted by Solr in the hadoop cluster, we can detect whether all index shards are currently deployed in the running Solr instances, or if there are any indexes stored in the HDFS that are currently not deployed in a Solr core.

If there are indexes stored in HDFS which are not deployed on one of the data nodes, we will create new Solr cores on the data nodes and restore the indexes on these cores. This is done with another shell script, which is running on the data node where we want to restore the index. The script copies the zipped index from HDFS, unzipps it into the Solr data dir of the new core and starts the new core.

This restore process is performed in parallel to utilize all data nodes in the Hadoop cluster.

Rebalancing the cluster

Once we have ensured, that all indexes stored in the HDFS are running in one Solr core in the cluster, we move on to optimize the performance of our cluster. This is done by ensuring that the indexes are equally distributed among all available Solr instances. We aim for 2 cores on each instance. If there are instances with less than 2 cores running and also instances with more than 2 cores, we will move an index from one of the loaded instances to a vacant instance.

After the rebalancing, our Lucene indexes should be equally distributed throughout the data nodes. If we now still have Solr instances running with less than 2 cores, we can use these to create completely new index shards. This is usually the case when we add new data nodes to the cluster.

Finishing up

Now the Solr management cycle is finished. During this cycle we have ensured, that we utilize all available Solr instances and that all our index shards are running on a Solr core. We also know how many documents are stored in each index, so that we can use the index with the smallest number of documents to index new data.

The Solr manager has proved to be very reliable in our tests scenarios and it has kept human intervention in the daily cluster administration at a minimum. The algorithm is very robust and is able to cope with the standard failure scenarios we face.

We still have to cope with some limitations:

  • The Solr manager is a single point of failure similar to the Hadoop namenode software
  • If more than 1/3  of the data nodes fail, the manager is not able to restore all indexes. If this happens, the cluster will still be able to accept new data, but search results might be incomplete.

In a cluster with 40 data nodes, failures of single data nodes do happen. Without the monitoring and maintenance functionality of the Solr manager, each failure would lead to manual administration efforts. Thanks to the manager we have been able to run the cluster despite data node failures without the need for manual intervention.

Share

Leave a Reply

*

16 Responses to “Managing distributed Solr Servers”

  1. Saket Srivastava says:

    Hello Peter,

    Did you evaluated Hadoop + Hive or Hadoop + Cassandra? More from operational aspect how viable is to have your own custom Solr Manager?

    Cheers,
    Saket.

    • Peter Dikant says:

      We have evaluated Hive but not Cassandra. Hive is a very nice way to write queries on your data but it does not offer realtime capabilities, so it is no substitute for Solr.

      I’m not sure I understand the second question correctly. From an operational perspective the Solr Manager works quite solidly. We’ve had a couple of datanode failures so far and the manager handled these failure situations as expected.

  2. Jerome Pesenti says:

    How do you handle failures of the data-node currently indexing? A backup can only reflect the data at a certain point in time. What happens to the data that was indexed in between the last back up and the moment of failure?

    • Peter Dikant says:

      This is our worst case scenario. If a data-node fails during indexing, the index is corrupt. If this happens, we have a Map/Reduce-Job, that will regenerate the whole index. This job runs for about 8 hours when the cluster contains the maximum amount of data possible.

      We encountered this type of error once in the past 2 years.

      We also have a MapReduce-Job which compares data metrics from the HDFS with metrics from the Solr indexes to detect deviations between those two. Whenever these metrics don’t match, we rebuild the complete index.

  3. Ken Krugler says:

    Hi Peter,

    Very nice write-up, thanks! Several people at the recent Hadoop + Solr class I taught for Lucid (Lucene Eurocon in Barcelona) were asking specific questions about deploying & managing indexes generated using Hadoop – wished I’d know about this series of articles then.

    One question – Katta is a Lucene-based distributed search system that would seem to handle many of the coordination issues that you had to solve. Did you take a look at it? Any comments? I’ve used it for one project, thus curious about your input.

    Thanks,

    – Ken

    • Peter Dikant says:

      Hi Ken,

      yes, we have evaluated Katta when we started research on this project. At that time we could not get it to run stable. Please keep in mind, that this was at the beginning of 2009. We have filed bug reports for some of the issues we had and they have been fixed, but we decided at that time, that we feel more comfortable with our own solution.

      Best regards,
      Peter

  4. steve pauley says:

    we are planning a 10 node hadoop cluster. We plan to use Luence/solr for search. I understand that one creates a MR job to extract the data from Hadoop that users would want to search on and feed that data to Luence to index.

    What is the best way to setup Luence / Solr in a Hadoop env.

    1. Should the Luence / Solr have their own servers and storage?
    2. Is it best to have Luence / Solr implemented in the same Hadoop cluster — machines in the same rack with the hadoop nodes

    Do you have any exemplar architecture diagrams showing a logical or a physical implementation of Luence/Solr with Hadoop

    Thanks

    • Peter Dikant says:

      It depends on the size of your search index. If the index fits into one Core, I would recommend using a dedicated Solr-Server seperated from the Hadoop-Cluster.

      If on the other hand the index is too large for a single core and you need a kind of sharding you might be able to reuse your cluster also for Lucene / Solr. But first you need to evaluate the use of your Hadoop Cluster. If the Cluster is also heavily used for Map/Reduce-Jobs, you will not have enough resources for Solr.

      Bottom line: If your Hadoop cluster is primarily used for storage and has only a light Map/Reduce load, you can reuse it for running Solr. In all other cases you are better off with a seperate Solr Cluster.

  5. Peter Lu says:

    Hi Peter,

    Great Post ! Thank you.
    I am looking to build an index for transactional data (Carts). The source data is in multiple Oracle DBs and, at peak, will get created at upto 10 million rows per hour.

    I need the index to a) be kept up to date at near real time, b) be able to query based on 10 attributes and return a Key and pointer to the original RDBMS instance and c) keep enough history data to amount to about 30 billion rows.

    It looks like Solr is 1 option but am concerned about manageability and the near real time updating. What are your thoughts on using an RDBMS table for the index ? Anything other option I should consider ?

    Thank You,
    Peter

    • Peter Dikant says:

      Hi Peter,

      I am afraid I can not give you a good answer to your questions. Keeping an index of 30 billion documents in Solr is definitely a challange. I guess you will need to use something like Solr Cloud. It might or might not work out. This really depends on your index schema.

      Only thing I can answer is, that Solr is able to index 10 million documents per hour. Our highest peak reached 26 million documents in one hour in a single index. Although the query time was really bad during this peak.

      I would really recommend building a prototype with a single Solr index and try to find the limits of a single index for your indexing schema and data. From there you can try to scale through multiple index shards.

      Regards,
      Peter

  6. Aryan says:

    Hi Peter,

    May i know the node preference you are using i.e, about the ram and disk space and any other spec.

    • Peter Dikant says:

      We are using servers with a single Intel Xeon X5540 CPU, 8 to 16 GB RAM and 2 x 300GB harddrives.

      These are quite low spec Hadoop servers. Our main bottleneck is the RAM. So we are in the process of updating the cluster from 8 GB to 16 GB.

  7. Manish says:

    Thanks for a nice and detailed post. The solution started simple but become more and more complex as I read parts 2-4. Can you comment on the cost of developing and maintaining the solution? Not in terms of $$$, but in resources and time required.

    Regards,
    Manish

    • Peter Dikant says:

      During the development we had a total of 3 people working for about 10 month on it. Since deployment it is basically 1 person fulltime for maintenance and further development.

      I guess it is less complex in the end than it might seem from the article.

      The system has proved to be really stable and does not need much fixing. Most of the maintenance time is spend developing new functionality.

  8. Ryan Rieder says:

    Hello Peter,

    I just sent an email to your personal address. I wanted to make you aware, your response to my email would greatly be appreciated.
    I found your article excellent!

    thanks
    Ryan

  9. Josep says:

    Hi Peter,

    I’ve found extremely interesting your post. I’ve tried several “professional” log solutions, but never fulfilled my expectations. I am interested as well in the SIEM side of these solutions, but at the end you always need to search in huge amount of data for doing forensics.

    In your extremely nice explanation, I’ve only missed one thing, how do you recollect the logs? Does Hadoop implement some sort of “syslog”?

    Thanks in advance.

    Josep