We are currently experiencing a Geospatial Revolution that changes in how we navigate from A to B and how we search for locations like a specific sight or restaurants nearby. Geospatial search technology provides such information. This article shows how commercial applications can utilize geospatial search, e.g. for real estate search (qualifing real estates by their distance to the nearest kindergartens, schools, doctors, etc.), calculating building density in cities and so on.
In this blog article we discuss how Fredhopper, an advanced site search and merchandising product, can be integrated into the hybris eCommerce suite not only to search for products, but to create cross selling and campaigns as well. In the used scenario hybris is the foundation of a marketplace with a few million products from thousands of vendors.
We use the open-source search server Solr for real-time search on data stored in a Hadoop cluster. For our terabyte-scale dataset, we had to implement distributed search on multiple Lucene index partitions (shards). This article describes our solution to manage 40 independent Solr instances without human interaction, including transparent failover and automatic backup of the Lucene index.
In the previous part of this article series we focused on the efficient storage of log data in Hadoop. We described how to store the data in Hadoop’s MapFiles, and we tweaked the configuration settings for increased data storage capacity and greater retrieval speed. Today, we discuss how to perform near real-time searches on up to 36.6 billion log messages by a clever combination of Hadoop with Lucene and Solr.