SimpleSearch is the search engine for Hadoop. Vertascale’s SimpleSearch provides a powerful indexing and real-time search capability for semi-structured and mix-structured data stored in Amazon S3 or the Hadoop File System (HDFS). Compare the advantages of Vertascale’s SimpleSearch to traditional add-on search systems:
| Vertascale SimpleSearch™ | Traditional “Add-on” Search | ||
|---|---|---|---|
| Real-time query and powerful summary analysis: Find, explore and export large data sets quickly with a web-based application. Summary Analysis delivers a visual snapshot of millions of scattered records, instantly and concisely. | Yes | No | |
| Cost effective: Free to start, pay only for what you use. No new hardware to purchase or install. Runs on AWS or your existing Hadoop stack. | Yes | No | |
| Scalability: Easily scales to index large volumes of data without the need for complex sharding and replication | Yes | No | |
| Integration: Connects directly to Amazon S3 or HDFS to index and query data in place. No new ETL to manage. Not a separate system. | Yes | No | |
| Powered by Hadoop: Uses Hadoop MapReduce to build a scalable, distributed index that supports real-time query and analysis | Yes | No |
Meeting the Challenges of Search @ Scale
Traditional approaches to search were conceived over thirty years ago. Commodity storage and distributed computing systems that we take for granted today just didn’t exist. The design emphasis of these early systems was on building as concise an index as possible, and relying on real time computation to minimize storage requirements.
As data volumes have continued to grow, traditional approaches to search have become prohibitively expensive to implement and scale. VertaScale’s SimpleSearch technology takes advantage of the latest advances in distributed computing and commodity storage to build a more efficient distributed index that can scale to meet the needs of today’s Big Data environments.