Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (2024)

Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (1)

Hadoop’s progression from a large scale, batch oriented analytics tool to an ecosystem full of vendors, applications, tools and services has coincided with the rise of the big data market.

While Hadoop has become almost synonymous with the market in which it operates, it is not the only option. Hadoop is well suited to very large scale data analysis, which is one of the reasons why companies such as Barclays, Facebook, eBay and more are using it.

Although it has found success, Hadoop has had its critics as something that isn’t well suited to the smaller jobs and is overly complex.

CBR identifies five Hadoop alternatives that may better suit your business needs.

1. Pachyderm

Pachyderm, put simply, is designed to let users store and analyse data using containers.

The company has built an open source platform to use containers for running big data analytics processing jobs. One of the benefits of using this is that users don’t have to know anything about how MapReduce works, nor do they have to write any lines of Java, which is what Hadoop is mostly written in.

Pachyderm hopes that this makes itself much more accessible and easy to use than Hadoop and thus will have greater appeal to developers.

Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (2)

With containers growing significantly in popularity of the past couple of years, Pachyderm is in a good position to capitalise on the increased interest in the area.

The software is available on GitHub with users just having to implement an http server that fits inside a Docker container. The company says that: "if you can fit it in a Docker container, Pachyderm will distribute it over petabytes of data for you."

2. Apache Spark

What can be said about Apache Spark that hasn’t been said already? The general compute engine for typically Hadoop data, is increasingly being looked at as the future of Hadoop given its popularity, the increased speed, and support for a wide range of applications that it offers.

However, while it may be typically associated with Hadoop implementations, it can be used with a number of different data stores and does not have to rely on Hadoop. It can for example use Apache Cassandra and Amazon S3.

Spark is even capable of having no dependence on Hadoop at all, running as an independent analytics tool.

Spark’s flexibility is what has helped make it one of the hottest topics in the world of big data and with companies like IBM aligning its analytics around it, the future is looking bright.

3. Google BigQuery

Google seemingly has its fingers in every pie and as the inspiration for the creation of Hadoop, it is no surprise that the company has an effective alternative.

The fully-managed platform for large-scale analytics allows users to work with SQL and not have to worry about managing the infrastructure or database.

The RESTful web service is designed to enable interactive analysis of huge datasets working on conjunction with Google storage.

Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (3)

Users may be wary that it is cloud-based which could lead to latency issues when dealing with the large amounts of data, but given Google’s omnipresence it is unlikely that data will ever have to travel far, meaning that latency shouldn’t be a big issue.

Some key benefits include its ability to work with MapReduce and Google’s proactive approach to adding new features and generally improving the offering.

4. Presto

Presto, an open source distributed SQL query engine that is designed for running interactive analytic queries against data of all sizes, was created by Facebook in 2012 as it looked for an interactive system that is optimised for low query latency.

Presto is capable of concurrently using a number of data stores, something that neither Spark nor Hadoop can do. This is possible through connectors that provide interfaces for metadata, data locations, and data access.

The benefit of this is that users don’t have to move data around from place to place in order to analyse it.

Like Spark, Presto is capable of offering real-time analytics, something that is in increasing demand from enterprises.

Presto supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. Lovers of Java will be happy to hear that this is what the system is implemented in.

5. Hydra

Developed by the social bookmarking service AddThis, which was recently acquired by Oracle, Hydra is a distributed task processing system that is available under the Apache license.

It is capable of delivering real-time analytics to its users and was developed due to a need for a scalable and distributed system.

Having decided that Hadoop wasn’t a viable option at the time, AddThis created Hydra in order to handle both streaming and batch operations through its tree-based structure.

This tree-based structure means that can store and process data across clusters that may have thousands of nodes.

Hydra features a Linux-based file system in addition to a job/client management component that automatically allocates new jobs to the cluster and rebalances existing jobs, it is also capable of automatically replicating data and handling node failures.

Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (4)

Sign up for our regular news round-up!

Give your business an edge with our leading Tech Monitor

Sign up

Spark, Hydra & BigQuery: 5 enterprise alternatives to Hadoop - Tech Monitor (2024)

References

Top Articles
Mikayla Campinos Leaks: Revealing the Truth Behind the Controversy | NexNews Network
Henry Günther Ademola Dashtu Samuel: A Remarkable Legacy in Music | NexNews Network
Knoxville Tennessee White Pages
Is pickleball Betts' next conquest? 'That's my jam'
Ingles Weekly Ad Lilburn Ga
Craigslist Kennewick Pasco Richland
Craigslist In Fredericksburg
T&G Pallet Liquidation
Florida (FL) Powerball - Winning Numbers & Results
Premier Boating Center Conroe
Hmr Properties
Blue Beetle Showtimes Near Regal Swamp Fox
Oppenheimer Showtimes Near Cinemark Denton
Pwc Transparency Report
Curtains - Cheap Ready Made Curtains - Deconovo UK
Https://Store-Kronos.kohls.com/Wfc
Munich residents spend the most online for food
Skyward Login Jennings County
Dtab Customs
Joann Ally Employee Portal
Huntersville Town Billboards
Program Logistics and Property Manager - Baghdad, Iraq
Best Transmission Service Margate
Jeff Nippard Push Pull Program Pdf
Bidevv Evansville In Online Liquid
Dmv In Anoka
Restaurants In Shelby Montana
12657 Uline Way Kenosha Wi
Craigslistodessa
Ridge Culver Wegmans Pharmacy
Panchang 2022 Usa
Blue Beetle Movie Tickets and Showtimes Near Me | Regal
Craigslist West Seneca
Bay Focus
Greater Keene Men's Softball
Wattengel Funeral Home Meadow Drive
Captain Billy's Whiz Bang, Vol 1, No. 11, August, 1920
America's Magazine of Wit, Humor and Filosophy
Umiami Sorority Rankings
Makes A Successful Catch Maybe Crossword Clue
Strange World Showtimes Near Marcus La Crosse Cinema
Jackerman Mothers Warmth Part 3
Lebron James Name Soundalikes
Iron Drop Cafe
Barber Gym Quantico Hours
2487872771
Twizzlers Strawberry - 6 x 70 gram | bol
Mkvcinemas Movies Free Download
Craigslist.raleigh
Bomgas Cams
WHAT WE CAN DO | Arizona Tile
Selly Medaline
Latest Posts
Article information

Author: Fr. Dewey Fisher

Last Updated:

Views: 5603

Rating: 4.1 / 5 (62 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Fr. Dewey Fisher

Birthday: 1993-03-26

Address: 917 Hyun Views, Rogahnmouth, KY 91013-8827

Phone: +5938540192553

Job: Administration Developer

Hobby: Embroidery, Horseback riding, Juggling, Urban exploration, Skiing, Cycling, Handball

Introduction: My name is Fr. Dewey Fisher, I am a powerful, open, faithful, combative, spotless, faithful, fair person who loves writing and wants to share my knowledge and understanding with you.