I’m trying to sift through information on big data technologies. I have data stored in S3 that I want to analyze using EMR. However, when I try to research the pros and cons of Presto, Hive, Spark, or any other technology, I end up drowning in company sponsored benchmark reports or written by people with clear biases.

So, my ask: Am I better off just experimenting with each tool, or do you have any suggested that offer opinions with substance, and not just buzzwords?

Source link
and data center


Please enter your comment!
Please enter your name here