YavaCE is a big data platform which is a community version of Yava247 Data Management Platform. This community version platform is intended for anyone who wants to start and learn data processing using Hadoop, MapReduce, Hive and Spark
-
Distributed Processing
Hadoop and Spark have been proven as distributed processing give you high performance data processing -
SQL Analytics on Hadoop
Hive and SparkSQL provide convenience in processing and analyzing structured data -
Stream Processing
Spark Streaming lets you reuse the same code for batch processing, join streams against historical data -
Machine Learning
Spark Machine Learning library make practical machine learning scalable and easy at a high level
-
Educational
Starting from something simple, will make it easier to master Big Data -
POC/Trial
YavaCE can be a platform for testing various use cases on Big Data. -
Research
Open source technology enables various of research topics that can be developed on big data
This repository contains a collection of short and practical recipes that are easy to follow in processing and analyzing data using YavaCE