Data Processing with SMACK: Spark, Mesos, Akka, Cassandra, and Kafka

Processing layer: Spark

Almost MapReduce: bringing processing closer to data

There are plenty of frameworks already available or under active development (such as Hadoop, Cassandra, Kafka, Myriad, Storm and Samza) which are targeted to integrate widely used systems with Mesos resource management capabilities.

Ingesting the Data

Kafka acts as a buffer for incoming data

For keeping incoming data with some retention and its further pre-aggregation/processing, some sort of distributed commit log could be used. In this case, consumers will read data in batches, process it and store it into Cassandra in form of pre-aggregates.

Consuming the data: Spark Streaming

Designing for failure: backups and patching

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Re-thinking Claims

Internet Bot Attack Test Experiment

Manually updating access token’s properties in WSO2 API Manager

Cloud Landing Zone Lifecycle explained!

Introducing MetaLaunch Tier Model for Private IGOs

How to Localize Software at Scale: A Step-by-Step Guide

Learn Git within 7 Minutes and Play Around with It

Final Project: Final Write Up

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Open Data Architecture at scale on Cloud

Import data from an SFTP server to Spark using Apache Sqoop.

GCP — Execute Jar on Databricks from Airflow — Big Data Processing

Spark Applications Overview | Use Cases of Apache Spark