Skalierbare Datenanalyse mit Apache Spark : Implementation einer Text-Mining-Anwendung und Testbetrieb auf einem Low-End-Cluster

Kirchner, Daniel

License:
Title:	Skalierbare Datenanalyse mit Apache Spark : Implementation einer Text-Mining-Anwendung und Testbetrieb auf einem Low-End-Cluster
Language:	German
Authors:	Kirchner, Daniel
Issue Date:	29-Jul-2015
Abstract:	Apache Spark ist auf dem Weg sich als zentrale Komponente von Big-Data-Analyse-Systemen für eine Vielzahl von Anwendungsfällen durchzusetzen. Diese Arbeit scha t einen Überblick der zentralen Konzepte und Bestandteile von Apache Spark und untersucht das Verhalten von Spark auf einem Cluster mit minimalem Leistungspro l. Grundlage dieser Untersuchung ist ein realitätsnaher Anwendungsfall, der Sparkmodule für Batch-Processing und Streaming kombiniert. Apache Spark is quickly becoming a central component of Big Data analysis systems for a variety of applications. This work provides an overview of key concepts and components of Apache Spark and examines the behavior of Spark on a cluster with a minimal performance pro le. This study is based on an application that is inspired by a real-world usecase. The application combines the Spark modules for batch processing and streaming.
URI:	http://hdl.handle.net/20.500.12738/7075
Institute:	Department Informatik
Type:	Thesis
Thesis type:	Bachelor Thesis
Advisor:	Kahlbrandt, Bernd
Referee:	Zukunft, Olaf
Appears in Collections:	Theses