Towards unified secure on- and off-line analytics at scale

P. Coetzee; M. Leeke; S. Jarvis

doi:10.1016/j.parco.2014.07.004

Back

Towards unified secure on- and off-line analytics at scale

Journal article

Peer reviewed

Towards unified secure on- and off-line analytics at scale

P. Coetzee, M. Leeke and S. Jarvis

Parallel computing, Vol.40(10), pp.738-753

01/12/2014

DOI: https://doi.org/10.1016/j.parco.2014.07.004

Abstract

Computer Science

Computer Science, Theory & Methods

Science & Technology

Technology

Data scientists have applied various analytic models and techniques to address the oft-cited problems of large volume, high velocity data rates and diversity in semantics. Such approaches have traditionally employed analytic techniques in a streaming or batch processing paradigm. This paper presents CRUCIBLE, a first-in-class framework for the analysis of large-scale datasets that exploits both streaming and batch paradigms in a unified manner. The CRUCIBLE framework includes a domain specific language for describing analyses as a set of communicating sequential processes, a common runtime model for analytic execution in multiple streamed and batch environments, and an approach to automating the management of cell-level security labelling that is applied uniformly across runtimes. This paper shows the applicability of CRUCIBLE to a variety of state-of-the-art analytic environments, and compares a range of runtime models for their scalability and performance against a series of native implementations. The work demonstrates the significant impact of runtime model selection, including improvements of between 2.3 x and 480x between runtime models, with an average performance gap of just 14x between CRUCIBLE and a suite of equivalent native implementations. (C) 2014 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/3.0/).

Files and links (1)

url

https://doi.org/10.1016/j.parco.2014.07.004View

Published (Version of record) Open

Metrics

1 Record Views

Details

Title: Towards unified secure on- and off-line analytics at scale
Creators: P. Coetzee - University of Warwick
M. Leeke - University of Warwick
S. Jarvis - University of Warwick
Publication Details: Parallel computing, Vol.40(10), pp.738-753
Publisher: Elsevier
Number of pages: 16
Publication Date: 01/12/2014
Grant note: Industrial EPSRC CASE Studentship; UK Research & Innovation (UKRI); Engineering & Physical Sciences Research Council (EPSRC) 1273878 / Engineering and Physical Sciences Research Council; UK Research & Innovation (UKRI); Engineering & Physical Sciences Research Council (EPSRC)
Identifiers: 991103787602346; WOS:000347018800013
Academic Unit: President & VC's Office (VC01)
Language: English
Resource Type: Journal article

Towards unified secure on- and off-line analytics at scale

Abstract

Files and links (1)

Metrics

Details

Usage Policy