Free Essay

Parallel Querying of Rolap Cubes in the Presence of Hierarchies

In:

Submitted By ajifatou
Words 760
Pages 4
ABSTRACT
Online Analytical Processing is a powerful framework for the analysis of organizational data. OLAP is often supported by a logical structure known as a data cube, a multidimen- sional data model that offers an intuitive array-based per- spective of the underlying data. Supporting efficient index- ing facilities for multi-dimensional cube queries is an issue of some complexity. In practice, the difficulty of the in- dexing problem is exacerbated by the existence of attribute hierarchies that sub-divide attributes into aggregation layers of varying granularity. In this paper, we present a hierar- chy and caching framework that supports the efficient and transparent manipulation of attribute hierarchies within a parallel ROLAP environment. Experimental results verify that, when compared to the non-hierarchical case, very little overhead is required to handle streams of arbitrary hierar- chical queries.
Categories and Subject Descriptors
H.2.7.b [Database Management]: Data Warehouse and
Repository; H.2.2.a [DatabaseManagement]: AccessMeth- ods General Terms
Algorithms Design Performance
Keywords
Hierarchies, Caching, Data Cubes, Aggregation, Indexing,
OLAP, Granularity, Materialization, Parallelization
1. INTRODUCTION
Online Analytical Processing (OLAP) has become an im- portant component of contemporary Decision Support Sys- tems (DSS). Central to OLAP is the data cube, a multidi- mensional data model that presents an intuitive cube-like
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
DOLAP’05, November 4–5, 2005, Bremen, Germany.
Copyright 2005 ACM 1595931627/
05/0011 ...$5.00.
East
West
North
South
Automotive
Household
2004
2005
East
West
North
South
2005
2004
XG27
XY53
GL75
RT57
RT91
HJ45
HY35
HK46
UJ67
JW30
NH22
Brakes
Engine
2004
2005
Interior
Appliances
Furniture
East
West
North
South
(a)
Product
(b)
(c)
Figure 1: A hierarchical Product attribute broken down from (a) category, to (b) type, to (c) product number. interface to both end users and DSS developers. In re- cent years, the academic community has become increas- ingly interested in the cube model and a number of efficient cube generation algorithms have been presented in the lit- erature[2, 16, 22].
For the most part, the focus of these algorithms has been the generation of the cube data structure itself. Methods or techniques for efficient access/querying have received rela- tively little attention. When such methods have been pre- sented, they typically assume the existence of non hierarchi- cal attributes. In practice this is rarely the case. Figure 1 provides a simple example from the automotive industry.
Here, we have three feature attributes — Product, Loca- tion, and Time — that can be viewed in terms of one or more measure attributes. In this case, each cell in the cube might be associated with an aggregated total for the measure attribute Total Sales. Note how the hierarchical Product dimension on the x-axis is broken down into increasingly finer levels of aggregation.
While it is possible to represent each of these hierarchical levels as a distinct feature attribute, doing so dramatically increases the complexity of the underlying problem. Specif- ically, the number of possible attributes or group-bys in a d-dimensional data cube is exponential on the number of dimensions. For example, a 10-dimensional cube would gen-
89
erate 2d = 1024 aggregated group-bys. By contrast, the total number of group-bys in the presence of hierarchies is given as
Q
d i=1(hi + 1) when constructed from a data cube with d attributes, where dimension i has a hierarchy of size h [20]. The same 10-dimensional data cube with three-level hierarchies on each dimension would produce over one mil- lion group-bys. Clearly this is infeasible when the original input set may already contain terabytes of data.
An alternative approach to the generation and storage of fully materialized hierarchical cubes is to produce data cubes containing hierarchies represented only at the finest level of granularity. Hierarchical roll up or drill down is then done in real time during query resolution. In order for this to be feasible, the cube architecture must support both fast indexing and hierarchy-sensitive data structures. The associated overhead should be largely transparent to the end user. In this paper we present a series of algorithms and data structures for the efficient manipulation of attribute

Similar Documents

Premium Essay

Database Management System

...DATABASE S YSTEMS DESIGN, IMPLEMENTATION, AND MANAGEMENT CARLOS CORONEL • STEVEN MORRIS • PETER ROB Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Database Systems: Design, Implementation, and Management, Ninth Edition Carlos Coronel, Steven Morris, and Peter Rob Vice President of Editorial, Business: Jack W. Calhoun Publisher: Joe Sabatino Senior Acquisitions Editor: Charles McCormick, Jr. Senior Product Manager: Kate Mason Development Editor: Deb Kaufmann Editorial Assistant: Nora Heink Senior Marketing Communications Manager: Libby Shipp Marketing Coordinator: Suellen Ruttkay Content Product Manager: Matthew Hutchinson Senior Art Director: Stacy Jenkins Shirley Cover Designer: Itzhack Shelomi Cover Image: iStock Images Media Editor: Chris Valentine Manufacturing Coordinator: Julio Esperas Copyeditor: Andrea Schein Proofreader: Foxxe Editorial Indexer: Elizabeth Cunningham Composition: GEX Publishing Services © 2011 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted...

Words: 189848 - Pages: 760

Premium Essay

Databasse Management

...Fundamentals of Database Systems Preface....................................................................................................................................................12 Contents of This Edition.....................................................................................................................13 Guidelines for Using This Book.........................................................................................................14 Acknowledgments ..............................................................................................................................15 Contents of This Edition.........................................................................................................................17 Guidelines for Using This Book.............................................................................................................19 Acknowledgments ..................................................................................................................................21 About the Authors ..................................................................................................................................22 Part 1: Basic Concepts............................................................................................................................23 Chapter 1: Databases and Database Users..........................................................................................23 ...

Words: 229471 - Pages: 918

Premium Essay

Asignment

...Oracle® Database Concepts 10g Release 2 (10.2) B14220-02 October 2005 Oracle Database Concepts, 10g Release 2 (10.2) B14220-02 Copyright © 1993, 2005, Oracle. All rights reserved. Primary Author: Michele Cyran Contributing Author: Paul Lane, JP Polk Contributor: Omar Alonso, Penny Avril, Hermann Baer, Sandeepan Banerjee, Mark Bauer, Bill Bridge, Sandra Cheevers, Carol Colrain, Vira Goorah, Mike Hartstein, John Haydu, Wei Hu, Ramkumar Krishnan, Vasudha Krishnaswamy, Bill Lee, Bryn Llewellyn, Rich Long, Diana Lorentz, Paul Manning, Valarie Moore, Mughees Minhas, Gopal Mulagund, Muthu Olagappan, Jennifer Polk, Kathy Rich, John Russell, Viv Schupmann, Bob Thome, Randy Urbano, Michael Verheij, Ron Weiss, Steve Wertheimer The Programs (which include both the software and documentation) contain proprietary information; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent, and other intellectual and industrial property laws. Reverse engineering, disassembly, or decompilation of the Programs, except to the extent required to obtain interoperability with other independently created software or as specified by law, is prohibited. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. This document is not warranted to be error-free. Except as may be expressly permitted in your license agreement...

Words: 199783 - Pages: 800