To access the contents, click the chapter and section titles.

Oracle Data Warehousing Unleashed
(Publisher: Macmillan Computer Publishing)
Author(s): Bonnie O'neill et al
ISBN: 0672310775
Publication Date: 09/26/97

Table of Contents | Next

Page 3

Introduction

You have decided to embark on a data warehouse journey. You are probably a little scared and overwhelmed by the immensity of the project in front of you. You are probably also wondering where to start.

This book is designed to be a definitive reference guide for all project staff having anything to do with the data warehouse. Here is an overview of the types of job classes that would benefit from this book:

Purpose of this Book

The library is full of data warehouse books, all concentrating on a different subtopic within the vast reservoir of data warehousing. Here is a brief overview of some of the data warehouse book inventory:

Page 26

The authors felt that there was no comprehensive book about all the areas of data warehousing, all in one volume. We wanted to write the definitive reference guide to developing data warehouses using Oracle. Such a book would contain all the topics mentioned above, and in addition would cover topics such as:

It is therefore our goal in writing this book that you will need only one book on your desk for the definitive reference guide for Oracle data warehousing—Oracle Data Warehousing Unleashed.

Organization of the Book

Part I, "Initialization and Project Planning," discusses the rationale behind data warehousing, architecture, and methodology.

Part II, "Preparation: Integration and Cleansing," sets the stage and deals with one of the most important parts of data warehousing: integration challenges and scrubbing. In addition, the

Page 27

section describes important data administration tasks such as defining your data, and the central (and often neglected) role of metadata.

Part III, "Logical and Physical Database Design," shows how designing a warehouse is uniquely different from designing an operational system. It covers logical and physical design and storage/capacity planning. In addition, it also covers dimensions, special dimensions such as time and space, and star schema design. Lastly, no section covering physical design and performance issues is complete without a thorough discussion of indexes.

Part IV, "Management of a Data Warehouse," covers DBA-specific management issues such as security, backup/recovery, and loads.

Part V, "Data Access," presents a whole range of exciting new technologies to assist users in accessing and viewing their data, such as intranets, data mining, and ad-hoc query tools. In addition, this section introduces data access constructs that are related to the data warehouse: the ODS and data marts. Query performance tuning is also discussed.

A number of helpful appendixes serve as further reference when using Oracle for data warehousing. Appendix A supplies a checklist of tasks for the project manager. Appendix B presents a list of the tool vendors by product class. Appendixes C and D reference conferences, magazines, journals, and Web sites specializing in data warehousing. Appendix E provides an overview of Oracle8's data warehouse_specific new features. Last, Appendix F is a list of references which we found helpful.

What Do I Do First?

Read the first few chapters to get an overview of what data warehousing is all about. This book does not guide you through installing Oracle; for this task, you can refer to our companion volume, Oracle Unleashed, or the Oracle DBA Survival Guide. Then you can focus on the areas of the book of special interest to you (as mentioned above).

Conventions

Commands and code are presented in a special monospaced, computer typeface. In the text itself, we have tried to present all SQL keywords in the text in uppercase. Code output examples are also presented in computer typeface.

WARSTORY
We have flagged war stories with a special box so the reader can get an idea how other companies have experienced the things we talk about in the book. We have also peppered the book with many examples to illustrate each point. Other sidebars include notes, cautions, and tips.

Page 28

We have also flagged Oracle8 new features with an icon, so those users who are going to be using Oracle8 can see at a glance the new features that pertain specifically to them, and those who are migrating to the new release can be aware of them to take special advantage of them whenever possible. Other users can ignore the text flagged Oracle8, because it doesn't pertain to them. In other circumstances when discussing specific features, we have tried to indicate in which Oracle release it was first introduced. We know from personal experience that there is a wide variety of releases being currently used at any one time, and we have attempted to keep this in mind when discussing specific features.

Good Luck and Have Fun!

Data warehousing is an adventure: It is very exciting and rewarding, with new challenges always ready to meet you. We hope you will have fun learning about data warehousing and producing an infrastructure that will make a very important contribution to your company's future. And we hope that this book will be a useful reference guide, serving as a roadmap to facilitate your data warehouse adventure.

Contents

Introductionxxv

Part I Initialization and Project Planning

  1. Motivation for Building a Data Warehouse 3

    What Is a Data Warehouse? 5

    What a Data Warehouse Is Not 5

    User Access to Data: A Historical Perspective 6

    Summary of Historical Data Access Problems 7

    The Data Warehouse Context: Facilitate Business 8

    Why Oracle? 10

    Business Drivers for the Data Warehouse 11

    Technical Drivers 14

    Solving the Integration Problem 14

    Be Prepared 19

    Scope and Complexity 20

    Political Battles 20

    Inadequate Funding 20

    Data Integration and Data Semantics 21

    Different Methodology 21

    Different Skill Set 21

    Data Access 22

    Summary 22

  2. Data Warehouse Architecture 23

    Six Steps to Develop the Architecture 24

    The Data Warehouse Infrastructure 25

    Data Warehouse System Infrastructure 26

    Metadata 28

    Data Discovery 29

    Data Acquisition 29

    Data Distribution 29

    User Access 30

    Sybase IQ 31

    Data Management Architecture: Data Layers 31

    Ongoing Maintenance: Warehouse Infrastructure 34

    Summary 35

  3. Methodology and Project Management 37

    What Is a Methodology? 38

    Why the Data Warehouse Methodology Is Different 39

Page 6

    Sample Data Warehouse Methodology 40

    Project Roles 40

    Phase One: Define Objectives and Scope 45

    Phase Two: Define Architecture 47

    Phase Three: Begin to Build Infrastructure 48

    Phase Four: Begin First Iteration 49

    Phase Five: Set Up Ongoing Maintenance Processes,
    First Iteration 52

    Phase Six: Set Up User Analysis Environment,
    First Iteration 53

    Phase Seven: Release First Iteration, Begin Second Iteration 54

    Customizing the Methodology 55

    Changing Terms 56

    Using Individual Interviews Instead of JAD Sessions 57

    Changes You Should Never Make 58

    Project Management: The Art of Finding Problems 59
    Setting and Resetting Expectations 61
    Critical Success Factors 61

    The Heart of a Champion (with the Strength of a Gorilla) 61

    Public Relations: Delighting the End User 62

    Data Discovery 62

    Initial Requirements 63

    Sizing and Long Range Sizing Plan 63

    Metadata and Data Dictionaries 63

    Logical Data Model 64

    Extraction and/or Transformation Software 64

    Populating the Data Warehouse 65

    End-User Involvement 65

    Helpful Hints for Data Warehouse Project Managers 65
    Summary 66
  1. Surefire Ways to Make Your Warehouse Fail 67
    What Is a Fatal Error (or What Is Project Success)? 70
    Fatal Errors—A Non-Exhaustive List 71

    Neglect Executive Sponsorship 71

    Avoid Sharing the Project with an End-User Project
    Manager 71

    Limit End-User Involvement 73

    Use a Classic "Waterfall" Project Methodology 74

    Use a Technical Project Manager without Data
    Warehouse Experience 75

    Freeze the Specification as Early as Possible 75

    Work Multiple Projects in Parallel 76

Page 7

    Simultaneously Implement Operational Systems and
    Related Warehouses 76

    Postpone or Avoid Metadata 77

    Focus on Technology First 77

    Create Your Own Data Warehouse Software 79

    Rely on Unproved Technology 80

    Rely on "Silver Bullet" Technology 80

    Run OLAP on Top of the Operational Database 82

    Run the Data Warehouse Database on the Server for
    the Operational System 83

    Use Normalized Data Structures 83

    Define a Complex Warehouse Architecture 84

    Skimp on Disk Storage 85

    Ignore Data Movement Metrics 86

    Defer Disaster Planning 87

    Mistakes That Cause Pain 87
    How You Can Contribute to Project Success 88

    The Executive Suite 89

    The End-User Department(s) 90

    Project Managers and Project Leaders 90

    IT/IS Management (Also for Consulting Managers) 91

    Technical Team Members (Whether in IT/IS or in a
    Consulting Organization) 94

    Summary 95

Part II Preparation: Integration and Cleansing

  1. Data Integration: The Challenges 101
    Traditional Development 102

    Data Marts: Propagation of Data Chaos 105

    Mainframe Extracts 108

    PCs and Client/Server Development 108

    Heterogeneous Technologies 109
    The Compelling Need for Semantic Integration 110

    The Ramifications of Not Performing Semantic Integration 110

    Same Fact? How Can You Be Sure? 111

    Lack of Flexibility 112
    Semantic Problems 113

    Word Meaning 113

    Unit of Measure 113

    Level of Accuracy 113

    Format 113

    Timeliness 114

Page 8

    Solution: Integration and Prevention 115
    Summary 117
  1. Defining Your Data 119
    Qualities of Good Definitions 120
    Definitions as Business Rules 120
    Two Different Kinds of Definitions 121

    Data Element Definitions 122

    Business Term Definitions 122

    Bringing the Two Definitions Together 123

    Stepping Toward the Future 125

    Discrepancies Between the Present and the Past 127

    Changing Relationships 128

    Changing Unique Identifiers 128

    Historical Definitions and Names 129

    Tracking Historical Fields 129

    Subject Areas 130
    Defining Entities 131

    Issues with Addresses 133

    Defining Attributes 134

    Attribute Naming 134

    Abbreviations as Attributes 135

    Issues with Names as Attributes 135

    Format 135

    Database Naming Conventions 135

    Oracle Objects 136

    Physical DDL Storage/Maintenance 144

    Data Validation 145
    About Data Stewardship 145

    Problems with Data Stewardship 146

    The Magilla Gorilla Factor 147

    Summary 149
  2. Metadata 151
    What Is Metadata? 152
    Types of Metadata 152

    Control Metadata 153

    Change Control in the Data Warehouse 153

    Business Metadata: Ideal World Scenario 161

    The Metadata Project 162
    The Metadata Repository 163
    Buy, Build, or Extend? 163

    What Buying a Repository "Buys You" 163

    What to Look for in Repositories 164

Page 9

    Typical Metadata Elements 165

    The Metadata Coalition's Metamodel 167

    Extending a Repository 168
    Integrating Metadata from Disparate Tools 169
    The Metadata Coalition 170

    The Electronics Industries Association's CASE Data
    Interchange Format (CDIF) 172

    Other Integration Strategies 172

    Creative Customization 172

    Single Vendor Solutions 172

    Vendor Partnerships 173

    Metadata Access: Query Tools 173
    Summary 175
  1. Data Quality and Scrubbing 177
    How Does Bad Data Affect the Business? 178

    Poor Business Decisions 178

    Wasting Money 178

    Lost Information 178

    Defining Data Quality 181
    Situations That Highlight Dirty Data 183

    Merging Data Sources 183

    Integration of Information: Cross Referencing 184

    Mailing List Merging 185

    Application Merging 185

    Acquisition of Another Company 186

    The Data Cleansing Process 186

    Step 1: Data Discovery and Sleuthing 187

    Step 2: Categorize/Classify 188

    Step 3: Decide Action/Document 188

    Step 4a: Create Transforms/Generate Scrub Code 190

    Step 4b: Change Business Processes or Legacy Systems 190

    Step 5: Scrub Error Handling 191

    Step 6: Check for Metadata Drift Over Time 192

    Determining Data Meanings 192

    Two Different Kinds of Problems: Field versus Value 192

    Embedded Values: One Column, Different Values 193

    Subtypes and Supertypes 194

    Types of Transforms 194

    Determining the Source of Record 194

    One Column, Different Sources 194

    Synonyms 195

    Homonyms 195

Page 10

    Data Cleansing Tools 196

    Sleuthing/Discovery Tools 196

    Extract/Transform/Code: Scrub Tools 197

    Tools That Address Specific Problems 199

    Which Is Best? 199

    Documentation and Metadata 201
    Summary 202

Part III Logical and Physical Database Design

  1. Data Modeling Techniques and Options 205
    The Overall Design Process 207
    What Is Logical Design? 207

    Star Schema Design 208

    Database Design 101 209
    The Normalization Process 209

    Normal Forms 209

    Other Normalization Factors 216

    Benefits of Normalization 219

    Drawbacks of Normalization 219

    Normalization as the Baseline for Warehouse Design 220

    Denormalizing the Database 220

    Advantages of Denormalization 220

    Disadvantages of Normalization 221

    Guidelines 221

    Basic Denormalization Techniques 222

    Changing Column Definition 225

    Redefining Tables 228

    Data Partitioning/Fragmentation 228

    Referential Integrity in a Data Warehouse 231
    Summary 232
  2. Dimensions and Query Hierarchies 233
    Scales and Measurements 234

    Range of a Scale 236

    Granularity and Precision 236
    Types of Scales 237

    Nominal Scales 237

    Absolute Scales 237

    Ordinal Scales 237

    Rank Scales 238

    Quantitative Scales 238

Page 11

    Interval Scales 238

    Ratio Scales 238

    Scale Conversion 239
    Derived Units 240
    Dimensions 240
    Hierarchies 241
    Hints for Constructing Dimensions and Hierarchies 242

    Use Existing Standards 242

    Avoid NULLs 243

    Translate Codes for the User 243

    Keep the Codes in the Database 243

    Cubes, Drills, and Clusters 243
    Summary 246
  1. Star Schema and Variants 247
    Mid-Level Design 248
    The Shape of a Star Schema 249

    Drill Downs Across Dimensions 254

    Problem Hierarchies: Time 255
    Snowflake Schema 257
    Constellation Schema 261

    Drilling to Another Fact Table 262

    Common Dimensions in Data Warehouses 264
    Star Schema Optimization 264

    Tuning Tips for Regular Star Optimization
    (Release 7.2 and Higher) 264

    Star Transformation Optimization Tips 265

    Summary 266
  2. Spatial Data: A Very Special Dimension 267

    What Is So Special About Spatial Data? 269
    Oracle's Spatial Data Option 273

    HHCODE: A Slick Way to Formulate N-Dimension
    Spatial Intersections 273

    How Does HHCODE Work? 274

    Other Key Features of the Spatial Data Option 275

    Summary 275
  3. Storage Concerns and Planning 277
    Approaches to Storage Planning 278

    The Analytic Approach 278

    The Empirical Approach 278

    The Incremental Approach 278

Page 12

    How to Slam-Dunk an Initial Estimate 279

    Re-Estimate Regularly! 282

    Planning for the Long Haul 283

    Analyzing Your Storage Requirements 284

    Extract Storage Requirements 284

    Data Cleaning Storage Requirements 285

    Data Transformation Storage Requirements 286

    Sort Workspace 287

    Loads Use Storage Too! 288

    The Operational Data Store (ODS) 289

    The Main Warehouse Database 289

    Data Marts 291

    Query Workspace and Front-End Tools 294

    Query Results and Report Distribution 294

    Ancillary Data 294

    Archival Data 295

    Backup Data 295

    Operating System Paging and Swap Space 297

    Physical Storage Planning 298

    Datatypes 298

    Calculating Theoretical Size of Database Objects 299

    Storage Types 301

    Physically Locating Your Data 309

    Old Style Physical Tuning 309

    RAID 5 Makes It Easier—But Not Free 309

    Test Your Storage Plan! 311
    Monitoring Storage Utilization 314
    Summary 315
  1. Physical Database Design 317
    Design Is About Trade-Offs 318

    Denormalize, Normalize, Overnormalize 318

    Special Tables 321

    Star Schema 322

    Level of Detail (Granularity) Affects Physical Design 323
    Partitioning Tables 324

    Partition Views and Parallelism 324

    Partitioned Tables 326

    Partitioning on Dimensions Other Than Time 328

    Oracle Physical Design Considerations 328

    Consider Using Parallel Technology 328

    Memory Management 329

Page 13

    Striping and Object Placement 330

    Read-Only Tablespaces 331

    Rollback Segments 332

    Freelists 332

    Indexes 332

    Reclaim Space by Utilizing ALTER TABLE DEALLOCATE 332

    Explore the Hidden Parameters 334

    Define Temporary Tablespaces as TEMPORARY 335

    SORT_AREA_SIZE 336

    Summary 336
  1. Exploiting Parallel Technology 337
    SMP Versus MPP 338
    Oracle Parallel Server 339

    Pinging: The P Word 340

    How Many Freelists Do You Need? 342

    Parallel Query Option (PQO) 342

    Skewing 343

    The Optimizer 343

    Distribution of Data to Maximize Parallelization 348

    Summary 349
  2. Indexes 351
    Indexes 352

    B*-Tree Indexes 352

    B-Tree Index Advantages 354

    Disadvantages 355

    CREATE INDEX Syntax 356

    Bitmap Indexes 357

    What Is a Bitmap Index? 358

    When to Use Bitmap Indexes 358

    CREATE BITMAP INDEX Syntax 360

    Restrictions (Features!) 360

    Bitmap Query 361

    Bitmap INIT.ORA Parameters 361

    Hash Clusters 362

    CREATE CLUSTER Syntax 362

    Hash Cluster Parameters 363

    Pre-Allocation of Space for a Hash Cluster 366

    Hash Cluster Advantages 366

    Hashing Disadvantages 366

    Partitioned Indexes 367
    Summary 367

Page 14

Part IV Management of a Data Warehouse

  1. Security 371
    Privileges 372

    System Privileges 372

    Object Privileges 373

    Data Dictionary Tables 375

    Listing of User Privileges 375

    Listing of System Privileges 376

    Listing of Object Privileges 376

    Column Privilege Listing 377

    ALTER USER 378
    Roles 378

    Creating Roles 379

    Granting Privileges to Roles 379

    Granting Roles to Users or Roles 380

    Enabling Roles 380

    Enabling Roles by Setting Default Roles 380

    Predefined Roles 381

    Revoking Privileges from Roles 382

    Revoking Roles from Users or Roles 382

    Dropping Roles 382

    Data Dictionary Role Views 383

    New Oracle8 Feature Regarding Passwords 385
    Summary 385
  2. Backup and Recovery 387
    Developing a Backup Strategy 388

    Backup Strategies in NOARCHIVELOG Mode 389

    Backup Strategies in ARCHIVELOG Mode 390

    Types of Backups 391
    Performing Database Backups 391

    Listing the Files for Backup 391

    Performing Full Offline Backups 393

    Performing Online Partial Backups 394

    More About ARCHIVELOG Mode 395

    Enabling ARCHIVELOG Mode 396

    Steps to Enabling ARCHIVELOG Mode 397

    Checking Archive Status 397

    Control File Backup 398
    Database Recovery 398

    Redo Log Information 399

    Recovery Types 400

Page 15

    Media Recovery 402

    Control File Recovery 402

    Instance Recovery 402

    NOARCHIVELOG Recovery 403

    Recovery Tablespaces/Data Files 403

    Summary 404
  1. Loads 407
    Designing Loads 408
    Data Consistency 410

    Load Consistency 410

    Referential Integrity 411

    Freezing Time Variant Data 412

    Designing Load Process Streams 412
    Load Windows 414
    Common Load Problems 415
    Load Quality Assurance 416
    Back Out, Recovery, Restart 417
    Speeding Up Loads 418
    Technology Alternatives 422
    Capacity Planning for Loads 424

    Disk Storage Capacity 424

    Disk and Tape Throughput Capacity 425

    Network Capacity 426

    Mainframe Channel, Front-End Processor, and
    Communication Processor Capacity 426

    Summary 429
  2. Tuning Loads and Scrubs 431
    Why Are Loads and Scrubs Always a Performance Problem? 432

    Trade-Offs 433

    Analyze the Load/Scrub Process 433
    Optimizing Loads 434

    Is the Initial Load Representative of Ominous
    Things to Come? 434

    Extracts and Data Movement 434

    Dealing with Incremental Loads 435

    SQL*Loader Options 435

    Loading in Parallel 436

    Third-Party Load Products 436
    Network Issues 436

    Network Protocol Options 437

    Network Strategy for an ODS 437

Page 16

    Scrub Code Tuning 438

    Do What You Can on the Source System 438

    Dealing with Computed Fields 439

    Aggregate Processing Creativity 440

    What Do You Do About Referential Integrity? 440

    Use Common Tuning Techniques 441
    Resorting to a 3GL 442
    Summary 442
  1. Memory Tuning and Other DBA Tuning Considerations 443
    System Global Area 444

    The Shared Pool 447

    Data Block Buffer 452

    Other Memory Considerations 456

    Size of Database Buffer 456

    Tuning for Sorts 457

    Utilities and Other Server Tuning 458

    UTLBSTAT and UTLESTAT 458

    Rule-Based versus Cost-Based Optimizer 459

    PCTFREE 461

    Server Manager 462

    Summary 462

Part V Data Access

  1. Using the Intranet 465
    PC and Spreadsheet: Killer Combo of the 1980s 466
    Data Warehouse and the Internet: Killer Combo for
    the 1990s? 466
    An Introduction to the Technologies of the Internet 467
    What Is an Intranet? 470
    Why Use an Intranet? 473
    Connecting a Database to the Intranet 474
    Oracle Web Server 475
    Publishing and Delivery of Data Warehouse Data to
    the Intranet 476
    Querying the Data Warehouse Data from the Intranet 476
    Summary 477
  2. Front-End Tools 479
    What Is a Front-End Tool? 482
    Front-End Tool Terminology 483

    Multi-Dimensional 483

    OLAP 484

Page 17

    Slice and Dice 485

    Drill 485

    Traditional Programming Languages 485
    Types of Tools 487

    End-User Tools Versus Front-End Development Tools 487

    Ad Hoc Query 487

    Report Writers 487

    EIS Tools 488

    Data Mining Tools 488

    Web Browser-Based Tools 489

    Sample Front-End Tools 489

    Oracle Discoverer 489

    Oracle Express 491

    Oracle Developer 2000 491

    BusinessObjects 492

    DataMiner 492

    WebIntelligence 492

    Impromptu and Power Play 493

    Scenario 493

    Developing the User Layer 493
    Front-End Tool Performance Issues 494

    The Correct Performance Tuning Method 495

    Tuning the Back-End 496

    Sort Performance Problems 496

    Query Optimization 497

    Data Movement 497

    Memory Allocation 498

    Ease of Use: A Different Kind of Performance Tuning 498

    Selecting Front-End Tools 499

    Selection Approach 500

    Selection Criteria 507

    A Word on the RFx 511

    Information Sources 513

    Tools for Requirements Gathering 514
    Tools Administration 516
    Summary 519
  1. Data Mining 521
    Data Mining Applications 522
    Data Mining Approaches and Models 524
    The Methods Behind Data Mining 524

    Decision Trees 525

    Simple Descriptive Statistics 526

Page 18

    Linear (and Other) Regression Analysis 528

    Cluster Analysis 529

    Fuzzy Logic 530

    Neural Networks 531

    How Data Mining Fits with Data Warehousing 533
    Implementation Road Map 536
    Database Design Issues 536
    Product Selection 538
    Summary 541
  1. Tuning Queries 543
    The Optimizer 544

    Rule-Based Optimization 545

    Cost-Based Optimization 545

    Collecting Statistics 546

    Analyzing Tables 546

    Analyzing Indexes 547

    Cost Optimization Goal 548

    Query Optimization Techniques 548

    Hints Overview 548

    Index Usage 551

    Index Usage Examples 551

    Index Suppression 551

    Index Suppression Examples 552

    Forcing Index Suppression 552

    Forcing Index Usage 553

    GROUP BY 553

    Query Restructuring 554

    Query Optimization—OR / UNION 555

    Query Optimization—NOT EXISTS 555

    Query Optimization—Combine 556

    EXPLAIN_PLAN 557

    The PLAN_TABLE 557

    Setting Up a Plan 558

    EXPLAIN_PLAN Operations 560

    TKPROF 576

    How to Run TKPROF 576

    TKPROF Syntax 579

    Interpreting the TKPROF Results 581

    Query Tuning for the Star Schema 583

    Tuning Tips for Regular Star Optimization
    (Releases 7.2 and Higher) 584

    Star Transformation Optimization Tips 585

    Summary 586

Page 19

  1. Data Marts 587
    Types of Data Marts 589

    Satellite Data Marts 590

    Feeder Data Marts 593

    Partition Data Marts 595

    The Mini-Warehouse Data Mart 596

    Independent Data Marts 600

    Mixed Data Mart Architectures 606

    Building a Data Mart 608
    Tools for Data Marts 610

    Data Mart Implementation Tools 610

    The Oracle Data Mart Suite 611

    Oracle Express and Data Marts 611

    Data Mart Extremes—Some Perspective 612
    Summary 613
  2. Operational Data Store and the Operational Environment 615
    Major Limitation of the Data Warehouse: Updates 616
    What's the Difference? 617
    Architectures of DW and ODS 618
    Should the ODS Be Updated Directly? 621
    Determining the Source of Record 622
    Reverse Scrubs 624
    Avoid Moving the Source of Record to the ODS 626
    Edits Revisited 628
    Three Tier Architecture: Linking the Data Warehouse to
    the Operational Environment 629
    The Role of Metadata 630
    Integrating the ODS, Data Warehouse, and Operational
    Environment: Rule Diagnostics 631
    Implementing the Metadata Link 633
    Summary 635

Part VI Appendixes

    A Data Warehouse Checklists for Project Managers 639

    Critical Success Factors for Data Warehousing 640
    Key Tasks and Deliverables 641
    Eight Major Phases in Building the Data Warehouse 642
    Reference Task Checklist 642

    B Survey of Tool Vendors 645

Page 20

    C Survey of Conferences and Seminars 649

    Regular Conferences Specializing in Oracle 650
    Regular Conferences Specializing in Data
    Warehouse/Databases 650
    Other Conferences and Seminars of Interest 651

    D Survey of Publications and Journals 653

    Interesting Data Warehouse_Oriented Web Sites 655

    E Oracle8 Features for Data Warehousing 657

    Partition Tables 658
    More Parallel Features 658
    Fast Full Index Scan 659
    Bitmap Indexes 659
    Star Optimization 659
    Star Transformation Optimization 660
    Backup and Recovery 660
    Object Features 660

    F References 661

    Index665

Page 3

Overview

Introductionxxv

Part I Initialization and Project Planning

  1. Motivation for Building a Data Warehouse3
  2. Data Warehouse Architecture23
  3. Methodology and Project Management37
  4. Surefire Ways to Make Your Warehouse Fail67

Part II Preparation: Integration and Cleansing

  1. Data Integration: The Challenges99
  2. Defining Your Data101
  3. Metadata119
  4. Data Quality and Scrubbing151

Part III Logical and Physical Database Design

  1. Data Modeling Techniques and Options205
  2. Dimensions and Query Hierarchies233
  3. Star Schema and Variants247
  4. Spatial Data: A Very Special Dimension267
  5. Storage Concerns and Planning277
  6. Physical Database Design317
  7. Exploiting Parallel Technology337
  8. Indexes351

Part IV Management of a Data Warehouse

  1. Security369
  2. Backup and Recovery371
  3. Loads387
  4. Tuning Loads and Scrubs407
  5. Memory Tuning and Other DBA Tuning Considerations431

Part V Data Access

  1. Using the Intranet443
  2. Front-End Tools463

Page 4

  1. Data Mining479
  2. Tuning Queries521
  3. Data Marts543
  4. Operational Data Store and the Operational Environment587

Part VI Appendixes

    A Data Warehouse Checklists for Project Managers639

    B Survey of Tool Vendors645

    C Survey of Conferences and Seminars649

    D Survey of Publications and Journals653

    E Oracle8 Features for Data Warehousing 657

    F References661

    Index665

    Page 5

    Table of Contents | Next