Arthur's Blog

October 26, 2009

Evolving the “Humpty Dumpty Warehouse” Into a “Phoenix”

In my last blog post, I responded to Wayne Eckerson Wayne’s World Blog for TDWI, which revisited the dilemma of the “Humpty Dumpty Warehouse”. I suggested that the “Phoenix” might be a better model for modern enterprise data warehouses. Wayne continued the discussion in comments:

Arthur, you are right to suggest that the BI team needs to adapt to changes Phoenix-like rather than pick up the pieces every time the organization changes. I guess the Humpty Dumpty metaphor is not the best–albeit a lot of fun–unless the king’s men are using superglue to get Humpty back together again. Certainly, I’m a big advocate of adaptable DW and BI architectures. That’s a given I should have noted!

Rather than superglue — though that sounds like fun! — last time I mentioned several key breakthroughs in information technologies that have matured to the point where a viable, flexible, Phoenix-like EDW can be created without taking a “rip and replace” strategy that would discard what has already been accomplished within the organization.

Because it maximizes existing investments both in products and people, this represents a much more secure and cost-effective route than trading a well understood set of problems for a replacement technology that may well solve some problems, but will inevitably replace them with a variety of altogether new. These breakthroughs include the following:

  • Enhanced data base federation capabilities from all the major RDBMS providers, as well as from many Business Intelligence tool vendors like Business Objects.
  • Very high-performance, storage-efficient and massively scalable software-based Nearline 2.0 storage systems that can house the entirety of an organization’s structured detail data, federated with the primary RDBMS.
  • Very high-performance Column-Based Analytic Technology (CBAT) systems to support analytics for power users
  • Very inexpensive and powerful desktop computers with adequate storage
  • Relatively inexpensive blade servers
  • Very high performance, efficient, automated ETL tools that can be used by the organization to set up and control the flow of data over time (including Disaster Recovery support using the Nearline 2.0 storage architecture).

It is now possible to integrate all of these subsystems into a single EDW architecture, resulting in a Scalable Corporate Information Factory (SCIF), to adapt Bill Inmon’s terminology). With this model in place, key BI analysts no longer need to focus primarily on transforming data to achieve a single version of “the truth” — in reality, an unattainable goal — to enable adequate performance, and more on on helping users derive real business value from corporate information by maximizing accessibility to “the facts” for the users who can provide essential business insights.

In subsequent posts I will explore this architecture in more detail.


September 26, 2009

Regarding the Humpty Dumpty Data Warehouse Dilemma

Wayne Eckerson, on his Wayne’s World Blog for TDWI, revisits the dilemma of the “Humpty Dumpty Warehouse”:

Most organizations are like Humpty Dumpty teetering and tottering on top of a big wall. With the slightest gust of wind, Humpty crashes and breaks into dozens of pieces. And DW teams are “all the king’s horses and all the king’s men” who are charged with putting Humpty Dumpty back together again.

Whether we’re talking about “Humpty Dumpty” in terms of the enterprise as a whole, or the data within a given warehouse, agreed — DW teams are doing the best they can with what they have. But often so are the CEOs, who are facing battles in the boardroom, battles between the shareholders, the company’s bankers, the boards of directors, the various C-Levels within the organizations and some of their powerful subordinates. Never mind vacuums created when key executives leave, or when mergers and acquisitions, divestitures, etc. change the nature of the business.

Unfortunately, most current Data Warehouses are built in such a way that this Humpty Dumpty dilemma will repeat itself over and over again. The real dirty little secret is that the same tricks used to make DWs efficient for reporting purposes (aggregation, indexing, and the subsequent discarding of underlying details) are the ones that make them difficult — and expensive — to change and update.

So what’s the answer? First, we must realize that there is no such thing as a “single version of the truth” but merely a convenient and workable one. Next, we must break out of the “Humpty Dumpty” dilemma and its tragic ending and find a better story — a better model, like the “Phoenix” that can “rise from the ashes” overnight to meet all the new KPI’s to support the business needs.

The good news is that technological developments in Nearline 2.0, RDBMS federation
capabilities and high-performance ETL tools offer a way for companies to transition from “Humpty Dumpty” to the new “Phoenix”-like approach — without resorting to a “rip and replace” strategy.

My next post will explore these ideas further.


May 22, 2009

Building Corporate Memory Into a Next-Generation Data Warehouse

It is now possible to design and implement a corporate memory within the data warehouse using a number of mature, tested and well-understood products and methodologies that can be deployed relatively quickly and administered with minimal DBA overhead. These solutions can grow with relatively linear scalability in terms of both cost and performance, while providing powerful support for both power analysts and reporting users. The ingredients for a successful data warehouse implementation that makes use of the corporate memory concept involve hardware, software and architectural design components, as listed below:

Read more…


May 4, 2009

Data “Dumping Grounds” and the Importance of Corporate Memory

Received wisdom about data warehousing instructs us not to create a “dumping ground” for our raw detail data. But why not? This principle is a legacy from the not-so-distant past when it was impractical to keep huge amounts of data around if it was not being actively used – so once aggregates had been built, the original details were simply discarded. Of course, this meant that the organization was then confined to working with a particular “version of the truth” that someone had imposed on the data; there was no way to revisit the original details should the need arise for a change of perspective.

Read more…


April 24, 2009

Decision Support for Users Who Don’t Know What They Don’t Know

Since the beginning of the computer era, system designers have struggled to reconcile conflicting aims of performance vs. functionality and maintainability vs. adaptability. In the case of Business Intelligence, there has been no less of a need for tradeoffs in order to deliver workable systems. However, BI system design has also typically been constrained even further by four fundamental realities:

Read more…


April 6, 2009

Redefining the Role of IT in Business Intelligence

If our businesses are going to survive, we need to stop designing Business Intelligence systems that tell us what we want to hear, and which work well in good times but behave incomprehensibly during periods of significant inconsistency. Instead, we need to build systems that empower our best analysts to help correct flaws in our activities and identify opportunities that we can exploit. We need to be in a position where existing paradigms can be challenged and replaced by new ones on an ongoing basis. However, just as new scientific theories need to fit with observed reality, these new business approaches must be well supported by the facts as recorded in a company’s information repository.

Read more…


March 24, 2009

Business Intelligence: An Oxymoron?

An old joke has it that the term “military intelligence” is an oxymoron – and in light of the current global financial crisis, it is tempting to put “Business Intelligence” in this category as well. Our inability to predict or deal competently with major events, from wars in the Middle East to the meltdown of global financial systems, shows just how ineffective our Business Intelligence/Data Warehousing strategies or “fit-for-purpose” reporting systems can be in responding to events as they unfold in this complex world. We are now confounded by the facts: we cannot predict the future; the largest military powers cannot conquer and control much weaker opponents; economists cannot adequately monitor essential financial systems. Automated trading systems, whose rules we once thought we understood and controlled, seem to have taken on a life of their own.

Read more…


September 29, 2008

Nearline 2.0 vs. the Archive

In his most recent SAND blog post, Richard introduced the notion of “Nearline 2.0” and discussed how this concept, and related best practices, can be of vital importance to businesses dealing with the “data tsunami” we’ve been experiencing in recent years.

In this post, I’d like to step back a moment and explore the ways in which the dynamics of Nearline 2.0 differ from traditional methods of data archiving in terms of their approach to keeping data warehouse size under control.

Read more…


December 11, 2007

2008 Season’s Greetings

To all of our friends and associates,

As we approach the end of the year, I would like to express my appreciation for your interest in and support of SAND Technology during 2007.
Read more…


October 10, 2007

Moving Beyond the Data Warehouse Impasse – Part 3

Once you extend your thinking beyond the data warehouse and “free” the data to speak for itself, the potential applications of the data un-warehouse concept are virtually unlimited. Let me suggest three powerful possible applications that would offer substantial benefit:
Read more…


Page 1 of 212»