October 20, 2008
Starting Nearline 2.0: The Quick Check Approach
In previous posts, I introduced the concept of “Best Practices” for Nearline 2.0. Today, I will get down to the details of how and where to start with a Nearline 2.0 solution, beginning with a Best Practices approach designed to quickly identify the benefits of such an implementation in a given environment. At SAND Technology, we offer this “Nearline 2.0 Quick Check” as part of our professional services portfolio.
October 13, 2008
Nearline 2.0 Best Practices
In previous posts, we introduced the concept of Nearline 2.0, showed how it represented a significant step forward from traditional archiving practices, and discussed how Nearline 2.0 could help your business. To recapitulate: the major advantage of Nearline 2.0 is its superior data access performance, which enables a more aggressive approach to migrating data out of the online repository to nearline (a process known as “data nearlining”) than is practical when using a traditional archiving product. Read more…
October 6, 2008
How Can a Nearline 2.0 Solution Help Your Business?
In my last post, I discussed how a Nearline 2.0 solution allows vast amounts of detail data to be accessed at speeds that rival the per-formance of online systems, which in turn gives business analysts the power to assess and fine-tune important business initiatives on the basis of actual historical facts. We saw that the promise of Nearline 2.0 is basically to give you all the data you want, when and how you want it — without compromising the performance of existing warehouse reporting systems. Read more…
September 29, 2008
Nearline 2.0 vs. the Archive
In his most recent SAND blog post, Richard introduced the notion of “Nearline 2.0” and discussed how this concept, and related best practices, can be of vital importance to businesses dealing with the “data tsunami” we’ve been experiencing in recent years.
In this post, I’d like to step back a moment and explore the ways in which the dynamics of Nearline 2.0 differ from traditional methods of data archiving in terms of their approach to keeping data warehouse size under control.
September 23, 2008
Introducing Nearline 2.0
In today’s post, I want to introduce the notion of “Nearline 2.0″. While the name might seem esoteric, this concept represents the logical evolution of older data warehouse and information lifecycle approaches that have struggled to maintain acceptable performance levels in the face of the increasingly intense “data tsunami” that looms over today’s business world. Whereas older archiving solutions based their viability on the declining prices of hardware and storage, and rigid “Nearline 1.0” solutions were primarily designed to work with transactional systems, Nearline 2.0 embraces the dynamism of a software and services approach to fully leverage the potential of large enterprise data architectures.
August 13, 2008
Intelligent Information Management Part 2
In my previous post, I quickly introduced the concept of Intelligent Information Management. In today’s post, I discuss Information Lifecycle Management (ILM). ILM is one component of IIM best practices, dealing with the management of data from the moment of its creation up to its disposal. Studies have demonstrated that the rate of access for a given data set drops dramatically after 90 days. In fact some studies claim that currently less than 30% of the data in an enterprise data warehouse (EDW) is actively accessed by users. However, some organizations are responding to data retention regulations by storing data that is accessed very rarely in the data warehouse “just in case”, causing unnecessary database growth and increased TCO. These organizations are essentially using the EDW as a storage device – a very expensive one indeed!
July 18, 2008
Intelligent Information Management Part 1
Summer is well underway here in Montreal, bringing with it blue skies and warm temperatures. This is an exciting development, especially since the last winter was very long –- just like the amount of space I could devote to the topic of today’s post from the bus! There are many different aspects of Intelligent Information Management (IIM) to be discussed, and the subject is just too important to be dealt with in a hurry, so I will be covering this topic in multiple posts.
June 18, 2008
Columnar Deduplication and Column Tokenization: Improving Database Performance, Security, and Interoperability
For some time now, a special technique called columnar deduplication has been implemented by a number of commercially available relational database management systems. In today’s blog post, I discuss the nature and benefits of this technique, which I will refer to as column tokenization for reasons that will become evident.
June 3, 2008
CBAT Part 2: Flexible Data Modeling for a Simplified End User Experience
In my last blog post, I explained how Column-Based Architecture Technology (CBAT) offers a distinct advantage over the traditional row-oriented RDBMS in terms of I/O workload, deriving primarily from basing the granularity of I/O operations on the column rather than the entire row. This technological advantage has a direct impact on the complexity of data modeling tasks and on the end-user’s experience of the data warehouse, and this is what I will discuss in today’s post.
May 1, 2008
CBAT Part 1: The I/O Advantage
Column Based Analytical Technology (CBAT) has been getting a lot of attention recently in the data warehouse marketplace and trade press. Interestingly, some of the newer companies offering CBAT-based products give the impression that this is a entirely new development in the RDBMS arena. I don’t know where they have been for the last 10 years! This technology has actually been around for quite a while, and at SAND we have been working with it since 1987. But the market has only recently started to recognize the many benefits of CBAT. So, why is CBAT now coming to be recognized as the technology that offers the best support for very large, complex data warehouses intended to support ad hoc analytics? In my opinion, one of the fundamental reasons is the reduction in I/O workload that it enables. Read more…