Introducing Nearline 2.0
In today’s post, I want to introduce the notion of “Nearline 2.0″. While the name might seem esoteric, this concept represents the logical evolution of older data warehouse and information lifecycle approaches that have struggled to maintain acceptable performance levels in the face of the increasingly intense “data tsunami” that looms over today’s business world. Whereas older archiving solutions based their viability on the declining prices of hardware and storage, and rigid “Nearline 1.0” solutions were primarily designed to work with transactional systems, Nearline 2.0 embraces the dynamism of a software and services approach to fully leverage the potential of large enterprise data architectures.

Nearline 2.0, by contrast, allows historical data to be accessed with near-online speeds, empowering business analysts to measure and perfect key business initiatives through analysis of actual historical details. In other words, Nearline 2.0 gives you all the data you want, when and how you want it. (And without impacting the performance of existing warehouse reporting systems!)
Aside from the obvious economic and environmental benefits of this software-centric approach and the associated best practices, the value of Nearline 2.0 can be assessed in terms of the core proposition cited by Tim O’Reilly when he coined the term “Web 2.0″:
“The value of the software is proportional to the scale and dynamism of the data it helps to manage.”
In this regard, Nearline 2.0 provides a number of important advantages over prior methodologies:
Keeps data accessible: Nearline 2.0 enables optimal performance from the online database while keeping all data easily accessible. This massively reduces the work required to identify, access and restore archived data, while minimizing the performance hit involved in doing so in a production environment.
Keeps the online database “lean”: Because Nearline 2.0 data can still be easily accessed by users at near-online speeds, it allows for much more recent data to be moved out of the online system than would be possible with archiving. This results in far better online system performance and greater flexibility to further support user requirements without performance trade-offs.
Relieves data management stress: Data can be moved to Nearline 2.0 without the substantial ongoing analysis of user access patterns that is usually required by archiving products. The process is typically based on a rule as simple as “move all data older than x months from the ten largest tables”.
Mitigates administrative risk: Unlike archived data, Nearline 2.0 data requires little or no additional ongoing administration, and no additional administrative intervention is required to access it.
Lets analysts be analysts: With Nearline 2.0, far less time is taken up in gaining access to key data and “cleansing it”, so much more time can be spent performing “what if” scenarios before recommending a course of action for the company. This improves not only the productivity but also the quality of work of key business analysts and statistical gurus.
Copes with data structure changes: Nearline 2.0 can easily deal with data model changes, making it possible to query data structured according to an older model alongside current data. With archive data, this would require considerable administrative work.
Leverages existing storage environments: Compared to older archiving products/strategies, the high degree of compression offered by nearline 2.0 greatly increases the amount of information that can be stored as well as the speed at which it can be accessed.
Keeps data private and secure: Nearline 2.0 has optional privacy and security packages that protect key information from being seen by ad-hoc business analysts (for example: names, social security numbers, credit card information).
In short, Nearline 2.0 offers a significant advantage over older Nearline 1.0 and archiving technologies. When data needs be removed from the online database in order to improve performance, but still needs to be readily accessible by users to conduct long-term analyses or to rebuild aggregates/KPIs/InfoCubes for period-over-period analysis, Nearline 2.0 is currently the only workable solution available.
In my next post, I’ll discuss more specifically how implementing a Nearline 2.0 solution can benefit both your data warehouse and your business.
Richard Grondin
About SAND
SAND Technology provides scalable enterprise software and best practices for storing, managing, and accessing all your data, on-demand. SAND/DNA includes cost-effective nearline data access and high-speed, column-based analytics, aCRM, and specialized extensions designed to lower TCO and improve operational performance for SAP NetWeaver BI, IBM DB2, Microsoft SQL Server, Oracle, SAS, and more. SAND has offices in the United States, Canada, the United Kingdom and Central Europe, and can be reached online at www.sand.com.