If you know me, you know I advocate something called Information Lifecycle Governance (ILG) as the proper model for managing trusted information over its’ lifespan. I was reminded recently (at IOD) during a conversation with Sheila Childs, who is a top Gartner analyst in this subject area, of a running dialogue we have on the differences between governance at the storage layer and using records management and retention models as an alternative approach. This got me thinking about the origins of the ILG model and I decided to take a trip in the “way-back” machine for this posting.
What is Information Lifecycle Management (ILM)
Accordingly to Wikipedia as of this writing, Information Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices. Searchstorage.com (an online storage magazine) offers the following explanation:
Information life cycle management (ILM) is a comprehensive approach to managing the flow of an information system’s data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. Unlike earlier approaches to data storage management, ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures, as for example, hierarchical storage management (HSM) does. Also in contrast to older systems, ILM enables more complex criteria for storage management than data age and frequency if access. ILM products automate the processes involved, typically organizing data into separate tiers according to specified policies, and automating data migration from one tier to another based on those criteria. As a rule, newer data, and data that must be accessed more frequently, is stored on faster, but more expensive storage media, while less critical data is stored on cheaper, but slower media. However, the ILM approach recognizes that the importance of any data does not rely solely on its age or how often it’s accessed. Users can specify different policies for data that declines in value at different rates or that retains its value throughout its life span. A path management application, either as a component of ILM software or working in conjunction with it, makes it possible to retrieve any data stored by keeping track of where everything is in the storage cycle.
If you were able to get all the way through that (I had to read it 3 times) you probably concluded that (1) it was way too complicated (2) was very storage-centric and likely too costly (3) was incomplete. These are all reasons why this concept never took hold and is widely considered a failed concept.
But hold on … let’s not throw the baby out with the bathwater quite yet. The underlying idea is sound but needs modification. In my opinion, here is what was wrong with the notion of ILM when it came to prominence in 2002 or so:
Information Lifecycle Management (ILM) Failed it Failed to Consider Trusted Information
It’s incomplete.
Frequency of access does not determine the usefulness of information. Any set of policies need to include the value of the information to the business itself and the legal and regulatory obligations. Only calculating how recently files were accessed and used is an incomplete approach. Wouldn’t it make sense to understand all of the relevant facets of information value (and obligations) along with frequency of access?
It’s inefficient and leads to error.
Managing policies at the device level is a bad idea. As an example, many storage devices require setting the retention policy at the device itself. This seems crazy to me as a general principle. Laws and obligations change, policies changes, humans make errors … all of which leads to a very manual time-consuming and error-prone policy administration process. Wouldn’t a centrally managed policy layer make more sense?
It’s not well understood and can be too costly.
This model has led to the overbuying of storage. Many organizations have purchased protected storage when it was not necessary. These devices are referred to as NENR (Non Erasable, Non Rewritable) or WORM (Write Once, Read Many). These devices come in multiple flavors: WORM Optical, WORM Tape and Magnetic Disk WORM (Subystem) and can include multiple disks with tiered tape support. Sample vendors include: EMC Centera, Hitachi HCAP, IBM DR550, NetApp Snaplock and IBM Information Archive.
This class of storage costs more then other forms of storage primarily because of the perception of safety. Certain storage vendors (who will remain nameless) have latched onto this market confusion and even today try to “oversell” storage devices as a substitute for good governance. This is often to uninformed or ill-advised buyers. The fact is, only the SEC 17a-4 regulation requires WORM storage. Using WORM for applications other then SEC 17a-4 usually means you are paying too much for storage and creating retention conflicts (more on this in a future posting). The point is … only buy protected storage when appropriate to your requirements or obligations.
If we could just fix those issues, is the ILM concept worth revisiting? It’s really not that hard of a concept. When information is born, over 90% is born digital. Over 95% expires and needs to be disposed of. Here is a simple concept to consider:
A simple model for governing information over its' lifespan
I will go deeper in this very concept (and model) in my next posting. In the meantime, leave me your thoughts on the topic.
I am also curious to know if you have been approached by an overly zealous vendor trying to sell you WORM based storage as a replacement for good governance or records management. I will publish the results.
Comments