Big data can provide powerful insights into large data sets. Some scholars and practitioners have even suggested that big data tools and techniques might replace relational databases for ordinary business use, but this claim offers only false hope for organizations that struggle with relational databases. Here's why:
Big data systems work with information organized into small 2-part chunks known as key-value pairs (or some related format). For example: Last Name = Fuller; City = Redmond; Car = Honda Accord; Order status = complete.
Organizing information this way is great for things like analyzing trends and detecting patterns. But big data formats cannot be used for ordinary business reporting unless each record is tagged with additional information to tell which other records it is related to. For example: this address belongs to that person; this item goes with that order, and so forth. Applying these kinds of tags to information in a big data format requires exactly the same kind of discipline and pre-planning as it would if it were organized for a relational database. Big data offers nothing new in this regard.
Even when a big data record set includes complete information about the relationships between each pair, big data technologies do not offer anywhere near the flexibility of relational databases for reporting purposes. So any claim that big data presents a plausible alternative to relational databases for general business use is uninformed and false.
Here are 3 ways to lower costs and improve outcomes in any BI or analytics project, or for that matter any information management effort. They require varying degrees of commitment ranging from easy, free, and doable right now, to a need for significant change in organization and culture:
- Have each information owner (meaning the business person who decides the requirements for an information resource) actually look at the proposed way the information will be organized; in other words, the actual tables and relationships with sample data (very important). Some might think this sounds like asking them to look at programming code – Not so! Programming code is purely technical in nature. Deciding how information is organized is purely business in nature; the only input that should be needed from IT is cost-benefit advisement on issues like performance. The way information is organized determines how it can be used. The operational capabilities of any organization are literally determined by the way its information is organized. A business owner cannot possibly make a complete list of every possible use case for an information resource, but when they look directly at a proposed format and see how it is organized they can easily determine whether that resource will meet their foreseeable needs even before they try to express any requirement. Of course, what an information owner considers foreseeable can change over time, but it will always change more slowly than the constantly changing list that must be maintained when an architect or engineer is in charge of determining requirments from their own perspective. The time a business expert spends doing this will be returned many times over. It will reduce scope creep, save development cycles and produce better outcomes every time, guaranteed.
- Assign an information manager to every business unit. Information managers can be drawn from the same talent pool as business analysts. They are business-oriented professionals often trained at university business schools to manage information and determine how it should be organized. There is no reason these people should work in any IT department except as information managers for business units within IT. Properly-placed information managers will eliminate the need for business analysts and will be 3x to 10x more effective and productive. Information managers should report to the same office as business managers, usually a GM or VP. An information manager should be responsible to decide how information produced by that business unit will be organized according to the priorities and requirements of the business. When information needs to be organized across multiple systems and business units, information managers should coordinate with their cross-department peers and respond to policies set by senior information managers who report directly to the COO or CFO. Information managers for large business units may require a staff, as will those for some smaller organizations depending on the rate-of-change and complexity of the information they manage.
- Have everyone in your organization take a course in fundamental logic. This is not too much to ask – logic was once a core focus of classical education. In fact it was one of the main reasons universities were invented in the first place. Logic remained a central pillar of university curricula for hundreds of years until around the 1940's or so, but since then has been severely de-emphasized at great detriment to the discipline of information management. Today a person can earn a PhD in nearly any subject including business administration or computer science without taking a single introductory course in formal logic. Almost any person at any level of a modern organization can create new information resources, so logic education and logic-aware management are absolutely essential for any organization that wants to build an effective culture and capacity to manage information.
Information is the sine qua non of all commerce – a status not even money can claim. Money is, after all, a form of information.
A business resource is anything that brings value to a business. Classical economists described business resources in terms of factors of production. Land, labor and capital are the primary factors because they do not become part of any finished product and are not consumed or significantly changed by the production process. Resources such as raw materials and energy are secondary because they are derived from the primary factors. From the classical perspective even things like entrepreneurship, intellectual property and the time value of money are derived from labor and capital, so they too are considered secondary formulations of the primary factors.
So where does information fit in? Information is obviously an important business resource, but is it a primary factor or secondary? Or is it something else?
Information is consumed in the production process but not in the sense that it is depleted or reduced; in fact new information is created by every act of production and commerce. Further, information is non fungible, which means it cannot be substituted one unit for another such as a kilowatt of electricity, an ounce of gold, or a computer. Information cannot be replaced the way a building or an executive can be. No business resource can be effectively utilized without information.
For these reasons information must be acknowledged as superior to every other business resource. It is more primary than the primary factors. Information is the sine qua non of all commerce – a status not even money can claim. Money is, after all, a form of information.
As late as 1946 there were in the combined professional, technical and scientific press of the United States only seven articles on the subject of information
So why did the classical economists not have anything to say about information as a factor of production? My guess is that information is so essential to every aspect of commerce that until the mid 20th century it was not even recognized as a distinct resource class. In 1963 a professor of management noted “As late as 1946 there were in the combined professional, technical and scientific press of the United States only seven articles on the subject of information" (see here).
Information is like the air we breath – nothing can happen without it, but it is easy to ignore until you have reason to notice.
Managing information is the most difficult and costly operational challenge facing most businesses. At the root of the problem is a failure to recognize the distinction between information resources and technology resources. To their detriment, businesses treat them as the same thing. My evidence for this claim is that no distinction is ever made in the requirements expressed for either, or in the way each is managed. They are delivered and maintained by the same people and no distinction is recognized at any point in the lifecycle processes of either type of resource. Information resources are mistakenly treated as components of automated systems.
As a result, some of the most important management decisions at every level of enterprise organizations are unwittingly delegated to technical specialists instead of business experts. Efforts to address the resulting problems without addressing the root cause only make the problems worse. It is a vicious circle that creates thick layers of artificial complexity in the form of initiatives, roles and processes which lead to additional costs and complexity. The only way to solve the problem is to recognize that information resources are not the same thing as the technology-based tools used to access and maintain them. Businesses must develop a capacity to determine and express requirements for information resources separately from those of automated systems.
An information resource is information organized for some purpose. It can take the form of anything from a memorized telephone number to the Library of Congress or the entire internet. The following table lists various types of information resources, how they are organized, and what they are useful for:
|Information resource:||Organized by:||Useful for:|
|File cabinet||Drawers with alpha or numeric sorting||Manual document retrieval|
|Novel||Sentences, paragraphs, chapters||Entertainment, relaxation|
|Library||Subject, author||Finding publications|
|Relational database||Tables, columns, rules, relationships||Flexible storage, retrieval and analysis|
|XML file||Tags, nested hierarchies||Transporting and sharing data|
|Big data||Key-value pairs||High-volume capture and processing|
|Semantic ontology||Triples (subject, predicate, object)|| Making information discoverable
The way information is organized determines how it can be used, so decisions about the organization of information should be carefully considered by the owner and managers of the resource. Unfortunately, owners and managers usually only provide high-level guidance, and the actual decisions about the way information gets organized are instead delegated to an architect or technical specialist. This is a costly mistake with long-term consequences. The outcome is almost always an information resource that cannot be used the way its owners intend without being modified for every newly desired use.