Database normalization is a dubious concept for 2 reasons:
- The cost-benefit value of any normal form can only be determined by an information owner
- Only a subject matter expert (SME) can identify the normal form of any set of tables
As an example of reason 1, the table below contains an error which you can see in the last row: an address in Grand Junction, Tennessee has the same ZIP code as the address above it in Grand Junction, Colorado. This is clearly a mistake, but without investigating further we cannot know whether the error is with the State or the Zip.
Breaking City and State off into two separate tables as shown here eliminates any potential for this kind of inconsistency, however the advantage might come at a cost because performance can suffer when applications and reports have to query 2 joined tables instead of 1.
So what is more important, better performance or lower risk of error? Who should decide? Only information owners have the appropriate perspective and incentive to make that kind of decision wisely (see Sophotaxis). Architects and engineers can provide valuable cost-benefit advisement, but final decisions should be made by owners.
The example above is pretty simple, but large business databases can have thousands of similar cost-benefit scenarios that can be far more complex. The more complex the situation, the greater the need for ownership perspective and expertise in the specific business issues at hand.
For an example of reason 2, consider a simple table containing only one column with phone numbers. Many engineers would agree that this table is in first normal form. In most situations it would work fine just as it is, or as a column in a larger table. But for a company such as a telephone service provider, where users would want to group or sort phone numbers by area code or exchange, this table would not be in first normal form. In other words, it would not satisfy even the minimum theoretical standard for use in a modern database system. Instead, in this particular case, the numbers should be broken out into each meaningful component as shown in the lower image.
This shows that even in very simple cases the normal form of a table can only be determined by someone who understands the full-range of potential uses for the information. Business databases are filled with situations like this, but unfortunately most decisions about organizing information are made by technologists rather than business experts. That is why businesses struggle with databases that cannot adapt well to changing requirements.
Information should be organized in a way that makes the most sense to its owner, not according to some predetermined normal form which cannot even be reliably achieved. The normal forms are guidelines to improve performance and protect consistency. But any decision made for performance can potentially degrade the usefulness of the information, and the range of decisions needed to protect consistency can only be determined by a business expert.
For related discussion see: