In a requirements gathering session I was asked a question that I should have knocked out of the park. A subject matter expert asks, “Why dimensional modeling?” I knew that there was no time to go into a long description of exactly what dimensional modeling is, nor did anyone (other than me) really care. I pretty much had about 30 seconds to describe why dimensional modeling was the best technique to use. To make the situation a little more interesting, the organization I am working with has a somewhat shaky history with dimensional modeling. In short, a group was well on their way to implementing a solid model when the business decided it was taking too long and stopped the project. So, here I am recommending a solution that has, in some people’s minds, failed previously.
Ok, back to the point. What did I say? First, let me preface this quote by saying that I’m not proud of this answer. I said, “Dimensional modeling is a simpler solution because it has less joins as compared to 3NF models.” I really wish I could get a redo on that answer. After spending quite a few years building dimensional models, authoring the most subscribed to YouTube channel focused on dimensional modeling, and teaching the concepts to a number of up and coming technicians this is the best answer I could come up with at the time. Since this time I have thought about the situation quite a bit. So, although the original audience will likely never see this redo, here’s why I choose to implement data warehouses using dimensional modeling techniques.
First, I agree with Kimball. We should be creating a data warehouse that is designed to meet the needs of people that are accessing it. People like things that are simple and fast. Dimensional modeling is clearly simpler than a normalized structure. The design also is structured in a way that does minimize joins which often leads to better performance.
Second, the nature of dimensional modeling is such it lends itself to iterative development methods very naturally. As long as we adhere to the basic concepts such as conformed dimensions, bus architecture, and SCD handling, etc… this is a very extendable technique. This is a point that is not focused on too often, but I think it is critical to success. That is not to say that other techniques can’t achieve the same behavior, but in my opinion dimensional modeling caters to agile development methodologies better than most techniques.
Finally, the data warehousing community seems to feel the same way. Sure, some groups will insist on building a 3NF structure before loading a dimensional mart. Regardless, I think it is fair to say that most data warehouses will present a dimensional structure to the data consumers. This may be through a semantic layer in some cases and instantiated in others. What I’m saying is, don’t trust me. Look at the wealth of very smart and experienced practitioners that choose to use dimensional modeling techniques.
So, next time I’ll do a better job of answering this question. As a practitioner I do believe it is my responsibility to justify the recommendations I’m making. The tricky part of these discussions, however, is you always have edge cases. Someone might say that they could load a set of facts and dimensional attributes in one table to serve a specific need & their table will perform better than a properly structured dimensional model. They very well might be correct. Obviously, such a design is going to have problems soon enough, but in this one case it might be a great solution. To this I say, go ahead and build whatever makes the most sense to your data consumer. The catch is that this should only be done after creating a proper dimensional model. Everything should be sourced from the dimensional model. Why? Because every decision has pros and cons. We need balance. Basing a data warehouse’s design on a single use case is a terrible idea. It will fall apart or become a maintenance nightmare soon enough. Having said this, sometimes people need to experience the outcome of bad design techniques before they can truly appreciate the beauty of a well-designed data warehouse.