Today, I want to talk about Power BI. And I’m going to preface this by asking you to refer to some of the prior discussions where I’ve emphasized that Reportopia is not about tools. It’s not the case, though, tools are very important. I do think it makes sense to focus on some of the capabilities that tools have, because they are key. Especially initial decisions that need to be made for those of you that are starting a new solution. That’s what we, at LeapFrogBI, dealt with this last week. We onboarded a new client. One of the first things we must decide is what type of architecture are we going to implement. We’ve talked in the past about some of the data solutions that we might choose and at the very simplest of those data solutions is what we call system of record reporting.
What we’re saying here is we have a data source like QuickBooks, and we want to try to connect Power BI, which is the reporting system that our client is going to use. Most our clients use Power BI. We’re going to try to connect Power BI directly to that source system. That’s the system of record reporting. There’s nothing in between your system of record and the reporting system. Once you go beyond system of record reporting, you’re moving into your staging and your persistent staging areas, your operational data stores, your data warehouse, etc. When you make that step, the critical step between system of record reporting and any other solution, you have some additional complexity, you have some additional cost, and there’s several things that you have to deal with at that point.
It’s critical to decide, if you can get away with it, to just use direct system record reporting. I want to also mention that if you go look at any of the business intelligence tools such as Power BI, Spot Fire, and Tableau, if you look at any of their marketing, you’re going to immediately think that these tools can do it all. You don’t need anything other than this tool, and somehow, it’s going to solve all the problems that you have in the organization. I looked at Power BI to see what their marketing looked like.
Let me just say this, I love Power BI. Power BI is clearly the far in a way front runner today when it comes to business intelligence, what I call front end tools, basically data visualization tools. Power BI is the best we have today. It’s across the board. They offer the highest value. They have a huge amount of functionality. The application is constantly being enhanced. Every single month you get new features. They just stole the show with Power BI, but it wasn’t always like that. In the past, we’ve had several other business intelligence tools that I felt the same way about, frankly, Tableau, not too long ago. We had no idea that any tool would kind of take over the Tableau market, because they pretty much owned it at that time. Of course, Power BI has done that.
So, I went to the Power BI website and looked at their above the fold. I looked at their marketing. It says right there in black and white,
“Create a data-driven culture with business intelligence for all”.
Then, the subtitle to that is:
“Enable everyone at every level of your organization to make confident decisions using up to the minute analytics.”
That’s the goal. You could even take that little heading there and you can make it part of the Reportopiadefinition, because that’s what we’re trying to do. It’s not exactly how I explain it, but it absolutely is what we’re trying to do. We want everyone in an organization to have the information they need to operate efficiently and make good decisions. But here’s my beef, Power BI is a tool. It is the hammer. It’s not the thing that swings the hammer. I’m not trying to diminish Power BI’s role in this at all. It’s not bad. We’ve got to be realistic here that Power BI is a tool.
When you buy a tool, you must have people that know how to use that tool. Even when we talk about data-driven culture, that goes well beyond what Power BI is going to do for you. It is a little bit of a pet peeve of mine. When I look at business intelligence tools, front end tools, and I look at their marketing. It’s very easy to think that you can buy this tool and all of the sudden, all your problems are going to be solved. But if you have that concept in your mind, you’re going to be let down. However, today, what I want to talk about is what can we get away with when we talk about direct system of record reporting. One of the things that we try to do at LeapFrogBI is keep it simple.
In other words, let’s stay as simple as we possibly can in all aspects of these solutions. Architecture is what we’re talking about. Let’s keep it as simple as we possibly can for as long as we possibly can, but let’s also keep in mind what we think is going to happen tomorrow. So, let’s consider at least the known parts of the future, things that we know we might try to do in the future. Let’s go back to my original story, we’re onboarding a new client, and our first order of business is to get an understanding of their needs. So, what type of reporting do they need? Then, we get an understanding of the data source. For this client, they are using QuickBooks. This is not a super huge organization where they have five different data sources that you know you want to integrate. On day one, bigger organizations will have different users that have different requirements. That’s not what we’re talking about.
This is a very confined solution. We know precisely the type of reporting that we want to create. We know there might be some extension to that reporting in the future, but the requirements, as far as the business requirements go are reasonably simple. It’s not a very complex scenario. The data source is also quite simple. They use two versions of QuickBooks, online and desktop. We are talking about financial information. We’re talking about high level financial information, not individual transactions. Basically, we are looking at financial reporting. The source system and in the types of reporting, are both pretty modest. That makes us think that it’s possible that we can get away with using direct system of record reporting.
A lot of the things about direct system reporting as an option are not absolute. I did a project a few years ago for a large oil and gas company and they said they did not want to have a data solution. They wanted to use Power BI to connect to all data sources. We were able to accomplish the mission. It wasn’t a failure, but the connections in Power BI were extremely complex. We were trying to integrate data in Power BI. Power BI has a back-end layer called power queries. We’re trying to do all that work in Power BI directly. That workbook was so complicated even for me, and I was the person that built the thing.
It was very difficult to keep track of all the things that needed to happen to integrate this information on the fly. Without a doubt, we spent a lot more time getting this data set together in Power BI than we would’ve done if we could’ve just simply staged this information up front. The ability to maintain that reporting is going to be extremely difficult.
That’s not the situation we’re in here with this new client of ours. This again is very constrained requirements with a very limited number of simple data sources, just one data source of two different flavors.
First, I did a little work to get myself up to speed on the data source and what the capabilities were in Power BI. If the built in Power BI connector could connect to QuickBooks. Then we could have a direct connection to QuickBooks. I found that we could connect directly to online QuickBooks using a connector, but it can’t connect to QuickBooks desktop. This makes sense, QuickBooks online is online, Desktop is running locally. So, we couldn’t use that connector. Other problem is the connector would be a pull mechanism. In this particular case, we’re working with multiple companies, and we’d have to have pretty unrestricted access in order to be able to do what we want to do. So, the connector was not going to work for us.
Then we start looking at, if we take the data sources and instead of connecting directly to QuickBooks, what if we have QuickBooks provide us standard exports, in the form of flat files. Then we take Power BI and connect to those flat files. That’s the architecture that we’re going to be going with. The flat files become our system of record.
We’re not interfacing with QuickBooks. It’s the actual files that become our system of record in this case. What are the pros and cons of trying to do something like this? We’re going to take Power BI and connect directly to these files. So, we’re not going to have anything in between Power BI and these files. This is what we’re calling direct system of record reporting. What are the pros and cons of this? I kind of floundered a little bit whenever I was working with this client and thinking about this. I also learned a little bit about Power BI’s capabilities. In this case, we have many files with the same schema that we’re going to be picking up. And I didn’t realize that Power BI can traverse a folder and pick up all these files, kind of in one fail swoop, which it can do. Our practice manager, Audrey, corrected me on that.
That’s one of the reasons why at LeapFrogBI, we really love Power BI. It’s constantly evolving. It’s got all sorts of capabilities. And even though we use these tools every day, it’s quite difficult to keep up with all the functionality and understand the limitations of those functionalities.
So, what are the pros and cons of using direct system of record reporting? The number one pro I came up with is there’s no need to maintain a data solution. Meaning a separate data solution from what Power BI will provide for you. Power BI is going to provide a tabular model. You can import data into Power BI, just like most of these business intelligence front ends. Power BI has a data storage and data engine, mechanism. It is a tabular model that is implementing, but a pro of direct system market reporting is you don’t have a data solution. That means you don’t need to maintain whatever your data solution might have been. Otherwise, most often we are using SQL server in a virtual machine that we can turn on, turn off and make run very efficiently. But in this case, when you do have direct system market reporting, by definition, there’s nothing in between Power BI and your data source. So that’s great. That means one less thing to maintain. That means that there’s less chances of error.
That’s pro number two. There’s no chance of errors in your data solution. There are chances of errors in Power BI, but if we don’t have a data solution at all, then there’s no chance of errors manifesting in that data solution.
This kind of goes back to keeping it simple. We want to keep things as simple as possible for as long as we can. If we don’t have a data solution, then there’s no chance of that data solution itself interjecting errors in your overall solution.
There are some things about data solutions that can be tricky. The two that come to mind are delete detection and change bags. Sometimes you have data sources that are going to delete records in place, meaning they just simply disappear. They’re there one day and then the record disappears the next day, that’s called a hard delete. We must do things to make sure that we know if a record was deleted. The worst scenario would be if the record is deleted and there’s no evidence of its deletion. So, you have to figure that out if the data source has deleted the record.
The other thing that’s difficult with system of record reporting is change backs. What is change back? Basically, you have a value that starts out as A, then it goes to B and then it changes back to A. We’re persisting data in a specific way in our data solution and that must have special handling.
These are just a couple of examples, there’s many more. The point is there’s no chance of an error occurring in the data solution because there is no data solution. To continue that same line of thinking, if there’s no data solution, then we don’t have to pay for a data solution. The cost of the solution itself is zero because it doesn’t exist.
What is the cost of a data solution? That’s a more difficult question to answer but on the smallest end of the scale, you’re looking at somewhere around a few hundred dollars. We typically start out at around $500 per month for a data solution. So that might equate to somewhere around three or four hours of work. So just give you an idea, if not having that data solution causes you to spend three or four more hours of work in the front end, like Power BI, well, you didn’t save anything. If we are not going to use a data solution, then we don’t have that $500 cost. So, I’m going to list that as a pro. The cost could be a lot more than $500. Don’t misunderstand me. At the smallest end of the spectrum, which is typically kind of what we’re talking about when we’re considering system of record reporting, the cost is quite low.
The other thing that I’ve already mentioned somewhat is simplicity. If we don’t have that data solution in the middle between your system of record and Power BI, you’ve got a much simpler solution. Simple has a lot of advantages. We’ve already mentioned no errors in the data solution, but another one is you have less skills that you need to have in order to build a solution. You don’t necessarily need an ETL developer because you’re not building the data solution. You don’t need to know how to use several other tools. You don’t need to know some of the other modeling techniques that are required because all we’re doing is pulling data from a data source and pushing it into Power BI.
These are the top-of-mind pros that I came up with. And there’s more I’m sure, but these are, I think, some of the most significant ones when we talk about using direct system of record reporting.
Now, what are the cons? Everything sounds great so far, so why would you not want to use direct system of record reporting? First, you’re going to be limited to what’s in your data source. If you have a data source that overrides values, knowing the overridden value, plus the current value is important to you. Well, then you don’t have an option to use Power BI or any front-end solution without a data solution in the middle.
If we are talking about a customer record. If it’s important to you to know where a customer lives at the time that they purchased something from you so that you know that sales were originated from California on one cell and then another cell was originated in New York. If it’s important to know where sales originated from and you’re basing that on the customer’s address and the source system overrides the address, that’s something you can’t overcome with direct system of record reporting. You can only represent where the customer’s address is today. In this scenario, that’s all the source system stores. Everything else is overridden. So anytime you get into what we call city two or beyond type situations, you can’t use system of record reporting in order to achieve that.
Another issue with system of record reporting is data size. Power BI is quite powerful. You can push a lot of data into a tabular model using import mode. You might also use direct query, but most of the time you’re not going to be direct querying your data source unless it’s a extremely simple situation. If you start putting analytical loads on your system of record, you’re likely going to be diminishing the performance of your system of record.
The data size may become an issue if you’re using Power BI only because again, this depends on the licensing level. If you’re using Power BI pro, you’re limited to one gigabyte per import dataset. This is highly compressed. It might be better than it seems because you can get a lot of data in a one gigabyte dataset, but you are limited to that one gigabyte data set. Now you can spend more, you can buy a premium which is $5,000 per month. You can even use premium per user, which is just $20 per user per month as opposed to $10 per user monthly, at least at the time of this podcast recording. $10 per user per month is for Power BI Pro. There are options, but those options have costs. If you don’t have a data solution in between Power BI and your data source, well that data situation might become an issue. So, you have to really look at that.
Another issue, and there’s a little bit of personal preference, and it is a hard fact, you’ve only got Power BI in place. No data solution in the middle. You’re going to be doing all your transformations, all your business, and logic implementation in Power BI. That was an advantage a minute ago. It was a pros because it is very simple. It’s also in my opinion, a disadvantage because now you’re limited to just about two languages. You’ve got M which is the power query language. Then you have Dax, which is your front-end language, your query language. So, this gets kind of deep into the details here, but even though these two languages, M and Dax are very powerful, that’s all you have. There’s a lot of things that you can do a lot simpler in SQL than you can do in these two languages. So, you are going to be limited to only the tools that are available in Power BI or whatever front end you choose if you’re doing direct system of record reporting.
Why is that good or bad? It depends. It’s good because it’s simple, but it’s bad because often you are going to spend more time trying to find a way to make something work using only the tools available. When you know that you could do things a lot more efficiently, if you had a data solution. We talked about one of the pros being the savings of the data solution cost. This is an area where you can very quickly eat up that cost. A few hours of labor will quickly exceed the cost of a data solution. That’s important to consider.
Another con is all logic is going to be in the front end. If there’s no data solution in place, all your business logic is in the front-end tool, Power BI in this case. So what? Well, number one, if you ever decide to change your front-end tool, you have to start over. I don’t know why, but I really harp on that. I guess it’s because I’ve seen front-end tools come and go so many times through the years. Right now, Power BI is the leader. Years ago, when we were looking at Tableau and many front-ends, before that. It’s hard to see the future. If something overcomes Power BI, we’re going to want to migrate to that. It might happen and if it does, all your logic is in the front end, and you’re going to be recreating that logic in the next tool that you choose to use. In my opinion, that’s a con.
I prefer to have a data repository. That’s my preference to have a data repository where I implement all my business logic. Then I use the front end to present data visualization ends. There’s a lot more functionality in these tools today, but again, my preference is to use the back end to do the heavy lifting. If you change your front end, you’re going to be starting over.
I see another issue when we talk about Power BI specifically and some of the other front-ends. If we’re talking about pushing data into Power BI, it’s not like you’re pushing data into a data warehouse. You’re pushing data into a data set in Power BI’s terms. That data set is not going to be a building block. If you’re familiar with dimensional modeling, where we create data marts and those data marts work across the bus architecture to form a data warehouse. A data set is not going to be a building block to a data warehouse. That’s not how it works. A data set is essentially a cube. If you have a limitation, like we mentioned a bit ago on the size of that cube, maybe not even the hard limitation, but the performance limitations by adding too much data to a cube or growing a cube too large. Well, what’s going to happen is you’re going to need a new data set eventually. You’re going to need two data sets or three data sets or four data sets. That’s very common. We do this for several clients.
We have as many data sets as we need to serve the need. When you do that and you have your logic in the front end only, that means that all your logic, at least the amount is needed for each data set is going to be in those data sets. This means you’re not reusing your code. You’re going to put the same information, the same logic, the same work into multiple data sets. This may not be a big deal and it might be a huge deal because we need you to make a change to that logic. Now you’re doing it in multiple data sets. That means that we could have a difference between those data sets. Maybe there was a mistake in one. So, it’s never great to duplicate your logic or your code. You’re going to be in that situation in some scenarios.
Another thing I’d say about a con is integration can be tricky. It is possible to integrate many sources to Power BI, there’s no limitation. If you talk to the Power BI marketing team, they’re going to say you can connect to all of them and integrate on the fly. You must trust me on this one, it will be quite tricky. These front-end tools can do a lot, but it’s not a miracle tool. There are limitations, practical limitations.
Another thing to consider is how important is it that you have these reports available. What if the front end is offline? What if Power BI goes down? It is important that you can still have access to this information.You might think that’s not going to happen. Well, it happened last week, and it’s happened more times than you might believe. We run several clients, so we have clients using Power BI resources in several regions. And we have seen, I wouldn’t say many occasions, but on more than just one or two occasions, Power BI is not functioning properly. It is unusual to go to Power BI and it doesn’t respond at all. It could be that subscriptions aren’t being sent out. There are all sorts of things that happen. We deal with that for our clients. We get the systems back online as fast as we possibly can. I’m generally talking about Power BI online, not Power BI report server. If you’re using Power BI online and that resource is offline, you don’t have access to your information.
How costly will it be to your organization to not have the information you need to run your business, for one day? You must consider that cost. It may not be a big deal, in this case for this client that we’re onboarding right now. They’re primarily interested in monthly reporting. It isn’t very time sensitive. It’s not as if we’re embedding analytics within business processes that are ongoing. In this case, it makes pretty good sense to use direct system of record reporting. So that’s another thing to consider.
The final con that I should mention is that there is no seamless path to migrating to a more robust solution, which would include a data solution. If you’re putting all your business logic in the front end, if you later decide, well, that’s not going to get the job done, well, you’ve pretty much got to start over. You’ve got to build the data solution. And then you’ve got to essentially recreate all the assets in Power BI to utilize that data solution. So that’s a con to taking this approach.
If we go back to the question, can Power BI do it all, the answer is maybe. It depends on the requirements. It depends on your business requirements, the type of reporting, the complexity of the reporting, all the details related to the reporting that we’re trying to create, as well as your data source. How complex is your data source the ecosystem? Is it just one simple data source? Is that the situation or is it multiple data sources that are going to be difficult to integrate? Is it just one data source that’s very complicated?
If you think about an ERP system that’s highly normalized, it’s going to be difficult to see how you can achieve building more than just the simplest of reports and Power BI on top of a system like that. You might ask, what do we recommend? What we recommend is look at your situation and decide. If you think that you are in a situation where you’re trying to create a very simple set of reports on top of a very simple data source, and you don’t foresee that you’re going to have increased complexity anytime soon. I would say go for direct system of record reporting.
If that’s not the case, if you’re going to have more complicated reporting. If you want to build your business on a data-driven culture. If you’re going to do something like that, I would say, get a data solution in place. It’s not going to break the bank to build a data solution. It will be a little more expensive as compared to a very simple direct system of record reporting initially. Long run, it would be a lot cheaper to have a data solution in place, as opposed to trying to do everything in Power BI.
That’s my opinion on direct system of record reporting. I think it is amazing what some of these tools can do for us today when we talk about implementing solutions, but it really is important to just keep your eye on the ball. With these Reportopia solutions, it’s not really about tools, though tools are very important.
They’re extremely important, but it’s more about creating these processes and the culture that’s focused on continuous improvement. Let’s find the value opportunities. Let’s find those opportunities. Let’s figure out how we can capture those opportunities and then let’s implement that design.