I’ve been thinking back to the 90’s, when self-service business intelligence (BI) tools like Business Objects and Cognos first rolled out. Heck, like all overly-enthusiastic software engineers, I even helped build one in my brief stint at Citigroup. At that time, my youthful self made two very quick predictions:
- Excel is dead
- Self-service data is going to quickly take over
OK, so I’m not quite Nostradamus. After Citigroup, I found myself a decade into a career as a BI consultant — execute some data engineering (ETL back then), slap a BI tool on top, train business users, rinse and repeat. We set up some “great stuff,” but gig after gig left one unsatisfying result:
Business users were not adopting the software and self-serving at the rate we expected.
A small percentage of “power users” (often on the tech side) would pick up the tools and create varying levels of dashboards and reports, but there was no widespread adoption on the business side. And there remained a heavy dependency on consultants.
BI vendor sales pitch: 100% self-service data democracy
My expectation: 60–80% adoption
Reality: <20% adoption, optimistically
After a while, these projects began to feel like
an absolute failure a great opportunity to learn. What was to blame? The tools, the users, IT, the consultants? We’re circa 2010 and there’s starting to be plenty of documentation of failed BI projects. Not “failed” in that projects never produced meaningful results, but failed in that they rarely reached their full potential. Business domains still had a heavy dependency on IT for data. Clean, trustworthy data was not quickly available.
An interesting thing happens at this point in time: a data visualization product called Tableau starts gaining widespread adoption. It’s everywhere, and it’s the solution to data democracy. Then Power BI comes in to compete as a best-of-both worlds data visualization and reporting tool. However, fast forward to today, and we still see the same thing: abysmal self-service adoption of BI tools. Clearly, I’m not alone.
The global BI adoption rate across all organizations is 26%. (360Suite 2021)
But wait, artificial intelligence (AI) to the rescue! If you’re paying attention, then you should know that you are the last person on Earth that isn’t using AI to solve all of your analytical needs.
OK, that’s enough cynical history about the misguided path of BI tools. Let me get straight to the point:
There never was and never will be a single magical solution to making data analytics accessible to the masses in a meaningful way.
What we can do, however, is take a step back and think about the problem from a “big picture,” non-tech perspective, and perhaps gain some valuable insights and strategies to move forward.
Maslow’s Hierarchy of Needs
Transport yourself back in time to high school, if you will, and try to recall that invigorating Psychology lecture on human motivation. If you didn’t cover this in school, or can’t remember, here’s a recap:
American psychologist Abraham Maslow proposed a theory of human motivation which argued that basic needs must be satisfied before a person can attain higher needs. As we move up the hierarchy, we move from lower-level short term needs like food and water to higher-level needs that have longer duration, are more complex, and increasingly difficult to attain. The highest level is self-actualization and transcendence.
In a nutshell, you need a basic foundation before you move to the next level. Anyone in the data world will immediately recognize this and understand that this applies directly to achieving the “self-actualization of data,” which is clearly “self-service.” Come on, they both have “self,” it can’t be a coincidence. Let’s dig in.
Self-Service Hierarchy of Needs
We’re going to show the same image from the top because it’s not only an Insta-worthy beaut of a graphic, but also extremely helpful in our upcoming analysis. Like Maslow’s hierarchy, the Self-Service Data Analytics Hierarchy of Needs shows how each level supports and enables the level above it. Additionally, you’ll see that the higher you go, more trust is both necessary and delivered.
One More Time, DJ:
At the base, Maslow’s physiological needs are obvious: food, water, shelter. Likewise, the base level of the Self-Service Hierarchy of Needs is obvious — data collection. You need to have collected the data. Let’s take this a step further and say that your foundation needs to collect raw data from disparate sources. In the modern data world, this is the Extract and Load portion of ELT (Extract, Load, Transform), and results in what we’ll call a Data Lake, for simplicity sake. Note the differentiation between the traditional/older data warehousing concept of ETL (Extract -> Transform -> Load), that is no longer relevant for many reasons that we’ll cover in another article.
The last point to make here is that any data analysis produced from this level will need to be done by higher-skill analysts/data scientists, and has a lower level of trust in that it hasn’t gone through the higher levels of the hierarchy. The analogy would be something like this: can you skip right to the top-level transcendence? Maybe, but at the end of the weekend when the party is over, it’s unlikely you’ll be able to sustain it.
The next level in Maslow’s hierarchy is safety, which includes things like security, social stability, predictability, and control. In our Self-Service Hierarchy, we achieve that predictability, stability, and control by cleaning and organizing our data as business models in our data warehouse. This often takes the form of multi-dimensional star schema models. With the raw source data from the lower Collection level, analysts might have to join lots of disparate tables together for customer data. In this level, that disparate data has been brought together in a common table, called the Customer Dimension. Also in this process, data is cleaned (duplicate, mismatching names for the same customer) and helpful calculations are pre-computed (e.g., first order date), allowing for much simpler SQL.
At the end, we’ve established another level of safety and trust in the data, but also enabled a new group of analysts with self-service because they don’t need to know the business complexity of the underlying source data. Also very important to note, at this level we should see involvement from business domain owners. The transformation process is meant to support real business needs, so business owners must be involved. In the modern data world, we start to see “analytic engineers” as a critical role to support this hybrid need.
Maslow’s third level is love and belonging through relationships and connectedness. The correlation with our Self-Service Hierarchy is uncanny, as the semantic layer is literally where you set up your relationships (table joins), and is what brings everything together. I could go on-and-on with semantic-layers, and do in the post linked here:
I’ll argue that this level is the most important for enabling true self-service, and that business domain owners need to be heavily involved. The “universal semantic layer” can provide a single-source of truth that powers self-service analytics through data literacy, simplicity, and trust. Analysts can rely on business-friendly field and entity names, data catalog descriptions, and perhaps most importantly, they do not need to know how tables join to each other (or at least how to write the SQL). We also have access to critical things like data lineage (trace a field back to the source table), synonyms (you call it “sales”, I call it “revenue”), and data freshness (when was the data last refreshed).
One important thing to note here, especially for you historians who might say “Business Objects had this in the 90’s.” We’ve not yet reached the “Analysis layer” (BI tool level). For many reasons, which are elaborated upon in the post linked above (“Semantic-free is the future of Business Intelligence”), it is critical that you do not stuff your business logic semantic layer into a BI tool. The “semantic layer” level in our Self-Service Hierarchy should support the next layer, not be it.
At this level, now we’re talking BI tools, reports, dashboards, and what most people think about when we talk about self-service analytics. If you found the semantic-layer correlation to Maslow’s hierarchy as uncanny as I did, then hold on to your seats for Maslow’s self-esteem level. Here, he breaks needs into “lower” version needs like status, recognition, fame, prestige, and attention, as well as “higher” version needs like strength, competence, mastery, self-confidence, independence, and freedom. Hello “data heroes,” “Zen Masters,” and gurus.
At this level in our Self-Service Hierarchy, we start to see business domain ownership and self-service analytics, with a focus on two of the four types of analytics:
1. Descriptive — reports and dashboards that show what happened
2. Diagnostic — analysis that shows why that happened
You’re building your dashboards from a clean data warehouse with a well-modeled transformation layer and universal semantic layer on top, right?
Paradoxically, it might be the BI tools that we thought were enabling self-service that were actually doing the biggest disservice. We know that Tableau (an incredible viz tool with enormous value, to be sure) gained early traction in bypassing slow-moving IT and selling directly to the business, and continues to exploit this divide. Far too many implementations involve exporting data from hand-written SQL on source databases or static BI reports, and importing that .CSV into Tableau. While you can choose to eat healthy at this all-you-can-eat buffet, the reality is often quite different. The mess that ensues can often bog down businesses so much that they will never reach the next levels, so they continue to produce only descriptive dashboards about things that happened.
Self-Actualization and Transcendence
The highest level of Maslow’s hierarchy is around self-fulfillment, personal growth, and reaching your full potential. Similar to life, in the data world, there is no pinnacle that you reach and say “that’s it, all done.” It’s a constant work-in-progress, very difficult to attain, and can seemingly go on forever. At this level, we move beyond the basic descriptive and diagnostic analytics, and have established a very high level of trust in our data and process. This enables the next two types of analytics:
3. Predictive — figuring out what will happen next
4. Prescriptive — based on predictions, recommend the optimal path forward
At this point, we have a strong foundation in all our layers of data and can start to make meaningful strides towards leveraging artificial intelligence, automating business processes, and tackling more advanced use-cases.
Like many things academic, we’ve established a framework and ideals to adhere to, but are left asking “OK, how do we achieve this?” How do we build a data-driven culture that supports self-service analytics?
Unlike Psych 101, I will not leave you hanging. Stay tuned for my next post about how to build a data-driven organization that enables self-service by focusing on people, process, and tools through an agile approach that takes us from buy-in to measuring success and continuous improvement. Coming soon…
I’d love to hear your thoughts, or reach out to Andrew Taft