🧠Second Brain
Search
Problems of Business Intelligence
On the other hand, BI has some substantial problems with speed and transparency. I tried to summarize the issues I learned or heard people telling over my career as a BI engineer and specialist working with Oracle and SQL Server:
- It takes too long to integrate additional sources, and BI engineers are overloaded with work.
- That’s one reason why data silos and analyses are created in every department with disconnected Excel spreadsheets, which are consistently out-of-date and require significant manipulation and reconciliation.
- The lack of speed is a significant disadvantage and can be mitigated with data warehouse automation. More details on what can be mitigated with automation I summarized in a Quora post or in my series about data warehouse automation.
- Transparency is a problem for users other than BI engineers. Only they can see inside the transformation logic, which is mostly hidden in proprietary ETL tools.
- Business people or managers are dependent on BI engineers. There is no easy way to access the ETL or get any real-time data.
- The BI department makes it more complicated than it requires. The impression was always that it shouldn’t be as complex. For us, all the transformations, business logic cleaning, star schema transformation, performance tuning, working with big data, and the list goes on and on are clear. But for non-BI’lers, this is hard to understand.
- Difficulties to handle (semi-) unstructured data formats like JSON, images, audio, video, e-mails, documents, etc.
- This comes down to ETL, transform before loading, which is traditionally a data warehouse where and ELT (first loads the data into storage, and only after decides what to do with it) — also called schema on write vs schema on reading. ELT gives you a significant advantage in speed, which more modern data lakes or NoSQL databases are doing. If you want to know more about the difference between a data warehouse and a data lake (ETL vs ELT), I recommend my earlier post about it.
- Another point is that slice-and-dice is done on aggregated data, while unstructured data like the above does not really do well.
- Furthermore, these unstructured data stretch the nightly ETL jobs even more as they take longer to process.
- General data availability only once a day (traditionally). We get everything in real-time in our private lives, and everyone demands the same from modern BI systems.
This list is not complete by any means. Also, can any point be mitigated with special solutions (e.g., cloud solutions with SnowflakeDB with Variant data type for semi-structured data) or different approaches (data vault for fast integration). However, stereotypes are deeply preserved and, from what I hear, still around.
Read more on The Goal of Business Intelligence and Business Intelligence Meets Data Engineering.
Origin:
Business Intelligence Meets Data Engineering
References: