🧠Second Brain
Search
Problems of Business Intelligence
On the other hand, BI has some substantial problems with speed and transparency. I tried to summarise the issues I learned or heard people telling over my career as a BI engineer and specialist working with Oracle and SQL Server:
- It takes too long to integrate additional sources, and BI engineers are overloaded with work.
- That’s one reason why data silos and analysis are created in every department with disconnected Excel spreadsheets, which are consistently out-of-date and require significant manipulation and reconciliation.
- The lack of speed is a significant disadvantage and can be mitigated with data warehouse automation. More details to what can be mitigated with automation I summarised in a Quora-post or in my series about data warehouse automation.
- The transparency is a problem for other users than BI engineers. Only they can see inside the transformation logic mostly hidden in proprietary ETL tools.
- Business people or managers are dependent on BI engineers. There is no easy way to access the ETL or getting any real-time data.
- The BI department makes it more complicated than it requires to be. The impression was always it shouldn’t be as complex. For us, it is clear with all the transformations, business logic cleaning, star schema transformation, performance tuning, working with big data, and the list goes on and on. But for non-BI’lers, this is hard to understand.
- Difficulties to handle (
semi-)
unstructured data formats like JSON, images, audio, video, e-mails, documents, etc.
- This comes down to ETL, transform before loading, which is traditionally a data warehouse where and ELT (first loads the data into storage, and only after decides what to do with it) — also called schema on write vs schema on reading. ELT gives you a significant advantage in speed which are more modern data lakes or NoSQL databases are doing. If you want to know more about the difference between data warehouse vs data lake (ETL vs ELT), I recommend my earlier post about it.
- Another point is slice-and-dice is done on aggregated data, which unstructured data like above mentioned not really do well.
- On top, these unstructured data stretches the nightly ETL jobs even more as they take longer to process.
- General data availability only once a day (traditionally). We get everything in real-time in our private lives, everyone demands the same from modern BI systems.
This list is not complete by any means. Also, can any point be mitigated with special solutions (e.g. cloud-solutions with SnowflakeDB with Variant data-type for semi-structured data) or different approaches ( data vault for fast integration). However, stereotypes are deeply preserved and from what I hear, still around.
Read more on Business Intelligence Meets Data Engineering.
Origin:
Business Intelligence Meets Data Engineering
References: