31 July 2020

This week is not about the warehouses we generally talk about at Fuzzy LogX, but rather a key data infrastructure component that will help an organisation gain additional value and insight from their data.

Given the data deluge around us, it’s quite common to see multiple systems and sources producing enormous amounts of data, which unsurprisingly may not always agree with each other. With the right business relationships, organisations can also now make use of supplier and vendor data as well, which again may produce different results to their own. When the time comes to reporting and especially analysing data from a myriad of sources, the problems and disadvantages of not having a single source of truth are easily laid bare and can even result in confusing and widely different analysis results.

Source of Truth

First, what is the data source of truth? This can be a set of data tables or databases that each hold specific data sets, for example one for incoming sales, another for outbound orders, another for product information etc. At any given time, this information will be contained in transactional databases that record all data lines related to business operations. However, transactional databases are usually optimised for performance to enable reliable and quick access for vital transactional systems. Also, transactional databases will usually be required to delete data to maintain system performance.

Data Warehouse

The ideal solution is an unique centralised database that the entire organisation can easily refer to for all their reporting and analytics needs, commonly referred to as a Data Warehouse (DW). The DW will not be a source for up-to-date transactional data and will generally be updated from transactional data sources on a fixed time schedule, for example in the night once transactions are at a minimum, putting the least load on the transactional databases. Depending on business requirements, the DW may consist of a mix of transactional and aggregated data to prevent the DW from being too large. Being de-coupled from transactional databases also allows all reporting and analytics to be easily executed from the DW without the risk of crippling day-to-day operations by reducing the performance of the transactional databases.

Writing data to the DW isn’t as simple as a copy-paste and will usually be passed through an Extract-Transform-Load (ETL) process that will clean the data, transform and apply various business rules to the data and then load it to the DW. Some of the transformations of the data will include combining multiple data sources together, for example, combining outbound order information together with dispatch freight data received from the carrier to add a dimension of freight information to outbound order data, all which add value and insight that can be obtained from data.

One key advantage of a DW is that it allows the storage of months and even years of data. A very common data issue we come across is Product Information data not including SKUs that were active several months ago due to the product records being deleted. Ideal DW design will allow such data to be retained and minimise the deletion of data. DWs can also typically record changing information using a concept called Slowly Changing Dimensions. For example, if a vendor regularly changes the pack size of the SKU and the business prefers to refer to all variations by the same SKU ID, the DW can include the active time period for each pack size allowing an analysis conducted even several years later to be aware of the varying pack sizes of the SKU.

The biggest advantage will always be the fact that all reporting and analytics can now refer to the one source which contains relatively clean and processed. This improves data quality and availability and also enhances the analytics process. Analysts and data scientists can now spend more time extracting value and information from the data as opposed to wasting time trying to source the data, stitch data together and clean it. With this also comes multiple business benefits such as improving the decision-making process, increasing forecast accuracy and most of all saves time.

However, keep in mind that the DW will not magically provide you with all the right answers. The design and structure of the DW and associated systems will greatly influence the value and insight you can get from your data. As is the case most often, adequate documentation, processes, systems and the right people will be able to extract value and information from any data source, be it transactional or a data warehouse. If you want to gain more insight and create value from your distribution centre data, feel to contact us and we’ll let SAM do the rest.