Planning Your ETL in Qlik

Qlik has a powerful script engine which allows you to bring information in from a variety of sources. It’s included within Qlik and because you load all the data into memory, once saved it is effectively a portable business intelligence (BI) solution. Of course in the real world the file would remain on a server where all the decision makers can access the information held within.

Naturally, one would assume each dashboard would bring in the data needed from its sources and this can be the case. However, when you’re planning a larger scale strategic implementation this becomes less efficient.

Qlik has a simple yet effective option to allow you to develop these large suites of BI dashboards. As mentioned, Qlik stores the information into memory and also has the ability to export data tables in a special format which is already optimised in the standard Qlik file format (QVD). This optimisation reduces the data size without the need for compression. A Qlik dashboard file can now be seen as something different; it becomes a script which moves a set of data from one location to another and perhaps transforming it as well.

For large deployments we would recommend creating three layers (three Qlik dashboard files):

  1. Extract
  2. Transform
  3. Dashboard Load

The Extract layer would take information from the source systems and place it straight into a QVD file. Here, the latest information from the source system can be added to the data retrieved previously – also known as incremental loading. This method is far more efficient than taking the whole history each time you re-load. It’s important at the extract stage that no data is changed, combined or additional fields added. This way you can easily test that your saved data matches the source file.

The next step is Transformation.  Here we read the previously saved data and get it ready for consumption for a dashboard. As before we would do this incrementally for large tables that are added to over time. Transformation would include; joining tables together, adding flags and calculations.

Finally the Dashboard Loads the transformed data it requires. As the data is fully prepared in the transformation stage, it can be loaded as ‘optimised’ which means it happens very quickly.

When storing QVDs in the Extract and Transform layers it’s good practice to split large tables into manageable chunks. These tables typically contain your transactional data which grows over time (commonly referred to as FACT tables). These bite sized chucks depend on the data volume although we would typically recommend monthly files – this makes archiving data in the future easier. Also, if your dashboard only requires information from the last twelve months, you don’t need to load a QVD file which potentially could contain many years of data, which of course would slow down your process.

The benefits of this type of data load structure include:

  • Your source systems are only accessed once, reducing the demand on their resources
  • You can validate the accuracy of each stage of the process
  • QVD files are built up over time and are optimised
  • Several dashboards can re-use the same data which is both efficient and consistent

If you have any questions or you’d like more information on how Qlik business intelligence can help your organisation, please don’t hesitate to contact us on +44 (0)330 363 9900 today.

Thanks for reading!

Richard Pearce, Senior Consultant.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s