There never used to be a need for various roles in data before. When the data guys inducted me into data in 2009 when there wasn’t much buzz about business intelligence (BI) then. To affirm this, all they told me was to know a few SQL scripts and am ready to go. They trusted me with databases, knowing they have backups from the data warehouse. It was in 2017 that I started to notice the importance of having different data roles. Maybe this was driven by the fact that I had moved from the unsecured lending operations team and now was in the Group Technology (GT) Finance department working closely with the data-driven intelligence (DDI) team. The DDI team was so big because of consolidating all data people from the entire bank. They were then divided into roles. The team also arranged themselves into squads so that there could be a better focus.

Each squad would comprise each BI role. This means that each squad would have as many data engineers as possible, data analysts, business/process analysts, and Power BI visualisation specialists. The squads were arranged this way to enable speciality and focus on data. Credit card data had its own squad so was overdrafts, insurance, personal loans, and so forth. More squads would depend on the team’s growth and business need. Once we have discussed the overview composition of the BI department, it is also important to understand what is expected from data engineers, data analysts, and visualisation specialists.
Data Engineer
Data engineer source, organize and move data between systems. They can also be involved in decisions about data storage and infrastructure. Their key competencies include (1) Extract, Transform, and Load (ETL) to automate data feeds or migrate data from various systems, (2) Data warehouse (DW) creation to store all data needed by the business. DW helps data users access everything in one place/server thus less need to create ETL themselves – optimised for analysis, (3) Data Governance is security-focused to ensure data integrity, and (4) Data system knowledge especially mastering and understanding data structures and other ways to help analysts avoid data issues. Data engineers know various types of database systems optimised both for computers and human analysis.
There are four types of database systems namely (1) Online Transaction Processing (OLTP), (2) Data Lake, (3) Data Warehouse (DW), and Data Mart. OLTP is a type of data processing that consists of executing several transactions occurring concurrently online banking, shopping, order entry, or sending text messages, for example. It is also optimized to enter, modify, delete, and read data. A data lake is a centralized repository designed to store, process, and secure large amounts of structured (tables and lists), semi-structured, and unstructured data (emails and phone conversations). It can store data in its native format and process any variety of it, ignoring size limits.

A DW is a type of data management system that is designed to enable and support BI activities, especially analytics. Combines data from multiple sources. DWs are solely intended to perform queries and analysis and often contain large amounts of historical data and are optimized for analysis and human interaction. A data mart is a simple form of a data warehouse that is focused on a single subject or line of business, such as sales, finance, or marketing. Given their focus, data marts draw data from fewer sources than data warehouses. This makes data marts to be ideal for small data warehouses used for a specific project (squad) or team.
The ideal programming language used is Structured Query Language (SQL). SQL is ideal to query and fetch data from databases. It is easy to read and write using SQL. There are various online resources catered for novice programmers to be proficient in SQL. My personal favourite has always been SQL Tutorial (w3schools.com). I found W3Schools.com to be more elaborate especially considering that databases and SQL language were only covered in two chapters of the software engineering book. The main objective might have been for us to seek help online as novice programmers than relying on the textbook for everything. Knowing SQL might seem a bit primitive to nowadays analysts, but an understanding of SQL will set you apart from dashboard-focused BI analysts. Especially when considering all the various tools that assist with analysis.
The common tools used by data engineer includes tools for data storage, cloud services, coding languages, big data manipulation, and live stream data. There are various ways for data storage which include Microsoft server management studios, Oracle, SAP, SAS, and other data storage tools. There have been improvements following the increasing cost of physical data storage and reduced speed of query (latency) to have an online solution to data storage. Providers such as Google Cloud Storage (GCS), Microsoft Azure, and Amazon Web Services (AWS) have filled this need. Data engineers are welcome to use any of the programming/coding languages such as SQL, Scala, or Python with either local or online data storage providers. Some providers have specialisation in big data manipulation which varies a little to SQL, Python and Scala. These include Hadoop, Sparks, and Databricks for big data that is not live streaming. Live streaming data could be handled using Kafta, Pubsub, Kenesis DS, or Apache Beam. I am not much of an expert in big data manipulation and live stream data except for completing one assignment in university. I have been relating more to data analysis in my analysis roles.

Data Analysts
Data analysts are responsible for building data models and metrics, which facilitate analysis and visualization. Like data engineers, they are specialists in (1) ETL to import data from Excel files, CSV files and databases, transform data into the desired format, and load data into models, (2) Calculate matrices by writing formulae that calculate business performances thus requiring a good understanding of data, (3) Creating data models which require then to combine tables that link related data together and do it to refresh automatically, (4) Documentation of data models for other analysts as well as all the metric definitions using a data dictionary. They can make use of tools to achieve the above tasks.
The common tools used by data analysts include traditional ones like Microsoft Excel, VBA, and Access. Other data transformation tools include SQL server management and Power Query (links to Ms Excel 2016 and later). Besides this, they can make use of the following tools: Tableau, Power Pivot, or Power Bi for advanced analysis. Other popular tools include QliK, R, and Python. Data analysts could team up with data visualisation specialists to focus on data modelling while others work on building dashboards and interactive analysis.

Data Visualisation Specialists
Data visualization specialists focus on turning clean data into visuals that help communicate a message or help answer a specific question. Their key competencies include (1) Creating data visuals that highlight key data points or trends and also focus on one or a few metrics, (2) Creating dashboards by combining multiple visuals into one so as to tell a story using data & visuals (that enabling others to see the detail and identify root causes), (3) Communicating results and present to audiences in management meetings and manage audience permissions (e.g. row-level security), and (4) Audience feedback about ways to make the report clearer, how to ensure consistency, how to improve definitions and etcetera. As a Senior Finance Business Analyst, I used to conduct monthly sessions with the finance team where I shared knowledge and they also provided feedback. I realised quickly how many Finance Officers (FOs) benefits from knowing their problems are already solved by a dashboard.
There are many tools in the market specially designed to help with visualisation tasks. The industry leaders include Tableau, Power Bi, and Excel. Other tools are already making waves such are MicroStrategy, Qlik, Looker, Sisense, Dundas BI, Python, Adaptive Insights, Plotly or using Analysis. The choice of a tool is dependent on how the business leader (Management committee/Manco) wants to see and not on what is easier for the data team. Business leaders are the main audience of BI reports and dashboards. They are also key to guiding our data and analysis strategy. Data visualisation should assist with decision-making and allow Manco to maintain or change the course of action.

Conclusion
There need to have various roles within the data department became prevalent in recent years. There was little distinguishment when I joined in 2009 and soon realised that it becomes tougher for one person to do all tasks in a big team. This induced the BI team to have roles like Data engineers, Data analysts, and Visualisation specialists. There are many software tools in the market specially designed to help visualisation specialists present data as insights. The industry leaders include Tableau, Power Bi, and Excel. The choice of a tool is dependent on how Manco wants to see and not on what is easier for the data team. Business leaders are the main audience of BI reports and dashboards. Manco is also key to guiding our data and analysis strategy. Having a great relationship between the data team and Manco allows the entire organisation to maintain or change the course of action. Being a key person in Group Technology Finance Department and working so close with Financial Officers eliminated all the red tape thus better decision-making.