Programs

Fact Table vs Dimension Table: Difference Between Fact Table and Dimension Table

What is a Fact Table?

In the simplest terms, a fact table refers to a central table primarily used in data warehousing and business intelligence systems. It stores quantitative data such as measurements, metrics, or facts related to a particular business process or event. There are typically two columns in a fact table- one for foreign keys, acting as a link to the dimension tables and one for value or data that needs evaluation. 

Fact tables are essentially important because they facilitate the analysis and reporting of business performance by storing granular data that can be analysed across different dimensions. Some of the most common examples of fact tables include inventory levels, website traffic metrics, financial data, and sales transaction data.

Characteristics of a Fact Table

Below are some of the key characteristics of fact tables in data warehousing and business intelligence systems.

  • Foreign Keys – Fact tables contain foreign keys that link to the primary keys in the dimension table.
  • Outrigger Dimensions – They refer to any other dimension table.
  • Additive Measures – A fact table usually contains additive measures, meaning they can quite easily be aggregated across dimensions with the help of mathematical operations such as sum or average.
  • Sparse Data – This refers to a few records on the fact table containing null values or measurements, indicating that they do not provide any data.
  • Fact Table Grain – This refers to the actual depth or level of detail of the information in the fact table. In order for a fact table to be successful, it must be designed at the highest level.
  • Degenerated Dimensions – These refer to attributes that are non-additive, meaning that they cannot be added, despite being available in the fact table.

By adhering to these characteristics, fact tables can efficiently store qualitative data required for analytical processing, thus making them an important tool for data warehousing.

Understanding the Granularity of a Fact Table

The granularity of a fact table is a very crucial component of data analysis and reporting as it defines the scope and accuracy of the insights that can be obtained from the data. 

Simply put, granularity refers to the level of detail or specificity at which individual events or transactions get recorded in the fact table. When designing a fact table, granularity is the foremost factor that needs to be addressed. 

This usually constitutes two crucial steps,

  • Identify the dimensions that need to be included in the fact table
  • Determine where along the hierarchy of the dimensions, the information will be stored.

upGrad’s Data Analytics 360 Cornell Certificate program can help you understand granularity in fact tables while decoding its accurate implementation. 

Check out our free courses to get an edge over the competition.

What is a Dimension Table?

Contrary to a fact table, a dimension table can be described as a type of table that stores all the different attributes or characteristics of the data in the fact table. The information is usually quite descriptive in nature and helps to provide contextual or additional details to the numeric data that is stored in the fact table. 

There are different types of dimensions in data warehouse. Some of the most commonly used dimension tables include slowly changing dimensions, junk dimensions, role-playing dimensions, and shrunken dimensions, among others.

Similar to the fact table, dimension tables are an integral part of the star schema or the snowflake schema data modelling techniques widely used in data warehousing. They usually have columns, which serve as a primary key allowing for the other dimension rows or records to be uniquely identified. 

Characteristics of a Dimension Table

Let’s explore the key characteristics of a dimension table.

  • Unique Identifiers- Each row in the dimension table is assigned a unique identifier which we refer to as the primary key. 
  • Records- There are usually fewer records present than characteristics in a dimension table.
  • Relationship between attributes – In an ideal scenario, each attribute has little to almost no direct relationship.
  • Attribute Values- A dimension table usually contains textual data rather than numbers.
  • De-normalized- A dimension table typically is de-normalized, meaning that redundant data might be stored to improve query performance and simplify data retrieval during the whole analysis process.

Fact Table vs. Dimension Table 

The fact table and dimension table are two important components of the dimension model widely used for data warehousing. On that note, here are a few key points of difference between fact and dimension table

Fact Table Dimension Table
The primary purpose of a fact table is to record quantitative or numeric data and facts of a business process. Dimension table is used to store descriptive attributes or characteristics related to the data in the fact table.
There are more records present than in the dimension table There are usually much lesser records present than in the fact table.
Fact tables tend to be large because they store a vast amount of numeric data. In comparison, dimension tables are usually much smaller in size because they do not contain detailed numeric data.
It is mainly used for analysis and decision-making purposes. It mainly stores all the information about a business and its process.
Fact tables do not have any hierarchical structure. Dimension tables can have a hierarchical structure with attributes organised into levels to facilitate drill-down and roll-up analysis.

Now that you have a clear understanding of the notable differences in the fact vs dimension table let’s look at the different types of facts that can be captured in the dimensional model.

Types of Facts

Facts can be categorised into various types depending on the nature or the characteristics of the data they represent. Nonetheless, some of the most common types of facts include,

  • Additive Facts

These are the ones that can be aggregated across all dimensions of a fact table. It involves simple mathematical operations such as addition, subtraction, multiplication or division. A few examples of additive facts might include sales revenue, total cost, or quantity sold. 

  • Non-additive Facts

Contrary to additive facts, non-additive facts refer to those that cannot be aggregated at all or can be aggregated only under certain specific conditions. They represent the measurements that are not additive across dimensions. A few examples of the same are percentages, ratios or averages.

  • Factless Facts 

Factless facts refer to those tables in data warehouses that capture no measures or facts. They are only useful for storing the occurrence of an event without any specific numeric data. For example, a factless fact table might only contain the date or product key without any measures. 

  • Snapshot Facts

These store the state of a business process at any specific point in time. Since they represent a momentary snapshot of data such as daily sales, monthly inventory sales, or weekly website traffic, they are referred to as snapshot facts. 

Ready to unlock the power of data science? Then check out this Graduate Certificate programme in Data Science and AI, brought to you by upGrad. 

Learn data science courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Read our popular Data Science Articles

Types of Dimensions in Data Warehouses

There are different types of dimensions in data warehouse, leveraged to organise and describe the data in the fact tables. Some of the most commonly known dimensions are,

  • Normal Dimension 

A normal dimension contains attributes related to a single logical entity. It has a business key, and all attributes heavily depend on the surrogate key. For example, a customer dimension might include customer ID, name, location, and age.

  • Junk Dimension

A junk dimension contains no business key and is typically used to group boolean or binary attributes. It combines multiple indicators into a single dimension table and helps to reduce the number of dimension tables in the data warehouse and improve query performance. The attributes in a junk dimension are usually at the transaction level.

  • Split Dimension

A split dimension is one, as the name suggests, that has been split into multiple tables to reduce the chances of data redundancy and improve data management. It is used when a dimension is predicted to be big, e.g., 20 million rows. Dividing the same into multiple smaller tables makes the data more manageable and efficient to query.

  • Text Dimension

A text dimension usually contains large amounts of textual data such as comments, descriptions or notes. It allows for a more detailed analysis of text-based information. The text dimension is especially useful when hierarchies or relationships exist between different dimensions. 

  • Stacked Dimension

A stacked dimension is one where multiple related dimensions are combined into a single table. It allows for simplification of the data model and makes it easier to navigate and analyse the data. 

Top Data Science Skills to Learn

Example of Fact Table vs Dimension Table

Below is a small example illustrating the difference between fact and dimension tables.

Fact Table

Order ID Product ID Customer ID Quantity Price Discount Total Sales
1001 101 C001 2 $50 $5 $95
1002 102 C002 1 $30 $0 $30
1003 103 C003 3 $20 $2 $58

This fact table has the record of three different sales transactions. It highlights quantitative data such as products sold, unit price, discounts applied and the total sales of each transaction. 

Dimension Table

Product ID Product Name Category Brand
101 Laptop Electronics ABC Inc.
102 Smartphone Electronics XYZ Corp
103 Headphones Accessories DEF Tech

Here, we have information related to the various products sold by the company. Each row represents a unique product and includes attributes such as product name, category, and brand. 

Conclusion

Hopefully, with this, you clearly understand the key differences in data warehouse dimension vs fact table. To sum it up, both these components play crucial roles in organising and storing data for efficient analysis and reporting. 

From performance optimisation and data organisation, to query efficiency and simplified reporting, the list of advantages they bring to the table goes on and on. They enable business enterprises to harness the power of data-driven decision-making for improved business performance. 

If you wish to learn more about intricate data science components like fact table vs dimension table Power BI, do not forget to check out the MS In Data Science program offered by Liverpool John Moores University in collaboration with upGrad. This 18 months course is specifically tailored for IT professionals and sales experts who wish to venture into this vast dynamic world of data science.

Frequently Asked Questions

What is the role of dimension hierarchies in data warehousing?

Dimension hierarchies provide a structured way to organise and navigate data at various levels of granularity. With the help of this, data warehouse users can access specific data points while maintaining the ability to view broader trends and insights. This, in turn, facilitates more effective decision-making and data exploration.

What are the advantages of using fact and dimension tables in data warehousing compared to other data storage models?

Fact and dimension tables remain among the most widely used data storage models in data warehousing because of the innumerable benefits it brings. Such include improved performance, easy scalability, flexibility and data integrity.

What are the main characteristics and functions of a dimension table?

Dimension tables usually contain descriptive attributes that are non-numeric by nature and help to categorise, classify and label the data in the fact table. In addition to this, they often contain hierarchical structures that allow users to navigate through the varied levels of granularity.

Want to share this article?

Leave a comment

Your email address will not be published. Required fields are marked *

Our Popular Data Science Course

Get Free Consultation

Leave a comment

Your email address will not be published. Required fields are marked *

×
Get Free career counselling from upGrad experts!
Book a session with an industry professional today!
No Thanks
Let's do it
Get Free career counselling from upGrad experts!
Book a Session with an industry professional today!
Let's do it
No Thanks