Additive, Semi-Additive, Non-Additive
In data warehousing, a fact table contains measurements or facts, which can be categorized into three types: additive, semi-additive, and non-additive.
Additive Facts
These can be summed across all dimensions of the fact table. Additive facts are the most straightforward to aggregate and are commonly used in reporting.
Examples:
- Sales revenue: Can be summed by time, product, region, etc.
- Quantity sold: Summable across product, time, store dimensions.
- Profit: Aggregates well across dimensions.
- Number of website visits: Can be summed up by day, user, or region.
- Number of clicks on a banner ad: Can be aggregated across time, campaigns, etc.
Semi-Additive Facts
These can be summed across some dimensions but not others. For example, summing across time might not make sense for certain facts like balances or inventory, which are snapshot-based.
Examples:
- Bank balance: Can be summed across accounts but not over time (as balances are snapshots at specific points in time).
- Stock price: Cannot be summed across time but might be relevant for a single stock across locations or markets.
- Inventory levels: Summable across product or location but not over time. Number of employees: Summable by department, but not over time as it's a snapshot metric.
- Number of students enrolled in a course: Summable by course but not over time.
Non-Additive Facts
These cannot be summed across any dimension. These are usually metrics like ratios, percentages, or averages that do not make sense to sum.
Examples:
- Profit margin: It's a ratio and cannot be added up across regions or time.
- Gross margin percentage: Like profit margin, it’s a percentage and non-additive.
- Average temperature: Averages do not sum well; they need to be recalculated for larger groups.
- Average customer satisfaction rating: Cannot sum up ratings; they need to be averaged again for larger sets.
- Percentage of market share: Similar to margins, percentages cannot be summed.
It is essential to identify the type of fact as it determines how the fact table will be aggregated and also impacts the design of the data warehouse.
Example
Additive Facts:
Sales Revenue: Can be summed across all dimensions (e.g., total sales revenue by day, store, product, etc.). Quantity Sold: Additive across all dimensions (e.g., sum of quantities sold by product, store, or day).
Total Sales Revenue for P001 across all stores on 10/01/2023 = 500 (S001) + 600 (S002) = 1100.
Total Quantity Sold for P001 on 10/01/2023 = 10 (S001) + 12 (S002) = 22.
Semi-Additive Facts:
Inventory Level: This represents a snapshot of inventory at a given time. You can sum inventory across stores but not across time (e.g., you can sum inventory for a product across different stores at a single point in time, but not over multiple days).
Inventory Level for P001 on 10/01/2023 across all stores = 200 (S001) + 180 (S002) = 380.
Inventory Level across time would not be summed (e.g., you cannot add inventory levels on 10/01 and 10/02).
Non-Additive Facts:
Profit Margin (%): Cannot be summed across any dimension, as it is a percentage. For example, adding profit margins across stores or products would not yield a meaningful result. Instead, you'd need to compute a weighted average.
Average Customer Rating: Similar to profit margin, it cannot be summed across stores or time. It would need to be recalculated as an average if you wanted to aggregate it.
Average Profit Margin for P001 on 10/01/2023 would need to be recalculated based on weighted averages of sales revenues or units sold.
Average Customer Rating would need to be recalculated based on customer reviews or feedback.