Files
awesome-copilot/skills/powerbi-modeling/references/STAR-SCHEMA.md
2026-02-03 12:29:33 +03:00

104 lines
3.1 KiB
Markdown

# Star Schema Design for Power BI
## Overview
Star schema is the optimal design pattern for Power BI semantic models. It organizes data into:
- **Dimension tables**: Enable filtering and grouping (the "one" side)
- **Fact tables**: Enable summarization (the "many" side)
## Table Classification
### Dimension Tables
- Contain descriptive attributes for filtering/slicing
- Have unique key columns (one row per entity)
- Examples: Customer, Product, Date, Geography, Employee
- Naming convention: Singular noun (`Customer`, `Product`)
### Fact Tables
- Contain measurable, quantitative data
- Have foreign keys to dimensions
- Store data at consistent grain (one row per transaction/event)
- Examples: Sales, Orders, Inventory, WebVisits
- Naming convention: Business process noun (`Sales`, `Orders`)
## Design Principles
### 1. Separate Dimensions from Facts
```
BAD: Single denormalized "Sales" table with customer details
GOOD: "Sales" fact table + "Customer" dimension table
```
### 2. Consistent Grain
Every row in a fact table represents the same thing:
- Order line level (most common)
- Daily aggregation
- Monthly summary
Never mix grains in one table.
### 3. Surrogate Keys
Add surrogate keys when source lacks unique identifiers:
```m
// Power Query: Add index column
= Table.AddIndexColumn(Source, "CustomerKey", 1, 1)
```
### 4. Date Dimension
Always create a dedicated date table:
- Mark as date table in Power BI
- Include fiscal periods if needed
- Add relative date columns (IsCurrentMonth, IsPreviousYear)
```dax
Date =
ADDCOLUMNS(
CALENDAR(DATE(2020,1,1), DATE(2030,12,31)),
"Year", YEAR([Date]),
"Month", FORMAT([Date], "MMMM"),
"MonthNum", MONTH([Date]),
"Quarter", "Q" & FORMAT([Date], "Q"),
"WeekDay", FORMAT([Date], "dddd")
)
```
## Special Dimension Types
### Role-Playing Dimensions
Same dimension used multiple times (e.g., Date for OrderDate, ShipDate):
- Option 1: Duplicate the table (OrderDate, ShipDate tables)
- Option 2: Use inactive relationships with USERELATIONSHIP in DAX
### Slowly Changing Dimensions (Type 2)
Track historical changes with version columns:
- StartDate, EndDate columns
- IsCurrent flag
- Requires pre-processing in data warehouse
### Junk Dimensions
Combine low-cardinality flags into one table:
```
OrderFlags dimension: IsRush, IsGift, IsOnline
```
### Degenerate Dimensions
Keep transaction identifiers (OrderNumber, InvoiceID) in fact table.
## Anti-Patterns to Avoid
| Anti-Pattern | Problem | Solution |
|--------------|---------|----------|
| Wide denormalized tables | Poor performance, hard to maintain | Split into star schema |
| Snowflake (normalized dims) | Extra joins hurt performance | Flatten dimensions |
| Many-to-many without bridge | Ambiguous results | Add bridge/junction table |
| Mixed grain facts | Incorrect aggregations | Separate tables per grain |
## Validation Checklist
- [ ] Each table is clearly dimension or fact
- [ ] Fact tables have foreign keys to all related dimensions
- [ ] Dimensions have unique key columns
- [ ] Date table exists and is marked
- [ ] No circular relationship paths
- [ ] Consistent naming conventions