mirror of
https://github.com/github/awesome-copilot.git
synced 2026-02-23 11:55:12 +00:00
Add Power BI modeling skill
This commit is contained in:
195
skills/powerbi-modeling/references/MEASURES-DAX.md
Normal file
195
skills/powerbi-modeling/references/MEASURES-DAX.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# DAX Measures and Naming Conventions
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
### General Rules
|
||||
- Use human-readable names (spaces allowed)
|
||||
- Be descriptive: `Total Sales Amount` not `TSA`
|
||||
- Avoid abbreviations unless universally understood
|
||||
- Use consistent capitalization (Title Case recommended)
|
||||
- Avoid special characters except spaces
|
||||
|
||||
### Table Naming
|
||||
| Type | Convention | Example |
|
||||
|------|------------|---------|
|
||||
| Dimension | Singular noun | Customer, Product, Date |
|
||||
| Fact | Business process | Sales, Orders, Inventory |
|
||||
| Bridge | Combined names | CustomerAccount, ProductCategory |
|
||||
| Measure Table | Underscore prefix | _Measures, _KPIs |
|
||||
|
||||
### Column Naming
|
||||
| Type | Convention | Example |
|
||||
|------|------------|---------|
|
||||
| Keys | Suffix with "Key" or "ID" | CustomerKey, ProductID |
|
||||
| Dates | Suffix with "Date" | OrderDate, ShipDate |
|
||||
| Amounts | Descriptive with unit hint | SalesAmount, QuantitySold |
|
||||
| Flags | Prefix with "Is" or "Has" | IsActive, HasDiscount |
|
||||
|
||||
### Measure Naming
|
||||
| Type | Convention | Example |
|
||||
|------|------------|---------|
|
||||
| Aggregations | Verb + Noun | Total Sales, Count of Orders |
|
||||
| Ratios | X per Y or X Rate | Sales per Customer, Conversion Rate |
|
||||
| Time Intelligence | Period + Metric | YTD Sales, PY Total Sales |
|
||||
| Comparisons | Metric + vs + Baseline | Sales vs Budget, Growth vs PY |
|
||||
|
||||
## Explicit vs Implicit Measures
|
||||
|
||||
### Always Create Explicit Measures For:
|
||||
1. Key business metrics users will query
|
||||
2. Complex calculations with filter manipulation
|
||||
3. Measures used in MDX (Excel PivotTables)
|
||||
4. Controlled aggregation (prevent sum of averages)
|
||||
|
||||
### Implicit Measures (Column Aggregations)
|
||||
- Acceptable for simple exploration
|
||||
- Set correct SummarizeBy property:
|
||||
- Amounts: Sum
|
||||
- Keys/IDs: None (Do Not Summarize)
|
||||
- Rates/Prices: None or Average
|
||||
|
||||
## Measure Patterns
|
||||
|
||||
### Basic Aggregations
|
||||
```dax
|
||||
Total Sales = SUM(Sales[SalesAmount])
|
||||
Order Count = COUNTROWS(Sales)
|
||||
Average Order Value = DIVIDE([Total Sales], [Order Count])
|
||||
Distinct Customers = DISTINCTCOUNT(Sales[CustomerKey])
|
||||
```
|
||||
|
||||
### Time Intelligence (Requires Date Table)
|
||||
```dax
|
||||
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])
|
||||
MTD Sales = TOTALMTD([Total Sales], 'Date'[Date])
|
||||
PY Sales = CALCULATE([Total Sales], SAMEPERIODLASTYEAR('Date'[Date]))
|
||||
YoY Growth = DIVIDE([Total Sales] - [PY Sales], [PY Sales])
|
||||
```
|
||||
|
||||
### Percentage Calculations
|
||||
```dax
|
||||
Sales % of Total =
|
||||
DIVIDE(
|
||||
[Total Sales],
|
||||
CALCULATE([Total Sales], REMOVEFILTERS(Product))
|
||||
)
|
||||
|
||||
Margin % = DIVIDE([Gross Profit], [Total Sales])
|
||||
```
|
||||
|
||||
### Running Totals
|
||||
```dax
|
||||
Running Total =
|
||||
CALCULATE(
|
||||
[Total Sales],
|
||||
FILTER(
|
||||
ALL('Date'),
|
||||
'Date'[Date] <= MAX('Date'[Date])
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
## Column References
|
||||
|
||||
### Best Practice: Always Qualify Column Names
|
||||
```dax
|
||||
// GOOD - Fully qualified
|
||||
Sales Amount = SUM(Sales[SalesAmount])
|
||||
|
||||
// BAD - Unqualified (can cause ambiguity)
|
||||
Sales Amount = SUM([SalesAmount])
|
||||
```
|
||||
|
||||
### Measure References: Never Qualify
|
||||
```dax
|
||||
// GOOD - Unqualified measure
|
||||
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])
|
||||
|
||||
// BAD - Qualified measure (breaks if home table changes)
|
||||
YTD Sales = TOTALYTD(Sales[Total Sales], 'Date'[Date])
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
### Measure Descriptions
|
||||
Always add descriptions explaining:
|
||||
- What the measure calculates
|
||||
- Business context/usage
|
||||
- Any important assumptions
|
||||
|
||||
```
|
||||
measure_operations(
|
||||
operation: "Update",
|
||||
definitions: [{
|
||||
name: "Total Sales",
|
||||
tableName: "Sales",
|
||||
description: "Sum of all completed sales transactions. Excludes returns and cancelled orders."
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
### Format Strings
|
||||
| Data Type | Format String | Example Output |
|
||||
|-----------|---------------|----------------|
|
||||
| Currency | $#,##0.00 | $1,234.56 |
|
||||
| Percentage | 0.0% | 12.3% |
|
||||
| Whole Number | #,##0 | 1,234 |
|
||||
| Decimal | #,##0.00 | 1,234.56 |
|
||||
|
||||
## Display Folders
|
||||
|
||||
Organize measures into logical groups:
|
||||
```
|
||||
measure_operations(
|
||||
operation: "Update",
|
||||
definitions: [{
|
||||
name: "YTD Sales",
|
||||
tableName: "_Measures",
|
||||
displayFolder: "Time Intelligence\\Year"
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
Common folder structure:
|
||||
```
|
||||
_Measures
|
||||
├── Sales
|
||||
│ ├── Total Sales
|
||||
│ └── Average Sale
|
||||
├── Time Intelligence
|
||||
│ ├── Year
|
||||
│ │ ├── YTD Sales
|
||||
│ │ └── PY Sales
|
||||
│ └── Month
|
||||
│ └── MTD Sales
|
||||
└── Ratios
|
||||
├── Margin %
|
||||
└── Conversion Rate
|
||||
```
|
||||
|
||||
## Variables for Performance
|
||||
|
||||
Use variables to:
|
||||
- Avoid recalculating the same expression
|
||||
- Improve readability
|
||||
- Enable debugging
|
||||
|
||||
```dax
|
||||
Gross Margin % =
|
||||
VAR TotalSales = [Total Sales]
|
||||
VAR TotalCost = [Total Cost]
|
||||
VAR GrossProfit = TotalSales - TotalCost
|
||||
RETURN
|
||||
DIVIDE(GrossProfit, TotalSales)
|
||||
```
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] All key business metrics have explicit measures
|
||||
- [ ] Measures have clear, descriptive names
|
||||
- [ ] Measures have descriptions
|
||||
- [ ] Appropriate format strings applied
|
||||
- [ ] Display folders organize related measures
|
||||
- [ ] Column references are fully qualified
|
||||
- [ ] Measure references are not qualified
|
||||
- [ ] Variables used for complex calculations
|
||||
215
skills/powerbi-modeling/references/PERFORMANCE.md
Normal file
215
skills/powerbi-modeling/references/PERFORMANCE.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# Performance Optimization for Power BI Models
|
||||
|
||||
## Data Reduction Techniques
|
||||
|
||||
### 1. Remove Unnecessary Columns
|
||||
- Only import columns needed for reporting
|
||||
- Remove audit columns (CreatedBy, ModifiedDate) unless required
|
||||
- Remove duplicate/redundant columns
|
||||
|
||||
```
|
||||
column_operations(operation: "List", filter: { tableNames: ["Sales"] })
|
||||
// Review and remove unneeded columns
|
||||
```
|
||||
|
||||
### 2. Remove Unnecessary Rows
|
||||
- Filter historical data to relevant period
|
||||
- Exclude cancelled/void transactions if not needed
|
||||
- Apply filters in Power Query (not in DAX)
|
||||
|
||||
### 3. Reduce Cardinality
|
||||
High cardinality (many unique values) impacts:
|
||||
- Model size
|
||||
- Refresh time
|
||||
- Query performance
|
||||
|
||||
**Solutions:**
|
||||
| Column Type | Reduction Technique |
|
||||
|-------------|---------------------|
|
||||
| DateTime | Split into Date and Time columns |
|
||||
| Decimal precision | Round to needed precision |
|
||||
| Text with patterns | Extract common prefix/suffix |
|
||||
| High-precision IDs | Use surrogate integer keys |
|
||||
|
||||
### 4. Optimize Data Types
|
||||
| From | To | Benefit |
|
||||
|------|-----|---------|
|
||||
| DateTime | Date (if time not needed) | 8 bytes to 4 bytes |
|
||||
| Decimal | Fixed Decimal | Better compression |
|
||||
| Text with numbers | Whole Number | Much better compression |
|
||||
| Long text | Shorter text | Reduces storage |
|
||||
|
||||
### 5. Group and Summarize
|
||||
Pre-aggregate data when detail not needed:
|
||||
- Daily instead of transactional
|
||||
- Monthly instead of daily
|
||||
- Consider aggregation tables for DirectQuery
|
||||
|
||||
## Column Optimization
|
||||
|
||||
### Prefer Power Query Columns Over Calculated Columns
|
||||
| Approach | When to Use |
|
||||
|----------|-------------|
|
||||
| Power Query (M) | Can be computed at source, static values |
|
||||
| Calculated Column (DAX) | Needs model relationships, dynamic logic |
|
||||
|
||||
Power Query columns:
|
||||
- Load faster
|
||||
- Compress better
|
||||
- Use less memory
|
||||
|
||||
### Avoid Calculated Columns on Relationship Keys
|
||||
DAX calculated columns in relationships:
|
||||
- Cannot use indexes
|
||||
- Generate complex SQL for DirectQuery
|
||||
- Hurt performance significantly
|
||||
|
||||
**Use COMBINEVALUES for multi-column relationships:**
|
||||
```dax
|
||||
// If you must use calculated column for composite key
|
||||
CompositeKey = COMBINEVALUES(",", [Country], [City])
|
||||
```
|
||||
|
||||
### Set Appropriate Summarization
|
||||
Prevent accidental aggregation of non-additive columns:
|
||||
```
|
||||
column_operations(
|
||||
operation: "Update",
|
||||
definitions: [{
|
||||
tableName: "Product",
|
||||
name: "UnitPrice",
|
||||
summarizeBy: "None"
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
## Relationship Optimization
|
||||
|
||||
### 1. Minimize Bidirectional Relationships
|
||||
Each bidirectional relationship:
|
||||
- Increases query complexity
|
||||
- Can create ambiguous paths
|
||||
- Reduces performance
|
||||
|
||||
### 2. Avoid Many-to-Many When Possible
|
||||
Many-to-many relationships:
|
||||
- Generate more complex queries
|
||||
- Require more memory
|
||||
- Can produce unexpected results
|
||||
|
||||
### 3. Reduce Relationship Cardinality
|
||||
Keep relationship columns low cardinality:
|
||||
- Use integer keys over text
|
||||
- Consider higher-grain relationships
|
||||
|
||||
## DAX Optimization
|
||||
|
||||
### 1. Use Variables
|
||||
```dax
|
||||
// GOOD - Calculate once, use twice
|
||||
Sales Growth =
|
||||
VAR CurrentSales = [Total Sales]
|
||||
VAR PriorSales = [PY Sales]
|
||||
RETURN DIVIDE(CurrentSales - PriorSales, PriorSales)
|
||||
|
||||
// BAD - Recalculates [Total Sales] and [PY Sales]
|
||||
Sales Growth =
|
||||
DIVIDE([Total Sales] - [PY Sales], [PY Sales])
|
||||
```
|
||||
|
||||
### 2. Avoid FILTER with Entire Tables
|
||||
```dax
|
||||
// BAD - Iterates entire table
|
||||
Sales High Value =
|
||||
CALCULATE([Total Sales], FILTER(Sales, Sales[Amount] > 1000))
|
||||
|
||||
// GOOD - Uses column reference
|
||||
Sales High Value =
|
||||
CALCULATE([Total Sales], Sales[Amount] > 1000)
|
||||
```
|
||||
|
||||
### 3. Use KEEPFILTERS Appropriately
|
||||
```dax
|
||||
// Respects existing filters
|
||||
Sales with Filter =
|
||||
CALCULATE([Total Sales], KEEPFILTERS(Product[Category] = "Bikes"))
|
||||
```
|
||||
|
||||
### 4. Prefer DIVIDE Over Division Operator
|
||||
```dax
|
||||
// GOOD - Handles divide by zero
|
||||
Margin % = DIVIDE([Profit], [Sales])
|
||||
|
||||
// BAD - Errors on zero
|
||||
Margin % = [Profit] / [Sales]
|
||||
```
|
||||
|
||||
## DirectQuery Optimization
|
||||
|
||||
### 1. Minimize Columns and Tables
|
||||
DirectQuery models:
|
||||
- Query source for every visual
|
||||
- Performance depends on source
|
||||
- Minimize data retrieved
|
||||
|
||||
### 2. Avoid Complex Power Query Transformations
|
||||
- Transforms become subqueries
|
||||
- Native queries are faster
|
||||
- Materialize at source when possible
|
||||
|
||||
### 3. Keep Measures Simple Initially
|
||||
Complex DAX generates complex SQL:
|
||||
- Start with basic aggregations
|
||||
- Add complexity gradually
|
||||
- Monitor query performance
|
||||
|
||||
### 4. Disable Auto Date/Time
|
||||
For DirectQuery models, disable auto date/time:
|
||||
- Creates hidden calculated tables
|
||||
- Increases model complexity
|
||||
- Use explicit date table instead
|
||||
|
||||
## Aggregations
|
||||
|
||||
### User-Defined Aggregations
|
||||
Pre-aggregate fact tables for:
|
||||
- Very large models (billions of rows)
|
||||
- Hybrid DirectQuery/Import
|
||||
- Common query patterns
|
||||
|
||||
```
|
||||
table_operations(
|
||||
operation: "Create",
|
||||
definitions: [{
|
||||
name: "SalesAgg",
|
||||
mode: "Import",
|
||||
mExpression: "..."
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
## Performance Testing
|
||||
|
||||
### Use Performance Analyzer
|
||||
1. Enable in Power BI Desktop
|
||||
2. Start recording
|
||||
3. Interact with visuals
|
||||
4. Review DAX query times
|
||||
|
||||
### Monitor with DAX Studio
|
||||
External tool for:
|
||||
- Query timing
|
||||
- Server timings
|
||||
- Query plans
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] Unnecessary columns removed
|
||||
- [ ] Appropriate data types used
|
||||
- [ ] High-cardinality columns addressed
|
||||
- [ ] Bidirectional relationships minimized
|
||||
- [ ] DAX uses variables for repeated expressions
|
||||
- [ ] No FILTER on entire tables
|
||||
- [ ] DIVIDE used instead of division operator
|
||||
- [ ] Auto date/time disabled for DirectQuery
|
||||
- [ ] Performance tested with representative data
|
||||
147
skills/powerbi-modeling/references/RELATIONSHIPS.md
Normal file
147
skills/powerbi-modeling/references/RELATIONSHIPS.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# Relationships in Power BI
|
||||
|
||||
## Relationship Properties
|
||||
|
||||
### Cardinality
|
||||
| Type | Use Case | Notes |
|
||||
|------|----------|-------|
|
||||
| One-to-Many (*:1) | Dimension to Fact | Most common, preferred |
|
||||
| Many-to-One (1:*) | Fact to Dimension | Same as above, direction reversed |
|
||||
| One-to-One (1:1) | Dimension extensions | Use sparingly |
|
||||
| Many-to-Many (*:*) | Bridge tables, complex scenarios | Requires careful design |
|
||||
|
||||
### Cross-Filter Direction
|
||||
| Setting | Behavior | When to Use |
|
||||
|---------|----------|-------------|
|
||||
| Single | Filters flow from "one" to "many" | Default, best performance |
|
||||
| Both | Filters flow in both directions | Only when necessary |
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Prefer One-to-Many Relationships
|
||||
```
|
||||
Customer (1) --> (*) Sales
|
||||
Product (1) --> (*) Sales
|
||||
Date (1) --> (*) Sales
|
||||
```
|
||||
|
||||
### 2. Use Single-Direction Cross-Filtering
|
||||
Bidirectional filtering:
|
||||
- Impacts performance negatively
|
||||
- Can create ambiguous filter paths
|
||||
- May produce unexpected results
|
||||
|
||||
**Only use bidirectional when:**
|
||||
- Dimension-to-dimension analysis through fact table
|
||||
- Specific RLS requirements
|
||||
|
||||
**Better alternative:** Use CROSSFILTER in DAX measures:
|
||||
```dax
|
||||
Countries Sold =
|
||||
CALCULATE(
|
||||
DISTINCTCOUNT(Customer[Country]),
|
||||
CROSSFILTER(Customer[CustomerKey], Sales[CustomerKey], BOTH)
|
||||
)
|
||||
```
|
||||
|
||||
### 3. One Active Path Between Tables
|
||||
- Only one active relationship between any two tables
|
||||
- Use USERELATIONSHIP for role-playing dimensions:
|
||||
|
||||
```dax
|
||||
Sales by Ship Date =
|
||||
CALCULATE(
|
||||
[Total Sales],
|
||||
USERELATIONSHIP(Sales[ShipDate], Date[Date])
|
||||
)
|
||||
```
|
||||
|
||||
### 4. Avoid Ambiguous Paths
|
||||
Circular references cause errors. Solutions:
|
||||
- Deactivate one relationship
|
||||
- Restructure model
|
||||
- Use USERELATIONSHIP in measures
|
||||
|
||||
## Relationship Patterns
|
||||
|
||||
### Standard Star Schema
|
||||
```
|
||||
[Date]
|
||||
|
|
||||
[Product]--[Sales]--[Customer]
|
||||
|
|
||||
[Store]
|
||||
```
|
||||
|
||||
### Role-Playing Dimension
|
||||
```
|
||||
[Date] --(active)-- [Sales.OrderDate]
|
||||
|
|
||||
+--(inactive)-- [Sales.ShipDate]
|
||||
```
|
||||
|
||||
### Bridge Table (Many-to-Many)
|
||||
```
|
||||
[Customer]--(*)--[CustomerAccount]--(*)--[Account]
|
||||
```
|
||||
|
||||
### Factless Fact Table
|
||||
```
|
||||
[Product]--[ProductPromotion]--[Promotion]
|
||||
```
|
||||
Used to capture relationships without measures.
|
||||
|
||||
## Creating Relationships via MCP
|
||||
|
||||
### List Current Relationships
|
||||
```
|
||||
relationship_operations(operation: "List")
|
||||
```
|
||||
|
||||
### Create New Relationship
|
||||
```
|
||||
relationship_operations(
|
||||
operation: "Create",
|
||||
definitions: [{
|
||||
fromTable: "Sales",
|
||||
fromColumn: "ProductKey",
|
||||
toTable: "Product",
|
||||
toColumn: "ProductKey",
|
||||
crossFilteringBehavior: "OneDirection",
|
||||
isActive: true
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
### Deactivate Relationship
|
||||
```
|
||||
relationship_operations(
|
||||
operation: "Deactivate",
|
||||
references: [{ name: "relationship-guid-here" }]
|
||||
)
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Ambiguous Path" Error
|
||||
Multiple active paths exist between tables.
|
||||
- Check for: Multiple fact tables sharing dimensions
|
||||
- Solution: Deactivate redundant relationships
|
||||
|
||||
### Bidirectional Not Allowed
|
||||
Circular reference would be created.
|
||||
- Solution: Restructure or use DAX CROSSFILTER
|
||||
|
||||
### Relationship Not Detected
|
||||
Columns may have different data types.
|
||||
- Ensure both columns have identical types
|
||||
- Check for trailing spaces in text keys
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] All relationships are one-to-many where possible
|
||||
- [ ] Cross-filter is single direction by default
|
||||
- [ ] Only one active path between any two tables
|
||||
- [ ] Role-playing dimensions use inactive relationships
|
||||
- [ ] No circular reference paths
|
||||
- [ ] Key columns have matching data types
|
||||
226
skills/powerbi-modeling/references/RLS.md
Normal file
226
skills/powerbi-modeling/references/RLS.md
Normal file
@@ -0,0 +1,226 @@
|
||||
# Row-Level Security (RLS) in Power BI
|
||||
|
||||
## Overview
|
||||
|
||||
Row-Level Security restricts data access at the row level based on user identity. Users see only the data they're authorized to view.
|
||||
|
||||
## Design Principles
|
||||
|
||||
### 1. Filter on Dimension Tables
|
||||
Apply RLS to dimensions, not fact tables:
|
||||
- More efficient (smaller tables)
|
||||
- Filters propagate through relationships
|
||||
- Easier to maintain
|
||||
|
||||
```dax
|
||||
// On Customer dimension - filters propagate to Sales
|
||||
[Region] = "West"
|
||||
```
|
||||
|
||||
### 2. Create Minimal Roles
|
||||
Avoid many role combinations:
|
||||
- Each role = separate cache
|
||||
- Roles are additive (union, not intersection)
|
||||
- Consolidate where possible
|
||||
|
||||
### 3. Use Dynamic RLS When Possible
|
||||
Data-driven rules scale better:
|
||||
- User mapping in a table
|
||||
- USERPRINCIPALNAME() for identity
|
||||
- No role changes when users change
|
||||
|
||||
## Static vs Dynamic RLS
|
||||
|
||||
### Static RLS
|
||||
Fixed rules per role:
|
||||
```dax
|
||||
// Role: West Region
|
||||
[Region] = "West"
|
||||
|
||||
// Role: East Region
|
||||
[Region] = "East"
|
||||
```
|
||||
|
||||
**Pros:** Simple, clear
|
||||
**Cons:** Doesn't scale, requires role per group
|
||||
|
||||
### Dynamic RLS
|
||||
User identity drives filtering:
|
||||
```dax
|
||||
// Single role filters based on logged-in user
|
||||
[ManagerEmail] = USERPRINCIPALNAME()
|
||||
```
|
||||
|
||||
**Pros:** Scales, self-maintaining
|
||||
**Cons:** Requires user mapping data
|
||||
|
||||
## Implementation Patterns
|
||||
|
||||
### Pattern 1: Direct User Mapping
|
||||
User email in dimension table:
|
||||
```dax
|
||||
// On Customer table
|
||||
[CustomerEmail] = USERPRINCIPALNAME()
|
||||
```
|
||||
|
||||
### Pattern 2: Security Table
|
||||
Separate table mapping users to data:
|
||||
```
|
||||
SecurityMapping table:
|
||||
| UserEmail | Region |
|
||||
|-----------|--------|
|
||||
| joe@co.com | West |
|
||||
| sue@co.com | East |
|
||||
```
|
||||
|
||||
```dax
|
||||
// On Region dimension
|
||||
[Region] IN
|
||||
SELECTCOLUMNS(
|
||||
FILTER(SecurityMapping, [UserEmail] = USERPRINCIPALNAME()),
|
||||
"Region", [Region]
|
||||
)
|
||||
```
|
||||
|
||||
### Pattern 3: Manager Hierarchy
|
||||
Users see their data plus subordinates:
|
||||
```dax
|
||||
// Using PATH functions for hierarchy
|
||||
PATHCONTAINS(Employee[ManagerPath],
|
||||
LOOKUPVALUE(Employee[EmployeeID], Employee[Email], USERPRINCIPALNAME()))
|
||||
```
|
||||
|
||||
### Pattern 4: Multiple Rules
|
||||
Combine conditions:
|
||||
```dax
|
||||
// Users see their region OR if they're a global viewer
|
||||
[Region] = LOOKUPVALUE(Users[Region], Users[Email], USERPRINCIPALNAME())
|
||||
|| LOOKUPVALUE(Users[IsGlobal], Users[Email], USERPRINCIPALNAME()) = TRUE()
|
||||
```
|
||||
|
||||
## Creating Roles via MCP
|
||||
|
||||
### List Existing Roles
|
||||
```
|
||||
security_role_operations(operation: "List")
|
||||
```
|
||||
|
||||
### Create Role with Permission
|
||||
```
|
||||
security_role_operations(
|
||||
operation: "Create",
|
||||
definitions: [{
|
||||
name: "Regional Sales",
|
||||
modelPermission: "Read",
|
||||
description: "Restricts sales data by region"
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
### Add Table Permission (Filter)
|
||||
```
|
||||
security_role_operations(
|
||||
operation: "CreatePermissions",
|
||||
permissionDefinitions: [{
|
||||
roleName: "Regional Sales",
|
||||
tableName: "Customer",
|
||||
filterExpression: "[Region] = USERPRINCIPALNAME()"
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
### Get Effective Permissions
|
||||
```
|
||||
security_role_operations(
|
||||
operation: "GetEffectivePermissions",
|
||||
references: [{ name: "Regional Sales" }]
|
||||
)
|
||||
```
|
||||
|
||||
## Testing RLS
|
||||
|
||||
### In Power BI Desktop
|
||||
1. Modeling tab > View As
|
||||
2. Select role(s) to test
|
||||
3. Optionally specify user identity
|
||||
4. Verify data filtering
|
||||
|
||||
### Test Unexpected Values
|
||||
For dynamic RLS, test:
|
||||
- Valid users
|
||||
- Unknown users (should see nothing or error gracefully)
|
||||
- NULL/blank values
|
||||
|
||||
```dax
|
||||
// Defensive pattern - returns no data for unknown users
|
||||
IF(
|
||||
USERPRINCIPALNAME() IN VALUES(SecurityMapping[UserEmail]),
|
||||
[Region] IN SELECTCOLUMNS(...),
|
||||
FALSE()
|
||||
)
|
||||
```
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
### 1. RLS on Fact Tables Only
|
||||
**Problem:** Large table scans, poor performance
|
||||
**Solution:** Apply to dimension tables, let relationships propagate
|
||||
|
||||
### 2. Using LOOKUPVALUE Instead of Relationships
|
||||
**Problem:** Expensive, doesn't scale
|
||||
**Solution:** Create proper relationships, let filters flow
|
||||
|
||||
### 3. Expecting Intersection Behavior
|
||||
**Problem:** Multiple roles = UNION (additive), not intersection
|
||||
**Solution:** Design roles with union behavior in mind
|
||||
|
||||
### 4. Forgetting About DirectQuery
|
||||
**Problem:** RLS filters become WHERE clauses
|
||||
**Solution:** Ensure source database can handle the query patterns
|
||||
|
||||
### 5. Not Testing Edge Cases
|
||||
**Problem:** Users see unexpected data
|
||||
**Solution:** Test with: valid users, invalid users, multiple roles
|
||||
|
||||
## Bidirectional RLS
|
||||
|
||||
For bidirectional relationships with RLS:
|
||||
```
|
||||
Enable "Apply security filter in both directions"
|
||||
```
|
||||
|
||||
Only use when:
|
||||
- RLS requires filtering through many-to-many
|
||||
- Dimension-to-dimension security needed
|
||||
|
||||
**Caution:** Only one bidirectional relationship per path allowed.
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- RLS adds WHERE clauses to every query
|
||||
- Complex DAX in filters hurts performance
|
||||
- Test with realistic user counts
|
||||
- Consider aggregations for large models
|
||||
|
||||
## Object-Level Security (OLS)
|
||||
|
||||
Restrict access to entire tables or columns:
|
||||
```
|
||||
// Via XMLA/TMSL - not available in Desktop UI
|
||||
```
|
||||
|
||||
Use for:
|
||||
- Hiding sensitive columns (salary, SSN)
|
||||
- Restricting entire tables
|
||||
- Combined with RLS for comprehensive security
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] RLS applied to dimension tables (not fact tables)
|
||||
- [ ] Filters propagate correctly through relationships
|
||||
- [ ] Dynamic RLS uses USERPRINCIPALNAME()
|
||||
- [ ] Tested with valid and invalid users
|
||||
- [ ] Edge cases handled (NULL, unknown users)
|
||||
- [ ] Performance tested under load
|
||||
- [ ] Role mappings documented
|
||||
- [ ] Workspace roles understood (Admins bypass RLS)
|
||||
103
skills/powerbi-modeling/references/STAR-SCHEMA.md
Normal file
103
skills/powerbi-modeling/references/STAR-SCHEMA.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Star Schema Design for Power BI
|
||||
|
||||
## Overview
|
||||
|
||||
Star schema is the optimal design pattern for Power BI semantic models. It organizes data into:
|
||||
- **Dimension tables**: Enable filtering and grouping (the "one" side)
|
||||
- **Fact tables**: Enable summarization (the "many" side)
|
||||
|
||||
## Table Classification
|
||||
|
||||
### Dimension Tables
|
||||
- Contain descriptive attributes for filtering/slicing
|
||||
- Have unique key columns (one row per entity)
|
||||
- Examples: Customer, Product, Date, Geography, Employee
|
||||
- Naming convention: Singular noun (`Customer`, `Product`)
|
||||
|
||||
### Fact Tables
|
||||
- Contain measurable, quantitative data
|
||||
- Have foreign keys to dimensions
|
||||
- Store data at consistent grain (one row per transaction/event)
|
||||
- Examples: Sales, Orders, Inventory, WebVisits
|
||||
- Naming convention: Business process noun (`Sales`, `Orders`)
|
||||
|
||||
## Design Principles
|
||||
|
||||
### 1. Separate Dimensions from Facts
|
||||
```
|
||||
BAD: Single denormalized "Sales" table with customer details
|
||||
GOOD: "Sales" fact table + "Customer" dimension table
|
||||
```
|
||||
|
||||
### 2. Consistent Grain
|
||||
Every row in a fact table represents the same thing:
|
||||
- Order line level (most common)
|
||||
- Daily aggregation
|
||||
- Monthly summary
|
||||
|
||||
Never mix grains in one table.
|
||||
|
||||
### 3. Surrogate Keys
|
||||
Add surrogate keys when source lacks unique identifiers:
|
||||
```m
|
||||
// Power Query: Add index column
|
||||
= Table.AddIndexColumn(Source, "CustomerKey", 1, 1)
|
||||
```
|
||||
|
||||
### 4. Date Dimension
|
||||
Always create a dedicated date table:
|
||||
- Mark as date table in Power BI
|
||||
- Include fiscal periods if needed
|
||||
- Add relative date columns (IsCurrentMonth, IsPreviousYear)
|
||||
|
||||
```dax
|
||||
Date =
|
||||
ADDCOLUMNS(
|
||||
CALENDAR(DATE(2020,1,1), DATE(2030,12,31)),
|
||||
"Year", YEAR([Date]),
|
||||
"Month", FORMAT([Date], "MMMM"),
|
||||
"MonthNum", MONTH([Date]),
|
||||
"Quarter", "Q" & FORMAT([Date], "Q"),
|
||||
"WeekDay", FORMAT([Date], "dddd")
|
||||
)
|
||||
```
|
||||
|
||||
## Special Dimension Types
|
||||
|
||||
### Role-Playing Dimensions
|
||||
Same dimension used multiple times (e.g., Date for OrderDate, ShipDate):
|
||||
- Option 1: Duplicate the table (OrderDate, ShipDate tables)
|
||||
- Option 2: Use inactive relationships with USERELATIONSHIP in DAX
|
||||
|
||||
### Slowly Changing Dimensions (Type 2)
|
||||
Track historical changes with version columns:
|
||||
- StartDate, EndDate columns
|
||||
- IsCurrent flag
|
||||
- Requires pre-processing in data warehouse
|
||||
|
||||
### Junk Dimensions
|
||||
Combine low-cardinality flags into one table:
|
||||
```
|
||||
OrderFlags dimension: IsRush, IsGift, IsOnline
|
||||
```
|
||||
|
||||
### Degenerate Dimensions
|
||||
Keep transaction identifiers (OrderNumber, InvoiceID) in fact table.
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
| Anti-Pattern | Problem | Solution |
|
||||
|--------------|---------|----------|
|
||||
| Wide denormalized tables | Poor performance, hard to maintain | Split into star schema |
|
||||
| Snowflake (normalized dims) | Extra joins hurt performance | Flatten dimensions |
|
||||
| Many-to-many without bridge | Ambiguous results | Add bridge/junction table |
|
||||
| Mixed grain facts | Incorrect aggregations | Separate tables per grain |
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] Each table is clearly dimension or fact
|
||||
- [ ] Fact tables have foreign keys to all related dimensions
|
||||
- [ ] Dimensions have unique key columns
|
||||
- [ ] Date table exists and is marked
|
||||
- [ ] No circular relationship paths
|
||||
- [ ] Consistent naming conventions
|
||||
Reference in New Issue
Block a user