mirror of
https://github.com/github/awesome-copilot.git
synced 2026-02-20 02:15:12 +00:00
5.3 KiB
5.3 KiB
Performance Optimization for Power BI Models
Data Reduction Techniques
1. Remove Unnecessary Columns
- Only import columns needed for reporting
- Remove audit columns (CreatedBy, ModifiedDate) unless required
- Remove duplicate/redundant columns
column_operations(operation: "List", filter: { tableNames: ["Sales"] })
// Review and remove unneeded columns
2. Remove Unnecessary Rows
- Filter historical data to relevant period
- Exclude cancelled/void transactions if not needed
- Apply filters in Power Query (not in DAX)
3. Reduce Cardinality
High cardinality (many unique values) impacts:
- Model size
- Refresh time
- Query performance
Solutions:
| Column Type | Reduction Technique |
|---|---|
| DateTime | Split into Date and Time columns |
| Decimal precision | Round to needed precision |
| Text with patterns | Extract common prefix/suffix |
| High-precision IDs | Use surrogate integer keys |
4. Optimize Data Types
| From | To | Benefit |
|---|---|---|
| DateTime | Date (if time not needed) | 8 bytes to 4 bytes |
| Decimal | Fixed Decimal | Better compression |
| Text with numbers | Whole Number | Much better compression |
| Long text | Shorter text | Reduces storage |
5. Group and Summarize
Pre-aggregate data when detail not needed:
- Daily instead of transactional
- Monthly instead of daily
- Consider aggregation tables for DirectQuery
Column Optimization
Prefer Power Query Columns Over Calculated Columns
| Approach | When to Use |
|---|---|
| Power Query (M) | Can be computed at source, static values |
| Calculated Column (DAX) | Needs model relationships, dynamic logic |
Power Query columns:
- Load faster
- Compress better
- Use less memory
Avoid Calculated Columns on Relationship Keys
DAX calculated columns in relationships:
- Cannot use indexes
- Generate complex SQL for DirectQuery
- Hurt performance significantly
Use COMBINEVALUES for multi-column relationships:
// If you must use calculated column for composite key
CompositeKey = COMBINEVALUES(",", [Country], [City])
Set Appropriate Summarization
Prevent accidental aggregation of non-additive columns:
column_operations(
operation: "Update",
definitions: [{
tableName: "Product",
name: "UnitPrice",
summarizeBy: "None"
}]
)
Relationship Optimization
1. Minimize Bidirectional Relationships
Each bidirectional relationship:
- Increases query complexity
- Can create ambiguous paths
- Reduces performance
2. Avoid Many-to-Many When Possible
Many-to-many relationships:
- Generate more complex queries
- Require more memory
- Can produce unexpected results
3. Reduce Relationship Cardinality
Keep relationship columns low cardinality:
- Use integer keys over text
- Consider higher-grain relationships
DAX Optimization
1. Use Variables
// GOOD - Calculate once, use twice
Sales Growth =
VAR CurrentSales = [Total Sales]
VAR PriorSales = [PY Sales]
RETURN DIVIDE(CurrentSales - PriorSales, PriorSales)
// BAD - Recalculates [Total Sales] and [PY Sales]
Sales Growth =
DIVIDE([Total Sales] - [PY Sales], [PY Sales])
2. Avoid FILTER with Entire Tables
// BAD - Iterates entire table
Sales High Value =
CALCULATE([Total Sales], FILTER(Sales, Sales[Amount] > 1000))
// GOOD - Uses column reference
Sales High Value =
CALCULATE([Total Sales], Sales[Amount] > 1000)
3. Use KEEPFILTERS Appropriately
// Respects existing filters
Sales with Filter =
CALCULATE([Total Sales], KEEPFILTERS(Product[Category] = "Bikes"))
4. Prefer DIVIDE Over Division Operator
// GOOD - Handles divide by zero
Margin % = DIVIDE([Profit], [Sales])
// BAD - Errors on zero
Margin % = [Profit] / [Sales]
DirectQuery Optimization
1. Minimize Columns and Tables
DirectQuery models:
- Query source for every visual
- Performance depends on source
- Minimize data retrieved
2. Avoid Complex Power Query Transformations
- Transforms become subqueries
- Native queries are faster
- Materialize at source when possible
3. Keep Measures Simple Initially
Complex DAX generates complex SQL:
- Start with basic aggregations
- Add complexity gradually
- Monitor query performance
4. Disable Auto Date/Time
For DirectQuery models, disable auto date/time:
- Creates hidden calculated tables
- Increases model complexity
- Use explicit date table instead
Aggregations
User-Defined Aggregations
Pre-aggregate fact tables for:
- Very large models (billions of rows)
- Hybrid DirectQuery/Import
- Common query patterns
table_operations(
operation: "Create",
definitions: [{
name: "SalesAgg",
mode: "Import",
mExpression: "..."
}]
)
Performance Testing
Use Performance Analyzer
- Enable in Power BI Desktop
- Start recording
- Interact with visuals
- Review DAX query times
Monitor with DAX Studio
External tool for:
- Query timing
- Server timings
- Query plans
Validation Checklist
- Unnecessary columns removed
- Appropriate data types used
- High-cardinality columns addressed
- Bidirectional relationships minimized
- DAX uses variables for repeated expressions
- No FILTER on entire tables
- DIVIDE used instead of division operator
- Auto date/time disabled for DirectQuery
- Performance tested with representative data