* Fatih: Add Cloud Design Patterns instructions for distributed systems architecture * Convert Cloud Design Patterns from instruction to skill * Update skills/cloud-design-patterns/SKILL.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update skills/cloud-design-patterns/references/reliability-resilience.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
6.6 KiB
Performance Patterns
Asynchronous Request-Reply Pattern
Problem: Client applications expect synchronous responses, but back-end processing is asynchronous.
Solution: Decouple back-end processing from a front-end host where back-end processing must be asynchronous, but the front end requires a clear response.
When to Use:
- Long-running back-end operations
- Client applications can't wait for synchronous responses
- Offloading compute-intensive operations from web tier
Implementation Considerations:
- Return HTTP 202 (Accepted) with location header for status checking
- Implement status endpoint for clients to poll
- Consider webhooks for callback notifications
- Use correlation IDs to track requests
- Implement timeouts for long-running operations
Cache-Aside Pattern
Problem: Applications repeatedly access the same data from a data store.
Solution: Load data on demand into a cache from a data store when needed.
When to Use:
- Frequently accessed, read-heavy data
- Data that changes infrequently
- Reducing load on primary data store
Implementation Considerations:
- Check cache before accessing data store
- Load data into cache on cache miss (lazy loading)
- Set appropriate cache expiration policies
- Implement cache invalidation strategies
- Handle cache failures gracefully (fallback to data store)
- Consider cache coherency in distributed scenarios
CQRS (Command Query Responsibility Segregation) Pattern
Problem: Read and write workloads have different requirements and scaling needs.
Solution: Separate operations that read data from those that update data by using distinct interfaces.
When to Use:
- Read and write workloads have vastly different performance characteristics
- Different teams work on read and write sides
- Need to prevent merge conflicts in collaborative scenarios
- Complex business logic differs between reads and writes
Implementation Considerations:
- Separate read and write models
- Use event sourcing to synchronize models
- Scale read and write sides independently
- Consider eventual consistency implications
- Implement appropriate security for commands vs queries
Index Table Pattern
Problem: Queries frequently reference fields that aren't indexed efficiently.
Solution: Create indexes over the fields in data stores that queries frequently reference.
When to Use:
- Improving query performance
- Supporting multiple query patterns
- Working with NoSQL databases without native indexing
Implementation Considerations:
- Create separate tables/collections optimized for specific queries
- Maintain indexes asynchronously using events or triggers
- Consider storage overhead of duplicate data
- Handle index update failures and inconsistencies
Materialized View Pattern
Problem: Data is poorly formatted for required query operations.
Solution: Generate prepopulated views over the data in one or more data stores when the data isn't ideally formatted for query operations.
When to Use:
- Complex queries over normalized data
- Improving read performance for complex joins/aggregations
- Supporting multiple query patterns efficiently
Implementation Considerations:
- Refresh views asynchronously using background jobs or triggers
- Consider staleness tolerance for materialized data
- Balance between storage cost and query performance
- Implement incremental refresh where possible
Priority Queue Pattern
Problem: Some requests need faster processing than others.
Solution: Prioritize requests sent to services so that requests with a higher priority are processed more quickly.
When to Use:
- Providing different service levels to different customers
- Processing critical operations before less important ones
- Managing mixed workloads with varying importance
Implementation Considerations:
- Use message priority metadata
- Implement multiple queues for different priority levels
- Prevent starvation of low-priority messages
- Monitor queue depths and processing times per priority
Queue-Based Load Leveling Pattern
Problem: Intermittent heavy loads can overwhelm services.
Solution: Use a queue as a buffer between a task and a service to smooth intermittent heavy loads.
When to Use:
- Protecting services from traffic spikes
- Decoupling producers and consumers
- Enabling asynchronous processing
Implementation Considerations:
- Choose appropriate queue technology (Azure Storage Queue, Service Bus, etc.)
- Monitor queue length to detect saturation
- Implement auto-scaling based on queue depth
- Set appropriate message time-to-live (TTL)
- Handle poison messages with dead-letter queues
Rate Limiting Pattern
Problem: Service consumption must be controlled to prevent resource exhaustion.
Solution: Control the consumption of resources by applications, tenants, or services to prevent resource exhaustion and throttling.
When to Use:
- Protecting backend services from overload
- Implementing fair usage policies
- Preventing one tenant from monopolizing resources
Implementation Considerations:
- Implement token bucket, leaky bucket, or fixed window algorithms
- Return HTTP 429 (Too Many Requests) when limits exceeded
- Provide Retry-After headers to clients
- Consider different limits for different clients/tiers
- Make limits configurable and monitorable
Sharding Pattern
Problem: A single data store may have limitations in storage capacity and performance.
Solution: Divide a data store into a set of horizontal partitions or shards.
When to Use:
- Scaling beyond single database limits
- Improving query performance by reducing dataset size
- Distributing load across multiple databases
Implementation Considerations:
- Choose appropriate shard key (hash, range, or list-based)
- Avoid hot partitions by selecting balanced shard keys
- Handle cross-shard queries carefully
- Plan for shard rebalancing and splitting
- Consider operational complexity of managing multiple shards
Throttling Pattern
Problem: Resource consumption must be limited to prevent system overload.
Solution: Control the consumption of resources used by an application, tenant, or service.
When to Use:
- Ensuring system operates within defined capacity
- Preventing resource exhaustion during peak load
- Enforcing SLA-based resource allocation
Implementation Considerations:
- Implement at API gateway or service level
- Use different strategies: reject requests, queue, or degrade service
- Return appropriate HTTP status codes (429, 503)
- Provide clear feedback to clients about throttling
- Monitor throttling metrics to adjust capacity