mirror of https://github.com/github/awesome-copilot.git synced 2026-04-18 06:05:55 +00:00

Files

Anush 9637e1ab08 feat: Qdrant skills (#1412 )

2026-04-17 10:54:27 +10:00

4.4 KiB

Raw Blame History

name, description

name	description
qdrant-sliding-time-window	Guides sliding time window scaling in Qdrant. Use when someone asks 'only recent data matters', 'how to expire old vectors', 'time-based data rotation', 'delete old data efficiently', 'social media feed search', 'news search', 'log search with retention', or 'how to keep only last N months of data'.

Scaling with a Sliding Time Window

Use when only recent data needs fast search -- social media posts, news articles, support tickets, logs, job listings. Old data either becomes irrelevant or can tolerate slower access.

Three strategies: shard rotation (recommended), collection rotation (when per-period config differs), and filter-and-delete (simplest, for continuous cleanup).

Shard Rotation (Recommended)

Use when: data has natural time boundaries (daily, weekly, monthly). Preferred because queries span all time periods in one request without application-level fan-out. User-defined sharding

Create a collection with user-defined sharding enabled
Create one shard key per time period (e.g., 2025-01, 2025-02, ..., 2025-06)
Ingest data into the current period's shard key
When a new period starts, create a new shard key and redirect writes
Delete the oldest shard key outside the retention window

Deleting a shard key reclaims all resources instantly (no fragmentation, no optimizer overhead)
Pre-create the next period's shard key before rotation to avoid write disruption
Use shard_key_selector at query time to search only specific periods for efficiency
Shard keys can be placed on specific nodes for hot/cold tiering

Collection Rotation (Alias Swap)

Use when: you need per-period collection configuration (e.g., different quantization or storage settings). Collection aliases

Create one collection per time period, point a write alias at the newest
Query across all active collections in parallel, merge results client-side
When a new period starts, create the new collection and swap the write alias Switch collection
Drop the oldest collection outside the window

Trade-off vs shard rotation: allows per-collection config differences, but requires application-level fan-out and more operational overhead.

Filter-and-Delete

Use when: data arrives continuously without clear time boundaries, or you want the simplest setup.

Store a timestamp payload on every point, create a payload index on it Payload index
Filter to the desired window at query time using range condition Range filter
Periodically delete expired points using delete-by-filter Delete points

Run cleanup during off-peak hours in batches (10k-50k points) to avoid optimizer locks
Deletes are not free: tombstoned points degrade search until optimizer compacts segments
Does not reclaim disk instantly (compaction is asynchronous)

Hot/Cold Tiers

Use when: recent data needs fast in-RAM search, older data should remain searchable at lower performance.

Shard rotation: place current shard key on fast-storage nodes, move older shard keys to cheaper nodes via shard placement. All queries still go through a single collection.
Collection rotation: keep current collection in RAM (always_ram: true), move older collections to mmap/on-disk vectors. Quantization

What NOT to Do

Do not use filter-and-delete for high-volume time-series with millions of daily deletes (use rotation instead)
Do not forget to index the timestamp field (range filters without an index cause full scans)
Do not use collection rotation when shard rotation would suffice (unnecessary fan-out complexity)
Do not drop a shard key or collection before verifying its period is fully outside the retention window
Do not skip pre-creating the next period's shard key or collection (write failures during rotation are hard to recover)

4.4 KiB Raw Blame History