stream-processing-windowing-designer

Solid

Designs optimal windowing strategies for stream processing

AI & Automation 1,160 stars 71 forks Updated today MIT

Install

View on GitHub

Quality Score: 97/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
71
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Stream Processing Windowing Designer ## Overview Designs optimal windowing strategies for stream processing. This skill provides expertise in window types, watermarks, and trigger strategies for streaming applications. ## Capabilities - Window type selection (tumbling, sliding, session, global) - Watermark strategy design - Late data handling - Trigger configuration - Window aggregation optimization - State management recommendations - Exactly-once semantics configuration ## Input Schema ```json { "useCase": "string", "eventTimeField": "string", "latencyRequirements": { "maxLatencyMs": "number", "allowedLateMs": "number" }, "aggregations": ["object"] } ``` ## Output Schema ```json { "windowConfig": { "type": "string", "size": "string", "slide": "string" }, "watermarkConfig": "object", "triggerConfig": "object", "lateDataHandling": "object" } ``` ## Target Processes - Streaming Pipeline - Feature Store Setup ## Usage Guidelines 1. Define use case and event time field 2. Specify latency requirements 3. List aggregation operations needed 4. Consider late data arrival patterns ## Best Practices - Choose window type based on business requirements - Configure watermarks based on expected lateness - Use appropriate triggers for latency vs completeness tradeoff - Plan state management for long windows - Test with realistic event time distributions

Details

Author
a5c-ai
Repository
a5c-ai/babysitter
Created
4 months ago
Last Updated
today
Language
JavaScript
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

Web & Frontend Listed

streaming-patterns

Kafka, Flink, Kinesis, and Spark Structured Streaming design — consumer groups, partitioning, exactly-once semantics, lag monitoring, windowing, and late-arriving data. Use this skill whenever the user needs real-time or near-real-time data processing, is redesigning a batch pipeline into streaming, asks about event-driven architectures, or mentions Kafka topics, consumer lag, checkpointing, watermarks, or stream-table joins. Also trigger when the user says batch is "too slow", stakeholders want "live" dashboards, or the pipeline needs to react to events as they happen rather than on a schedule. If latency requirements are under a few minutes, this skill should be active.

0 Updated 5 days ago
Methasit-Pun
AI & Automation Solid

window-function-generator

Generate window function generator operations. Auto-activating skill for Data Analytics. Triggers on: window function generator, window function generator Part of the Data Analytics skill category. Use when working with window function generator functionality. Trigger with phrases like "window function generator", "window generator", "window".

2,274 Updated today
jeremylongshore
AI & Automation Solid

kafka-topic-designer

Designs and optimizes Apache Kafka topics and configurations

1,160 Updated today
a5c-ai
Data & Documents Listed

streaming-data

Build event streaming and real-time data pipelines with Kafka, Pulsar, Redpanda, Flink, and Spark. Covers producer/consumer patterns, stream processing, event sourcing, and CDC across TypeScript, Python, Go, and Java. When building real-time systems, microservices communication, or data integration pipelines.

368 Updated 5 months ago
ancoleman
Data & Documents Listed

pipeline-architect

Designs and implements data pipelines: ETL/ELT, streaming, batch processing, schema migrations, and data warehouse architecture. Covers Kafka, Airflow, dbt, Spark, ClickHouse, BigQuery, Snowflake, Redis Streams, and more. Use this skill when the user asks about data pipelines, ETL jobs, data transformation, streaming setup, data warehouse design, CDC, schema migrations, data quality checks, or anything involving moving data from source to target. Also triggers on "build a pipeline," "migrate data from X to Y," "set up streaming," "design my data warehouse," or "data quality is bad, help me fix it."

1 Updated 4 days ago
mturac