Search
Kestra
Kestra is a universal open-source orchestrator that makes both scheduled and event-driven workflows easy. By bringing Infrastructure as Code best practices to data, process, and microservice orchestration, you can build reliable workflows and manage them with confidence.
Kestra is an open-source orchestrator designed to bring Infrastructure as Code (IaC) best practices to all workflows — from those orchestrating mission-critical operations, business processes, and data pipelines to simple Zapier-style automation.
In just a few lines of code, you can create a flow directly from the UI. Thanks to the declarative YAML interface for defining orchestration logic, business stakeholders can participate in the workflow creation process.
Kestra offers a versatile set of language-agnostic developer tools through YAML (extensive DSL (Domain Specific Language)) while simultaneously providing an intuitive user interface tailored for business professionals.
The YAML definition gets automatically adjusted any time you make changes to a workflow from the UI or via an API call. Therefore, the orchestration logic is always managed declaratively in code, even if some workflow components are modified in other ways (UI, CI/CD, Terraform, API calls).
Kestra API-first Philosophy
Built with an API-first philosophy, Kestra enables users to define and manage data pipelines through a simple YAML configuration file. This approach frees you from being tied to a specific client implementation, allowing for greater flexibility and easier integration with various tools and services.
More on Kestra Docs.
# History
First public release on 2022-02-01 on Introducing Kestra first public release :tada: with main features as:
- an orchestrator: Build a complex pipeline in couple of minutes.
- a scheduler: Launch your flows whatever your need!
- a rich ui: Create, run, and monitor all your flows with a real-time user interface.
- a data orchestrator: With its many plugins, build your data orchestration directly.
- cloud native & scalable: Scale to millions of executions without stress or hassle.
- an all-in-one platform: No need to use multiple tools to deliver a complete pipeline.
- a pluggable platform with the option to choose from several plugins or to build your own.
Summarizing their first blog, Kestra started in 2019 with this initial commit by Ludovic Dehon. At this time, Kestra was at the proof-of-concept stage. Leroy Merlin rejected Apache Airflow for their cloud-based data platform due to instability, performance issues, and lack of features.
First public release doesn’t mean that Kestra is not production ready. In fact, it has been used in production since August 2020 at Leroy Merlin — take a deeper look at the case study if you want more detail. Here are some figures to give a picture of Kestra’s credentials:
Challenged by a co-worker, the author decided to create a new open-source workflow management system. Over 30 months, they built Kestra, choosing Kafka, ElasticSearch, and Vue.js as core technologies.
Kestra was released as open-source under the Apache License. The author, drawing from experience with another open-source project, AKHQ, created a company to support Kestra’s development.
Kestra offers deep integration with tools and databases through plugins, simplifying complex tasks compared to bash commands. Despite being a first public release, Kestra is production-ready. It’s been used at Leroy Merlin since August 2020, managing thousands of flows and millions of tasks monthly.
# Seed Round
Kestra’s $8 million Seed round, 2024-09-23. 🚀 Kestra Secures $8 Million to Simplify and Unify Orchestration for All Engineers
# Company Behind
It is a French company, and with a 3$ Mio round in 2023-10-05, Article.
# Architecture
- Kestra’s architecture has been designed to offer a transparent separation between the orchestration and data processing capabilities.
- Kestra’s Executor is responsible for executing tasks and workflows without directly interacting with the user’s infrastructure.
- The Executor relies on Workers, which are stateless processes that carry out the computation of runnable tasks and polling triggers (like Sensors in Dagster).
- For privacy reasons, workers are the only components that interact with the user’s infrastructure, including the internal storage and external services.
- Kestra’s internal storage:
- data stored in users private bucket, not internal Kestra database. KV Store are based on internal storage, to store that data locally
# Java
Kestra is written in Java.
A
comparison by Julien Hurault with docker compose with two Pyhton Orchestrators:
# Concepts
- Flowable Tasks: Control your orchestration logic.
- Runnable Tasks: Data processing tasks handled by the workers.
- Revision: Manage versions of flows.
- Secret: Store sensitive information securely.
- Key Value (KV) Store: Build stateful workflows with the KV Store.
- Pebble Templating Engine: Dynamically render variables, inputs and outputs.
- Blueprints: Ready-to-use examples designed to kickstart your workflow.
- Backfill: Backfills are replays of missed schedule intervals between a defined start and end date.
- Task Runners: Task Runners is an extensible, pluggable system capable of executing your tasks in arbitrary remote environments.
- Replay: Replay allows you to re-run a workflow execution from any chosen task run.
- Expression: Expressions to dynamically render various flow and task properties.
More on Concepts. Kestra First Try
# Features
- Kestra’s Realtime Triggers:
Kestra Become the First Real-Time Orchestration Platform - YouTube:
- React to events as they happen with millisecond latency. As soon as you add a Realtime Trigger to your workflow, Kestra starts an always-on thread that listens to the external system for new events. When a new event occurs, Kestra starts a workflow execution to process the event. Let us understand how we can implement Realtime Trigger for some of the messaging systems.
- KV Store
- Kestra is stateless by default. But with KV store you can save data beyond input/output data that are store in kestra internal storage.
- KV Store allows you to persist any data produced in your workflows in a key-value format
- Built on top of Kestra’s internal storage (which can be any cloud storage service like
S3
orGCS
):- there is no limit of size and you can set time life with TTL
# SQL within YAML
Pipelines can be build with SQL within YAML (Related: Extending SQL for analytics):
Example from the
Quick-Start data_engineering_pipeline
# Subflow
Similar to resources in Dagster.
Subflows allow you to build modular and reusable components, e.g. encapsulates critical business logic. This Subflow can be used across multiple flows. Another example is to sent alerting errors to Slack and email. By using a Subflow, you can reuse these two tasks together for all flows that you want to send error notifications, instead of having to copy the individual tasks for every flow.
# Local Sync (code-first with no-code)
To be no-code and code-first at the same time, you can actively sync to a local folder.
in docker-compose we add to volume
and in KESTRA_CONFIGURATION
:
|
|
Local Flow Synchronization. Built on Micronaut Framework (Micronaut Framework), see Micronaut Framework at Kestra - Micronaut Framework.
# Task Runners
Task Runners offer a powerful way to offload compute-intensive tasks to remote environments.
# Releases
Some of the recent releases and features captured here.
# v0.23
This release brings a multi-panel editor, unit tests for flows, enhanced UI filters, customizable dashboards, Python dependency caching, and many new plugins.What’s new in the Open-Source Edition:
- Multi-Panel Editor: Open, reorder, and manage multiple panels (Code, No-Code, Files, Docs, etc.) side by side.
- No-Code Forms: Build and edit flows using a redesigned form-based UI. No YAML required.
- New UI Filters: Faster filter autocompletion, now editable as plain text.
- Customizable Dashboards: Set a custom default dashboard, add new KPI charts, and adjust their widths.
- Python Dependency Caching: Automatic caching of script dependencies for faster execution.
- Git Sync for Dashboards & Apps: Version control your dashboards and apps using Git tasks.
- Plugin enhancements: New and updated plugins for Salesforce, HubSpot, Ollama, OpenAI, LangChain4j, GitHub Actions, Jenkins, Go scripts, InfluxDB, GraphQL, Databricks, Redis, and ServiceNow.
- Additional improvements: Plugin usage metrics, Pebble function autocompletion, worker info in execution details, and improved data backup/restore.
For Enterprise Edition users, this release additionally brings:
- Unit Tests for Flows (Beta): Define unit tests for flows with fixtures and assertions; run them directly from the UI or via API.
- Tenant-Based Storage Isolation: Store execution data in isolated storage per tenant, ensuring strict data separation.
- Salesforce plugin: Integrate Salesforce operations (create, update, delete, query) into workflows.
# v0.22
Here’s what’s new:
⚙️ Plugin Versioning to run multiple plugin versions in parallel
🔒 Read-only backends to securely reference externally stored secrets
📦 Cross-Namespace File Sharing for code and KV pair inheritance
🔔 afterExecution property to run tasks after flows finish
🚀 GraalVM tasks for fast and secure Python/JS/Ruby runtime
💾 Enhanced Queues taking up to 90% less database space
🏃♂️➡️ Execution processing got between 3x to 10x faster
👤 LDAP Sync for streamlined enterprise user management
📡 New Log Exporters for Splunk, S3, GCS, Azure Blob, and a new Audit Log Shipper.
🔌 New plugins: Snowflake CLI, MariaDB, ServiceNow, and improved in-process DB plugins.
# v0.18.0
Kestra v0.18.0:
- an embedded Key-Value Store
- a new, improved way to manage your workflow execution Outputs
- new
ForEach
task - new
SELECT
andMULTISELECT
input types - new tasks to upload, download or delete namespace files
- improved
Purge
mechanism along with a more flexible way of deleting Executions and related logs, metrics and files - human-readable second-level
Schedule
trigger - improved JSON and ION handling, along with new plugins to transform data with
JSONata
andGrok
- SCIM Directory Sync
- many enhancements to Secrets
- a more powerful Audit Logs interface
- new capabilities in Task Runners (now in GA!)
- improved Namespace Management (now available in OSS!)
- …and a bunch of new plugins!
Additionally, SQL Server is now available in preview as a Kestra EE backend database.
See
release blog post to learn more about all enhancements.
# Examples / Use-Cases
- My NEW HomeLab automation platform // Kestra - YouTube ( Part 2)
- Beyond Storing Data: How to Use DuckDB, MotherDuck and Kestra for ETL
- Integration with Modal: Kestra on LinkedIn: Join us for a live demo with Modal and discover how their serverless…
# Other Data Orchestrators
See Data Orchestrator and Kestra vs Dagster.
# Further Reads
- The Universal Data Orchestrator: The Heartbeat of Data Engineering | ssp.sh
- Universal Data Orchestrator in Action: Enterprise Best Practices | ssp.sh
Origin: Data Orchestrators
References:
Kestra, Open Source Declarative Data Orchestration, Kestra Inc
Created 2024-01-12