Dagster Monitoring with Prometheus: From System Metrics to Custom Assets (Part 1)
The Monitoring Challenge
Maintaining a self-hosted instance of Dagster comes with its own unique set of challenges. Every distributed application like Dagster should have comprehensive monitoring in place to measure bottlenecks, record performance metrics, track execution times, monitor materialization durations, and capture many other vital operational metrics alongside business-specific parameters.
When running applications on Kubernetes, Prometheus has become practically the default choice for monitoring.
Prometheus is an open-source monitoring and alerting toolkit that operates on a pull-based model, actively scraping metrics from /metrics
endpoints at regular intervals and storing them in a time-series database.
System-Level Monitoring. The Easy Part
Monitoring system components of Dagster like dagster-webserver
, dagster-daemon
, or even the PostgreSQL database is relatively straightforward when running in Kubernetes. The Kubernetes API automatically records and provides comprehensive system-level metrics for these services through tools like kubelet, cAdvisor, and kube-state-metrics. This functionality comes out of the box without any additional configuration.
The Real Challenge. Business Logic Monitoring
The real challenge emerges when we need to monitor Dagster jobs and assets that step close to the business logic of your applications. This is where traditional Prometheus scraping patterns break down due to the fundamental nature of data processing workloads.
By nature, Dagster jobs can be short-lived, sporadic, dynamic, and distributed. The conventional Prometheus approach of scraping metrics endpoints at regular intervals (typically 15-60 seconds) simply doesn't work for ephemeral processes that may start and complete between scrape intervals.
Without proper monitoring of job-level metrics, you lose visibility into asset materialization times, job success/failure rates, resource consumption during data processing, data quality metrics, and business KPIs tied to your data pipelines.
Push-Based Metrics with Prometheus
Despite these challenges, Prometheus remains effective for monitoring Dagster jobs through push functionality using specialized gateways. Dagster provides built-in support for the prometheus_client
library, allowing you to instrument your code and generate Prometheus-ready metrics directly within your application logic.
The workflow is simple: Dagster jobs push metrics to a gateway, and Prometheus scrapes the gateway on its regular schedule, storing everything in its time-series database for analysis.
Let's explore how to implement this approach with a concrete example that demonstrates instrumenting a Dagster asset with Prometheus metrics.
Setting Up Dagster Metrics with Prometheus
First, we need to configure the push gateway to allow Prometheus to receive metrics from our Dagster jobs. Use an Aggregation Gateway like prom-aggregation-gateway
instead of the official Prometheus Pushgateway. The official version doesn't aggregate metrics and may overwrite data from concurrent jobs, while an aggregation gateway properly combines metrics from multiple sources, preventing data loss when Dagster runs jobs in parallel.
docker-compose.yaml
Instrumenting Dagster Assets with Prometheus Metrics
Now that our Prometheus setup is ready, let's instrument the code using the prometheus_client
library to send valuable metrics from our Dagster assets. This involves adding metric collection directly into your asset definitions to capture key performance indicators like execution time, data quality metrics, and business-specific measurements.
Materialise the asset to make sure everything is working. Once you trigger the asset materialization, you should see the metrics being pushed to the gateway

From the Prometheus side, we can see the same metrics by checking our aggregation gateway target. Navigate to your Prometheus web interface and verify that the gateway is being scraped successfully and the metrics are available for querying.

Now that we've confirmed the data is flowing correctly, we can start building our monitoring dashboard using Grafana. The various types of metrics we defined in our instrumentation code - counters, gauges, and histograms - will allow us to create diverse visualisations in Grafana, from simple time series graphs tracking asset execution times to complex heatmaps showing performance distributions across different jobs and time periods.

What's Next: Monitoring Your dbt Assets
This covers the foundation of monitoring your Dagster infrastructure and custom assets with Prometheus. But we're only halfway there - the most critical piece is still missing.
Remember, 50% of Dagster users rely on dbt for their data transformations, yet monitoring these SQL-based assets presents unique challenges. You can't simply add prometheus_client
calls to your dbt models like you can with Python assets. So how do you capture execution times, row counts, and performance metrics from your dbt transformations?
In our next post, we'll dive deep into monitoring dbt assets with Prometheus, exploring how to extract valuable metrics from dbt's artifacts, instrument your SQL transformations, and create comprehensive dashboards that give you visibility into your entire data pipeline - from Python assets to dbt models.
Stay tuned to complete your Dagster observability stack!
Related Posts
dagster
Jul 17, 2025
The combination of open-source tools like Authentik with Kubernetes ingress controllers provides enterprise-grade authentication without the enterprise price tag, making secure self-hosted data stacks accessible to organizations of any size.
dagster
Jul 7, 2025
Enter GitOps - a modern operations model where desired state lives in version control, and automation reconciles it to reality. In software engineering, GitOps is already the go-to for managing microservices. But it’s just as powerful for managing pipelines.