Enterprise Authentication for Self-Hosted Data Infrastructure
Open-source products like Dagster and Airbyte come with comprehensive installation guides and automated setup scripts, making it relatively easy to get started and see your first pipelines or replication jobs running in your environment. However, things become more challenging when you need to productionize the installation, secure it properly, and integrate it with your existing infrastructure while making it available to internal users.
One critical step in this process is managing user access to the web UI of these products. Both applications feature beautiful interfaces, but only their cloud versions include fully-featured user authentication.
There are several approaches to adding user authentication to any application, ranging from building custom authentication solutions to implementing basic auth through your ingress controller.
Since both products run in Kubernetes, most ingress controllers can provide basic authentication capabilities out of the box. While this solution is simple, it doesn't scale with organizational needs and becomes manual and difficult to manage over time. Fortunately, there are numerous open-source projects that can help address this challenge.
Our recommended solution: Authentik
Our choice is Authentik (https://goauthentik.io/), an open-source Identity Provider (IdP) and Single Sign-On (SSO) platform. If you're familiar with enterprise products like Keycloak or ForgeRock, Authentik serves as a lean alternative to these heavyweight solutions. It's equally feature-rich but significantly more modern, easier to manage, and intuitive, while maintaining enterprise-grade security and readiness.
How it all works together
Since all products are designed to run in Kubernetes, integrating them is straightforward. The principle is simple: the ingress controller handles most of the heavy lifting by checking if requests are authenticated. If authentication is present, it verifies the credentials against Authentik. If no authentication is provided, it redirects users to the corresponding Identity Provider's login interface.
Integration flexibility
Authentik can function as a standalone identity provider where you create users, groups, enable self-registration, and manage user accounts directly within the platform. However, the real value comes from integrating with your existing company identity provider. Authentik supports numerous providers out of the box, including Azure AD and Google OAuth.
Similarly, you can integrate with any OAuth provider your organization currently uses.
Architecture Overview
The authentication architecture consists of the following key components:
External Identity Provider
Azure Active Directory serves as the primary identity provider
Contains two user groups:
dagster-users
- Users authorized to access Dagsterairbyte-users
- Users authorized to access Airbyte
Kubernetes Cluster Components
Authentik Deployment - Open-source IdP running within the cluster
Ingress Controller - Entry point for all external traffic
Dagster Application - Data orchestration platform
Airbyte Application - Data integration platform
Authentication Flow

Step 1: Initial User Request
User attempts to access either Dagster or Airbyte application
Request hits the Ingress Controller (entry point to the cluster)
Step 2: Authentication Check
Ingress Controller forwards the request to Authentik for authentication verification
Authentik checks if the user has valid authentication credentials
Step 3: Authentication Decision
If user is NOT authenticated:
5. Authentik presents login page to the user.
6. User authenticates with their corporate credentials (Azure AD)
7. Azure AD return User object including all attributes like groups (dagster-users
or airbyte-users
)
8. Upon successful authentication, Azure AD returns user to Authentik with authentication token
If user IS authenticated:
5. Authentik validates the existing session
Step 4: Authorization and Access
Authentik confirms user authentication and returns success response to Ingress Controller
Authentik includes the
X-authentik-groups
header containing the user's group memberships (e.g.,dagster-users
,airbyte-users
)Ingress Controller examines the
X-authentik-groups
header to verify the user has appropriate permissions for the requested applicationIf group membership is valid, Ingress Controller forwards the original request to the appropriate application (Dagster or Airbyte)
User gains access to the requested application with their authorized permissions
Configuration Examples
For our cluster, we use Istio with Service Mesh and ingress controller. Authentik has excellent support for Envoy, which makes it very easy to integrate with not just Istio but any other Envoy-based ingress solution.
and the remaining part to create Authorization Policy
In the same way it's straight-forward to setup the same for Nginx or Traefik
Conclusion
Securing your self-hosted data stack doesn't have to break the bank or overwhelm your team. Authentik provides enterprise-grade authentication that integrates seamlessly with your existing setup, giving you the security controls you need without the complexity.
Your data teams get secure access to the tools they need, while your organization maintains centralized control over who can access what. It's a win-win solution that scales with your business.
Related Posts
dagster
Jul 7, 2025
Enter GitOps - a modern operations model where desired state lives in version control, and automation reconciles it to reality. In software engineering, GitOps is already the go-to for managing microservices. But it’s just as powerful for managing pipelines.