Self-hosting
Why DataLook isn't self-hostable in V1, what that means for your data, and what to ask if you genuinely need it.
The honest answer up front: DataLook is not self-hostable in V1, and that's a deliberate choice — not an oversight or a paywall. This page explains the reasoning so you can decide whether the hosted product fits, and tells you exactly what to ask us if it doesn't.
Why not
We're a two-person team chasing a focused beta. Every hour spent making the stack portable — documenting four backing services, supporting arbitrary Postgres/ClickHouse/Redis versions, writing an upgrade path, and answering "it doesn't start on my distro" tickets — is an hour not spent making the product better for the founders we're building it for.
The architecture is also genuinely involved for something you'd run yourself:
- ClickHouse for the event store, with hot/cold storage tiering, materialized views, and load-bearing memory caps that OOM-kill the app if mis-tuned.
- Redis Streams as the ingest buffer, with a long-lived consumer process that has to stay healthy (XAUTOCLAIM, dead-letter handling, poison-row bisection).
- Postgres for app data, with schema migrations.
- A reverse proxy terminating TLS, plus a CDN edge in front.
That's four stateful services and a worker, tuned to fit on one box. We run it so you don't have to think about any of it.
This is the same call Plausible and Fathom made early on: hosted-first, self-hosting later (if at all). It keeps a tiny team fast.
What this means for your data
Not self-hosting doesn't mean locked-in. The product is built so you can leave whenever you want:
- Export any time. Every table on screen exports to CSV or JSON, and there's a date-ranged bulk events endpoint (
GET /api/export/events?from=…&to=…&format=csv|json). Your data is yours. - Cookieless and PII-light by design. We don't store raw IPs, we strip PII keys server-side, and the visitor id is a rotating daily hash. There's less of your data sitting on our box than with most analytics tools to begin with.
- The SDK is readable. The browser script is an unobfuscated IIFE you can read end to end — see Security & CSP.
"But ad blockers / my CSP / my compliance team…"
Most of the reasons people reach for self-hosting are already solved without it:
| Concern | Hosted answer |
|---|---|
| Ad blockers hide my analytics | First-party proxy — serve the SDK and collector from your own domain. Nothing third-party to block. |
| Strict CSP | The proxy install collapses your directives to script-src 'self'; connect-src 'self'. See Security & CSP. |
| Data residency / compliance | Talk to us (below). EU-targeting + a DPA are on the v1.1 roadmap. |
If you genuinely need to self-host
Some teams have a hard requirement — air-gapped networks, a contractual data-residency clause, a security policy that forbids third-party script origins even when proxied. If that's you, email us with:
- The actual constraint — "compliance requires X" beats "we'd prefer it." It helps us understand whether the proxy install already satisfies it.
- Your scale — events/month and number of sites. Self-hosting only makes sense above a certain volume.
- Who operates it — do you have someone who'll own a ClickHouse instance, or are you hoping for one-click?
We're not against self-hosting forever — it's just not a V1 commitment. Enough well-scoped requests move it up the roadmap.
What's next
- Beat ad blockers without self-hosting: First-party proxy setup.
- Read exactly what the script does: Security & CSP.
- Pull your data out any time: the export menu on every dashboard table.