Today's Deep-Dive: PostHog

0:00

Okay, let's unpack this. We are diving into a toolkit today that, well, it seems to

0:05

have

0:05

really shifted how product engineers think about building software. And actually,

0:10

before we really

0:10

get started, we absolutely want to thank the supporter of this deep dive. Safe

0:15

Server. Safe

0:16

Server supports this exploration, handling the hosting side of things for software

0:20

like this,

0:21

and really aiding in your digital transformation. You can find more info at www.safeserver.de.

0:28

So our mission today is to explore PostHog. The goal is, well, pretty simple.

0:33

Explain what this

0:34

heavily starred open source platform actually is, why it makes this bold claim of

0:39

being an all-in-one

0:40

solution, and importantly, how it simplifies life for product teams. Especially,

0:44

you know,

0:44

if you're maybe just starting out in the world of product analytics. And you can

0:47

tell it's gaining

0:48

serious momentum. I mean, our sources today are straight from their main marketing

0:52

pages, sure,

0:52

but also their GitHub repo, and that shows, what, 29.8 thousand stars and 2,000 forks.

0:58

Wow. Yeah, that kind of traction doesn't just happen by accident. It really signals

1:02

they're

1:02

solving a, well, a pretty crucial pain point for developers. Absolutely. So the big

1:06

picture,

1:07

the core idea, is that PostHog aims to be the single source of truth for building

1:12

successful

1:12

products. It rolls together product analytics, session recording, feature flagging,

1:17

A-B testing,

1:17

all those separate tools that product engineers usually have to kind of cobble

1:22

together.

1:22

Right. And one open source bundle.

1:24

Exactly. Okay, so here's where it gets really interesting for me. You know, every

1:29

tech stack

1:29

eventually hits that problem of tool sprawl, right? One service for errors,

1:34

another for feature flags, maybe a third for user analytics. Why did PostHog decide

1:40

to consolidate

1:40

everything? What's the philosophy behind that? Well, what's fascinating here is it

1:45

feels like

1:46

a core philosophical shift. They actually define PostHog as a product OS, like an

1:51

operating system,

1:52

but for your product data. A product OS, okay. Yeah, and they recognize that if you're

1:55

a developer,

1:56

pretty much every decision you make, you know, should I fix this bug? Should we

1:59

launch that

2:00

feature? It all requires context. The key insight driving this whole thing seems to

2:04

be that developers

2:05

should operate from the full set of data, not just like a siloed slice of what's

2:09

happening inside

2:10

their app. Right, we've all felt that pain of disconnected data silos. You know, if

2:14

I'm trying

2:14

to debug something critical, I need the error log, sure, but I also need to see the

2:19

user's click path

2:20

and maybe know if they're a paying customer. It all matters. Exactly. So what does

2:25

this full set of data

2:27

actually mean in PostHog's world? Technically speaking, I mean. It means basically

2:32

breaking

2:32

down those walls between the different operational bits of the business and the

2:36

product engineering

2:37

side. So for instance, PostHog is designed to integrate data that happens outside

2:41

the application

2:42

itself, but still really defines that customer experience. Okay, like what? Things

2:47

like financial

2:48

data, maybe payments tracked in Stripe, or exceptions captured by a separate error

2:52

tracking tool,

2:53

or even support tickets logged in something like Zendesk or Salesforce. Okay, wait

2:57

a second. If I'm

2:59

a developer, that sounds fantastic for making decisions, definitely. But I have to

3:03

ask,

3:03

isn't consolidating financial data, customer service history, and every single user

3:10

click

3:10

into one potentially open source platform? Isn't that a pretty significant security

3:15

and

3:15

maintenance challenge? That's a lot of critical data in one basket. That's a really

3:20

important

3:20

question, yeah. And it speaks directly to their engineering approach, I think.

3:24

Their argument

3:24

seems to be that managing one highly secure integrated system is actually less

3:30

risky,

3:30

potentially, than constantly syncing sensitive data between, say, half a dozen

3:35

separate vendors.

3:36

Oh, okay. Each with its own authentication, its own compliance quirks. By

3:39

integrating these pipes,

3:40

they have this data pipelines and warehouse component we can talk about. They

3:43

create a kind

3:44

of unified identity across the data. I see. Which lets you do things like say, okay,

3:48

show me all

3:49

users who clicked feature X, had error Y last Tuesday, and submitted a high

3:52

priority support

3:53

ticket last week. That kind of visibility, well, it allows for much more informed

3:58

decisions.

3:58

It sounds like the real value isn't just having the tools in one place, but it's

4:03

that single,

4:04

unfragmented view of the entire user journey from payment to frustration to

4:10

actually using a feature.

4:12

Precisely. It shifts the focus from just reporting abstract numbers to directly

4:17

informing

4:18

actual product strategy changes. So we get the philosophy,

4:22

single source of truth, unified view. Since it is an all-in-one platform,

4:27

let's look at the toolkit itself. But maybe instead of just defining basic terms,

4:31

because our listeners probably know what A-B testing is, let's focus on the synergy.

4:35

How do these tools actually work together in this unified environment?

4:38

That's the critical difference, right? Because in a traditional stack, you might

4:41

run an A-B test in,

4:42

say, tool A, but then you've got to manually pipe those results over to tool B,

4:47

your main analytics

4:48

platform, just to see the full impact. Yeah, that's always a pain.

4:51

Here, the idea is the tools share the same underlying event data immediately. So

4:56

take

4:56

product analytics combined with feature flags. Let's say you set up a feature flag.

5:00

Maybe you're

5:00

rolling out a new checkout button to just 10% of your users. Since the flag is

5:04

native to the

5:05

platform, that 10% cohort is just automatically tracked by the analytics engine. No

5:10

extra setup.

5:11

Then, maybe the next day, you decide, hey, let's turn this into a proper experiment,

5:16

an A-B test.

5:17

You don't have to redefine the cohort. You don't have to re-instrument any code.

5:21

The data is already

5:21

flowing right into the experiments tool. That allows for immediate statistical

5:25

analysis on, say,

5:27

conversion rate changes. Okay, that dramatically cuts down on the

5:30

operational lag and the engineering overhead, I imagine. What about the tools

5:35

focused more on

5:36

like product stability, errors and replays? Yeah, look at session replays and error

5:42

tracking.

5:42

Session replays, for those who haven't used them, are like watching a screen

5:46

recording of a real

5:46

user session. Right, super useful.

5:48

In the traditional model, if a user reports some weird issue, you might see the

5:52

error log in tool

5:53

A, but you have zero visual context. You're just guessing. With post hog, the idea

5:58

is if an error

5:59

pops up in the error tracking section, the developer can immediately click and jump

6:03

straight to the

6:04

session replay that's tied to that exact moment the error occurred. Oh, wow. Okay,

6:08

so instead of

6:08

trying to reproduce a bug based on just a stack trace, which can be impossible

6:13

sometimes, I can

6:14

actually watch the user encounter the bug live, see exactly what steps they took,

6:18

maybe check the

6:19

corresponding network logs right there alongside the replay. That sounds like, well,

6:23

like a huge

6:24

improvement in debugging time. It absolutely can be. Yeah. And kind of feeding into

6:28

all of this

6:29

is the data warehouse pipelines capability. This is sort of the heavy lifting

6:33

engine behind the

6:33

scenes. It's what ingests that external data, we talked about Stripe, HubSpot,

6:37

whatever,

6:38

and lets you run custom transformations on it. Okay. And critically, it then lets

6:42

you send that

6:43

enriched unified data stream out to, I think they say 25 plus other downstream

6:49

tools. So it ensures

6:50

post hoc remains flexible, even if your stack evolves later. That makes sense. Keeps

6:55

it from

6:55

being too much of a closed box. Finally, we should probably talk about LLM

6:59

analytics. You know,

7:00

AI powered applications are becoming huge, but they bring this whole new layer of

7:04

operational

7:05

complexity. How does post hoc handle that? This is where they really seem to be

7:08

looking ahead.

7:09

For applications using large language models, they're capturing specialized metrics

7:14

that are

7:14

pretty vital for operations. Things like API call traces, token generation counts,

7:21

latency for the model responses, and even the actual cost per query. Oh,

7:25

interesting. Tracking

7:27

the cost directly. Yeah. It's an essential, quite specialized layer for measuring

7:32

the efficiency

7:32

and performance of these AI driven products. And it connects that raw engineering

7:37

cost

7:37

directly to the user behavior metrics you're seeing in the main analytics dashboard.

7:42

It feels like a key feature that frankly, a lot of other consolidated platforms are

7:47

probably only

7:47

just starting to think about. Yeah, that definitely feels ahead of the curve. Okay,

7:50

that is a seriously

7:52

ambitious suite of tools all bundled together. It almost sounds like the kind of

7:55

infrastructure

7:55

only, you know, major enterprises could typically afford or manage. So what does

8:00

this all mean for

8:01

someone just wanting to get started? Is this actually accessible to small teams or

8:05

is there

8:05

a steep barrier to entry? This is where their open source roots and their pricing

8:10

philosophy really

8:11

come into play and make a difference. They basically offer two main paths. The

8:15

recommended

8:16

option is post hog cloud. Right, the hosted version. Exactly. It's engineered for

8:20

speed,

8:21

reliability, basically zero maintenance for you. And crucially, they offer an

8:25

extremely generous

8:26

free tier. How generous are we talking here? Like for a startup or maybe just an

8:31

individual

8:31

developer exploring the platform? We're talking generous to the point where they

8:36

state that 98%

8:37

of their customers currently use it entirely for free. Wow, 98%. Yep. Every month

8:42

you get,

8:43

let's see, 1 million events for product analytics, 5,000 session recordings, 1

8:47

million feature flag

8:48

requests, 100,000 exceptions tracked for errors, and 1,500 survey responses. All

8:53

free every month.

8:54

Okay. That is a huge operational allowance for a startup or a small team. That's

8:58

pretty impressive.

9:00

Pricing only kicks in after you hit those limits. Correct. Usage based after the

9:04

free tier. Got it.

9:06

And what about the pure open source experience? You know, for teams who really

9:11

insist on absolute

9:13

data sovereignty and want to self-host everything? That option is definitely there.

9:18

The ability to

9:18

deploy a kind of hobby instance with just a single line of code using Docker on

9:24

Linux exists. However,

9:26

and they're quite transparent about this too, the source is generally advised that

9:29

this setup is

9:30

really only recommended up to maybe about 100,000 events per month. Ah, okay. So it

9:35

has limitations

9:36

at scale. Yeah, if you scale beyond that, they strongly suggest migrating to their

9:40

cloud

9:40

infrastructure simply because managing the underlying data warehouse click house,

9:45

I believe for massive scale, is a really non-trivial engineering task. And well,

9:50

they handle that best. Makes sense. Now let's talk about their culture for a second

9:53

because this is

9:54

often overlooked, but I think it's pretty central to their whole appeal, especially

9:58

to developers.

9:59

They don't just open source their code. Right. They seem to open source their

10:02

entire company.

10:03

Wait, what do you mean? Like their internal processes? What does that actually

10:05

involve?

10:06

Well, it's pretty extreme transparency. They open source their entire company handbook.

10:11

It details their strategy, how they make decisions, their ways of working, internal

10:16

processes.

10:16

Seriously, their strategy docs are public. Yeah. And look, it's probably not just a

10:22

gimmick. It

10:23

seems to build this profound trust with the developer community who aren't just

10:27

users,

10:27

but often contributors too. It's like a powerful statement. Look, if we're this

10:31

transparent about

10:32

how we operate internally, you could probably trust us to handle your mission

10:35

critical data

10:36

responsibly. That's bold. And I guess that transparency extends directly to their

10:41

pricing

10:42

philosophy too, which sounds like a pretty sharp contrast to standard sauce

10:46

practices.

10:47

Absolutely. Their pricing is completely usage based, like we said, and they have an

10:51

explicit

10:51

no sales call approach. You don't have to talk to anyone to get started or figure

10:55

out costs.

10:57

Nice. You can calculate your exact cost based on totally transparent per unit

11:02

pricing, like that

11:03

zero, zero, zero, zero, zero, five per event number after the free tier limit.

11:09

There are no hidden

11:09

tiers, no complex enterprise negotiations just to figure out what you'll owe. They

11:14

explicitly aim

11:15

to be the cheapest option at scale. And this radically transparent model is kind of

11:19

how they

11:19

try to prove it. Okay. That's refreshing. Finally, let's just circle back quickly

11:23

to the future.

11:24

Beyond the LLM analytics tool they already offer, what are their teams apparently

11:28

focused on in

11:29

terms of integrating AI within the platform itself? Well, the roadmap seems to

11:33

suggest they're

11:34

shifting the goal beyond just measurement towards more active automation. They're

11:38

apparently working

11:39

on AI features to automate some of the more monotonous analysis tasks, maybe

11:43

summarize

11:44

complex user journey information automatically. Okay. And perhaps more ambitiously

11:48

integrating AI

11:50

into the actual development workflow. Like having the platform eventually suggest

11:54

or maybe even make

11:55

code changes to fix bugs, it identifies automatically. Yeah, definitely an

11:59

evolution from just being a

12:00

data collector to potentially be more like a product co-pilot. That's quite a

12:04

vision. So I

12:05

guess the key takeaway here is that this isn't really just one tool or even just a

12:09

collection

12:09

of tools. It's more like a unified, very open and incredibly transparent ecosystem

12:15

designed to give

12:16

product engineers that single source of truth, hopefully eliminating some of the

12:21

guesswork and

12:22

maybe even the security risks that come from having all these disparate data

12:26

sources. Indeed.

12:27

And maybe for a final provocative thought, the sources mentioned their unique

12:33

culture,

12:34

including this one time they apparently sent customers a floppy disk that turned

12:38

out to be

12:38

a rickroll. Okay. Which is fun, sure. But underlying that is that transparency we

12:42

talked about sharing

12:43

their strategy, their sales manual, their inner workings. That's what feels really

12:47

foundational.

12:48

And the question for you, the listener might be, if a company is that open about

12:52

how they operate

12:53

internally, how much does that influence your trust when you're deciding whether to

12:57

build your

12:57

own mission critical product on their platform? That openness, maybe it isn't just

13:01

a feature,

13:02

maybe it's a kind of security and confidence guarantee in itself. That's a great

13:05

question

13:06

to leave you with. Thank you for joining us for this deep dive into PostHawk. And

13:10

once again,

13:10

a big thank you to SafeServer for the support of this deep dive that helps you with

13:14

digital

13:15

We'll see you next time.

13:15

We'll see you next time.

Today's Deep-Dive: PostHog

Episode description

Persons