I Read Google Discover's SDK So You Don't Have To: Clusters, Classifiers, OG Tags, NAIADES... (Free pCTR Tool Inside)
Most Discover advice sounds like this: publish great content, use big images, post consistently, and let the algorithm find you. That’s not wrong. It’s just incomplete.
Google Discover serves content to hundreds of millions of people every day, and almost nobody in the SEO world has looked under the hood at how it actually works on the client side. Not through speculation. Through the SDK itself.
That’s what I spent time doing. I went through the observable telemetry, the event naming conventions, and the client-side state that Google’s own code exposes during normal Discover operation. Every finding I’m about to share traces back to a specific string, constant, or configuration value. Where something is an inference, I say so.
Think of it like reading the nutrition label. You can’t see inside the factory, but the label tells you a lot about what’s inside.
The finding that reframed everything for me
Discover’s content pipeline has 9 stages (I identified 9 of them). What I did not expect was the ordering. Why 9? Actually, these are just client-side architectures.
The collection-level filter runs at stage 4. Interest matching runs at stage 6. The pCTR model runs at stage 7.
The block is triggered when a user taps “Don’t show content from [Publisher].” One article. One user action. Entire domain suppressed.
And here is the asymmetry that matters: there is no observable blanket boost equivalent. The penalty surface is wider than the reward surface. That’s not an editorial judgment on my part. It’s just what the system exposes.
What the pCTR model actually consumes
The existence of a predicted click-through rate model in Discover is mentioned. What is less known is which inputs feed it. Based on the telemetry, those inputs include the title text from og:title, image quality signals including width and height thresholds, freshness measured in seconds, historical CTR derived from per-URL click and show counts, and image load success rates.
The practical meaning of this: og:title is not just a display label. It is a model input. That distinction matters. A title is being evaluated, not just displayed.
At the same time, the presence of historical CTR as a feedback signal is its own kind of justice. Misleading titles that drive clicks but not engagement should self-correct over time, because high initial clicks followed by quick bounces degrade future pCTR scores.
Dashboard here: https://metehan.ai/discover.html
Experimental pCTR Free Tool here: https://pctr-discover.pages.dev
The freshness buckets
Discover has three named time buckets and then a continuous decay phase(identified so far in client events, maybe there are more!). Content aged 1 to 7 days carries the highest freshness weight. Content aged 8 to 14 days drops to medium. Content aged 15 to 30 days drops to low. After 30 days, staleness is tracked in hours and decays continuously. Discover is using milliseconds! FCFS.
The first week is when content has its best window. That’s not a soft guideline. The buckets are hardcoded into the system.
The 6 OG tags that actually matter
Publishers often wonder which meta tags Discover actually parses. The SDK gives a clear answer: exactly six. og:image and og:title are mandatory. Without an image, no card is rendered. Full stop. The minimum width for a large hero card format is 1200px. Smaller images produce a thumbnail card, which typically sees lower engagement.
The other four tags, og:site_name, og:locale, og:image:secure_url, and article:content_tier, are recommended. They affect attribution display, locale matching, HTTPS preference, and content tier classification.
NAIADES and the personalization stack you’ve never heard of
Discover’s personalization draws from four layers. The outermost two are shared Google infrastructure: the Geller/AIP interest graph used across Assistant and Search, and a system called NAIADES, a Google-wide personalization system with 18 content subtypes.
I already caught it here with the Cambridge example: https://metehan.ai/blog/image-to-seo-i-built-an-ai-tool-to-decode-google-discover-heres-what-it-found/
Tombstones
When a user dismisses content, three records are created: a dismissal overlay ID, a filter status update, and a tombstone. The tombstone is a permanent per-content record. Dismissed content does not resurface. Ever. The record doesn’t expire.
150 A/B experiments at once
During a single observed session, approximately 150 server-side A/B experiment IDs were active simultaneously. Two users who look identical in terms of interests and behavior can see meaningfully different feeds based purely on experiment bucket allocation. This is why Discover can feel inconsistent even when your content and publishing patterns haven’t changed.
The full picture
This post covers the highlights. The full technical dashboard includes event constants, 56 telemetry counters, 18 NAIADES subtypes, 13 cluster types, 51 runtime feature flags, and 41 individually fact-checked findings.
Dashboard here: https://metehan.ai/discover.html
Experimental pCTR Free Tool here: https://pctr-discover.pages.dev
If you’ve been treating Discover as a black box, I hope this at least gives it some walls.
I won’t expose the full source because I talked with some great friends in the search industry, they are already aware of these!



