How it works · CavBot analytics v5

What CavBot does under the hood

CavBot instruments your routes, 404 control room, badge, and SEO snapshots through a single script snippet — all tied to append-only events and derived tables that power CavCore Console and future assistive layers.

This page is for engineers, web architects, and SEOs who want to see the system, not the slogan. We’ll answer:

  • What am I installing?
  • What data is collected?
  • How does it flow into CavCore Console?
Show:
Step 1 · Install the snippet

Drop in the CavBot tracker

CavBot starts with a single script tag. No SDK ceremony, no build step. The snippet wires up route views, 404 control-room events, badge telemetry, SEO snapshots, and performance sampling under one project key.

Place the CavBot tracker on every page you want under guard. For most teams, that means the main layout or HTML template for your app or framework.
Where to place it:
  • Option A: In <head> with defer for early instrumentation.
  • Option B: Right before </body> if your team prefers scripts at the bottom.
Required attributes:
  • data-project-key — public key that maps to a row in projects.
  • data-environmentproduction, staging, or preview.
Optional attributes:
  • data-badge="true" — mounts the CavBot badge on your site.
ID behavior:
  • anonymousId — project-scoped UUID per visitor (no names, emails, or user IDs).
  • sessionKey — rolling key based on time and idle rules.
  • No raw IP, email, or account data stored by default — just route-level signal.
Prefer docs? You can wire this into your layout, template, or shell just like any other analytics script.
Step 2 · Telemetry model

What CavBot tracks

CavBot is not a generic pageview counter. It tracks the parts of your system that break journeys: routes, 404s, SEO structure, performance, and how people actually move through your site.

404 & routes
Every visit to a route — and every time someone lands in the 404 control room — becomes a typed event.
Core events:
  • cavbot_route_view — normal route view.
  • cavbot_404_view — session enters your 404 control room surface.
  • cavbot_catch — player catches CavBot in the 404 game.
  • cavbot_miss — player misses, or exits without recovery.
  • cavbot_idle — control room open, but no interaction for N seconds.
Tables:
events pages daily_page_aggregates
Powers:
  • 404 density maps by route and referrer.
  • Recovery rate from 404-control-room sessions.
  • Top failing routes and dead paths in CavCore Console.
SEO snapshots
CavBot takes periodic SEO snapshots so you can tie structure and indexability to real behavior, not just audits.
Per-route snapshot includes:
  • title, metaDescription, canonicalUrl.
  • Indexability flags: noindex, nofollow, robots rules.
  • Heading outline, word count, and key social tags (OG/Twitter).
Tables:
seo_snapshots
Derived insights:
  • missing_meta_description
  • short_title / duplicate_title
  • non_indexable_critical — important pages accidentally blocked.
Performance & feel
Light-weight performance sampling gives you Core Web Vitals and a “runtime feel” score per route.
Metrics:
  • LCP — Largest Contentful Paint (per route, per device).
  • TTFB — Time to first byte.
  • CLS — Cumulative Layout Shift.
  • Ready for FID/INP extensions.
Tables:
performance_samples
Powers:
  • “Feel” lens in CavCore Console, per critical route.
  • Insights such as slow_critical_page and “runtime feel below target”.
Structure & sitemap
CavBot builds a canonical map of your routes, internal links, and sitemap entries so you can see structure issues in one place.
Resolves:
  • All discovered routes (including 404-prone ones).
  • Sitemap URLs and their status.
  • Internal referrers and click paths.
Tables:
pages referrer_aggregates daily_page_aggregates
Powers:
  • Orphan and weakly linked pages.
  • Sitemap URLs that 404 in practice.
  • Funnels that land visitors on thin or broken content.
Step 3 · CavBot presence

The CavBot badge on your site

The CavBot badge is an optional, fixed UI element that quietly signals a session is “under guard.” It reuses the same CavBot head system as the control room, scaled down and wired into its own event stream.

You can enable the badge entirely from configuration. For most teams, the snippet attribute is enough:

  • data-badge="true" on the tracker snippet.

Or, if you prefer explicit configuration, wire it through your analytics init call:

window.cavbotAnalytics.init({
  projectKey: "YOUR_PROJECT_KEY",
  environment: "production",
  cavbotBadge: true
});

The badge renders as a small, fixed element (bottom-right by default). Today it shows static project detail — like project name and number of routes under guard. Tomorrow it becomes an assistive surface:

  • Live 404 count for this session’s journey.
  • Route-level health hints (“This page is slow on mobile”).
  • “Ask CavBot” hooks powered by the same event and insights tables.
Step 4 · 404 control room

404 Control Room: turning dead routes into signal

CavBot’s 404 Control Room is a designed surface, not a default error. When you wire 404s into the control-room component, every play session becomes a rich source of routing signal.

The implementation depends on your framework, but the pattern is the same: send users into a CavBot-powered 404 component, and track interactions as structured events.
Example · Next.js 404 page
// pages/404.js
import CavBotControlRoom from "../components/CavBotControlRoom";

export default function NotFoundPage() {
  if (typeof window !== "undefined" && window.cavbotAnalytics) {
    window.cavbotAnalytics.track("cavbot_404_view", {
      route: window.location.pathname,
      pageType: "404-control-room"
    });
  }

  return (
    <CavBotControlRoom
      difficulty="standard"
      theme="grid-lab"
    />
  );
}
Configurable options:
  • difficulty — e.g. standard, strict.
  • theme — e.g. grid, signal, imposter.
Every 404 play session logs:
  • cavbot_404_view — session enters the control room.
  • cavbot_catch — player successfully “catches” CavBot.
  • cavbot_miss — player fails or exits without recovery.
  • cavbot_idle — control room open, no input for N seconds.

These events feed 404 recovery rate, game interaction metrics, and idle levels in CavCore Console — giving you a live sense of how often broken routes are rescued versus abandoned.

Step 5 · Data path

From raw events to CavCore Console & AI

CavBot’s data model is intentionally small. Everything moves from append-only events, into a set of aggregates, and finally into insight rows that CavCore Console — and future AI endpoints — can read.

  1. Layer 1 · Events
    Raw event stream
    Every interaction becomes a single row in events, keyed by anonymousId, sessionKey, eventName, route, and time.
    events sessions deploy_markers
  2. Layer 2 · Aggregates
    Daily and route-level tables
    Nightly jobs roll raw events into compact aggregates: per-route counts, referrer breakdowns, SEO and performance stats.
    daily_page_aggregates referrer_aggregates seo_snapshots performance_samples
  3. Layer 3 · Insights
    Typed findings
    Aggregates are scanned for conditions that matter: 404 spikes, SEO regressions, slow critical pages, and campaigns sending traffic into thin content.
    404_spike seo_missing_meta slow_critical_page campaign_lands_on_thin_page
  4. Layer 4 · Surfaces
    Console & future AI
    CavCore Console pulls from the same tables you do:
    • Project overview and health.
    • 404 summary and control room analytics.
    • Page & SEO detail, referrers, campaigns, and performance slices.
    Future endpoints sit on top of this same schema:
    /v1/assist /v1/insights/summarize
Privacy & retention

Privacy, IDs, and retention windows

CavBot is designed to be safe for production sites by default. IDs are anonymous, payloads are lean, and retention is tuned for operational signal — not long-term user tracking.

IDs & identifiers
  • Anonymous per project: each visitor gets a project-scoped anonymousId (UUID).
  • sessionKey groups events by activity window and idle rules.
  • No emails, names, or customer IDs are required or stored by default.
  • Custom identifiers can be passed, but CavBot encourages route-level analysis instead of user-level profiling.
IP & location
  • IP handling is optional. When enabled, addresses are truncated before any storage.
  • Coarse geography only (e.g., country/region) for routing and abuse detection.
  • No long-term storage of raw IPs; data is aggregated as quickly as possible.
  • Abuse or anomaly patterns can be detected without tying back to identifiable persons.
Retention & controls
  • Raw events: kept for a short operational window (e.g., 90 days).
  • Aggregates: daily rollups and SEO/perf snapshots retained longer for historical trends.
  • Project-level toggles: disable certain event types or turn off performance sampling entirely.
  • Retention policies: configured per project so you can align CavBot with your data posture.
Why teams install CavBot

What engineers & SEOs get back

CavBot is built for teams who want fewer guesses and more structure. The goal is simple: turn broken routes, SEO gaps, and runtime feel into a navigable system you can actually improve.

Less guesswork

See exactly where routes and 404s are hurting journeys. Density maps, recovery rates, and control-room metrics replace scattered “I think this is broken” reports.

SEO with context

SEO snapshots are tied to real sessions, referrers, and funnels — not just static crawls. When CavBot flags a meta issue, you know how much traffic it actually touches.

Runtime feel

Core Web Vitals and a “runtime feel” lens turn slow-feeling routes into concrete issues, with enough structure for engineers to prioritize them.

AI-ready schema

Because everything lands in a small, typed schema, future CavBot assistants can sit on top of events, seo_snapshots, performance_samples, and insights without ever needing user-level data.