I built Loamly in 4-5 months while working a full-time job. Nights and weekends.

This is not a "we raised $10M" story. This is a solo founder building something real, making mistakes, and learning in public.

Here's what actually happened.

The Problem I Saw

In August 2025, I noticed something strange in my analytics.

Direct traffic was growing, but the behavior patterns didn't match typical direct visitors. These people weren't typing URLs or using bookmarks. They were landing on specific product pages, showing high intent, and converting well.

Something was sending them. But GA4 couldn't tell me what.

After digging, I realized: AI was the new search. ChatGPT, Claude, Perplexity - they were recommending products and brands. But when users clicked those links, the referrer data was getting stripped.

My "Direct" traffic was actually AI traffic. And I had no way to measure it.

Why I Decided to Build

The obvious question: why not just use existing tools?

I looked at everything. GA4 couldn't detect it. Plausible had added some AI detection, but only for platforms that pass referrer headers. Most specialized tools were enterprise-only.

More importantly, I realized the problem had two parts:

Traffic detection: Know when AI sends you visitors
Brand monitoring: Know what AI says about you (even when users don't click)

No tool did both. And the second part seemed more valuable long-term.

So I started building.

The Tech Stack

For context on what I built:

Framework: Next.js 15 with App Router
Database: Supabase (Postgres + Edge Functions)
AI APIs: OpenAI, Anthropic, Google (Gemini), Perplexity
Scraping: Firecrawl for web content extraction
Hosting: Vercel + Cloudflare
Open source: The core is MIT licensed

I chose this stack because I could move fast solo and scale later.

Month 1: Everything Broke

Let me tell you about my first major failure.

I built a "brand analyzer" that would scrape a website and ask AI platforms questions about it. Simple concept.

Problem: I was using Claude Sonnet (the expensive model) instead of Claude Haiku.

Cost difference: 12x.

I only discovered this when Anthropic sent me billing alerts. Every 2 hours, for days. I was burning through my API credits running test queries.

The fix was one line:

// Before (expensive)
model: "claude-sonnet-4"
 
// After (90% cost reduction)  
model: "claude-haiku-4-5"

Lesson learned: Triple-check your model configs before running batch operations. One wrong string can cost you hundreds of dollars.

The Competitor Deduplication Disaster

Next major failure: competitor identification.

When analyzing a brand, I needed to identify competitors mentioned in AI responses. I built an extraction system that parsed AI text and found company names.

Sounds straightforward. It wasn't.

Problem 1: AI responses included garbage "competitors" like:

"Affordable" (an adjective, not a company)
"Can" (a modal verb)
"Compliance" (a concept)

I was extracting common words that happened to be capitalized.

Problem 2: Duplicate detection failed.

When analyzing Clay.com (a sales intelligence tool), my system returned:

"Apollo" → resolved to apollo.com (Apollo Global Management - a private equity firm)
"Apollo.io" → resolved to apollo.io (the actual sales tool)

Same competitor, two different entries. Because my domain resolution picked the .com before checking if .io was the intended match.

The fix required a complete rewrite. I integrated Brandfetch's Brand Search API as the source of truth. Now I validate every extracted competitor name against real brand data before accepting it.

// Query Brandfetch to validate competitor exists
const validation = await brandfetch.searchBrand(extractedName);
 
if (!validation.found || validation.confidence < 0.7) {
  // Not a real brand, skip it
  continue;
}

Lesson learned: AI extracts text, not truth. You need external validation for anything that matters.

Multi-Platform Hell

Building for one AI platform is manageable. Building for four simultaneously is chaos.

Each platform has different:

API structures
Response formats
Citation patterns
Rate limits
Pricing models

Here's what I shipped for multi-platform support (this became LOA-313 in my Linear backlog):

ChatGPT: Uses OpenAI's Responses API with web search. Returns citations in a specific format. Relatively stable.

Claude: Uses Anthropic's API with web search tools. Citations come in web_search_tool_result blocks. Different parsing required.

Gemini: Google's API with grounding. Citations embedded differently. Had to build separate extraction logic.

Perplexity: Most consistent for citations, but required different prompt engineering.

I built shared utilities that work across all platforms:

Circuit breaker (stop calling APIs that are failing)
Retry with exponential backoff
Concurrency limiter
Semantic caching (don't re-run identical queries)

Each utility took a day or two. But they saved me weeks of debugging flaky API calls later.

The 6-Phase Intelligence Overhaul

The biggest project was rebuilding my "Intelligence" page - the dashboard that shows AI visibility over time.

I tracked this as LOA-312 through LOA-318 in Linear. Six phases:

Phase 1: Multi-platform parity. Add Claude and Gemini alongside ChatGPT.

Phase 1.5: Conversational prompt generation. Instead of keyword-stuffed queries, generate prompts that sound like real users ("What's the best CRM for a 10-person sales team?")

Phase 2: AI-native category extraction. Instead of asking users to pick their industry, use Firecrawl to scrape their site and let AI determine the product category.

Phase 3: Perplexity integration as a premium feature.

Phase 4: GEO score - a composite metric for AI visibility.

Phase 5: UI polish with shadcn/studio components.

Each phase revealed new bugs. The category extraction initially returned "business software" for everything because my fallback logic was too aggressive. The GEO score calculation had a bug where 0% clarity scores were averaged incorrectly.

Lesson learned: Ship incrementally. I could have tried to build all six phases at once. I would have failed. Shipping each phase, learning, and iterating worked.

PageSpeed: The 63 to 100 Journey

At one point my homepage scored 63 on PageSpeed Insights. Embarrassing for an analytics tool.

The fixes (tracked in LOA-342):

fetchpriority="high" on LCP image: The logo was loading lazily. Wrong priority for the largest contentful paint element.
Remove blur filter animations: CSS blur is expensive. I had decorative blurs that killed performance.
Lazy load non-critical components: My animated "Game of Life" background was loading immediately. Moved it to dynamic import.
Preconnect hints: Added ReactDOM.preconnect() for API domains.
Image optimization: One user avatar was 7.4MB. Compressed to 83KB. (99% reduction, same visual quality.)

After these fixes: 90+ on mobile, close to 100 on desktop.

Lesson learned: Performance is cumulative death by a thousand cuts. Each individual issue seems minor. Combined, they destroy your score.

The /check Tool as Lead Magnet

Early on I realized: nobody trusts a brand they've never heard of.

So I built a free tool: enter any domain, get an AI visibility report. No signup required.

This became the /check tool. It:

Scrapes the domain with Firecrawl
Runs queries against ChatGPT, Claude, Gemini, Perplexity
Calculates a GEO score
Shows competitor comparisons
Provides specific improvement recommendations

Building this as a lead magnet was strategically important. But the implementation was hard.

The full analysis takes 3-4 minutes (lots of AI API calls). Users were bouncing during the wait.

I added:

Progressive loading: Show results as they come in
Email capture: "Want notification when your report is ready?"
Estimated completion times

Conversion to email capture improved significantly after these UX changes.

What I'd Do Differently

If I started over:

1. Less feature scope, more depth.

I built traffic detection AND brand monitoring AND analytics integrations AND... too much at once. Should have picked one and gone deep.

2. User interviews before code.

I assumed I knew what marketers needed. Some assumptions were wrong. Now I talk to users before building.

3. Better error handling from day one.

My early error messages were garbage. "Something went wrong" tells users nothing. Invested in detailed error states later, but should have done it from the start.

4. Automated testing earlier.

I wrote tests after breaking things in production. Obvious in hindsight: test first.

The Numbers (Honest)

Build time: ~4 months, nights and weekends Commits: ~3,000 (I work fast with AI coding tools) API costs during development: ~$500 total (mostly that Claude Sonnet mistake) Current status: 8 users, 2 paying

I'm not going to pretend this is a success story yet. It's a "built something real, now figuring out distribution" story. For the full business side, check out my building in public update.

Why Open Source

Loamly's core is open source (MIT license).

Not for marketing. For trust.

AI visibility is sensitive. Marketers need to know what the tool actually does. Open source means you can audit the code. You can see exactly how we calculate scores and what data we collect.

It also means transparency. You can see exactly what the tracker does and verify it yourself.

And honestly, I believe analytics should be transparent. Google Analytics is a black box. I built the alternative I wanted.

What's Next

My roadmap (also in Linear, public):

More comparison content: Loamly vs Plausible, Loamly vs GA4, etc.
Claude traffic detection improvements: Their referrer behavior keeps changing
Historical trend analysis: Show AI visibility over time
Integration with HubSpot: First CRM integration for attribution

If you've made it this far, you're probably interested in AI visibility.

Try the free check at loamly.ai/check. Takes 3 minutes. I read every feedback email.

This is a living document. I'll update it as I learn more. Built in public means sharing the real journey, not just the wins.

I Built an AI Traffic Detector: Here's What I Learned