ideas.
May 21, 2026 3 min read webb2baidata

AI feedback aggregator for product teams

A web app that ingests customer feedback from multiple inboxes, clusters it into themes with an LLM, and surfaces a structured weekly digest so product teams stop missing signal buried in email.

The idea

Product teams collect feedback from everywhere — support email, Typeform surveys, app store reviews, CSV exports from Intercom — and almost none of it gets systematically read. This tool ingests all those sources through a lightweight importer, runs each entry through an LLM to extract topic, sentiment, and urgency, then clusters related entries and publishes a weekly digest: the top five themes this week, the five most urgent individual items, and a list of newly emerging topics not seen in previous weeks.

Why build this

The pain is real and common: a small team shipping fast has no time to read 300 support emails a week, so they sample or skip. Signal about broken flows, missing features, and churning customers stays invisible. Existing tools like Dovetail or Productboard solve this but cost $300–1 000/month per team and require manual tagging. A self-hosted, open-source version that does the tagging automatically is a natural fit for early-stage B2B teams who already pay for API access but not for full-suite analytics.

The underlying technology — embedding-based clustering combined with an LLM extraction pass — is now cheap enough that processing 500 feedback items costs less than a dollar. The main reason this hasn't been commoditized at the low end is that nobody has assembled the plumbing into a small, self-deployable package.

Stack sketch

  • Backend: Python, FastAPI, Celery for the ingestion and processing queue
  • LLM: Claude claude-sonnet-4-6 via the Anthropic SDK for extraction (topic, sentiment, urgency score, one-sentence summary); use prompt_caching on the system prompt to cut cost on high-volume batches
  • Embeddings: text-embedding-3-small via OpenAI or a local nomic-embed-text model via Ollama for the clustering step
  • Clustering: HDBSCAN on the embedding vectors; re-clusters on each weekly digest run so new items flow into existing themes
  • Storage: Postgres for raw feedback entries and cluster assignments; pgvector extension for the embedding column
  • Frontend: Next.js app with a simple dashboard: theme list, entry drilldown, digest view
  • Importers: Gmail OAuth, Typeform webhook, Intercom CSV upload, plain CSV upload

Scope for v1

  • CSV and email import only (OAuth for Gmail is table stakes; skip Intercom for now)
  • LLM extraction pipeline: topic label, sentiment (positive/neutral/negative), urgency (high/medium/low)
  • Clustering on each weekly run; no real-time clustering
  • Weekly digest emailed as HTML using Resend or SES
  • Single-tenant: one product, one team, no multi-org auth
  • No custom taxonomy — let the LLM generate topic labels; curation UI comes later
  • Out of scope for v1: Slack integration, app store ingestion, sentiment trend graphs, API for querying themes programmatically

Where it could go

The natural v2 is a Slack or Teams bot that surfaces a daily "top 3 urgent items" message and lets a team member assign or dismiss entries inline. This turns the digest from a pull artifact into a push habit, which is where retention lives for this kind of tool.

Beyond that, a trend layer becomes valuable once enough history accumulates: show which themes are growing week-over-week, which have gone quiet after a fix shipped, and which recur despite repeated responses. Pair that with a changelog importer (GitHub releases or Linear), and you can auto-correlate product changes to shifts in feedback volume — the kind of loop-closing that currently requires a dedicated data analyst.

Watch out for

LLM-assigned topic labels drift over time: the model may label the same complaint as "slow load times" one week and "performance issues" the next, fragmenting what should be a single cluster. Pin the taxonomy once you have 3–4 weeks of data by feeding existing labels back into the extraction prompt, or the digest becomes hard to compare across periods.