From RSS Chaos to AI-Curated Feeds: The NimbusFeed Story

The Problem With RSS in 2025

RSS never died. It just got buried under algorithmic feeds, notification fatigue, and the slow collapse of Google Reader’s spiritual successors. For anyone who still relies on RSS to stay current across dozens of sources, the experience in 2025 is roughly the same as it was in 2012: a chronological firehose of headlines you don’t have time to read.

Article Overview

From RSS Chaos to AI-Curated Feeds: The NimbusFeed Story

10 sections · Reading flow

01
The Problem With RSS in 2025

→

02
Why Existing Solutions Fell Short

→

03
Architecture Overview

→

04
The Embedding Pipeline: Lessons Learned

→

05
Deduplication: A Harder Problem Than Expected

→

06
Cold Start and Onboarding

→

07
Performance Considerations

→

08
Monitoring and Operational Lessons

→

09
What We'd Do Differently

→

10
The Broader Lesson

HARBOR SOFTWARE · Engineering Insights

We built NimbusFeed because we lived this problem daily. As a software agency, Harbor stays plugged into security advisories, framework changelogs, industry analysis, and client-relevant verticals. Across the team, we were collectively subscribed to over 400 feeds. The volume was unmanageable. People stopped checking their readers entirely, which defeated the purpose.

The question we kept asking was simple: Can we use AI to turn RSS from a firehose into a focused briefing? Not a summary service. Not a chatbot. An intelligent feed that learns what matters to you and surfaces it with context.

This is the story of building NimbusFeed — from the first prototype to the architecture decisions that shaped the production system, including the mistakes we made along the way and the lessons we’d carry into any AI-augmented information product.

Why Existing Solutions Fell Short

We evaluated a dozen feed readers before writing a single line of code. Feedly’s AI features are decent but locked behind enterprise pricing and focused on keyword matching rather than semantic understanding. Inoreader has powerful filtering but no actual intelligence layer. Miniflux is beautifully minimal but offers zero curation. Feedbin is a joy to use but curation is entirely manual.

The gap was consistent across all of them: rule-based filtering is not the same as understanding. You can set up a filter for “React” but you’ll still get every trivial tutorial alongside the genuinely important architectural discussions. Keyword matching cannot distinguish between a React changelog that affects your production app and a beginner’s guide to useState. It cannot tell the difference between a CVE advisory that impacts your stack and one that affects a library you’ve never used.

We needed something that could understand the semantic content of articles, learn from user behavior over time, and deliver a ranked feed that actually reduced cognitive load. Not just fewer articles — the right articles, in the right order, with enough context to decide whether they’re worth reading.

There was also a workflow problem. Most feed readers present a binary: read or don’t read. There’s no middle ground for articles that are relevant but not urgent, or articles you want to reference later without reading in full right now. We wanted to build that middle ground into the product from the start.

Architecture Overview

NimbusFeed is built as a three-layer system: ingestion, intelligence, and presentation. Each layer is independently deployable and communicates through a shared PostgreSQL database and a BullMQ job queue. This separation was deliberate — it lets us scale the intelligence layer (the most compute-intensive part) independently of the ingestion and presentation layers.

Ingestion Layer

The ingestion layer handles feed polling, content extraction, and normalization. We poll feeds on configurable intervals (default 15 minutes for high-priority feeds, 1 hour for everything else) using a distributed task queue built on BullMQ. Each feed is a recurring job with its own schedule.

// Feed polling worker (simplified)
const pollFeed = async (feed: Feed) => {
  const parser = new RSSParser();
  const result = await parser.parseURL(feed.url);
  
  const newItems = result.items.filter(
    item => !await db.feedItem.findFirst({
      where: { guid: item.guid, feedId: feed.id }
    })
  );

  for (const item of newItems) {
    const extracted = await extractContent(item.link);
    await db.feedItem.create({
      data: {
        feedId: feed.id,
        guid: item.guid,
        title: item.title,
        link: item.link,
        publishedAt: new Date(item.pubDate),
        rawContent: extracted.text,
        wordCount: extracted.wordCount,
        embedding: await generateEmbedding(extracted.text),
      }
    });
  }
};

Content extraction was its own challenge. RSS feeds range from full-content to title-only. Many include only a brief excerpt followed by a “Read more” link. We use Mozilla’s Readability library (the same engine behind Firefox Reader View) to extract clean article text from the source URL. This gives us the full content regardless of what the feed itself includes.

Not all URLs cooperate. Some sites block automated requests, some use heavy client-side rendering, some sit behind paywalls. We handle these cases gracefully — the system falls back to the RSS content, then to the title and description only, and flags articles where extraction failed so users understand why the summary might be less accurate. We experimented with headless browser extraction using Puppeteer but the latency and resource cost weren’t justified for the marginal improvement in extraction success rate (roughly 3% more articles extracted successfully, at 10x the compute cost per article).

Intelligence Layer

This is where NimbusFeed diverges from a traditional reader. Every article gets processed through three stages:

Embedding generation using OpenAI’s text-embedding-3-small model. Each article becomes a 1536-dimensional vector stored in pgvector.
Topic classification using a lightweight prompt that categorizes articles into the user’s defined interest areas. Each user can define up to 20 interest areas (“Kubernetes security,” “Next.js ecosystem,” “startup fundraising”) and the classifier maps each article to zero or more of these areas.
Relevance scoring that combines the embedding similarity to the user’s interest profile with recency, source authority, and engagement signals.

The interest profile is the key innovation. Rather than asking users to manually configure preferences, we build it from their behavior. Every article a user clicks, bookmarks, or shares updates their profile embedding using exponential moving average:

const updateUserProfile = async (userId: string, articleEmbedding: number[]) => {
  const user = await db.user.findUnique({ where: { id: userId } });
  const alpha = 0.15; // Learning rate
  
  const updatedProfile = user.profileEmbedding.map(
    (val, i) => alpha * articleEmbedding[i] + (1 - alpha) * val
  );

  await db.user.update({
    where: { id: userId },
    data: { profileEmbedding: updatedProfile }
  });
};

The learning rate of 0.15 was chosen after experimentation. Too high and the profile becomes volatile, chasing every click. Too low and it takes weeks to adapt. At 0.15, the profile meaningfully shifts within 2-3 days of changed reading patterns while remaining stable enough to be useful. We tested values from 0.05 to 0.30 across a cohort of 20 internal users over four weeks. 0.15 scored highest on a satisfaction survey where users rated the relevance of their top-10 daily articles.

The relevance score itself is a weighted combination of multiple signals:

const computeRelevanceScore = (
  article: FeedItem, 
  userProfile: number[], 
  sourceAuthority: number
): number => {
  const semanticSimilarity = cosineSimilarity(article.embedding, userProfile);
  const recencyBoost = Math.exp(-hoursSincePublished(article) / 48); // Half-life of 48 hours
  const sourceWeight = sourceAuthority; // 0-1, based on user's historical engagement with this source
  const lengthPenalty = article.wordCount > 5000 ? 0.9 : 1.0; // Slight penalty for very long articles
  
  return (
    0.50 * semanticSimilarity +
    0.20 * recencyBoost +
    0.20 * sourceWeight +
    0.10 * (1 - lengthPenalty)
  );
};

These weights were tuned manually based on user feedback during beta. We tried learning them automatically but our dataset was too small to train a reliable model. The manual weights work well enough that optimizing them further isn’t a priority.

Presentation Layer

The frontend is a Next.js application with a clean, distraction-free reading interface. Articles are presented in a ranked list with relevance scores visible (users asked for this during beta — they wanted to understand why the system was recommending something). Each article card shows a one-line AI-generated summary, source, publication time, and topic tags.

We also built a daily digest feature that generates a 5-minute briefing every morning. This uses a longer context window to identify themes across articles and group related items together. The prompt for digest generation is carefully tuned:

const digestPrompt = `You are a technical editor creating a morning briefing.
Given these ${articles.length} articles from the last 24 hours,
ranked by relevance to the reader's interests:

${articleSummaries}

Create a briefing with:
1. The 3-5 most important items with 2-sentence summaries
2. A "Worth Watching" section for emerging trends (2-3 items)
3. A "Quick Hits" section for minor but notable updates (3-5 one-liners)

Keep it under 500 words. Be specific, not generic.
Do not editorialize. State what happened and why it matters.
Include links to original articles using markdown format.`;

The digest turned out to be the most-used feature by a wide margin. Many users told us during beta that they open NimbusFeed specifically for the morning digest and only browse the ranked feed when they have extra time. This was humbling — we spent most of our engineering effort on the intelligence layer and real-time ranking, and the highest-value feature was essentially a batch job that runs once a day.

The Embedding Pipeline: Lessons Learned

Our initial approach was to embed the full article text. This was both expensive and surprisingly less effective than embedding a compressed representation. Full articles contain boilerplate, author bios, navigation text (from imperfect extraction), and tangential content that dilutes the semantic signal.

We switched to a two-step process: first, generate a structured summary of each article (title, key claims, technologies mentioned, implications), then embed that summary. This reduced our embedding costs by roughly 70% and actually improved relevance scoring accuracy by 12% in our benchmarks. The structured summary acts as a denoised representation of the article’s core content.

The summary generation step adds about $0.002 per article using Claude Haiku. At 50 articles per day per user, that’s $0.10 per user per day, or roughly $3/month. Combined with embedding costs, the total AI spend per user is about $4/month. Pricing NimbusFeed at $12/month gives us healthy margins after infrastructure costs.

Storage was another consideration. With 400+ feeds and an average of 50 new articles per day across all feeds, we’re generating about 18,000 embeddings per month per user. At 1536 dimensions per embedding, that’s roughly 110MB of vector data per user per year. We chose pgvector over a dedicated vector database (Pinecone, Weaviate) for simplicity. NimbusFeed doesn’t need sub-millisecond vector search at massive scale. It needs reliable, transactional storage that lives alongside the rest of the application data. Postgres with pgvector handles our query patterns comfortably with proper indexing.

-- pgvector index for cosine similarity search
CREATE INDEX idx_feed_items_embedding 
  ON feed_items 
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

-- Composite index for common query pattern: recent articles for a user's feeds
CREATE INDEX idx_feed_items_recent 
  ON feed_items (feed_id, published_at DESC)
  WHERE published_at > NOW() - INTERVAL '7 days';

One caveat with pgvector’s IVFFlat index: it requires a training step that uses the current data distribution. If your embedding distribution changes significantly (e.g., you switch embedding models), the index needs to be rebuilt. We hit this when we migrated from text-embedding-ada-002 to text-embedding-3-small — query accuracy degraded until we dropped and recreated the index. HNSW indexes don’t have this problem but use more memory. For our data volumes, IVFFlat with periodic rebuilds is the right tradeoff.

Deduplication: A Harder Problem Than Expected

One issue we underestimated was content duplication. The same story gets covered by multiple outlets, syndicated across feeds, and sometimes republished with minor edits. Showing users five slightly different versions of the same announcement is exactly the kind of noise NimbusFeed exists to eliminate.

Title matching catches the obvious cases, but many outlets rewrite headlines entirely. “OpenAI Launches GPT-5” and “GPT-5 Released: What Developers Need to Know” are clearly about the same event but share minimal title overlap. We implemented a cosine similarity threshold on article embeddings: if a new article’s embedding is within 0.92 similarity of an existing article from the same day, we flag it as a duplicate and cluster it with the original.

The user sees the highest-ranked version with a note like “3 similar articles” that expands on click. This clustering approach respects the fact that different outlets add different value — a deep technical analysis and a brief news announcement might be about the same event but offer different levels of detail. Users can click through to see all versions and choose the one they prefer.

Finding the right threshold took iteration. At 0.90, we were incorrectly clustering articles that covered related but distinct topics (e.g., two separate React library releases on the same day). At 0.95, we were missing obvious duplicates with different framing. 0.92 hit the sweet spot for our content mix, though we expose this as a user-configurable setting for users who prefer tighter or looser clustering.

We also added a time window to the deduplication. Articles published more than 48 hours apart are never clustered, even if their embeddings are very similar. This handles recurring topics (like weekly roundups) that would otherwise get incorrectly merged with earlier coverage.

Cold Start and Onboarding

The cold start problem was real. A new user with no reading history has no interest profile. We solved this with a three-step onboarding flow. First, users select from broad topic areas (security, frontend, backend, DevOps, AI/ML, mobile, data engineering, etc.). Second, they pick 3-5 articles from a curated set that they find interesting — these articles are chosen to be maximally differentiating within the selected topics. Third, they import or add their feed subscriptions.

This three-step process gives us enough signal to generate an initial profile embedding that’s immediately useful. The topic selection provides a coarse signal, and the article selection refines it. A user who selects “Security” and then picks articles about application security rather than network security gets a meaningfully different initial profile.

We also implemented OPML import, which is table stakes for any RSS reader. But we went further: when a user imports their existing subscriptions, we analyze the feeds themselves to infer likely interests. A user subscribed to Krebs on Security, The Record, and SANS ISC is clearly interested in cybersecurity. This feed-level inference provides a decent bootstrap signal even before the user reads a single article. We built a mapping of ~2,000 popular feeds to topic tags, which covers the majority of feeds users actually subscribe to.

Performance Considerations

The intelligence layer adds latency. Generating embeddings, running classification, computing relevance scores — this all takes time. We made the architectural decision early on that intelligence processing is always asynchronous. When you open NimbusFeed, you see results from the last completed processing cycle, not a real-time computation.

This means the feed you see might be up to 15 minutes behind the latest published content. In practice, users don’t notice. RSS content isn’t real-time by nature — an article published 10 minutes ago versus 25 minutes ago makes no practical difference. The tradeoff of slightly stale rankings for consistently fast page loads (under 200ms) was clearly the right call.

Processing itself runs on a background worker. For a user with 400 feeds, a full re-ranking of the day’s articles takes about 8 seconds. The bottleneck is the OpenAI API calls for classification, not the vector similarity computations. We batch API calls where possible and cache aggressively — if two users subscribe to the same feed, we only process each article once.

On the frontend, we use React Server Components to pre-render the feed on the server, then hydrate with real-time updates via Server-Sent Events. The initial page load delivers content immediately without a loading spinner, which is critical for a product that users open habitually. A spinner on your morning reading tab is the fastest way to lose daily active users.

Monitoring and Operational Lessons

Running a system that depends on external feeds and external AI APIs taught us lessons about operational resilience. Feeds go down. They change URLs. They start returning malformed XML. The OpenAI API has rate limits and occasional outages. Building for these failure modes from the start would have saved us several incidents.

We track feed health with a simple status system: healthy (last poll succeeded), degraded (last poll returned partial data or fewer items than usual), and failing (last 3 polls failed). Failing feeds are retried with exponential backoff and eventually flagged for user attention after 24 hours of continuous failure. This prevents the system from wasting resources polling dead feeds while giving temporary outages time to resolve.

For the AI pipeline, we implemented a circuit breaker that switches to a fallback mode (chronological feed without intelligence ranking) when the AI API is unavailable. Users see a banner explaining that AI features are temporarily limited. This is better than failing silently or not loading at all.

What We’d Do Differently

If we were starting NimbusFeed today, we’d make several changes:

Use a smaller embedding model from the start. We began with text-embedding-ada-002, migrated to text-embedding-3-small, and could probably get away with a 512-dimension model for this use case. The relevance scoring doesn’t need the full semantic richness of higher-dimensional embeddings. Lower dimensions mean faster similarity computations and less storage.
Build the digest feature first. During beta, the daily digest was far more popular than the ranked feed. Many users set NimbusFeed as their morning tab and read only the digest. We built the ranked feed first because it was the technically interesting challenge, but the digest was the feature that drove retention. Product lesson: build the thing users will use daily, not the thing that’s architecturally interesting.
Invest more in feed discovery. Users consistently asked “what feeds should I subscribe to?” We added a basic recommendation engine that suggests feeds based on interest profiles, but it deserved more attention. The best feed reader is useless if it’s reading the wrong feeds.
Add a “save for later” queue sooner. We added this in the third month after multiple requests. Users wanted a middle ground between “read now” and “ignore forever.” The queue feeds back into the interest profile (saving is a stronger signal than clicking), and it became one of the primary interaction patterns.

The Broader Lesson

NimbusFeed reinforced a principle we apply across all our projects at Harbor: AI is most valuable when it reduces decisions, not when it makes them. NimbusFeed doesn’t decide what you should read. It reduces a list of 200 articles to a ranked 30, and then to a 5-minute digest. The user still chooses. The AI just removes the noise.

This pattern shows up in our other products too. VibeGuard doesn’t decide whether a code change is secure — it flags potential issues and explains why. SparkAI doesn’t decide how to respond to a Reddit mention — it surfaces the mention with context and sentiment. The best AI products we’ve built are filters, not decision-makers. They compress information, they don’t replace judgment.

RSS was the perfect proving ground for this philosophy. The content is there. The signal is buried in noise. The user’s time is finite. All we had to do was build a better filter. NimbusFeed is that filter, and the lessons we learned building it — about embedding pipelines, user profile management, cold start problems, and the surprising value of simple batch digest features — have shaped how we think about intelligence layers across every product we ship.

If you’re dealing with information overload in your own workflows — whether it’s RSS feeds, support tickets, log files, or market data — the pattern is the same: ingest, embed, rank, surface. The specific models and thresholds will vary, but the architecture translates cleanly. That’s the real takeaway from the NimbusFeed story: the hard part isn’t the AI. The hard part is building a system where the AI’s output is reliable enough and fast enough that users trust it with their daily workflow. Get that right, and the intelligence layer becomes invisible — which is exactly what good infrastructure should be.

The Problem With RSS in 2025

From RSS Chaos to AI-Curated Feeds: The NimbusFeed Story

Why Existing Solutions Fell Short

Architecture Overview

Ingestion Layer

Intelligence Layer

Presentation Layer

The Embedding Pipeline: Lessons Learned

Deduplication: A Harder Problem Than Expected

Cold Start and Onboarding

Performance Considerations

Monitoring and Operational Lessons

What We’d Do Differently

The Broader Lesson

You may also like

Leave a comment Cancel reply