How We Scaled Programmatic SEO in Next.js: Memory-Efficient Sitemap Generation for Thousands of Pages
The memory bottleneck in generating sitemaps for 10k+ SEO pages
When AustinsElite hit the 10,000-page mark for programmatic SEO, our sitemap builds started failing. Not with errors you’d see in dev—no, these were silent killers: memory exhaustion, timeouts, and sluggish builds that choked our Vercel deployments. We were generating a single, monolithic sitemap.xml file by fetching all venue data at once, then mapping over it to construct URLs. It worked fine at 100 pages. At 10,000? Node.js would crash before finishing.
The root issue was simple: we were loading everything into memory. Every venue, every modifier, every gallery route—pulled in one go, processed in a single pass. We also had N+1 query patterns sneaking in through lazy-loaded relationships, making the problem worse. The sitemap wasn’t just slow—it was unsustainable.
This wasn’t just a performance issue. It was a business risk. If search engines couldn’t reliably crawl our pages, our SEO strategy would collapse. We needed a solution that scaled with our content, not against it.
Refactoring approach: chunked data streaming, eager loading, and inline indexing
We tore up the old approach and rebuilt it around three principles: chunking, pre-loading, and inlining.
First, chunked data streaming. Instead of fetching all venues at once, we started pulling them in batches of 500 using cursor-based pagination. Each chunk was processed and written to the sitemap stream immediately, then garbage-collected. This kept memory pressure low and allowed us to handle datasets of any size without hitting Node’s 2GB heap limit.
// Simplified: streaming venues in chunks
async function* getVenueChunks() {
let cursor = null;
while (true) {
const { data, nextCursor } = await fetchVenues({ limit: 500, cursor });
if (!data.length) break;
yield data;
cursor = nextCursor;
}
}
Second, eager loading of modifiers. Previously, we’d fetch venue data first, then make additional calls to get modifiers (like premium, featured, etc.) during URL construction. That meant N+1 queries. We fixed it by pre-loading all necessary modifiers in a single, indexed map before processing any chunks.
const modifiersMap = await getAllModifiers().then(mods =>
new Map(mods.map(m => [m.venueId, m]))
);
Now, during sitemap generation, we could resolve modifiers in O(1) time—no extra I/O, no delays.
Third, inlining indexability checks. We used to call a separate isIndexable() function per venue, which sometimes triggered async logic or external checks. We replaced that with a precomputed boolean field in the database, updated on write. Now, filtering out non-indexable pages became a simple inline check:
if (!venue.isIndexable) continue;
No function calls, no promises—just a fast sync evaluation. We also extended this logic to dynamically include new route types, like /venue/[slug]/gallery, ensuring they appeared in the sitemap only when valid.
Results: reduced memory usage by 70% and eliminated N+1 query issues
The impact was immediate. Memory usage during sitemap generation dropped by 70%. Builds that used to peak at 1.8GB now hovered around 500MB. More importantly, they were stable—no more OOM crashes, even as we scaled past 15,000 pages.
We also cut the number of database queries from ~10,500 (10k venues + 10k modifier lookups) down to just two: one for the initial modifier map, and one for the chunked venue stream. No more N+1. No more latency spikes.
But the real win was operational. Sitemaps now build faster, deploy more reliably, and scale predictably. We’ve since reused this pattern for other large-scale exports—think canonical URL lists, structured data dumps, and SEO audit reports.
If you’re working with programmatic SEO in Next.js (or any meta-framework), don’t wait for the crash to refactor. Start small: chunk your data, preload your relationships, and inline your logic. Memory efficiency isn’t just a nice-to-have—it’s what keeps your SEO engine running when it matters most.