Bounded concurrency for post/page serialization — mapWithLimit vs sequential await

We’re looking at a performance improvement to Ghost’s post and page serializers and want to get community feedback before merging.

The Problem

Both the post and page serializers await each model mapper sequentially:

// posts.js and pages.js — same pattern
for (let model of models.data) {
    let post = await mappers.posts(model, frame, {tiers});
    posts.push(post);
}

For a request returning 50 posts where each mapper takes ~5ms (DB lookups, HTML processing), this means ~250ms spent serializing sequentially when the work could overlap. The same bottleneck applies to pages — any site with many published pages (landing pages, about, contact, etc.) hits the same sequential path.

The Proposed Change

Replace the sequential loop in both posts.js and pages.js with a concurrency-limited parallel mapper:

const CONCURRENCY_LIMIT = 10;

async function mapWithLimit(items, fn, limit) {
    const results = new Array(items.length);
    let index = 0;

    async function worker() {
        while (index < items.length) {
            const i = index++;
            results[i] = await fn(items[i]);
        }
    }

    const workers = Array.from(
        {length: Math.min(limit, items.length)},
        () => worker()
    );
    await Promise.all(workers);
    return results;
}

// Used in both serializers:
posts = await mapWithLimit(models.data, model => mappers.posts(model, frame, {tiers}), CONCURRENCY_LIMIT);
pages = await mapWithLimit(models.data, model => mappers.pages(model, frame, {tiers}), CONCURRENCY_LIMIT);

Order is preserved via index-based assignment. Concurrency is capped at 10 to avoid exhausting the knex connection pool (default max: 10).

What We’ve Verified

  • Mappers are stateless — no shared mutable state between calls

  • frame is read-only within mappers

  • Both mappers.posts() and mappers.pages() follow the same code path (pages is a thin wrapper around the post mapper that removes a few fields)

  • Unit tests pass (5,432 passing, 0 failing)

  • Synthetic benchmarks show 13-47x speedup depending on item count and mapper latency

What We’re Less Sure About

  1. Is 10 the right concurrency limit? It matches the default knex pool size, but sites with custom pool configs might benefit from a different value. Should this be configurable?

  2. Connection pool pressure under load. A single API request now uses up to 10 connections simultaneously. If multiple users hit the posts or pages API concurrently, could this cause connection starvation for other endpoints (member login, webhooks, admin UI)?

  3. Should we use an existing library instead? Libraries like p-limit or async.mapLimit are battle-tested. We chose a zero-dependency helper to keep it simple, but the index++ pattern relies on JS being single-threaded (safe in Node, but worth calling out).

  4. Memory profile. 10 concurrent mappers each allocating post/page objects simultaneously vs 1 at a time. For typical page sizes (15-50 items) this should be fine, but has anyone seen issues with larger custom limits?

Affected Endpoints

Endpoint Serializer Change
GET /ghost/api/admin/posts/ posts.js Sequential → mapWithLimit(10)
GET /ghost/api/content/posts/ posts.js Sequential → mapWithLimit(10)
GET /ghost/api/admin/pages/ pages.js Sequential → mapWithLimit(10)
GET /ghost/api/content/pages/ pages.js Sequential → mapWithLimit(10)

We’d appreciate feedback from anyone running Ghost at scale (1000+ posts/pages, high traffic) on whether this change improves or degrades their experience.

Have you actually tested this?

If you actually have issues, why not use the built-in Redis cache? Worked well on my end:

1 Like

It would be interesting to see your benchmarking approach. I’m struggling to see the benefit of adding additional Promise handling, closure allocations, and worker coordination overhead for a process that is almost entirely CPU bound synchronous work.

The only database lookups within the mapper are Product lookups but only when a post has a non-standard visibility (i.e. it’s set to tiers with a custom filter) that is typically edge case. Everything else is synchronous CPU-bound tasks that wouldn’t benefit from attempting to split concurrently over a single-threaded node process.

If you’re struggling with performance in your environment then I’d agree with @jannis that utilising the Redis cache would give you better results.

2 Likes