URL/slug delay after publishing when serving over Cloudfront CDN


#1

Hi friends – I’m trying to understand what’s happening with a self-hosted (Heroku) based blog I’m working on.

We recently moved our blog to be served through AWS Cloudfront, and it’s caching effectively, but maybe too effectively. To avoid issues with caching specifically when editing/publishing, we use the non-cached URL (see link below).

The issue: when clicking “publish”, the pages don’t immediately appear. They show up on category pages and the image/title/excerpts load as usual, but the URL points to /404. Only later does the URL change to the correct slug, usually between 30 minutes to an hour. This can be frustrating if trying to publish a new post on the homepage, yet when it appears as a thumbnail on the homepage, the link directs to 404 for a while and eventually updates.

I found this post which recommends excluding the /ghost* path from caching, but that’s not relevant for us because we’re using the non-CDN URL to edit/publish posts.

Cloudfront is set up to refresh every 24 hours. We could, of course, lower that duration, but that would negate the benefits of using a CDN to begin with.

Any suggestions? Why would just the URL not update when everything else seems to respond immediately (tag pages show the post, the thumbnail/title/excerpt loads correct)?


#2

Hey!

How many data do you have in your database e.g. amount of posts/tags?

Only later does the URL change to the correct slug, usually between 30 minutes to an hour.

And you don’t do any action e.g. publish again?


#3

Furthermore: Could you pls upgrade your blog to the latest Ghost version and test again?


#4

Hi Kate – Thank you for the quick reply. I just looked and we have 499 posts to date.

I just upgraded to v2.13.1 and publishing now appears to be instant again. Any insight as to what changes may have solved this? This felt magical (in a good way) but curious as to why.


#5

Good to hear!!!

@Kevin has guessed the bug might affected you, because you were on 2.7.x


#6

Hello again. I have a very strange update since last Wednesday: one of our writers just now published 2 articles which instantly appeared after publishing. Then, after publishing a 3rd article, she ran into the 404 issue described in my original post. We have been running version 2.13.1 since Wednesday with no problems up until just now. Unpublishing and republishing does not help, either. The 404 remains.

Even more peculiar – the 2 articles she wrote before (after publishing successfully) are suddenly going to 404s after publishing that 3rd article.

I tried restarting our server but to no avail. I also confirmed this isn’t a CDN/Cloudfront issue, as our non-CDN link was also sending straight to a 404 page.

Any ideas? Could that 3rd article have broken something somehow? The post content doesn’t appear to be any different than the other two. I’m especially confused considering this is such a delayed reaction.


#7

Are you using a custom routes.yaml?
If yes, can you please share it and tell me more about this 3rd article?


#8

Hey Kate. Unfortunately no, we have not edited any of the routes. I believe it’s all defaults. Here’s a copy/paste of what’s currently in there to confirm:

routes:

collections:
  /:
    permalink: /{slug}/
    template:
      - index

taxonomies:
  tag: /tag/{slug}/
  author: /author/{slug}/

#9

I am not able to reproduce what you are reporting, but it sounds serious and i would like to help.
We haven’t received a similar report so far.

My current assumption is that it has something to do with attaching tags, but i might be wrong.

cc @gargol

Is this behaviour reproducible? If yes, could you please try to describe the exact steps?


#10

Any help would be amazing! I’m still stumped here, but I have a bit more data to share if it helps:

I created a new post just now and noticed the following behavior:

Create new post, type in title and something in the body (no tags added). Hit CMD + S to save, then clicked the preview button. Preview looked exactly as it should.

Then I hit Publish Now, clicked the “View Post” link at the bottom left and the post worked! It was showing exactly as it should have.

However, when I refreshed the page, I got a 404 instead of the post. And from then on every refresh gave me a 404.

Another curious point: we have 3 other blogs, all set up the same way (Node v8.9.x, Heroku, cloudfront, etc.) even down to the same Ghost version: v2.13.1

The only differences I can think of between those blogs and the blog with 404 issues is the number of posts (post count is 503 currently for the problematic blog), plus the problematic blog also has posts data prior to v2, but those display/edit just fine.


#11

Just to check, are you running a single heroku dyno for each Ghost instance or do you have multiple dynos per Ghost instance?

Multiple dynos won’t work because Ghost has a number of in-memory caches that won’t get updated across dynos which could explain why you see a valid URL after posting but a 404 after refreshing if the requests are being served by different dynos.


#12

Hey Kevin, good question – yep, we have a single dyno now with Cloudfront. We had several before but suspected that might be the reason for the 404s initially on a different blog, and that seemed to fix the problem on that blog, so we applied the same setup across the board to all blogs, including the problematic one above.


#13

Small update – about 24 hours later, the two posts that have been 404ing are suddenly responsive. I have not touched the server since those posts, although it does do a daily reboot (happened about an hour ago) and it’s very possible the reboot helped. However, creating a new test post just now still creates a 404.


#14

Small update – about 24 hours later, the two posts that have been 404ing are suddenly responsive. I have not touched the server since those posts, although it does do a daily reboot (happened about an hour ago)

Happy to hear that!

However, creating a new test post just now still creates a 404.

Does it randomly serve a 404 again (after refreshing)? Or were you only getting a 404 right after publishing the test resource?

I was testing a bigger local database today and I was not able to see any 404’s :woman_shrugging:t2:


#15

It seemed to 404 immediately after clicking this time around, actually (and from there onward).

Could this be database-specific? For what it’s worth, we’ve been contributing to this database for over a year now (pre-v2) before migrating to v2+. Not sure if that makes a difference.


#16

Update – I upgraded to v2.13.2, same issue. I noticed too that when I mash refresh on a brand new published post, the URL cycles between 404 and the actual page around 50% of the time for each. This is on the non-CDN url.

I noticed the changelog for 2.13.2 mentioned Node 10; we’re on 8.9 for this blog. Do you think upgrading to 10 would make a difference?


#17

@zackcreach are you perhaps using this buildpack? https://github.com/cobyism/ghost-on-heroku

I’ve just looked at that because it was the first google result for “heroku ghost”. Unfortunately it’s got a serious config issue where it uses node-cluster on dynos where the concurrency (automatically set to number of processors) is higher than 1. Clustering is not supported in the same way that scaling up the number of dynos is not supported.


#18

That was it! It turns out we were using that exact buildpack and have moved to a simpler (non-clustering) configuration. Everything is responding now as it should. Thank you very much :pray:t3: @Kevin and @Kate


#19

Cool, glad it’s working and that we got to the bottom of it :smile: I’ve opened an issue on the buildpack repo, hopefully it will get an update soon so others don’t run into the same confusing behaviour.


#20

The ghost-on-heroku build pack has been updated to remove usage of node-cluster and the readme now contains a warning against using multiple dynos.