What happens when posts number increased? Why memory cost is proportional with posts number?


#1

I am running ghost on a small 1-core 1GB server.

When my blog has hundreds of posts, it works well.

But when the posts number is larger than 10,000, the machine’s memory will be cost by ghost. If the number reaches 20,000, it’s done, the server cannot operate any longer.

Is this because ghost is trying to preload posts into memory?

If so, can I configure how many posts to preload?

Thank you for the help!


#2

Which version of Ghost are you running?

20,000, it’s done, the server cannot operate any longer.

Could you please describe this behaviour with more details? Which Database, which server, when does the server not operate etc.

Is this because ghost is trying to preload posts into memory?

Yeah we preload resources on bootstrap to generate the blog’s urls for an upcoming feature: dynamic routing. But we only keep a minimal set of fields in memory. This behaviour was added in 1.17.3 to pre-catch any unexpected behaviour with url preloading.

You can currently disable preloading urls with putting disableUrlPreload: true in your config file. But this config option will be removed as soon as we ship dynamic routing.


#3

Hi, thank you for the quick reply!

I am using “Ghost-CLI version: 1.5.2”.

When there are over 20,000 posts in the mysql database, after the ghost process starts, the memory cost will increase untill the memory is 100%. Then the machine cannot be connected by ssh/http.

After I truncated the tables “posts” and “posts_tags”, the server is back in service now.


#4

And which Ghost version? ghost ls.


#5

Oh “1.21.3”, I didn’t know the command, thanks!


#6

Thanks :slight_smile:

Could you please update your blog to the latest Ghost version (1.22.3) and report back?
I doubt it will make a huge difference, but still good to share how the behaviour is.


#7

Let me try! I need to reimport my database.


#8

Just curious, if somebody scans the website, will all the dynamically generated URLs stored in memory?


#9

The goal is that Ghost has a knowledge of all resource urls (for posts, users, pages, tags) in memory at runtime, yes. Not sure what you mean by scanning? We don’t expose all the urls somewhere, it’s for internal processing.


#10

After updating, ghost cannot start with the following error,

[2018-04-19 18:18:08] ERROR

NAME: DatabaseIsNotOkError
CODE: DB_NEEDS_MIGRATION
MESSAGE: Migrations are missing. Please run knex-migrator migrate.

I am trying to find a solution. I will try to fix it before I can report the final result.


#11

Was ghost update successful? Doesn’t look like it was. ghost setup migrate should fix your state.
The migration can run a little longer if you have 20.000 posts :laughing:


#12

Amazing code, I didn’t find this line in any other place. This line may need to go to the “Troubleshooting” page.


#13

Now the server behaves much better, for 20,000 posts, memory is no longer the problem at the ghost booting phase.

I will keep track of the memory to see how long it can serve.

If all the URLs are being remembered, then I may need to restart ghost to keep the memory at a low level.

Thanks for the help!


#14

That is super useful information. But to double check: you haven’t disabled preloading urls in your config?


#15

I did nothing to the config. Both ghost run and ghost start use less memory.


#16

did nothing to the config

That is good news. That means Ghost is able to fetch all your resources, generates a set of urls and you are currently not running into serious problems.

If you discover any weird behaviour, don’t hesitate to come back here :slight_smile:

This thread was super useful for the Ghost team.


#17

Glad to! As the posts increasing every day, I hope it can generate more information for the Ghost team.


#18

Great!

Oh, do you mind sharing an export of your blog with me?
It could be useful for me when testing performance. I will respect your data with respect and only use it for local testing. If so, please send the JSON to kate@ghost.org.

I can understand if you say no :innocent:But I’ll give you a hug when you say yes :laughing:


#19

My blog has not many texts but contains a lot of pictures. [https://www.visualjoyce.com/]
The url for each post is relatively longer.

Export 20,000+ records via API will end in 502 Bad Gateway.


#20

A backup of your database should be in content/data.

Not sure why you are getting a 502 on export. The exporter is a labs feature and it was e.g. never designed to be performant. But as said, you should find backups of your database in your content folder :dizzy: