Data migration tool > scrape full site for JSON import data

Not sure if I am asking for help from the current devs or just ‘a’ dev who might want to take on a plugin or helper app, but here goes…

I have been doing CMS sites for 20+ years now (my first one being a PostNuke derived network of about 10 sites), and recently (2020) I archived as many of those as I could to HTML and started anew with a rather shabby app that was sort of a Buzz-Feed ripoff. What can I say, I was hopeful I could reform it into something I needed. It’s a laravel-based resource hog and is killing my Lightsail instances pretty badly atm. As I turned to Discourse for my new forum in 2020, I have had nothing but respect for their work and feel your project is in a similar vein development wise. So I am looking a moving my sites to Ghost so that I can be on a platform I don’t have to mess with and will get the updates it needs moving forward.

Apologies for all this preamble. Here is the crux of my issue. I need to be able to migrate some or all of my articles from my existing sites over to their new Ghost-powered ones. There is no JSON export feature sadly and I simply don’t have the time right now to try to code anything exporting data directly from the database to make a Ghost mobiledoc file, etc.

What I am hoping to find is someone that is interested in making a solution for people in my circumstance. I have looked at applications like ParseHub and some others but maybe I am missing something as it doesn’t seem to want to crawl a site but simply be directed at a page and go page by page. There doesn’t seem to be anything out there that will scrape a whole site and build article data via JSON > mobiledoc. At least not that I have found in several days of looking.

I would be willing to pay for help with this. And obviously if someone develops a tool like this I think it has merit.

  • How was Ghost installed and configured? - AWS Lightsail instance
  • What Node version, database, OS & browser are you using? - Bitnami
  • What errors or information do you see in the console? - None site working
  • What steps could someone else take to reproduce the issue you’re having? - have 20 years of web site archive to deal with :D