Tool to find/replace in the database

I thought I might a recall a Ghost CLI tool that would allow an easy syntax to find/replace content in the database.

I realize I can use the MySQL CLI and a SQL “UPDATE” statement for this, but I thought I might ask there might be a pre-made find/replace tool for Ghost.

My use-case: Images being moved from one CDN to another.

Thanks!

Ref:

Hey @markstos doing find and replace in the database isn’t recommended, as none of the internal events for updating related things will fire.

There is a script that uses the API to find and replace content in posts here, it needs a minor update to fix versioning but should work for your purposes:

Note: You may need to do similar updates to user, tag and settings images.

Would be pretty cool to have a full script for find-replacing all image paths and add it to the demo repo :slight_smile:

1 Like

Thanks @Hannah ! I’ll that approach and share what I come up with.

I have a question about the image resizing that Ghost does. Some of the URLs I need to replace have been resized by a third-party CDN. An example URL structure looks like:

https://cdn.com/old-domain.com/wp-content/uploads/2016/11/some.jpg?resize=580

The old domain and the new Ghost blog are hosted on the same server, so it’s easy enough to copy all those photos to a parallel photo structure, so let’s assume that part is taken care of. My question is about the auto-resizing that Ghost does. I presume the correct replacement here would be:

GHOST_URL/content/images/size/w600/2016/11/some.jpg

But how is the “w600” calculated? I presume it would be based on the sizes found in the package.json of my current theme, Headline, but that doesn’t have a “600” width there:

I don’t mind doing the math/calculations myself, but I’m not clear if I also needs to tell Ghost to generate multiple sizes of the image or if they are made on-demand (my assumption), and I don’ t know how to find the value sizes put in the URL as the width like “/w600/”, or if I can just exactly copy the widths from the old CDN URLs. I believe the widths embedded in the old URLs were based on the old blog and design, so they aren’t necessily the ideal widths to use with Ghost.

Ghost does all size generation on demand. There’s both the theme image sizes and some internal ones for content so that’s why not all of them are in the theme.

1 Like

Thanks. Do I need to update the html property of the post or just the mobiledoc?

I read the Ghost Admin API docs, but it doesn’t provide guidance here. It mentions that “only writable fields” can be updated on a post, but which fields are writable is not documented.

The html field is read-only because it’s generated from the mobiledoc field contents any time a post is saved. As long as you’re updating posts via the API you don’t need to touch it :slightly_smiling_face:

1 Like

Oh, awesome. I’m loving having this Admin API.

2 Likes

@Hannah Here’s the code I ended up using, first on a dev instance and then also with some other debugging output to be sure it worked as expected.

I tried to simplify it further by distributing it as a Deno script. That would save the step of requiring npm install for a one-off script. But that failed due to un-merged TypeScript compatibility patch:


/*

  Goal is replace URLS like
    https://i0.wp.com/old.com/wp-content/uploads/2016/11/Lunch-at-the-park.jpg?resize=580,435
  With URLS like:
    https://new.com/content/images/2016/11/Lunch-at-the-park.jpg

 // Create a custom integration in admin area to get an admin key
 env GHOST_ADMIN_API_KEY='XXX' GHOST_URL='https://dev.stosberg.com' node ./fix-image-urls.mjs

*/


const url = process.env.GHOST_URL;
const key = process.env.GHOST_ADMIN_API_KEY;
const goodRegex = new RegExp(`${url}/content/images`);
const badRegex = new RegExp('i[01234]\.wp\.com');

// This prefix is relative to the i*.wp.com domains
const badImagePrefix = new RegExp('/old.com/wp-content/uploads');
const newImagePrefix = `${url}/content/images`;

import Promise from 'bluebird';
import GhostAdminAPI from '@tryghost/admin-api';
import getUrls from 'get-urls';
const api = new GhostAdminAPI({
    url,
    key,
    version: 'v2.0'
});

// Admin API automatically includes tags and authors
// WARNING: If the site is really big (1000s of posts) maybe do this paginated
const allPosts = await api.posts.browse({limit: 'all'});

let postsCnt = 0;
let badUrlsCnt = 0;

let allBadUrls = [];

// convert our list of posts, to a list of promises for requests to the API
const result = await Promise.mapSeries(allPosts, async (post) => {

  const allUrls = Array.from(getUrls(post.mobiledoc));
  const goodUrls = allUrls.filter(url => goodRegex.test(url));
  const badUrls = allUrls.filter(url => badRegex.test(url));

  let newMobileDoc = post.mobiledoc;

  if (badUrls.length) {
    postsCnt += 1;
    badUrlsCnt += badUrls.length;
    allBadUrls = allBadUrls.concat(badUrls);
    
    // This step gets rid of query string as well as fixes the prefix
    const fixedUrls = badUrls.map(badUrl => {
      const parts = new URL(badUrl);
      const fixedUrl = `${newImagePrefix}${parts.pathname.replace(badImagePrefix, '')}`;

      newMobileDoc = newMobileDoc.replace(badUrl,fixedUrl);
      return fixedUrl;
    });
    console.log(`fixing ${post.slug}`, badUrls, fixedUrls);
    await api.posts.edit({
      id: post.id, 
      mobiledoc: newMobileDoc, 
      updated_at: post.updated_at
    });
  }
});
console.log(`Summary: Fixed ${postsCnt} posts with ${badUrlsCnt} bad URLS`);