Wordpress importer mangles URL

I just imported a site from wordpress into a new (well, newly up-to-date, after deleting all content) ghost install. My wordpress site is a multi-site, so the image URLs contain a ‘sites/N’ addition, where N is an integer, in their path. After importing, images weren’t loading, so I looked in the ghost logs and found:

[2023-04-14 14:07:33] INFO “GET /content/images/wordpress/sites/2/2023/01/2FF27314-FF46-40EE-8C6E-805FE408654C_1_105_c.jpeg” 404 8ms
[2023-04-14 14:07:33] INFO “GET /content/images/wordpress/sites/2/2023/01/074BB98A-F7BA-4900-937D-6CCDFC795C7C_1_105_c.jpeg” 404 3ms

The URLs are EXACTLY what I expect from a wordpress MU setup. So why the 404? I searched for one of the filenames to see where the importer had put them:

find /data -name 2FF27314-FF46-40EE-8C6E-805FE408654C_1_105_c.jpeg


Ahah, there’s the problem. Somehow, I’m guessing a regex, ‘sites/2/2023’ got munged into `023’. The fix wasn’t very hard, cd /data/matt.simerson.net/content/images/wordpress/; mkdir sites/2; mv * sites/2; and then rename all the folders with their missing 2. Should be a straight forward bug to fix.