Bookmark Author wrong

When inserting a bookmark for a product page like Coast to Coast Expedition across New Zealand | Much Better Adventures into our Ghost magazine (See Ghost post here: The 303km Coast to Coast Trail Across the South Island) Ghost appears to fetch the “author” name from an alt tag from an ATOL Logo image buried in the footer. There’s no author meta tags in the HTML at all, nothing around the ATOL logo image that it’s fetching the string from. There’s other images with alt tags all over the page… I’m a bit stumped!

What’s going on?

note: I’ve hidden the author using CSS: .kg-bookmark-author { display:none; } but it is still in the HTML

Ghost uses https://metascraper.js.org/ to handle metadata extraction for bookmarks.

The metascraper library has a fallback mechanism that attempts multiple different paths to extract the data when explicit metadata doesn’t exist (as is the case for a very large number of sites) so that for the majority of sites the data is useful. Though as always on the web, it’s impossible to account for the infinite ways that data can be presented so it can’t be 100% correct all of the time.

You can see here the rules it uses for author extraction. If you see something there that you think is obviously wrong it’s worth opening a bug with that project.

Ah. This is actually the publisher metadata. The author/publisher class names are swapped due to a change in ordering a couple of major versions back but had to be kept for backwards compatibility.

I believe it’s this rule that is detecting the ATOL logo image alt as the publisher text because there was nothing before that in the page that was detected as a better source for the publisher

Rule:

toPublisher($ => $('[class*="logo" i] a img[alt]').attr('alt')),

Matching page content:

<div class="mui-y6w8e4-logos">
  <a href="https://help.muchbetteradventures.com/financial-protection-with-abtot-abta-atol" target="_blank" rel="noreferrer">
    <img alt="ATOL logo" ...

Ah - great spot, because the author one was eluding me! - I’ll add the og:site_name meta tag which should sort this, and keep it hidden in the css anyway for my purposes.

Thanks for the quick repsonse