Does Ghost support indexing gated content?

I like the idea of having everything in one place without the need for constant integrations, however, I do have one concern that is preventing me from making the switch.

Gated content without full indexing is really not ideal, it would cause more harm than good, especially for a publication with decent SERP authority. We currently use a system (without name dropping) that allows all articles to be fully indexed. It detects all search crawlers and disables the wall for them to allow a full index.

A couple of other options that we would miss is the ability to allow a number of free articles before asking the user to register.

Are there any publications that have had success using Ghost with the restricted indexing?

2 Likes

The Ghost Membership feature was designed to make it impossible to access premium content without being logged in. I think your question + alternatives is answered in this topic:

Yes, it’s a shame and it will not work for us sadly.

I may consider Ghost in the future for spin-off projects.

Thanks.

One of the benefits of protected content in Ghost is that it’s truly protected. There’s no way for visitors (or bots) to disable JS/CSS and access the content - but it’s still possible to select a content access level for each individual piece of content. We also released a public preview feature recently, which allows you to protect part of a post and give the search engines more content to crawl on the page.

We currently use a system (without name dropping) that allows all articles to be fully indexed. It detects all search crawlers and disables the wall for them to allow a full index.

You might want to check that the system you’re using has proper structured data markup in this case, to make sure the SERPs don’t think you’re cloaking. More info here: Feature request: members-only content need appropriate schema markup

1 Like

No cloaking, a simple whitelist for search engine bots. It’s the same system that Newsweek use. Some systems are better than others. Take Bloomberg’s walll, it’s awful, you can bypass that just by using two lines of CSS within the inspector.

A public preview could be a good strategy, yes.

1 Like

For allowing search crawlers full access question, you can enable this with a middleware function to detect the crawlers (Google explains what technique can be used under Advanced SEO topic)
If you self host (or have someone host it for you), you can add it to core code (Express Site App) and let them access the site without restriction (I did a quick and dirty test by adding a member assigned to Google Crawler and setting the member session, MemberSSR to that )

Be careful as to not only check the User-Agent, but the IPs(two DNS lookups needed as per Google)
Important part of the code :

try{
.....
const api = await ssr._getMembersApi();
 const email = 'GoogleCrawler@mysite.com';   <---Email assigned to member I created
const member=api.getMemberIdentityData(email);
Object.assign(req, {member});
res.locals.member = req.member;
 next();
 } catch (err) {
        logging.error(err);
        Object.assign(req, {member: null});
        next();
    }

Also make sure you tell the crawler not to cache the content (so that users can not read the whole post from Google’s cache). ‘noarchive’ is the keyword to use and explained here

I can provide more explanation if needed

1 Like

Be careful with that because you can be penalised heavily for cloaking. Cloaking is anything which makes your site appear different to Google than it does to a regular user. Detecting Google’s user agents and serving different content to them than to your regular visitors absolutely does that.

The only way to get around the cloaking problem is to have the full content be public (so users can find a way to access it for free if they know how) and use structured markup to indicate to google that the content is paywalled. Ghost is not built with that in mind - members content is strictly accessible by those members - but you do have the ability show as much of the content as you want as a free preview and that will be indexed as a typical page.

1 Like

Thanks Kevin for pointing that out :+1:

Yes, and in order to prevent Google thinking this is cloaking, this guide must be followed. As Google puts it :” This structured data helps Google differentiate paywalled content from the practice of cloaking, which violates our guidelines.”

So a combination of letting Googlebot accessing the site freely and presenting a proper JSON-LD as explained by Google would be the way to do it, if one would want to do a bit of coding and modifying the core.

But as Kevin said above, this is not how Ghost is architected as of now, that’s why the above technique is a bit of hacking and comes with its own risks

Excellent, thank you kindly. :+1:

Here’s a clunky and primitive idea;

creating two collections, one being the “official” blog/publication, where articles would be paywalled.

A second collection would have all articles free and would be canonical.

Search referred users get to read that one article for free, but from that point on, all the links on the page would lead to the “official” paywalled blog. The non-paywalled collection would essentially be invisible to most users, except for bots and anyone who figures this scheme.

Is this any good?