I like the idea of having everything in one place without the need for constant integrations, however, I do have one concern that is preventing me from making the switch.
Gated content without full indexing is really not ideal, it would cause more harm than good, especially for a publication with decent SERP authority. We currently use a system (without name dropping) that allows all articles to be fully indexed. It detects all search crawlers and disables the wall for them to allow a full index.
A couple of other options that we would miss is the ability to allow a number of free articles before asking the user to register.
Are there any publications that have had success using Ghost with the restricted indexing?
The Ghost Membership feature was designed to make it impossible to access premium content without being logged in. I think your question + alternatives is answered in this topic:
One of the benefits of protected content in Ghost is that itās truly protected. Thereās no way for visitors (or bots) to disable JS/CSS and access the content - but itās still possible to select a content access level for each individual piece of content. We also released a public preview feature recently, which allows you to protect part of a post and give the search engines more content to crawl on the page.
We currently use a system (without name dropping) that allows all articles to be fully indexed. It detects all search crawlers and disables the wall for them to allow a full index.
No cloaking, a simple whitelist for search engine bots. Itās the same system that Newsweek use. Some systems are better than others. Take Bloombergās walll, itās awful, you can bypass that just by using two lines of CSS within the inspector.
For allowing search crawlers full access question, you can enable this with a middleware function to detect the crawlers (Google explains what technique can be used under Advanced SEO topic)
If you self host (or have someone host it for you), you can add it to core code (Express Site App) and let them access the site without restriction (I did a quick and dirty test by adding a member assigned to Google Crawler and setting the member session, MemberSSR to that )
Be careful as to not only check the User-Agent, but the IPs(two DNS lookups needed as per Google)
Important part of the code :
try{
.....
const api = await ssr._getMembersApi();
const email = 'GoogleCrawler@mysite.com'; <---Email assigned to member I created
const member=api.getMemberIdentityData(email);
Object.assign(req, {member});
res.locals.member = req.member;
next();
} catch (err) {
logging.error(err);
Object.assign(req, {member: null});
next();
}
Also make sure you tell the crawler not to cache the content (so that users can not read the whole post from Googleās cache). ānoarchiveā is the keyword to use and explained here
Be careful with that because you can be penalised heavily for cloaking. Cloaking is anything which makes your site appear different to Google than it does to a regular user. Detecting Googleās user agents and serving different content to them than to your regular visitors absolutely does that.
The only way to get around the cloaking problem is to have the full content be public (so users can find a way to access it for free if they know how) and use structured markup to indicate to google that the content is paywalled. Ghost is not built with that in mind - members content is strictly accessible by those members - but you do have the ability show as much of the content as you want as a free preview and that will be indexed as a typical page.
Yes, and in order to prevent Google thinking this is cloaking, this guide must be followed. As Google puts it :ā This structured data helps Google differentiate paywalled content from the practice of cloaking, which violates our guidelines.ā
So a combination of letting Googlebot accessing the site freely and presenting a proper JSON-LD as explained by Google would be the way to do it, if one would want to do a bit of coding and modifying the core.
But as Kevin said above, this is not how Ghost is architected as of now, thatās why the above technique is a bit of hacking and comes with its own risks
creating two collections, one being the āofficialā blog/publication, where articles would be paywalled.
A second collection would have all articles free and would be canonical.
Search referred users get to read that one article for free, but from that point on, all the links on the page would lead to the āofficialā paywalled blog. The non-paywalled collection would essentially be invisible to most users, except for bots and anyone who figures this scheme.
When we were on Wordpress, we use Leaky Paywall (metered paywall) and it made a difference in both sign-ups and Google results.
A month or so ago, I asked if someone would write code for that for our Ghost site. Eric from Layered Craft reached out, and proceeded to create a metered paywall for Ghost, and it works great. I would reach out to him about purchasing his solution.
If you want to see it in action, go to our site (ForwardKY.com) and open up four stories in a row. You should get the metered paywall message.