Wrong robots.txt

Suddenly I am having this problem, I only noticed it because I saw that the google search console does not scan my site since October 23rd.

robots.txt in the template root is:

User-agent: *
Sitemap: https://domain.it/sitemap.xml
Disallow: /ghost/
Disallow: /p/

It has always worked flawlessly.

Now if I try to connect to https://domain.it/robots.txt I see:

User-agent: *
Disallow: /

I tried both incognito and through a browser that I have never used.

The site is not private and I initially disabled a “Purge Everything” on Cloudflare, now I have completely disabled it.

I don’t understand why Ghost generates me a robots.txt with Disallow: /

Ghost * Version: 5.20.0

1 Like

After 9 hours I double-checked everything, the robots.txt inserted in the root of the template is correct, cloudflare is set in developer mode so the cache system is disabled.

the robots.txt shown by Ghost is still:

User-agent: *
Disallow: /

Updated to version 5.21.0, the problem persists

I also noticed in my google search console a lot of pages blocked by a robots.tx. I am new to this type of issue and was reading what you were talking about. Quite a few of my pages are blocked. You seem to know a lot about the subject. What should I do, in layman’s terms. Thanks

For nearly all use cases, the default robots.txt file served by Ghost will work and you won’t need a custom one.

Learn more: Uploading a custom robots.txt file — Ghost Help Center

@giacomosilli @DonnieBoy Can you share your site URL?

The problem is that I didn’t put one in. It seems many of my pages are blocked and will not be indexed in googles because it says it is blocked by robots.tx. Only I never added one 1! I use Ghost and Reiro, a third party template from Fueko. I’m kind of at a loss why this is doing this. Any suggestions where to look or how to fix this? I am kind of a novice. I programmed long ago back in legacy days. I have learned a great deal, but I’m way, way off from where I should be.

Don Schniepp
donaldschniepp@gmail.com
brainiacsbest.com
(A Product review site)

1 Like

I don’t believe Ghost produced your sitemap. If it did, it’d look something like this:

User-agent: *
Sitemap: https://yoursite.com/sitemap.xml
Disallow: /ghost/
Disallow: /p/
Disallow: /email/
Disallow: /r/

However, this is yours, which is redirects to Twitter!


# Google Search Engine Robot
# ==========================
User-agent: Googlebot

Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Allow: /*?ref_src=
Allow: /*?src=
Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

# Yahoo! Search Engine Robot
# ==========================
User-Agent: Slurp

Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

# Yandex Search Engine Robot
# ==========================
User-agent: Yandex

Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

# Microsoft Search Engine Robot
# =============================
User-Agent: msnbot
Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

# Bing Search Engine Robot
# ========================
User-Agent: bingbot
Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

# Every bot that might possibly read and respect this file
# ========================================================
User-agent: *
Allow: /*?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid

Disallow: /*?
Disallow: /*/followers
Disallow: /*/following

Disallow: /account/deactivated
Disallow: /settings/deactivated

Disallow: /oauth
Disallow: /1/oauth

Disallow: /i/streams
Disallow: /i/hello

# Wait 1 second between successive requests. See ONBOARD-2698 for details.
Crawl-delay: 1

# Independent of user agent. Links in the sitemap are full URLs using https:// and need to match
# the protocol of the sitemap.
Sitemap: https://twitter.com/sitemap.xml
1 Like

Update: It looks good now.


User-agent: *
Sitemap: https://www.brainiacsbest.com/sitemap.xml
Disallow: /ghost/
Disallow: /p/
Disallow: /email/
Disallow: /r/
1 Like

Can I remove this and will it hurt anything if I do?

On Reiro from Fueko on the right side panel there are places for facebook and twitter to put in images and other info. I always fill out that and also the meta data. Is there something else I should be doing or not be doing? I have put in no code for anything. Just information from my articles and formatting the wording. That is it. I don’t ad any canonical anything or code.

Don Schniepp
donaldschniepp@gmail.

There’s nothing wrong with your site now. Whatever happened has cleared.

I checked that it’s not blocking resources for /. You can use an online tool, too, e.g., robots.txt Validator and Testing Tool | TechnicalSEO.com.

1 Like

Thanks, I’ll check it out.
Don Schniepp
donaldschniepp@gmail.com

hey,
Do I have to add robots.txt for my ghost website separately or is it automatically added?
because it is showing like this now:
www.robotstxt.org/
User-agent: *
Sitemap:https://getmanifest.ai/sitemap.xml