Hi everyone,
I’m facing an issue where Google and Bing are not crawling my site. Here’s the situation:
- Search Console and Bing Webmaster Tools: I added my site on December 8, 2024, and successfully verified ownership.
- Sitemap and robots.txt: The sitemap has been submitted correctly and is accessible (I’ve checked manually). The robots.txt file is properly configured with no restrictions blocking crawlers.
- Server logs: I’ve checked the server logs, and there are no errors or blocked requests from crawlers.
- Server performance: The site is hosted on a VPS with good performance. There are no significant slowdowns or downtime.
- Theme in use: I’m using the Aspect theme and haven’t made any changes that could block crawlers.
Despite everything being set up correctly, Google and Bing are not crawling my site. Over a month has passed, and I can’t figure out what’s preventing the crawlers from accessing it.
Questions:
- Are there any specific settings in Ghost or the Aspect theme that I might have overlooked?
- Could there be external factors or penalties affecting my site?
- What further steps would you recommend to identify and resolve the issue?
https://whatisnumerology.com/sitemap.xml
https://whatisnumerology.com/robots.txt
Thanks in advance for any advice!
Worth asking: Are you sure they haven’t crawled? How are you determining that? (Sometimes they crawl but don’t index…)
A good place to start would be revisiting the Google Search Console. Since a month has gone by, I’d expect that Google has at least ‘discovered’ and ‘crawled’ a good number of your pages, and hopefully indexed (what’s actually required to have them show up in search).
What do you see in GSC? Are there errors? If you have a bunch of pages that are not indexed, look at why. You can inspect individual entries.
Also check that no errors are shown for your sitemap page.
Also worth checking: Do you have any sort of ‘bot protection’ like out in front? It’s possible you’re accidentally blocking search engine crawlers, but I’d start with GSC to see what the errors are there.
Thank you for your response!
I’ve checked Google Search Console thoroughly, and there are no errors reported. The sitemap is accessible and has been processed without issues. However, after more than a month, only 5 pages are indexed, which seems unusually low given the number of pages available on the site.
Regarding bot protection, I don’t have any additional measures in place that should block search engine crawlers. I’ve also reviewed my robots.txt and server logs, and I don’t see anything that would indicate a problem.
Do you have any other suggestions for diagnosing why the majority of my pages are not being indexed? Could there be something specific in the theme or Ghost configuration that I might have overlooked?
Thanks again for your help!
OK, we need to drill down a bit. Click the three dots to expand the -posts menu.
Choose ‘see page indexing’ (or whatever it is in Italian?)
Here’s mine:
OK, so I have one post in ‘discovered, not indexed’ status. Click that category, to see what post is affected:
Hover the affected post and choose the magnifying glass/search icon. Here’s what I see:
This page hasn’t been crawled, so I’ll click “request indexing”. Note that there are referring pages, which is good, but for whatever reason, Google hasn’t crawled this page yet.
Here’s an example from my tags sitemap. Note that Google doesn’t always like tag pages - there’s not a lot of content there in many cases, just a list of posts.
Here’s a tag page, in ‘crawled but not indexed’.
This shows me that there are links to the page (although only one?), that Google has crawled the page without errors, and that the canonical url Google has selected is this page. (If it isn’t, that’s a problem, because Google won’t index a non-canonical version.)
I could click ‘request indexing’, but unless something has changed since the last crawl, it’s not likely to get indexed. This page (Productivity - Spectral Web Services) really doesn’t have any content that Google is going to decide is index-worthy. Getting pages that are just collections of post titles and excerpts indexed is tough. Google /has/ indexed the posts that are on that tags page, so I really don’t care that it has missed the tags page itself. The actual content is available in search.
[If I wanted that tag page indexed, I’d need to write a couple introductory paragraphs about productivity. Then I might have a chance.]
The three dots in Search Console are disabled, as you can see in the picture, and if you look closely, the spider hasn’t crawled since December 31st."