Hello again, dear forum.
When I started a few months ago to blog with Ghost instead of WP I noticed on Google that I have a lot of weird looking urls/subdomains that I do not use. In the Google search console, I have thousands of urls not indexed that I’m unable to fix, and it’s probably because these links have nowhere to go.
They look like this:
ww34.domain.com, iloapp.domain.com, ww2.domain.com etc. I do not have access to these domains, and after some googling I discovered that it’s a common thing getting “scraped” or something like that if you use a wildcard for your subdomains, which I have done for years, since I’ve been a WP user and host some of my domains over at wp.com as well.
I have now removed the wildcard, but the problem persists. I recently remembered that ghost has a redirects file. Any idea on how I should use the
redirects.yaml file to redirect all of these subdomains to the real url in use?
I was thinking something like this:
^/.domain.com ==> domain.com/$1
I currently do not use any subdomain, not even
https://domain.com and I have no plans in the future to use a subdomain for this domain.
Will this work? Regex is what I plan to use to fix this. I just hope that Ghost will know what url I want to match it with, so i.e match
iloapp.domain.com/blog/my-post ==> domain.com/blog/my-post
Are these search queries? and, is the response a 404? If so, that’s the correct response.
Yes, they return a 404, since my Ghost install does not understand where they’re coming from. I guess my suggested solution would work, then if I understand you correctly?
Since the links do not exist on your site, the correct response is 404. Therefore, I don’t think you should create redirects.
Ok, good to know. I just have to wait for the issue to go away then. Thanks for the info.
I’m my experience, these eventually disappear from the search console; I had this when I used WP. I just cleared them, and things settled down. There shouldn’t be any impact on SEO.
Ahh interesting. How do you clear them? I’d really like to get rid of them, because they will never be found.
When reviewing any issues, correct them if the issue is your side, e.g., you changed a slug and there’s no redirection (Ghost does this for you though), and then click on Validate Fix.
For example, I currently have some author URLs for authors that don’t exist, and never have. A 404 is the correct response, so I simply confirm by using validate fix.
I also decided to remove some lower quality guest posts, and therefore, a 404 is shown. Again, this is correct. I avoid redirecting such posts to something similar; rather tell the visitor that the content isn’t found, and let Ghost recommend other posts.
Ok, I’ve already done that, but some of those issues refuse to resolve, unfortunately. I guess I’m just gonna have to live with that then.
scratching my head …I guess I’m confused here. So are your 404s for posts that do exist, just for a domain that’s strange? I’m not sure that a 404 is optimal here - why not 301 (permanent redirect) the traffic to the real post, if the real post exists? @mjw , am I missing something in your assertion that they should be 404s, if there are real posts available at that slug, just at the wrong subdomain?
Yes, that was what I was thinking at first, creating a 301 to the real post instead of these weird looking domains that have been scraping my posts somehow.
When I click on a bogus link, it goes nowhere, so it’s a 404.
In the Google Search Console they’re registered as “alternate page with proper canonical tag” and no matter how many times I try to click marked as fixed on GSC, it keeps failing.
It’s been nearly 3 months now, and they still appear. I have blocked them using the removal tool, but I still have a list with thousands of weird looking links going nowhere.
Sometimes I’ve seen on Google (google.com) that my newly written post is nowhere to be found, but instead this bogus url is there, so I’d like to fix it. That is why I wondered if my regex above will work for this?
This documentation (https://ghost.org/tutorials/implementing-redirects/) shows a couple examples that include the domain name. It also says not to use redirects.yaml for subdomain redirects. So I guess I wonder where you’re hosting and what your setup is? Ghost Pro isn’t likely to route randomsub.yourdomain.tld to your Ghost site if it wasn’t set up for that subdomain. You might be able to set up the redirects you need with your DNS provider, depending on who that is.
Currently I host my domain I’m using digital ocean. I have removed my wildcard, since that was causing the whole issue to begin with. I also have my Ghost install with a DO droplet, so I selfhost.
I still had this issue when I used the wildcard, so I have no idea how I could redirect this without one.
Unless I’ve misunderstood, these URLs never existed in any form, and are spammy backlinks. So, yes, these should be 404s.
Moreover, you don’t want to legitimize these spammy links by creating a permanent redirect.
A better approach is to disavow the originating sites.
Thank you for that tip! I think that’s what I needed. Never heard about that either, disavow. Very interesting. Thanks!