Can robots.txt Tell the Difference Between a Slash and a Dash?

I've been working with my client, Gremlin Inc, to launch a content library covering the topic of Chaos Monkey. We have a PPC page that is blocked via robots.txt using a *very* similar URL structure to one of the subpages of this asset. The only difference is a subfolder – yet Google is blocking both pages from organic search. It appears like this may be a bug with the way Google processes URLs with robots.txt, but I wanted to open this up to the community and see if anyone has dealt with something similar.

Here’s the robots.txt:

Gremlin Robots.txt

Notice line 10:

Disallow: /chaos-monkey-simian-army/

The Simian Army subpage of the guide we’ve created, has a URL that’s supposed to be indexed:

/chaos-monkey/the-simian-army/

But Google is blocking this result:

And we have to opt in for extra results just to get to it:

blocked results for gremlin's simian army page that should be indexed.

When i try to submit the page in search console, it tells me robots.txt is blocking the page:

When we test using the old Web Master Tools robots.txt tester – it says the URL should be allowed:

old WMT saying the URL is okay

So is Google allowing for a type of fuzzy match on near identical URLs when it comes to processing subfolders and dashes in URLs that are supposed to be blocked by robots.txt?

Please leave a comment if you’ve seen something similar.

END OF DAY UPDATE:

We removed line 10 from robots.txt, resubmitted robots.txt using the old Web Master Tools, used fetch + render and submitted the page. We are back in, with a description:

result showing up properly

Definitely looks like a bug from where I am sitting…

Featured image via Bergen Offentlige Bibliotek

John-Henry Scherck

John-Henry Scherck is the owner of Growth Plays, a B2B content strategy and SEO consultancy based in Los Angeles. He works with founders, marketers, and investors to plan, build and refine growth marketing initiatives using a common sense approach.