• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
NameHero Blog

NameHero Blog

Web Hosting Tips & Resources From NameHero

  • Web Hosting
  • Reseller Hosting
  • Managed Cloud
  • Domains
  • Account
  • Blog Home
  • Categories

Why “Noindex” SHOULD Be In Robots.txt

Bhagwad Park

Published on: July 30, 2019

Categories: Website Development 0

I woke up this morning to see the following message from Google in my e-mail inbox:

E-mail From Google About Noindex

I wasn’t surprised by this of course. In fact, I’ve been expecting this. Google has been warning us for a long time that they’re removing support for noindex in robots.txt. So if you’ve been relying on this to prevent crawlers from indexing your site, you should use some of the other methods to do so for now. But I think this is a terrible decision. The “noindex” tag absolutely belongs in robots.txt for the reasons outlined below.

At the end, I’ll also give you some guidelines on how to ensure that your pages remain “noindexed” now that the robots.txt rule is no longer supported by Google.

Noindex is For Robots. So Where Should It Belong?

The core of my beef with Google’s decision to remove support for noindex from robots.txt is that it makes no thematic sense. Sure, we can use .htaccess to specify no-index. Yes, we can use the robots meta tag in the page HTML.

But the “noindex” directive is meant for robots. And what’s the point of having a robots.txt file, if you can’t use it to specify instructions for robots? Google’s excuse is that it’s not a supported directive. Well then…maybe it should be right? And since so many sites are using it and advocate its use, then there’s no harm in letting it continue!

If anything, the exclusion of noindex for the robots.txt standard is an oversight. When we can specify “disallow”, then why not specify nonidex as well? It makes no sense!

Other Alternatives are Too Programmatic

One of the benefits of using robots.txt is that you don’t need to touch your site. The file sits quietly in the root directory and doesn’t interfere with your site’s functionality. You don’t risk crashing stuff. And there are plenty of online tools that can check robots.txt to ensure that your web page is accessible.

Meta Tags in the HTML

By contrast, the two other “recommended” approaches to specify the noindex directive, both require you to intervene in your site and make changes to it. The first is the “meta” tag in the HTML. You can do this either using a piece of code yourself, or with a plugin.

The problem is that it’s not easy to target a wide swathe of pages that match a certain URL. You have to come up with a creative solution to make that happen. If you’re using WordPress, that means using a hook to check the page URL and add the noindex tag dynamically.

That impacts a bunch of stuff like site speed. I don’t like the idea of programmatically adding a meta robots tag to pages.

Using .htaccess

The other solution is even worse. The recommendation is that you use the .htaccess file to include a “noindex” response header. This really rubs me the wrong way. The .htaccess file is a very technical piece of work. It’s incredibly easy to break your site if you get even a single character wrong. It’s a complex interaction of rules, and code written by yourself, and other plugins.

Under normal circumstances, I never touch the .htaccess file if I can help it. And when I do so, I change it with the utmost care, painfully aware that I risk crashing my site. The notion of using .htaccess for something as trivial as robots.txt header, really annoys me.

But I Have no Choice – For Now

As of now, I have no choice but to use one of the above two methods to make noindex work. However, I will not be removing it from my robots.txt file. The web is more than just Google, and perhaps other search engines will find it useful. In the meantime, I hope Google realizes one day that not everyone wants to rework their site or .htaccess file merely to include a “noindex” response header.

That directive is meant for robots, and the right place for it is robots.txt. End of story.

Bhagwad Park Profile Picture
Bhagwad Park

I’m a NameHero team member, and an expert on WordPress and web hosting. I’ve been in this industry since 2008. I’ve also developed apps on Android and have written extensive tutorials on managing Linux servers. You can contact me on my website WP-Tweaks.com!

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Follow & Subscribe

Exclusive promos, content and more!

Most Popular Posts

NameHero’s Recommended WordPress Plugin and Theme Setup (2023)

How To Increase The InnoDB Buffer Pool Size

How To Fix A Stuck All-in-One WP Migration Import

How To Add A Subdomain In Cloudflare

How To Inline And Defer CSS On WordPress Without Plugins

Top Categories

  • WordPress
  • Website Performance
  • Web Hosting
  • Resellers
  • Website Security
  • Website Development
  • VPS Hosting
  • SEO Tips
  • Announcements
  • Domain Registration
NameHero

NameHero proudly provides web hosting to over 40,000 customers with 99.9% uptime to over 750,000 websites.

  • Master Card
  • Visa
  • American Express
  • Discover
  • Paypal
Name Hero
  • Web Hosting
  • Reseller Hosting
  • Managed Cloud
  • Domains
Help & Support
  • NameHero Blog
  • Knowledgebase
  • Announcements
  • Affiliates
Company
  • About Us
  • Contact Sales
  • Reviews
  • Uptime
  • We're Hiring

Copyright © 2023 NameHero, LLC. All rights reserved.

  • Privacy Policy
  • Terms of Use
  • Acceptable Use Policy
  • Payment Policy
  • DMCA