• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
NameHero Blog

NameHero Blog

Web Hosting Tips & Resources From NameHero

  • Hosting
    • Web Hosting
    • VPS Hosting
    • WordPress Hosting
    • WooCommerce Hosting
  • Reseller
  • Enterprise
  • Domains
  • Account
  • Blog Home
  • Categories

How To Use Regex To Filter The Google Search Console List

Bhagwad Park

Published on: September 15, 2021

Categories: Website Development 0

Recently, Google introduced a new “doesn’t match regex” filter on the search console which I immediately found a use for! A few months back, I wrote about how to practice content hygiene on your website. One of the biggest difficulties I had with creating a list of pages for my website WP-Tweaks, was that the search console lists not just full posts and pages, but also archives, in-line anchors, and nofollow links. This clutters up the display and doesn’t easily allow us to see which actual pages are performing poorly. 

For example, if you sort your pages in ascending order of clicks or impressions, you might see something like this: 

yoursite.com/test-page#someinterallink1 
yoursite.com/test-page#someinterallink2 
yoursite.com/page/4/ 
yoursite.com/page/5/ 
yoursite.com/tag/3/ 

Since archive pages and inside links are visited very rarely, the majority of your results will be like the ones above. Previously, I had no choice but to export the entire list into a spreadsheet, and then painstakingly clean out the list using filters and exclusion pattern matching. The whole process was very annoying. Not to mention it got out of date quickly as new data became available in the search console, and I had to repeat the process all over again. 

Google Added Negative Regex Matching 

While Google allowed you to create Regex expressions that match the criteria you want, it critically lacked negative matching. Meaning that you couldn’t create a list whose members didn’t include some keywords. For our purposes, this was useless. I couldn’t create a pattern to match every valid page. I only knew which patterns I didn’t want. So for my purposes, the in-built Google tools weren’t helpful at all. 

However, recently Google added this feature. We can now create pattern matching regexes and instruct Google to list items that don’t match them. Here’s a screenshot: 

Pattern Matching in Google Search Console

With this, I could create a clean list of poorly performing pages in just a few seconds, all updated with the latest Google Search Console data. It’s really streamlined the way I practice my content hygiene. 

Here’s how it works. 

Creating the Negative Regex 

Regex expressions can be notoriously difficult to create. They also look super ugly, and often have no resemblance to the pattern you’re trying to match. However, it’s also very powerful. I always end up re-learning Regex each time I use it, only to forget it a little while later. 

For my purposes, here’s the Regex I use in Google Search Console to clean my list of useless pages: 

(#)|(page)|(replytocom)|(/tag/)|(/recommends/) 

Let’s break this down a bit. 

This regex matches all URLs that have ONE OF the following inside them: 

  1. # 
  1. page 
  1. replytocom 
  1. /tag/ 
  1. /recommends/ 

Looking at the regex, you see I’ve enclosed each pattern in brackets (), separated by “|” – the OR operator. Once you understand it, the rule is quite clear. One thing to keep in mind is that if you want to match special characters like dots (.), carets (^), or backslashes (\), you need to escape them with a backslash (\). So if you want to match a dot in your URL, you need to use (\.) instead. 

Modify the above Regex with your own pattern, and paste it into the search console after selecting the “Doesn’t match regex” dropdown item from the list. Of course, if you do want to match patterns, then select that instead! 

Click “Apply”, and you’re done! The search console should now show only those pages that don’t match the pattern in your Regex. Now you can easily check which pages are underperforming, and take steps to correct the situation. Happy cleaning! 

Bhagwad Park Profile Picture
Bhagwad Park

I’m a NameHero team member, and an expert on WordPress and web hosting. I’ve been in this industry since 2008. I’ve also developed apps on Android and have written extensive tutorials on managing Linux servers. You can contact me on my website WP-Tweaks.com!

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Follow & Subscribe

Exclusive promos, content and more!

Most Popular Posts

Speed up your site: solving the WordPress slowdown

NameHero’s Recommended WordPress Plugin and Theme Setup (2023)

How To Increase The InnoDB Buffer Pool Size

How To Fix A Stuck All-in-One WP Migration Import

How To Add A Subdomain In Cloudflare

Top Categories

  • WordPress
  • WordPress Tutorials
  • Enterprise Hosting
  • WooCommerce
  • Web Hosting
  • Resellers
  • Website Security
  • Website Development
  • Website Performance
  • VPS Hosting
  • SEO Tips
  • Announcements
  • Domain Registration
NameHero

NameHero proudly provides web hosting to over 40,000 customers with 99.9% uptime to over 750,000 websites.

  • Master Card
  • Visa
  • American Express
  • Discover
  • Paypal
Products
  • Web Hosting
  • VPS Hosting
  • WordPress Hosting
  • WooCommerce Hosting
  • Reseller Hosting
  • Enterprise Hosting
  • Domains
Help & Support
  • NameHero Blog
  • Support
  • Knowledgebase
  • Announcements
  • Affiliates
Company
  • About Us
  • Contact Sales
  • Reviews
  • Uptime
  • We're Hiring

Copyright © 2023 NameHero, LLC. All rights reserved.

  • Privacy Policy
  • Terms of Use
  • Acceptable Use Policy
  • Payment Policy
  • DMCA