{"id":256,"date":"2021-01-28T15:19:47","date_gmt":"2021-01-28T15:19:47","guid":{"rendered":"https:\/\/organicdigital.co\/blog\/?p=256"},"modified":"2026-04-28T01:06:12","modified_gmt":"2026-04-28T00:06:12","slug":"how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google","status":"publish","type":"post","link":"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/","title":{"rendered":"How To:  Find Out If Your Sites URLs Are Being Crawled &#038; Indexed by Google"},"content":{"rendered":"\n<p>This is a blog post in two (large) pages &#8211; live and staging sites:<\/p>\n\n\n\n<p>Part 1: <a href=\"#live\">How To Check if Google has Indexed Your Live Site<\/a><\/p>\n\n\n\n<p>Part 2: <a href=\"#staging\">How To Check If Google has Indexed Your Staging\/Test Site<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity\"\/>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_can_I_tell_if_Google_has_indexed_my_live_site\" >How can I tell if Google has indexed my live site?<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Use_The_site_query_operator\" >Use The site: query operator<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Check_the_Coverage_Section_of_Google_Search_Console\" >Check the Coverage Section of Google Search Console<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Use_the_URL_Inspect_Function_In_GSC\" >Use the URL Inspect Function In GSC<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Why_Wont_Google_Crawl_or_Index_My_Pages\" >Why Won\u2019t Google Crawl or Index My Pages?<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#The_robotstxt_Disallow_Directive\" >The robots.txt Disallow Directive<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_a_Robotstxt_File_Manually\" >How To Amend a Robots.txt File Manually<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_a_Robotstxt_File_in_WordPress\" >How To Amend a Robots.txt File in WordPress<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_a_Robotstxt_File_in_Magento\" >How To Amend a Robots.txt File in Magento<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#The_Robots_Meta_Tag_is_Set_to_Noindex_andor_Nofollow\" >The Robots Meta Tag is Set to Noindex and\/or Nofollow<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_the_Robots_Meta_Tag_File_Manually\" >How To Amend the Robots Meta Tag File Manually<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_the_Robots_Meta_Tag_in_WordPress\" >How To Amend the Robots Meta Tag in WordPress<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Amend_Robots_Meta_Tag_in_Magento\" >How To Amend Robots Meta Tag in Magento<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#My_Site_Pages_Can_Be_Crawled_and_Indexed_by_Google_%E2%80%93_What_Next\" >My Site \/ Pages Can Be Crawled and Indexed by Google \u2013 What Next?<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#I_Also_Have_a_Bing_Webmaster_Account\" >I Also Have a Bing Webmaster Account!<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-16\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Ive_Done_All_This_and_My_Site_Pages_Still_Arent_Indexed\" >I\u2019ve Done All This and My Site \/ Pages Still Aren\u2019t Indexed!<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-17\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Check_If_Google_has_Indexed_Your_StagingTest_Site\" >How To Check If Google has Indexed Your Staging\/Test Site<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-18\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Set_Up_A_Google_Search_Console_GSC_Domain_Property\" >Set Up A Google Search Console (GSC) Domain Property<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-19\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Check_Google_SERPs_Using_Link_Clump\" >Check Google SERPs Using Link Clump<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-20\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Search_For_Text_Unique_To_Your_Site\" >Search For Text Unique To Your Site<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-21\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Crawl_The_Site_Using_Screaming_Frog\" >Crawl The Site Using Screaming Frog<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-22\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Check_Google_Analytics_Hostnames\" >Check Google Analytics Hostnames<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-23\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#How_To_Remove_and_Prevent_Your_Test_Site_From_Getting_Indexed\" >How To Remove and Prevent Your Test Site From Getting Indexed<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-24\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Remove_URLs_via_GSC\" >Remove URLs via GSC<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-25\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Set_robots_tag_to_noindex_on_test_site\" >Set robots tag to noindex on test site<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-26\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Password_Protect_Your_Test_Site\" >Password Protect Your Test Site<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-27\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Delete_site_and_return_page_status_410\" >Delete site and return page status 410<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-28\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#Block_via_robotstxt\" >Block via robots.txt<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-29\" href=\"https:\/\/daveashworth.co\/blog\/how-to-find-out-if-your-site-urls-are-being-crawled-indexed-by-google\/#But_Remember\" >But Remember<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\" id=\"live\"><span class=\"ez-toc-section\" id=\"How_can_I_tell_if_Google_has_indexed_my_live_site\"><\/span>How can I tell if Google has indexed my live site?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In my years as a <a href=\"https:\/\/daveashworth.co\/\">website optimisation consultant<\/a> this is a question I have been asked many times.<\/p>\n\n\n\n<p>There are two straightforward ways to find out:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Use_The_site_query_operator\"><\/span>Use The site: query operator<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Search for your domain on Google as follows:&nbsp; <a href=\"https:\/\/www.google.com\/search?q=site%3Adaveashworth.co&amp;oq=site%3Adaveashworth.co&amp;aqs=chrome.0.69i59l2j69i58.504j0j7&amp;sourceid=chrome&amp;ie=UTF-8\" target=\"_blank\" rel=\"noreferrer noopener\">site:daveashworth.co<\/a><br><br>If your site is indexed, you will see a list of pages:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"687\" height=\"673\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_operator.png\" alt=\"Site Query Operator\" class=\"wp-image-258\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_operator.png 687w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_operator-300x294.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_operator-150x147.png 150w\" sizes=\"(max-width: 687px) 100vw, 687px\" \/><\/figure>\n\n\n\n<p>If no results are returned, then you may have issues:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"845\" height=\"398\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_opertor-no_results.png\" alt=\"Site Query Operator with No Results\" class=\"wp-image-259\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_opertor-no_results.png 845w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_opertor-no_results-300x141.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_opertor-no_results-150x71.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/site_query_opertor-no_results-768x362.png 768w\" sizes=\"(max-width: 845px) 100vw, 845px\" \/><\/figure>\n\n\n\n<p><br>Note:&nbsp; on bigger sites, whilst you will see an approximation of how many pages are indexed, you will only be able to actually see around 300 of them in the SERPs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Check_the_Coverage_Section_of_Google_Search_Console\"><\/span>Check the Coverage Section of Google Search Console<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Every website should have GSC account, it is, in my opinion, the greatest tool a site owner or SEO can use and gives a wealth of information about your site\u2019s organic visibility and performance. &nbsp;If you do not have one, <a href=\"https:\/\/search.google.com\/search-console\/about\" target=\"_blank\" rel=\"noreferrer noopener\">head to the official GSC page<\/a>, if you do, go to the Coverage section where you can see a breakdown of:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Errors encountered whilst crawling pages<\/li>\n\n\n\n<li>Pages that are blocked<\/li>\n\n\n\n<li>Valid indexed pages<\/li>\n\n\n\n<li>Pages that are excluded<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"894\" height=\"858\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_coverage_report.png\" alt=\"GSC Coverage Report\n\" class=\"wp-image-260\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_coverage_report.png 894w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_coverage_report-300x288.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_coverage_report-150x144.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_coverage_report-768x737.png 768w\" sizes=\"(max-width: 894px) 100vw, 894px\" \/><\/figure>\n\n\n\n<p>If your site has issues, these will be reported under &#8220;error&#8221; or &#8220;excluded&#8221; \u2013 and you can find out the reasons why they are not being included in search such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Alternate page with proper canonical tag<\/li>\n\n\n\n<li>Crawled &#8211; currently not indexed<\/li>\n\n\n\n<li>Duplicate without user-selected canonical<\/li>\n\n\n\n<li>Excluded by \u2018noindex\u2019 tag<\/li>\n\n\n\n<li>Crawl anomaly<\/li>\n\n\n\n<li>Not found (404)<\/li>\n<\/ul>\n\n\n\n<p>If your site\u2019s pages are not appearing in the \u201cvalid\u201d section then you may have issues.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Use_the_URL_Inspect_Function_In_GSC\"><\/span>Use the URL Inspect Function In GSC<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If some pages are indexed and others are not, then you can also use the URL Inspect tool to see if Google is able to crawl and index a specific page, or if there are other issues preventing it from appearing in search \u2013 this is in the top menu and will allow you to check one URL at time:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"740\" height=\"44\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_url_inspect_tool.png\" alt=\"GSC URL Inspect Tool\" class=\"wp-image-261\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_url_inspect_tool.png 740w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_url_inspect_tool-300x18.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_url_inspect_tool-150x9.png 150w\" sizes=\"(max-width: 740px) 100vw, 740px\" \/><\/figure>\n\n\n\n<p>If your page is indexed, it will give details as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"645\" height=\"741\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/indexed_page_data.png\" alt=\"GSC Indexed Page Data\" class=\"wp-image-262\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/indexed_page_data.png 645w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/indexed_page_data-261x300.png 261w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/indexed_page_data-131x150.png 131w\" sizes=\"(max-width: 645px) 100vw, 645px\" \/><\/figure>\n\n\n\n<p>If not, you get this status which shows when Google has attempted to crawl the page and some insight into why it is not indexed:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"646\" height=\"667\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_non_indexed_page_data.png\" alt=\"GSC Non Indexed Page Data\" class=\"wp-image-263\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_non_indexed_page_data.png 646w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_non_indexed_page_data-291x300.png 291w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_non_indexed_page_data-145x150.png 145w\" sizes=\"(max-width: 646px) 100vw, 646px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_Wont_Google_Crawl_or_Index_My_Pages\"><\/span>Why Won\u2019t Google Crawl or Index My Pages?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>There are generally two reasons why a page cannot be either crawled or indexed.&nbsp; These are particularly common when a new site has been launched or migrated, and the settings from the development environment have been carried over.<\/p>\n\n\n\n<p>These are often <a href=\"https:\/\/daveashworth.co\/blog\/how-to-fix-search-console-errors-why-pages-arent-indexed\/\">picked up in Search Console and explained in more detail here<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_robotstxt_Disallow_Directive\"><\/span>The robots.txt Disallow Directive<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>This is where the site, a directory, or a page are blocked from being crawled by the robots.txt file.<\/p>\n\n\n\n<p>Every site should have a robots.txt file, this is used to give directives to search engines as to what sections of your site should and should not be crawled.<\/p>\n\n\n\n<p>If you have one, you will find it in your root directory under the name robots.txt<\/p>\n\n\n\n<p><a href=\"https:\/\/daveashworth.co\/robots.txt\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/daveashworth.co\/robots.txt<\/a><\/p>\n\n\n\n<p>The directives that would prevent a site, directory or page being crawled would be as follows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Disallow: \/\nDisallow: \/directory\/\nDisallow: \/specific_page.html\n<\/code><\/pre>\n\n\n\n<p>You can also use <a href=\"https:\/\/www.screamingfrog.co.uk\/seo-spider\/\" target=\"_blank\" rel=\"noreferrer noopener\">Screaming Frog<\/a> to attempt to crawl your site.  If it is unable to do so, you see the following crawl data:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1010\" height=\"619\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_issue.png\" alt=\"Screaming Frog Robots Issue\" class=\"wp-image-265\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_issue.png 1010w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_issue-300x184.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_issue-150x92.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_issue-768x471.png 768w\" sizes=\"(max-width: 1010px) 100vw, 1010px\" \/><\/figure>\n\n\n\n<p>There are many valid reasons for blocking search engines using this directive, but if you see something along the lines of the above, you need to amend these to allow crawling of your site.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_a_Robotstxt_File_Manually\"><\/span>How To Amend a Robots.txt File Manually<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If you have access to FTP or have a developer on hand, you can manually amend the robots.txt file to remove any directives that are blocking your site from crawl.<\/p>\n\n\n\n<p>Generally, the following command will do this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>User-agent: *\nAllow: \/\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_a_Robotstxt_File_in_WordPress\"><\/span>How To Amend a Robots.txt File in WordPress<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If you have the <a href=\"https:\/\/yoast.com\/wordpress\/plugins\/seo\/\" target=\"_blank\" rel=\"noreferrer noopener\">Yoast plugin<\/a> installed, you can edit your file directly via the <em>Tools -&gt; File Editor<\/em> Section \u2013 <a href=\"https:\/\/yoast.com\/help\/how-to-edit-robots-txt-through-yoast-seo\/\" target=\"_blank\" rel=\"noreferrer noopener\">follow this link for instructions on how to do this<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"524\" height=\"803\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_robots_txt_editor.png\" alt=\"Yoast robots.txt Editor\" class=\"wp-image-266\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_robots_txt_editor.png 524w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_robots_txt_editor-196x300.png 196w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_robots_txt_editor-98x150.png 98w\" sizes=\"(max-width: 524px) 100vw, 524px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_a_Robotstxt_File_in_Magento\"><\/span>How To Amend a Robots.txt File in Magento<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Go to C<em>ontent -&gt; Design -&gt; Configuration<\/em>, click into your relevant Store View and edit \u201cSearch Engine Robots\u201d<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"217\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-1024x217.png\" alt=\"Magento Robots Settings\" class=\"wp-image-267\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-1024x217.png 1024w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-300x63.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-150x32.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-768x162.png 768w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor-1536x325.png 1536w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_robots_txt_editor.png 1783w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Robots_Meta_Tag_is_Set_to_Noindex_andor_Nofollow\"><\/span>The Robots Meta Tag is Set to Noindex and\/or Nofollow<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>In addition to the robots.txt file, you can also check the robots meta tag within your site\u2019s source code and ensure it\u2019s not preventing search engines from crawling.<\/p>\n\n\n\n<p>Misuse of noindex tags is one of <a href=\"https:\/\/daveashworth.co\/blog\/how-to-fix-search-console-errors-why-pages-arent-indexed\/]\">the most common causes of indexing problems.<\/a><\/p>\n\n\n\n<p>If you check your source code, if you do not see a robots meta tag, or, it is set to &#8220;index&#8221; or &#8220;index,follow&#8221; \u2013 then this isn\u2019t the issue.&nbsp;&nbsp; However, if you see that it says &#8220;noindex&#8221;, this means your page can be crawled but will not be indexed:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"671\" height=\"117\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/noindex_tag_in_source_code.png\" alt=\"Noindex Tag In Source Code\" class=\"wp-image-268\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/noindex_tag_in_source_code.png 671w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/noindex_tag_in_source_code-300x52.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/noindex_tag_in_source_code-150x26.png 150w\" sizes=\"(max-width: 671px) 100vw, 671px\" \/><\/figure>\n\n\n\n<p>Again, you can use <a href=\"https:\/\/www.screamingfrog.co.uk\/seo-spider\/\" target=\"_blank\" rel=\"noreferrer noopener\">Screaming Frog<\/a> to check the status of your robots tags on your site.&nbsp;&nbsp; If your tag is set to noindex,nofollow it won\u2019t get beyond the home page:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1010\" height=\"619\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_issue.png\" alt=\"Screaming Frog Robots Noindex\/Nofllow Issue\" class=\"wp-image-269\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_issue.png 1010w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_issue-300x184.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_issue-150x92.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_issue-768x471.png 768w\" sizes=\"(max-width: 1010px) 100vw, 1010px\" \/><\/figure>\n\n\n\n<p>If it is just set to noindex, the whole site can still be crawled but not indexed:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1010\" height=\"619\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_follow_issue.png\" alt=\"Screaming Frog Robots Noindex\/Nofllow Issue\" class=\"wp-image-270\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_follow_issue.png 1010w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_follow_issue-300x184.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_follow_issue-150x92.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/screaming_frog_robots_noindex_follow_issue-768x471.png 768w\" sizes=\"(max-width: 1010px) 100vw, 1010px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_the_Robots_Meta_Tag_File_Manually\"><\/span>How To Amend the Robots Meta Tag File Manually<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Again, access your site\u2019s page\/template directly and replace\/add the following tag:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;meta name=\"robots\" content=\"index, follow\"&gt;<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_the_Robots_Meta_Tag_in_WordPress\"><\/span>How To Amend the Robots Meta Tag in WordPress<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>There are two ways to do this \u2013 if the issue is site wide the go to <em>Settings -&gt; Reading<\/em> and ensure the \u201cDiscourage search engines from indexing this site\u201d is not ticked:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"744\" height=\"168\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/wordpress_noindex_site_setting.png\" alt=\"Wordpress Noindex Site Setting\" class=\"wp-image-271\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/wordpress_noindex_site_setting.png 744w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/wordpress_noindex_site_setting-300x68.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/wordpress_noindex_site_setting-150x34.png 150w\" sizes=\"(max-width: 744px) 100vw, 744px\" \/><\/figure>\n\n\n\n<p>I might be wrong, but I think the only way a specific page or post can be set to index or noindex if you are using Yoast, so go to page\/post and check the following setting at the foot of the page:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"405\" height=\"810\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_noindex_site_setting.png\" alt=\"Yoast NoIndex Setting\" class=\"wp-image-272\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_noindex_site_setting.png 405w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_noindex_site_setting-150x300.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/yoast_noindex_site_setting-75x150.png 75w\" sizes=\"(max-width: 405px) 100vw, 405px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Amend_Robots_Meta_Tag_in_Magento\"><\/span>How To Amend Robots Meta Tag in Magento<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>As before, go to <em>Content -&gt; Design -&gt; Configuration<\/em>, click into your relevant Store View and amend the \u201cDefault Robots\u201d drop down option:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"217\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-1024x217.png\" alt=\"Robots Meta in Magento\" class=\"wp-image-273\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-1024x217.png 1024w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-300x63.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-150x32.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-768x162.png 768w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting-1536x325.png 1536w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/magento_noindex_site_setting.png 1783w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"My_Site_Pages_Can_Be_Crawled_and_Indexed_by_Google_%E2%80%93_What_Next\"><\/span>My Site \/ Pages Can Be Crawled and Indexed by Google \u2013 What Next?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Once you are satisfied that your robots.txt file and robots meta tag are correct, you can again use the Inspect URL tool to check your page and request that Google crawls and indexes your page:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"903\" height=\"751\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_request_indexing.png\" alt=\"GSC Request Indexing\" class=\"wp-image-274\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_request_indexing.png 903w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_request_indexing-300x250.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_request_indexing-150x125.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/gsc_request_indexing-768x639.png 768w\" sizes=\"(max-width: 903px) 100vw, 903px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"I_Also_Have_a_Bing_Webmaster_Account\"><\/span>I Also Have a Bing Webmaster Account!<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Do you?  I thought I was the only one.  Ok, you can do pretty much all the same things written in this article in <a href=\"https:\/\/www.bing.com\/webmasters\/homepage\" target=\"_blank\" rel=\"noreferrer noopener\">Bing Webmaster Tools<\/a> as you can in GSC  \u2013 so inspect the URL and Request Indexing:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"914\" height=\"594\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/bing_request_indexing.png\" alt=\"Bing Request Indexing\" class=\"wp-image-275\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/bing_request_indexing.png 914w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/bing_request_indexing-300x195.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/bing_request_indexing-150x97.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2021\/01\/bing_request_indexing-768x499.png 768w\" sizes=\"(max-width: 914px) 100vw, 914px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Ive_Done_All_This_and_My_Site_Pages_Still_Arent_Indexed\"><\/span>I\u2019ve Done All This and My Site \/ Pages Still Aren\u2019t Indexed!<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>In which case, you need a deeper delve into the configuration and functionality of your website to identify what other issues there might be.&nbsp; &nbsp;I can help you with if you fill in the contact form below.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"staging\"><span class=\"ez-toc-section\" id=\"How_To_Check_If_Google_has_Indexed_Your_StagingTest_Site\"><\/span>How To Check If Google has Indexed Your Staging\/Test Site<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"293\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764-1024x293.jpg\" alt=\"Someone Who Has Just Realised Their Test Site Is Indexed\" class=\"wp-image-102\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764-1024x293.jpg 1024w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764-150x43.jpg 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764-300x86.jpg 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764-768x220.jpg 768w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/dreamstime_m_137161764.jpg 1031w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Only three things are certain in life: death, taxes and your test site getting indexed by Google. &nbsp;<\/p>\n\n\n\n<p>Very rarely do you come across a new site launch without at some point realising the staging server has been left open to bots to come crawl and index.&nbsp; <\/p>\n\n\n\n<p>It\u2019s not necessarily the end of the world if a search engine\nwere to index a test site as it\u2019s fairly easy to resolve \u2013 but if you are\nrunning a test environment long term to develop new functionality alongside a live\nsite, then you need to ensure it is protected correctly as early as possible to\navoid duplicate content issues, and to ensure real life humans don\u2019t visit and\ninteract (i.e. try to buy something).<\/p>\n\n\n\n<p>I am formerly a developer, and probably made these mistakes myself more than once, but back then I didn\u2019t have an SEO being a pain in my arse all the time pointing these things out (back then, old school brochure-come-web designers who didn\u2019t understand the limitation of tables and inline CSS where the pain in my arse).<\/p>\n\n\n\n<p>The following techniques are all tried and tested methods\nthat I\u2019ve used to identify these issues in the wild, though to protect the identity\nof my clients and their developers, I\u2019ve taken the selfless decision to set up\na couple of test sites using my own website content in order illustrate what\nyou need to do, those being:<\/p>\n\n\n\n<p>test.daveashworth.co<br>alitis.co.uk<br> <br>Though by the time you read this, I will have followed my own advice and taken these down, I need all the visibility I can get, the last thing I need are indexed test sites holding me back.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Set_Up_A_Google_Search_Console_GSC_Domain_Property\"><\/span><strong>Set Up A Google Search Console (GSC) Domain Property<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>One of the great things about the new GSC is that you can set up domain properties which gives you key insights across all subdomains associated with your website \u2013 on both HTTP and HTTPS. &nbsp;&nbsp;To set this up, simply select the domain option when adding a property (you also need to carry out the potentially not so straightforward task of adding a TXT record to your domain\u2019s DNS):<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"536\" height=\"425\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-domain-property-1.png\" alt=\"GSC Domain Property\" class=\"wp-image-108\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-domain-property-1.png 536w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-domain-property-1-150x119.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-domain-property-1-300x238.png 300w\" sizes=\"(max-width: 536px) 100vw, 536px\" \/><\/figure>\n\n\n\n<p> There are a whole host of reasons why a domain property is useful, in this case it\u2019s because if you have your test site set up on a sub domain and it\u2019s generating impression and clicks in search, you can spot this from within the \u201cPerformance\u201d section by filtering or ordering your pages: <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"644\" height=\"267\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-performance-data-2.png\" alt=\"GSC Performance Data\" class=\"wp-image-110\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-performance-data-2.png 644w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-performance-data-2-300x124.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-performance-data-2-150x62.png 150w\" sizes=\"(max-width: 644px) 100vw, 644px\" \/><\/figure>\n\n\n\n<p>In addition, you should also check the \u201ccoverage\u201d section \u2013 in\nsome cases, Google will index your content:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"647\" height=\"641\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-indexed-data.png\" alt=\"GSC Indexed Data\" class=\"wp-image-87\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-indexed-data.png 647w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-indexed-data-150x150.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-indexed-data-300x297.png 300w\" sizes=\"(max-width: 647px) 100vw, 647px\" \/><\/figure>\n\n\n\n<p>Whilst In other cases, they will spot that you have\nduplicate content in place, and kindly refrain from indexing, in which case you\nwould find it within the section \u201cDuplicate, Google chose different canonical\nthan user\u201d:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"646\" height=\"633\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-different-canonical.png\" alt=\"GSC Different Canonical\" class=\"wp-image-88\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-different-canonical.png 646w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-different-canonical-150x147.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-different-canonical-300x294.png 300w\" sizes=\"(max-width: 646px) 100vw, 646px\" \/><\/figure>\n\n\n\n<p>Even if this is the case, you should still endeavour to ensure\nit\u2019s not crawled moving forward.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Check_Google_SERPs_Using_Link_Clump\"><\/span><strong>Check Google SERPs Using Link Clump<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If you don\u2019t have access to GSC domain properties, or any access\nto GSC (if not, why not?) then you can check the SERPs to see if any test URLs\nhave made their way into the index.&nbsp;&nbsp; <\/p>\n\n\n\n<p>This is also a handy technique when pitching for new business,\nwhat better way to win over a potential client than to make their internal or\nexternal development team look like they are dicing with search visibility death\nby allowing this to happen in the first place, and that you\u2019re here to save the\nday.<\/p>\n\n\n\n<p>The steps are as follows:<\/p>\n\n\n\n<p>i) install the <a href=\"https:\/\/chrome.google.com\/webstore\/detail\/linkclump\/lfpjkncokllnfokkgpkobnkbkmelfefj?hl=en\" target=\"_blank\" rel=\"noopener\">Link\nClump Google Chrome Extension<\/a>, which allows you to copy and paste multiple URLs\nfrom a page to somewhere more useful like Excel.<\/p>\n\n\n\n<p>ii) Amend your Link Clump settings as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"622\" height=\"568\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-settings.png\" alt=\"Link Clump Settings\" class=\"wp-image-89\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-settings.png 622w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-settings-150x137.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-settings-300x274.png 300w\" sizes=\"(max-width: 622px) 100vw, 622px\" \/><\/figure>\n\n\n\n<p>The most important one to note is the Action \u201ccopied to clipboard\u201d\n\u2013 the last thing you want to happen here is to open up to a hundred URLs at\nonce.<\/p>\n\n\n\n<p>iii) Go to your favourite (or local) <a href=\"https:\/\/www.google.com\/\" target=\"_blank\" rel=\"noopener\">Google TLD<\/a>, click \u201csettings\u201d which you\nshould see at the bottom right of the page, and select \u201csearch settings\u201d where\nyou can set your \u201cresults per page\u201d to 100.<\/p>\n\n\n\n<p>iv) Return to the Google home page and use the \u201csite:\u201c query\noperator and append your domain.&nbsp; If you\nuse www or similar, remove this \u2013 so the command would be as follows:<br>\n<br>\n<a href=\"https:\/\/www.google.com\/search?q=site%3Adaveashworth.co&amp;rlz=1C1CHBF_en-GBGB829GB829&amp;oq=site%3Aorganic&amp;aqs=chrome.1.69i57j69i59l3j69i58.2695j0j4&amp;sourceid=chrome&amp;ie=UTF-8\">site:daveashworth.co<\/a><\/p>\n\n\n\n<p>You will be presented with a sample of up to 300 URLs\ncurrently indexed by Google across all the subdomains.&nbsp;&nbsp; Whilst you could manually review each result\nto spot rogue sites:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"634\" height=\"498\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps.png\" alt=\"Test Site in SERPs\" class=\"wp-image-90\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps.png 634w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-150x118.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-300x236.png 300w\" sizes=\"(max-width: 634px) 100vw, 634px\" \/><\/figure>\n\n\n\n<p>I find it far quicker and easier to right click and drag all\nthe way to the bottom of the page.&nbsp; You\nwill know if Link Clump is working as you will see the following occur to\ndenote links are being selected and copied:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"686\" height=\"427\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-in-action.png\" alt=\"Link Clump In Action\" class=\"wp-image-91\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-in-action.png 686w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-in-action-150x93.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/link-clump-in-action-300x187.png 300w\" sizes=\"(max-width: 686px) 100vw, 686px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"380\" height=\"274\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/urls-in-excel.png\" alt=\"URLs in Excel\" class=\"wp-image-92\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/urls-in-excel.png 380w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/urls-in-excel-150x108.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/urls-in-excel-300x216.png 300w\" sizes=\"(max-width: 380px) 100vw, 380px\" \/><\/figure>\n\n\n\n<p>Repeat this across SERPs 2 and 3 if available, and once all\nURLs are pasted into Excel, use sort by A-Z to easily identify your indexed\ncontent across all relevant sub domains.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Search_For_Text_Unique_To_Your_Site\"><\/span><strong>Search For Text Unique To Your Site<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>The above methods work if your test site is hosted on a\nsubdomain on the same domain as your live website.&nbsp; However, if your test site is located elsewhere,\ne.g. test.webdevcompany.com, then they won\u2019t work.&nbsp; In which case, this or the following methods\nmight.<\/p>\n\n\n\n<p>Find some content you believe is unique to your website \u2013 in my case I\u2019ve gone with the strapline of: \u201cEnhance Your Website\u2019s Organic Visibility And Traffic\u201d \u2013 then search for this within quotation marks.&nbsp;&nbsp; If a test site containing this content has been indexed, this search should reveal it:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"779\" height=\"755\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-again.png\" alt=\"Test Sites In SERPs Again\" class=\"wp-image-93\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-again.png 779w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-again-150x145.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-again-300x291.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/test-site-in-serps-again-768x744.png 768w\" sizes=\"(max-width: 779px) 100vw, 779px\" \/><\/figure>\n\n\n\n<p>As you can see, the home pages on the main site, test sub domain and separate test domain all appear.&nbsp; You may also inadvertently spot a competitor who has ripped off your content.&nbsp;&nbsp; Some would take that as a compliment, others would issue DMCAs \u2013 it\u2019s up to you, but the last thing you want is someone outranking you with your own copy.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Crawl_The_Site_Using_Screaming_Frog\"><\/span><strong>Crawl The Site Using Screaming Frog<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>I presume you\u2019re into SEO and therefore use Screaming Frog. If either of those answers is no, then well done for making it this far into this article (let me guess you\u2019re a developer who has dropped a bollock and looking to cover your arse before anyone else finds out?).<\/p>\n\n\n\n<p>If you don\u2019t have it, <a href=\"https:\/\/www.screamingfrog.co.uk\/seo-spider\/\" target=\"_blank\" rel=\"noopener\">download it here<\/a>.<\/p>\n\n\n\n<p>Within the Basic Settings, tick \u201cCrawl All Subdomains\u201d.&nbsp; You can also tick \u201cFollow Internal \u2018nofollow\u2019\u201d\nas some test environments may have this in place.<\/p>\n\n\n\n<p>Once the crawl is complete, peruse the list to see if there\nare any internal links in place to test sites.&nbsp;\nI came across this recently where a new Drupal site had gone live but\nwith all internal links within the blog posts pointing to a beta subdomain:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"881\" height=\"218\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-crawl.png\" alt=\"Screaming Frog Crawl\" class=\"wp-image-94\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-crawl.png 881w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-crawl-300x74.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-crawl-150x37.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-crawl-768x190.png 768w\" sizes=\"(max-width: 881px) 100vw, 881px\" \/><\/figure>\n\n\n\n<p>You can then click on each test URL and click on InLinks at\nthe bottom to find the offending internal link from the live to test site.&nbsp; In this case, I amended the Contact Us link\non the sitemap to point to the test URL:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"817\" height=\"261\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-internal-links.png\" alt=\"Screaming Frog Internal Links\" class=\"wp-image-95\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-internal-links.png 817w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-internal-links-300x96.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-internal-links-150x48.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/screaming-frog-internal-links-768x245.png 768w\" sizes=\"(max-width: 817px) 100vw, 817px\" \/><\/figure>\n\n\n\n<p>Once spotted, amend and re-crawl till these are no more\ninternal links taking visitors elsewhere.&nbsp;\nIf you are using WordPress, use a search\/replace plugin to find all test\nURLs and replace them with the live one.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Check_Google_Analytics_Hostnames\"><\/span><strong>Check Google Analytics Hostnames<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If your test site has the same Google Analytics account\u2019s\ntracking code installed as your live site, you will be able to spot this within\nGA if you go to a section such as \u201cBehavior\u201d -&gt; \u201cSite Content\u201d -&gt; \u201cAll\nPages\u201d and select \u201cHostname\u201d as a secondary dimension:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"687\" height=\"392\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-hostnames.png\" alt=\"Google Analytics Hostnames\" class=\"wp-image-96\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-hostnames.png 687w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-hostnames-150x86.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-hostnames-300x171.png 300w\" sizes=\"(max-width: 687px) 100vw, 687px\" \/><\/figure>\n\n\n\n<p>Further to this, you can also then filter the data further by\nexcluding from the report all visits to the main domain, which will leave all\nother instances in the list.&nbsp;&nbsp; In\naddition to test sites, you may also uncover GA Spam being triggered on a 3<sup>rd<\/sup>\nparty site:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"694\" height=\"122\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-exclude-hostname.png\" alt=\"Google Analytics Exclude Hostname\" class=\"wp-image-97\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-exclude-hostname.png 694w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-exclude-hostname-150x26.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/google-analytics-exclude-hostname-300x53.png 300w\" sizes=\"(max-width: 694px) 100vw, 694px\" \/><\/figure>\n\n\n\n<p>There are pros and cons to having the same GA tracking ID\nrunning on both your live and test environments, but personally, I see no\nreason to have separate accounts and instead would create multiple views within\nyour one account.&nbsp;&nbsp; For the live site,\nset up a filter to only include traffic to the live hostname, and vice versa\nfor the test site.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_To_Remove_and_Prevent_Your_Test_Site_From_Getting_Indexed\"><\/span><strong>How To Remove and Prevent Your Test Site From Getting Indexed<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>So you\u2019ve discovered your test site in the index using one\nof the techniques above, or, you want to make sure it doesn\u2019t happen in the\nfirst place.&nbsp; The following will all help\nwith this:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Remove_URLs_via_GSC\"><\/span>Remove URLs via GSC<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If your site is indexed, whether it\u2019s generating traffic or\nnot, it\u2019s best to get it removed.&nbsp;&nbsp; To do\nthis, you can use the \u201cRemove URLs\u201d section from the \u201cold\u201d GSC. &nbsp;&nbsp;&nbsp;<\/p>\n\n\n\n<p>Note, this will not work at domain property level as these\naren\u2019t catered for in old GSC.&nbsp; In order\nto do this, you need to set up set up a property for the individual test\ndomain. <\/p>\n\n\n\n<p>Once set up, \u201cGo To The Old Version\u201d and go to \u201cGoogle Index\u201d\n-&gt; \u201cRemove URLs\u201d.&nbsp;&nbsp; From here, select \u201cTemporarily\nHide\u201d and enter as single forward slash as the URL you wish to block which will\nsubmit your entire site for removal:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"815\" height=\"352\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-remove-urls.png\" alt=\"GSC Remove URLs\" class=\"wp-image-98\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-remove-urls.png 815w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-remove-urls-150x65.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-remove-urls-300x130.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/gsc-remove-urls-768x332.png 768w\" sizes=\"(max-width: 815px) 100vw, 815px\" \/><\/figure>\n\n\n\n<p>This will remove your site from the SERPs for 90 days, in\norder to ensure it doesn\u2019t return, you must take further steps.&nbsp; One of the following will suffice (and should\nbe carried out regardless of whether you are able to Remove via GSC)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Set_robots_tag_to_noindex_on_test_site\"><\/span>Set robots tag to noindex on test site<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Ask your developers to ensure that when running on the test domain,\neach page across the site generates a robots noindex tag:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;meta name=\"robots\" content=\"noindex\" \/&gt;<\/code><\/pre>\n\n\n\n<p> If your site is WordPress, you can set this via \u201cSettings\u201d -&gt; \u201cReading\u201d and selecting \u201cDiscourage search engines from indexing this site\u201d:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"576\" height=\"429\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/wordpress-reading-settings.png\" alt=\"Wordpress Reading Settings\" class=\"wp-image-99\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/wordpress-reading-settings.png 576w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/wordpress-reading-settings-150x112.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/wordpress-reading-settings-300x223.png 300w\" sizes=\"(max-width: 576px) 100vw, 576px\" \/><\/figure>\n\n\n\n<p>Whatever code or settings you use to prevent the test site\nfrom being indexed, you must ensure this is not migrated to the live site when new\ncontent or functionality is made live.&nbsp;&nbsp;\nTest site settings going live are one of the most common and most sure-fire\nways to mess up your live site\u2019s visibility.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Password_Protect_Your_Test_Site\"><\/span>Password Protect Your Test Site<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>From your web control panel or via the server, password\nprotect the directory in which your test site resides.&nbsp;&nbsp; There are numerous ways to do this \u2013 the best\nbet is to ask your hosting company or developers to configure this, or, there\nare plenty good resources out there that will show you how to do this, such as:<\/p>\n\n\n\n<p><a href=\"https:\/\/one-docs.com\/tools\/basic-auth\" target=\"_blank\" rel=\"noopener\">https:\/\/one-docs.com\/tools\/basic-auth<\/a><\/p>\n\n\n\n<p>Once blocked, you should see an alert box when trying to\naccess your test site:<\/p>\n\n\n\n<p><a href=\"https:\/\/alitis.co.uk\/\" target=\"_blank\" rel=\"noopener\">https:\/\/alitis.co.uk\/<\/a><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"816\" height=\"278\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/password-protected-site.png\" alt=\"Password Protected Site\" class=\"wp-image-100\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/password-protected-site.png 816w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/password-protected-site-150x51.png 150w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/password-protected-site-300x102.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/password-protected-site-768x262.png 768w\" sizes=\"(max-width: 816px) 100vw, 816px\" \/><\/figure>\n\n\n\n<p>This will prevent search engines from crawling and indexing the\nsite.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Delete_site_and_return_page_status_410\"><\/span>Delete site and return page status 410 <span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If you no longer have need for your test site, you can simply\ndelete it.&nbsp; When search engines try to\nvisit pages on longer live, they will see the pages are deleted.&nbsp;&nbsp; By default, a broken page will return status\n404 (\u201cNot Found\u201d) \u2013 whilst this will get the site de-indexed in time, it will\ntake a while as there will be follow up visits to see if the broken page has returned.&nbsp;&nbsp; <\/p>\n\n\n\n<p>Instead, set the status to 410 (\u201cPermanently Gone\u201d) which will return the following message:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"705\" height=\"476\" src=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/status-410.png\" alt=\"Status 410\" class=\"wp-image-101\" srcset=\"https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/status-410.png 705w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/status-410-300x203.png 300w, https:\/\/daveashworth.co\/blog\/wp-content\/uploads\/2019\/07\/status-410-150x101.png 150w\" sizes=\"(max-width: 705px) 100vw, 705px\" \/><\/figure>\n\n\n\n<p>To do this across an entire domain, delete the site and\nleave the .htaccess file in place with the following command:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Redirect 410 \/<\/code><\/pre>\n\n\n\n<p>This will ensure the site gets de-indexed at the first time\nof asking (or at least quicker than a 404)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_via_robotstxt\"><\/span>Block via robots.txt<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>You can block the site from being crawled by implementing\nthe following commands in the test site\u2019s robots.txt file:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>User-agent: *\nDisallow: \/<\/code><\/pre>\n\n\n\n<p>This will prevent bots from crawling the site.&nbsp; Note: if your test site is currently indexed,\nand you have gone down the route of adding noindex tags to the site, do not add\nthe robots.txt command in until all pages have been de-indexed.&nbsp; If you add this in before all pages have de-indexed,\nthis will prevent them from being crawled and the robots tag detected, so the\npages will remain indexed.<\/p>\n\n\n\n<p>And that\u2019s it \u2013 I hope the above will be enough for you to\nfind, deindex and prevent your test from being crawled ever again.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"But_Remember\"><\/span><strong>But Remember<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>I cannot stress this enough \u2013 if you decide to implement\nrobots meta tags or robots.txt which disallow all bots from crawling and\nindexing your test site, make sure when you put your test site live that you do\nnot carry these configurations over to the live site, as you will risk losing your\norganic visibility altogether. &nbsp;<\/p>\n\n\n\n<p>And we\u2019ve all been there, right?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If key pages on your site, or indeed your whole site, are not showing up in search, you may have technical issues preventing search engines crawling and indexing your site.<\/p>\n","protected":false},"author":1,"featured_media":276,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-256","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-migrations"],"_links":{"self":[{"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/posts\/256","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/comments?post=256"}],"version-history":[{"count":2,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/posts\/256\/revisions"}],"predecessor-version":[{"id":2069,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/posts\/256\/revisions\/2069"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/media\/276"}],"wp:attachment":[{"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/media?parent=256"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/categories?post=256"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daveashworth.co\/blog\/wp-json\/wp\/v2\/tags?post=256"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}