Let’s find out why Noindex, Nofollow, and rel=canonical attributes should not be used together and how to correctly point search crawlers to a canonical URL.
What Does «Should Noindex Pages Have Canonicals» Mean?
If there are a few similar pages on a site, the search engine perceives only one of them as canonical. This page will be identified as a priority for indexing and display in SERPs. Ideally, a webmaster should define the main page and give clear instructions to the search crawlers on which of the available copies is canonical. In certain cases, crawlers select the canonical page on their own. Therefore, site owners try to indicate to crawlers which page is a priority for indexing and display in SERPs, including setting the Noindex, Nofollow directive as a ban on selecting a page as canonical. However, Noindex, Nofollow cannot be used this way. Why?
A canonical page must be a priority for indexing. The Noindex attribute indicates that the page should be excluded from indexing, while Nofollow signals that the link should not be crawled and followed. When adding Noindex, Nofollow on the page that crawlers defined as canonical, signals can be combined with each other. In this case, the impact of attributes that prohibit indexing will be passed to all page copies.
What triggers this issue?
The canonical page selection may be caused by the fact that several unique URLs lead to the same content on the site. For example, a website has similar pages that have:
- URLs with and without www;
- http and https protocols;
- dynamic URLs generated when selecting products with different parameters;
- uppercase and lowercase characters in URLs;
- versions for different types of devices, language versions;
- the entry published in different sections.
All these variations will be perceived by the search engine as copies of the canonical page. You can find details and examples in the guide for developers: https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls?hl=en.
How to check the issue?
Search crawlers determine the canonical page based on several factors:
- the rel canonical attribute is set on the page;
- internal links refer to the page;
- URL is included in the sitemap file;
- the page uses HTTPS;
- human-readable URL format.
You can retrieve information about the versions of indexed pages related to your resource by using the URL checker tool in Search Console.
Checking Canonicalized URL being noindex, nofollow is important but not enough to rank good enough!
Check not only the issue but make a full audit to find out and fix your technical SEO.
Why is this important?
When you specify the noindex attribute tag, the canonical page is not indexed. This error will result in the website losing organic traffic.
How to fix the issue?
John Mueller, the Senior Webmaster Trends Analyst at Google, recommends: for search crawlers to select URLs defined by the webmaster as canonical, canonical URLs should send unambiguous signals to search crawlers that they should be selected. Ideally, search engines should find no alternatives for canonicalization.
The following actions will help confirm the preferred main pages to the search engine:
- setting a tag with the rel canonical attribute;
- adding rel=canonical to the HTTP header;
- specifying canonical URLs in the Sitemap;
- using HTTPS protocol;
- using canonical a URL for internal links;
- adding references to the canonical page to the source code of copy pages.