Latest news, food, business, travel, sport, Tips and Tricks...

How To Tell Robot not Indexing our Archive?

Blogger has a variety of URL representations, for example for Archives and Posts. A post will get a URL format like .../year/month/post.html, if we omit /post.html, then it will be .../year/month an Archive page. Each page can generally be accessed by a bot then processed by it, such as for example a Google bot can index the page and then display it on its search engine.

Because each page can be accessed, it is possible that we will find duplicate content. Just say, an article titled "Foo go to Bar" with the URL form "/2018/10/foo.html" will also be found in the "2018/10" Archive Page. On the Google search engine, this can affect the display of 2 content, which is often called a duplicate. For example, we search for the title "Foo go to Bar" in Google search, maybe we will get 2 results from our Blog with 2 different URLs, one of which is the Post Article Link with title Foo go to Bar and the other one is the Archive page Link with title BlogTitle: 10/1/18 - 11/1/18 with the description "Foo go to Bar" inside or showing same description in every Archive Link. Try to search your Blog with keyword site:BLOG-URL example, site:example.blogspot.com.

ExpressionURL FormatMeaning
data:view.isHomepage/Homepage
data:view.isPage/pStatic Page
data:view.isArchive/2018 or /2018/11Page List of post from the current year or month
data:view.isPost/2018/11/foo.htmlPost Article Page
data:view.isSearch/searchAll URL has a word 'search'
data:view.isLabelSearch/search/labelnamePage list of Post with 'labelname'
data:view.search.query/search?q=keywordBlog Search Result Page.

To avoid this duplication, we cannot use the Custom Robot.txt feature as we forbid robots from accessing the label page section by entering "Disallow: /search/LabelName" because if we enter "Disallow: /2018/10" this will generalize all that is inside that section will be prevented from accessing, this includes our posting page, like the example "foo.html", another problem also that "/year/month/" will definitely become a lot as the age of our Blog increases. from that, the solution is to use the <meta> tag for Robot.

Meta Tags for Robots Search Engine Crawlers

The Meta robot tag is a tag that will tell the Search engine how it should handle a page. This tag is located in the <head> tag section of a page.

SYNTAX

<meta name='robots' content='ATTRIBUTE'/>

Attribute
allAllow All, mean no restrictions for indexing or serving, like Allow index, allow follow, etc.
noindexDon't show this page in search results.
nofollowDon't follow the links on this page.
noneEquivalent to noindex, nofollow.
noarchiveDon't show a "Cached" link in search results.
nosnippetDon't show a text snippet or video preview in the search results for this page.
notranslateDon't offer translation of this page in search results.
noimageindexDon't index images on this page.
unavailable_after: [RFC-850 date/time]Don't show this page in search results after the specified date/time. The date/time format must be specified in the RFC 850 format.

There are 2 ways in Blogger to provide meta tags for robots, manually and automatically.

Custom Robots Header Tags

This is an automated way to provide meta tags on a page. There are 3 sections here, Homepage, Archive and Search Page, Post and Page.

  1. Inside the Blogger Dashboard select 'Settings'.
  2. Select 'Search preferences'.
  3. See the 'Crawlers and indexing' section.
  4. Click 'Edit' in the 'Custom Robots Header Tags' section.
  5. Choose 'yes' to enable this feature.
  6. Select the desired option.
  7. Save.

How To Tell Robot not Indexing our Archive?

Seen from the section, Archive And Search Page is in the same part, if we want to only provide the meta tag for the Archive page, then we have to use the manual method.

Manually Adding Meta Robots Tag in Blogger

The manual method can be done by giving a code for the meta tag. For example, we give a condition to only display the Meta Robots Tag on the Archive page, we can insert the code below between the tags <head> and </head>.

WITH THE <B:IF> TAG

<b:if cond='data:view.isArchive'>

    <meta content='noindex,noarchive' name='robots'/>

</b:if>

,
//