Today we will talk about how duplicate content is created, and how can you identify duplicate content on your site? When search engine detect duplicate content, such as through variations caused by URL parameters, well maintained search engine (like Google) group the duplicate URLs into one cluster and then select the best URL to represent the cluster in search results.
Having duplicate content can effect your website in a variety of ways, but unless you have been duplicating deliberately, it's unlikely that one of those ways will be a penalty.
There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value.
How to Identify duplicate contentWhy Duplicate content:
- canonical issues (www and non-www version);
- pagination when different pages have identical titles and meta description;
- various versions of the home page (e.g. www.site.com and www.site.com/index.php);
- incorrect internal navigation creating several URLs to one and the same page (e.g. www.site.com/page.php?id=200 and www.site/category/page.php?id=200); etc
Using Google Webmaster Tools you can easily find pages with both duplicate titles and meta descriptions. You simply click on “HTML Improvements” under “Search Appearance”.
webmaster will show you what pages have duplicate meta descriptions and page titles.
Download the screaming frog web crawler and use it to crawl 500 pages for free.
Page Titles/Meta Descriptions
You can find duplicate page titles by simply clicking on the tab “Page Titles” or “Meta Description” and filtering for “Duplicate.” For more detailed.
Identify Duplicate Content Using Google Search
Use The inurl: Search Operatorinurl:Dell_laptops This works because the string “Dell_laptops” is found in the URL we are auditing. Many times automated scrapper sites will simply replicate every aspect of a domain, even the URL structure, this allows us to easily hunt this scrapped content down.
Use The intitle: Search OperatorWhen you use the intitle: search operator you can quickly find pages that are duplicating the title tag of a page you are auditing. For example lets say we want to look to see if this page is being duplicated, we would run the following query: intitle:”Muhammad Iqbal - Wikipedia, the free encyclopedia” Some of the results in those results have legitimate reasons to have the same title tag, but if you look further down you can see that some sites are simplely scrapping Wikipedia. This is the type of duplicate content that SEOs should be concerned about.
The canonical tag (rel="canonical") is an essential tool in the search engine optimization (SEO) toolbox. It is often a better solution that a 301 redirect in cleaning up duplicate content issues.
in our next article we will learn about canonical rel tag and how to fix duplicate content issue on your site. Luckily, you have control over your own site, so you have the power to fix it.
stay tuned for our next update. Enjoy.
Image Credit:- Vector15