How to Handle Duplicate Content on Your Joomla Website?

One of the worst things (SEO wise) that you can have on your Joomla website is duplicate content. Duplicate content will dilute the importance of your content by confusing search engines on which content to list in the SERPs (Search Engine Results Pages). There are many causes for duplicate content to appear on your website:

  • You have links that contain flags that do not change the content of your pages. For example, we know that some versions of VirtueMart (under specific settings) add this flag redirected=1 (this happens when you are being redirected from https to http) and this one flypage= to the links of your products pages. These links are internal for VirtueMart and do not change the content whatsoever.
  • You are using a mix of SEF/non SEF links on your Joomla website for the same pages. For example, you are linking to both http//www.yourjoomlawebsite.com?option=com_content&articleId=5 and http://www.yourjoomlawebsite.com/yourarticle.html which are actually the same pages.

  • You have the exact same content under different links. For example, you have something like http://www.yourjoomlawebsite.com/featured/yourarticle.html and http://www.yourjoomlawebsite.com/yourarticle.html.

  • You have a printer-friendly and/or pdf versions of your articles, but you’re not using the no-index search engine directive for these versions.

  • Your website is indexed in both the www and the non-www versions. So, if you enter something like “site:www.yourjoomlawebsite.com” and “site:yourjooomlawebsite.com” on Google you will get a different number of results .

Of course, the scenarios for having duplicate content are endless, and may be intentional or non-intentional, but the outcome is still always the same: a lower search ranking than the one your content deserves. So, how do you address duplicate content issues on your Joomla website? Below are some tips for doing this, that are considered to be good practices from a search engine perspective, and that will get rid of the problems above:

  • Use the robots.txt: The robots.txt is always read by search engines. It will tell search engines which pages to index (and subsequently rank) and which pages to not index. The robots.txt file should be placed in the root directory of your website. In order, for example, to block Google/Yahoo/Bing from indexing VirtueMart pages with the ?redirected=1 flag, you will need to add the following line in your robots.txt file:

    Disallow: /*?redirected=1

    The above line tells search engines not to crawl and not to index any page that ends with ?redirected=1 .

    Now let’s disallow indexing of printer-friendly versions of the website. We can do it from robots.txt if the link to the printer friendly version ends with printer=1. Here’s how to do it in this case:

    Disallow: /*printer=1

    Now, every link that ends with printer=1 will not be indexed (of course, you might want to slightly modify this to accommodate your needs).

  • Use the no-index directive: We have explained, in the above, how to use robots.txt to block search engines from indexing your printer-friendly pages. But this can be done also at the page level, by using the no-index directive. All you need to do is add the following code at the top of your printer-friendly page:

    <meta name="robots" content="noindex, nofollow">

    The above will instruct search engines not to index the printer-friendly page, and not to follow any links on that page.

  • Do not have a mix of SEF and non-SEF links on your website: Check your website to see which pages link to non-SEF links and fix these links. Note that if these links are generated by Joomla extensions, then you may need to change the order of the plugins in order to fix the problem. We suggest that you contact Joomla Experts in this situation.

  • Have only one version of your website: By default, any website has two versions: a www version an a non-www. You will need to choose which one to use (it is advisable to go with the one that has the highest PageRank on Google, and let go of the other one). Once you decide on which version to choose, you need to tell search engines about your decision by redirecting (using the .htaccess file) traffic from the version that you do not want to the version that you want. Here’s how to do so (assuming you want to redirect non-www traffic to www):

    RewriteEngine On
    RewriteCond %{HTTP_HOST} ^yourjoomlawebsite.com
    RewriteRule (.*) http://www.yourjoomlawebsite.com/$1 [R=301,L]

  • Ensure that your pages do not contain content copied from other pages on your website: While this is tolerated by search engines when only done in a couple of pages, it is considered as spam in case it’s done in several pages.

Once the above tips are implemented, your website will most likely be more appreciated by search engines, and subsequently your traffic will increase. In case you need help implementing any of the above, then don’t hesitate to contact us. We’re here to help you, we’re very friendly, and we will treat your business as ours. Oh, and be sure to check our very affordable rates!

2 Responses to “How to Handle Duplicate Content on Your Joomla Website?”
  1. Pingback by How to Duplicate a Joomla Site | itoctopus — January 31, 2012 @ 3:09 pm

    […] of the website, and it wasn’t just someone stealing your content (it will be considered as duplicate content). If you must duplicate your website, then make sure that the new website is only reachable through […]

  2. Pingback by How to Remove www from Your Joomla Website URLs | itoctopus — April 26, 2012 @ 1:04 pm

    […] (Note: we have discussed redirecting non-www to www before in our article about handling duplicate content on your Joomla website.) […]

Leave a comment