Back in your school days, your teachers likely hammered into you the severity of copying. If you and a fellow student were to hand in an essay that contained paragraphs of identical text, you’d both get a failing grade.
With this sort of attitude having been drilled into people for so long, it’s no surprise that many of us view duplicate content with a similar mindset.
The idea of having your web content match text found at a different URL can seem scary if you don’t understand what it means. But what is duplicate content, and what does it mean for your online marketing?
We’ll dive into both of those questions below, so read on for more information. Then consider partnering with WebFX’s team of over 500+ experts for our search engine optimization (SEO) services.
What is duplicate content?
Duplicate content is a term Google uses to describe substantial chunks of content that appear on more than one URL.
A broad range of things can qualify as duplicate content — sometimes it’s entire pages that are identical, while other times it’s a few similar sentences. Something important to note is that duplicate content doesn’t purely involve word-for-word copies.
You can copy a piece of content from one URL to another and then swap out several of the words with synonyms, but Google will still often recognize the similarity.
What causes duplicate content?
While duplicate content might make you think of deliberate copying or plagiarism, the fact is, most duplicate content is completely benign and unintentional. Here are some of the most common causes:
- URL variations: Sometimes, what should be a single URL ends up becoming multiple URLs, like if your site uses session IDs and effectively creates a different version of the URL for each user.
- Alternate site versions: If you create a new website at a different domain or make an HTTPS version to replace the original HTTP one, you can end up with duplicate pages. Our duplicate content SEO checker can check for issues with an HTTP to HTTPs redirect!
- Pure coincidence: Sometimes, it’s possible to write duplicate content without being aware of it.
If you and another site write about the exact same topic, you may end up writing a similar enough passage somewhere to count as duplicate content.
- Scraped content: Scraping is where you effectively copy content directly from one URL to another. This sometimes happens without malintent, like if two pages use a large quote from the same source.
While usually unintentional, duplicate content can have a measurable effect on your online marketing, as we’re about to see.
How does duplicate content impact your SEO?
If you feel that duplicate content will affect the success of your SEO, you’re right — but maybe not in the way you think. Read on for more information about the relationship between duplicate content and SEO!
Is there a duplicate content penalty in Google?
One of the most commonly held beliefs about a duplicate content penalty is that Google will directly penalize you for having duplicate content on your site.
But is this true?
In short, no. Google has attempted to clarify — with limited success — that they do not directly penalize duplicate content in rankings. Google understands that duplicate content is bound to happen sometimes and that it’s generally not intentional. However, there is one notable exception: Copied content.
What is copied content?
Copied content is a specific subset of duplicate content, namely the small portion of duplicate content that is intentional — and deceptive. Copied content is content that actively plagiarizes or tries to manipulate search rankings.
In rare cases where Google recognizes deceptive intent in duplicate content, it will apply a duplicate content penalty to the offender in the form of ranking them lower or even removing them from Google’s index entirely.
Most duplicate content doesn’t qualify as copied content, though, so Google doesn’t penalize it.
But just because Google doesn’t issue a direct duplicate content penalty doesn’t mean duplicate content can’t still harm your SEO.
How does duplicate content affect rankings?
When Google ranks content, it does so by determining which results are most relevant and provide the best user experience.
But when it encounters duplicate content, it can quickly become confused. The two pages are exactly the same — which deserves to rank?
This confusion frequently results in lower rankings for both pages since Google’s algorithms aren’t confident about ranking either one particularly high. If you want to keep your rankings high, you’ll want to take steps to help Google determine which page is the “real” version.
How should you handle duplicate content?
In many cases, duplicate content isn’t anything to stress over.
Since it can occasionally bring down your rankings, though, it’s in your best interest to counter its effects with a few simple practices, which tools like Screaming Frog can help with. Here are three ways you can handle duplicate content SEO on your site!
1. Avoid duplicate content where possible
The first and most obvious way to avoid duplicate content issues is to avoid having duplicate content at all. This won’t always be something you’re fully in control of, but you can still work to minimize it.
If you notice that a particular product page is creating multiple URLs for different sizes of the same pants, for instance, you can work to consolidate those URLs into one. That will mean the content appears only on one page, not two, so there’s no duplicate content.
Without addressing those issues, you could end up with situations where every combination of the different “length” and “waist” buttons in the above image creates a different URL!
2. Use redirects
In some cases, duplicate content is a natural consequence of a particular site change, such as switching from an HTTP site to an HTTPS site.
In these situations, you can help Google avoid confusion by using 301 redirects. The way a redirect works is that when a search engine tries to visit URL A, the redirect sends it to URL B instead. So, if you have two URLs with duplicate content, you can use a redirect to send all the traffic to the preferred page.
3. Use rel=“canonical”
The rel=“canonical” attribute is a line of code that you can insert into a page’s HTML head. It essentially tells Google that the page it appears on is a duplicate of another page, and asks Google to treat the other page as the original.
This is a good practice if, for example, you have a PDF version of a page on your site.
You want the original, HTML version of the page to rank, but the duplicate content might confuse Google.
So, you put the rel=“canonical” attribute on the PDF page to tell Google it’s not the original.
Learn how we increased traffic by over 40%, and conversions by over 100% for an ecommerce client.
Read the Case Study
WebFX can help you conquer Google rankings
Ready to optimize for duplicate content and get your website to the top of rankings?
WebFX can help! Our more than 25 years of experience have made us the ultimate experts on SEO, and we have all the skill and dedication to optimize your campaigns for top rankings. With our SEO services, you’ll not only receive help handling duplicate content, but also optimizing everything from page speed to keyword integration.