Why and how to delete content in bulk for SEO
Whenever I mention deleting content in bulk to clients or colleagues, I’m usually met with some form of skepticism or doubt. This is usually wrapped up in several different positions, all of which make sense.
Many times, great efforts were made to produce this content, and deleting it is a blow to the emotions. Beyond the emotional hit is the financial one – content costs money, and good content costs real money. Throwing it away seems like such a waste.
And then there is skepticism around the concept – why would deleting content ever be a good idea? If Google doesn’t like it, they just won’t rank it, right? Bummer, but let’s not throw the baby out with the bathwater, right? It could rank in the future.
We have deleted content in bulk on many websites at our agency, and a recent case study I shared on Twitter surfaced a lot of questions.
Let’s first dive into the idea of deleting content from a theoretical point. Then, let’s look at the scenario I was up against and why, in this case, deleting content in bulk made sense. Finally, the results from our specific use case.
Why delete content?
There are several reasons to consider deleting content in bulk on a website, but the two most pertinent are related to low-quality content and authority/relevance.
Ever since Google launched the Panda update in 2011, their algorithms have been on the hunt for low-quality content. Panda has been permanently embedded into the core algorithm now, so websites no longer have to be impacted by a specific algorithm update to see the effects of low-quality content.
Google evaluates low-quality content on a page-by-page basis but does aggregate all of the indexed pages when they evaluate a website for quality. In other words, the quality of individual articles can affect how your entire site ranks.
Better put, here is what Google’s Gary Ilves told Search Engine Land’s Barry Schwartz about Panda:
“[Panda] measures the quality of a site pretty much by looking at the vast majority of the pages at least. But essentially allows us to take [the] quality of the whole site into account when ranking pages from that particular site and adjust the ranking accordingly for the pages.”
Google’s John Mueller has also discussed this concept multiple times, saying low-quality content on one part of your website can affect your entire website’s rankings.
Deleting the content that Google determines as “low quality” isn’t a silver bullet. However, as part of a larger strategy aimed at improving content quality and increasing topical authority, it can make sense.
Here’s a great article on this concept by Glenn Gabe.
Authority and relevance
Newer to the conversation than Panda is the concept of a website having topical authority. This is closely related, and oftentimes similar, to the idea that Google’s algorithms need to see your content as relevant to what your website is an authority on.
Mueller commented on this in a 2021 Office Hours, saying:
“For search engines, you are building out your reputation of knowledge on that topic. If Google can recognize that a site is good for the broader topic, then it can surface that site for broader queries on the topic as well.”
There are many ways to build topical authority, for example:
- Adding backlinks from other websites that are authorities on the topic.
- Controlling relevance through internal links and anchor text.
A favorite approach to gaining topical authority is publishing a lot of highly relevant, in-depth content on the topic. With a good base of helpful content on a topic, Google is much more likely to trust you for that topic by giving you higher rankings.
However, for the same reasons that this approach works when done right, it can also hurt the website when done wrong.
Publishing content about a lot of topics, without building out topical authority on each of those topics, can dilute your website’s overall authority.
There are several ways to fix topics that are not well built out. In a perfect world, you’d prefer to fully build out that topic with lots of rich, in-depth content. However, this takes time and a healthy budget.
Deleting content is often a simpler and quicker approach.
Let’s dive into one specific situation.
The website we were working on primarily focused on one product line in the fashion niche. Since its inception in 2018, the brand built content around this product. The website ranked well for a variety of terms around this product.
In mid-2019, the brand started to expand into two more related product lines. At a glance, these three products seemed to make sense as they all fell under one umbrella.
Throughout the first part of 2020, the website started to gain some traction in rankings and organic traffic for these two new product lines. However, June 2020 was where that traction peaked.
For the second half of 2020, traffic and rankings started to decline across all parts of the website. By early 2021, traffic was down 40% from its previous June 2020 high.
It was tricky to evaluate much of anything in 2020, given that most of the world was shut down due to COVID-19 at the beginning, and then ended in a flurry of pent-up spending. Everything from buying patterns to seasonal trends to traffic changes had to be taken with a large grain of salt.
The drops for this website weren’t sudden, like something you might see from an algorithm update. The traffic decline was steady but never dramatic or extreme.
However, when you isolated the URLs from the three different product lines, it became pretty clear:
- The oldest product and related content were performing worse than in previous years, but not dramatically worse. However, newer content that was published in this product silo was significantly slower to rank.
- The two newest products and related content had lost the vast majority of their rankings and traffic.
Grouping “content silos” when analyzing poor performance is a great way to determine where problems might be.
In addition, when isolating traffic and keyword rankings from different silos, it’s important to distinguish if the losses are happening to one or two pages, or to the vast majority of the pages in the silo.
It’s common for a large share of traffic on a website to come from only a handful of pages. And, if one or two of those pages lose several key rankings, overall site traffic can drop considerably. But it isn’t as though all pages experienced a traffic drop.
On the contrary, when the majority of pages in a silo lose traffic and keyword rankings, it is a much bigger reflection on the entire silo.
Here is an example. We isolated the newer product lines, which revealed massive drops across the board.
When looking at the same report for the original product and related content, traffic was a lot more stable. There was no growth, but there weren’t the large drops that we saw with the other product silos.
Organic traffic charts in Google Analytics told a similar story.
In summary, we were up against several challenges:
- Keyword rankings, traffic, and sales for the two new product lines had dropped considerably.
- Keyword rankings, traffic, and sales for the original product line were not growing, even with new content and links going live.
What to do
Taking a step back, it seemed clear that this could be an authority and relevance problem in Google’s eyes. For several years, this site only published content around one core topic.
And while the new product lines seemed closely related, that doesn’t mean that Google transferred that topical authority over to this new content.
In other words, expanding into new products/topics might have diluted the authority the site had in its original product.
Sure, these new product silos ranked well out of the gate, perhaps relying on the site’s overall authority for these early rankings.
But, Google will sometimes change its mind on what your site is and isn’t an authority on, as we discussed above.
This can make sense, especially upon a deeper evaluation:
- Most of the backlinks to the site were pointing to the first product silo.
- Most of the consistent traffic was going to the first product silo.
Plan of action
There are several viable strategies to consider in such a situation as this:
- Invest heavily into the silos that are losing, by producing lots of great content, updating old content and building niche-relevant backlinks.
- Delete the content in the silos that are losing.
- Do nothing and hope Google changes its mind about the silos
Scenario #1 is expensive, time-consuming and there is no guarantee it will work.
Scenario #2 is disappointing and removes the opportunity to rank for this content in the future. There is also no guarantee it will work.
Scenario #3 makes sense if you can make the case that the content should be ranking. In other words, you are 100% sure it is the best content on the topic. But, if you can’t make that case, then this route doesn’t make much sense.
It’s really tough to evaluate if your content is truly deserving of being the #1 result. I will say that we all generally tend to overestimate how good our own content is.
For this specific scenario, here is where we landed:
- We didn’t think the content in the poorly performing silos was bad, but it wasn’t the best content available.
- There probably wasn’t enough of it to fully establish topical authority, as a look at competitors showed they had a lot more content than we did.
- Additionally, the appetite just wasn’t there for a heavy investment in product silos that hadn’t even been proven yet.
In scenario #2, the idea is that you delete the content silos that Google doesn’t deem you an authority on, thus raising the overall authority of the site.
You also rebalance the relevance of the site.
This is what we set out to do.
Once we decided to delete all of the content from the poorly performing silos, there was some work to be done.
We isolated all of the URLs to be deleted and then evaluated each URL to see if it had backlinks or not.
We set up 301 redirects from all of the URLs that had backlinks in order to preserve the authority coming from those backlinks. We pointed these 301s to the most relevant pages for each URL. In the end, 26% of the URLs were 301 redirected.
For the remaining 74% of the URLs, we removed the content and set up 410 status codes. As opposed to a 404 code, the 410 code immediately tells Google that this page is no longer live.
We opted to let a lot of the pages go “dead” without redirects because we wanted Google to know that we no longer had content on this topic. The 410 code gave this information to Google even faster than the 404 code would.
We followed this up by running a site crawl and fixing any internal links that were still pointing to 301 or 410 pages.
All of this was performed during the final two weeks of February 2021.
Once we had all of the pages deleted and necessary redirects set up, we knew it was a waiting game.
Depending on how frequently Google crawls your site, it can take a while for them to get through all of the deleted pages and re-index the pages on your site that are still live.
And, in terms of seeing significant change as a result of all of the deleted content, this can often take until the next Core Algorithm Update, or even further.
Gabe, among others, has talked about how recovery from Core Updates often won’t happen until subsequent updates “because there are site-level quality algorithms that refresh during broad core updates.”
While we weren’t trying to recover from a Core Update in this situation, this is what I typically expect when looking for gains from deleting content in bulk.
However, we didn’t want to just sit on our hands. Another tool we had at our disposal was to publish additional content about the topic we were now 100% focused on.
Over the next five months, we published 30 articles, ranging from informational topics to specific product reviews.
Throughout Q2 of 2021, rankings and traffic continued to erode slowly.
From the low point in May, traffic spiked 4.5x as a result of the June/July Core Updates. Revenue jumped 3x from its previous low.
The results really invigorated us, but we noticed that rankings and traffic started to slowly erode as we hit August and September 2021. It was slow and subtle, but we were attuned to this now.
As a result, we started publishing additional content throughout Q4 2021 by publishing a total of 56 articles. We started seeing gains in December 2021. Given that this is a website in the fashion niche, you could argue that Q4 has increased demand. But, we saw the traffic gains were matched by ranking increases.
We started a third content push in March 2022, aiming to publish 15-20 articles per month.
Traffic has now eclipsed its monthly high from July 2021, and May 2022 is on track to have its highest month of traffic ever.
Revenue is also up, with April 2022 revenue 97% higher than April 2021. Summer and Q4 are the biggest sales seasons, so hopes are high that revenue will continue to climb and exceed previous highs.
- It is almost always better to improve the content rather than delete it. If you can do that, that should be your go-to. However, as we know, this isn’t always possible, for a variety of reasons.
- Deleting content is not a silver bullet, and it doesn’t work every time. Spend the time analyzing the details of the website’s specific situation to determine if the use case lines up with when deleting content makes sense.
- We deleted a lot of content along with publishing lots of new, topically focused content. If you’re looking for a true A/B test about deleting content only, this is not it. But the results are compelling nonetheless.
- Pay attention to topical authority, even if you aren’t suffering from traffic drops. Google is constantly tweaking the way its algorithm evaluates authority and relevance, and you don’t want to fall on the wrong side of a Core Update.
For some niches, building topical authority is relatively simple. Publishing high-quality content that answers the query can be enough.
However, in other niches, building topical authority involves a lot more than just publishing high-quality content. Sourcing experts to contribute to long-form content pieces might be necessary. Building high-quality, niche-relevant backlinks and brand mentions might be needed. And, in these cases, a healthy amount of time and patience might be needed.
Either way, focus your efforts on establishing true topical authority. And when you fall short, consider deleting content in bulk as an option.