Penguin is a series of Google algorithmic approaches launched in order to improve the quality of search results by discounting the value of certain manipulative link building practices employed by marketing vendors in order to improve their visibility in Google and other major search engines. Web sites that have used these practices will notice a traffic in received by Google Organic Search Results because of the algorithmic updates.
On this guide:
- 1 What did really happen?
- 2 Tracking issues
- 3 Content issues
- 4 Technical Issues
- 5 Google Updates
- 6 What is Google Penguin
- 7 Steps To Recovery
- 8 How to judge the value of a Link
- 9 Defining and measuring the success
- 10 How to Remove Links
- 11 The disavow tool
- 12 Link Removal Tools
- 13 Writing Effective Link Removal Emails
- 14 A Common Sense Approach
What did really happen?
When you see a traffic drop from Google, it does not necessary mean that it is due to a penguin penalty. There might be other causes such as:
In certain occasions the actual reason a google organic traffic drop might be that the Google Analytics tracking pixel is missing from one or more pages.
The content of the sight might be of poor quality or there might be duplicate content issues that might have resulted in a Panda or manual penalty.
Check with Google Web Master Tools and Siteliner to find internal duplicate content issues (duplicate title tags, meta descriptions) and check for doubles of:
This is very common after a site migration due a disallow directive in robots.txt, wrong implementation of rel=”canonical”, severe site performance issues etc.
Outbound linking issues – Many times a site links to a spam sites or websites operating in untrustworthy niches.
- Links to low trust web sites
- Paid Links
Negative SEO – If you experience a sudden traffic drop, then you might have been a victim of negative SEO. Negative SEO usually refers to the practice of a competitor buying low quality links and pointing them to your web site with the intention of hurting your organic traffic.
In order to get a better understanding of how the various Google updates affected the organic traffic, it is also recommended to identify all the core dates that any updates have taken place (official and unofficial) and how they have.
Google Algorithm Updates Sources affected the web site’s traffic
In order to isolate the Google Traffic you will need to create the following segment:
Google Algorithm Updates Tools
- Sistrix Updates Tool
- Barracuda Penguin Tool
What is Google Penguin
Google Penguin is an Algorithmic update firstly launched by Google Search Engine in April 2012 to improve the value of the search results returned for users by trying to deal with any form related to spam (also known as spamdexing or Black Hat SEO) such as:
Key facts about Penguin
- Penguin is an algorithmic update, which means that it is possible to instantly recover from it.
- Penguin seems to affect more keyword rankings.
- Recovery is possible before the next update.
- You DO NOT receive a notification in Google Web Master Tools if you have been hit by a Penguin update.
- You can only submit a reconsideration request when you have received a manual penalty.
- The Key date is the 24th April 2012, so if you show a traffic drop after this date, you have been hit by the Google Penguin Algorithmic Update.
How to find out if you were hit by Penguin
As Penguin is related mostly to backlinks, it is absolutely necessary to examine the following:
- Over optimised anchor text (externally and internally)
- Over optimised anchor text on low quality web sites
- The dates that your web sites traffic was affected.
- If you received any notification in Google Web Master Tools.
- Is it a site-wide drop or does it seem to be keyword-specific?
Steps To Recovery
Step 1 – Match updates to Google Analytics organic traffic
Google Analytics is a very useful tool in this case as it can help us identify if there was any traffic drop after each Penguin update.
- April 24, 2012: Penguin 1
- May 25, 2012: Penguin 1.2
- October 5, 2012: Penguin 1.3
- May 22, 2013: Penguin 2.0
- October 4, 2013: Penguin 2.1
- October 18, 2014: Penguin 3.0
Step 2 – Compare 2 weeks before and two weeks after
Now you need to compare the organic traffic two weeks prior to each Penguin update two weeks after each update allowing a few days of buffer on both sides to give the algorithm time to shake out.
Step 3 – Investigating what dropped
Now that you have a clear understanding which updates affected the web sites organic traffic from Google, we also need to find out what actually dropped.
Step 4 – Which keywords dropped?
Penguin seems to affect web sites more at a keyword level rather than site wide. Do a comparison for the same period that you checked your traffic for the top keywords that you are optimising your web site to see if there were any affected severely.
Step 5 – Gather All Links
Now you reached the point that you need to gather all links to start the analysis. For this process you will need you backlink profile from the following tools
After you have exported all data and remove all duplicates with excel, start the analysis of the acnhor text. What you need to do initially is to find instances of anchor text by using the following fucnctions:
Microsoft Excel Definition: Counts the number of cells within a range that meet the given criteria.
COUNTIF is your go-to function for getting a count of the number of instances of a particular string.
Microsoft Excel Definition: Returns a value that you specify if a formula evaluates to an error; otherwise, it returns the result of the formula. Use IFERROR to trap and handle errors in a formula.
Syntax: IFERROR(value, value_if_error)
IFERROR is really simple and will become an important piece of most of our formulas as things get more complex. IFERROR is your method to turn those pesky #N/A, #VALUE or #DIV/0 messages into something a bit more presentable.
Step 6 – Combine data to pull out learnings
Now you need to pull data from Google Analytics for each update (15 days before vs 15 days after) for the top anchor texts in order to discover if there was a drop in the organic traffic for these keywords that were used to improve your rankings by linking back to your web site (top anchors). Here is what you need to do step by step:
- Combine all link resources in excel
- Keep only the i) Anchor Text ii) Linking Domains ii) Links Containing Anchor Text
- De-duplicate data
- Use COUNTIF AND IFERROR to find anchor text instances
- Extract data from Google Analytics (pre and post update)
- Find the percentage of traffic drop by using the following formula (date before – date after)/date_before by selecting the columns and cells that represent the data for each date range.
- Create a pivote table and combine the following information
- The drop;
- # of LRDs;
If you are not very familiar with excel and pivot tables, I recommend downloading the following spreadsheet and use it as a guide as it will help you save a lot of time.
Step 7 – Check links using Link Detox
Link Detox is a very powerful tool if used property as it combines data from multiple resources. Here is what you need to do:
- Create an account here https://www.linkresearchtools.com/
- Go to Link Detox https://www.linkresearchtools.com/toolkit/dtox.php
- Enter Domain to analyze
- Analyze links going to Root Domain
- Activate the NOFOLLOW evaluation
- Select theme of domain from dropdown
- Select if Google Send You A Manual Spam Example (Yes, No, Do Not Know)
- Upload any links you already have (Ahrefs, Open site Explorer, Majestic, Google Web Master Tools)
- Upload Disavowed links (if you have disavowed any)
- Hit the Run Link Detox and wait
- After one hour or even the next day go to reports to find the link detox report
- Classify all your anchor text before start auditing your links
- Compound (brand + money example : Debenhams toys collection)
- Download the report in CSV formant and open with excel.
- Keep only the following columns
- From URL- This is the URL of the page that links to your web site.
- To URL- This is the page of your web site that the external web site is linking to.
- Anchor Text – This is the keyword or keyword phrase used as link text.
- Link Status: If the link is passing link juice or not for the search engines (follow or nofollow).
- Link Loc – The location of the link on the page (paragraph, footer, widget etc.). Very useful when you need to remove it.
- HTTP-Code –
- Link Audit Profile – The important of reviewing the links coming from each domain. The higher the priority the more urgent to examine the links.
- DTOXRISK – How toxic is each link.
- Sitewide links – a site wide link is one that appears on most or all of a website’s pages (blogroll, footer etc.)
- Disavow – Has Google been notified through the Disavow Tool that these link has to be ignored.
- Power Trust – The power trust is a metric used to show how powerful and trusted is a page or domain to the eyes of Google.
- Power Trust* Domain – Power trust metric applied to an individual page.
- Rules – Spam link classification (banned domain, link network etc.)
Step 8 – Create additional columns
Before you start, you will need to create the following columns:
- Contact Email
- Contact Page URL
- Contacted (Yes, No)
- Removed (Yes, No)
- Page Power Trust (Majestic)
- Domain Power Trust (Majestic)
- Niche (use majestic for this)
- Page Indexed (double check)
- Date of 1st Contact
- Date of 2nd Contact
- Date of 3rd Contact
The following are supplementary:
- Edu Domains (Majestic)
- Domain Toxic Links (OpenLinkProfiler)
- Governmental Domains (Majestic)
- Page Facebook Shares
- Page Facebook Likes
- Page Twitter Shares
- Page Google ++ Likes
Step 9 – Keep only on URL per domain
Create an additional column after the domain column and paste the following function to the fist cell =IF (B1=B2,”duplicate”,”unique”) and copy it across the whole column and then select from the filter control you have applied to view only unique values.
Step 10 – Exclude non-verified domains
Simply use the filter from the column anchor text to exclude the unverified links. These are links that do not exist anymore.
Step 11 – Exclude disavowed Links
If you have done a link audits before and you have disavowed files, it would be good to exclude them too as this will help save precious time. You can review these links separately.
Step 12 – Start with banned domains
Now it the time to start reviewing your links. Follow links are always a higher priority as if they violate the Google guidelines directly.
- Now is the time to start reviewing your links. Follow backlinks are always a higher priority as they violate the Google web master guidelines directly.
- Apply a filter to all cells.
- Then apply filter to view links with TOX1 (banned links).
- Use the tag columns to mark which domain needs to be removed or not by using a descriptive tag.
- Mark any URL or domain that needs to be disavowed so that you can create a file at the end very easily.
- Be careful while reviewing as in certain cases several domains might not be indexed for other reasons than being penalised (robots.txt, no-index tag).
- Also, several domains might be very authoritative and trustworthy and there is nothing wrong with linking to you. For instance the following link was found as TOXIC during one of my link audits.
Step 13 – Domain infected with viruses
- Apply a filter to all cells
- Then apply filter to view links with TOX2 (virus infected)
- If you find good one do not remove but simply contact the web master.
- Remove only the bad ones.
- Double check the domains with one of the following tools
Step 14 – Audit TOX3 domains
All there links according to Link Genesis are classified a highly toxical, so you will need to check them very carefully and remove them if you agree with link detox suggestions.
Step 15 – Double check Google Web Master backlinks
Pay particular attention to link imported from Google Web Master Tools during your reviews as according to John Mueller from Google has confirmed that it should be the primary source of backlinks used to clean up your web site from bad links.
How to judge the value of a Link
Before deciding to take any action with any links that might be toxic and therefore could result in your web site receiving a Penguin penalty by Google, you need to devote time to understate all the data that you have pulled in the spreadsheet such as:
Domain Trust Flow (Majestic): How respected is the domain on the web. If the domain has a high trust in general, this is an indication that Google values it. (This usually applies for domains with domain trust over 10).
Page Trust Flow (Majestic): This metric is similar to Domain Power Trust but it is applied at a page level.
Domain Power Trust (Cember): This metric determines the quality of a website according to its strength and trustworthiness. It analyses data in real time from over 24 sources including Google, Moz, SEMrush, MajesticSEO, Sistrix, and many more.
There are four type of links:
- High Trust and Low Power– Link from highly trusted domains such as Universities or governmental institutions. These links are usually very difficult to get and have a very positive impact on your web site’s credibility.
- Low Trust and High Power– These links require further research as they are not necessarily always good.
- Low Trust and Low Power– These links do not help much in general as they may come from new, dormant, or even penalized sites. Review carefully any of these sites before you decide to build any links in any of their pages.
- High Trust and High Power– This is ideally what you are looking for. Pursue this links actively as they will strongly benefit your web site.
DTOXRISK: This is the risk for each link based on how harmful it might be for your web site based on Link Detox calculations (client feedback, observations, linking domains, neighbourhood, internal and external seo experts, known google publications etc.)
To get a full understanding please go the following page: http://www.linkdetox.com/faq
Link Audit Priority: The higher the priority the more important it is to review each link.
Link Status: Whether a link is follow or no-follow.
Link Location: Where the link is placed on the page (header, footer, navigation etc.)
Niche: The niche that the domain falls under (finance, property, computers etc.)
HTTP code: These codes help identify the cause of the problem based on the response send from the server (For detailed information please go: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html)
Page indexed: Whether the page is indexed or not by Google. Please double check and also use http://indexchecking.com/ .
On the top of all these metrics you will need to take into consideration how search engines juidge the value of each link.
Which Links To Remove?
- Link networks
- Article submissions
- Directory submissions
- Duplicate content links (e.g. guest blog duplicated over 100s of domains)
- Spammy bookmarking sites
- Forum profiles (if done for backlinks)
- Malware/hacked sites
- Gambling/Adult sites (if your site is not in the same niche)
- Comment links with over-optimized anchor text (e.g. Cheap Flights instead of John Doe)
- Blog roll links
- Footer links
- Site wide links (in most cases)
- Scraper sites
- Any auto-generated links (xRumer forum posts, etc.)
Defining and measuring the success
After identification and prioritization of the most toxic links to be removed, check the results of these tasks. Depending upon the website case, there can be 3 scenarios:
- Manual Penalty: Even if you have received the “Manual Spam action revoked” from Google, this does not ensure that your website traffic will recover to its pre-penalty level.
- Algorithm hit:In this case, the effect is visible only after a month or so. For example, penguin update may take as long as 6 months to recover from it.
- Proactive approach: Avoiding a manual penalty/future algorithmic is appreciable for the sites that initially had heavy toxic links.
How to Remove Links
When it comes to removing links, there are several options available.
Contact web masters
- Draft a document containing complete links’ list to be removed. Send to webmaster in single, well drafted and small email.
- Adopt one communication channel (email/linked-in/face-book). Shift the channel only when the earlier one isn’t responding.
- Be polite to webmasters. They are trying their best to solve your problem! Keep human touch in communication protocol. An email referring the webmaster by his/her name is more likely to get response and develop strong professional relation.
- Email from the domain (you wish to remove) brings better response possibility from webmaster
- Avoid paying any high feeds for link removal.
- Be polite because you are asking favor from webmaster.
- Be polite while talking to Spammers because they can blackmail you for money if you act rude.
- If you spammed somebody’s site; Be polite, Admit and apologize; Make sure you don’t repeat it (webmaster will check).
The disavow tool
- If you aren’t able to remove all toxic links, use disavow tool before reconsideration tool. Disavow tool shouldn’t be used as short cut at expense of above mentioned tasks because Google checks previous efforts before entertaining your request.
- Make a spread-sheet of the links removed without using disavow tool; sort it and make a list of un-removed links, removed links and methods of removal.
- Focus on un-removed links. Try to sort data in order of domain.
- You can either disavow a full domain or just one link from a domain. Choose the method wisely. Make separate lists in notepad for the domain and links.
- Don’t include http://www. Prior to a domain.
- Don’t use extensions like .com
- Put each domain in new line and add reason (with prefix #) for disavowing for Google’s reference.
Link Removal Tools
Link Research Tools / Link Detox
This is my personal favourite tool as it has some very powerful features such as:
- Nofollow Link Evaluation
- Disavow Links Import
- Link Spam Classification
- Link Prioritisation
- Exporting to excel
- Disavow File Creation
- Data Visualisation
- Contact details extraction
I found it extremely useful when it comes to link networks, penalised web sites, article sites, directories and forums. However, sometimes a few of the results are not 100% accurate especially for Toxic 1 Links and probably because the tool is unable to retrieve the right data. If you use it carefully and have a full understanding of the metrics used to categorise each link you will be able to speed up your work without removing any good links.
Kerboo can be used both to analyse your backlink profile and audit you disavow file. All Links are classified based on their LinkRisk from 0-1000. You can add your own link data to a Kerboo Audit profile. The more data you use the more accurate the LinkRisk score. There is a nice user interface that allows you to easily audit all link profile data, while the Peek too makes information discovery a lot easier, saving you a lot of time.
LinkDelete is in my opinion one of the best removal services I have seen. Their algorithm is very sophisticated and accurate and they also handle manual outreach to web masters and send you reports that you can submit to Google. Different packages are available depending your budget and needs. The service keeps improving.
Remove’em can help you identify spam my links and save time by managing the outreach from the same place. The interface is not that great but it definitely helps speed up the process. I use it mostly for link removal after I have completed the link audit
Rmoov does not attempt to classify links as spammy but it simply help you speed up the link outreach by locating any contact information available on each web site.
Buzzstream is not a link audit tool but it can be used to help you with link removal as it pulls the contact information from each web site and it can also retrieve the Whois information if there is not any contact information available on the web site. Other useful features included are: i) powerful templates ii) list building iii) flexible filtering and iv) notifications.
After you have tried removing as many links as possible use the disavow tool provided by Google for links that you were not able to remove. This tool is only meant to be used as a last resource so you have to make very good use of it.
404 the pages
In several cases if you cannot remove any deep links at all, you can also change the URL page so that all these links go to a 404. I personally redirect them to another site that I create specifically for this reason. I do not like to increase the errors on any web site that I am working on.
Writing Effective Link Removal Emails
When writing to web masters to remove low quality links that might hurt or that have already hurt your web site you always need to bear in mind the following simple guidelines:
- Use Your Own Email
- Be brief
- Be polite
- Explain the situation
- Make it easy to remove the links
- Provide the pages that are linking to your web site
- The linked pages
- Explain what exactly needs to be done
- Remove links
- Make links nofollow
- Disavowing is not enough, please remove any mention of our site from any page.
- Notify me when the links have been removed.
Here are a few examples that you can use
Email example 1
Subject line: Please remove a link
I am currently trying to remove any links I can as my web site has been penalised by Google. Your site is really great but in order to increase the chances of getting out of the penalty I will have to ask for your help.
Here are the pages that are linking to my site:
www.example.com/randompage is linking to www.mysite.com/my-page with the “Luxury London Hotels” anchor text.
The link need to be actually removed, rather than just disavowed. Even if they are “nofollow,” I’d still like them removed.
Please let me know as soon as you remove the links from your web site.
All The Best
Email example 2
I am trying to remove some backlinks pointing to our website, www.hurford-salvi-carr.co.uk. I would really appreciate your help in removing these links. Here is the info…
My website is linked on your website here:
The URLS point to the following pages:
And it used the following anchor text:
- Sussex Flats To Let
If you could please send a confirmation note letting me know that the link has been removed, I would really appreciate it. Thanks in advance! I hope to hear from you soon.
A Common Sense Approach
Based on everything that has been said by John Mueller and Mutt Cuts from Google and from industry experts and my personal experience I would suggest the following actions:
- Review very carefully all links.
- Clean up as many links as possible you can and in particular the ones you created yourself (directories, forums, mini sites, profiles, press releases on poor quality sites, mini sites)
- Disavow all toxic links that you could not remove.
- Make sure 60% of your anchor text is branded and only 40% focuses on money keywords ( 20% exact, 20% miscellaneous).
- Review carefully you niche to understand their link profile ( branded vs non-branded percentage, link types).
- Build quality only links to restore link equity and to also build trust with Google.
- Grow your brand.
- Get media coverage.
- Wait until Google reruns the Penguin algorithm and reassesses your site.
I personally try to remove as many links as possible to recover a site from penguin even if it is not necessary, simply because I do not want the sites that I work with to be associated with any spammy or low quality sites. Furthermore, if I am not convinced 100% that the disavow works without removing any Inks. Another reason is that sometimes if I choose to disavow links instead of domains (rarely), I might miss several bad links.
Depending on your time and resources, you will have to decide if you really wish to clean up the sites back link profile whether to focus only on recovering from Penguin. Link removal campaigns have in general 5 – 20% success rate, so for algorithmic updates are inefficient but you should always talk to your clients about this option.
If you have any suggestion please feel free to email me. I will also try to keep improving this guide as much as I can with new tools and information.