Understanding the Panda algorithm
The change in the Google ranking, called Panda from the name of an engineer, impacted 11.8% of US website with poor content, not original or not very useful.
"We want to encourage a healthy ecosystem..." Google said.
Panda is a program launched manually from time to time by Google to assess the "quality" of sites, that was integrated to the organic algo in January 2012.
It results in an overall score assigned to the site that is taken into account by the algorithm to rank pages.
No other criteria of the algorithm does alter this score.
Google has a long-term vision. Firstly webmasters will react and adapt their site to these new conditions. Even if that make it less convenient for the user.
And the criticism that the firm has suffered - we remember the joke of the April 1 on the Yacht called Adsense of the CEO of Demand Media - hurted it and it had to react.
Dates of Panda updates, the first two are official, the following are estimated:
- 1.0: February 24, 2011. Applied to US sites.
- 2.0: April 11. Applied to all English sites in the World. It impacted 2% of the websites.
- 2.1: May 10.
- 2.2: June 16.
- 2.3: July 23.
- August 5-8. A change in the general algo reversed effects of Panda on some sites. Date depends on regions.
- 2.4: August 12. Panda rolled out in all languages but Chinese, Korean, Japanese.
- 2.5: September 26-28. Favors again big media (Youtube, eBay) and penalyzes their competitors. (Ref).
- October 14. Novembre 1. November 18.
- In January 2012, Google started to integrate Panda in the general algorithm, and then updates will be made in real time when sites evolve.
However, each month a new iteration of Panda is performed.
Criteria of the Panda algorithm
Theoretically, Panda makes the difference between a quality site and a site without interest as so:
Google’s algorithm looks for authoritative sites, consistently offering new information and innovative statements contrary to the one churning out five hundred words about a topic with no special background on it.
Another more recent quote:
Demoting low-quality sites that did not provide useful original content or otherwise add much value.
This sentence is self-descriptive. It is itself that said what is a L-Q site.
But webmasters are looking for more precise criteria. Here are the criteria used to enhance the "quality" of sites, based on infos given by Google and experience of webmasters:
- Contents copied from another site. Too big quotes may be treated as duplicate content.
- Or unoriginal. Even a completely different article but repeating the same ideas (keywords actually) is considered of lesser quality.
- Shallow or unhelpful content.
- The fact that many users block a site in search results, is considered as a negative signal by Panda since the second iteration of April 2011. It's official. (But Google is trying to separate spam from the genuine).
- It is likely that the design of a site is taken into account, as a quality factor.
- Several similar pages on the same site. More again it they have the same title but one or two words..
- Pages of categories or tags, lists of pages, can be considered of poor content. It is not new, but now the whole site is penalized.
- If all pages are made on the same model, you're more likely to be affected as long as this model is often used by low quality sites.
- If a part of a site is affected by these criteria, the overall ranking of the site will suffer, other pages will be declassified. A global note is assigned to the site.
- Sites linked to a site penalized, are downgraded too. Panda has a negative PageRank.
Many believe that the exit rate combined with a low time spent on the page is the starting point of Panda. And if Panda is conducted monthly, this is not because he processes a lot of data as the officials said, but he needs to accumulate for a month measures to judge. But it is certain that it makes massive comparisons.
The exit rate is different from the bounce factor . This is the percentage of visitors to a page leaving the site after seeing this page. If high, it is considered of little quality and this score is transmitted to pages that link on it.
Panda was presented as a separate program that requires vast resources because it analyzes page content and compare them to detect what is original and what is widely used.
In addition to originality, content is confronted with requests, and the same page can answer many queries. It is certain that Google uses semantic analysis techniques to compare the content of the pages and go beyond the similarity between sentences. This has been unveiled by a team member on a forum. Panda's role is to perform the heavy analysis of contents.
This does not prevent the engine to use different shortcuts and give a premium to the most popular sites, as it usually does, so they can avoid being penalized by Panda. It depends on the ratio of useful content too.
Pages considered of lower quality are now less often visited by crawlers (according to Matt Cutts). You can therefore by looking at the logs have information about that.
How to modify a site to reverse Panda penalty
What to do when hit by the Panda update?
"Low-quality content on some parts of a website can impact the whole site’s rankings, and thus removing low quality pages, merging or improving the content of individual shallow pages into more useful pages, or moving low quality pages to a different domain could eventually help the rankings of your higher-quality content."
But all experts agree that it is not possible to cancel the penalty, and this is confirmed by experience, without to change the content of existing pages and adding new content.
Merging two pages with a banal content will make a new bigger page with a banal content, it will not solve anything.
When a site is pandalized, it is not the poor content that is downgraded but the whole site. Then deleting pages with poor content does not fix the issue, to the contrary, you lost visits (except for pages of tags or content duplicated from other sites).
So we can not restore a site that is penalized, without adding a new content founded on new bases, creating like a new site.
- Watch the exit rate in Analytics or other statistical tool. Pages that have a high exit rate penalize the site.
- For pages whose engine cannot understand the interest for the user enrich their content. Do not delete them. But not if they have a lot of backlinks as this all triggers other penalties.
- Make sure your content offers something useful and unique (make a search for similar content). Always ask what your page offers more than others.
- Personalize the content. Use your own words. And, I talk to bloggers, remember your essays in school, the teacher did not ask you to copy the subject or the answer of someone else, but to give your own ideas.
- An external link to a quality page from another site must be accompanied by a critical and personal opinion. Different external links on each page.
- If the site has no system of comments, add a personal analysis which will attempt to adopt a different perspective from that of the article.
- Take care of the user experience, the desire to see other pages, to return to the site.
- Again, do not modify pages with a lot of backlinks.
You should know that changing the existing content will not suffice to undo the effects of the penalty. This is especially new unique content that will do. And diversification of the content.
This will require much work, but you will be consoled in thinking to content farms which have million pages to modify...
The most important new fact which makes the results incomprehensible to webmasters, wich has been officially confirmed by Google, is that if a part of a site is considered of poor quality, the whole site will be penalized. So pages of very good quality will be less well positioned in the SERPs behind pages of other sites of lower quality!
It's hard to admit. In fact Google has always misrepresented Panda because its views on a quality site has nothing to do with that of the user. For Google, it is useless to have quantities of pages repeating the same content. It is necessary to limit the index, but to judge pages which must remain, all kinds of signals are used with a dubious effectiveness, which often leads to promote spam sites at the expense of the original authors.
- Quality of a website. List of quality criteria from Google.
- Evolution of the Google algorithm since the origin. Updates before and after Panda.
- Panda update, facts and myths. List of misconceptions, often widespread.
- Finding more quality sites. This is the official view of Google for Panda.
- Interview of Google's staff. This interview gives a historical view on the establishment of Panda and the reason that justified it, without providing precise data on the operation of the algorithm.
- Discussion on the Google forum. Reactions, often desperate, from webmasters.
- Lessons learned about Panda Update. This article confirms that a site is penalized in full when a part is considered a poor content and suggests the use of the noindex attribute for pages with low content (this is not what we recommend).