Panda, facts and myths

Changes to the algorithm of 11 April 2011 surprised webmasters and have generated much of theories about what Google wants to penalize effectively.

panda update pandalyzed

Last level reached

Facts

Google wants to remove from the visible part of the index pages with poor content

Poor content means a text without any original content. The algorithm uses signals to judge the quality of a text, based on keywords and user behavior, but this is a program and may be totally wrong.
For example it does not take into account JavaScript, dynamic content, the educational quality of the text which may be conventional but more understandable than others.

A site is penalized as a whole if some pages are considered of poor content

This has been confirmed officially. The penalty can be mild or severe depending on the proportion of low-content pages.
Thus pages that were top of the results, may lose several positions and appear behind links to pages of lower interest or at least beside the point, that because the site is penalized.

This is a group of Google employees who defined the quality of a site

The Panda algorithm analyzes a site based on criteria defined by a group of employees who were presented a set of sites and decided which are quality, which are not.
The sites are then penalized when they deviate from this standard preset, regardless of other criteria.

The penalty resets a counter and is almost irreversible

It is not pages that are penalized, but a negative score assigned to the whole site. So if we remove the poor content, it does not negate the penalty and even that reduce the traffic coming from Google and other sources.
To cancel it, add new pages with rich content to improve the score of the site, and possibly also improve the poor content pages but this will have an effect later.

Things are different when the site has duplicate content, because it triggers a different penalty.

Note that "rich" or "poor content" are nothing else that ratings by Google. Objectively is "poor content" what makes a site pandalized. (See article on the Panda algorithm).

google pandalizes

Poor quality glass

Google knows that it unfairly penalizes sites

Google recommends: "combine shallow pages to make a more useful content. " We do not combine spam to make useful content, the advice is directed to genuine webmasters.

Panda has one purpose: to fight spam. To prevent spammers to progress by trial and error, it penalizes the entire site making it difficult to know which part is covered and it does not negate the penalty when the content that is considered poor is deleted.

By penalizing an entire site for a part of the content, it also knows that it may penalizes quality content.

A site could be penalized when its contents is copied

A side effect is that sites whose content is often copied, have been affected by this change, Google being often unable to distinguish what is original and what is the copy. This has shocked webmasters.

Normally, the algorithm must penalize sites that mirror the content from other sites. But often it confuses the original and the copy and the first is taken for the second.
This happened even at popular sites like cultofmac.com.

A site can be penalized for an earlier reason

A site may have lost much of its traffic at the Panda Update for a cause that has nothing to do with quality content, as confirmed by Matt Cutts.
The site had already received a negative signal, for example, have placed links to a link farm or used text link ads, and received a negative score with no effect on traffic. The effect occurs on April 11 when this score was combined with other unfavorable signals.

Panda is not a change in the algorithm but a different program

It is a program which is run at regular intervals and consider the sites on a different basis, trying especially to determine the usefulness of the pages, their interest to the user.
It acquires experience, so that sites can be more penalized with each pass.

Google filed a patent related to Panda

Patent 8,190,537 requested October 31, 2008 describes how, based on the characteristics of a number of pages, by recognizing these features in other pages, we can rank them with the previous ones. This is what Panda does. By finding the characteristics of sites without useful content in a new site it deduces that it has no useful content.
Si of f the pages of a site are similar to those of sites without any meaningful content it will be penalized even if its content is useful.

It goes without saying that if all pages of a site are made on the same model, it is more likely to be affected by Panda.

Panda has reduced spam but not improved results

Panda has effectively removed a part of the spam, only a part, because we always see the first results pages occupied by commercial that does not necessarily correspond to what is desired. For example the query "free software" often leads to "paid software with free trial."
In general, results were not improved and is still diffcult to find what we search when it does not correspond to just a few keywords.

Myths

Myth: A site is penalized because it displays too much advertising

The number of ads on a page is not taken into account by the algorithm of the search engine. Besides the service Adsense of Google permit to display 6 units on one page.
Google does not penalize a site because it displays too many ads, never ... except that it may indirectly increase the bounce rate which is a negative factor. Matt Cutts confirmed at PubCon 2011 that ads are not a direct ranking factor in Panda.
A page may be penalized because it has too little content (perhaps a single sentence) beside a filling almost exclusive of advertising. There is a difference between 3 ads that cover 90% of a page, above the fold or not, and 3 ads that cover 10% of a page.

Myth: Panda is made against content farms

Several needs have been combined in this new algorithm. Perhaps the firm was annoyed by the arrogance of companies like Demand Media we saw too often on the media (but which are no longer heard since Panda), and the joke about the yacht named Adsense, but the update has a more general and long term purpose.
This has affected 14% of sites in English, so millions.
It is likely that content farms have served as sampling as several versions of Panda have been launched until they  have all been caught.

The future of search engines is not compatible with pages without original content. One can expect that Google reduces the sources of information, or substitute itself to them.

Myth: Panda selects pages of quality

This myth is deliberately maintained by Google with its advice on quality pages. The tips are good to follow, for users, but are far from determining the rankings.

This is contradicted by the first search that is done on the search engine. Panda tends to favor sites whose content is original and disadvantage sites whose content lacks originality, but the notion of quality remains to be defined.
One way is used to detect originality, an analysis tool similar to Translate that can extract the raw information of a page regardless of the formulation.
Then a score is assigned to the whole site, so that poor pages of a site with a good score can outrank better content from other sites with a lower score.

Myth: A Gmail account can penalize a site

If you publish a newsletter via Gmail and if a significant number of those who subscribed does not open the messages when there are received or report a spam, the site of the newsletter receives a penalty.
It is what said the webmaster of the site lockergnome.com.
He deleted the subscriptions and made ​​a successful request for reconsideration. But this is vehemently denied by Matt Cutts: a request for reconsideration is not limited to what the webmaster said it may cancel a penalty for something different!

Documentation

Guilty Panda. How look a panda A quoi ressemble un panda caught in the act

Cult Of Mac once pandalyzed for its content copied, got a white list entry