Developments noted or announced on the Google algorithm and its means of ranking pages.
We follow the evolution of the algorithm of Google and consequences for webmasters.
The evolution of SERPs and interface are detailed in another article.
January 19, 2012. Content visible above the fold is now a ranking factor
As announced in November last year, the pages that show at first advertising content and then actual content visible only when you scroll the text, will be penalized.
It affects 1% of searches.
Users complained that to find content that meets their query, they must scroll the page and skip ads.
But how to determine what is "above the fold", as it depends on the resolution of the screen. On a mobile, depending on whether one holds it in portrait or landscape mode, it is not the same. This is intended to affect pages that have two 280 pixels high ads side by side. In fact Google provides a statistical measure of what is the page height without scrolling with the tool Browser Size. 550 pixels is an acceptable value.
The size of the header is it a part of the equation? If it is not considered part of the content.
Browser Size. This Google's tool provides a measure of what is "above the fold " for a web page.
Announcement in Inside Search.
December 2, 2011. Detection of parked domains.
And whose home page is filled with advertising. A new algorithm is added to detect and exclude them from results. This is part of a dozen measures announced for the month of November, includind also ones about the freshness of content, to promote the most recent pages.
In November.
November 14, 2011. Bonus for official sites.
Sites related to a product, a person, when identified as official sites (made by the brand owner itself), will now receive a preferential treatment in ranking, according to a modification of the algorithm announced November 14, 2011.
November 10, 2011. Too many ads in a page: a direct ranking factor in the algo.
At the 2011 PubCon, Matt Cutts said that having too many ads on a page was going to be a (negative) direct ranking factor.
This has always been an indirect factor in so far as this can encourage visitors to leave the site and increase the bounce rate and reduce the time of visit. But it will now be taken directly into account.
This also confirms that this was not a criterion in Panda.
Note that "too much ads" depends upon the size of the page and he said also its placement above the fold is taken into account.
November 3, 2011. New ranking about freshness of pages.
A change in the algorithm affects 35% of queries on the search engine. This concerns the novelty of pages that can be promoted according to the search context.
It is about recent events and hot topics, or topics that come up regularly in the news (Ex: F1 Grand Prix), or which is continually updated without current (Ex: A software).
Other topics such as cooking recipes should not be affected by this change.
August 29, 2011. Better recognition of scrapers.
Sites that duplicate verbatim the pages of other sites to display advertisements should be better identified. They are sometimes better positioned in search results pages as the originals!
Google is testing a new algorithm and asks users to report such sites to help develop.
Report a scraper.
This is not for copyright infringement but for sites that use some tool to extract a content and put in their pages.
August 12, 2011. Panda in all languages.
Apart Chinese, Korean, Japanese Panda now applies to the whole World. The impact is between 6 and 9% in each language.
Panda. In the time, Google changed the way Analytics calculates the bound factor.
June 20, 2011. Post-Panda recovery.
Since June 15, some sites recovered from Panda penalty, when they were modified to remove duplicate content.
This seems to address mainly site hits because duplicate content. The sites which were victim of scrapers recovered and the latter now often removed from SERPs.
June 8, 2011. The author attribute.
Several tags to place award in the body of the page are now recognized by Google:
<a rel="author" href="profile.html">Myself </a>
<a rel="me" href="profile.html">Myself </a>
This will help to classify the pages per author.
The profile page so designated must be on the site that contains this attribute.
More information.
April 11, 2011. Panda action extended to the World.
The Panda action against poor quality is now rolled out in the whole World.
But this target only English language queries (on local versions of the search engine).
Google is also starting to take into account the fact that some sites are blocked by users. This is one more criterion but minor.
New big sites like eHow were affected by the update, but a lot of much smaller sites with indirect results as links from these sites are written down and this affects also other site, not directly affected.
Understanding Panda Update
Update March 3, 2011. Important change against content farms: Panda Update.
Called internally "Panda" (from the name of an engineer), this action impacted 11.8% of queries by reducing the presence in results pages of poor content, not original or not very useful. On the contrary those which provide detailed articles resulting from a original research will be favored.
"We want to encourage a healthy ecosystem..." Google said.
Google says that these changes does not come from the new extension for Chrome that allows you to block sites. But a comparison with data collected shows that 84% of the sites concerned are included in the list of blocked sites.
The effects appear today only in the U.S.A. Subsequently, this will involve the rest of the world. One main result will be an increase in Adsense revenue for other sites because these content farms are mostly intended to display advertisements.
Remain to see how content farms will be affected, on Alexa or Google trends and if it is a Farmer Day.
Finding more quality sites.
February 24, 2011.
List of sites penalized by the change.
Interview of Google's staff.
January 28, 2011. Change against copied content.
To fight against sites which take content from other sites or of wich the content has no originality, a change was made in the algorithm earlier this week, from 24 January.
This only affects 2% of queries but according to Matt Cutts, that's enough for you to experience a change in the positioning (in the case of Scriptol, the audience grew by 10%).
It is a further improvement affecting the long tail. This can affect content farms that produce line articles, necessarily not original.
Announcement by Matt Cutts.
How organized spam works. By SEOMoz.
January 21, 2011. New ranking formula.
The new algorithm is more efficient to detect spam in the page content, represented by a repetition of words, with the obvious intention of being ranked on those words.
They can be found in an article or blog comments.
See link below.
January 21, 2011. Algorithm better than ever against spam.
So says Google in a letter responding to criticisms about the quality of its search engine particularly in the fight against spam.
Google says ad posters Adsense does not a site prevent without any meaningful content to be downgraded nor participation in the Adwords program.
In 2010, the algorithm has undergone two major changes to cons spam. We remin the change that has affected the long tail at the expense of sites with no content.
Google want to go further in 2011 and invites webmasters to give their opinion. The target is mainly the "content farm" that provide interest-free pages filled with keywords to position themselves in the results.
Some sites as Demand Media (eHow, Answerbag), Associated Content, Suite101, could match the definition of content farm: lot of pages each day with few or no interest targetting the main asked keywords.
The algorithm will be enhanced to recognize the content copied or with no original content.
Google search and search engine spam.
Give your opinion.
Content farms. Exact definition of what is a content farm and the list.
December 2, 2010. Sentiment analysis added to the algorithm.
Following an article on the New York Times, denouncing the fact that a merchant who causes the dissatisfaction of its customers and generates many complaints in blogs and forum gains an advantage with search engines, Google reacted.
Indeed, when we denounce the practices or content of a site, we put links on it to provide examples, and these backlinks are treated as a popularity indice by search engines, which translates into better ranking in the results!
Google therefore developed an algorithm for sentiment analysis, which aims to recognize if the text surrounding a link is positive or negative towards it, depending keywords it contains to penalize sites that we complained.
Google also advises the nofollow attribute to put a link on a site without wishing to contribute to its positioning.
Being bad to your customers is bad for business.
Large-Scale Sentiment Analysis for News and Blogs. Analysis in English of the algorithm.
November 5, 2010. Black Friday.
Since 21 and 22 October depending on the region, a modification of the algorithm on the ranking in results affected a lot of sites, some losing up to 80% of their traffic. The Alexa search engine, has published graphs showing huge losses or gains equivalent to some sites.
These changes seem permanent.
The purpose of these chan ges appears to be intended to improve the relevance of results.
"You are not alone", Alexa blog.
August 31 2010. SVG indexed.
SVG content now indexed either it is in a file to include or embedded into HTML code.
List of files formats supported by Google.
August 20, 2010. Harmful internationalization?
Some webmasters have seen their traffic increased from search engines other than Google Google.com or that of their country.
So Americans can see the arrival of visitors who visit other ccTLD engines, such as google.co.uk or google.fr for example, implying that the engine of other ccTLD includes U.S. sites in the results.
This could reduce the audience for the sites in these countries.
June 8, 2010. Caffeine builds a fresher index.
Google announced June 8 that the new indexing engine, Caffeine is finalized. It offers a new index with 50% fresher results.Its operation differs from the previous system which was updating as a whole, by waves. Caffeine updates the index incrementally. New pages can be added and made available for search as soon as they are discovered.
The new architecture allows also to associate a page to several countries.
Caffeine vs. previous system.
May 27, 2010. MayDay: The long tail evolves.
This was confirmed by Matt Cutts at the Google I/O in May, the radical evolution in the month of April comes from the change in the algorithm, to promote quality content on the long trail.
This is an algorithmic change in Google, looking for higher quality sites to surface for long tail queries. It went through vigorous testing and isn’t going to be rolled back.
Remind that the long tail is the set of queries with multiple keywords, each being rare, but which together form the bulk of traffic to a site.
Webmasters gave the evolution the name of MayDay. I have previously called Black Tuesday. This has been disastrous for some sites well established but having not enough content in deep pages. This happened in late April and early May depending on the sites, even though other sites have experienced a loss of traffic for other reasons.
This has boosted traffic on scriptol.com.
Google confirms Mayday impact. By Vanessa Fox that says also Caffeine is not live yet.
Matt Cutts explains Mayday in a video. It is not related to Caffeine and is definitive. Webmasters must add content to their page to retrieve the traffic lost.
April 27, 2010. Black Tuesday: Ranking changes on the long tail.
The long tail is the set of pages on a site that make few visit each but all together have a large traffic.
Queries on multiple keywords, make the long tail.
Many sites have seen a change in traffic of these pages since April 27. Some have lost up to 90% of their traffic.
They attributed this change to Caffeine, the new infrastructure of Google indexes more pages and creates more competition, but it has been confirmed later by Google it is a change in the algorithm (see May 27).
April 9, 2010. Site speed.
It is officially a ranking factor. This was announced a few months ago, it became reality: a site that is too slow is now downgraded in SERPs or at least has a chance to be in conjunction with other factors.
Today we're Including a new signal in our search ranking algorithms: site speed.
It is possible to know if your site is too slow from Google Webmaster Tools (Labs -> Site performance).
Using site speed in web search ranking.
November 19, 2009. The speed of a site will be a ranking factor in 2010.
This is what Matt Cutts has just said in an interview.
"Historically, we haven't had to use it in our search rankings, but a lot of people within Google think that the web should be fast.
It should be a good experience, and so it's sort of fair to say that if you're a fast site, maybe you should get a little bit of a bonus. If you really have an awfully slow site, then maybe users don't want that as much.
I think a lot of people in 2010 are going to be thinking more about 'how do I have my site be fast, how do I have it be rich without writing a bunch of custom javascript?"
This should favor static website with no SQL. See our article, How to build a CMS without database.
See also Let's make the Web faster.
The interview.
2009
According to Google, 540 improvements was made to the search engine in the year 2009.
December 15, 2009. Canonical Cross-Domain.
Taking into account the attribute rel="canonical" which was implemented some months ago to avoid duplicate content between pages within a site, has been extended to similar pages on different domain names.
It is still preferable to use 301 redirects when you migrate a site on another domain.
Source Google.
To protect your site against other sites that might copy your content without permission, see how to build a generic canonical tag in PHP.
August 11, 2009.New search engine, Caffeine.
Google is trying a new search engine that is intended to be faster and to provide more relevant results.
July 2, 2009. Less weight for irrelevant backlinks.
This is not confirmed officially by Google (who spoke little of its algorithm in any case), but webmasters believe that the results have changed and that positions in the SERPS are lost because they came from quantities of lower quality backlinks.
What are irrelevant links? There are:
- Blogroll.
- Backinks from social sites.
- From directories.
- Backlinks in footers in partner sites.
- Links included in CMS templates.
In fact Google recently announced that it would take no more account of blogrolls. It is without doubt the result. And it is not just a loss of importance to these links: they are no longer taken into account!
With regard to social sites (such as Delicious, Stumbleupon), in contrast, Google said in a roundtable with webmasters: "They are adressed as other sites".
June 19, 2009. Flash resources indexed.
The crawler is able to index Flash application, but now, it can index images and texts uses by these applications too. Source Webmaster Central Blog.
June 2, 2009. New effect of the nofollow attribute - Onclick links.
The nofollow attribute let crawlers to ignore a link in a page. So the PR is distributed among the remaining links.
It now appears that the PR is first distributed among all the links (with or without nofollow) and then not distributed to the nofollowed links.
Example: You have 10 points and 5 PR links, 2 points are awarded for each. If two links are nofollow, no PR is passed through them, but others will not receive more points, they will receive only 6 points shared in 3.
The consequences are dramatic, links in comments on a blog would result in lost of PR for other links.
Quoting Matt Cutts:
"Suppose you have 10 links and 5 of them are nofollowed. There’s this assumption that that the other 5 links get ALL that PageRank and that may not be as true anymore."
More on PageRank and nofollow.
Also, Google takes into account links assigned in the onclick event.
April 4, 2009. Local Search.
Google improves local search, based on the IP address, which allows it to find the country and the city of a visitor. From it, Google tries to show in results sites that are located as closely as possible.
To take advantage of this option requires that the research includes a place name, in which case a map is displayed.
Google's Blog.
February 26, 2009. Brand names.
The algorithm gives now more weight to brand name and therefore promotes related sites. This is confirmed by Matt Cutts (head of staff and spokesman of Google) in a video.
The video.
February 25, 2009. The canonical tag.
A new tag tells to crawlers which URL it should index when a page is accessible with multiple addresses.
The duplicate content problem solved.
July 16, 2008.
Google introduced on an experimental basis in some of Wikia's search engine. Users can mark results as good or spam.
The engine takes into account, but for the user who has scored only. For now ...
July 2008.
Google announces that it has one trillion URLs of Web pages in its database.
The pages are all indexed.
June 2008. Nofollow and PageRank.
Nofollow links do not count for the transmission of PageRank, but their PR is not spread over the normal links.
So the PR sent to related pages is divided by the number of links first, then when it evaporates links to nofollow.
Source: PageRank Sculpting.
October 19, 2005. Jagger Update
This update adds more weight to relevance in the links. Important sites appear as fortunate.
Spam is fought, especially techniques using CSS for hidden content for visitors.
An analysis of the Jagger Update.
May 20, 2005. Bourbon Update
An update to penalize sites with duplicate content, links to irrelevant pages (unrelated to the linked page), reciprocal links in quantity, quantity links to a nearby site. This has affected many sites with collateral damage.
2003. Florida Update.
It upset the SERPs. One of the key changes and that the algorithm works differently for different types of queries, and the SERPs are populated with the results of different and complementary types. An analysis of the Florida Update.
1998.
The Google search engine appears on the Web.
More information
- How Google works on its algorithm! Video.
- Google algorithm. The original version.
- Google anatomy. Schematic of the infrastructure of the search engine.
- How Google's algorithm has changed over time. Graph showing the theoretical evolution of ranking factors. April 9, 2009.
- State of the Index 2009. All the improvements Google has made during 2009 on the index but also new tools.
| Tweet |
|