Frequently asked questions about search engine algorithms
While the glossary of SEO provides essential terms that give an overview of SEO, this page answers frequently asked questions and problems about optimization of a site for search engines. SEO goes beyond indexing in directories. Many details can avoid losses of rankings, unfair when the site's content is quality.
Questions about Google's PageRank, and ranking in general, and how to gain some points, by natural ways, without to use bad practices as cloaking and spamming and other forbidden artifacts that may lead you to the black list...
- How do I know if my pages are indexed by Google?
- How to exclude a page from the index?
- Why a drop in traffic for my website?
- Is the duplicate content penalized?
- Is the domain extension important for PageRank?
- My page is not indexed by search engines
- Can I force a Web page to be indexed?
- Where can I get more information about Googlebot?
- How to avoid cloaking?
- What is the bounce rate?
- Bounce rate is it a ranking factor for Google? How to lower it?
- How can I leave the sandbox?
- What is minus thirty?
- How to visit google.com without being redirected to my country version?
- How to be a trusted site?
- Should we add content frequently?
- Which percentage of users click on the first link in search page results? Are there some stats about percentages of click?
- How to change domain without losing its ranking.
- To be temporarily unavailable is it harmful for a website?
- What is the difference between "white hat" and "black hat"?
- Google could it forget to remove a penalty that has no cause?
- Spelling and grammar are they taken into account by search engines?
- How Google evaluate the originality of a content?
- Why big sites are they favored?
- Why forums have they preferential treatment?
- Is it really useful to provide a sitemap to Google?
- Is robots.txt helpful? How does Google use it?
- Are RSS feeds useful for SEO?
- Is the description meta used by Google?
- Should I fill the meta keyword?
- Why the link command on Google gives only a few backlinks?
- Twitter and Facebook have they an effect on ranking?
- How to get a foreign version of the search engine?
- How to improve the SEO of my site?
- How many keywords can I put into a URL?
- How can we overshoot Wikipedia?
- Can I modify the snippets?
- Compliance to W3C standard , ie well-formatted HTML code, is it important for the ranking?
- Changing the design can it affect the ranking?
- Dividing an article in multiple pages is it bad for SEO?
- How to optimize the title of a page for a better ranking?
- Why is a page penalized?
- What is over-optimization of a website?
Links and backlinks
- Are internal links helpful?
- Are social bookmark links giving less weight than other back links?
- Are nofollow links followed by crawlers?
- How many links can I put into a page?
- Several links on a page to the same page are they useful?
- Internal links can they cause a penalty?
Questions about the PageRank
- How to know the PR of a page?
- Why the link: operator from Google returns only a few backlinks?
- What is PageRank?
- Is PageRank important?
- Is PageRank used against duplicate content?
- How is PageRank calculated by Google?
- PageRank is it transmitted through a link to an image?
- Why is a site above another that has a more important PR?
- What is cloaking?
- What is spamming?
- What is spoofing?
- How to know my PageRank?
- A company guarantees me a 10 points PR
- Is the PageRank the first factor for the position?
- What means a graybar PR? Is this a penalty?
- How to improve my PageRank?
- Other factors for the position in results.
- Does a 301 redirect mean a lost in PageRank?
- When the PageRank is it updated?
How do I know if my pages are indexed by Google?
If your site is called "www.scriptol.com" for example (this is impossible), type this in the search window:
Google will display your indexed pages and so allows you to check the title and description of the pages.
How to exclude a page from the index?
Insert a meta tag within <head> </head> into the HTML page:
<meta name="robots" content="noindex" />
A robots.txt at the root of the site may also contain rules to search engines for excluding files or directories.
Is a duplicate content penalized?
Duplicate content is the presence of same contents on page in the same site
or in different site, or contents indexed twice. This could happen with different
URLs pointing on the same page or with copies of pages. This would be a way
for a site that would try to monopolize the top or result pages, but this
never happen in the real world, so it can be concluded that engines penalize
effectively duplicate content.
In a post on its blog, Google has clarified the rules about duplicate content.
Duplicate content, this can also be the incorporation of a portion of an article from another site on its site. It is a penalty factor insured, unless it is a quote placed in a <blockquote> tag . Quotations must be accompanied by a personal text.
Are RSS feeds useful for SEO?
It is a way to get visitors and amounts of backlinks. The RSS file contains
a list of links on your articles and it can be replicated on other sites,
as well as in directories. To find out how easily achieve an RSS file, and
how to use it, consult RSS
section on this site.
The backlinks provided by the RSS feeds which are echoed by many sites are temporary, they will disappear with the renewal of the content of the feed, therefore RSS is best suited for blogs.
Is the description meta used by Google?
The answer is given by Google on his blog for webmasters, in the article
snippets with a meta description makeover".
Snippets are the descriptions in search results under the titles.
The description in the meta must be unique and must give details on the page. It should contain keywords related to its contents.
Should I fill the meta keywords?
The meta keyword is not used by Google. It may be used by other search engines. Some webmasters performed a successful experience with the meta keyword and Yahoo.
If you need for additonnal trafic from Yahoo, fill the meta keyword.
Why the link command of Google gives only a few backlinks?
The operator link in the search bar (link: site-name) is a command to display the number of links pointing to a site. In fact this command provides only a fraction of backlinks, in order to save servers bandwich.
The choice of outcome is totally random, this was confirmed by Matt Cutts in a video on Youtube. They have nothing to do with PR or with the quality of the pages, they are taken randomly.
Twitter and Facebook have they an effect on ranking?
The answer is officially yes. This was confirmed by Google and Bing in an interview with a journalist. The number of times a page is retweeted, or linked on Facebook affects its position in the results even if the links themselves are in nofollow.
We must therefore add another criterion to many signals of Google: social authority.
How to get a foreign version of the search engine?
To not be automatically redirected to the local version of the search engine, a language parameter must be added. For example:
For the french version of Google.
How to optimize the title of a page for a better ranking?
An optimal title must tempt users to click on the link to read the page, contain essential keywords corresponding to the content, and have a length of about 60 characters.
More details: Creating a good web page title.
Are internal links helpful?
Internal links, mainly on the home page, facilitate the indexing of the pages,
and also tend to spread the PageRank of a page to another. Put a maximum of
internal links in the content of the pages, when a term refers to the content
of another page of course.
The anchor of the link must be descriptive, it helps search engines to define the content of a target page and therefore favors its rank.
Several links to the same page may be even added, as explained further.
Are social bookmark links giving less weight than other back links?
For Matt Cutts, (see interview in references at bottom), a link is a link.
And so links gained from social bookmark sites have same weight as other link
in regular webpages.
But the weight of a link depends upon the PageRank of the page where it is added.
Is the domain extension important for PageRank?
No, the extension may be either .com, .edu or .org, this has no importance, only the PageRank of the page is important for backlinks. Links from these sites are not more trusted and do not pass more PageRank.
Reference in interview.
Are nofollow links followed by crawlers?
It is sometimes admitted that even if nofollowed links do not pass PageRank, they are used for discovery of new pages. This is denied by Google.
- Nofollow links do not pass PageRank.
- They are not used to discover new pages.
- The anchor is not used to define the content of the linked page.
They are totally ignored.
Reference in interview.
Several links on a page to the same page are they useful?
When multiple links point to the same page, only the first is taken into account by Google. But this is not the case if the links point to different sections of the page, determined by a fragment with the #xxxxxx format.
In this case, the anchor of each link is considered to index the target page. Whether it links to another site or on the same site.
It appears even that the first link on the page and not a section is ignored.
Tests have been made by seomoz to verify that. But there have been changes in the algorithm of Google in April 2012 on anchor links, and this could have changed.
Internal links can they cause a penalty?
Especially if they all have the same anchor...
No, internal links are not taken into account by algorithms that are likely to penalize a site because they are a way to navigate the site and are therefore considered necessary.
Except in the very special case where a page would contain quantities of identical links, algos see in internal links a navigation tool and not a mean to transmit PageRank. You do not need to worry about your menus, sidebars or internal references or their anchors. (Dixit Mister G.)
How many keywords can I put into a URL?
In the directory + filename, you can put until 5 keywords with no problem. Beyond that, your URL look as spam and the algorithm weights these words less. You can get spam report with lot of keyword in URLs (Matt Cutts in references).
How many links can I put into a page?
The guidelines recommend to put less than 100 links. You can bypass this number, technically, there is no problem as Google can parse a page up to 500 KB, but it is bad practice and it is better to split the page into smaller ones.
Can I force a Web page to be indexed?
If robots do not come frequenlty enough on your site (the date of the last
visit is indicated on the home page of webmaster tools), you can still force
the indexing by getting a link to the page on another site that is frequently
See the article How to obtain backlinks and similar article on this site for details.
How to improve the SEO of my site?
Where can I get more information about Googlebot?
Googlebot is the crawler of Google. It could parse some pages on your site every day. This Googlebot FAQ gives details of how it works.
How to avoid cloaking?
Cloaking is presenting
to search engines text that is not visible to visitors. It
may not be intentional when you add text unnecessary to visitors to index
pages made of flash or images or dynamic text that are not scanned by robots.
But this is not allowed.
How to type google.com without being redirected to my country version?
When you want to access the search engine, it automatically redirects you to the regional version of the engine. This is suitable for most users but not to the webmaster or the user who wants to do a search on google.com.
To reach google.com, type in the URL bar:
What can be placed in bookmark. "ncr" could mean "no country redirect".
What is the bounce rate?
Definition from Google: "Specifies in what percentage visitors left the site
without viewing any other pages." The bounce is the fact that a visitor leaves
the site as soon as he read the page on which it arrives. So if three out
of four visitors do read a single page and leave the site without to read
others, the bounce rate will be 75%.
It is generally preferable to have a low bounce rate, it means that there is interest in the content of the site and that one read so many pages, but on the other hand, when a visitor searches for something very precise he will leave the site after having found it and the bounce in this case is a positive factor!
How can we overshoot Wikipedia?
Wikipedia, the big wiki, sort of online encyclopedia, tends to arrive at
the top in Google, although before websites with more comprehensive article
and with more backlinks!
One of the reasons is that this site is favored and another is in the impressive number of links between articles and sub-domains.
But there is room to move ahead and achieve top results in search engines. The weakness of the wiki is that all articles have a single word for name and thus anchor are also a single keyword.
The solution is to make articles based on two keywords, for example, grape + health, or health + diet. The title of the article include two keywords, as well as the file name, and the anchors of internal links...
Searches made on two keywords should return your page rather the one keyword page of the wiki.
Changing the design can it affect the ranking?
It should not. However, some webmasters have experienced a loss of positioning with the change of design of a site without changing the content, immediately after the passage of Googlebot.
This experience has been shared on Webmasterworld. The ranking returns to the previous state after a variable delay. It is probable a massive change raises some signal on the engine.
We therefore recommends to change the design little by little and not globally. If something leads to a penalty it will be easier to see what it is.
Another advice from Google is to not change the design when you change the domain and redirect the pages.
To be temporarily unavailable is it harmful for a website?
This may be the case if the situation is not properly managed, which means that we know in advance that the site will be unavailable.
If this is not the case, webmasters may think that the site, if not very important, is closed and remove backlinks. Similarly, robots of search engines can return a negative signal.
If the outage is expected, the ideal is to return an HTTP code 503, which is defined for this situation. In PHP, the code of the home page or all pages in the case of a CMS can be like this:
header('HTTP/1.1 503 Service Temporarily Unavailable'); header('Retry-After: Mon, 25 Jan 2011 12:00:00 GMT');
This code is supplied by Google.
What is minus thirty?
Many webmasters believe they have suffered a penalty that is called minus 30 or -30. Their site is bumped from #1 to #31 in results of Google, and it is very clear with the URL of the site. In general, a site ranks first on its name with the extension, or the sites are now found in 31th position.
Why the link: operator from Google returns only a few backlinks?
The link operator in the search bar (link: site-name) is a command to display the number of links pointing to a site. In fact this command provides only a fraction of backlinks, in order to save bandwich of servers.
The choice of outcomes is totally random, this was confirmed by Matt Cutts in a video on Youtube. They have nothing to do with PR or with the quality of the pages, they are taken randomly.
Should we add content frequently?
Continuously adding new pages can it not be harmful since it increases the number of links on the homepage?
Adding content is good but you we must follow some rules of organization. The homepage does not link to all articles but only a few. Each page must have a link on the home page and links to related articles: links should always be relevant.
That said, Google promotes new content, so assuming that your new articles are related to the actuality, or your change in previous articles update them, it is good for SEO.
The changes that are not of actuality have little interest, it serves mostly Adsense which targets preferentially pages that evolve.
What is the difference between "white hat" and "black hat"?
These are two forms of optimization, one regular and one prohibited by the rules of search engines.
The first is to make the content of a site more accessible to search engines: internal links, choice of keywords, etc..
The second is to manipulate them to gain better positions in the results with less useful or even inappropriate content, for example by creating link farms.
Google said many, times, and especially in this video, it does not consider SEO as spam.
Provided it is designed to make access to the site easier for search engines:
- Make all pages accessible through links.
- Put keywords corresponding to the content on the page (without artificially multiply them, one occurrence is enough).
- Make loading faster.
Unlike black hat techniques:
- Bad: To present different content to search engines and users.
- Bad: Overloading page with keywords useless for the user.
Spelling and grammar are they taken into account by search engines?
This is not part of the 200 signals that determine the ranking, but Matt Cutts said that there was a correlation between the authority sites, well positioned, with spelling and grammar.
In addition there is a correlation with PageRank, which is normal, it will be easier to link to or bookmark a well written page, which gives the impression of having be entreated.
How Google determines the originality of content?
Google has perfected over time an instrument used to detect duplicate content, and uses a derivative to the online tool Translate. This allows to recognize a same idea in different formulations.
The page size is not taken into account, what matters is the answer to a query and each page can respond to multiple requests, specifically if it contains subtitles that correspond to questions.
What makes the better optimized sites could have suffered most from Panda: they are more responsive to requests.
Why big sites are they favored?
Panda actually evaluates sites quality with a set of signals:
- Direct access to the site rather than the search engine.
- Bounce Rate.
- Time on site.
It goes without saying that big sites have more direct access and rely less on search engines.
Robin Hood robbed the rich to redistribute to the poor, Panda Hood tends to do the opposite.
Why forums have they preferential treatment?
Search engines are very effective to give an answer to a general query such as "organic food". But if we want a precise answer, it's different, it should understand the issue according to the specific needs of the user.
Forums and question and answer sites are most effective for that.
Forums have something more: contributors tend to say almost anything, providing a unique content.
How to improve naturally the PageRank
How to know the PR of a page?
The greenbar is a Chrome extension. URL:
PageRank, or website ranking, is a notation from 0 to 10, given by Google to
each page of a website.
The higher is this value, the better will be the position of the page in results of searches, among other pages that match the request.
A 5 points PageRank is Good. 7 points may be reached with valuable backlinks. The number of 10 points PageRank websites is very short!
The word PageRank comes both from "page ranking" and "Page" that is the name of one of the two authors of the algorithm (Serguey Brin and Larry Page).
More in Google's PageRank.
According to Google, PageRank is the more important among 200 criteria to order
pages in results of searches.
Thus, it is not the only one. But for websites that match a same group of keywords, it is very important.
When two pages are identical, and if the date of indexing is not sufficient to know what is the original and what is the copy, Google considers that the page with the higher PageRank is the original. This was clearly stated in an interview of Matt Cuts by Stephan Spencer and confirmed by a post on the Google's blog about duplicate content.
PageRank is it transmitted through a link to an image?
The official answer from Google is YES.
Why my site is better positioned than another in search results while the latter has a PageRank greater, or vice versa?
Because the results page is related to a group of keywords, while the PageRank reflects the number and quality of backlinks to the page regardless of the query. It is possible that another page is best positioned on a group of different keywords.
When cloaking is detected the website goes to the blacklist, their pages are no longer indexed. See "bmw.de" et "ricoh.de" affairs (same webmaster?)
This is known as a bug in the calculation of the PR, and is probably fixed now.
But this is a kind of mean as PageRank depends upon a group of keywords. To know the real ranking, perform searches with various keywords. The position of your page (when several match the request) gives the ranking: the top of list means for a ranking of 10. First page of search results means for 6-9 PR when lot of matches exists.
I have been contacted by a company and it guarantees me a 10 points PR,
and I want to improve my ranking. Should I accept?
According to Google, nobody can garantee a PageRank, for any position. (And I know only a dozen of big websites with a 10 PR).
Matt Cutts is the member of the Google's SEO staff who communicates the most often on medias about the algorithm. He said in an interview published on the Stonetemple site, in Octobre 8, 2007:
"I would certainly say that the links are the primary way that we look at things now in terms of reputation."
Links are the source of the PageRank, according their weight and their number, and they are the first factor for the reputation of the document, which in turn is certainly the first factor for the position in results (but one among 200 signals).
What means a graybar PR? Is this a penalty?
This is not necessarily a penalty and this is not a problem with the toolbar as some think. This is not equivalent to a PR 0.
The graybar is a signal that something is wrong with the page from the rules that Google wants to see applied by webmasters. The more often a lack of content, an excess number of internal or external links compared to the content.
In practice, it prevents the spread of PR. A page should not be grayed if it has quality backlinks, otherwise you should study it as it can contain anomalies.
PageRank, that is based upon backlinks, is only one factor among several
ones, to calculate the position of link to your website, in results of search
These factors are also considered:
- The localization of the host and the language of the request.
- Clicks on the link to your website rather than other links in results. Your page must be chosen. Imagine good title and description, clear and attracting.
- The number of relevant keywords (different). This is used first to select a page, and then to calculate its position in the list.
A more complete list is given in the Google patent.
Does a 301 redirect mean a lost in PageRank?
When a page is redirected through the HTTP code 301, the PageRank is transmitted with a discount. This has been confirmed by Matt Cutts. The ratio of this reduction is the same than for a link, about 15%. We can say from experience that it is enough to lose one or more positions in results.
It is better to avoid changing the domain of a site if it is not absolutely necessary.
When the PageRank is it updated?
The actual PageRank depends on the evolution of backlinks among other factors and is constantly modified.
But the public PR as displayed by the green bar of the toolbar is automatically changed to fixed dates, every three months, in the beginning of January, April, July, October.
- SEO manual. Step by step manual for how to succeed in SEO and to increase the number of visitors.
- Answers from
Google to webmasters
Lot of questions and the team at Google Webmaster Central answered all of them.
- Interview of Matt Cutts. Head of Googles webspam team.
- Articles on robots.txt.
- Sharing advices.
PageRank, what do we really know about it?
Summary of rules, articles and tools about PageRank. Some rules as "Adding new pages can decrease Page Rank" for example are false, you can add so much pages you want without problem.