Monday, September 6, 2010

All About Page Rank

0 comments





This chapter deals largely with theory. However, because of the misunderstanding of Google PageRank (PR), it is important that you understand how PR works under the hood and what role it plays in influencing rankings.

Many people obsess and over-hype the importance PageRank and therefore introduce worry and confusion that is not warranted.

There are PR = 8 sites that you cannot find in Google unless you search for them by company name, while there are PR = 4 sites that are in the top 2 or 3 search results for relevant keyword phrases.


PageRank vs. Search Result Ranking


People tend to confuse PageRank with their page’s ranking for a certain search result for a certain keyword. PR is just one factor that is used to determine your page’s actual rank on a search results page for a given search query.

It is not uncommon to see a page with a lower PageRank that is positioned higher on a search results page than a page with a higher PageRank. This shows that PageRank is not the most important factor in Google’s ranking algorithm. A properly keyword-optimized page with a lower PageRank can outrank a non-optimized page with a higher PageRank.

This is a common scenario for large corporate sites. The corporate site may have a high PageRank as a result of the large number of other business partner sites that link to it, but they may end up being outranked due to their lack of keyword optimization for their pages.


Toolbar PageRank vs. Actual PageRank


The Google Toolbar allows you to see a crude approximation of PageRank value for any page in its index. Download and install the Google Toolbar at http://toolbar.google.com/.

Most people don’t realize that the PageRank values shown in the Google Toolbar are not the actual PageRank values that Google uses to rank web pages. The Google Toolbar is divided up into 10 equal linear ranges from 0 - 10. These linear divisions correspond to a logarithmic scale that Google uses. The actual scale is estimated to be between log base 5 and log base 10. The public Toolbar PR value is however what people talk and agonize about.

The Toolbar PageRank value only indicates that a page is in a certain range of the overall scale. One PR=5 page could be just above the PR=5 division and another PR=5 page could be just below the PR=6 division, which is a vast gulf.

Although the exact logarithmic base used for PageRank is a secret, the following table should give you an idea of how different Toolbar PR is from actual PR.


This means that moving a page from a PR = 6 to a PR = 7 is much harder than moving from a PR = 4 to a PR = 5.

Although PageRank is assigned per page, your site is a collection of web pages under a domain that you control and hence your site has a total PR value too.

PR as viewed using the Toolbar can be pretty inaccurate. Sometimes home pages for sites will suddenly show a PR = 0 (no green bar) when indeed the page does have a PR value. Appending /index.html to the URL (or whatever the filename is for the home page) in your browser restores the proper value displayed in the Toolbar.

Also, new web pages that the Toolbar displays a PR value for may not have any “real” PageRank of their own yet. Rather, the new page is “assigned” a PR value 1 point below an indexed page on the site, but this is an “estimate” PageRank that exists only in the Toolbar.

My suggestion is to simply ignore that little green bar. It never was that accurate to begin with and it’s just gotten worse over time. It really doesn’t have much bearing on how well you are ranking.

Increasing PageRank

Each page of your website has a PR value, and as such you can simply add up the individual PR values of each page to arrive at the total PR that your site has (bear in mind however that when someone speaks of PR, it applies to a page). How you structure your internal links can influence what the PR value of a page will be, as will links pointing to a page on your site. Although page PR value is important, you should really be trying to increase your total site PR value.

The actual PR value of each page indexed by Google is in constant flux. On the Web new pages are added, old pages are removed, more links are created – all of which over time slowly degrade the “value” of your links.

As the number of web pages in the Google index increases, so does the total PageRank value of the entire Web, and so does the high end of the overall scale used. This is kind of like the top student setting the “curve” for an exam. The top-ranking site (or handful of sites in actuality) gets the maximum, perfect PageRank score of 10 in the Google Toolbar) and everyone else is scaled down accordingly. As a result, some web pages may drop in PageRank value for no apparent reason. If a page's actual PR value was just above a division on the scale, the addition of new pages to the Web may cause the dividing line to move up the scale slightly and the page would end up just below the new division.

As such, you should always strive to obtain more links that point to your site, otherwise your site can naturally start slipping in rankings due to this “raising of the bar” of PageRank across the Web.


Decreasing PageRank

The amount of PageRank value a link forward on to your site is diluted by the presence of other links on the same page. This is where link strength comes into play.

The greater the number of other links on a page, the weaker the strength of each individual link. The strength of that “vote” is divided equally among all other links on the page.
Which means, all other things being equal, if someone has a link to your site on their page with 100 other links, you may not get any appreciable value from that link in the overall calculation, unless the page has a very high PageRank.


The PageRank Equation

Here is the official PageRank equation. It is calculated by solving an equation that includes each of the billions of web pages in the Google index:

PR(your page) = 0.15 + 0.85 [(PR(page A) / total links (page A) ) + (PR(page B) / total links (page B) ) + …]

A couple of observations to note about the PR equation:

• PR is based on individual web pages – not on a website as a whole.

• The PR of each page that links to your site in turn is dependent on the PR of the pages that link to it, and so on iteratively.

• A link’s value (amount of PageRank or “voting power” forwarded to the linked-to page) is at most only 85% of the linking page’s PageRank value, and this value is diluted (decreased) by the number of other links on that page.

• PR has nothing to do with keywords or text in links - it is purely dependent on link quantity and link strength, as discussed previously.

Some may incorrectly conclude that a link from a page with a PR = 4 and with only a few outgoing links is worth a more than a link from a page with a PR = 6 with 100 outgoing links because for the latter, the “voting power” or value is divided up among 99 other links.

However, you must remember the logarithmic nature of actual PageRank. A link from a PR = 6 page with lots of outbound links may indeed be worth more than a link from a PR = 4 page that has only a few outbound links.

The Evolution of PageRank

Pagerank used to be a simple weighting factor for all links regardless of the topic of the page that contained the link. This led to a small industry that focused around buying and selling high-PageRank links. However, when anyone can achieve high rankings by simply buying enough links from any website, or trading links with any unrelated website, Pagerank loses its value as a factor in ranking websites accurately.

As such, Google has done some tweaking of how it analyzes the value of links. Links are now scored differently and some links may not count as much as they used to. PageRank as the defining metric for links is becoming less important and the other variations listed below are becoming more important.


Topic-Sensitive PageRank

Topic-sensitive PageRank computes link value based only on incoming links from pages that are returned from a given search result set that matches the search query (whether the result set is 100 or 10,000 pages is not known).

This means that a flower site only gets links counted from other sites that are related to flowers and gardening - not from sites that are about mortgage loans for example.
By using Topic-sensitive PageRank, Google hopes to filter out irrelevant links that have skewed the value of PageRank in the past.


Local Rank

A variation of PageRank whereby links from sites that share the same Class C blocks are worth less than links from a variety of different IP addresses, which are generally different servers owned by different businesses.

As you may recall, a Class C block is that number shown in the third position of an IP address. For example, for 255.137.xxx.255, xxx represents the Class C block.
This attempts to deal with the problem of different sites owned by the same company that cross-link to each other. Put another way, Google wants to see incoming links that are from different business entities, not different sites owned by the same person.


TrustRank and the Sandbox

A variation of PageRank whereby links from site that are “trusted” by Google carry more weight than other links. This also related to the Google Sandbox. As you recall, the Google Sandbox is a series of filters applied to new sites that cause them not to rank well or rank at all for anything but very niche, unique keyword phrases, such as their company name.

TrustRank says that new websites either have to reach a certain age (say 6 - 18 months) OR obtain relevant, quality links from authoritative "highly-trusted" sites to escape the Sandbox. However, links from highly-trusted sites can be very difficult for new sites to get. For this reason, most new sites must be of sufficient age AND the links that point to new sites need also to be of sufficient age and at least “moderately trusted" before a new site can rank well.
The TrustRank threshold that new sites need to overcome to escape the Sandbox varies by keyword and industry. Gambling and pill sites have a much harder time breaking free from the Sandbox filters than say baby blanket sites.

0 comments:

Post a Comment