13. PageRank and its influence
PageRank is important if you want a high Google ranking
It cannot be ignored.
Its calculation, understanding and influence in a real website is very
complex making it possible to only generalize on how individual web pages effect
others on the same or different websites by virtue of the links in place.
Despite this enough intuitive understanding has been developed to be able to
guide web marketers.
PageRank is a registered trademark of Google. It is always referred to in its
capitalized form in this book when the term relates to Googles intellectual
property.
The word Page is named after Lawrence or Larry Page the co-founder with
Sergey Brin of Google, the search engine.
This chapter will pull out the important and practical implications of
PageRank as I understand them. It will not in any major way try to explain
PageRank by lengthy discussion. This has been done by the following authors in
an admirable and eye-opening fashion. These following papers should all be read
and re-read in your attempts to get a handle on this misunderstood and complex
topic (complex in both its calculation and its practical implications).
Sergey Brin and Lawrence Page, Anatomy of a Search Engine. The presentation
at the 7th international WWW98 conference
http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm
The paper below The PageRank Citation Ranking: Bringing Order to the Web.
also by Brin and Page covers in more detail the mathematics and philosophy
around PageRank and the whole WWW.
PageRank is essentially a system based upon citations of good quality content
web pages. The more that a good quality web page is recognized the more will
that web page be linked to naturally and for good logical reasons.
http://dbpubs.stanford.edu:8090/pub/1999-66
Chris Riding has written an excellent long document called PageRank Uncovered
and is a sequel to the original PageRank Explained that goes into great detail
in terms of the PageRank calculations and theoretical possibilities. Chris was
the first to really study and make sense of PageRank for the benefit of SEOs.
Most of the later contributors agree with Chriss views.
www.supportforums.org/PageRank.pdf
Phil Craven has also written a very good article offering more understanding
and practical use of PageRank
http://webworkshop.net/pagerank.html
Marcus Sebeks articles (in which he actually points out a mix-up in Brin and
Pages presentation to WWW98) can be found at
http://pr.efactory.de/
13.1 VISUALIZING PAGERANK AND ITS PR BAR GRAPH:
Google does not calculate PageRank for a single website. It calculates the
PageRank for every page on the whole web (or rather its indexed database) in one
sitting at various time intervals.
If the whole web consisted of 1,000,000 interlinked pages and a surfer
started clicking on links in a random manner then for a web page having a
PageRank of say 1,000 that single surfer in 1,000,000 surfing starts would
probably hit the web page with a PageRank of 1,000, a thousand times on average
(i.e. the PageRank)
For a site with a PageRank of only 100 the same surfer would hit it ten times
less frequently. This kind of thinking puts into perspective the importance of
Yahoo with hundreds of thousands of links pointing to it. This was the intuitive
thinking behind Brin and Pages theory and PageRank algorithm.
The Google tool bar which was discussed earlier is shown below to refresh
your memory.
Note the green horizontal bar called PageRank. This bar is a representation of
actual PageRank. The actual scale is not known outside Google but it is known to
be logarithmic in nature. This means using the log base of 10 we are generally
familiar with that a PageRank of exactly 4 on the bar chart is 10 times bigger
than exactly 3 on the same bar chart. If the log base was 5 as some believe then
PageRank 4 would be 5 times bigger than PageRank 3.
PageRank 2 could have an actual numerical value of say 100, or it may be
1,000, only Google knows. Do not rely upon this tool bar for any meaningful
number use it for indicative purposes only
13.2 PAGE RANK MATHEMATICS SIMPLIFIED
In calculating PageRank Google needs to know the number of links from every
page on the whole of its indexed database. It also needs to know where the link
goes to. This table shown next is exactly what Google does to determine the
number of links from every page. For the sake of simplicity I have assumed the
whole matrix of web pages is only a total of 8.
| |
Page |
PageB |
PageC |
PageD |
PageW |
PageX |
PageY |
PageZ |
Total of links |
| PageW |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
6 |
| PageX |
2 |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
6 |
| PageY |
1 |
1 |
1 |
1 |
1 |
0 |
1 |
0 |
6 |
| PageZ |
0 |
1 |
1 |
1 |
0 |
1 |
0 |
1 |
5 |
| PageA |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
3 |
| PageB |
0 |
0 |
0 |
1 |
2 |
0 |
0 |
1 |
4 |
| PageC |
1 |
1 |
1 |
0 |
1 |
1 |
1 |
1 |
7 |
| PageD |
0 |
2 |
1 |
0 |
0 |
0 |
1 |
1 |
5 |
The total column represented by the red numbers represents the total number
of links from say PageW (vertical column) to other pages (on horizontal row) on
the whole database. In This case PageW has 6 links out. The importance of this
number is that in calculating PageRank for every page on the whole web Google
would calculate (by reiteration) the PageRank of PageW, divide it by 6 and add
1/6th to PageA and PageB and PageC and PageX and PageY and PageZ i.e. all those
pages PageW is actually linking to.
Even though the above shows that in some cases there are 2 links from the
same page to another page only one of these links is probably calculated.
The strength of Google compared to other search engines is that it can do all
these vast numbers of link interpretations and then do massive amounts of
reiterative calculations at low cost by virtue of the mathematical PageRank
algorithm it developed and then patented.
Now take a look at the next table with exactly the same numbers.
What do you think the totals in blue represent?
| |
Page |
PageB |
PageC |
PageD |
PageW |
PageX |
PageY |
PageZ |
Total of links |
| PageW |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
6 |
| PageX |
2 |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
6 |
| PageY |
1 |
1 |
1 |
1 |
1 |
0 |
1 |
0 |
6 |
| PageZ |
0 |
1 |
1 |
1 |
0 |
1 |
0 |
1 |
5 |
| PageA |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
3 |
| PageB |
0 |
0 |
0 |
1 |
2 |
0 |
0 |
1 |
4 |
| PageC |
1 |
1 |
1 |
0 |
1 |
1 |
1 |
1 |
7 |
| PageD |
0 |
2 |
1 |
0 |
0 |
0 |
1 |
1 |
5 |
| Total |
6 |
8 |
5 |
4 |
5 |
3 |
5 |
6 |
|
These totals are representations of probable relevance and/or high citation.
PageB has the highest number of links to it and is thus probably more relevant
and would probably have the highest PageRank. This is in the ideal world only.
You can draw this table for every one of your own websites to attempt to get
greater visual understanding of relative importance of your pages and where you
can increase/decrease links to re-allocate PageRank within your own website.
13.3 IMPORTANT PRACTICAL PRINCIPLES
Reference must be made to the papers identified earlier for a greater insight
into these conclusions presented here.
> Only pages in the Google index count.
> A web page with a high PageRank which links to your website will add
considerably more to the PageRank of your own web page than one with a low
PageRank all other matters being equal.
> The PageRank of an individual web page is positively influenced by links
into that page and the whole site is affected negatively by links out of the
site.
> In general terms links out of a site should be limited to a minimum number
of web pages and those web pages should ideally have low PageRank so that less
accumulated PageRank is drained out of the site as a whole
> Well interlinked web structures tend to garner a higher combined PageRank
than simple or circular linked sites.
> The combined PageRank of a website is distributed around the web pages
making up the site in a non-equal manner for most practical web designs.
> Reducing links out from an important page would tend to improve that pages
PageRank
> Linking from an outside link to the most important page is the best way to
capitalize on incoming PageRank
> An internal link is of NO lesser importance as far as PageRank is concerned
than an external PageRank link. There is a view that internal links are marked
down. Pages are pages whether on the same domain or not.
> A link from a page that has few outgoing links is to the benefit of the
receiving page. Because of this it can be better to get a link from say a PR 4
with only a couple of outgoing links rather than a PR 5 with 100 outgoing links.
> Increasing the number of pages within a website by adding more good content
will increase the overall website PageRank which can be optimally spread out
using linking mechanisms that favor the movement of PageRank to a chosen page.
This is true in a closed system and probably true in a non-closed system.
However if many pages are added to a site containing incoming links and the new
pages do not have incoming links then it is possible that the PageRank of
individual web pages can be lowered to the detriment of the highly ranked pages
(refer papers by Mark Sebek and Phil Craven).
> New pages on a site should be channeled to give maximum PageRank to the
more important pages.
> If a page has a PageRank of 100 and has 10 outgoing links then using Brin
and Pages published information each outgoing link will carry 0.85 x 100/10 =
8.5 PageRank points to each of the 10 linked pages. This amount of 8.5 is then
distributed around the whole website in a non linear manner depending upon the
internal linking structure and where the link enters into the site. If there
were 100 outgoing links then each recipient site would only get 0.85 PageRank
points.
> Outward links to pdf, Excel files etc do not drain PageRank from a site.
They would be classified as dangling links by Google. This information could be
useful in terms of providing visitors with useful information on another part of
the web without draining any PageRank away from your own site.
> If there is more than a single outgoing link to the same destination only
one of these is counted. A page linking to itself has no value to share around.
> It is better to spread out links from new pages rather than have all point
to the same page.
> Splitting very long web pages into 2 or more shorter ones will improve the
PageRank of the site as a whole.
> Heres the PageRank formula . expressed in words and simplified just to make
it more easy to understand
PageRank of a page called A that has a single link into it from page B is
equal to the total of (PageRank of B divided by links out of B) multiplied by
0.85 PLUS 0.15
If more than a single page links back to A then the formula is
PageRank of a page called A that has multiple links into it is equal to the
total of (PageRank of each page divided by links out of each page all added up)
multiplied by 0.85 PLUS 0.15
Go and read Chris Ridings paper for an excellent review of PageRank.
To remind readers PageRank is a multiplying factor and if PageRank is very
low (0.15 is assumed to be minimum any page can have) then your web page will
come low down on the SERPs. Equally if you have done a very poor job of using
on-page factors then your site might also come in low down even if your PageRank
is very high.
Thus it is very important that before worrying about PageRank you make sure
that you optimize your page to ensure you get into Googles short barrel i.e.
become one of those relatively few results that have your keyword probably in
Title, URL and anchor link text and thus make it into the short barrel.
To reread the background information to these important points go back to the
earlier chapter entitled Introducing the concept of links in relation to web
marketing.
13.4 PAGERANK AND A LITTLE KNOWN SECRET
Earlier I mentioned that Google and other search engines now used the
nofollow attribute within an <a ref> tag in order to block links on Blog
comments. This means that wherever a link ( e.g. <a
href="http://www.pond-pumps.com/">Best pond pumps in South Africa</a>) has the
rel=nofollow attribute inserted into it then the search engine will not follow
the link it will be ignored by the search engines but still be avauilable for
real people to click. This is what the no follow link would look like <a
rel=nofollow href="http://www.pond-pumps.com/">Best pond pumps in South
Africa</a>
If a link is not followed then that link will NOT bleed any PageRank from the
page and this is a good thing for ranking within Google (not necessarily Yahoo
and others). Here then is a mechanism for redirecting PageRank to any place you
want without disrupting the actual searcher experience in other words you can
now have many outward links from say the home page by using the nofollow
attribute and there will be no bleed of PageRank from this most important page.
In practice each any and every affiliate link should be a nofollow as should
links to less important pages like About Us, Find Us pages and the like.
13.5 ONE OR MORE KEYWORDS PER PAGE
Is it better to create a new page for every keyword to get link benefits or
is it better to incorporate more than a single keyword onto a page? ..
This question is really posed in relation to those very many keywords for
which it is easy to get top SERPs ranking.
My feeling is (especially if you are using Front Page) that if it is very
easy to produce web pages and the web page contains useful content then it would
be better to introduce a new page. Having created the new page there must be a
link(s) to this page from within the site and a link(s) back to the selected
important page(s). This link is best made from and to the important page at
times this may not be possible.
The addition of extra pages will increase the overall sites PageRank and
distribution of the individual pages PageRank can be channeled to a specific
page(s). Remember it is also best to provide outgoing links from low PageRank
pages to maintain as much as possible PageRank within your own site.
Do keep a record of every page in your website to record at least the
following information:
| Page URL |
Keyword |
PageRank if indexed |
Rank for keyword |
| |
|
|
|
What should be done if you have an existing website with many interlinked
pages and you want to maximize PageRank for a page you need to climb up the
SERPs.
The following paragraphs are what I think should be done without having
sufficient experimental results to be totally sure so bear this in mind. I am
sure you are asking yourself this question at this point in time. It is a
question I threw around in my mind a lot.
The over-riding consideration is to ensure your visitor can navigate your
site. Make reference to the table above and construct such a table showing
interlinking pages for your own website.
1. Identify the important page.
2. Ensure that the important pages outbound internal (i.e. own site) links
are all reciprocated where it makes sense to do this with very large sites this
cannot be done so be selective and remember the customer.
3. Do not provide outbound external (i.e. others site) links from this
important page unless in return for high quality reciprocal links (i.e. high
PageRank with low quantity of links out ideally)
4. Encourage inbound external links to arrive at this page.
5. All lesser important pages including new pages could be linked directly to
the important page
6. Add new pages linked only to important pages and from outbound links page
(see 7 below)
7. Collect outbound external links and place them on a single low PageRank
page.
8. Break long pages into shorter pages if you believe you have sufficiently
satisfied on-page factors to get into Googles short barrel and stand a good
chance of on-page factor ranking. Remember once Google has recognized that the
limit of on-page factors (e.g. keywords in body text) has been reached it does
not care how many more you use on the page after reaching this limit and it will
ignore those extra keywords.
9. Make good use of the nofollow attribute this is probably the most powerful
tool you have to direct or concentrate PageRank wherever you want it and few
people know about it.
At this stage it should be realized that web marketing cannot be left to the
average web page designer if success is to be achieved. There is a considerable
amount of understanding to get a high ranking SERP.
For this reason even if you do not elect to do your own web page designs you
are now in a position to control exactly what MUST be done to succeed. Dont
expect to see overnight changes of significance because this will not happen.
Were talking of changes that take months to propagate.
Of course having done what you think is required to succeed the next step is
to ensure that the results of all your work do actually get the required results
high SERPs which hopefully lead to high volumes of visitors and ultimately
buyers to your web.
You must monitor the results of your work and modify it where necessary.
Before getting to this stage however we need to design, layout and publish our
webs to the Internet and then to submit our sites to the search engines and
particularly Google. Subsequent chapters will cover:
-
More about the Google toolbar
-
Web page layout
-
Checking web page design for broken links,
poor design, insufficient keyword use etc
-
Comparing web page keyword designs to high
ranking competitors
-
Getting your site url and other page urls
indexed
|