| The term "search engine" is often used generically
to describe both crawler-based search engines and human-powered
directories. These two types of search engines gather their listings
in radically different ways. Crawler-Based Search Engines
Crawler-based search engines, such as Hotbot, create their
listings automatically. They "crawl" or "spider" the web, then
people search through what they have found.
If you change your web pages, crawler-based search engines
eventually find these changes, and that can affect how you are
listed. Page titles, body copy and other elements all play a role.
Human-Powered Directories
A human-powered directory, such as Yahoo, depends on humans for
its listings. You submit a short description to the directory for
your entire site, or editors write one for sites they review. A
search looks for matches only in the descriptions submitted.
Changing your web pages has no effect on your listing. Things
that are useful for improving a listing with a search engine have
nothing to do with improving a listing in a directory. The only
exception is that a good site, with good content, might be more
likely to get reviewed for free than a poor site.
"Hybrid Search Engines" Or Mixed Results
In the web's early days, it used to be that a search engine
either presented crawler-based results or human-powered listings.
Today, it extremely common for both types of results to be
presented. Usually, a hybrid search engine will favor one type of
listings over another. For example, Yahoo is more likely to present
human-powered listings. However, it does also present crawler-based
results (as provided by Google), especially for more obscure
queries.
Search for anything using your favorite crawler-based search
engine. Nearly instantly, the search engine will sort through the
millions of pages it knows about and present you with ones that
match your topic. The matches will even be ranked, so that the most
relevant ones come first.
Of course, the search engines don't always get it right.
Non-relevant pages make it through, and sometimes it may take a
little more digging to find what you are looking for. But, by and
large, search engines do an amazing job.
As WebCrawler founder Brian Pinkerton puts it, "Imagine walking
up to a librarian and saying, ‘travel.’ They’re going to look at you
with a blank face."
OK -- a librarian's not really going to stare at you with a
vacant expression. Instead, they're going to ask you questions to
better understand what you are looking for.
Unfortunately, search engines don't have the ability to ask a few
questions to focus your search, as a librarian can. They also can't
rely on judgment and past experience to rank web pages, in the way
humans can.
So, how do crawler-based search engines go about determining
relevancy, when confronted with hundreds of millions of web pages to
sort through? They follow a set of rules, known as an algorithm.
Exactly how a particular search engine's algorithm works is a
closely-kept trade secret. However, all major search engines follow
the general rules below.
Location, Location, Location...and Frequency
One of the main rules in a ranking algorithm involves the
location and frequency of keywords on a web page. Call it the
location/frequency method, for short.
Remember the librarian mentioned above? They need to find books
to match your request of "travel," so it makes sense that they first
look at books with travel in the title. Search engines operate the
same way. Pages with the search terms appearing in the HTML title
tag are often assumed to be more relevant than others to the topic.
Search engines will also check to see if the search keywords
appear near the top of a web page, such as in the headline or in the
first few paragraphs of text. They assume that any page relevant to
the topic will mention those words right from the beginning.
Frequency is the other major factor in how search engines
determine relevancy. A search engine will analyze how often keywords
appear in relation to other words in a web page. Those with a higher
frequency are often deemed more relevant than other web pages.
Spice In The Recipe
Now it's time to qualify the location/frequency method described
above. All the major search engines follow it to some degree, in the
same way cooks may follow a standard chili recipe. But cooks like to
add their own secret ingredients. In the same way, search engines
add spice to the location/frequency method. Nobody does it exactly
the same, which is one reason why the same search on different
search engines produces different results.
To begin with, some search engines index more web pages than
others. Some search engines also index web pages more often than
others. The result is that no search engine has the exact same
collection of web pages to search through. That naturally produces
differences, when comparing their results.
Meta tags are what many web designers mistakenly assume are the
"secret" to propelling their web pages to the top of the rankings.
However, not all search engines read meta tags. In addition, those
that do read meta tags may chose to weight them differently.
Overall, meta tags can be part of the ranking recipe, but they are
not necessarily the secret ingredient.
Search engines may also penalize pages or exclude them from the
index, if they detect search engine "spamming." An example is when a
word is repeated hundreds of times on a page, to increase the
frequency and propel the page higher in the listings. Search engines
watch for common spamming methods in a variety of ways, including
following up on complaints from their users.
Off The Page Factors
Crawler-based search engines have plenty of experience now with
webmasters who constantly rewrite their web pages in an attempt to
gain better rankings. Some sophisticated webmasters may even go to
great lengths to "reverse engineer" the location/frequency systems
used by a particular search engine. Because of this, all major
search engines now also make use of "off the page" ranking criteria.
Off the page factors are those that a webmasters cannot easily
influence. Chief among these is link analysis. By analyzing how
pages link to each other, a search engine can both determine what a
page is about and whether that page is deemed to be "important" and
thus deserving of a ranking boost. In addition, sophisticated
techniques are used to screen out attempts by webmasters to build
"artificial" links designed to boost their rankings.
Another off the page factor is click-through measurement. In
short, this means that a search engine may watch what results
someone selects for a particular search, then eventually drop
high-ranking pages that aren't attracting clicks, while promoting
lower-ranking pages that do pull in visitors. As with link analysis,
systems are used to compensate for artificial links generated by
eager webmasters.
A query on a crawler-based search engine often turns up thousands
or even millions of matching web pages. In many cases, only the 10
most "relevant" matches are displayed on the first page.
Naturally, anyone who runs a website wants to be in the "top
ten" results. This is because most users will find a result they
like in the top ten. Being listed 11 or beyond means that many
people may miss your website.
The tips below will help you come closer to this goal, both for
the keywords you think are important and for phrases you may not
even be anticipating.
Pick Your Target Keywords
How do you think people will search for your web page? The words
you imagine them typing into the search box are your target
keywords.
For example, say you have a page devoted to stamp collecting.
Anytime someone types "stamp collecting," you want your page to be
in the top ten results. Then those are your target keywords for that
page.
Each page in your website will have different target keywords
that reflect the page's content. For example, say you have another
page about the history of stamps. Then "stamp history" might be your
keywords for that page.
Your target keywords should always be at least two or more words
long. Usually, too many sites will be relevant for a single word,
such as "stamps." This "competition" means your odds of success are
lower. Don't waste your time fighting the odds. Pick phrases of two
or more words, and you'll have a better shot at success. |