Monday, November 07, 2005

The Useful Internet (0101) - Web Directory & Search Services - Introduction


Serial Number: 01
Branch Number: 01
Main Categroy: Web Directory & Search Services
Sub Category: Definitions and general characteristics


Definitions and Terms

Web Directory services are simply websites that lists the addresses of other websites categorized in an organized manner mostly by topic of the websites they list. In other words, what you will usually see in the front page of any directory service is a small listing of broad general topics that when clicked, take you to another categorized listing of subtopics of the main topic and so on. When a site's content falls under a certain topic or subtopic, it is listed there. This is the general for of directory services. However, specialized directory services also exists which can be thought of as being branches of general directories. For example you may find directories listing only pages that are concerned with sports in general categroized by their broad kind of sport first, and then by specific branches of sports. This is also considered a web directory services.

On the other hand, search services began as huge listings of sites accessable through searching for keywords. You type "iced tea restaurants" and the "engine" starts "searching" for all sites that mentions these three words regardless of their real content or categorization. This might bring up a site with an interview with a football player whose favorite drink is iced tea taken in a restaurant! It might even bring up a restaurant that states explicitly on its site that they DON'T offer iced tea. Not quite the results you will be looking for in most of the cases!

Each of the two models has its advantages and disatvantages. Directories may take long time to find you the site you are looking for (especially with a slow connection) as it requires you to browse down the tree of categories to the right one (this can be as deep as twenty+ levels down) while search engines will give you the result directly without browsing. Directories usually also face the problem of clear categorizations and ambiguity. If you are looking for the "Contoso Airlines Company" website, you might decide to start at "travell and outdoors". However, the human who categorized the site might have put it under businesses and companies and you will end up searching the whole directory without finding your destination. This makes the directory "categorizers", usually called "editors", have to put the same site under many ctegorizations at the same time. Starting with a huge category, the list of sites under branches of the directory in one place might get as large as those of search engines, making the very reason why directories are good become the very reason why they are bad. Search engines also suffer their own problems. The biggist of all is unrelated sites, like the ones we already mentioned above in the "iced tea restaurants" example.

Those problems led to most modern directories having their search engine facilities and most search engines having their own web directories. This created a new searching hybrid between both technologies which is the common situation today. Although each of the two technologies is still developing independantly of the other in techniques and methodologies, they still need each other so you hardly see them separated if at all. When we speak of a directory nowadays, we usually speak of a part of some search engine, and when we speak of a search engine, it can be thought of as being part of or supporting a directory. In these articles however, we will rather refer to them as being separate components.

Recently, a completely new searching mechanism emmerged to the surface: data mining (also called meta search). It is not a totally new concept, however, it found its place on the web lately after some major advancements to the old concept has been made. These sites, offer a service that will "mine" for data from different data sources and give you the result. Some of them will simply propagate your search to a collection of other search engines, and then filter the results of the unrelated results and give you the final list of sites. Others, will propagate the search to other data sources than search engines before they filter and merge the results. For example, your query might be sent to a number of online libraries, encyclopedias, search engines, dictionaries and references before the results are filtered and then merged to give you "information" about what you are looking for, followed by a list of related sites.

General Characteristics and Structure

Web Directories

  • Analogy: Phone Directories, notebooks ...etc.
  • Structure: a community of editors (humans) who are responsible for reviewing the sites list, ordering them and categorizing them after planning, organizing, implementing and administring a category structure.

Search Engines

  • Analogy: Yellow Pages, Alphabitical phone address books (by name)
  • Structure: a software called a spider/crawler/search-bot that "crawls" the internet for sites using one site to reach to the others and so on in a recursive manner. The crawler stores keywords about each site it visits/crawls which forms a large database of sites, sites addresses and keywords that are later searched using an "engine" (also a software)

Data Mining Services

  • Analogy: the closest thing to search engine searching a complete directory
  • Structure: depends on the type and settings of the service. Usually a software that filters and merges results from other data sources. Persmission is usually obtained through commercil agreements

Examples:

"This is only a set of examples, not a comprehensive
or complete list. In the next post, there will be more sites under each
category along with reviews about them that might help you choose what suites
you best! We will also recommend a set of these facilities to be used oftenly at
the beginning of the next post."

This article summarized the differences about the major three types of search facilities. The next articles will concentrate on the specifics of some of the most popular and used of these services with more detailes and more links and examples.

Please give me your opinion about this article if you read it so that i can avoid mistakes and make this series better and more useful to everyone. Comments are open to everyone and you opinion, suggestions and comments are all welcomed.

1 comment:

Anonymous said...

Hi buddy, your blog' s conception is basic and clear and i like it. Your blog posts are superb. Please keep them coming. Greets!!!
[IMG]http://www.sedonarapidweightloss.com/weightloss-diet/34/b/happy.gif[/IMG]