Monday, November 21, 2005

Working on a voluntary site .. what an experience

What an experience! almost every programmer should try at least visiting this site if he/she didn't already. However, if you are a programmer, you should really think of working on it! There are plenty of software projects developed there, in almost all categories, so no matter what language u use, what platform u prefer or what framework u work under, you'll find a project fetting your skills.
But what do you gain from working in source forge voluneerly?!!
Well, for starters, if you are a new programmer, a new graduate from some computer related education line or generally new to working in programming teams, Open source development sites generally are your destination! You'll begin to learn the tools most teams use when developing softwares, how they communicate, how they change their software and how they devide the work amongst them. You will also have a great chance to apply what you've learnt so far in your specific programming career. But there is more to it!
I think the main benefit for new and intermediate developers is the chance to actually "live" the code development process. You see, in a real life project, the life cycle of a software product being developed from A to Z. What are the troubles real life projects face, how are they solved, how a team work is different from an individual effort, what are the tricks older programmers use and when they use it. All of these are some of the knowledge you acquire when developing in an open source project along with many other benefits. For the experienced programmers who haven't tried to do open source, there are benefits too.
If you've been a programmer developing applications for a software company, then you certainly faced many situations where you had to do something in a way you don't like (or even, the worng way sometimes)! That's because companies apply policies, business plans, feasability studies, expenses and revenue impact analysis, and the list goes on of these ugly, unwanted things that we mere programmers usually don't understand (or don't want to!). In open source projects, you get a chance to see if these policies are necessry or not. After all, you're the one who puts your plans, you're your own boss! Many programmers I know learnt more about these manegerial stuff in open source software projects instead of their own work. More importantly, they learned to embrace such policies, and even started to develop their own manegerial skills.
Still not convinced? Then you must be thinking what my wife used to think when she knew i was spending a lot of time developing a software that will be available for downloading by anyone on the internet! She used to say: "Well, waste that time on something that will really be useful to you, instead giving something for free. You don't get free things all the time do you?"
Now, my wife is not a mean person. Actually she is a very nice creature. What she really thought was that she didn't like the idea that her husband is spending hours daily on developing a software just for some "thief" - in her opinion as a non programming person - to come and make use of it and may be get all the credit in his/her company, which is probably wht you are thinking if you don't believe in open-source projects. Well here's the good news: Non of this is true most of the time!
First of all, the open source architicture almost always reserves the main author's credits to him. It allaws everyone to take the code, make some customizations or modifications and then simply maybe sell it! However, it does not allow that person to prtend that the software is his/her own altogether. He/she must aknowledge that the parts developed by you are developed by you. In other words, he/she must explicitly say that this software is a modification/customization of an open source software called "---" developed by "----" to his/her clients. This means that in some way, that person will be introducing you to his/her clients. After all, reputation is important in the field of programming! Another crucial thing is, it allows you to do the same! Well, who will be better than you in developing your own software further?! One more thing is, well, it gives a hell of an impression for employers and clients to know that you are developing open source software! You may think of it as both a great line to add in your CV and as a great propaganda!
So no matter what your motives are, and no matter what your level of experience is, open-source software development will be a good place for you.
Finally, here is some places where you may start:
Finally, I hope this helped the newbies, amused the fathers and didn't bore the masters!

Tuesday, November 15, 2005

The Useful Internet (0102) - Web Directory & Search Services - Web Directories



Serial Number: 01
Branch Number: 02
Main Categroy: Web Directory & Search Services
Sub Category: Web Directories



Web Directories - a more thorough look

As mentioned earlier in post 0101, Web directories are normally huge lists of sites categorized usually by topic. In this article, we discuss in more details the organization and structure of popular directories. Most of the exmples will be based on the following sites:

Web Directories can be categorized themeselves! However, in the age of computing revolution, the boundries between categroies begin to blur and become more vague. It became harder to categorize anything actually as we once could. However, there are few ways that can help categorize what can be called a Web Directory:

By Directory Specialization

Some directories are general ones such as the ODP and the Google Web Directory. Such directories will usually try to list all sites of the world (a goal that seems impossible to achieve) and categorizse them usually by topic. Go to www.galaxy.com / dmoz.org / dir.google.com and you will usually see a list of very broad general topics. Now if you click any of the topics, another list of categories will appear. Only this time the categories are more specific and related to the main topic you've choosen. You might even find a small list of sites along with descriptions on them. That list will be of general sites covering a wide area of the main topic you selected. Now, descriptions are what makes directories special. Usually those descriptions will not be extracted by some software program. Instead, it will be an edited paragraph descibing the content of the website.

Other directories choose to specialize in some topic. Take the FSF/UNESCO Free Software Directory for example. They maintain a list of free software along with some detailed information about the software and where you can get it (a website link usually!) You may find a directory specialized in listing sport related products and companies in Alabama soon (if there is no one already there!) ChefMoz for example is a directory (informative mainly but it includes links if available) specialized in restaurants: http://chefmoz.org/

By Directory Focus/Flavor

Although it might seem strange at first, I like to categorize directories by their Flavor! That is not of course Lemon or apple flavors! Some sites are more into a "commercial" flavor. They give you commercial entities working on the area you're searching for and then list other sites. This is different from commercial vs non-profit directories. It's more into what the directory targets, what kind of websites does it focus upon. Excite is one of those commercial flavored directories. Others are software flavored, and so on.

By Directory Listing Type

Some directories just give you the site link and a short description about it. This is usually the general ones as it is hard to keep detailed information about large numbers of sites (the ODP currently has about 5.1 million sites) Other directories however choose to provide more information about the listing (One might consider the sourceforge.net softwre tree as a directory listing the projects and their summary pages. It provide details about each project more than just a short desc)

By Directory Editing Mechanism

Some Directories are voluntary based (edited by volunteers such as ODP), Others are edited by employees (such as Google's Directory) and some are based on software categorizing the sites then human editors check the listings (google might be using this kind of approach)

Useful Sites and Directories:

Web Directories:

The Open Directory Project(ODP): http://dmoz.org

A voluntary based not for profit web directory. This directory provides a free dump of its data to the public beside a searching facility, and of course, the directory interface itself. Sites are not ranked in the ODP. Instead, if a site is superior to its counterparts in the same category, it is "cooled" to indicate it is special. Even this is discouraged to be done oftenly. However, it is usually based on the content of the website which makes it realistic. A great place to start your information mining process! a useful article about the ODP can be found in wikipedia at: http://en.wikipedia.org/wiki/Open_Directory_Project

The Google Directory: http://dir.google.com

This is built upon the ODP's data but, as a commercial entity, Google's directory is updated more often. The searching facility is much stronger in google (a company legacy I guess) However, one downfall of it is that it is commercial, meaning it relies on profit from advertisements so some of the sites are ranked higher only because they pay more and not because their content is really more related.

The Galaxy Web Directory: http://www.galaxy.com/directory

The site is great even it does not contain as much links as google but it's humanly edited in a way to avoid adult and hate content, by very strict on relevance and it displays those who pay on a separate section called Featured Listings. A Good point to start from

The Yahoo! Web Directory: http://dir.yahoo.com

Although one of the oldset, and it's also one of the best, commercials make it hard to concentrate using the yahoo interface in general. However, everyone who've been in the internet before 2000 knows yahoo and its directory! No hands, just dive!

About.Com InfoDirectory: http://www.about.com

I like to call it InfoDirectory because it is more into information than sites. It does list sites but within articles about a certain topic. I usually recommend this for people with no goal to go and wanting to spend some time getting info about certain topics. Be cautioned however! I started out looking for info on computer internet technologies and ended up looking at the personal weblog of someone (It was a nice blog after all!) so this can take you to places were you can spend MORE time than you really want! It's fun however, and can be very useful if used well

Specialized Directories

http://www.educational-software-directory.net

http://library.albany.edu A great research Directory!!

http://www.vcanet.org American Virtual Community of Associations

http://groups.google.com Google's Groups' Directory

http://www.cyberfiber.com directory to USENET and alt. Newsgroups

http://www.epistemelinks.com/ EpistemeLinks includes over 18,500 categorized links to philosophy resources on the Internet and has several additional features

Finally: I hope this article helped the newbies, amused the fathers and didn't bore the masters!

Please give me your opinion about this article if you read it so that i can avoid mistakes and make this series better and more useful to everyone. Comments are open to everyone and you opinion, suggestions and comments are all welcomed.

Monday, November 07, 2005

The Useful Internet (0101) - Web Directory & Search Services - Introduction


Serial Number: 01
Branch Number: 01
Main Categroy: Web Directory & Search Services
Sub Category: Definitions and general characteristics


Definitions and Terms

Web Directory services are simply websites that lists the addresses of other websites categorized in an organized manner mostly by topic of the websites they list. In other words, what you will usually see in the front page of any directory service is a small listing of broad general topics that when clicked, take you to another categorized listing of subtopics of the main topic and so on. When a site's content falls under a certain topic or subtopic, it is listed there. This is the general for of directory services. However, specialized directory services also exists which can be thought of as being branches of general directories. For example you may find directories listing only pages that are concerned with sports in general categroized by their broad kind of sport first, and then by specific branches of sports. This is also considered a web directory services.

On the other hand, search services began as huge listings of sites accessable through searching for keywords. You type "iced tea restaurants" and the "engine" starts "searching" for all sites that mentions these three words regardless of their real content or categorization. This might bring up a site with an interview with a football player whose favorite drink is iced tea taken in a restaurant! It might even bring up a restaurant that states explicitly on its site that they DON'T offer iced tea. Not quite the results you will be looking for in most of the cases!

Each of the two models has its advantages and disatvantages. Directories may take long time to find you the site you are looking for (especially with a slow connection) as it requires you to browse down the tree of categories to the right one (this can be as deep as twenty+ levels down) while search engines will give you the result directly without browsing. Directories usually also face the problem of clear categorizations and ambiguity. If you are looking for the "Contoso Airlines Company" website, you might decide to start at "travell and outdoors". However, the human who categorized the site might have put it under businesses and companies and you will end up searching the whole directory without finding your destination. This makes the directory "categorizers", usually called "editors", have to put the same site under many ctegorizations at the same time. Starting with a huge category, the list of sites under branches of the directory in one place might get as large as those of search engines, making the very reason why directories are good become the very reason why they are bad. Search engines also suffer their own problems. The biggist of all is unrelated sites, like the ones we already mentioned above in the "iced tea restaurants" example.

Those problems led to most modern directories having their search engine facilities and most search engines having their own web directories. This created a new searching hybrid between both technologies which is the common situation today. Although each of the two technologies is still developing independantly of the other in techniques and methodologies, they still need each other so you hardly see them separated if at all. When we speak of a directory nowadays, we usually speak of a part of some search engine, and when we speak of a search engine, it can be thought of as being part of or supporting a directory. In these articles however, we will rather refer to them as being separate components.

Recently, a completely new searching mechanism emmerged to the surface: data mining (also called meta search). It is not a totally new concept, however, it found its place on the web lately after some major advancements to the old concept has been made. These sites, offer a service that will "mine" for data from different data sources and give you the result. Some of them will simply propagate your search to a collection of other search engines, and then filter the results of the unrelated results and give you the final list of sites. Others, will propagate the search to other data sources than search engines before they filter and merge the results. For example, your query might be sent to a number of online libraries, encyclopedias, search engines, dictionaries and references before the results are filtered and then merged to give you "information" about what you are looking for, followed by a list of related sites.

General Characteristics and Structure

Web Directories

  • Analogy: Phone Directories, notebooks ...etc.
  • Structure: a community of editors (humans) who are responsible for reviewing the sites list, ordering them and categorizing them after planning, organizing, implementing and administring a category structure.

Search Engines

  • Analogy: Yellow Pages, Alphabitical phone address books (by name)
  • Structure: a software called a spider/crawler/search-bot that "crawls" the internet for sites using one site to reach to the others and so on in a recursive manner. The crawler stores keywords about each site it visits/crawls which forms a large database of sites, sites addresses and keywords that are later searched using an "engine" (also a software)

Data Mining Services

  • Analogy: the closest thing to search engine searching a complete directory
  • Structure: depends on the type and settings of the service. Usually a software that filters and merges results from other data sources. Persmission is usually obtained through commercil agreements

Examples:

"This is only a set of examples, not a comprehensive
or complete list. In the next post, there will be more sites under each
category along with reviews about them that might help you choose what suites
you best! We will also recommend a set of these facilities to be used oftenly at
the beginning of the next post."

This article summarized the differences about the major three types of search facilities. The next articles will concentrate on the specifics of some of the most popular and used of these services with more detailes and more links and examples.

Please give me your opinion about this article if you read it so that i can avoid mistakes and make this series better and more useful to everyone. Comments are open to everyone and you opinion, suggestions and comments are all welcomed.

The Useful Internet (0000) - Organization of the posts


Serial Number: 00
Branch Number: 00
Main Categroy: Posts within "The Useful Internet" Series of Articles Guide
Sub Category: Organization of the posts and introductory commenting and usage guide

This is the first post in a series of articles i intend to write in which i will summarize some of the most useful sites of the internet (at least in my own opinion). Here's how it will go:

  • Every post will address a certain general category or sub-category of sites. For example, a post will address "Web Directory Services".
  • Next posts will have a small header at the beginning identifying the group/category/subcategory of sites it is discussing. The posts will be given serial numbers according to the category they are addressing, for example, all posts discussing web directory and search services will be numbered 01, the next topic will have 02 as its number and so on.
  • Some categories will be discussed in more than one post, each may be discussing one subcategory of the main category being discussed or just continue a long discussion of the category. Those will be given branch numbers, so "open web directory services" might be given the serial number 01 and the branch number 01, totalling for a 0101 post number!
  • Headers at the top of each post will contain information about this.
  • Now suppose i didn't mention anything about yahoo! when i discussed search engines, and you think that it is the best search engine ever created, all you have to do is post a comment on the appropriate post (the one discussing commercial search services) mentioning that there is a search engine called yahoo and where it can be found, what is its main features, and what other services they offer. Like this, we might get something out of this blog on the long run.
  • Another thing is, if you have a question that you feel like asking related to the topic of one of the posts and not answered by that post, also comment the question so others (including me) may reply to it.

Simple enough i guess.

This format is also subject to discussion and further refinement. It keeps me comfortable as it reminds me of the good old fashioned newsgroups and usenets!

What to post here ..?

After a long while off the road working, i finally have some time to work on my projects at last. My last job turned out to be a disaster, but it made me learn more about myself and about what i want to do next, which is good! I'll start posting more often to this blog so that if there is anyone interested on the topics, it will be good! Why? I don't know!

I've been trying to figure out what to post in such a blog and then found one good start (in my opinion) which is a series of posts about places to go in the internet. I work as an editor in the Open Directory Project which made me visit lots and lots of websites in lots and lots of fields and flavors, and i think this is a good experience to be shared with others. I have choosen the title "The useful internet" as a common name for the series so whenever you see a post called the useful internet, know that it will contain information about sites that might be of good use to the programmer mainly (sometimes to others as well). I'm also currently thinking of other generic series of articles but they are only prototype ideas and this one will give me an idea of whether it is good to go on or not. I'll post again when something new happens.
That's it for now.
Enjoy!