In the context of computer and the internet, a search engine is a program that helps in locating information. The user provides queries which the search engine processes and displays the result.
As far as the Internet is concerned, a web search engine helps you find information on the web based on your query supplied. The results of your query to a search engine can be web pages, text documents (PDF or Word files), videos, images etc or a combination of all.
Google, Yahoo!, Bing, AOL are some of the famous web search engines – for more, refer the major search engines and directories list.
Without search engines, it would be impossible to find something on the web. I know it’s a clichÃ© but it would be like searching for a needle in a haystack… a haystack that’s miles and miles across.
Is a web search engine just one program? NO! There are five main ingredients to it:
We shall now look at each to understand web search engines better.
Most people think of the interface as the actual search engine. It is not! For instance, what you see on www.google.com is the interface – a text field in which you type in your search query, a couple of buttons and a few links. The real Google search engine is one that works behind the scenes – it wakes up when a query it entered in the blank field and the search button in clicked.
A typical web search engine interface has a text field, in which the surfer needs to enter their query, and a submit button that passes the query to the actual search engine program. This interface is either presented on a web page or may be a part of another program such as the web browser or the add-on browser toolbar.
By far, the simplest web page search engine interface is that of Google and I guess it’s one of the reasons that made it so famous. Google has been very particular and “careful” of its interface. Ever since it was launched, the Google homepage was simplicity personified – a logo, text field and search buttons. Distracting elements were altogether absent and because of that the interface loads very fast even on slow internet connections.
When you run a query on a web search engine, the program doesn’t look at each web site on the Internet to hunt for the required information.. This would be impossible to do and would take an immense amount of time – by the time you would have aged a few years.
So what happens and how are the search results displayed so quickly?
Each web search engine keeps a repository of web pages. This collection is stored in a database. Furthermore, the web pages in this database are indexed (or organized, if you don’t like the fancy word) based on the information (text, images…) they contain. This indexing is very important and is responsible for rapid searching for the required web pages based on your query.
The main job of a search engine bot is to go around the web, hunt for new information and add or update the database. The bot follows links from web pages quite like you do when you click on one. However, it moves from one web site to another like a “spider” without human involvement. On finding a new web page, it sends the information which is then stored in the database. The same happens when it finds a web page that has been changed or deleted.
As mentioned above, because of the dynamic nature of the internet (with information being changed), it is virtually impossible for a bot to have all the current information of the web in the database. There simply cannot be a current snapshot of the web. This means the results you get for your query would not include web pages that have been added a few seconds back (sometime even hours and days).
At the heart of it all is the search engine indexing program. This program is in-charge of organizing and segregating information which the bot gathers and stores in the database. It’s also responsible for getting you relevant results based on your query.
The indexing of online information (web pages and their contents) involves complicated algorithms and processes. And these are closely guarded secrets because the success of the search engine depends on them. Google, Yahoo!, Bing, AOL all have different algorithms for indexing web pages which is apparent from the different search results they display for the same query.
The web search engine program takes the query provided by the user, runs it through the indexed database and provides the results. Note: the relevancy of the results depends on how the information has been indexed – the actual search program simply goes through the index and presents the results.
Most web search engines follow the same basic format in presenting the results – web pages are listed one after another, 10 or 20 at a time. More results (if found) are displayed on additional pages.
Do you now understand the complexities in a web search engine? The simple interface which you see on the web page of Google.com is just the tip of the iceberg. There is so much more web search companies do to make it easy for all of us. I hope you’ve found this article entertaining and useful; if so drop me a comment.
The Fail Whale, Twitter's error message, was designed by Yiying Lu of Sydney, Australia. She had originally designed it for a birthday card. Later, the illustration was put on iStockPhoto, from where it was picked up by the microblogging web site. Fail Whale has 8 small orange birds lifting a Beluga whale on a net. Image Source: yiyinglu.com [more...]