Free Essay

Search Engine

In:

Submitted By perrine88
Words 1212
Pages 5
Functions of Internet Search Engines

A search engine is a computer software, that is continually modified to avail of the lastest technologies in order to provide improved search results. Each search engine does the same functions of collecting, organizing, indexing and serving results in its own unique ways, thus employing various algorithms and techniques, which are their trade secrets. In short, the functions of a search engine can be categorized into the following: 1. Crawling the internet for web content. 2. Indexing the web content. 3. Storing the website contents. 4. Search algorithms and results.

Crawling and Spidering the Web

Crawling is the method of following links on the web to different websites, and gathering the contents of these websites for storage in the search engines databases. Crawling the internet can start afresh (starting with a popular website containing lots of links, such as Yahoo) or from existing older indexes of websites. The crawler (also known as a web robot or a web spider) is a software program that can download web content (web pages, images, documents and other files), and then follow hyper-links within these web contents to download the linked contents. The linked contents can be on the same site or on a different website.

The crawling continues until it finds a logical stop, such as a dead end with no external links or reaching the set number of levels inside the website's link structure. If a website is not linked from other websites on the internet, the crawler will be unable to locate it. Therefore, if the website is new, and has no links from other sites, that website has to be submitted to each of the search engines for crawling.

The efficiency of the crawler makes it crawl multiple websites at the same time, so as to collect billions of website contents as frequently as it can. News and media sites are crawled more frequently (every hour or so) by advanced search engines like Google, in order to deliver updated news and content in their search results. The crawler also does not flood a single website with a high volume of requests at the same time, but spreads the crawling over a period of time so that the web site does not crash. Usually search engines crawl only a few (three or four) levels deep from the homepage of a website. The term deep crawl is used to denote that the crawler or spider can index pages that are many levels deep. Google is an example of a deep crawler.

Crawlers or web robots follow guidelines specified for them by the website owner using the robots exclusion protocol (robots.txt). The robots.txt will specify the files or folders that the owner does not want the crawler to index in its database. Many search engine crawlers do not like unfriendly URLs, such as those generated by database driven websites. These website URLs contain parameters after the question mark (such as http://somedomain.com/article.php?cat=1&id=3). Search engines dislike such URLs because the website can overwhelm the crawler by using parameters to generate thousands of new web pages for indexing with similar content. Thus, crawlers often disregard the changes in the parameters as part of a new URL to spider.

Search engine friendly URLs are used to compensate for this problem.

Indexing the Web Content

Similar to an index of a book, a search engine also extracts and builds a catalog of all the words that appear on each web page and the number of times it appears on that page etc. Indexing of web content is a challenging task assuming an average of 1000 words per web page and billions of such pages. Indexes are used for searching by keywords, therefore, it has to be stored in the memory of computers to provide quick access to the search results.

Indexing starts with parsing the website content using a parser. The parser can extract the relevant information from a web page by excluding certain common words (such as a, an, the - also known as stop words), HTML tags, Java Scripting and other bad characters. A good parser can also eliminate commonly occurring content in the website pages (such as navigation links) so that they are not counted as a part of the page's content.

Once the indexing is completed, the results are stored in memory, in a sorted order. This helps in retrieving the information quickly. Indexes are updated periodically as new content is crawled. Some indexes help create a dictionary (lexicon) of all words that are available for searching. Also a lexicon helps in correcting mistyped words by showing the corrected versions in a search result. A part of the success of the search engine lies in how the indexes are built and used. Various algorithms are used to optimize these indexes so that relevant results are found easily without much computing resource usage.

Storing the Web Content

In addition to indexing the web content, the individual pages are also stored in the search engine's database. Due to cheaper disk storage, the storage capacity of search engines is very huge, and often runs into terabytes of data. However, retrieving this data quickly and efficiently requires special distributed and scalable data storage functionality. The amount of data, that a search engine can store, is limited by the amount of data it can retrieve for search results. Google can index and store about 3 billion web documents. This capacity is far more than any other search engine during this time.

Search Algorithms and Results

Once user enters the search keywords, the search engine's search algorithm looks up the indexes for matches for the search keywords. Once it can match the keywords in the index, the search engine tries to provide the most relevant contents first. This relevance matching is achieved by various search engine algorithms and hence is the bread and butter of search engine's popularity. Among all the search engines on the internet, Google stands out from the rest because it can provide more relevant answers to search queries. The search algorithms, that are used to find the most relevant results from a hay stack of web content, are different from one another. That is why search results, for the same keywords, produces different results on various search engines.

Advanced search engines, like Google, use a relevance ranking system, where each web page is ranked based on various factors such as: 1. Content analysis : The content of each webpage is evaluated for the keywords based on the number of occurrences, position in the page (such as title, meta tags, heading), font size, proximity between them etc.

2. Linking structure : The links from an external page or website to this page are analyzed for keywords in the link structure. Also links from a popular website will lead to a higher ranking.

3. Page ranking :This is a relative ranking of a website based on an algorithm that is used specifically by Google. The page rank denotes the ranking of a web page based on its popularity and quality of links, among various other factors. The basic idea behind a higher page rank is that it is easier to find the website on the internet.

Similar Documents

Premium Essay

Search Engine

... Web search engine SUBMITTED TO: MD MOQBUL HOSSAIN BHUIYA PROFESSOR DEPT. OF MIS UNIVERSITY OF DHAKA SUBMITTED BY: Sanjida Sharmin ROLL# MIS 06-83 UNIVERSITY OF DHAKA A paper on Web search engine ACKNOWLEDGEMENT I would like to express my gratitude to our course instructor MD MOQBUL HOSSAIN BHUIYA for inspiring me to know about INFORMATION TECHONOLOGY and then prepare an assignment on web search engine. This is the way I want to know the INFORMATION TECHONOLOGY and I feel myself sufficient now. Although, it is little about the topic, however I must cite that he gave me the apt direction and showed me the accurate way to complete the assignment in a creative way. TABLE OF CONTENTS Index Page General information 5 1. History 5-7 2. How it works 8-9 3. List of search engine 10-13 4. Market share 14 5. Bias 14 6. Facilities 14 7. Why I choose this topic??? 14 8. Reference 15 General information A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented...

Words: 2507 - Pages: 11

Premium Essay

Search Engines

...страница. Вероятността интернет търсачките да намерят даден интернет сайт е по-голяма, ако много други сайтове съдържат връзки към него. 2. Индекс, или база-данни, съдържаща копие от всяка страница (пълния текст или сведения за заглавие, ключови термини и др. информация), обходена от робота. Има търсачки, които индексират пълния текст на документа, т.е. всяка дума от този текст може да бъде търсеща – ако се зададе в полето на търсене, този документ ще бъде повикан в резултатите(Excite, Alta Vista, Google, HotBot, Infoseek, WebCrawler). При други търсачки роботите регистрират само заглавие, подзаглавни данни, анотации, или просто първите няколко реда от текста, които носят информация за темата на документа. (Lycos, ALIWEb, Galaxy,WWW search, Magelan). 3. Механизъм за търсене, който позволява на потребителя да въведе интересуващите го понятия, да търси в тази база-данни и да получи списък от адреси на сайтове по интересуващата го тема. При първото поколение търсачки подреждането им става най-често според честотата на срещане на думите от запитването. Във второто поколение търсачки резултатите обикновено се подреждат по честота на посещаемост от други потребители и по брой на направени хипервръзки към тях. Най-отгоре в списъка ще появят най-популярните и...

Words: 1003 - Pages: 5

Premium Essay

Search Engine

...DESIGN AND OPERATION OF TYPICAL SEARCH ENGINE INTRODUCTION Design of information retrival system to find out the stored information on a computer system is known as search engine. It is used to links to documents, web sites, text snippets, images, videos etc. first they have to process hundreds of millions of issues and searches every day and answer to the queries in milliseconds. Search engine is a general class of program. Many key factors and requirement of the user should be taken into account by the search engine secondly the resources on world wide web must be update constantly. Wheather it may continuously add of data, removed or change of data( the overall changing is upto 8% a week). Third the user express should be make use of available syntactic features with limited express power of that language. As we know that search engine is a designed program which help the user to retriev data stored in the computer such as world wide web or from a personal computer. The user can retriev the data with list of references which match the criteria of the user quickly and efficently this can done only by search engine with reguarly updated indexes. In other words search engine is a sophisticated peace of software which can access on a website which allows user to access the web page by entering the queries in the search box. There are two types of search indexes which will be access for the web search directories crawler-based search engine Directories : unlike serach...

Words: 2233 - Pages: 9

Premium Essay

Characteristics Of Search Engines

...Currently, the top five search engines are Google, Bing, Yahoo, Naver, and Baidu. Each search engine has distinct characteristics. The differences that make each of these search engines are the aspects of global or domestic markets. In the business world, these search engines are used to work as personal levels such as illustrating the newspaper or dictionaries. Also, these search engines are used for collecting, distributing, and researching for information and data. Search engines are popular and powerful for firms around the world to analyze data to cultivate the fast-changing Internet world nowadays. Today, Google is one of the major search engines that most Internet users use to search and collect information on a day-to-day basis....

Words: 955 - Pages: 4

Premium Essay

Search Engine

...Search Engine Industry History of search engine The need for search services grew with the expanding reach and magnitude of the World Wide Web. One of the earliest search services, Yahoo!, was a directory of sites selected and organized into categories by human editors. The Web soon grew too large for directory-based search. AltaVista invented technology that automated search, relying on software “spiders” that created a searchable index of page contents and on algorithms that ranked page relevance based on the frequency of keyword references. Yahoo! added AltaVista’s algorithmic search engine, but in 1998 replaced AltaVista with Inktomi, which used parallel-processing networks to offer faster processing and a larger index. As website developers exploited search algorithms by repeating keywords on their pages, searches increasingly returned irrelevant listings—”spam”—that frustrated users. In 1998, Sergey Brin and Larry Page tackled this problem as graduate students at Stanford. Their PageRank algorithm reliably delivered more relevant searches by favoring pages that were referenced—”linked to”—by other pages. These links were called “votes,” because they signaled that another page’s webmaster had decided that the focal page deserved attention. The focal page’s importance was determined by counting the number of votes it received, weighting votes more heavily when they were cast by pages that Google had previously deemed to be important. This approach required PageRank to...

Words: 6111 - Pages: 25

Premium Essay

Search Engine Optimization

...Management & E­Business  Search engine optimization        4/21/2012    Apr.21   SEO    Search engine optimization (SEO)  Search engine optimization (SEO) is the process of improving the visibility of a website or a web  page in search engines' "natural," or un‐paid ("organic" or "algorithmic"), search results. In general, the  earlier (or higher ranked on the search results page), and more frequently a site appears in the search  results list, the more visitors it will receive from the search engine's users. SEO may target different  kinds of search, including image search, local search, video search, academic search, news search and  industry‐specific vertical search engines.  As an Internet marketing strategy, SEO considers how search engines work, what people search for, the  actual search terms or keywords typed into search engines and which search engines are preferred by  their targeted audience. Optimizing a website may involve editing its content and HTML and associated  coding to both increase its relevance to specific keywords and to remove barriers to the indexing  activities of search engines. Promoting a site to increase the number of back links, or inbound links, is  another SEO tactic.  The acronym "SEOs" can refer to "search engine optimizers," a term adopted by an industry  of consultants who carry out optimization projects on behalf of clients, and by employees who perform  SEO services in‐house. Search engine optimizers may offer SEO as a stand‐alone service or as a part of a ...

Words: 849 - Pages: 4

Premium Essay

Search Engine Optimization

...Search Engine Optimization Name Institution Affiliation Course Tutor Date Search Engine Optimization The World Wide Web has in the modern world become so vital in the sharing and connection of files and documents on the internet (Johnson, 2014). Because of the millions of files and documents available on the World Wide Web, programs have been developed to help users in their search for documents on the internet (Rohan, 2014). A program that locates and identifies documents on the World Wide Web that match keywords specified by the user is called a search engine. Popular examples of search engines include Google, Yahoo!, and MSN Search. After conducting a search, the engine displays websites that contain documents marching the keywords typed by the user with the most relevant at the top. The websites that appear at the top do so because of a technique known as search engine optimization (SEO). This technique assists search engines to identify and rank higher particular websites than numerous others in response to a search query. SEO involves writing pages that use keywords that are in regular use by people in searches (Rohan, 2014). Website owners use SEO in order to acquire quality results within search engines that are frequented by users, and this is in a bid to increase traffic to their websites. Designing and optimizing a site in a manner that it attains the highest search rankings is what SEO entails. The use of SEO is the cheapest as compared...

Words: 451 - Pages: 2

Premium Essay

Search Engine Optimization

...9 Step On-Page Search Engine Optimization (SEO) Guide This 9 Step On-Page Search Engine Optimization Guide will help you optimize your website pages, so they will have the best chance to get found by your target market. The Guide outlines the most important factors to consider when optimizing each page on your site. These steps are listed based on importance, so don’t skip a step. Also, remember that creating new optimized content on a weekly basis is critical to achieving long-term success with SEO. Step 1: Choose Keywords Read the page’s content and identify two (2) keywords that are most relevant to the overall page content. Choose one (1) primary keyword relevant to the page’s content and one variation of that keyword (e.g. plural variation or two closely related keywords) per page. If you can’t identify one primary keyword for a page, you’ll need to create new website pages to separate the different content. If it’s not clear to you what page is about, then your visitors and the search engines won’t be able to understand the page either. Step 2: Page Title The page title appears as the blue, bolded, underlined text on a Google search results page, and also on the top left the browser bar. The page title should follow these guidelines:      Be under 70 characters with no more than two long-tail keywords per page title The primary keyword should appear first Each keyword phrase should be separated by pipes (|) Each page title on your website...

Words: 874 - Pages: 4

Premium Essay

Search Engine Optimization

...Company Website: It is imperative that our business has a website in addition to the social networking features. A website is the first place customers will go to when searching for our company. A website is a place where we can control the information posted and the format of the data. It gives us the opportunity to provide an excellent shopping experience while being efficient and of value to our customers. a. Promoting Our Accessible Website i. Write Articles for Websites: Writing articles about fitness, cycling, bicycles, and reviews allows us to distribute our expertise to other editors to use as free content for their websites and newsletters. This exposure on other sites can produces hundreds of links to Bicycle Pro’s website. EzineArticles.com offers great tips and information on how to use this as successful marketing. ii. Request Reciprocal Links: Bicycle Pro will use other bicycle related websites to increase our sites traffic. We will include a section on our site that contains links to other sites that have Bicycle Pro’s link on their page. We will work with related companies to keep customers on our connected network of sites. Zeus and IBP Link Builder will help get us started. This takes time and effort, but it has no cost. iii. List Products with Shopping Comparison Bots and Auction Sites: A good shopping comparison website allows visitors to compare prices, features, reviews, and merchants side by side. Shopping bots compare a company’s products...

Words: 890 - Pages: 4

Premium Essay

Search Engine Optimization

...www.e-prithibi.com বইটি সম্পূর্ণ টবনামূল্঱ে ডাউনল্঱াড ও বন্ধুল্ের সাল্ে ল্লয়ার করা যাল্ব এটি a4 সাইজ এর,চাইল্঱ সহল্জই টিন্ি করল্ে পারল্বন । েল্ব অবশ্যই বইল্য়র সম্পূনণ অংল করল্ে হল্ব। উৎসর্ণ “পৃটেবীর সব মা-বাবা ল্ক ,যারা আমাল্ের মে সন্তানল্ের জন্য আমরন কষ্ট কল্র যান “ www.e-prithibi.com “ , , Paid SEO ( ) ,SEO , SEO Starter Guide > ) ”--পােণ সারটে কর www.e-prithibi.com ( - টক ? টকভাল্ব করল্বন ? www.e-prithibi.com সার্চ ইঞ্জিন অ঩ঞ্জিভাইজেশন ঞ্জি ? www.e-prithibi.com (SEO) , Organic Natural (Keyword) “Play Online Game” . , https://adwords.google.co.uk/select/KeywordToolExternal : HTML title title , “description” meta , On Page Optimization www.e-prithibi.com : PageRank , PR – http://toolbar.google.com : (BackLink) , ,  : ,  : Signature  : www.e-prithibi.com  :  : (SEO) “ (On-page SEO) ” ) “title” HTML , HTML www.e-prithibi.com “description” HTML , title description description description ) URL www.e-prithibi.com , description URL URL ID http://yoursite.com?category_id=1&product_id=2 http://yoursite.com/books/book-title URL URL breadcrumb , - Home > Products > Books , HMLT XML “ ” http://www.google.com/webmasters/tools http://code.google.com/p/googlesitemapgenerator www.e-prithibi.com 404 404 “404 File Not Found” ) ...

Words: 462 - Pages: 2

Premium Essay

Google and Search Engines

...Searching for a search engine Why is achieving a significant level of brand familiarity especially important for Google’s competitors? Google dominates the global search market –controlling more than 80% of it – therefore is important for its competitors to focus on their costumers and their needs. Since Google is worldwide known and people are ‘googling’ all the time, it is important for competitors such as Bing, Baidu and Yandex to offer specialized services and to raise awareness of its current clients to increase their brand familiarity and, as a consequence, loyalty. As more people know and, more importantly, have a good experience with these kind of search engines, the more they are likely to grow and become more appealing to people. Even though it is very difficult to compete with Google due to its intense presence, the companies that own this type of search engines need to realize that they have a market niche to explorer which can also be an advantage in the sense that they can place them better to compete with Google and people that are interested in very specific information can become their customers. What are the search sites doing to increase consumers’ motivation, ability, and opportunity to process external information? Before increasing consumer’s motivation, ability and opportunity, the search sites need to increase the customers’ awareness towards its products and they can achieve it by, for example, introducing advertisement in websites...

Words: 594 - Pages: 3

Premium Essay

Google and Search Engines

...Google and Search Engines Searching for a search engine Why is achieving a significant level of brand familiarity especially important for Google’s competitors? Google dominates the global search market –controlling more than 80% of it – therefore is important for its competitors to focus on their costumers and their needs. Since Google is worldwide known and people are ‘googling’ all the time, it is important for competitors such as Bing, Baidu and Yandex to offer specialized services and to raise awareness of its current clients to increase their brand familiarity and, as a consequence, loyalty. As more people know and, more importantly, have a good experience with these kind of search engines, the more they are likely to grow and become more appealing to people. Even though it is very difficult to compete with Google due to its intense presence, the companies that own this type of search engines need to realize that they have a market niche to explorer which can also be an advantage in the sense that they can place them better to compete with Google and people that are interested in very specific information can become their customers. What are the search sites doing to increase consumers’ motivation, ability, and opportunity to process external information? Before increasing consumer’s motivation, ability and opportunity, the search sites need to increase the customers’ awareness towards its products and they can achieve it by, for example, introducing...

Words: 326 - Pages: 2

Premium Essay

Search Engine Optimization

...Search engine optimization, also known as SEO, is about making one’s business instantly viewable when searched online in an engine such as Google. The more that a website is shopped on or clicked on by consumers, the higher up it is in the search results. This can also be based on relevance, of course, so using keywords and phrases is important for site owners when first establishing their website. SEO is basically a way of providing organic results in search engines like Google, Yahoo, and Bing, to allow consumers to the seemingly best result (“What is SEO/Search Engine Optimization?”). Social media has become a huge part of society in recent years, even though one of the first social media sites was created in 1997, called Six Degrees (“The History of Social Media”). Since then, it has evolved, but it still remains as a way for users to share, view, and connect via online. Top social media sites at the moment include Facebook, Twitter, Instagram, and YouTube. All are similar platforms, allowing users to create content, whether that be a picture, video, their thoughts or opinions, or simply a response to someone else’s content. Facebook is status (post about thoughts), picture, and message based, Twitter is status based with a lot of sharing of ideas, Instagram is photo based, and YouTube is video based (“The History of Social Media). Today, the presence on social media and the SEO reflects on a business in a huge way. Social media users can and do post their opinions and...

Words: 1209 - Pages: 5

Premium Essay

Lenovo Search Engine Evaluation

...Name: Fei Xie Lenovo Search Engine Evaluation Company Background Lenovo is a new world company that makes award-winning PCs for customers. They operate as a company uninhibited by walls or organizational structures using worldsourcing to harness the power of innovation across the global team. They design innovative and exciting products and services to meet customers’ needs [3]. According to the research from IDC, it shows that Lenovo is the leading PC making company in China, which contributes 35.2% to the market, and it is also ranked as 4th in worldwide, which contributes 7.3% to the world market [4]. This report is about the evaluation of search engine from Lenovo. Methodology There are so many factors could be used to evaluate a search engine, for example, user interface, user interaction, docs, performance and so on. In order to formalize these features, an evaluation methodology proposed by Li [1] and his colleges will be applied for Lenovo search engine. Based on the method, two major groups of evaluation parameters have been identified, Feature group and Performance group. Feature group consists of characteristics that enhance the usability and user-friendliness of the search engine. Performance group utilizes various metrics to examine the efficiency and the effectiveness of the search, such as recall and precision. The parameters in the Feature group are more static in nature since they represent facts and do not change often, while Performance parameters...

Words: 1082 - Pages: 5

Free Essay

Search Engine Optimization

...Meta Tags Optimization XIAOCHEN ZHAO Content relevant to the analysis Our goal is to attract and help more people to find our website so that we could increase the number of our potential customers. Here’re all the pages that might be relevant and helpful for achieving our goal. http://suabroad.syr.edu/index.html(home page) http://suabroad.syr.edu/interestedstudents/ (interested students) -/interestedstudents/infosessions.html (info session) http://suabroad.syr.edu/programs/ (search programs) -/destinations/summer/index.html (summer programs) -/programs/bysubject.html (by subject) http://suabroad.syr.edu/dest/ (destinations)r The meta description tags allows you to show description of your website under the clickable link on a search engines. Although it might not be displayed by search engines every time, it is still a good way to tell your audience what your website is mainly about. Suabroad didn’t have any of these. It is not good when you find the links of suabroad, and it only shows you” SU Abroad will not be open Monday, Dec. 26-Monday, Jan. 2. There will be no mail service to SU Abroad during this time; please plan accordingly”. This definitely will make some potential audience lose their patience if they just have gone through hundreds of similar pages of this kind. So adding descriptions to content- relevant pages is the step to do meta tag optimization. The Meta Keywords Tag will not guarantee the high page ranking. However, well usage of keywords...

Words: 1536 - Pages: 7