facebook
Search:

Home | Business | Home Business | Network Marketing


Web Data



By: David Militello

Information on different topics is available from the World Wide Web with more than 800 million pages of relevant content but based on a new study by a team of computer scientists people are only guided to half of these pages by Internet search engines. With regard to indexing the Web, search engines are making less of an effort to do so. Owned by a firm for computer and communications is one institute.

Conducted was a similar study at the end of 1997 and researchers were able to determine that collectively the top six search engines were able to cover 60 percent of the Web and the top search engine was able to hit one third of all pages. It was from a report last February by a renowned journal where there were information about how only 42 percent of all sites was found in a test of 11 top search engines and no single program that existed could cover more than about 16 percent of the Web.

Promised by the Web was an effort to equalize access to information but because search engines often tend to index the sites that have more links to them people are able to view these popular sites and not those which may carry loads of relevant data.

At first it was estimated that the amount of Internet information and content resulted to around 320 million pages but there is more that needs to be patrolled since just 14 months later they found out that the number of pages was more than double of their first estimate. In general, the Web has 6 trillion bytes of information while the library of congress has 20 trillion bytes. There were 3 million servers with 289 pages per server available publicly and this was discovered by researchers after their random surfing exercise of 2,500 Web sites.

Still they said that the amount of information available on the Net could be larger because just a few sites may have millions of pages. A series of tests were done on the servers and from these they found out that 2 percent contained pornographic material, 2 percent were personal Web pages, about 83 percent of them contained commercial content company Web pages and catalogues, 6 percent had information about science and education, and 3 percent contained health information. To blame for making so much of the Web difficult to reach are the techniques used by search engines and not the volume per se.

User registration as well as following links to find new pages are the two main methods utilized by search providers in finding pages. According to researchers, what search engines do is make a biased sample of the Web because what they do is follow links to new pages where they end up finding and indexing pages that have more links to them. When it comes to this, it is not a matter of not having the ability to do the indexing, this is an issue pertaining to how resources are made to appear with a different use or benefit for users like free email for example.

It was a search engine expert who mentioned that people are often inclined to make simple information requests and it is this that makes them fail to see what they are missing. This particular imbalance in cataloguing is expected to continue on for more years and this can be attributed to the fact that the rate of increase in computer resources will be faster than the production of information content by humans to be posted on new sites.




Article Source: http://www.ezinearticles.mk

When you would like to get more information on seo sydney check out this site. You can get the best search engine marketing sydney information by visiting this website.

Please Rate this Article

 

Not yet Rated

Click the XML Icon Above to Receive Network Marketing Articles Via RSS!

Copyright © 2012 EzineArticles Directory -- All Rights Reserved Worldwide

MozGator Top Sites Top Sites Cat™ - A Catalog of Top Sites by Rank Top Article Directory Top Modern Musclecar Sites Politics Politics Topsites List - Vote Now TopTipSpot Top Sites Article Directory Toplist Top 100 Internet Marketing Sites

Powered by Article Dashboard