Skip to main content
ad info

 
CNN.com technology > computing
  Editions | myCNN | Video | Audio | Headline News Brief | Feedback  

 

  Search
 
 

 
TECHNOLOGY
TOP STORIES

Consumer group: Online privacy protections fall short

Guide to a wired Super Bowl

Debate opens on making e-commerce law consistent

(MORE)

TOP STORIES

More than 11,000 killed in India quake

Mideast negotiators want to continue talks after Israeli elections

(MORE)

MARKETS
4:30pm ET, 4/16
144.70
8257.60
3.71
1394.72
10.90
879.91
 


WORLD

U.S.

POLITICS

LAW

ENTERTAINMENT

HEALTH

TRAVEL

FOOD

ARTS & STYLE



(MORE HEADLINES)
*
 
CNN Websites
Networks image


Study says Web is 500 times larger than major search engines now show

web
 

SAN FRANCISCO (AP) -- The Internet has become so large so fast that sophisticated search engines are just scratching the surface of the Web's vast information reservoir, according to a new study released Wednesday.

The 41-page research paper, prepared by a South Dakota company that has developed new software to plumb the Internet's depths, estimates the World Wide Web is 500 times larger than the maps provided by popular search engines like Yahoo!, AltaVista and Google.com.

These hidden information coves, well-known to the Net savvy, have become a tremendous source of frustration for researchers who can't find the information they need with a few simple keystrokes.

"These days it seems like search engines are a little like the weather: Everyone likes to complain about them," said Danny Sullivan, editor of SearchEngineWatch.com, which analyzes search engines.

For years, the uncharted territory of the Internet's World Wide Web sector has been dubbed the "invisible Web."

  MESSAGE BOARD
 

BrightPlanet, the Sioux Falls start-up behind Wednesday's report, describes the terrain as the "deep Web" to distinguish from the surface information captured by Internet search engines.

"It's not an invisible Web anymore. That's what so cool about we are doing," said Thane Paulsen, BrightPlanet's general manager.

Many researchers suspected that these underutilized outposts of cyberspace represented a substantial chunk of the Internet, but no one seems to have explored the Web's back roads as extensively as BrightPlanet.

Deploying new software developed over the past six months, BrightPlanet estimates there are now about 550 billion documents stored on the Web.

Combined, Internet search engines index about 1 billion pages. One of the first Web search engines, Lycos, had an index of 54,000 pages in mid-1994.

While search engines obviously have come a long way since 1994, they aren't indexing even more pages because an increasing amount of information is stored in evolving, giant databases set up by government agencies, universities and corporations.

Search engines rely on technology that generally identifies "static" pages, rather than the "dynamic" information stored in databases.

This means that general-purpose search engines will guide users to the home site that houses a huge database, but finding out what's in them requires additional queries.

BrightPlanet believes it has developed a solution with software called "LexiBot."

With a single search request, the technology not only searches the pages indexed by traditional search engines, but delves into the databases on the Internet and fishes out the information in them.

The LexiBot isn't for everyone, BrightPlanet executives concede. For one thing, the software costs money -- $89.95 after a free 30-day trial. For another, a LexiBot search isn't fast. Typical searches will take 10 to 25 minutes to complete, but could require up to 90 minutes for the most complex requests.

"This isn't for grandma when she is looking for chocolate chip recipes on the Internet," Paulsen said.

The privately held company expects LexiBot to be particularly popular in academic and scientific circles. It also plans to sell its technology and services to businesses.

About 95 percent of the information stored in the deep Web is free, according to BrightPlanet.

Several Internet veterans who reviewed BrightPlanet's research Wednesday were intrigued, but warned that the company's software could be too overwhelming.

"The World Wide Web is getting to be so humongous that you need specialized engines. A centralized approach like this isn't going to be successful," predicted Carl Malamud, co-founder of Petaluma-based Invisible Worlds.

Like BrightPlanet, Invisible Worlds is trying to extract more data hidden from search engines, but is customizing the information.

Malamud calls this process "giving context to the content."

Sullivan agreed that BrightPlanet's greatest challenge will be showing businesses and individuals how to effectively deploy the company's breakthrough.

"No one else has come up with something like this yet, so when they fetch people all this information on the deep Web, they are going to have to show people where to dive in. Otherwise, people will just drown."

Copyright 2000 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed.



RELATED STORIES:
Long-awaited government portal nears reality
June 26, 2000
Emerging Web technologies steal the limelight at WWW9
May 17, 2000

RELATED SITES:
Welcome to BrightPlanet
Welcome to CompletePlanet
Search Engine Watch: Tips About Internet Search Engines & Search Engine Submission
Yahoo!
AltaVista
Google


Note: Pages will open in a new browser window
External sites are not endorsed by CNN Interactive.
 Search   

Back to the top  © 2001 Cable News Network. All Rights Reserved.
Terms under which this service is provided to you.
Read our privacy guidelines.