Title: System and method for multiple data sources to plug into a standardized interface for distributed deep search
Abstract: A system and method for adapters to provide nodes of a network access to a distributed search mechanism. Network nodes operating as consumer or requesting nodes generate search requests. Nodes operating as hubs are configured to route messages in the network. Individual nodes operating as provider nodes receive search requests and may generate results according to their own procedures in return. Hub nodes may resolve the search requests to a subset of the provider nodes in the network, for example by matching search requests with registration information from nodes. Communication between nodes in the network may use a common query protocol. Adapters may be implemented in the network to reformat messages exchanged in the network. Adapters may customize results. Adapters may enable nodes to function in a distributed search mechanism.
Patent Number: 7,013,303 Issued on 03/14/2006 to Faybishenko,   et al.
| Inventors:
|
Faybishenko; Yaroslav (Berkeley, CA);
Kan; Gene H. (Belmont, CA);
Camarda; Thomas J. (Staten Island, NY);
Doolin; David M. (El Cerrito, CA);
Waterhouse; Steve (San Francisco, CA);
Beatty; John (San Francisco, CA)
|
| Assignee:
|
Sun Microsystems, Inc. (Santa Clara, CA)
|
| Appl. No.:
|
106731 |
| Filed:
|
March 26, 2002 |
| Current U.S. Class: |
707/10; 709/217 |
| Current Intern'l Class: |
G08F 17/30 (20060101) |
| Field of Search: |
707/1-10
709/217
|
References Cited [Referenced By]
U.S. Patent Documents
| 5446880 | Aug., 1995 | Balgeman et al.
| |
| 5577241 | Nov., 1996 | Spencer.
| |
| 5659732 | Aug., 1997 | Kirsch.
| |
| 5751611 | May., 1998 | Jamieson.
| |
| 5826261 | Oct., 1998 | Spencer.
| |
| 5832506 | Nov., 1998 | Kuzma.
| |
| 5845278 | Dec., 1998 | Kirsch et al.
| |
| 5848234 | Dec., 1998 | Chernick et al.
| |
| 5920854 | Jul., 1999 | Kirsch et al.
| |
| 5920856 | Jul., 1999 | Syeda-Mahmood.
| |
| 5924105 | Jul., 1999 | Punch, III et al.
| |
| 5933822 | Aug., 1999 | Braden-Harder et al.
| |
| 5985454 | Nov., 1999 | McMordie et al.
| |
| 5987454 | Nov., 1999 | Hobbs.
| |
| 6006217 | Dec., 1999 | Lumsden.
| |
| 6047286 | Apr., 2000 | Burrows.
| |
| 6055538 | Apr., 2000 | Kessenich et al.
| |
| 6070158 | May., 2000 | Kirsch et al.
| |
| 6094649 | Jul., 2000 | Bowen et al.
| |
| 6102969 | Aug., 2000 | Christianson et al.
| |
| 6105019 | Aug., 2000 | Burrows.
| |
| 6145003 | Nov., 2000 | Sanu et al.
| |
| 6161102 | Dec., 2000 | Yanagihara et al.
| |
| 6212545 | Apr., 2001 | Ohtani et al.
| |
| 6233571 | May., 2001 | Egger et al.
| |
| 6256623 | Jul., 2001 | Jones.
| |
| 6269361 | Jul., 2001 | Davis et al.
| |
| 6275820 | Aug., 2001 | Navin-Chandra et al.
| |
| 6278993 | Aug., 2001 | Kumar et al.
| |
| 6292802 | Sep., 2001 | Kessenich et al.
| |
| 6317741 | Nov., 2001 | Burrows.
| |
| 6327590 | Dec., 2001 | Chidlovskii et al.
| |
| 6347314 | Feb., 2002 | Chidlovskii.
| |
| 6434548 | Aug., 2002 | Emens et al.
| |
| 6442544 | Aug., 2002 | Kohli.
| |
| 6453315 | Sep., 2002 | Weissman et al.
| |
| 6480837 | Nov., 2002 | Dutta.
| |
| 6484166 | Nov., 2002 | Maynard.
| |
| 6490575 | Dec., 2002 | Berstis.
| |
| 6510406 | Jan., 2003 | Marchisio.
| |
| 6523022 | Feb., 2003 | Hobbs.
| |
| 6523026 | Feb., 2003 | Gillis.
| |
| 6523029 | Feb., 2003 | Kulyukin.
| |
| 6523037 | Feb., 2003 | Monahan et al.
| |
| 6526400 | Feb., 2003 | Takata et al.
| |
| 6560600 | May., 2003 | Broder.
| |
| 6574655 | Jun., 2003 | Libert et al.
| |
| 6615209 | Sep., 2003 | Gomes et al.
| |
| 6647383 | Nov., 2003 | August et al.
| |
| 6650998 | Nov., 2003 | Rutledge et al.
| |
| 6665655 | Dec., 2003 | Warner et al.
| |
| 6687696 | Feb., 2004 | Hofmann et al.
| |
| 6704726 | Mar., 2004 | Amouroux.
| |
| 6711568 | Mar., 2004 | Bharat et al.
| |
| 6718324 | Apr., 2004 | Edlund et al.
| |
| 6725425 | Apr., 2004 | Rajan et al.
| |
| 6757646 | Jun., 2004 | Marchisio.
| |
| 6757675 | Jun., 2004 | Aiken et al.
| |
| 6763362 | Jul., 2004 | McKeeth.
| |
| 6785670 | Aug., 2004 | Chiang et al.
| |
| 6785671 | Aug., 2004 | Bailey et al.
| |
| 6799176 | Sep., 2004 | Page.
| |
| 6801906 | Oct., 2004 | Bates et al.
| |
| 6807539 | Oct., 2004 | Miller et al.
| |
| 6807546 | Oct., 2004 | Young-Lai.
| |
| Foreign Patent Documents |
| 829 811 | Mar., 1998 | EP.
| |
| 00/62264 | Oct., 2000 | WO.
| |
Other References
James P. Callan, et al. "Searching distrubuted collections with inference networks,"
In Proceedings of the 18th Annual SIGIR Conference, 1995, 2 pages.
C. Yu, et al., "Efficient and Effective Metasearch for a Large Number of Text
Databases," Tech. report, U. of Illinois at Chicago, 1995, http://citeseer.ist.psu.edu/yu99efficient.html,
2 pages.
W. Meng et al., "Estimating the Usefulness of Search Engines," 15th International
Conference on Data Engineering (ICDE'99), Syndey, Australia, Mar. 1999, 2 pages.
Mic Bownman, et al. "Harvest: A scalable, customizable discovery and access system",
Technical report, Unsiversity of Colorado-Boulder, 1995, 3 pages.
Oates, T.; et al., "Parallel and distributed search for structure in multivariate
time series," Technical Report 96-23, University of Massachusetts at Amherst, Comptuer
Science Department, Long version, 1996, 2 pages.
T. Oates, et al., "Networked Information Retrieval as Distributed Problem Solving,
" In Proceedings of CIKM Workshop in Intelligent Information Agents held in conjuction
with the Third International Conference on Information and Knowledge Management
(CIKM'94), Dec. 1994, 2 pages.
Walter L. Warnick, PhD, et al., "Searching the Deep Web," D-Lib Magazine, Jan.
2001, vol. 7, No. 1, 11 pages.
Julio C. Navas, et al., "Jambalya: Using Multicast for Blind Distributed Web
Searching and Advertising," 1998, 1 page.
Fidel Cacheda, et al., "Improving the Information Retrieval in the World Wide
Web (2000)," 1 page.
L. Warshaw, et al., "Rule-based query optimization, revisited," Proceedings of
the 9th Conference on Information and Knowledge Management. Kanasas City, Kansas,
Nov., 1999, pp. 267-275.
Gravano, et al., "STARTS: Stanford Proposal for Internet Meta-Searching" 1997,
ACM, XP000730508, pp. 207-218.
Sun Microsystems, Inc., Li Gong, "Project JXTA: A Technology Overview," Apr.
25, 2001, pp. 1-11.
James Powell, "Multilingual Federated Searching Across Heterogeneous Collections,"
D-Lib Magazine, Sep. 1998, XP002255043, 10 pages.
International Search Report for PCT/US 02/13469 mailed Oct. 22, 2003, 6 pages.
Florescu, et al., "Integrating Keyword Search into XML Query Processing," Computer
Networks, Published by Elsevier Science, B.V., 2000, 17 pages.
Sun Microssystems, Inc., "Project JXTA: An Open, Innovative Collabortation,"
Apr. 25, 2001, pp. 1-15.
Brian Thomas, "URL Driving," The International Society for optical Engineering,
May-Jun. 1998, IEEE Internet Computing, pp. 92-93.
|
Primary Examiner: Choules; Jack M.
Attorney, Agent or Firm: Kowert; Robert C., Meyertons, Hood, Kivlin, Kowert & Goetzel, P.C.
Parent Case Text
This application is also a continuation-in-part of U.S. application Ser. No.
09/872,360 filed May 31, 2001 titled "Distributed Information Discovery" which
claims benefit of priority to U.S. provisional application Ser. No. 60/288,848
filed May 4, 2001 titled "Distributed Information Discovery".
Claims
What is claimed is:
1. A method for participating in a distributed search network, comprising:
each of a plurality of adapters receiving a search request formatted in accordance
with a common query protocol, wherein each of the adapters is associated with a
different provider network node of a plurality of provider network nodes;
wherein the search request is sent to a plurality of provider network nodes from
a requesting network node through a hub network node configured to match the search
request against provider registrations indicating at least the plurality of provider
network nodes;
wherein the hub network node forwards the search request to the adapter associated
with each of the indicated provider network nodes;
each adapter reformatting the received search request from the common query protocol
to a different protocol used by the associated provider network nodes; and
each adapter sending the reformatted search request to the associated provider
network node.
2. The method as recited in claim 1, further comprising:
receiving a search response formatted in accordance with the different protocol
from one or more of the provider network nodes including one or more results generated
in response to the reformatted search request;
reformatting the search response from the different protocol to the common query
protocol; and
sending the reformatted search response to the requesting network node.
3. The method of claim 2, wherein said reformatting the search response includes
selecting at least one search result from the one or more results.
4. The method of claim 3, wherein said selecting is based on data access rights
associated with the one provider network node.
5. The method of claim 3, wherein said selecting is based on data access rights
associated with the requesting network node.
6. The method of claim 2, wherein one of the one or more results is a reference
to a computer data file stored in the network.
7. The method of claim 2, wherein one of the one or more results is at least
a portion of the data contained in a computer data file.
8. The method of claim 2, wherein one of the one or more results is at least
a portion of the data contained in a computer data file and one of the, one or
more results is a reference to a computer data file stored in the network.
9. The method of claim 2, further comprising including relevance information
corresponding to each of the one or more results in the search response indicating
a ranking of the one or more results.
10. A method for participating in a distributed search network, comprising:
receiving a plurality of search results requested by a requesting network node
from one or more provider network nodes, wherein the search results are formatted
in accordance with a common query protocol;
reformatting the received search results from the common query protocol to a
different protocol used by the requesting network node; and
sending the reformatted search results to the requesting network node.
11. The method as recited in claim 10, wherein said reformatting includes collating
at least a first and second search results of the plurality of search results.
12. The method as recited in claim 11, wherein the first and second search results
respectively include first and second relevance information indicating a ranking
according to corresponding first and second of the plurality of provider network
nodes and said collating includes ordering the first and second search results
in response to the first and second relevance information.
13. The method as recited in claim 12, further comprising receiving relevance
information from the requesting network node indicating an ordering parameter,
wherein said generating a combined search result includes selecting and ordering
the plurality of search results in response to the relevance information.
14. The method as recited in claim 10, further comprising:
receiving from the requesting network node a search query formatted in accordance
with a different protocol used by the requesting network node;
reformatting the search query to the common query protocol from the different
protocol; and
sending the reformatted search query to a network hub for routing to the plurality
of provider network nodes.
15. A method for interacting with a distributed search network, comprising:
receiving from a requesting network node a search query formatted in accordance
with a requesting network node protocol;
reformatting the search query from the requesting network node protocol to the
common query protocol;
sending the search query formatted in accordance with the common query protocol
to a network hub for routing to at least a plurality of provider network nodes;
receiving a plurality of search results formatted in accordance with the common
query protocol from the plurality of provider network nodes;
reformatting the plurality of search results from the common query protocol to
the requesting network node protocol; and
sending to the requesting network node the plurality of search results formatted
in accordance with the requesting network node protocol.
16. The method as recited in claim 15, wherein a first and a second of the plurality
of search results respectively include a first and a second relevance information
indicating a ranking by the corresponding first and second of the plurality of
provider network nodes and said reformatting the plurality of search results includes
ordering the first and second search results in response to the first and second
relevance information.
17. The method as recited in claim 15, further comprising receiving relevance
information indicating an ordering preference parameter from the requesting network
node, wherein said generating a combined search result includes selecting and ordering
the plurality of search results in response to the relevance information.
18. A method for participating in a distributed search network, comprising:
distributing a search request from a requesting node in the network to a plurality
of provider nodes in the network each configured to generate one or more search
results according to their own procedures in response to a search request;
each of a plurality of provider nodes receiving the search request;
each provider node generating a search response including the one or more search
results from data accessible by the provider node in response to the search request;
each of a plurality of adapters associated with a different one of the plurality
of provider nodes receiving the search response in a format different from a common
query protocol from the respective associated provider node;
each adapter reformatting the received search response to the common query protocol;
and
each adapter transmitting the reformatted search response to the requesting node.
19. The method as recited in claim 18, wherein each adapter is configured to
reformat the search response from a second format different from the common query
protocol used by a second of the plurality of provider nodes to the common query
protocol receiving the search request in the second different format from the second
provider node.
20. The method as recited in claim 18, wherein the one or more search results
generated by at least one of the plurality of provider nodes include one or more
dynamic data accessible by the one provider node.
21. The method as recited in claim 18, wherein the one or more search results
generated by at least one of the plurality of provider nodes is generated from
dynamic data.
22. A computer system in a network, comprising program instructions, wherein
the program instructions are computer-executable to implement;
an adaptor associated with a provider network node receiving a search request
formatted in accordance with a common query protocol sent to a plurality of provider
network nodes from a requesting network node through a hub network node configured
to match the search request against provider registrations indicating at least
the plurality of provider network nodes;
wherein the hub network node forwards the search request to the adaptor associated
with each of the indicated provider network nodes;
the adaptor reformatting the search request from the common query protocol to
a different protocol used by the associated one of the plurality of provider network
nodes; and
the adaptor sending the reformatted search request to the associated provider
network node.
23. The computer system as recited in claim 22, further comprising:
receiving a search response formatted in accordance with the different protocol
from one or more of the provider network nodes including one or more results generated
in response to the reformatted search request;
reformatting the search response from the different protocol to the common query
protocol; and
sending the reformatted search response to the requesting network node.
24. The computer system of claim 23, wherein said reformatting the search response
includes selecting at least one search result from the one or more results.
25. The computer system of claim 24, wherein said selecting is based on data
access rights associated with the one provider network node.
26. The computer system of claim 24, wherein said selecting is based on data
access rights associated with the requesting network node.
27. The computer system of claim 23, wherein one of the one or more results is
a reference to a computer data file stored in the network.
28. The computer system of claim 23, wherein one of the one or more results is
at least a portion of the data contained in a computer data file.
29. The computer system of claim 23, wherein one of the one or more results is
at least a portion of the data contained in a computer data file and one of the
one or more results is a reference to a computer data file stored in the network.
30. The computer system of claim 23, further comprising including relevance information
corresponding to each of the one or more results in the search response indicating
a ranking of the one or more results.
31. A computer system in a distributed search network, comprising program instructions,
wherein the program instructions are computer-executable to implement:
receiving a plurality of search results requested by a requesting network node
from one or more provider network nodes, wherein the search results are formatted
in accordance with a common query protocol;
reformatting the received search results from the common query protocol to a
different protocol used by the requesting network node; and
sending the reformatted search results to the requesting network node.
32. The computer system as recited in claim 31, wherein said reformatting includes
collating at least a first and second search results of the plurality of search results.
33. The computer system as recited in claim 32, wherein the first and second
search results respectively include first and second relevance information indicating
a ranking according to corresponding first and second of the plurality of provider
network nodes and said collating includes ordering the first and second search
results in response to the first and second relevance information.
34. The computer system as recited in claim 33, further comprising receiving
relevance information from the requesting network node indicating an ordering parameter,
wherein said generating a combined search result includes selecting and ordering
the plurality of search results in response to the relevance information.
35. The computer system as recited in claim 31, further comprising:
receiving from the requesting network node a search query formatted in accordance
with a different protocol used by the requesting network node;
reformatting the search query to the common query protocol from the different
protocol; and
sending the reformatted search query to a network hub for routing to the plurality
of provider network nodes.
36. A computer system for interacting with a distributed search network, comprising
program instructions, wherein the program instructions are computer-executable
to implement:
receiving from a requesting network node a search query formatted in accordance
with a requesting network node protocol;
reformatting the search query from the requesting network node protocol to the
common query protocol;
sending the search query formatted in accordance with the common query protocol
to a network hub for routing to at least a plurality of provider network nodes;
receiving a plurality of search results formatted in accordance with the common
query protocol from the plurality of provider network nodes;
reformatting the plurality of search results from the common query protocol to
the requesting network node protocol; and
sending to the requesting network node the plurality of search results formatted
in accordance with the requesting network node protocol.
37. The computer system as recited in claim 36, wherein a first and a second
of the plurality of search results respectively include a first and a second relevance
information indicating a ranking by the corresponding first and second of the plurality
of provider network nodes and said reformatting the plurality of search results
includes ordering the first and second search results in response to the first
and second relevance information.
38. The computer system as recited in claim 36, further comprising receiving
relevance information indicating an ordering preference parameter from the requesting
network node, wherein said generating a combined search result includes selecting
and ordering the plurality of search results in response to the relevance information.
39. A computer system for participating in a distributed search network, comprising
program instructions, wherein the program instructions are computer-executable
to implement:
distributing a search request from a requesting node in the network to a plurality
of provider nodes in the network each configured to generate one or more search
results according to their own procedures in response to a search request;
each of a plurality of provider nodes receiving the search request;
each provider node generating a search response including the one or more search
results from data accessible by the provider node in response to the search request;
each of a plurality of adapters associated with one of the plurality of provider
nodes receiving the search response in a format different from a common query protocol
from the associated provider node;
each adapter reformatting the received search response to the common query protocol;
and
each adapter transmitting the reformatted search response to the requesting node.
40. The computer system as recited in claim 39, wherein each adapter is configured
to reformat the search response from a second format different from the common
query protocol used by a second of the plurality of provider nodes to the common
query protocol receiving the search request in the second different format from
the second provider node.
41. The computer system as recited in claim 39, wherein the one or more search
results generated by at least one of the plurality of provider nodes include one
or more dynamic data accessible by the one provider node.
42. The computer system as recited in claim 39, wherein the one or more search
results generated by at least one of the plurality of provider nodes is generated
from dynamic data.
43. A computer system in a distributed search network, comprising:
a plurality of adapter means for receiving a search request formatted in accordance
with a common query protocol; wherein each of the adapter means is associated with
a different provider network node of a plurality of provider network nodes;
wherein the search request is sent to a plurality of provider network nodes from
a requesting network node through a hub network node configured to match the search
request against provider registrations indicating at least the plurality of provider
network nodes, wherein the hub network node forwards the search request to the
adapter associated with each of the indicated provider network nodes;
wherein each adapter means includes means for reformatting the received search
request from the common query protocol to a different protocol used by the respective
associated provider network nodes; and
wherein each adapter means includes means for sending the reformatted search
request to the one provider network node.
44. The computer system as recited in claim 43, further comprising:
means for receiving a search response formatted in accordance with the different
protocol from one or more of the provider network nodes including one or more results
generated in response to the reformatted search request;
means for reformatting the search response from the different protocol to the
common query protocol; and
means for sending the reformatted search response to the requesting network node.
45. The computer system of claim 44, further comprising means for including relevance
information corresponding to each of the one or more results in the search response
indicating a ranking of the one or more results.
46. A computer system for participating in a distributed search network, comprising:
means for receiving a plurality of search results requested by a requesting network
node from one or more provider network nodes, wherein the search results are formatted
in accordance with a common query protocol;
means for reformatting the received search results from the common query protocol
to a different protocol used by the requesting network node; and
means for transmitting the reformatted search results to the requesting network
node.
47. The computer system as recited in claim 46, further comprising means for
collating at least a first and second search results of the plurality of search results.
48. The computer system as recited in claim 47, wherein the first and second
search results respectively include first and second relevance information indicating
a ranking according to corresponding first and second of the plurality of provider
network nodes and said means for collating includes means for ordering the first
and second search results in response to the first and second relevance information.
49. The computer system as recited in claim 47, further comprising means for
receiving relevance information from the requesting network node indicating an
ordering parameter, wherein said means for generating a combined search result
includes means for selecting and ordering the plurality of search results in response
to the relevance information.
50. The computer system as recited in claim 46, further comprising:
means for receiving from the requesting network node a search query formatted
in accordance with a different protocol used by the requesting network node;
means for reformatting the search query to the common query protocol from the
different protocol; and
means for sending the reformatted search query to a network hub for routing to
the plurality of provider network nodes.
51. The computer system as recited in claim 50, further comprising means for
receiving relevance information indicating an ordering preference parameter from
the requesting network node, wherein said means for generating a combined search
result includes means to select and order the plurality of search results in response
to the relevance information.
52. A computer system for interacting with a distributed search network, comprising:
means for receiving from a requesting network node a search query formatted in
accordance with a requesting network node protocol;
means for reformatting the search query from the requesting network node protocol
to the common query protocol;
means for sending the search query formatted in accordance with the common query
protocol to a network hub for routing to at least a plurality of provider network
nodes;
means for receiving a plurality of search results formatted in accordance with
the common query protocol from the plurality of provider network nodes;
means for reformatting the plurality of search results from the common query
protocol to the requesting network node protocol; and
means for sending to the requesting network node the plurality of search results
formatted in accordance with the requesting network node protocol.
53. The computer system as recited in claim 52, wherein a first and a second
of the plurality of search results respectively include a first and a second relevance
information indicating a ranking by the corresponding first and second of the plurality
of provider network nodes and said means for reformatting the plurality of search
results includes means for ordering the first and second search results in response
to the first and second relevance information.
54. A distributed search network, comprising:
a plurality of provider nodes;
a hub network node configured to distribute a search request from a requesting
node in the network to the plurality of provider nodes in the network;
wherein each provider node is configured to:
receive the search request;
generate, one or more search results according to its own procedures in response
to the search request; and
a search response including the one or more search results from data accessible
by the provider node in response to the search request;
a plurality of adapters each associated with a different one of the plurality
of provider nodes, wherein each adapter is configured to:
receive the search response from the associated provider node in a format used
by the associated provider node and different from a common query protocol;
reformat the received search response to the common query protocol; and
transmit the reformatted search response to the requesting node.
55. The distributed search network as recited in claim 54, wherein each adapter
is configured to reformat the search response from a second format different from
the common query protocol used by a second of the plurality of provider nodes to
the common query protocol receiving the search request in the second different
format from the second provider node.
56. The distributed search network as recited in claim 54, wherein the one or
more search results generated by at least one of the plurality of provider nodes
include one or more dynamic data accessible by the one provider node.
57. The distributed search network as recited in claim 54, wherein the one or
more search results generated by at least one of the plurality of provider nodes
is generated from dynamic data.
Description
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer networks, and more particularly to a system
and method for providing a distributed information discovery platform that enables
discovery of information from distributed information providers.
2. Description of the Related Art
It has been estimated that the amount of content contained in distributed information
sources on the public web is over 550 billion documents. In comparison, leading
Internet search engines may be capable of searching only about 600 million pages
out of an estimated 1.2 billion "static pages." Due to the dynamic nature of Internet
content, much of the content is unsearchable by conventional search means. In addition,
the amount of content unsearchable by conventional means is growing rapidly with
the increasing use of application servers and web enabled business systems.
Crawlers currently may take three months or more to crawl and index the
web (Google numbers), so that conventional, crawler-based search engines such as
Google may best perform when indexing static, slowly changing web pages such as
home pages or corporate information pages. Targeted or restricted crawling of headline
or other metadata is possible (such as that done by moreover.com) but this limits
search ability. Web resources that do not have a "page of contents" or similar
index—"deep" web resources—may be more difficult to search, index,
or reference by conventional crawler-based search engines. For example, Amazon.com
contains millions of product descriptions in its databases but does not have a
set of pages listing all these descriptions. As a result, in order to crawl such
a resource, it may be necessary—though difficult—to query the database
repeatedly with every conceivable query term until all products are extracted.
Likewise, many web pages are generated dynamically given information about the
consumer or context of the query (time, purchasing behavior, location, etc.), a
crawler approach is likely to lead to distortion of such data. In some situations,
content may be inaccessible due to access privileges (e.g. a subscription site),
or for security reasons (e.g. a secure content site).
Conventional search mechanisms also may be less efficient than desirable
in regard to some types of information providers, for example in regards to accessing
dynamic content from a news site. A current news provider may provide content created
by editors and stored in a database as XML or other presentation neutral form.
The news provider's application server may render the content as a web page with
associated links using templates. Although the end user may see a well-presented
page with the relevant information, for a crawler-type search engine to extract
the content of the HTML page it must be programmed to use information about the
structure of the page and "scrape" the content and headline from the page. It may
then store this content or a processed version for indexing purposes in its own
database, and retrieve the link and story when a query matching the story is submitted.
This search process is inherently inefficient and prone to errors. In addition
it gives the content provider no control over the format of the article or the
decision about which article to show in response to a query.
It would be desirable for search mechanism of the web to perform "deep searches"
and "wide searches." "Deep search" may find information embedded in large databases
such as product databases (e.g. Amazon.com) or news article databases (e.g. CNN).
"Wide searches" may reach a large distribution. Moreover, it would be desirable
for the search mechanism to efficiently use bandwidth and maximize search speed
while avoiding bottlenecks. It would also be desirable for a search mechanism to
function over an expanded web covering a wide array of distributed devices (e.g.
PCs, handheld devices, PDAs, cell phones, etc.).
SUMMARY OF THE INVENTION
A distributed network search mechanism is described for a consumer coupled to
a
network to send a search request to and receive a search result from at least one
provider coupled to the network in response to its search request. A search request
may include a search query. A search result may include a query result. A search
request and a search result may be formatted according to a query routing protocol
(QRP). A QRP may specify a mark-up language format for communicating search requests,
search results, and/or other information between nodes in the network.
A network hub may be configured to implement a search method according to a query
routing protocol. The search method may include receiving a search request from
a consumer. A network hub may accept search requests only from registered consumers.
A network hub may be configured to receive registration requests from consumers.
A network hub may be configured to receive registration requests from providers.
A registration request may be formatted according to a QRP. A provider's registration
request may indicate at least some of the search queries the provider is interested
in receiving. The search method may include resolving a consumer's search query
from a search request by determining at least one provider that indicated interest
in receiving at least similar search queries in its registration request. A network
hub may be configured to route a consumer's search query to a provider and may
format the search query according to a QRP.
A provider may be configured to receive a search query. A provider may respond
with a query result. A provider may be configured to customize its query result.
A query result may be formatted according a QRP. The query result may be routed
to a network hub. A network hub may be configured to receive a query result from
a provider. A network hub may be configured to collate a plurality of query results
regarding the same search query. A network hub may be configured to route a query
result or collated query results to a consumer as a search result. A search result
may be formatted according to a QRP.
A network hub may be configured to route a search request, a search result, or
other communication between a consumer and a provider through at least another
network hub. A network hub may be configured to resolve a consumer's search query
using a query-space. A search request may include an indication of a query-space.
A provider registration may include an indication of a query-space. A query-space
at least defines a structure for indicating and matching search criteria, and may
include a predicate statement. A provider registration may include a query server
address to which matching search queries are to be directed.
Resolving a search query may include deriving search criteria from a search
query, applying the search criteria from the search query to the search criteria
of the query-spaces from provider registrations, and determining which query-spaces
from provider registrations suitably match the search criteria from the search
query. A search query may be routed to at least a subset of the query server addresses
specified by the resolved providers registrations.
A QRP interface may be configured to operate with a consumer or a provider in
the
network. A QRP interface may be configured as a proxy for a consumer or a provider
that do not include a QRP interface to operate with the distributed network search
mechanism. A QRP interface may be configured as an interface between a network
hub and a consumer or a provider to receive information from that consumer or provider
and send it to a network work or to receive information from a network hub and
send it to that consumer or provider. A consumer, or a provider may be configured
to send information to or receive information from a QRP interface. A network hub
may be configured to send or receive information to a QRP interface for a consumer
or a provider. A QRP interface may be configured to translate a between consumer
or provider specific protocols to a QRP. A QRP interface may be configured to customize
a search query or a search result in response to instructions from a consumer or
a provider.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a network utilizing the distributed information discovery
platform according to one embodiment;
FIG. 2 illustrates an architecture for the distributed information discovery
platform according to one embodiment;
FIG. 3 illustrates message flow in a distributed information discovery network
according to one embodiment;
FIG. 4 illustrates a provider with a query routing protocol interface according
to one embodiment;
FIG. 5 illustrates a provider with a query routing protocol interface and a
results presentation mechanism according to one embodiment;
FIG. 6 illustrates an exemplary distributed information discovery network including
a plurality of hubs according to one embodiment;
FIG. 7 illustrates provider registration in a distributed information discovery
network according to one embodiment;
FIG. 8 is a flowchart illustrating message flow in a distributed information
discovery network according to one embodiment;
FIG. 9 illustrates an example of several peers in a peer-to-peer network according
to one embodiment;
FIG. 10 illustrates a message with envelope, message body, and optional trailer
according to one embodiment;
FIG. 11 illustrates an exemplary content identifier according to one embodiment;
FIG. 12 is a block diagram illustrating two peers using a layered sharing policy
and protocols to share content according to one embodiment;
FIG. 13 illustrates one embodiment of a policy advertisement;
FIG. 14 illustrates one embodiment of a peer advertisement;
FIG. 15 illustrates one embodiment of a peer group advertisement;
FIG. 16 illustrates one embodiment of a pipe advertisement;
FIG. 17 illustrates one embodiment of a service advertisement;
FIG. 18 illustrates one embodiment of a content advertisement; and
FIG. 19 is a block diagram illustrating one embodiment of a network protocol
stack in a peer-to-peer platform.
While the invention is described herein by way of example for several embodiments
and illustrative drawings, those skilled in the art will recognize that the invention
is not limited to the embodiments or drawings described. It should be understood,
that the drawings and detailed description thereto are not intended to limit the
invention to the particular form disclosed, but on the contrary, the intention
is to cover all modifications, equivalents and alternatives falling within the
spirit and scope of the present invention as defined by the appended claims. The
headings used herein are for organizational purposes only and are not meant to
be used to limit the scope of the description or the claims. As used throughout
this application, the word "may" is used in a permissive sense (i.e., meaning having
the potential to), rather than the mandatory sense (i.e., meaning must). Similarly,
the words "include", "including", and "includes" mean including, but not limited to.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
A system and method for providing a distributed information discovery platform
that may enable discovery of information from distributed information providers
is described. In an embodiment, in contrast to conventional search engines and
exchanges, the distributed information discovery platform does not centralize information;
rather it may search for information in a distributed manner. This distributed
searching may enable content providers to deliver up-to-the-second responses to
search queries from a user or client.
In the distributed information discovery platform, queries are distributed to
"peers" in a network who are most likely to be capable of answering the query.
The distributed information discovery platform provides a common distributed query
mechanism for devices from web servers and small computers.
The distributed information discovery platform may be applied in a wide variety
of domains, including, but not limited to: public accessible web search, private
networks of trading partners, and interaction between distributed services and
applications. In addition to supporting public networks, the distributed information
discovery platform may also include support for private networks such as for business-to-business
(B2B) networks and extranet applications. Private network support may include quality
of service provisioning, security via public key infrastructure and explicit B2B
queryspace support. The distributed information discovery platform may also be
applied to Peer-to-Peer (P2P) networking, exemplified in programs such as Napster
and Gnutella. The distributed information discovery platform may also be applied
to other similar networks or combination of networks.
In one embodiment the distributed information discovery platform may include a
web front end to a distributed set of servers, each running a P2P node and responding
to queries. Each node may be registered (or hard coded in some embodiments) to
respond to certain queries or kinds of queries. For example, one of the nodes may
include a calculator service which would respond to a numeric expression query
with the solution. Other nodes may be configured for file sharing and may be registered
to respond to certain queries. A search query on a corporate name may return an
up-to-the-minute stock quote and current news stories on the corporation. Instead
of presenting only text-based search results, the distributed information discovery
platform may return other visual or audio search results. For example, a search
query for "roses" may return photo images of roses.
In some embodiments, the distributed information discovery platform may leverage
web technologies (e.g. HTTP/XML). In addition to supporting arbitrary XML, the
distributed information discovery platform may be integrated with other standard