Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberCN101185074 B
Publication typeGrant
Application numberCN 200680018794
PCT numberPCT/US2006/010965
Publication date23 Jun 2010
Filing date24 Mar 2006
Priority date31 Mar 2005
Also published asCA2603085A1, CA2603085C, CN101185074A, EP1872283A1, US7587387, US8065290, US8224802, US8650175, US20060224582, US20090313247, US20110295888, US20120278301, US20140129538, WO2006104951A1
Publication number200680018794.8, CN 101185074 B, CN 101185074B, CN 200680018794, CN-B-101185074, CN101185074 B, CN101185074B, CN200680018794, CN200680018794.8, PCT/2006/10965, PCT/US/2006/010965, PCT/US/2006/10965, PCT/US/6/010965, PCT/US/6/10965, PCT/US2006/010965, PCT/US2006/10965, PCT/US2006010965, PCT/US200610965, PCT/US6/010965, PCT/US6/10965, PCT/US6010965, PCT/US610965
Inventors安德鲁威廉霍格
Applicant谷歌公司
Export CitationBiBTeX, EndNote, RefMan
External Links: SIPO, Espacenet
User interface for facts query engine with snippets from information sources that include query terms and answer terms
CN 101185074 B
Abstract
A method and a system for providing snippets of source documents of an answer to a fact query are disclosed. Snippets of source documents may be provided in response to a user request for the source documents from which the fact answer to a fact query was extracted. The snippets include the terms of the fact query and terms of the answer. The snippets may be displayed along with Uniform Resource Locators (URL) of the source documents.
Claims(11)  translated from Chinese
  1. 一种用于显示事实的源的方法,包括:接收用户构制的事实查询,所述事实查询包括一个或多个查询词语;从事实储存库识别对所述事实查询的回答,所述回答包括一个或多个事实回答词语以及识别包括所述事实查询的一个或多个查询词语和所述回答的一个或多个所述事实回答词语的源文档的信息;在识别所述回答之后,访问与所述事实储存库不同的文档数据库中的所述源文档中的至少一个源文档;为所述源文档中的至少一个生成片断,所述片断包括所述事实查询的一个或多个查询词语和所述回答的一个或多个所述事实回答词语;以及生成包括所述片断的响应。 A method for displaying a source of facts, including: receiving user queries constructed facts, the facts query includes one or more query words; the answer repository recognition of the fact that the facts of the query, the answer includes the fact that one or more of the words in the answer and the fact that a query includes information identifying one or more of the query terms and answer one or more of the words in the answer to the fact that the source document; after identifying the answer, and access The fact that different document repository database in the source document at least one source document; generating at least one piece of the source document, the piece includes a check of the facts or more query terms and The answer to the fact that one or more answer word; and generating a response comprising said fragment.
  2. 2. 如权利要求1所述的方法,其中生成响应进一步包括在所述片断中高亮显示所述事实查询的所述一个或多个词语和所述回答的所述一个或多个事实回答词语。 2. The method of claim 1, wherein the response further comprises generating said segment highlight the fact that the one or more words in the query and the answer to the fact that one or more of the words in the answer.
  3. 3. 如权利要求1所述的方法,进一步包括对在对所述事实查询的所述回答的显示表示中的链接的用户选择作出响应。 The method according to claim, further comprising a user of the fact that the answer to the query representation of a display responsive to selection of a link.
  4. 4. 如权利要求1所述的方法,其中所述响应包括用于请求所述源文档的列表的用户可选链接,所述方法进一步包括通过发送所述源文档的列表来响应用户对所述链接的选择。 4. The method of claim 1, wherein the user response includes a request for a listing of the source document of an optional link, the method further comprises transmitting the list of source document to a user in response to the Select the link.
  5. 5. 如权利要求1所述的方法,其中生成所述响应包括确定在所述源文档的至少一个中,所述一个或多个查询词语与所述一个或多个事实回答词语的散布。 5. The method of claim 1, wherein generating the response comprises determining at least one of said source document, the one or more query terms with the fact that one or more of the words in the answer spread.
  6. 6. 如权利要求1所述的方法,其中生成所述片断包括从各个源文档选择文本,以便在所述片断中包括所述事实查询的至少一个文本词语和所述回答的至少一个文本词语。 6. The method of claim 1, wherein said fragment comprises generating from each source document select the text to include the fact that at least one of the query terms and text answer at least one of the words in the text fragment.
  7. 7. —种用于显示事实的源的系统,包括:查询接收器装置,用于接收用户构制的事实查询,所述事实查询包括一个或多个查询词语;回答识别器装置,用于从事实储存库识别对所述事实查询的回答,所述回答包括一个或多个事实回答词语以及识别包括所述事实查询的一个或多个查询词语和所述回答的一个或多个所述事实回答词语的源文档的信息;源文档识别器装置,用于在识别所述回答后访问与所述事实储存库不同的文档数据库中的所述源文档中的至少一个源文档;片断生成器装置,用于为所述源文档中的至少一个生成片断,所述片断包括所述事实查询的一个或多个查询词语和所述回答的一个或多个所述事实回答词语;以及响应生成器装置,用于生成包括所述片断的响应。 7. - the kind used to display the fact that the source of the system, including: Query receiver means for receiving a user to query constructed facts, the facts query includes one or more query words; answer identifying means for the Answer repository recognition of the fact that the facts of the query, the answer to include one or more of the facts and identifying the words to answer a query that includes the fact that one or more query words and one or more of the facts of the answer answer information words in the source document; source document identification means for identifying the answers after accessing the fact repository different documents in the source document database in the at least one source document; fragment generator means, for said at least one of the source document to generate fragment, the fragment comprises one or more of the query terms of the fact query and one or more of the fact that the answer word answers; and a response generator means, for generating said segment comprising a response.
  8. 8. 如权利要求7所述的系统,其中所述响应生成器装置包括用于在所生成的片断中高亮显示所述事实查询的所述一个或多个词语和所述回答的所述一个或多个事实回答词语的装置。 8. The system of claim 7, wherein said response means comprises means for generating the generated fragments, highlight the fact that the one or more query terms and the one or the answer the fact that the answer means more than words.
  9. 9. 如权利要求7所述的系统,其中所述响应包括用于请求所述源文档的列表的用户可选链接,并且所述查询接收器装置通过发送所述源文档的列表来响应用户对所述链接的选择。 9. The system of claim 7, wherein the user response includes a request for a listing of the source document of an optional link, and the receiver device by the query sending the list of source document in response to user The link selection.
  10. 10. 如权利要求7所述的系统,其中所述片断生成器装置包括邻近度检测器装置,用于检测在所述源文档的至少一个中,所述一个或多个查询词语与所述一个或多个事实回答词语的散布。 10. The system of claim 7, wherein said segment generator means includes a proximity detector means for detecting at least one of said source document, the one or more query terms with the a or more of the facts to answer the words spread.
  11. 11.如权利要求7所述的系统,其中所述片断生成器装置被配置以从各个源文档选择文本,以便在所述片断中包括所述事实查询的至少一个文本词语和所述回答的至少一个文本词语。 11. The system of claim 7, wherein said segment generator means is configured to select the text documents from various sources, for inclusion in said segment in at least one of the fact that the text words in the query and the answer to at least a text word.
Description  translated from Chinese

用于事实查询引擎的带有来自信息源的包含查询词语和回答词语的片段的用户界面 Query engine for the fact that the user interface with a fragment containing the query words and answer words from the information sources

[0001] 相关申请 [0001] Related Applications

[0002] 本申请涉及以下申请,其中每一个都通过引用并入此处作为参考: [0002] The present application relates to the following application, each of which is incorporated by reference herein by reference:

[0003] 2005年3月31日提出申请的美国专利申请号为11/097,688的"Corroborating [0003] March 31, 2005 filed a US patent application No. 11 / 097,688 for "Corroborating

Facts Extractedx from Multiple Sources"; Facts Extractedx from Multiple Sources ";

[0004] 2005年3月31日提出申请的美国专利申请号为11/097, 676的"Bloom Filters for Query Simultation,, 5 [0004] March 31, 2005 filed a US patent application No. 11/097, 676 "Bloom Filters for Query Simultation ,, 5

[0005] 2005年3月31日提出申请的美国专利申请号为11/097,690的"Selecting the Best Answer to a Fact Query from Among a Set 0fP0tential Answers,,;以及[0006] 2004年12月30日提出申请的美国专利申请号为11/024, 784的"Su卯lementing Search Results with Information of Interest,,。 [0005] March 31, 2005 filed a US patent application No. 11 / 097,690 for "Selecting the Best Answer to a Fact Query from Among a Set 0fP0tential Answers ,,; and [0006] December 30, 2004 presented U.S. Patent Application Serial No. application is 11/024, 784 "Su d lementing Search Results with Information of Interest ,,.

技术领域 FIELD

[0007] 公开的实施例通常涉及对事实的查询,尤其涉及用于事实查询引擎的用户界面和具有查询词语和回答词语的源的片段。 [0007] The disclosed embodiments relate generally to the fact that the query, and more particularly to a user interface for the query engine and the fact query terms and answer with fragments of source words.

背景技术 BACKGROUND

[0008] 万维网(也可以叫做web)和万维网中的网页都是事实信息的巨大资源。 [0008] World Wide Web (also called web) vast resources and the World Wide Web pages are factual information. 用户可以查看网页来获得事实问题的回答,例如"波兰的首都是什么"或者"乔治•华盛顿的出生日期是什么"。 Users can view the page to get the facts to answer questions such as "What is the capital of Poland" or "George Washington • What is the date of birth." 然而,万维网搜索引擎在这一点上对用户不能提供帮助,那就是他们通常不能提供对于例如上述的那些事实查询的简单、简洁的回答。 However, web search engines at this point the user can not help, is that they often do not provide the above-mentioned facts, such as those for a simple query, simple answer. 相反的,万维网搜索引擎提供被确定为匹配用户查询的网页的列表,用户必须对匹配的网页进行分类才能发现回答。 In contrast, the World Wide Web search engine is determined to match the user query a list of pages, pages that match the user must be classified in order to find the answer. [0009] 试图建立搜索引擎,可以对事实问题提供快速回答,但这种方式有其固有的缺点。 [0009] trying to build a search engine that can provide quick answers to questions on the facts, but this approach has its inherent shortcomings. 例如,一些搜索引擎从单一的资源提取它们的事实,例如从特定的百科全书。 For example, some search engines to extract them from the fact that a single resource, such as a particular Encyclopedia from. 这就限制了这些引擎可以应答的问题的类型。 This limits the types of these engines can answer questions. 例如,基于某百科全书的搜索引擎不可能回答许多关于流行文化的问题,例如关于电影、歌曲等的问题,而且也不可能回答许多关于产品、服务、零售和批发业务诸如此类的问题。 For example, based on a search engine encyclopedia impossible to answer many questions about pop culture, such as about movies, songs and other problems, and it is impossible to answer many questions about the products, services, retail and wholesale business like. 如果扩展由这样的搜索引擎所使用的资源集,然而,这样的扩展可能引入对事实查询的多种可能回答的可能性,其中一些可能是矛盾的或者含混的。 If the extended set of resources from this search engine is used, however, may introduce the possibility of such an extension of the fact that a variety of query may be answered, some of which may be contradictory or ambiguous. 另外,随着资源集的扩展,信息可能来自不可靠的来源或者未知可靠性的来源。 In addition, with the expansion of the resource set, the information may come from unreliable sources or reliability of unknown origin.

发明内容 SUMMARY

[0010] 根据本发明的一方面,公开了一种用于显示事实的源的方法,该方法包括接收包 [0010] According to an aspect of the invention, a method is disclosed for the fact that the source of the display, the method comprising receiving packet

括一个或多个词语的事实查询,识别对该包括一个或多个词语的事实查询的回答,识别一个或多个源文档,所述源文档包括所述查询的一个或多个词语以及所述回答的一个或多个词语,生成至少一个源文档的片段,该片段包括所述查询的一个或多个词语以及所述回答的一个或多个词语,生成包括所述片段的响应。 Including the fact that one or more of the words in the query, the answer including the identification of the fact that one or more words in the query, identifying one or more source documents, the source document comprises one or more terms of the query and the answer one or more terms, generating at least one source document fragment, the fragment comprises one or more terms of the query and one or more terms of the answer, including the fragments generated response.

4附图说明[0011] 图[0012] 图 4 BRIEF DESCRIPTION [0011] FIG [0012] FIG.

联事实。 Union Facts.

[0013] 图 [0013] FIG.

[0014] 图 [0014] FIG.

[0015] 图 [0015] Figure

和回答的源[0016] 图[0017] 图[0018] 图 And answers source [0016] Figure [0017] Figure [0018] Figure

示该回答和[0019] 在 Shows the answer and [0019] in

1示出根据本发明一些实施例的网络。 1 illustrates the present invention according to some embodiments of the network.

2示出根据本发明一些实施例的用于对象的数据结构和事实储存库中的相关 2 illustrates a data structure according to some embodiments of the present invention for an object and associated facts repository

3示出根据本发明一些实施例的用于事实索引的数据结构。 3 shows the data structure used in accordance with some embodiments of the fact that the index of the present invention.

4示出根据本发明一些实施例的用于可能回答的列表的数据结构。 4 shows the data structure used in accordance with some embodiments of the present invention may be a list of answers.

5A-5C是根据本发明一些实施例的用于选择对事实查询的回答以及显示回答 5A-5C is the fact that the answer to the query to select and display the answer to some embodiments of the present invention is used in accordance

的过程的流程图。 A flowchart of a process.

6示出根据本发明一些实施例的对事实查询的回答的表示。 6 shows a representation according to the present invention is the fact that some of the query answer embodiment. 7示出根据本发明一些实施例的对于事实查询的回答的源的列表的表示。 7 shows a representation according to a list of some embodiments of the answer to the query of the fact that the source of the present invention. 8示出了根据本发明一些实施例示的系统,用于选择对事实查询的回答以及显回答的源的列表。 8 illustrates some embodiments of the system according to the present invention illustrated, for selecting the answer to the fact that the source of the query and a list of significant answer.

全部图中相同的参考数字表示相应的部分。 All the drawings, the same reference numerals indicate corresponding parts.

具体实施方式 DETAILED DESCRIPTION

[0020] 查询引擎可以存储从很多分散的源收集的事实信息并且响应于用户对事实信息(或者"事实查询")的查询而返回回答。 [0020] The query engine can store a lot of factual information from scattered sources collected and factual information in response to a user (or "fact check") query and returns the answer. 从许多源收集信息扩展了用于查询引擎的可用事实信息的范围,但也引入了多个可能回答的可能性。 Collecting information from many sources extend the range of available factual information for the query engine, but also introduces the possibility of multiple possible answers. 查询引擎可以识别可能的回答并且从可能的回答中选择最好的回答提供给用户,或者它也可以确定没有一个可能的回答可以提供给用户。 The query engine can identify possible answers and to select the best answer from the possible answers provided to a user, or it may be determined that no one may answer would be provided to the user. 查询引擎还可以提供回答的源的列表,包括来自每一个源的文本部分。 List source query engine can also provide answers, including those from the text portion of each source. 文本的部分或多个部分被称作片段(snippet),可以包括事实查询词语和回答词语。 Or more portions of the text are called fragment (snippet), may include the fact query terms and answer terms. 当片段示出了由搜索引擎识别或选择的回答,源的列表给用户提供回答的基础并且可以帮助用户评估回答的真实性。 When the section is shown by a search engine to identify or select answers, the source list to a user basis and can help to answer the user answered assess authenticity.

[0021] 图1根据本发明的一些实施例示出了网络100。 [0021] Figure 1 in accordance with some embodiments of the present invention is shown a network 100. 网络100包括一个或多个客户端102和查询引擎106。 Network 100 includes one or more clients 102 and query engine 106. 客户端102可以包括客户端应用程序(未示出)。 The client 102 may include a client application (not shown). 网络100还可以包括一个或多个可以耦接这些组件的通信网络104。 Network 100 may also include one or more of these components may be coupled to a communication network 104.

[0022] 客户端应用程序为客户端102的用户(未示出)提供至查询引擎106的接口。 [0022] The client application 102 of the client user (not shown) provides the interface to the query engine 106. 使用在客户端102上运行的客户端应用程序,用户可以向查询引擎106提交关于文档的搜索(例如,web搜索)和事实查询并且观察来自查询引擎106的响应。 102 used on the client running the client application, the user can submit a search on the document (for example, web search) to the query engine 106 and the fact query and observe the response from the query engine 106. 客户端应用程序可以包括万维网浏览器。 Client applications may include a Web browser. 万维网浏览器的实例包括FIREFOX, INTERNET EXPLORER和OPERA。 Examples of Web browsers include FIREFOX, INTERNET EXPLORER and OPERA. [0023] 查询引擎106提供平台用于存储事实信息和响应事实查询,以及处理其它类型的搜索。 [0023] The query engine 106 provides a platform for storing information and respond to the fact that the fact that queries, and handle other types of searches. 查询引擎106可以处理对文档的搜索,例如万维网搜索,还可以处理事实信息的查询。 Query engine 106 can handle the search for a document, such as Web search, you can also process queries factual information. 查询引擎106提供查询服务器108。 Query engine 106 provides a query server 108. 查询服务器108为查询引擎106提供前端。 Query server 108 to provide front-end query engine 106. 查询服务器108从客户端102接收查询,将查询引导至能够处理事实查询和其它搜索的查询引擎106的组件,产生响应,并且将响应传送给客户端102。 Query server receives a query from a client 108 102, will be able to handle inquiries directed to the fact that search queries and other query engine assembly 106, resulting in a response, and the response sent to the client 102. 查询服务器108可以被分布到多个计算机。 Query server 108 can be distributed across multiple computers. 在其它实施例中,查询引擎可以处理更多或者更少的功能。 In other embodiments, the query engine can handle more or less functionality. 例如,在其它实施例中,响应产生可以在查询引擎106中的其它地方来处理。 For example, in other embodiments, the response may be processed to generate the query engine 106 at the other places.

[0024] 查询引擎106包括第一搜索控制器IIO,第一高速缓存112,文档索引114和文档数据库116,用于处理文档搜索。 [0024] The query engine 106 includes first search controller IIO, the first cache 112, document indexing and document databases 114 116, for processing document search. 在一些实施例中,为了提供对大量文档的快速访问可以在多个计算机中布局这些组件。 In some embodiments, in order to provide rapid access to a large number of documents can layout of these components in the plurality of computers. 例如,文档数据库116可以布局在N个服务器中,利用映射功能例如"模数N"功能来确定哪些文档被存储在N个服务器的每一个。 For example, the layout of the document database 116 may be in the N servers, for example, using the map function "modulo N" feature to determine which documents are stored in each of the N servers. N可以是大于1的整数,例如介于2和8196之间的整数。 N may be an integer greater than 1, e.g., an integer of between 2 and 8196. 类似的,文档索引114可以分布在多个服务器中,第一高速缓存112也可以分布在多个服务器中。 Similarly, the document index 114 can be distributed across multiple servers, the first cache 112 may be distributed across multiple servers. 另外,第一搜索控制器110也可以分布在多个计算机中。 In addition, the first search controller 110 may also be distributed across multiple computers.

[0025] 第一搜索控制器110耦接到查询服务器108。 [0025] The first search controller 110 is coupled to the query server 108. 第一搜索控制器110也耦接到第一高速缓存112、文档索引114和文档数据库116。 The first search controller 110 is also coupled to the first cache 112, document indexing and document databases 116 114. 配置第一搜索控制器110使得可从查询服务器108接收文档搜索查询并且传送该查询到第一高速缓存112、文档索引114和文档数据库116。 The first controller 110 configuration makes it possible to search queries the server 108 receives the document from the search query and transmits the query to the first cache 112, document indexing and document databases 116 114. 第一高速缓存112用于通过临时存储以前找到的搜索结果来提高搜索效率。 112 for the first cache search results by temporarily storing previously found to improve search efficiency. [0026] 第一搜索控制器110从第一高速缓存112和/或文档索引114接收文档搜索结果并且构建经排序的搜索结果列表。 [0026] The first list of search results search controller 110 receives a document search results from the first cache 112 and / or 114 and build documentation index sorted. 第一搜索控制器110然后向查询服务器108返回定位文档列表用于向前传送到客户端102。 The first search controller 110 then returns to the query server 108 to locate the document list for onward transfer to the client 102. 由第一搜索控制器110从第一高速缓存112和/或文档索引114接收的文档搜索结果可以伴随着搜索结果中的被定位文档片断。 Search by the first controller 110 114 received from the first cache 112 and / or documents indexed document search results along with the search results can be positioned document fragments. [0027] 查询引擎106还包括第二搜索控制器118,第二高速缓存10,事实索引122,以及事实储存库124。 [0027] The query engine 106 also includes a second search controller 118, a second cache 10, the fact index 122, and the fact repository 124. 在一些实施例中,为了提供对大量事实的更快速的访问可以在多个计算机中布局这些组件。 In some embodiments, in order to provide faster access to the fact that a large number of these components in the layout can be more computers. 例如,事实储存库124可以布局在N个服务器中,可利用映射功能例如"模数N"功能来确定哪些事实被存储在N个服务器的每一个中。 For example, the fact repository 124 may be the layout of the N servers, mapping function can be used such as "modulo N" function to determine which facts are stored in each of the N servers. N可以是大于1的整数,例如介于2和8196之间的整数。 N may be an integer greater than 1, e.g., an integer of between 2 and 8196. 类似的,事实索引122可以分布在多个服务器中,第二高速缓存120也可以分布在多个服务器中。 Similarly, the fact index 122 may be distributed across multiple servers, the second cache 120 may be distributed across multiple servers. 另外,第二搜索控制器118也可以分布在多个计算机中。 Further, the second search controller 118 may be distributed among multiple computers.

[0028] 第二搜索控制器118耦接到查询服务器108。 [0028] The second search controller 118 is coupled to the query server 108. 第二搜索控制器118还耦接到第二高速缓存120、事实索引122和事实储存库124。 The second search controller 118 is also coupled to the second cache 120, 122, and the fact that the fact that the index repository 124. 第二搜索控制器118被配置为可以从查询服务器108接收对事实问题的回答的查询并且传送该查询到第二高速缓存120和事实储存库124(通过事实索引122)。 The second search controller 118 is configured to receive 108 questions to answer factual queries from the query server and transmits the query to the second cache 120 and the fact that the repository 124 (by the fact that the index 122). 第二高速缓存120用于通过临时存储以前定位的搜索结果来提高事实检索效率。 The second cache 120 is used by temporary storage location of the search results before the fact to improve retrieval efficiency.

[0029] 第二搜索控制器118从第二高速缓存120和/或事实储存库124接收事实查询的可能回答的事实。 May answer the fact [0029] The second search controller 118 receives from the second cache 120 and / or the fact that the fact repository 124 queries. 第二搜索控制器118从可能的回答中选择回答作为最好的回答提供给用户。 The second search controller 118 selects answer as best answer available to users from a possible answer. 回答被传送给查询服务器108,其中产生包括该回答的响应并且传送给客户端102用于呈现给用户。 Answer is transmitted to the query server 108, which generates a response including the answer and transmitted to the client 102 for presentation to the user. 为了响应用户在所显示在事实查询的回答上的或其下一个上的图标的选择,查询服务器108可以识别与回答关联的源的列表并且传递源的列表到第一搜索控制器110。 In response to the user is displayed on the facts of the query answer or select an icon on the query server 108 can identify the source of the association with the answer list and a list of sources to deliver first search controller 110. 第一搜索控制器110访问对应于源和至少源文档子集的片断的文档。 The first search controller 110 accesses the source corresponding to the fragment, and at least a subset of the source document of the document. 在一些实施例中,该片断包括来自查询的词语和来自回答的词语。 In some embodiments, the fragment comprising the words from the query and words from the answer.

[0030] 事实储存库124存储从多个文档中提取的事实信息。 [0030] The fact repository 124 stores extract factual information from multiple documents. 从中提取特定事实的文档是该事实的源文档(或者称"源")。 Extract the particular facts of the document is the fact that the source document (or called "source"). 换句话说,事实的源包括在其内容中的事实。 In other words, the fact that the source is included in the contents of the facts. 源文档可以包括但是不限于网页。 Source document may include, but is not limited to web pages. 在事实储存库124中,实体、概念等等这些由事实储存库124可对其存储事实信息的,都可以通过对象来表达。 In fact repository 124, entities, concepts, etc. These information repository 124 may be stored by the fact that the fact that it is stored, can be expressed through the object. 一个对象可以具有一个或多个与其关联的事实。 An object can have one or more facts associated with it. 每一个对象都是事实的集合;没有事实与其关联的对象(空对象)在事实储存库124中可以被看作不存在的对象。 Each object is a collection of facts; no facts associated with objects (empty object) in the fact that the repository 124 can be seen as an object that does not exist. 在每一个对象中,每一个与对象关联的事实都作为一个属性-值对来存储。 In every object, every object is associated with the fact that both as an attribute - value pair storage. 每一个事实还包括源文档的列表,该源文档包括它们内容中的事实并且从源文档中可提取该事实。 Each also includes a list of the fact that the source of the document, the source document, including their content and the fact that can be extracted from the source document that fact. 事实储存库中的另外的关于对象和事实的细节都在下面进行描述,涉及到图2。 The fact repository Additional details about the object and are in fact described below, relates to Figure 2.

[0031] 为了在事实储存库124中查找信息,第二搜索控制器118在事实索引122中搜索该搜索查询中的词语。 [0031] In order to find the fact that the information repository 124, 118 to search for the search query terms in the index 122 in the fact that the second search controller. 这就产生了事实储存库位置的列表(对应于事实或对象),其匹配搜索查询中的不同的词语。 This leads to the fact repository location list (or object corresponding to the fact), which match the search query different words. 使用搜索查询的逻辑结构(可以认为是布尔表达示或树),第二搜索控制器118然后形成了这些位置列表的逻辑组合来识别可能的事实,如果存在的话,这样的事实匹配搜索查询。 Using the logical structure to the search query (which may be considered to be a Boolean expression or tree diagram), the second search controller 118 then forms the logical combination of these positions a list of possible to identify the fact that, if present, to the fact that match the search query.

[0032] 事实索引122给事实储存库124提供索引并且为在事实储存库124中信息的有效查询提供了方便。 [0032] The fact that the index 122 to the fact that the repository 124 provides for effective indexing and query information repository in the fact that 124 provides a convenient. 事实索引122可以基于一个或多个参数索引事实储存库124。 122 Index based on the fact that one or more parameters can be the fact that the repository 124 index. 例如,事实索引122具有索引(其可以被称为主索引或词语索引),它索引唯一词语到事实储存库124 中的位置。 For example, the fact index 122 has an index (which may be referred to the main index or concordance), only the words of the fact that the index repository 124 in position. 另外的关于事实索引122的细节将在下面进行描述,涉及到图3。 Further details regarding the fact that the index 122 will be described below, relates to Figure 3. [0033] 应当理解,尽管查询引擎106的任何组件可分布到多个计算机,为了解释的方便, 我们将讨论查询引擎106的组件就像它们在一个单一的计算机上被实现一样。 [0033] It should be understood that, although any component query engine 106 may be distributed to multiple computers, for convenience of explanation, we will discuss the components of the query engine 106 as if they were on a single computer is implemented the same. [0034] 图2示出了根据本发明的一些实施例的用于事实储存库124中的对象的实例数据结构。 [0034] Figure 2 shows a repository of objects 124 in accordance with some embodiments of the present invention is the fact that for instance data structure. 如上所述,事实储存库包括对象,每一个对象都可以包括一个或多个事实。 As described above, the fact repository includes an object, each object may include one or more facts. 每一个对象200包括唯一标识符,例如对象ID 202。 Each object 200 includes a unique identifier, such as an object ID 202. 对象200包括一个或多个事实204。 Object 200 includes one or more facts 204. 每一事实204包括唯一标识符用于该事实,例如事实ID 210。 Each fact 204 includes a unique identifier for that fact, such as the fact that ID 210. 每一事实204包括属性212和值214。 Each facts and values 204 212 214, including property. 例如,包含在表示乔治,华盛顿的对象中的事实可以包括具有属性"出生日期"和"死亡日期"的事实,这些事实的值分别是实际的出生日期和死亡日期。 For example, included in, said George Washington facts object may include the fact that the attribute "date of birth" and "death date", the value of these facts are the actual date of birth and date of death. 事实204可以包括到另一对象的链接216,另一对象可以是对象标识符,例如事实储存库124中的另一对象的对象ID 202。 The fact 204 may include a link 216 to another object, the other object may be an object identifier, such as an object ID 124 in the fact repository 202 to another object. 链接216使得对象可具有这样的事实,该事实的值是其它对象。 Link 216 such that the object may have a fact, the fact that the value of the other objects. 例如,对于对象"美国",它可以是具有属性"总统"的事实,该属性"总统"的值是"乔治• W •布什",而"乔治• W •布什"是事实储存库124中的另一个对象。 For example, for an object, "the United States", it may be a fact that the "President" of the property, "President" of the value of the property is "George • W • Bush" and "George • W • Bush" is the fact that the repository 124 another object. 在一些实施例中,值字段214存储所链接对象的名称,链接216存储所链接对象的对象标识符。 In some embodiments, the value of the name field 214 stores the link object, the link 216 stores the object identifier of the linked object. 在一些其它实施例中,事实204不包括链接字段216,因为事实204的值214可以存储至另一对象的链接。 In some other embodiments, facts 204 do not include a link field 216 because the value 214 of the fact 204 may store a link to another object. [0035] 每一事实204还可以包括一个或多个指标218。 [0035] Each fact 204 may also include one or more indicators 218. 指标可以提供事实质量的指示。 Indicators can provide an indication of the fact that quality. 在一些实施例中,指标包括置信度和重要度。 In some embodiments, indicators include the confidence and the degree of importance. 置信度指示事实正确的可能性。 Confidence facts indicate correct possibilities. 重要度指示该事实和对象的相关性,与其它事实对于相同对象的相关性相比。 Important indication of the relevance of the facts and objects, for the same object correlation compared with other facts. 换句话说,重要度衡量该事实对于由对象表示的实体或者概念的理解有多重要。 In other words, it is important to measure the degree of the fact how important the concept of an entity or object represented by understanding.

[0036] 每一事实204包括其中包含事实并且从中可提取事实的源220的列表。 [0036] 204, including the fact that each of which contains a list of facts and facts can be extracted from the source 220. 每一源都可以由统一资源定位符(URL)或者web地址来标识。 Each source can be determined by a Uniform Resource Locator (URL), or web address to identify.

[0037] 在一些实施例中,一些事实包括代理字段222,该字段标识提取事实的模块。 [0037] In some embodiments, some facts including proxy field 222, this field identifies the fact that the extraction module. 例如, 代理可以是从特定源提取事实的特殊模块,或者是通过万维网在文档中从自由文本中提取事实的模块,等等。 For example, the agent can be a special module to extract facts from a specific source, or by the World Wide Web to extract facts from free text in a document in the module, and so on.

[0038] 在一些实施例中,对象200可以具有一个或多个特定的事实,例如名称事实206和性质事实208。 [0038] In some embodiments, the object 200 may have one or more specific facts, such as the name and nature of the facts the facts 206 208. 名称事实206是一个为对象200所表示的实体或者概念表达名称的事实。 Name an expression of the fact that 206 is the fact that the name of the entity or concept represented object 200. 例如,对于一个表示西班牙国家的对象,可以是一个能够表达对象的名称为"西班牙"的事实。 For example, an object representation of the Spanish State, may be able to express the fact that the name of an object is "Spanish" in. 名称事实206,可以是一般事实204的特定实例,包括与任何其它事实204相同的参数;它具有:属性,值,事实ID,指标,源,等等。 Name fact 206, may be a specific instance of the general fact 204, includes the fact that the same 204 with any other parameter; it has: attribute, value, fact ID, index, source, and so on. 名称事实206的属性224指示该事实是名称事实,值是实际的名称。 Name attribute 224 indicates the fact that 206 is the name of the fact that fact, the value is the actual name. 名称可以是文本的字符串。 The name can be a string of text. 对象200可以具有一个或多个名称事实, 同时许多事实或者概念可以具有多于一个名称。 Object 200 may have one or more names facts, but many facts or concepts may have more than one name. 例如,表示西班牙的对象可以具有能够表达国家的通用名称"西班牙"的名称事实和官方名称"西班牙王国"。 For example, objects can represent Spain with a generic name to express the country's "Spain" in the name and the fact that the official name of "the Kingdom of Spain." 在另一实施例中,表示美国专利和商标局的对象可以具有能够表达代理的首字母縮略词"PTO"和"USPTO"的名称事实和正式官方名称"美国专利和商标局"。 In another embodiment, an object representing the United States Patent and Trademark Office may have to express the first letter of the acronym proxy "PTO" and "USPTO" the official name of the facts and the official name of "The United States Patent and Trademark Office."

[0039] 性质事实208是表达关于所关注对象200所表示的实体或者概念的陈述的事实。 [0039] Properties of the fact that 208 is a statement expressing concern about the fact that the entity or concept represented by the object 200. 例如,对于表示西班牙的对象,性质事实可以表达西班牙是欧洲的一个国家。 For example, Spain's object representation can express the fact that the nature of Spain is a country in Europe. 性质事实208, 作为一般事实204的特定实例,还可以包括与其它事实204相同的参数(例如:属性,值,事实ID,指标,源,等等)。 The fact that the nature of 208, as the specific examples of the general fact 204, also may include the same parameters as other facts 204 (e.g.: attribute, value, fact ID, index, source, etc.). 性质事实208的属性字段226指示该事实是性质事实,值的字段是能够表达所关注陈述的文本的字符串。 Properties of the fact property 226 of field 208 indicates the nature of that fact is the fact that the field value is stated interest capable of expressing a string of text. 例如,对于表示西班牙的对象,性质事实的值可以是文本字符串"是欧洲的一个国家"。 For example, the object of Spain, said the fact that the value of nature can be a text string "is a country in Europe." 对象200可以具有0个或更多的性质事实。 Object 200 can have 0 or more properties of the facts. [0040] 应当理解图2中示出的数据结构和上面所描述的只是实例性的。 [0040] It should be understood that Figure 2 is only exemplary of the described data structure shown above. 事实储存库124 的数据结构可以采取其它形式。 Facts repository data structure 124 may take other forms. 其它字段可以被包含在事实中,并且上面描述的其中一些字段可以省略。 Other fields may be included in facts and some of the fields described above may be omitted. 另外,每一对象除了名称事实和性质事实外都还可以具有另外的特定事实, 例如表达类型或种类的事实(例如,人,位置,电影,演员,等),用于将对象所表示的实体或者概念进行分类。 In addition, each object in addition to the fact that both the name and nature of the facts may also have additional specific facts, such as the expression of the type or types of facts (eg, people, location, movies, actors, etc.) for entity object represents or concept classification. 在一些实施例中,对象的名称和/或性质都由特定的记录表示,这些特定的记录都具有与对象的属性_值对相关联的事实记录204不同的格式。 In some embodiments, the name of the object and / or by the specific nature of the record indicates that these particular records have the property value of an object _ the facts associated with record 204 different formats. [0041] 图3根据本发明的一些实施例示出了实例事实索引。 [0041] FIG. 3 in accordance with some embodiments of the present invention shows an example of the fact that index. 如上所述,事实索引122可以基于一个或多个参数来索引事实储存库。 As described above, the fact index 122 may be based on one or more parameters to index the fact repository. 在一些实施例中,事实索引300可以是这种索引。 In some embodiments, such a fact index 300 may be indexed. 事实索引300映射唯一词语到事实,或者映射到事实储存库124中的信息位置上。 The fact that the index is mapped only 300 words to facts, or mapped to the fact that the information in the repository 124 locations. 如这里所使用的,词语是单词(例如英文单词"Spain"或者"George")或者数字(例如"123"或者"-9")。 As used herein, the word is a word (e.g., the English word "Spain" or "George") or digital (e.g., "123" or "-9"). 在一些实施例中,词语还可以包括包含两个或更多单词的词语,例如英文"United States"或者"birth data"。 In some embodiments, the term may also include words comprising two or more words, for example, the English "United States" or "birth data". 事实索引300包括词语的多个组303和相关联的词语位置记录,并且可选的包括带有关于索引300的信息(例如关于索引大小的信息,关于用于定位所述组的映射功能的信息,等)的索引头部302。 The fact that the index comprises a plurality of groups 303 and 300 words in the position of the words in the associated record, and optionally with information including index information (e.g., information about the size of the index, on the positioning of the mapping function for the group of about 300 , etc.) of the index head 302. 在每一个组303中都是词语304和一个或多个词语定位记录306,它可以识别事实储存库124中词语的每次出现的位置。 In the position of each group in the 303 and 304 are the words of one or more words in the positioning record 306, it can recognize the fact that the repository 124 words each occurrence. 每一词语定位记录都具有对象标识符308 (用于标识出现该词语的对象)、事实标识符310 (标识该对象中的事实)、事实字段标识符312(标识该事实中的字段)以及符号串(token)标识符314(标识字段中的符号串)。 Each word locate records have object identifiers 308 (object is used to identify the words appear), 310 (identifies the object in fact) the fact that the identifier, the fact that the field identifier 312 (identifies the fact that the field) and the symbol string (token) identifier 314 (ID field symbol string). 这四个字段把词语映射到事实储存库124中的位置。 These four fields of the words in the fact repository 124 is mapped to the location. 然而, 应当理解,事实索引300只是一个示例,事实索引300的其它形式和其它事实索引都是可能的。 However, it should be understood that the fact that the index 300 is only one example, the fact that other forms and other facts index index 300 are possible. 在一些实施例中,当词语定位记录306指向作为一整体的对象(例如,该词语是该对象的名称),事实标识符310、字段标识符312和符号串标识符314都具有预定值或者空值。 In some embodiments, when the word locate record 306 points to the object as a whole (e.g., the word is the name of the object), the fact that the identifier 310, an identifier field 312 and a symbol string identifier 314 has a predetermined value or null value. [0042] 图4示出根据本发明一些实施例的对于事实查询的可能回答的实例列表。 [0042] Figure 4 shows some embodiments of the present invention for the fact query may answer instance list. 第二搜索控制器118从第二高速缓存120或者从事实储存库124的搜索中接收对于事实查询的一个或更多可能的回答的列表,从可能回答的列表中选择最好的回答,将最好的回答传递到查询服务器108用于进一步的处理,这其中另外的细节将在下面进行描述,涉及到图5A-5C。 The second search controller 118 or from the second cache 120 receives a query for a fact or more of a list of possible answers, select the best answer from the possible answers list from the fact repository 124 searches, the most Good answer is passed to the query server 108 for further processing, further details of which will be described below, relates to FIG. 5A-5C. 图4示出了可能回答400的实例列表。 Figure 4 shows an example of a list of 400 possible answers. 列表400包括一个或多个可能的回答403。 List 400 includes one or more possible answers 403. 每一可能的回答403都具有一个或多个字段。 Each possible answer 403 has one or more fields. 对象ID 404标识对象,该对象包括了是可能回答的事实。 Object ID 404 identifies the object that includes the fact that it is possible to answer. 对象名称406标识由对象ID 404标识的对象所表示的实体或者概念的名称。 Object Name The name of the entity or the concept of the object by object ID identifies represented 404 406 logo. 对象名称406可以是包含在对象中的名称事实的值(参见上述的对象的数据结构)。 The object name 406 may be the name of the object that contains the value of the facts (see above object data structure) in. 事实属性408标识事实的属性,该事实为可能的回答。 Facts attribute 408 identifies the fact that the properties of the facts as a possible answer. 事实值410标识事实的值,该事实是可能的回答。 The fact that the value of 410 identifies the fact, that fact is possible answers. 回答字段412标识三个字段(对象名称406,事实属性408,或者事实值410)的哪一个具有响应于实际查询(也就是,用户正在寻找的回答的种类)的实际回答。 Answer field 412 identifies the three fields (object name 406, the fact that 408 properties, or the fact that the value of 410), which has a response to the actual query (that is, the user is looking for answers in the category) of the actual answer. QA类型414 标识由实际查询提出的问题(也就是,用户寻问的问题的类型以及,隐含地,响应于被询问的问题的种类)的类型。 QA 414 identifies the type of issues raised by the actual query (that is, the type of user Asking questions and, implicitly, in response to the kinds of questions being asked) type. 分值416指示可能回答的得分值(评分)。 Score of 416 indicates the possible answer score value (score). 分值是一种指标,其试图测出该可能回答作为精确和响应回答的质量。 Score is an indicator that attempts to measure the possible answers as precise and responsive answer quality. 事实查询418是由第二搜索控制器118产生的内部查询,可以导致如所述那样的可能回答的标识。 The fact that the second search query 418 controller 118 generates an internal inquiry, as described may lead to identification as possible answer. 事实查询418基于用户查询(由用户在客户端102输入的查询)而产生。 The fact query 418 based on a user query (query by a user on the client 102 inputs) is generated. 关于QA类型414、分值416以及事实查询418的其他细节都在下面进行描述,涉及图5A-5C。 About QA type 414, value 416, and other details of the fact query 418 are described below, in relation to Figure 5A-5C. 在一些实施例中,可能的回答403可以由更多或更少的信息字段来表示。 In some embodiments, the answer may be 403 or less by more information field to represent. 在一些实施例中,列表400包括列表头部402,其包含对于整个列表400可用的信息。 In some embodiments, the list 400 comprises list header 402, which contains the entire list 400 of available information. 例如,头部402可以包括用户查询的副本,指向列表40的顶端项的指针,或者其它数据结构用于方便对列表400中的词语或者记录进行访问。 For example, the head 402 may include a copy of the user query, a pointer pointing to the top of the list of items 40, or other data structure 400 for facilitating the list of words or records accessed. [0043] 图5A-5C示出根据本发明一些实施例的用于选择对于实际查询的最好回答和表示该回答的实例过程。 [0043] Figures 5A-5C illustrates some embodiments of the present invention is used to select the best answer to the query indicating the actual instance of the answer process. 对于实际查询的回答是事实储存库124中的事实,该回答被标识为对于该实际查询的最好响应。 The answer to the query is actually the fact repository 124, the fact that the answer is identified as the best for the actual query response. 一旦接收实际查询,查询引擎106处理查询,识别可能的回答,选择最好的回答,产生包含该回答的响应。 Upon receipt of the actual query, the query processing engine 106, to identify possible answer, choose the best answer, a response that contains the answer. 查询引擎106还产生包含该回答的源的列表的响应。 The query engine 106 also generates a response containing a list of sources of the answer.

[0044] 查询引擎106接收查询(502)。 [0044] The query engine 106 receives a query (502). 查询由用户在客户端102输入并且由客户端102 Queries by the user on the client 102 by the client 102 and the input

传递到查询引擎106。 Passed to the query engine 106. 查询包括一个或多个词语。 Query includes one or more terms. 由用户输入的查询是用户查询。 Query entered by the user is the user query.

[0045] 用户查询被处理(504)。 [0045] The user query is processed (504). 用户查询被传递到第一搜索控制器110和第二搜索控制 User search query is transmitted to the first controller 110 and the second search control

器118。 118. 因为用户查询包括一个或多个词语,它可以作为对文档的搜索查询,例如万维网搜 Because the user query includes one or more words, it can be used as search queries for documents, such as the World Wide Web search

索,并且传递到能够处理这种搜索的系统组件,例如第一搜索控制器110。 Cable, and transmitted to the system components to handle such searches, such as the first search controller 110. 搜索文档,例如万 Search for documents, such as Wan

维网搜索,这在现有技术中都是公知的,并且不需要进一步进行描述。 Dimensional network search, which are known in the art, and need not be further described.

[0046] 用户查询还可以传递到第二搜索控制器118。 [0046] User queries can also pass to the second search controller 118. 用户查询被预处理并且被分析以确定用户是否适合一个或多个QA类型中的任何一个。 User query is pre-processed and analyzed to determine whether the user for one or more types of any one of QA. 预处理可以包括剔除"停用词(stop word)"(例如英语中的定冠词和不定冠词和介词)和扩展用户查询中的词和/或短语来包括它们的各自的同义词或等效物。 Pretreatment may include removing "stop words (stop word)" (such as English Definite and indefinite articles and prepositions) and extended user query words and / or phrases, including their respective synonyms or equivalent thereof. 例如,短语"birth date"可以被扩展成包括其同义词"date birth"(没有停用词"of")和"birthday"。 For example, the phrase "birth date" can be expanded to include its synonyms "date birth" (no stop word "of") and "birthday". 分析可以包括对用户查询进行句法分析和分析用户查询的文本。 Analysis may include a user query parsing and analyzing user query text. 如果用户查询被确定为适合任何一个QA类型,可以为用户查询产生对应于相应QA类型的事实查询。 If the user query is determined to be suitable for any type of a QA, user queries may be generated corresponding to the respective types of QA fact query. 事实查询是查询引擎106内部的查询并且用于访问第二高速缓存120和事实储存库124(通过事实索引122)用于找到可能的回答。 The fact query is a query engine 106 internal inquiry and for access to the second cache 120 and the fact that the repository 124 (by the fact that the index 122) may be used to find the answer. 如果用户查询被确定为不适合任何QA类型,另外的由第二搜索控制器118在用户查询上的处理可以被废止,从第二搜索控制器118来看,用户查询不是事实查询。 If the user query is determined to be unsuitable for any type of QA, the other by the second search controller 118 processes the user query can be abolished, the second search controller 118, the user query is not fact query. 应当理解,用户查询可以被确定为适合不止一个QA类型,这样,对单一的用户查询可产生多于一个的事实查询。 It should be understood, the user query may be determined to fit more than one QA type, so that, for a single user query can produce more than one fact query. 每一个这些事实查询都被用于访问事实储存库124、第二高速缓存120和事实索引122,用于可能的回答。 Each of these facts queries are used to access the fact repository 124, the second cache 120 and the fact index 122 for possible answers.

[0047] 用户查询可以适合一个或多个QA类型。 [0047] user queries can fit one or more types of QA. QA类型是问题到回答的映射,指示了该用户查询询问了哪些问题以及响应于该事实查询的回答种类。 QA question to answer type mapping indicates that the user query asking what kind of questions to answer and respond to the fact that query. 在一些实施例中,有三个一般QA类型:名称和属性到值("NA-V");属性和值或性质到名称("AV-N");以及名称到性质、类型或者名称("N-PTN")。 In some embodiments, there are three general types of QA: name and attribute to a value ("NA-V"); and the value or nature of the attribute name ("AV-N"); and the name of the nature, type, or name (" N-PTN "). 在一些实施例中,还有另外的特定QA类型以处理问题的特定类型。 In some embodiments, there is a particular type of further specific QA type to deal with the problem. 在一些实施例中,这些特定的QA类型可以是一般QA类型的特定实例。 In some embodiments, these specific QA type may be a specific instance of the general QA types. [0048] 在NA-V类型中,用户(通过用户查询的项)提供对象名称和属性并且想知道具有给定的名字的对象的相应属性的值。 [0048] In the NA-V type, the user (via the user's query terms) to provide object names and attributes, and want to know the value of the corresponding property has given the name of the object. NA-V类型查询的实例可以是"波兰的首都是什么",其中"波兰"是对象名称,"首都"是期望知道其值的"波兰"的属性。 Examples NA-V type of query can be "What is the capital of Poland", where "Poland" is the name of the object, the "capital" is expected to know the "Polish" attribute whose value. 用于该查询的回答可以是事实的值,与具有名称"波兰"的对象相关联,并具有属性"首都"。 Answer to the query may be the fact that the value of, and has the name "Polish" associated with the object, and the attribute "capital." 在这种情况下,具有属性"首都"的事实的值可以是字符串"华沙"。 In this case, having a value of the attribute "capital" may be the fact that the string "Warsaw." 该值还可以是具有名称"华沙"的对象的对象标识符,这种情况下名称"华沙"可以由对象标识符替代并且作为可能回答返回。 This value can also be an object with the identifier name "Warsaw" objects, in this case the name of "Warsaw" can be replaced by an object identifier and returns as a possible answer. [0049] 在AV-N类型中,用户提供属性和值(或者性质,由于性质只是特定的属性值对, 如上所述)并且想知道具有给定属性的给定值的名称。 [0049] In the AV-N types, user attributes and values (or nature, but because of the nature of the particular attribute-value pairs, as described above) and want to know the name of the given property values have given. 在某种意义上,这是"反向查找"。 In a sense, this is the "reverse lookup." AV-N类型查询的实例可以是"哪个国家将华沙作为其首都",在这种情况下"首都"是属性, "华沙"是值。 Examples of AV-N type of query can be "Which country will Warsaw as its capital", in which case the "capital" of the property, "Warsaw" is the value. 一种可能的回答可以是具有该属性值对的对象的名称,也就是"波兰"。 One possible answer would be that the object has a name attribute value pairs, that is, "Poland." [0050] 在N-PTN类型中,用户提供名称并且想知道与给定名称相关联的对象的性质或者类型或者替代名称。 [0050] In the N-PTN types, user name and want to know given the nature or type name associated with the object, or an alternative name. N-PTN类型查询的一个实例可以是"NRA是什么(what is the NRA)"。 An example of N-PTN types of queries can be "what NRA is (what is the NRA)". "NRA"是用户想知道其性质、类型或者替代名称的对象的名称。 "NRA" is a user wants to know the nature, type, or substitute the name of the object name. 对于"NRA"的一种可能的性质的回答是"第二修正权拥护组(a second amendment rights advocacygroup),,。 一禾中可以表达由对象表示的实体或者概念的分类的回答类型,对于"NRA"可以是"组织",指示NAR 是一个组织,与其它类型例如人、书、电影等等相对。对于"NRA"的一种替代名称可以是"国家步枪联合会(National Rifle Association)",这是由具有名称(首字母縮略词)"NRA" 的对象表示的实体的正式名称。 For a "NRA" nature of possible answer is "second amendment rights advocacy group (a second amendment rights advocacygroup) ,,. One can express Wo answer type entities or concepts represented by the object classification, for" NRA "can be" organized ", indicating that NAR is an organization, and other types such as people, books, movies, etc. relative. For the" NRA "as an alternative name could be" the National Rifle Association (National Rifle Association) ", This is the official name of the entity by having a name (acronym) "NRA" object represents.

[0051] 在一些实施例中,事实查询可以包括额外的限制。 [0051] In some embodiments, the fact query may include additional restrictions. 例如,事实查询可以制定某个词语可只在特定字段而不在其它字段匹配。 For example, the fact that the query can be formulated in a word can not match other fields only in a specific field. 其它的限制可以是任何可能的回答都必须匹配特定类型(例如人,书等)。 The restriction may be any other possible answers must match a specific type (e.g., human, books, etc.). 这样的限制可以由第二搜索控制器118在分析和处理用户查询的期间产生。 Such restrictions may be generated by analyzing and processing user queries during a second search controller 118.

[0052] 在处理用户查询并产生一个或多个事实查询之后,事实查询用于访问事实储存库124(通过事实索引122)和第二高速缓存以查找可能的回答(506)。 [0052] Following the processing of user queries and generate one or more fact query, the query is used to access the fact that the fact repository 124 (via the fact index 122) and a second cache for possible answers (506). 可能的回答可以是匹配一个或多个事实查询的事实。 Possible answer can be matched with one or more of the fact that the fact that query. 对可能回答进行评分(508)。 The possible answers were scored (508). 可能回答的分值提供了该可能回答作为精确的和相应的回答的质量指示。 May provide the answer score possible answer as an accurate and the response of the quality indication.

[0053] 在一些实施例中,可能回答的分值是多个因子值的乘积。 [0053] In some embodiments, the scores may be answered is the product of a plurality of factor values. 在一些实施例中,一个或多个因子值可以是0和1之间的规范化的值,O和1包括在内。 In some embodiments, one or more of factor value can be normalized between 0 and 1, O and 1 included. 实际上用于确定分值的因子可以随与可能回答相匹配的事实查询的QA类型变化。 In fact the factor for determining the value with the possible answer can match the fact query QA type changes. 在一些实施例中,因为分值,作为0和1之间的因子的乘积,O和1包括在内,可以保持不变或者向0减少但是不会增加,如果用于特定回答的分值减少到预定阈值以下,用于任何特定可能回答的分值可以丢弃。 In some embodiments, because the scores, as multiplication factor between 0 and 1, O and 1 inclusive, may remain unchanged or reduced but not increased to 0, if the answer for a particular score reduction to a predetermined threshold value or less, for any particular value possible answer can be discarded. 这可以指示该可能回答是如此差的质量以至于进一步的评分是浪费的。 This may indicate that the answer may be in such poor quality that further rating is wasted.

[0054] 在一些实施例中,所述因子可以基于QA类型、匹配事实查询的事实的指标(例如置信指标和重要性指标)、提取该匹配事实的代理、事实中的字段与事实查询匹配的程度、 事实中的特定字段完全匹配事实查询的程度,等等。 [0054] In some embodiments, the factor may be based on QA type, matching index fact query facts (e.g., the confidence index and importance indicator), to extract the facts matching agent, and the fact that the fact that the fields that match the query extent, the fact that a particular field in the fact that the extent of exact match query, and so on. 应当理解上述因子只是实例性的,除了 It should be understood that the above is only exemplary of factors, in addition to

10上述的那些,其它因子也可以包括在内,上面描述的一些因子可以省略。 10 that, other factors described above can also be included, some of the factors described above may be omitted.

[0055] 每一个可能的回答被评分之后,可能回答被收集到可能的回答列表中,例如上述的可能回答列表400,参见图4。 [0055] After each of the possible answers are scored, the answer may be collected into the list of possible answers, such as described above may reply list 400, see Figure 4. 在一些实施例中,只有预定数量的高评分的回答被收集到可能回答列表400。 In some embodiments, only a predetermined number of high scoring possible answers is collected reply list 400. 例如,可能回答列表可以只包含100个最高评分的可能回答。 For example, you may be able to answer the list contains only the 100 highest score possible answer. 在一些实施例中,可能回答列表400的进一步处理由第二搜索控制器118来控制。 In some embodiments, further processing may reply list 400 by the second search controller 118 to control. [0056] 在图5B中继续,从可能回答列表400中识别一定数量的最高评分的可能回答(510)。 [0056] In Figure 5B continue, from the possible answers list 400 identifying a number of the highest score possible answers (510). 该数量可以是预定数量,它可以指定有多少高评分的回答将要被进一步处理。 The amount may be a predetermined amount, it can specify the number of high-scoring answers will be further processed. 只要仍然存在已找到的高评分的可能回答需要被处理(512-否),就会处理下一个高评分的回答。 As long as there are still found in the high score may answer needs to be treated (512- NO), will deal with the next high score answer. 该处理涉及到为各高评分的回答识别支持回答(514)以及基于各高评分的回答的分值和它的支持回答的分值来确定用于各高评分回答的支持分值(516)。 This process involves the support answer (514) for the high score of the answer of each high score based on the identification and the answer value and its supporting answers scores determined for each high score answer support value (516). 支持回答的识别将在下面进行详细描述。 Support answer recognition will be described in detail below.

[0057] 在一些实施例中,支持分值通过将每一个高评分回答的分值和它的支持回答转化为奇空间值(odds space value)。 [0057] In some embodiments, each supported by a high score score score and answer its supporting answers into odd spatial values (odds space value). 分值s被转化为奇空间值x。 S score is converted into an odd space value x.

[0059] 所转化的各值(也就是分值的奇空间转化)相加以生成值X,总和X被转化回概率空间值以获得用于该高评分回答的支持分值S。 [0059] The transformation of values (ie score the odd space conversion) phase to generate value X, the sum of the probability space X is converted back to the values obtained for the high score to answer support scores S.

[0061] 在为已找到的高评分回答确定支持分值(512-是)之后,识别具有最高支持分值(下文中为"最好支持回答")的高评分回答(518)。 [0061] In as determined to find answers to support high score score (512- is), the identification with the highest support score (hereinafter as "the best support the answer") high score answer (518). 对于该最好支持回答,识别与该最好支持回答矛盾的可能回答列表中的高评分回答(520)。 The best answer for the support, recognition and support the best possible answer to conflicting answers in the list of high scores answer (520). 对于该矛盾回答,确定用于该矛盾回答的支持分值的矛盾分值C(522)。 For this contradictory answers to determine support for the contradictory answers contradictions score score C (522). 另外,在一些实施例中,在与最好支持回答不相关的可能回答的列表中识别高评分回答(524)。 Additionally, in some embodiments, the identification of high score answer (524) in the list of the best supported answer is not related to the possible answer. 对于该不相关的回答,确定用于该不相关回答的支持分值的不相关分值U(526)。 For the unrelated answer, the answer is determined to support the unrelated score unrelated score U (526). 应当理解,用于确定矛盾分值C和不相关分值U的过程类似于确定S的过程:识别支持回答,分值被转化为奇空间值,奇空间值被相加,相加的和被转化回概率空间值。 It should be understood, is used to determine the contradictions and irrelevant score score C U S process similar to the process of determining: the recognition support to answer scores are converted to the value of the odd space, odd space values are summed, and the sum of converted back to probability space value. 矛盾和不相关的回答的确定将在下面进行描述。 Contradictions and unrelated answer determination will be described below.

[0062] 两个可能的回答是支持的、矛盾的或者不相关的都是基于两个回答的字段比较。 [0062] Two possible answer is to support, contradictory or are relatively irrelevant answer based on two fields. 每一个回答的所关注的字段,也就是名称、属性和值都被分组到输入和输出。 Each answer field of interest, that is the name, attributes and values are grouped into input and output. 例如,在一些实施例中,对于NA-V类型查询,输入是名称和属性,输出是值。 For example, in some embodiments, for a NA-V type query, the input is the name and attributes, the output is the value. 对于AV-N类型查询,输入是属性和值,输出是名称。 For AV-N type queries, input is the property and value is the name of the output. 两个可能的回答通过它们的输入字段和输出字段的成对比较而进行比较。 Two possible answers through their input and output fields were compared pairwise comparison. 这种比较考虑到字段中的数据的类型,也就是,字段中的数据究竟是词串、日期、数字等等。 This comparison taking into account the type of data field, i.e., whether the data field word series, dates, numbers, and so on. 回答的源也可以被考虑。 Answer source may also contemplated.

[0063] 在一些实施例中,成对的字段比较的结果是五种分类的其中之一。 [0063] In some embodiments, the results of comparison of paired fields is one of the five classifications. 它们是:[0064] 參不可比:字段具有不同的数据类型(例如,词串对日期),因此不能比较;[0065] 參不相似:字段属于相同的数据类型,但是根本就不相同; They are: [0064] parameters are not comparable: fields with different data types (for example, the word string date), and therefore can not be compared; [0065] parameters are not similar: the fields belonging to the same type of data, but simply not the same;

[0066] 參有些类似:字段具有一些相似性,但是难以得出结论是否它们指相同的东西;[0067] 參非常相似:字段几乎是相同的;以及[0068] 參相同:字段完全相同。 [0066] Participation is somewhat similar: Field has some similarities, but it is difficult to conclude whether they refer to the same thing; [0067] parameters are very similar: the field is almost the same; and [0068] the same parameters: exactly the same field. [0069] 字段相同与否的实际确定可因数据类型的不同而不同。 [0069] The actual determination of whether or not the same field may be due to different data types and different. 例如,对于数字,如果数字都是小整数,那么它们一定是完全相等才能被认为是相同的。 For example, for a digital, if the numbers are small integers, then they must be exactly equal in order to be considered the same. 如果数字是非常大的整数或者浮点型数字,那么如果它们互相在某一百分比之内就可以被认为相同。 If the number is very large integer or floating-point numbers, so if they are within a certain percentage of each other can be considered the same. [0070] 基于成对的字段比较,两个回答之间的关系可以被分类: [0070] Based on comparison of pairs of fields, relations between the two answers can be classified:

[0071] 參如果这些回答来自于相同的源,两个回答可以被分类为"互补"。 [0071] If the answer to the reference from the same source, two answers can be classified as "complementary." 与回答A互补的回答被忽略; A complementary answer and the answer is ignored;

[0072] 參如果这些回答具有相同的或者非常类似的输入但是输出只是有些类似,则这两 [0072] If these parameters have the same answer or very similar input but the output is somewhat similar, the two

个回答被分类为"可支持"(即回答A "可支持"回答B)。 Answer is classified as a "support" (i.e., answer A "support" answer B). "可支持"回答B的回答A被忽略; [0073] 參如果这两个回答具有相同的或者非常类似的输入以及相同的或者非常类似的 "Support" the answer A answer B is ignored; [0073] If these two parameters have the same answer or very similar input and the same or very similar

输出,那么这两个回答被分类为"支持",除非两个回答来自于相同的源。 Output, then the two answers are classified as "support" unless two answers from the same source. "支持"的回答的分值是支持分值确定的部分; "Support" for an answer scores are determined partial support scores;

[0074] 參如果输入是相同或者非常类似但是输出不相似或者是不可比的,那么两个回答 [0074] If the input parameters are the same or very similar, but the output is not similar to or are not comparable, then the two answers

是"矛盾的";以及 Is "contradictory"; and

[0075] 參如果输入不相似或者不可比,那么两个回答是"不相关的"。 [0075] If the input parameters are not similar or not comparable, then two the answer is "not relevant." [0076] 最好的支持回答的支持分值S与预定阈值T做比较(528)。 [0076] The best supported answer support S with a predetermined threshold value T is compared (528). 阈值T是是如果该最好的支持回答要被进一步考虑该支持分值S必须达到的最小分值。 Threshold T is the best support if the answer is to be further consideration of the support must meet a minimum score of S scores. 如果S小于或者等于T(528-否),那么执行图5B中示出的处理的处理器(例如第二搜索控制器118或者查询引擎106)可以产生指示查询引擎106不能提供回答的响应(534)。 If S is less than or equal to T (528- No), then the implementation shown in FIG. 5B processor (e.g., the second search controller 118 or the query engine 106) may generate a query engine 106 indicative of the response can not provide answers (534 ). 例如,第二搜索控制器118可以传递指示回答不可用的响应到查询服务器108,查询服务器108可以产生对该效果的响应并传递该响应到客户端102表示给用户。 For example, the second search controller 118 may transmit a response indicating that the answer can not be used to query the server 108, the query server 108 may generate and transmit a response to the effect of the response to the client 102 represents the user.

[0077] 如果S大于T (528-是),可以进行检查,看该最好支持回答的支持分值S较之矛盾回答的最好支持分值C是否超出了至少第一预定裕度。 [0077] If S is greater than T (528- Yes), can be checked to see if the score of the best supported answer support S than contradictory answer preferably support at least a value C exceeds the first predetermined margin. 在一个实施例中,这种比较通过将S与乘了常量a的矛盾分值C进行比较而进行(530)。 In one embodiment, this comparison and multiplying by a constant a S contradiction performed comparing scores C (530). 常量a代表为了把该最后支持回答选择为对于该事实查询的最好回答而必须达到的S与C的最小比率。 A represents a constant support in order to answer the final choice for the best answer for a minimum ratio of the fact that the query must reach the S and C's. 换句话说,S必须至少是矛盾分值C的a倍。 In other words, S must be at least a value C of times contradictory. 如果S小于aC(530-否),那么执行图5B中示出的处理的处理器(例如第二搜索控制器118或者查询引擎106)可以产生指示查询引擎(106)不能提供回答的响应(534)。 If S is less than aC (530- No), then the implementation shown in FIG. 5B processor (e.g., the second search controller 118 or the query engine 106) may generate an indication the query engine (106) does not provide the answer to the response (534 ).

[0078] 如果S等于或者大于a C(530_是),则进行另一个检查,看看最好支持回答的支持分值S较之不相关回答的最好支持分值U是否超出了至少第二预定裕度。 [0078] If S is equal to or larger than a C (530_ YES), another check is performed to see if the best supported answer support value S compared to the best supported answer is not related to whether the value exceeds the at least first U two predefined margin. 在一个实施例中,这种检查可以通过与乘了常量P的不相关的分值U进行比较而进行(532)。 In one embodiment, such a check may be performed by multiplying the constant unrelated score U P performed by making a comparison (532). 常量13 表示该最好支持回答可被选择作为该事实查询的最好回答之前所必须获得的S与U的最小比率。 Constant 13 indicates that the best answer may be selected to support a minimum ratio of the facts before the best answer to the query as necessary to obtain the S and U's. 换句话说,S必须是不相关的分值U的13倍。 In other words, S must be 13 times the value U is irrelevant. 如果S小于PU(532-否),那么执行图5B中示出的处理的处理器(例如第二搜索控制器118或者查询引擎106)可以产生指示查询引擎106不能提供回答的响应(534)。 If S is less than the PU (532- No), then the implementation shown in FIG. 5B processor (e.g., the second search controller 118 or the query engine 106) may generate an indication to answer the query engine 106 can not provide a response (534). 如果S等于或者大于PU(532-是),那么该最好的支持回答被选择作为该事实查询的回答并且被进一步处理,进一步的细节将参考图5C 在下面进行描述。 If S is equal to or greater than the PU (532- Yes), then the best supported answer is selected as the answer to the query of the fact and is further processed in further detail with reference to FIG. 5C described below.

[0079] 在图5C中继续,该最好支持回答被选择作为该事实查询的最好回答之后,查询服务器108产生响应(536)。 [0079] In Figure 5C continues, the best supported answer is selected as the best answer to the fact that after the query, the query server 108 generates a response (536). 该响应包括该最好支持回答。 The response includes support for the best answer. 该响应可以包括用于该最好支持回答的源的标识符和/或超级链接(例如URL)。 The response may include support for the best answer to the source identifiers and / or hyperlinks (eg URL). 在一些实施例中,该响应还可以包括链接, In some embodiments, the response may also include links,

12当用户在客户端102点击该链接时,可以产生请求用于该最好支持回答的各源的列表。 12 When the user on the client 102 clicks on the link, you can generate a request for support from the best sources to answer a list. 在一些实施例中,响应还可以包括文档搜索的结果,例如基于用户查询的web搜索。 In some embodiments, the response may also include a document search results, such as web search based on a user query. 文档搜索结果可以从处理这些搜索的查询引擎106的组件传递到查询服务器108,例如第一搜索控制器IIO。 Document Search results can be passed from the processing of these search engine queries to query the server component 106 108, such as the first search controller IIO. 该响应被传递到客户端102用于呈现给用户(538)。 This response is passed to the client 102 for presentation to the user (538). 包含回答和使用用户查询的文档搜索结果的示例响应将在下面参考图6进行详细描述。 Sample contains answers and using the document search results in response to user queries in the following detailed description with reference to FIG.

[0080] 看到在客户端102上提供的响应,用户可以请求该回答的源的列表。 [0080] see the response on the client 102 provides the user can request a list of sources of the answer. 在一些实施例中,用户可以通过点击包含在该响应(如上所述)中的链接提出该请求,一旦点击链接则产生对源列表的请求。 In some embodiments, the user can click on a link that contains the response (as described above) was made to the request, then once you click the link to generate a request for a list of sources.

[0081] 查询引擎106接收对该回答的源的列表的请求(540)。 [0081] The query engine 106 receives the list of sources to answer a request (540). 识别该回答的源(542)。 Identifying the source of the answer (542). 在一些实施例中,可以通过在事实储存库124中查找该回答事实的源220(图2)而识别该回答的源。 In some embodiments, may be to identify the source of the answer by the fact that to find the answer in the fact repository 124, source 220 (FIG. 2). 把片断产生请求发送到第一搜索控制器,连同源的列表、用户查询、匹配该回答的事实查询418以及该回答。 The fragments generated search requests to the first controller, along with a list of sources, user query, the answer to the fact that match the query and the answer 418. 片断产生请求被提交到第一高速缓存112、文档索引114和/或文档数据库116。 Snippet generation request is submitted to the first cache 112, the document index 114 and / or the document database 116. 在一些实施例中,如果源列表比预定的限界长,可以通过第一搜索控制器110选择该源列表的子集并提交到第一高速缓存112、文档索引114和/或文档数据库116。 In some embodiments, if the source list clearance than a predetermined length, the controller 110 may search by first selecting a subset of the source list and submitted to the first cache 112, document indexing 114 and / or 116 document databases. 第一高速缓存112、文档索引114和/或文档数据库116或片断产生请求被提交到的一个或多个处理器,对于所列出的源的每个产生片断(544)。 First cache 112, the document index 114 and / or document database 116 or fragments generated request is submitted to the one or more processors, for each listed source generating fragment (544). 每一片断可以包括文本的连续部分或者来自相应源的文本的多个非连续部分。 Each fragment may comprise a continuous or discontinuous portions of text from a plurality of portions corresponding source text. 对于特定片断来讲,如果所选择的包含在片断内的文本部分在源内是不连续的,这些部分就可以用省略号分隔开。 For a particular fragment in terms of text included in the section if the selected fragment within the source is not continuous, these portions can be separated by ellipses. [0082] 产生每一片断使得它能够包括尽可能多的该用户查询和/或事实查询的词语和尽可能多的该回答的词语。 [0082] to produce each piece so that it can include as many as possible to the user's query and / or the fact that the query words and answer as many of the words. 可以分析该源内查询和回答词语的散布(也就是查询词语和回答词语在源文档中如何散布)以帮助产生片断。 Can be analyzed within the source query and answer words spread (that is, words and answering queries on how to spread the words in the source document) to help produce clips. 产生最少查询词语和回答词语散布的一个或多个文本部分被选择用于包含在该片断内。 Produce at least answer the query words and parts of words in one or more text spread is selected for inclusion in the piece.

[0083] 产生包含该片断的响应(546)。 [0083] generates a response containing the fragment (546). 该响应包含源的列表和每一源的片断,所述片断包含用户/事实查询词语和回答词语。 Piece lists and each source of the response contains the source of the fragment containing the user / facts query words and answer words. 该响应还可以包含回答、用户查询、到每一源的超链接。 The response may also contain answer user queries, a hyperlink to each source. 该响应被传递到客户端102用于呈现给用户(548)。 This response is passed to the client 102 for presentation to the user (548).

[0084] 在一些实施例中,当响应被呈现给用户时,每一片断中的用户/事实查询词语和回答词语都被突出以使它们更加醒目。 [0084] In some embodiments, when the response is presented to the user, for each piece of user / facts query words and answer words are highlighted to make them more visible. 如这里所使用的,在片断中突出词语是指当呈现给用户时能让词语更加醒目的任何方式,包括但不限于:粗体表示词语,为词语添加下划线,使词语变为斜体字,改变词语的字体颜色,和/或为词语的局部添加背景颜色。 As used herein, the words highlighted in the piece is that when presented to the user in any way to make the words more visible, including but not limited to: bold words, for words to underline that word into italics, change words font color, and / or add a background color for partial words. 包含源的列表和片断的示例响应将在下面参考图7进行描述。 Fragments containing the source list and the sample response will be described below with reference to FIG. 7.

[0085] 在一些实施例中,提交给查询引擎106的查询可以由URL表示,包含用户查询词语和一个或多个其它参数。 [0085] In some embodiments, the query submitted to the query engine 106 may be represented by the URL, containing the user query terms and one or more other parameters. 例如,对词语"britney spearsparents"的查询可以由URL "http://www. google, com/search ? hi = en&q = britney+spears+parents,,表不。在一些实施例中,请求显示该回答的源的列表可以通过添加额外的参数来进行,例如把"&fsrc = l"添力n至IJ查询URL。因此,对于上述URL,如果对于查询"britney spearsparents"的回答的源的列表是所期望的,查询URL可以如这样"http:〃www. google, com/search ?hi = en&q = britney+spears+parents&fsrc = 1,,。 For example, for the word "britney spearsparents" queries can by URL "http:.?. // Www google, com / search hi = en & q = britney + spears + parents ,, table is not in some embodiments, a request for displaying the answer list of sources by adding additional parameters, for example the "& fsrc = l" Tim force n to IJ query URL. Therefore, for the above URL, if the query list "britney spearsparents" answered the source is desired and as such can query URL "http:.? 〃www google, com / search hi = en & q = britney + spears + parents & fsrc = 1 ,,. 在一些实施例中,当用户点击响应中包含回答的链接时,触发对于该回答的源的列表的请求,该链接是除额外参数之外的用于用户查询的查询URL。 In some embodiments, when a user clicks a link response contains answer triggers for a list of sources of the answer to the request, the link is a URL other than the query additional parameters for user queries. [0086] 在一些其它的实施例中,查询引擎106可以连同事实查询一起接收预定的特殊算符,可用于指示查询引擎106来寻找对事实查询的回答并且返回该回答和该回答的源的列表,而无需首先返回使用该查询作为文档搜索的输入而找到的文档列表。 [0086] In some other embodiments, together with the fact that the query engine 106 may receive queries along predetermined special operator, can be used to indicate that the query engine 106 to find the answer to the fact that the answer to the query and returns the answer and a list of source without first using the query returns a list of documents as input document search and find. 例如,用户可以输入"Z:X ofY","Z:"作为特殊算符,来指示查询引擎106寻找对事实查询"X ofY"的回答和该回答的源的列表。 For example, users can enter "Z: X ofY", "Z:" as a special operator, to indicate the query engine 106 to find a list of facts query "X ofY" answer and the answer source. 在某种意义上来讲,算符与查询的一起使用将查询与可对该查询找到回答的源的列表的请求合并在一起。 In a sense, the operator of the query can be used in conjunction with the query to find the answer to the query list of sources combined. 在一些实施例中,响应中能产生对该回答的源的列表的请求的链接,如上所述,当用户选择(例如点击)该链接时,对原来的查询添加该特殊算符并且把带有该特殊算符的查询提交到查询引擎106。 In some embodiments, the response can generate a list of links to the source of the answer to the request, as described above, when the user selects (e.g., clicking) on the link, adding to the original query and the operator with the particular The special operator submits a query to the query engine 106.

[0087] 图6示出了根据本发明一些实施例的对事实查询的实例响应,如在客户端102呈现给用户的,包含回答和使用该事实查询作为输入的文档搜索结果。 [0087] Figure 6 shows the response of the query, according to some examples of embodiments of the facts of the present invention, in the client 102 as presented to the user, and contains the answer to the query as an input using the fact that the document search results. 该响应600可以显示带有原始用户查询的搜索框602。 The 600 can be displayed in response to the search box with the original user query 602. 响应600包括对于查询604的回答,至该回答的源606的超链接,以及链接608,当由用户点击链接608时会触发对该回答的源的列表的请求。 Response to queries answered 600 includes 604, the answer to hyperlink source 606, and links 608, when requested by a user clicks on the link 608 is triggered when the source of the answer list. 在一些实施例中,链接608可以是添加了源列表请求参数的用于用户查询的查询URL,如上所述。 In some embodiments, link 608 may be added to the source list request query URL for the user query parameters, as described above. 在一些实施例中,如果回答事实604在事实储存库124中只有一个源,用户点击时可触发对该回答的源的列表的请求的链接可在响应600中被省略。 In some embodiments, if the answer to the fact that in the fact repository 604 is only one source 124, can trigger request link on the answer list of sources may be omitted when the user clicks in the response 600. 响应还可以包括文档搜索结果610的列表,例如万维网搜索,使用该事实查询作为输入。 Response may also include a list of search results 610 documents, such as Web search, use the fact that the query as input.

[0088] 图7示出了根据本发明的一些实施例的对于回答的源列表的请求的示例响应。 [0088] Figure 7 shows the response in accordance with some exemplary embodiments of the source list for answering a request for the present invention. Ring

应700包括带有原始用户查询的搜索框702。 700 should include a search box with the original user query 702. 在一些实施例中,搜索框702还可以包括特 In some embodiments, the search box 702 may also include special

定算符,如上所述,它可以与事实查询一起被用于请求源的列表。 Fixed operator, as mentioned above, it can be used together with the fact that the list of queries requesting source. 例如,在搜索框702中, For example, in the search box 702,

"factsources:"是特定算符,"Britney spears parents"是原始用户查询。 "Factsources:" is a particular operator, "Britney spears parents" is the original user query. 在一些其它实 In some other implementations

施例中,算符可以从搜索框中的查询表示中省略,例如,如果对于源的列表请求的触发是用 Embodiment, the operator can be omitted from the inquiry, said the search box, for example, if the trigger source for the list of requests is to use

户点击链接,例如链接608,这包括带有源列表请求参数的查询URL。 User clicks on the link, such as link 608, which includes a list of requests with source URL query parameters. 该响应还可以包括对 The response may also include

于事实查询的回答704,以及用于该回答的一个或多个源706的列表,连同URL、超链接和用 In fact query answer list 704, as well as for the answer to one or more sources 706, together with the URL, and the use of hyperlinks

于每一源的片断708 。 Fragments to each source 708. 在一些实施例中,在每一片断708中,查询词语和回答词语可以被突 In some embodiments, each piece 708, the query words and words that can be answered projection

出显示。 The display. 在片断708中,查询词语和回答词语通过被加黑加粗以突出显示。 In the piece 708, the query words and answer words by being bold and black to highlight.

[0089] 图8是方块图,示出根据本发明的一些实施例的事实查询应答系统800。 [0089] FIG. 8 is a block diagram showing a query answering system 800 according to some embodiments of the facts of the present invention. 系统800 System 800

通常包括:一个或多个处理单元(CPU)802,一个或多个网络或其它通信接口810,存储器 Generally comprising: one or more processing units (CPU) 802, one or more network or other communications interfaces 810, memory

812,以及一个或多个通信总线814用于互连这些组件。 812, and one or more communication buses 814 for interconnecting these components. 系统800可选的包括用户界面804, System 800 optionally includes a user interface 804,

该界面包括显示装置806和键盘/鼠标808。 The interface includes a display device 806 and a keyboard / mouse 808. 存储器812包括高速随机存取存储器,例如 Memory 812 includes high-speed random access memory, e.g.

DRAM、SRAM、DDR RAM或者其它随机存取固态存储器装置;并且可以包括非易失性存储器,例 DRAM, SRAM, DDR RAM or other random access solid state memory devices; and may include non-volatile memory, for example,

如一个或多个磁盘存储器装置、光盘存储器装置、闪存装置或者其它非易失性固态存储器 The one or more disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state memory

装置。 Devices. 存储器812可选地包括一个或多个存储器装置,其与CPU(802)异地远程定位。 Memory 812 may optionally include one or more memory devices, with the CPU (802) remotely located off-site. 在一 In a

些实施例中,存储器812存储下面的程序、模块和数据结构或者其子集: Some embodiments, memory 812 to store the following programs, modules and data structures, or a subset thereof:

[0090] 參操作系统816,其包括用于处理不同基本系统服务和用于执行硬件依赖任务的 [0090] Cf. operating system 816, which comprises means for handling basic system services and different for performing hardware dependent tasks

程序; Program;

[0091] 參查询接受和处理模块818,用于接受查询和处理查询,例如对该查询句法分析以确定QA类型并产生事实查询; [0091] to accept and participate query processing module 818 for receiving queries and processing queries, for example, the query syntax analysis to determine the type and produce facts QA inquiry;

[0092] 參回答识别模块820,用于识别对于事实查询的可能回答;[0093] 參回答评分模块822,用于确定回答的分值和支持分值; [0092] Participation answer identification module 820 is used to identify possible answer for the fact that the query; [0093] Participation answer scoring module 822, is used to determine the value and support to answer score;

14[0094] 參回答比较模块824,用于比较各回答以确定是否它们是支持的、矛盾的等; [0095] 參回答选择模块825,用于选择可能回答作为提供给用户的回答; [0096] 參源识别模块826,用于识别回答的源; 14 [0094] Participation answer comparison module 824 for comparing the answers to determine whether they are supported, such as contradictory; [0095] Participation answer selection module 825 may be used to select the answer as the answer provided to the user; [0096 ] reference source identification module 826 is used to identify the source of the answer;

[0097] 參文档索引接口828,用于当搜索文档时提供与文档索引的接口; [0097] reference document indexing interface 828, is used to provide an index of the document when the document search interface;

[0098] 參文档存储接口830,用于当请求和接收片断时提供与文档存储系统的接口; [0098] reference document storage interface 830, an interface for requesting and receiving fragments when providing and document storage systems;

[0099] 參事实索引接口832,用于当搜索事实时提供与事实索引的接口; [0099] the fact that the index parameter interface 832 for providing the facts with the facts when the search index interface;

[0100] 參事实存储接口834,用于提供与事实存储系统的接口;以及 [0100] Cf. fact storage interface 834, a storage system for providing an interface with the facts; and

[0101] 參响应产生模块838,用于产生被传递到客户端102的响应。 [0101] Cf. response generation module 838 for generating a response to the client 102 is transmitted to.

[0102] 在一些实施例中,系统800的存储器812包括事实索引而不是事实索引接口832。 [0102] In some embodiments, the system 800 comprises a memory 812 the fact index instead of the fact that index of the interface 832. 系统800还包括文档存储系统840用于存储文档的内容,其中一些内容可以作为用于回答事实的源。 The system 800 also includes a document storage system 840 is used to store the contents of the document, some of the content may be used as a source to answer the facts. 文档存储系统包括片断生成器842用于访问文档的内容并从内容中产生片断, 还包括片断词语突出模块836用于突出片断中的查询词语和回答词语。 Document storage system comprises a fragment generator 842 is used to access the contents of the document and generates from the content fragment, further comprising a fragment of the words in the module 836 for projecting pieces projecting in the query terms and answer terms. 系统800还包括事实存储系统844用于存储事实。 The system 800 also includes a fact storage system 844 for storing facts. 存储在事实存储系统844中的每一事实都包括源的对应列表,相应的事实从这些源中提取。 Each fact stored in the fact storage system 844 includes a source in the corresponding list, the corresponding facts extracted from these sources.

[0103] 上述每一所识别的要素被存储在一个或多个上述的存储器装置中,并且对应于用于执行上述功能的指令集。 [0103] each of the above identified elements are stored in one or more of said memory means, and for performing the functions described above corresponding to the instruction set. 上述的模块或程序(也就是指令集)不必实现成单独的软件程序、过程或者模块,因此这些模块的不同子集可以在不同实施例中被组合或者以其他方式重新安排。 Said modules or programs (i.e. set of instructions) need not be implemented as separate software programs, procedures or modules, and therefore different subsets of these modules may be combined in different embodiments, or otherwise rearranged. 在一些实施例中,存储器812可以存储上述的模块和数据结构的子集。 In some embodiments, memory 812 may store a subset of the above modules and data structures. 另外,存储器812可以存储另外的上面没有描述的模块和数据结构。 Further, memory 812 can store the modules and data structures not described further above.

[0104] 虽然图8示出了事实查询应答系统,图8试图给出不同特征的功能性描述,其中这些不同的特征可以表示为一组服务,而不是这里描述的实施例的结构性示意图。 [0104] Although Fig. 8 shows the fact query answering system, Figure 8 shows a functional description of trying different characteristics, wherein the different characteristics may be represented as a set of services, and not described herein a structural schematic of an embodiment. 实际上,本领域技术人员可以理解,单独示出的项目可以被组合并且一些项目也可以被分开。 Indeed, those skilled in the art can appreciate that the items shown separately could be combined and some items can also be separated. 例如,图8中分离的示出的一些项目可以在单一的服务器上实现,并且单一的项目也可以由一个或多个服务器实现。 For example, some items shown in FIG. 8 can be isolated on a single server implementation, and a single item can also be realized by one or more servers. 用于实现事实查询应答系统的服务器的实际数量以及这些特征如何在这些服务器中被分配都将因实现的不同而不同,并且可以部分依靠于在峰值使用期间以及在平均使用期间系统必须控制的数据通信量。 How the actual number of such features and answering system for implementing the fact query server is assigned will vary in different implementations of these servers, and may be in part dependent on the average peak during use and during use of the system data that must be controlled in traffic.

[0105] 前面的描述,出于解释的目的,参考了特定的实施例进行了描述。 The foregoing description [0105], for purposes of explanation, with reference to the specific embodiments described. 然而,上面讨论的说明性的讨论并不是穷举或者将本发明限定到所公开的确切形式。 However, the illustrative discussions above are not intended to be exhaustive or discussed to limit the invention to the precise forms disclosed. 考虑到上面的教导, 许多改变和变化都是可能的。 Taking into account the above teachings, many modifications and variations are possible. 选择和描述实施例是为了更好的解释本发明的原理和其实际的应用,从而使本领域技术人员更好的利用本发明和作为特定用途的使用的具有不同变化的不同实施例。 Embodiments were chosen and described in order to better explain the principles of the invention and its practical application of the present, so that the skilled in the art to better utilize the invention and various embodiments with different variations as the use of a particular purpose.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US2003/0069880 Title not available
US2004/0255237 Title not available
US658446419 Mar 199924 Jun 2003Ask Jeeves, Inc.Grammar template query system
Classifications
International ClassificationG06F17/30
Cooperative ClassificationY10S707/99933, Y10S707/99934, Y10S707/99935, G06F17/30864
European ClassificationG06F17/30W1
Legal Events
DateCodeEventDescription
21 May 2008C06Publication
16 Jul 2008C10Request of examination as to substance
23 Jun 2010C14Granted