Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20070112748 A1
Publication typeApplication
Application numberUS 11/281,291
Publication date17 May 2007
Filing date17 Nov 2005
Priority date17 Nov 2005
Also published asCN1967535A, CN100594495C, US9495349
Publication number11281291, 281291, US 2007/0112748 A1, US 2007/112748 A1, US 20070112748 A1, US 20070112748A1, US 2007112748 A1, US 2007112748A1, US-A1-20070112748, US-A1-2007112748, US2007/0112748A1, US2007/112748A1, US20070112748 A1, US20070112748A1, US2007112748 A1, US2007112748A1
InventorsRobert Angell, Stephen Boyer, James Cooper, Richard Hennessy, Tapas Kanungo, Jeffrey Kreulen, David Martin, James Rhodes, W. Spangler, Herschel Weintraub
Original AssigneeInternational Business Machines Corporation
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
System and method for using text analytics to identify a set of related documents from a source document
US 20070112748 A1
Abstract
A system and method for processing a document to generate a set of related documents. A system is provided that includes a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
Images(3)
Previous page
Next page
Claims(27)
1. A document processing system, comprising:
a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and
a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
2. The document processing system of claim 1, wherein the set of structured information comprises key words associated with a technology field.
3. The document processing system of claim 1, wherein the set of structured information comprises a list of chemical abstract numbers.
4. The document processing system of claim 1, wherein the set of structured information comprises a list of SMILES (simplified molecular input line entry specification) strings.
5. The document processing system of claim 1, wherein the source document comprises a patent document and the set of related documents comprise technical references.
6. The document processing system of claim 1, wherein the source document comprises a medical record and the set of related documents comprise technical references.
7. The document processing system of claim 1, further comprising an annotation system for annotating the source document with metadata associated with the set of related documents.
8. The document processing system of claim 7, further comprising:
a database of annotated documents; and
a data mining system for mining the database of annotated documents.
9. The document processing system of claim 1, wherein the metadata is contained in a database of MedLine abstracts.
10. The document processing system of claim 1, further comprising an aggregation and ranking system for prioritizing the set of related documents.
11. A computer program product stored on a computer readable medium for processing a content source, comprising:
program code configured for analyzing unstructured data contained in the content source and for extracting a set of structured information about the content source; and
program code configured for identifying a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
12. The computer program product of claim 11, wherein the set of structured information comprises key words associated with a technology field.
13. The computer program product of claim 11, wherein the set of structured information comprises a list of chemical abstract numbers.
14. The computer program product of claim 11, wherein the set of structured information comprises a list of SMILES (simplified molecular input line entry specification) strings.
15. The computer program product of claim 11, wherein the content source comprises a patent document and the set of related documents comprise technical references.
16. The computer program product of claim 11, wherein the content source is selected from the group consisting of: a medical record, a Web page, a multimedia input, a technical reference, and a publication.
17. The computer program product of claim 11, further comprising program code configured for annotating the content source with metadata associated with the set of related documents.
18. The computer program product of claim 17, further comprising:
program code configured for storing an annotated content source in a database of annotated documents; and
program code configured for data mining the database of annotated content sources.
19. The computer program product of claim 11, wherein the metadata is contained in a database of MedLine abstracts.
20. The computer program product of claim 11, further comprising program code configured for prioritizing the set of related documents.
21. A method of processing a source document, comprising:
analyzing unstructured data contained in the source document;
extracting a set of structured information about the source document; and
identifying a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
22. The method of claim 21, wherein the set of structured information comprises information selected from the group consisting of: key words associated with a technology field, a list of chemical abstract numbers, and a list of SMILES (simplified molecular input line entry specification) strings.
23. The method of claim 21, wherein the source document comprises a document selected from the group consisting of: a patent document, a Web page, a medical record, a technical reference, and a publication.
24. The method of claim 21, further comprising the step of annotating the source document with metadata associated with the set of related documents.
25. The method of claim 21, wherein the metadata is contained in a database of MedLine abstracts.
26. The method of claim 21, further comprising the step of prioritizing the set of related documents.
27. A method for deploying an application for processing a document, comprising:
providing a computer infrastructure being operable to:
analyze unstructured data contained in the content source and for extracting a set of structured information about the content source; and
identify a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
Description
    BACKGROUND OF THE INVENTION
  • [0001]
    1. Technical Field
  • [0002]
    The present invention relates generally to using text analytics to identify a set of documents from a source document, and more specifically relates to a system and method for using text analytics on a technical reference such as a patent, along with a MeSH database, to identify a set of related references.
  • [0003]
    2. Related Art
  • [0004]
    Recent years have seen an explosive growth in the field of biotechnology, where discoveries can be worth hundreds of millions of dollars for the entities that own the rights to the discoveries. An ongoing challenge however is the tremendous cost of the research and development that is typically required. Given the dollar figures that are involved, companies must have a full understanding of the technology landscape for a particular biotechnology field.
  • [0005]
    Much of the technology landscape for a particular field can be gleaned from technical references, such as patent references and other scientific articles. From such references, one can determine the current state of the art, what technology is proprietary, what technology is public domain, etc. One of the challenges however involves quickly and efficiently locating relevant references that relate to a technological endeavor.
  • [0006]
    In many cases, the researcher may have an initial document, e.g., a patent, a journal article, a patient record, etc., and would like to find a superset of technical references that are related to the initial document. Various methodologies are known for searching for technical references. A common approach involves word searching, in which key words are entered into a database to identify references that include the key words. Other approaches involve utilizing classification data. For instance, in the case of patents, related patents may be identified based on the classification and sub-classification codes that are designated to each patent. In even a further approach, investigators can examine the list of references cited in the initial document.
  • [0007]
    While each of these techniques is useful, each is limited for obvious reasons. Word searching is limited since different writers often refer to similar concepts using any number of different terms, which generates many useless results. Furthermore, in the case of patents, the number of patents that share the same classification/sub-classification codes can be very large in number, and not always include the relevant features that are being searched. Conversely, the number of cited references listed on a technical document is typically a relatively short list that can only point to preexisting references, which may provide a good starting point, but is almost certainly not comprehensive in nature.
  • [0008]
    Accordingly, there are currently significant limitations involved in searching and analyzing technical references when trying to understand the technology landscape of a particular field of study.
  • [0009]
    Fortunately, non-patent literature in the biotechnology field is somewhat more user-friendly. The US National Library of Medicine (NLM) has over the years developed a scientific system called the Universal Medical Language System (UMLS) for the international harmonization of medical information and for the purpose of improving access to medical and scientific literature. The UMLS (http://umls.nlm.nih.gov/) objective is to help researchers intelligently retrieve and integrate information from a wide range of disparate electronic biomedical information sources. It can be used to overcome variations in the way similar concepts are expressed in different sources. This makes it easier for users to link information from patient record systems, bibliographic databases, factual databases, expert systems, etc.
  • [0010]
    The UMLS knowledge services can also assist in data creation and indexing publications. A part of the UMLS consists of the Medical Subject Heading (MeSH) Codes which serve as the basis for building ontology's important for the classification of the scientific literature. To this end, the NLM has a full time staff who methodically index millions of scientific publications in practically all of the recognized scientific journals. This forms the bases of such national resources such as MedLine (as well as other databases). When the NLM indexers classify and index these journals they do it using the MeSH ontology and in so doing create an extremely valuable set of metadata that describes the articles being indexed. For example, the indexers typically read the articles and make a list of all chemicals that are mentioned in the articles (i.e., the chemical file).
  • [0011]
    At the highest level, the indexers use a variety of MeSH qualifier codes to determine if the article being indexed is about chemicals, surgery, genetics, etc. At the more granular level, they classify the articles via an extensive system of concept codes, which number more than 750,000. This serves as a rich source of metadata for further classifying and indexing other content.
  • [0012]
    Unfortunately, there is no automated mechanism that allows a user to find related technical references for an inputted document (e.g., patent document, newspaper article, patient record, etc.) that is not indexed by the NLM or other similar metadata database. Accordingly, a need exists for a system that can identify a superset of technical references for an inputted reference.
  • SUMMARY OF THE INVENTION
  • [0013]
    The present invention addresses the above-mentioned problems, as well as others, by providing
  • [0014]
    In a first aspect, the invention provides a document processing system, comprising: a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
  • [0015]
    In a second aspect, the invention provides a computer program product stored on a computer readable medium for processing a content source, comprising: program code configured for analyzing unstructured data contained in the content source and for extracting a set of structured information about the content source; and program code configured for identifying a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
  • [0016]
    In a third aspect, the invention provides a method of processing a source document, comprising: analyzing unstructured data contained in the source document; extracting a set of structured information about the source document; and identifying a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
  • [0017]
    In a fourth aspect, the invention provides a method for deploying an application for processing a document, comprising: providing a computer infrastructure being operable to: analyze unstructured data contained in the content source and for extracting a set of structured information about the content source; and identify a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
  • [0018]
    In a fifth aspect, the invention provides computer software embodied in a propagated signal for implementing an application for processing a document, the computer software comprising instructions to cause a computer to perform the following functions: analyze unstructured data contained in the source document; extract a set of structured information about the source document; and identify a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0019]
    These and other features of this invention will be more readily understood from the following detailed description of the various aspects of the invention taken in conjunction with the accompanying drawings in which:
  • [0020]
    FIG. 1 depicts a computer system having a document processing system in accordance with an embodiment of the present invention.
  • [0021]
    FIG. 2 depicts search engine for searching annotated documents in accordance with an embodiment of the present invention.
  • DETAILED DESCRIPTION OF THE INVENTION
  • [0022]
    Referring now to the drawings, FIG. 1 depicts a computer system 10 having a document processing system 18 that analyzes an inputted source document 28 and generates a set of related documents 30. In addition, document processing system 18 may also generate an annotated document 32 that includes metadata 34 used to identify the set of related documents 30. The annotated document 32 may be stored in an annotated documents database 40 (i.e., with other annotated documents). The set of related documents 30 comprises a list of publications that are somehow related or relevant to the inputted source document 28.
  • [0023]
    It is understood that source document 28 may comprise any type of document, but generally comprises “unstructured information.” The generated set of related documents 30 may comprise any documents that can be identified via a metadata database 36. For example, in one illustrative embodiment, source document 28 may comprise a biotechnology related patent document that discloses a particular genetic sequence, and the set of related documents 30 comprises a list of biotechnology references (i.e., journal articles, etc.) that discuss the particular genetic sequence. In another embodiment, source document 28 may comprise a patient record that discloses a particular condition or disease, and the set of related documents 30 may include scientific articles relevant to the condition or disease.
  • [0024]
    In still a further embodiment, rather than inputting a source document 28, document processing system 18 may input any type of content source that contains unstructured information. Illustrative content sources may include multimedia data such as audio files, video data, images, streaming data, Web pages, etc.
  • [0025]
    To generate the related set of documents 30, document processing system 18 includes a textual analytics system 20 for extracting “structured information,” including key words, such as chemical names, diseases, genes, etc., from the source document 28; a compare system 22 for matching the structured information with metadata stored in metadata database 36 to locate the set of related documents 30; an aggregation and ranking system 24 for aggregating and ranking the set of related documents 30 and/or associated metadata/structured information; and an annotation system for generating an annotated document 32 that includes metadata 34.
  • [0026]
    Textual analytics system 20 provides a system for analyzing unstructured information in order to generate a set of structured information. Textual analytics system 20 may for instance be implemented with the IBM™ Unstructured Information Management Architecture (UIMA). Structured information may be characterized as information whose intended meaning is unambiguous and explicitly represented in the structure or format of the data. The canonical example of structured information is a relational database table. Unstructured information may be characterized as information whose intended meaning is only loosely implied by its form and therefore requires interpretation in order to approximate and extract its intended meaning. Examples include natural language documents, speech, audio, still images, Web pages and video. It is estimated that 80 percent of all corporate information is unstructured.
  • [0027]
    In analyzing unstructured content, Unstructured Information Management (UIM) applications make use of a variety of technologies including statistical and rule-based natural language processing (NLP), information retrieval, machine learning, ontologies, and automated reasoning. UIM applications may consult structured sources to help resolve the semantics of the unstructured content. For example, a database of chemical names can help in focusing the analysis of medical abstracts. A UIM application generally produces structured information resources that unambiguously represent content derived from unstructured information input. These structured resources can then be made accessible through a set of application-appropriate access methods. A simple example is a search index and query processor that makes documents quickly accessible by topic and ranks them according to their relevance to key concepts specified by the user. A more complex example is a formal ontology and inference system that, for example, allows the user to explore the concepts, their relationships, and the logical implications contained in a collection consisting of millions of documents.
  • [0028]
    Textual analytics system 20 may be implemented to identify structured information about a particular technology field (e.g., life sciences) including key words, such as chemical names, diseases, genes, molecules, etc., from the source document 28. Other information, such as a list of chemical abstract (CAS) numbers and a list of SMILES (“simplified molecular input line entry specification,” which is a specification for unambiguously describing the structure of chemical molecules using short ASCII alpha-numeric strings) may also be derived by textual analytics system 20 from the source document 28.
  • [0029]
    Compare system 22 compares the results of textual analytics system 20 with information in metadata database 36 to identify a set of related documents 30. Metadata database 36 comprises metadata indexed from a comprehensive set of technology references, i.e., publications, such as scientific journal articles. In one illustrative embodiment, metadata database 36 comprises a database of MedLine abstracts, which include metadata comprised of MeSH codes, codes, chemical lists, CAS numbers, a SMILES data, etc., for associated publications. Compare system 22 thus identifies publications whose associated metadata matches the structured information obtained by textual analysis system 20. Each such match may result in the identification of a technology reference that can be added to the set of related documents 30. Aggregation and ranking system 24 may be implemented to aggregate results and rank documents within the set of related documents 30.
  • [0030]
    Annotation system 26 can be utilized to annotate the source document 28 with metadata 34 derived from both the metadata database 36 and from the textual analytics system 20. The metadata 34 in annotated document 32 may likewise be processed/ranked by aggregation and ranking system 24. In an example where source document 28 comprises a patent, an annotated patent could be generated with, e.g., MedLine metadata that includes MeSH data, indexed data associated with technical references containing chemicals in common with the source patent, etc.
  • [0031]
    In an illustrative embodiment, the metadata database 36 could be loaded as a separate star schema that is part of a larger data warehouse that also contains the annotated documents database 40.
  • [0032]
    The aggregation and ranking system 24 could be implemented in any manner. For instance, if multiple references within the set of related documents 30 include the same piece of metadata, those instances of the metadata could be aggregated into a single listing with an increased rank of importance. Moreover, aggregation and ranking system 24 could identify “categories” of references and/or metadata that are deemed more important than others. Furthermore, aggregation and ranking system 24 could filter references and/or metadata to exclude certain references or metadata from the results.
  • [0033]
    Likewise, annotation system 26 may be implemented in any fashion. For instance, the metadata 34 may be stored in additional fields of a document database.
  • [0034]
    It should be understood that any type of metadata could be used within the context of the present invention to identify a set of related documents 30 and annotate a source document 28. Illustrative types of metadata include MedLine qualifier codes, chemicals, molecular structures, MeSH codes, concept codes, classifications, ontologies, etc. Non-biotechnology related patents, such as software, mechanical, electrical, etc., could likewise be annotated in a similar fashion with domain specific metadata based on, e.g., existing or developed metadata ontologies and classifications.
  • [0035]
    FIG. 2 depicts a data mining system 42 for exploiting the annotated documents database 40 of FIG. 1. Data mining system 42 includes a search system 44 and metadata classification system 46 that allows a user to enter a metadata query 48 to generate a set of search results 50.
  • [0036]
    In general, the computer system 10 of FIG. 1 (as well as the data mining system 42 of FIG. 2) may comprise, e.g., a desktop, a laptop, a workstation, etc. Moreover, computer system 10 could be implemented as part of a client and/or a server. Computer system 10 generally includes a processor 12, input/output (I/O) 14, memory 16, and bus 17. The processor 12 may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server. Memory 16 may comprise any known type of data storage and/or transmission media, including magnetic media, optical media, random access memory (RAM), read-only memory (ROM), a data cache, a data object, etc. Moreover, memory 16 may reside at a single physical location, comprising one or more types of data storage, or be distributed across a plurality of physical systems in various forms.
  • [0037]
    I/O 14 may comprise any system for exchanging information to/from an external resource. External devices/resources may comprise any known type of external device, including a monitor/display, speakers, storage, another computer system, a hand-held device, keyboard, mouse, voice recognition system, speech output system, printer, facsimile, pager, etc. Bus 17 provides a communication link between each of the components in the computer system 10 and likewise may comprise any known type of transmission link, including electrical, optical, wireless, etc. Although not shown, additional components, such as cache memory, communication systems, system software, etc., may be incorporated into computer system 10.
  • [0038]
    Access to computer system 10 may be provided over a network 36 such as the Internet, a local area network (LAN), a wide area network (WAN), a virtual private network (VPN), etc. Communication could occur via a direct hardwired connection (e.g., serial port), or via an addressable connection that may utilize any combination of wireline and/or wireless transmission methods. Moreover, conventional network connectivity, such as Token Ring, Ethernet, WiFi or other conventional communications standards could be used. Still yet, connectivity could be provided by conventional TCP/IP sockets-based protocol. In this instance, an Internet service provider could be used to establish interconnectivity. Further, as indicated above, communication could occur in a client-server or server-server environment.
  • [0039]
    It should be appreciated that the teachings of the present invention could be offered as a business method on a subscription or fee basis. For example, a computer system 10 comprising document processing system could be created, maintained and/or deployed by a service provider that offers the functions described herein for customers. That is, a service provider could offer to provide identifying sets of related documents, a process for annotated documents, and/or a annotated documents database 40 as described above.
  • [0040]
    It is understood that the systems, functions, mechanisms, methods, engines and modules described herein can be implemented in hardware, software, or a combination of hardware and software. They may be implemented by any type of computer system or other apparatus adapted for carrying out the methods described herein. A typical combination of hardware and software could be a general-purpose computer system with a computer program that, when loaded and executed, controls the computer system such that it carries out the methods described herein. Alternatively, a specific use computer, containing specialized hardware for carrying out one or more of the functional tasks of the invention could be utilized. In a further embodiment, part of all of the invention could be implemented in a distributed manner, e.g., over a network such as the Internet.
  • [0041]
    The present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods and functions described herein, and which—when loaded in a computer system—is able to carry out these methods and functions. Terms such as computer program, software program, program, program product, software, etc., in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
  • [0042]
    The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of this invention as defined by the accompanying claims.
Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US4642762 *25 May 198410 Feb 1987American Chemical SocietyStorage and retrieval of generic chemical structure representations
US5794236 *29 May 199611 Aug 1998Lexis-NexisComputer-based system for classifying documents into a hierarchy and linking the classifications to the hierarchy
US5950192 *26 Jun 19977 Sep 1999Oxford Molecular Group, Inc.Relational database mangement system for chemical structure storage, searching and retrieval
US6038560 *21 May 199714 Mar 2000Oracle CorporationConcept knowledge base search and retrieval system
US6038574 *18 Mar 199814 Mar 2000Xerox CorporationMethod and apparatus for clustering a collection of linked documents using co-citation analysis
US6098034 *18 Mar 19961 Aug 2000Expert Ease Development, Ltd.Method for standardizing phrasing in a document
US6286018 *29 Sep 19994 Sep 2001Xerox CorporationMethod and apparatus for finding a set of documents relevant to a focus set using citation analysis and spreading activation techniques
US6289342 *20 May 199811 Sep 2001Nec Research Institute, Inc.Autonomous citation indexing and literature browsing using citation context
US6304869 *16 Feb 199916 Oct 2001Oxford Molecular Group, Inc.Relational database management system for chemical structure storage, searching and retrieval
US6389436 *15 Dec 199714 May 2002International Business Machines CorporationEnhanced hypertext categorization using hyperlinks
US6604114 *24 Aug 20005 Aug 2003Technology Enabling Company, LlcSystems and methods for organizing data
US6732090 *5 Dec 20014 May 2004Xerox CorporationMeta-document management system with user definable personalities
US6823301 *4 Mar 199823 Nov 2004Hiroshi IshikuraLanguage analysis using a reading point
US6879990 *28 Apr 200012 Apr 2005Institute For Scientific Information, Inc.System for identifying potential licensees of a source patent portfolio
US6963830 *14 Jun 20008 Nov 2005Fujitsu LimitedApparatus and method for generating a summary according to hierarchical structure of topic
US7003517 *24 May 200121 Feb 2006Inetprofit, Inc.Web-based system and method for archiving and searching participant-based internet text sources for customer lead data
US7054754 *11 Feb 200030 May 2006Cambridgesoft CorporationMethod, system, and software for deriving chemical structural information
US7065514 *5 Nov 200120 Jun 2006West Publishing CompanyDocument-classification system, method and software
US7197697 *15 Jun 200027 Mar 2007Fujitsu LimitedApparatus for retrieving information using reference reason of document
US20020062302 *7 Aug 200123 May 2002Oosta Gary MartinMethods for document indexing and analysis
US20020169755 *9 May 200114 Nov 2002Framroze Bomi PatelSystem and method for the storage, searching, and retrieval of chemical names in a relational database
US20040088332 *27 Oct 20036 May 2004Knowledge Management Objects, LlcComputer assisted and/or implemented process and system for annotating and/or linking documents and data, optionally in an intellectual property management system
US20040117405 *26 Aug 200317 Jun 2004Gordon ShortRelating media to information in a workflow system
US20040172378 *14 Nov 20032 Sep 2004Shanahan James G.Method and apparatus for document filtering using ensemble filters
US20040205448 *5 Dec 200114 Oct 2004Grefenstette Gregory T.Meta-document management system with document identifiers
US20050060305 *15 Sep 200417 Mar 2005Pfizer Inc.System and method for the computer-assisted identification of drugs and indications
US20050108001 *15 Nov 200219 May 2005Aarskog Brit H.Method and apparatus for textual exploration discovery
US20050131025 *22 Nov 200416 Jun 2005Matier William L.Amelioration of cataracts, macular degeneration and other ophthalmic diseases
US20050160107 *28 Dec 200421 Jul 2005Ping LiangAdvanced search, file system, and intelligent assistant agent
US20050234952 *15 Apr 200420 Oct 2005Microsoft CorporationContent propagation for enhanced document retrieval
US20050246316 *30 Apr 20043 Nov 2005Lawson Alexander JMethod and software for extracting chemical data
US20060095298 *31 Oct 20054 May 2006Bina Robert BMethod for horizontal integration and research of information of medical records utilizing HIPPA compliant internet protocols, workflow management and static/dynamic processing of information
US20070112833 *17 Nov 200517 May 2007International Business Machines CorporationSystem and method for annotating patents with MeSH data
US20070208719 *8 May 20076 Sep 2007Bao TranSystems and methods for analyzing semantic documents over a network
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US752655412 Jun 200828 Apr 2009International Business Machines CorporationSystems and methods for reaching resource neighborhoods
US7660793 *12 Nov 20079 Feb 2010Exegy IncorporatedMethod and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
US815610117 Dec 200910 Apr 2012Exegy IncorporatedMethod and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
US832681912 Nov 20074 Dec 2012Exegy IncorporatedMethod and system for high performance data metatagging and data indexing using coprocessors
US837498615 May 200812 Feb 2013Exegy IncorporatedMethod and system for accelerated stream processing
US8407215 *10 Dec 201026 Mar 2013Sap AgText analysis to identify relevant entities
US84950648 Sep 201123 Jul 2013Microsoft CorporationManagement of metadata for life cycle assessment data
US851599412 Jun 200820 Aug 2013International Business Machines CorporationReaching resource neighborhoods
US8713521 *2 Sep 200929 Apr 2014International Business Machines CorporationDiscovery, analysis, and visualization of dependencies
US88805019 Apr 20124 Nov 2014Ip Reservoir, LlcMethod and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
US92984533 Jul 201229 Mar 2016Microsoft Technology Licensing, LlcSource code analytics platform using program analysis and information retrieval
US932379427 Nov 201226 Apr 2016Ip Reservoir, LlcMethod and system for high performance pattern indexing
US9338249 *25 Aug 200610 May 2016Google Technology Holdings, Inc.Distributed user profile
US93962223 Nov 201419 Jul 2016Ip Reservoir, LlcMethod and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
US95478245 Feb 201317 Jan 2017Ip Reservoir, LlcMethod and apparatus for accelerated data quality checking
US963309322 Oct 201325 Apr 2017Ip Reservoir, LlcMethod and apparatus for accelerated format translation of data in a delimited data format
US963309723 Apr 201525 Apr 2017Ip Reservoir, LlcMethod and apparatus for record pivoting to accelerate processing of data fields
US9633115 *8 Apr 201425 Apr 2017International Business Machines CorporationAnalyzing a query and provisioning data to analytics
US967894913 Dec 201313 Jun 2017Cloud 9 LlcVital text analytics system for the enhancement of requirements engineering documents and other documents
US976059220 Feb 201412 Sep 2017International Business Machines CorporationMetrics management and monitoring system for service transition and delivery management
US20070112833 *17 Nov 200517 May 2007International Business Machines CorporationSystem and method for annotating patents with MeSH data
US20080114724 *12 Nov 200715 May 2008Exegy IncorporatedMethod and System for High Performance Integration, Processing and Searching of Structured and Unstructured Data Using Coprocessors
US20080288484 *25 Aug 200620 Nov 2008Motorola, Inc.Distributed User Profile
US20090313255 *12 Jun 200817 Dec 2009International Business Machines CorporationSystems and methods for reaching resource neighborhoods
US20100064012 *8 Sep 200811 Mar 2010International Business Machines CorporationMethod, system and apparatus to automatically add senders of email to a contact list
US20110055811 *2 Sep 20093 Mar 2011International Business Machines CorporationDiscovery, Analysis, and Visualization of Dependencies
US20120150852 *10 Dec 201014 Jun 2012Paul SheedyText analysis to identify relevant entities
US20150286697 *8 Apr 20148 Oct 2015International Business Machines CorporationAnalyzing a query and provisioning data to analytics
CN103970792A *4 Feb 20136 Aug 2014中国银联股份有限公司Index-based file comparison method and device
WO2014047051A1 *17 Sep 201327 Mar 2014Atigeo LlcMethods and automated systems that assign medical codes to electronic medical records
WO2014093935A1 *13 Dec 201319 Jun 2014Cloud 9 LlcVital text analytics system for the enhancement of requirements engineering documents and other documents
Classifications
U.S. Classification1/1, 707/999.004
International ClassificationG06F17/30
Cooperative ClassificationG06F17/27
European ClassificationG06F17/27
Legal Events
DateCodeEventDescription
19 Dec 2005ASAssignment
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION,NEW YO
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ANGELL, ROBERT L.;BOYER, STEPHEN K.;COOPER, JAMES W.;ANDOTHERS;SIGNING DATES FROM 20051017 TO 20051127;REEL/FRAME:017129/0657
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ANGELL, ROBERT L.;BOYER, STEPHEN K.;COOPER, JAMES W.;ANDOTHERS;SIGNING DATES FROM 20051017 TO 20051127;REEL/FRAME:017129/0657