WO2005036369A2 - Database for microbial investigations - Google Patents

Database for microbial investigations Download PDF

Info

Publication number
WO2005036369A2
WO2005036369A2 PCT/US2004/033742 US2004033742W WO2005036369A2 WO 2005036369 A2 WO2005036369 A2 WO 2005036369A2 US 2004033742 W US2004033742 W US 2004033742W WO 2005036369 A2 WO2005036369 A2 WO 2005036369A2
Authority
WO
WIPO (PCT)
Prior art keywords
search request
data
query
search
pathogen
Prior art date
Application number
PCT/US2004/033742
Other languages
French (fr)
Other versions
WO2005036369A3 (en
Inventor
Kumar Hari
John Mcneil
Dave Ecker
Neill White
Vivek Samant
Vanessa Zapp
Alan Goates
Original Assignee
Isis Pharmaceuticals, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Isis Pharmaceuticals, Inc. filed Critical Isis Pharmaceuticals, Inc.
Publication of WO2005036369A2 publication Critical patent/WO2005036369A2/en
Publication of WO2005036369A3 publication Critical patent/WO2005036369A3/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This invention relates generally to the field of microbial investigations.
  • this invention is directed towards a centralized computerized investigative system designed to assist researchers and clinicians with identifying and treating pathogens and related data.
  • threat lists contain common or sometimes misspelled bioagents, rather than the accepted taxonomical designations, if an accepted designation even exists.
  • Naming ambiguities extend to disease lists as well, where common names are often used, and the diseases may themselves be caused by multiple organisms.
  • Pathogens are often named after the disease they cause— for example, severe acute respiratory syndrome (SARS), and foot-and-mouth disease are both pathogen names that describe the effect on their victims rather than inherently describing the pathogen.
  • SARS severe acute respiratory syndrome
  • foot-and-mouth disease are both pathogen names that describe the effect on their victims rather than inherently describing the pathogen.
  • Disease and pathogen names also can vary depending on the host. Further, taxonomies are legitimately derived from multiple sources, including organism characteristics; disease effects; and molecular characteristics, e.g., rRNA sequence. Disease information and sequence data are often linked to different taxa. For example, viral sequences are most completely linked to the National Center for Biotechnology Information's (NCBI) taxonomy, while viral diseases and characteristics are most completely linked to the International Committee on Taxonomy of Viruses (ICTV) taxonomy.
  • NCBI National Center for Biotechnology Information's
  • ICTV International Committee on Taxonomy of Viruses
  • the present invention enables a centralized system for locating, storing and providing microbial-related information, including pathogens, diseases, symptoms, genetic information, and related documentation.
  • the present invention maintains a database that relates pathogens to diseases that they cause, symptoms of those diseases, genetic information about the pathogens, and documentation that is associated with any of the above.
  • the system recognizes duplicated information, for example a pathogen present in the database twice, each time under a different taxonomical name, and creates a link from one instance of the pathogen to the other.
  • the system additionally identifies common genes and sequences among related organisms.
  • a user such as a clinician or researcher can use a system of the present invention to perform investigations.
  • a medical doctor can enter patient symptoms into the system to obtain a list of diseases consistent with the entered symptoms, and pathogens associated with each disease.
  • the system can also be used, for example, to track the progression of an outbreak and identify similar occurrences by searching the World Wide Web on a nightly, automated basis and reporting results such as news articles, published research and the like to the doctor, who can then validate the new data making it part of the system's database.
  • FIG. 1 illustrates a system for performing microbial investigations in accordance with an embodiment of the present invention.
  • Fig. 2 illustrates a pathogen report in accordance with an embodiment of the present invention.
  • FIG. 3 illustrates an example of a query screen presented to a user in accordance with an embodiment of the present invention.
  • FIG. 4 provides an illustration of a pathogen report in accordance with an embodiment of the present invention.
  • Fig. 5 illustrates an NCBI-source pathogen page in accordance with an embodiment of the present invention.
  • Fig. 6 illustrates an example of a disease report in accordance with an embodiment of the present invention.
  • Fig. 7 illustrates a disease/host report in accordance with an embodiment of the present invention.
  • Fig. 8 illustrates a user of a list-based database query in accordance with an embodiment of the present invention.
  • Fig. 9 illustrates a method for configuring a database in accordance with an embodiment of the present invention.
  • Fig. 10 illustrates a method for executing a search in accordance with an embodiment of the present invention.
  • System 100 includes a user interface (UI) module 104 for communicating with a user 102 of the system; a query constructor module 106 for building search queries; a network access module 108 for executing the queries and searching a network such as the Internet for results of the query; a data analysis engine 110 for reviewing the results received from the search and compiling them for display to the user 102 via user interface module 104; and a Microbial Rosetta Stone (MRS) database 114 for storing the returned search results.
  • UI user interface
  • 106 for building search queries
  • a network access module 108 for executing the queries and searching a network such as the Internet for results of the query
  • a data analysis engine 110 for reviewing the results received from the search and compiling them for display to the user 102 via user interface module 104
  • Microbial Rosetta Stone (MRS) database 114 for storing the returned search results.
  • _MRS database 114 thus is an effective reference database of microbial organisms, diseases, genomics, threat management and forensics.
  • User interface module 104 enables communication between system 100 and a user 102 of the system.
  • UI module 104 includes a web server, and user 102 accesses the web server using a conventional web browser such as Internet Explorer by Microsoft Corporation of Redmond, Washington, or Firefox by The Mozilla Corporation of Mountain View, California.
  • user 102 uses client software developed specifically to interact with UI module 104; and in another embodiment user 102 interacts directly with UI module 104 by being physically located at the same location as system 100.
  • UI module 104 includes a Java-based expert interface called a pluggable object viewer (POV), created by and available as open source from Isis Pharmaceuticals of Carlsbad, California.
  • POV pluggable object viewer
  • This framework allows for hierarchical browsing of data from any source that can be wrapped into a Java object as defined by a plug-in level API.
  • the framework keeps track of lists of these objects and allows for querying related objects by any combination of a related list and a set of filters upon the data object in question.
  • the logic of the query is always available to the user and can be branched out from any point to create alternate search paths. Queries of user-defined complexity and depth can be annotated, saved as XML files and re-run or edited at a later time.
  • the lists of results can be manually manipulated or combined with other lists using set logic (union, intersection, difference, etc.).
  • Result lists can display any level of depth for a query from a single result to an entire history for the query that produced those results. Individual items in the list can display further detail when selected, and the display can be simple text output, more complex HTML or even graphical in nature. HTML is preferably used not only for display formatting, but for the ability to embed URLs to other public data sources, allowing UI module 104 to link from within system 100 to externally available information. Examples of a typical user interface in accordance with embodiments of the present invention are described below.
  • system 100 is initially configured by placing available information in MRS database 114, and associating pathogen, disease and gene sequence data wherever possible.
  • a list of known microbial threats is assembled from available sources. Depending on the source, a threat list might include pathogens, diseases, or pathogen/disease pairings. Once assembled, the pathogens associated with each list are standardized according to the correct internationally accepted taxonomic names. The threat list and associated pathogens are stored in MRS database 114. Next, a comprehensive collection of disease synonyms is loaded for each of the agents listed on the threat lists. In addition, a list of gene sequences and gene synonyms is also loaded into MRS database 114, including sequences that identify individual genotypes of the pathogens. MRS database 114 maintains an association between the disease synonyms, the genetic information, and pathogens.
  • initial data is loaded into system 100 in a variety of ways, including manual entry; bulk import using computational parsers and data validation tools; expert curation facilities; and from other database management systems (DBMS).
  • Network access module 108 preferably includes automated data import scripts for loading taxonomic lineages and nucleic acid sequences from a source such as the National Center for Biotechnology Information (NCBI).
  • NCBI National Center for Biotechnology Information
  • network access module 108 also imports data automatically from other microbial databases, including curated data from publicly available databanks such as the Swissprot protein sequence database maintained by the Swiss Institute of Bioinformatics, the Protein Families (Pfam) database of alignments maintained by the Wellcome Trust's Sanger Institute, Kyoto Encyclopedia of Genes and Genomes maintained by the Kyoto University Bioinformatics Center (KEGG), and the Gene Ontology (GO) project maintained by the Gene Ontology Consortium.
  • publicly available databanks such as the Swissprot protein sequence database maintained by the Swiss Institute of Bioinformatics, the Protein Families (Pfam) database of alignments maintained by the Wellcome Trust's Sanger Institute, Kyoto Encyclopedia of Genes and Genomes maintained by the Kyoto University Bioinformatics Center (KEGG), and the Gene Ontology (GO) project maintained by the Gene Ontology Consortium.
  • network access module 108 also imports data automatically from government sites, including the Centers for Disease Control (CDC); the United States Department of Agriculture (USDA); PBR (Pox Virus Database); ProMED, and the Department of Health and Human Services (HHS).
  • CDC Centers for Disease Control
  • USDA United States Department of Agriculture
  • PBR Pox Virus Database
  • ProMED ProMED
  • HHS Department of Health and Human Services
  • data is also loaded from the International Committee on the Taxonomy of Viruses (ICTV), including data for viral isolates.
  • ICTV International Committee on the Taxonomy of Viruses
  • data for viral isolates Preferably, internationally-accepted taxonomy standards for viral organisms are used.
  • system 100 associates each of the pieces of initial data with other related data such as bacterial taxonomy data from NCBI or ortholog data from various publications.
  • data analysis engine 110 determines which imported entries are duplicates of other imported entries, or, similarly, which seemingly disparate pieces of data in fact relate to a common pathogen.
  • Data ambiguities within the MRS database 114 are preferably resolved by creating additional tables in the database schema. These additional linking tables allow "many-to-many"-style relationships to be defined that otherwise would not be available in a relational schema. For example, one naming ambiguity involves three GenBank nucleotide sequences (X52374, X52505 and X52506) that were linked to the Berne virus at NCBI and to the Equine torovirus at ICTV. Despite the similar sequence associations, searching NCBI with the name “Equine torovirus" yields no results.
  • system 100 creates a new table to connect the differing NCBI and ICTV names that, by sequence associations, appear to describe the same organism. Data relationships stored in this linking table are then preferably brought to the user's attention via user interface module 104, as illustrated in Fig. 2.
  • the NCBI pathogen name, "Berne virus” 202 is listed as the Pathogen Name
  • the taxonomy source 206 field indicates that the source is the NCBI 208 database.
  • Under the equivalent pathogen section 210 is an indication of the corresponding ICTV name, equine torovirus 212.
  • a similar linking table architecture is preferably used to track synonyms for pathogens and disease names, and to provide data source referencing for each name. Links to documents and keywords are also included to allow referencing of threat list, disease, epidemiology, formulation, transmission, forensic, protocol, characteristic and sequence data.
  • system 100 then begins an automated curation of the data. First, system 100 ensures that pathogens are associated with diseases that they may cause. Next, if available, a link is established from each pathogen to its genomic sequence. System 100 insures a detailed and accurate taxonomic mapping of disease/threat organism by linking each included pathogen to its proper taxa, and each disease to the pathogen that causes it. Also, if available, system 100 creates a link from the pathogen to additional information on pathogen gene products and functions, including, for example, start and stop codons, protein coding regions, regions where PCR primers can bind, etc.
  • system 100 attempts to identify organisms that are annotated as being different, but which are in fact the same.
  • system 100 includes scripts that make the identification of duplicate entries based on names and synonyms that define the threat agent in MRS database 114.
  • duplicate identification is preferably made by comparing and mapping features, e.g., exons, introns, untranslated regions (UTRs), primer binding sites, etc., from one sequence to another.
  • System 100 then preferably checks to make sure that each of the pathogens identified as being duplicates of one another are in fact the same. Where they are not, they are unlinked from one another.
  • system 100 In addition to identifying duplicates and false duplicates, system 100 also identifies common genes and sequences among related organisms, by, for example, performing analyses of sequence similarities at the nucleotide or amino acid level. Such analyses are carried out using, e.g., available software utilities such as BLAST programs (basic local alignment search tools) and PowerBLAST programs known in the art (Altschul et al., J. Mol. BioL, 1990, 215, 403-410; Zhang and Madden, Genome Res., 1997, 7, 649-656), or other alignment utilities.
  • data analysis engine 110 preferably analyzes sequences and creates gene sequence alignments for hundreds or thousands of sequences at once.
  • MRS database 114 is also initially provided with documents such as studies, academic research papers, clinical trial data, news articles or other relevant materials relating to diseases and pathogens.
  • documents such as studies, academic research papers, clinical trial data, news articles or other relevant materials relating to diseases and pathogens.
  • some metadata is created either manually or automatically about the document.
  • keywords are automatically derived from the document using conventional document analysis technology such as the ht://Dig product available from the ht://Dig group, and the document is then indexed according to its keywords.
  • keywords are automatically derived from the document using conventional document analysis technology such as the ht://Dig product available from the ht://Dig group, and the document is then indexed according to its keywords.
  • Those of skill in the art will appreciate that other methods exist for indexing documents, including by performing latent semantic analysis, manual indexing, etc., any of which may be appropriate for a given implementation of system 100.
  • the documents are indexed, they are linked to appropriate pathogens and/or diseases and/
  • MRS database 114 For example, a document describing the identification of the Ames strain of Bacillus anthracis as the threat agent in the 2001 Anthrax attacks is stored in MRS database 114 and linked to disease, pathogen, two taxonomic lineages, other authors who have contributed papers about Bacillus anthracis, relevant gene sequences, and related keywords. Queries
  • a user that wishes to obtain some knowledge about a pathogen, set of symptoms, or other data tracked by system 100 makes a search request by interacting with UI module 104, as described further below.
  • UI module 104 passes the search request received from the user to query constructor module 106, which then constructs a query that can be used to search both MRS database 114 and external data sources 112c, e.g., via a search of the Internet.
  • Data analysis engine 110 correlates the response to the searches and provides them to user 102 via UI module 104.
  • Fig. 3 illustrates an example of a query screen 300 presented to user 102 via UI module 104 in an embodiment of the present invention.
  • Query screen 300 preferably allows a user 102 to type a global free-form query into a text box 302 and click the "find" button 304 to perform a search. Additionally, the user 102 can choose to filter the search results according to filter criteria 306.
  • filter criteria 106 includes Contact, Disease, Document, Forensics, Pathogen, Sequence, Threat List, Genome, and Characteristic fields. Each field additionally includes Boolean statements by which to filter. For example, to filter by contact, a user 102 can choose to filter by the contact's last name or first name.
  • Fig. 4 provides an illustration of a pathogen report 400.
  • Report 400 could be obtained, for example, when a user 102 types "Foot and mouth" into text box 302 in order to do a full database search; or by entering "foot and mouth” in a synonym search box from a pathogen search page (not shown). A list of results matching the criteria is then returned, and the user can click on a hyperlink to one of the results to view the report.
  • Pathogen reports 400 preferably display salient data for microorganisms, including pathogen names, taxonomic rank with information source, and synonyms. Taxonomic lineages that support upward and downward tree traversals are also shown.
  • the display of "Equivalent Pathogen(s)" 402 indicates that alternative taxonomic lineages exist for the organism. Diagnostic methods, laboratories capable of performing them and organizations which supply reagents, and links to relevant protocol references are listed in the "Capability Information” section 404 of pathogen report 400. Clicking on the "foot-and-mouth disease virus (NCBI) link 406, for example, links to Fig. 5, which is the NCBI-source pathogen page for foot and mouth disease.
  • NCBI foot-and-mouth disease virus
  • Fig. 6 illustrates an example of a disease report 600 in accordance with an embodiment of the present invention.
  • a disease report 600 preferably includes a description of host-pathogen associations, symptoms and treatments associated with the disease.
  • the report includes hyperlinks to other related data. For example, in Fig. 6, clicking on "glycine max" 602 would take a user to the pathogen page for the phakopsora pachyrhizi pathogen.
  • Fig. 7 illustrates a disease/host report 700 in accordance with an embodiment of the present invention.
  • the disease/host report describes each unique host/pathogen correlation found as a result of the query search.
  • epidemiological information 702 is also displayed.
  • Other reports can also be provided.
  • a forensics report can be produced by system 100, including a_history of the pathogen with links to seminal publications, and details about cases where the pathogen may have been used in a terrorist event.
  • Fig. 8 illustrates a way in which a list-based query can be made of the database using the Java-based POV interface.
  • the user-configured Query Panel 802 displays: 1) data types that can be used in searches; 2) the number of results from a search ("7 Threat Organism(s)” or "83 Sequence(s)") with user-defined annotations ("Use these sequences”); 3) and data types available for additional searches based on a given result.
  • the Result Panel 804 displays the data relationships between lists of results (here, the 7 Threat Organisms and the 83 Sequences). Details for any highlighted information in the Result Panel are shown in the Inspector Panel 806. Web links and other tools, e.g., a sequence "Feature Viewer” tool 808, are preferably embedded in the Inspector Panel 806.
  • a medical doctor may recognize that five patients arrive in the clinic with similar symptoms of difficulty breathing, cough, and fever of 100.5 F.
  • the doctor logs into system 100 via UI module 104, clicks on the "Disease” link, and enters the symptoms into the "Symptoms" search field.
  • a list of diseases is displayed, and by viewing the individual Disease reports, the etiologic agents, hosts, symptoms and epidemiologic properties are displayed.
  • the doctor can find microbiology lab protocols that may be used to uniquely identify the organism, diagnose the disease and possibly guide therapeutic intervention.
  • the doctor may wish to have the system track the progression of this outbreak and identify similar occurrences by searching the World Wide Web on a nightly, automated basis. To do so, the doctor enters the POV user interface, and constructs a query by entering the symptoms he has identified into a "Symptom" search and running the query. Using the results, he searches for all Diseases associated with those symptoms by running a second search, and then finds all pathogens which can cause those diseases by running a third search. Finally, the doctor may wish to find data on molecular diagnostic techniques, which he does by running a fourth search on "Capability" where he limits this search with the text "nucleic acid".
  • results of this POV query are preferably displayed in a table format, and the doctor can save the query logic as a template that can be used to automatically search the World Wide Web.
  • This automated WWW search tool can be specified to run at designated intervals against the MRS database 114.
  • the query results are then used to generate search terms that are used by a second tool to search the web in a conventional fashion, e.g., via the Google search engine.
  • Data gathered from the web that matches the MRS results is e-mailed to the doctor as web pages, and the doctor can tell the tool to load into the MRS database 114 any or all of the new data that was found.
  • the new information may be loaded as Documents into the system, keywords are indexed by existing tools as described above, and linked to the Document such that the next time the doctor runs a search for diseases with symptoms of difficulty breathing, cough, and fever of 100.5 F, the new data will be displayed.
  • An advantage of system 100 is that it automatically presents the most current and most relevant data to a user in response to a request for information.
  • network access module 108 re-executes queries with a specified frequency, thus allowing the data in MRS database 114 to reflect the most currently-available information.
  • a user 102 of system 100 could create a search request, and specify that the request should remain active for a certain number of days, weeks, or even indefinitely. Thus the results the user receives after one day would be updated on a second or subsequent days, with newly-available information.
  • system 100 can provide a user 102 with a centralized source of global information about the search topic—for example, disease outbreaks in China, the United States and northern Africa may seem completely unrelated, but may be grouped together by system 100 because of similarities previously unrecognized.
  • a user 102 in the United States may not even be aware of the Chinese or African outbreaks, but would in fact be made aware of them by the results provided by system 100.
  • system 100 provides a centralized source of information that is readily accessible to clinicians and researchers throughout the world.
  • a group of individuals is suspected of having been exposed to an unknown pathogen.
  • a health care professional evaluates the fever, cough and chest pain symptoms exhibited by the group of individuals and provides a search request via UI module 104, and including the observed symptoms.
  • the results returned by data anlysis engine 110 include lists of pathogens which are known to give rise to the symptoms.
  • Anthrax (Bacillus anthracis) appears as a member of the list.
  • the health care professional then makes another search request, requesting information about specific tests that can be performed to distinguish among the pathogens in the list. Because the health care professional has a particular reason to suspect that the group has been exposed to anthrax, the health care professional then searches for known clinical tests that can be used to distinguish anthrax from other pathogens.
  • the result of the query is that among other methods, base composition analysis and multi-locus VNTR analysis (MLVA) are efficient tests for identifying anthrax, both of which in this example are contained as data types within a data table in MRS database 114 designated "RS_Characteristic.”
  • the query may then provide links to a variety of data types that may include: contact information for scientists and other professionals with expertise in the testing methods, literature reports on the tests, and the like.
  • TIGER genetic evaluation of risk
  • the health care professional may choose to perform a nucleic acid base composition analysis test and thus, he collects clinical samples from the group that are then sent to the TIGER pathogen testing laboratory.
  • technicians query the same database to obtain intelligent amplification primers with the aim of rapid identification of Bacillus anthracis. After amplification of the clinical samples, the amplification products are analyzed by mass spectrometry to determine their base compositions.
  • System 100 is then queried for a match of the base composition of the amplification product, and returns information indicating that the experimentally-determined base compositions match the base compositions catalogued in the database for Bacillus anthracis, prompting the technicians to inform the health care professional to treat his patients for an anthrax infection.
  • the health care professional can then ask system 100 again for the latest effective treatments for anthrax infection, which may include newly discovered antibiotics or other drugs.
  • an expert in microbial forensics may initiate an investigation wherein the strain of anthrax is determined by searching system 100 with nucleic acid sequence or base composition data.
  • a result of such a query may provide links to laboratories known to harbor various strains of anthrax.
  • a genetic engineering event may also be determined in a similar manner.
  • an expert in epidemiological investigations may perform a search for information about the spread of an anthrax outbreak. Information such as the rate of spread of particular strains of anthrax spores and the resistance to disinfection may be provided by system 100.
  • Fig. 9 there is shown an example of a method for initially configuring system 100 in the manner described above.
  • pathogen data is loaded 902 into system 100, either on its own or as part of a threat list.
  • disease data is loaded 904 into system 100, again, either separately or as part of a threat list linking diseases with pathogens.
  • genetic data is preferably loaded 906.
  • System 100 identifies 908 duplicate entries in a manner as described above.
  • the input data is then linked 910 together, so that related pathogens, diseases and genetic information can be cross- referenced as appropriate.
  • System 100 extracts 912 a set of keywords from the stored data and uses those keywords as search terms to perform 914 periodic updates over the Internet.
  • information such as published articles or research found related to those keywords is then indexed in MRS database 114 along with the appropriate diseases, pathogens or genetic information.
  • System 100 receives 1002 a search request and first checks to see 1004 whether the search is one that has previously been requested and is therefore cached. If the search has not been done previously, then a query is constructed 1008; otherwise, the query is retrieved 1006 from the cache.
  • a query is constructed or retrieved, it is then used to execute 1010 a search over the web to obtain the latest information available about the queried items—for example, if the search request is for treatments for anthrax, the query might include a search on the terms "anthrax”, “Bacillus anthracis,” “Ames,” and “treatment”.
  • the search results are analyzed and correlated 1012 with data already in MRS database 114, and the search results are then returned 1014.
  • the query is also cached so that it can be quickly retrieved again if needed.
  • Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
  • the present invention also relates to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.

Abstract

A centralized system provides for locating, storing and providing microbial-related information, including pathogens, diseases, symptoms, genetic information, and related documentation. A centralized database relates pathogens to diseases that they cause, symptoms of those diseases, genetic information about the pathogens, and documentation that is associated with any of the above. By analyzing the stored data, the system recognizes duplicated information, for example a pathogen present in the database twice, each time under a different taxonomical name, and creates a link from one instance of the pathogen to the other. The system additionally identifies common genes and sequences among related organisms. A researcher uses the system to perform investigations. For example, a medical doctor can enter patient symptoms to obtain a list of consistent diseases and associated pathogens. The system also tracks the progression of an outbreak and identifies similar occurrences by searching the World Wide Web and reporting results.

Description

DATABASE FOR MICROBIAL INVESTIGATIONS
Inventors Kumar Hari John McNeil Dave Ecker Neill White Vivek Samant Vanessa Zapp Alan Goates
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of United States Provisional Application No. 60/509,911, filed October 9, 2003, and United States Provisional Application No. 60/598,408, filed August 2, 2004; each of which is incorporated by reference herein in its entirety.
STATEMENT OF GOVERNMENT SUPPORT
[0002] This invention was made with United States Government support under FBI contracts J-FBI-02-127 and J-FBI-04-189. The United States Government may have certain rights in the invention.
BACKGROUND OF THE INVENTION
Field Of The Invention
[0003] This invention relates generally to the field of microbial investigations. In particular, this invention is directed towards a centralized computerized investigative system designed to assist researchers and clinicians with identifying and treating pathogens and related data.
Description Of The Related Art
[0004] The number of organisms that pose an infection risk, both bacterial and viral, to humans is amazingly large. A recent literature review identified 1,415 species of infectious organisms pathogenic to humans, including 217 viruses and prions, 538 bacteria and rickettsia, 307 fungi, 66 protozoa and 287 helminthes (Taylor et al, Phil. Trans. R. Soc. Lond. B (2001) 356, 983). In addition, threats to the food supply, i.e. to crops and farm animals, and the environment also exist in abundance. Further, there are many strain variants below the species level, and organisms that are phylogenetically closely related to known infectious agents from which bioengineered or emerging infectious agents might originate. This includes the possibility of organisms shifting from animal to human hosts.
[0005] In addition to the bioagents themselves, there are many threat-related genes that encode toxins and antibiotic resistance and that mediate virulence, host range and pathogenicity. Many of these genes are important to consider in the identification and treatment of bioengineered threats.
[0006] In spite of the above, no efficient and reliable methodology exists for bringing together information about these threats in one centralized location, largely because the solution is far from trivial. For example, there are many different government, medical and veterinary sources of "threat lists". Typically, threat lists contain common or sometimes misspelled bioagents, rather than the accepted taxonomical designations, if an accepted designation even exists. Naming ambiguities extend to disease lists as well, where common names are often used, and the diseases may themselves be caused by multiple organisms. Pathogens are often named after the disease they cause— for example, severe acute respiratory syndrome (SARS), and foot-and-mouth disease are both pathogen names that describe the effect on their victims rather than inherently describing the pathogen. Disease and pathogen names also can vary depending on the host. Further, taxonomies are legitimately derived from multiple sources, including organism characteristics; disease effects; and molecular characteristics, e.g., rRNA sequence. Disease information and sequence data are often linked to different taxa. For example, viral sequences are most completely linked to the National Center for Biotechnology Information's (NCBI) taxonomy, while viral diseases and characteristics are most completely linked to the International Committee on Taxonomy of Viruses (ICTV) taxonomy.
[0007] Today, a number of individuals with specific local domain expertise are required to manually connect threat lists, biological agents, correct taxonomic names, and correct sequences in a genetic sequence database such as the National Institute of Health's GenBank since there is no comprehensive collection of threats, synonyms and consistent taxonomic names.
[0008] The benefit of having coordinated access to these disparate sources of data is substantial— clinical investigations of infections, for example, would be improved by allowing clinicians to cross-reference symptoms and epidemiology data with data already available from other sources. Today, that data may exist but only in the hands of experts doing research in discrete subject areas, effectively turning a clinician's search into something akin to searching for a needle in a haystack. The history of the recent SARS outbreak is an illustrative example.
[0009] Accordingly, there is a need for a centralized microbial knowledge base system that provides efficient access to a diverse array of expert-curated data.
SUMMARY OF THE INVENTION
[0010] The present invention enables a centralized system for locating, storing and providing microbial-related information, including pathogens, diseases, symptoms, genetic information, and related documentation.
[0011] The present invention maintains a database that relates pathogens to diseases that they cause, symptoms of those diseases, genetic information about the pathogens, and documentation that is associated with any of the above. By analyzing the stored data, the system recognizes duplicated information, for example a pathogen present in the database twice, each time under a different taxonomical name, and creates a link from one instance of the pathogen to the other. The system additionally identifies common genes and sequences among related organisms.
[0012] A user such as a clinician or researcher can use a system of the present invention to perform investigations. For example, a medical doctor can enter patient symptoms into the system to obtain a list of diseases consistent with the entered symptoms, and pathogens associated with each disease. The system can also be used, for example, to track the progression of an outbreak and identify similar occurrences by searching the World Wide Web on a nightly, automated basis and reporting results such as news articles, published research and the like to the doctor, who can then validate the new data making it part of the system's database.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] Fig. 1 illustrates a system for performing microbial investigations in accordance with an embodiment of the present invention.
[0014] Fig. 2 illustrates a pathogen report in accordance with an embodiment of the present invention.
[0015] Fig. 3 illustrates an example of a query screen presented to a user in accordance with an embodiment of the present invention.
[0016] Fig. 4 provides an illustration of a pathogen report in accordance with an embodiment of the present invention.
[0017] Fig. 5 illustrates an NCBI-source pathogen page in accordance with an embodiment of the present invention.
[0018] Fig. 6 illustrates an example of a disease report in accordance with an embodiment of the present invention.
[0019] Fig. 7 illustrates a disease/host report in accordance with an embodiment of the present invention. [0020] Fig. 8 illustrates a user of a list-based database query in accordance with an embodiment of the present invention.
[0021] Fig. 9 illustrates a method for configuring a database in accordance with an embodiment of the present invention.
[0022] Fig. 10 illustrates a method for executing a search in accordance with an embodiment of the present invention.
[0023] The figures depict preferred embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
System Architecture
[0024] Referring now to Fig. 1, there is shown a block diagram illustrating a system in accordance with one embodiment of the present invention. System 100 includes a user interface (UI) module 104 for communicating with a user 102 of the system; a query constructor module 106 for building search queries; a network access module 108 for executing the queries and searching a network such as the Internet for results of the query; a data analysis engine 110 for reviewing the results received from the search and compiling them for display to the user 102 via user interface module 104; and a Microbial Rosetta Stone (MRS) database 114 for storing the returned search results._MRS database 114 thus is an effective reference database of microbial organisms, diseases, genomics, threat management and forensics. Also illustrated in Fig. 1 are three example data sources 112a, 112b, and 112c, located across an Internet connection from system 100. As will be appreciated by those of skill in the art, more or fewer data sources 112 could be searched by system 100, and the inclusion of three data sources in Fig. 1 is purely for illustration. [0025] User interface module 104 enables communication between system 100 and a user 102 of the system. In one embodiment, UI module 104 includes a web server, and user 102 accesses the web server using a conventional web browser such as Internet Explorer by Microsoft Corporation of Redmond, Washington, or Firefox by The Mozilla Corporation of Mountain View, California. In an alternative embodiment, user 102 uses client software developed specifically to interact with UI module 104; and in another embodiment user 102 interacts directly with UI module 104 by being physically located at the same location as system 100.
[0026] In one embodiment, UI module 104 includes a Java-based expert interface called a pluggable object viewer (POV), created by and available as open source from Isis Pharmaceuticals of Carlsbad, California. This framework allows for hierarchical browsing of data from any source that can be wrapped into a Java object as defined by a plug-in level API. The framework keeps track of lists of these objects and allows for querying related objects by any combination of a related list and a set of filters upon the data object in question. The logic of the query is always available to the user and can be branched out from any point to create alternate search paths. Queries of user-defined complexity and depth can be annotated, saved as XML files and re-run or edited at a later time. The lists of results can be manually manipulated or combined with other lists using set logic (union, intersection, difference, etc.). Result lists can display any level of depth for a query from a single result to an entire history for the query that produced those results. Individual items in the list can display further detail when selected, and the display can be simple text output, more complex HTML or even graphical in nature. HTML is preferably used not only for display formatting, but for the ability to embed URLs to other public data sources, allowing UI module 104 to link from within system 100 to externally available information. Examples of a typical user interface in accordance with embodiments of the present invention are described below. System Operation
[0027] In a preferred embodiment, system 100 is initially configured by placing available information in MRS database 114, and associating pathogen, disease and gene sequence data wherever possible.
[0028] A list of known microbial threats is assembled from available sources. Depending on the source, a threat list might include pathogens, diseases, or pathogen/disease pairings. Once assembled, the pathogens associated with each list are standardized according to the correct internationally accepted taxonomic names. The threat list and associated pathogens are stored in MRS database 114. Next, a comprehensive collection of disease synonyms is loaded for each of the agents listed on the threat lists. In addition, a list of gene sequences and gene synonyms is also loaded into MRS database 114, including sequences that identify individual genotypes of the pathogens. MRS database 114 maintains an association between the disease synonyms, the genetic information, and pathogens. Note that in some cases, not all data is available for each pathogen— for example, genetic data might not yet be available. This does not present a problem for system 100, as the data is not a required element, and optionally can be added to MRS database 114 later, as the information becomes available.
[0029] Additionally, available publications that describe the threat list pathogens are preferably indexed and associated in MRS database 114 with the pathogens, diseases, and other elements of the database that they discuss.
[0030] In a preferred embodiment, initial data is loaded into system 100 in a variety of ways, including manual entry; bulk import using computational parsers and data validation tools; expert curation facilities; and from other database management systems (DBMS). Network access module 108 preferably includes automated data import scripts for loading taxonomic lineages and nucleic acid sequences from a source such as the National Center for Biotechnology Information (NCBI). In one embodiment, network access module 108 also imports data automatically from other microbial databases, including curated data from publicly available databanks such as the Swissprot protein sequence database maintained by the Swiss Institute of Bioinformatics, the Protein Families (Pfam) database of alignments maintained by the Wellcome Trust's Sanger Institute, Kyoto Encyclopedia of Genes and Genomes maintained by the Kyoto University Bioinformatics Center (KEGG), and the Gene Ontology (GO) project maintained by the Gene Ontology Consortium.
[0031] In one embodiment, network access module 108 also imports data automatically from government sites, including the Centers for Disease Control (CDC); the United States Department of Agriculture (USDA); PBR (Pox Virus Database); ProMED, and the Department of Health and Human Services (HHS).
[0032] In one embodiment, data is also loaded from the International Committee on the Taxonomy of Viruses (ICTV), including data for viral isolates. Preferably, internationally-accepted taxonomy standards for viral organisms are used.
[0033] Where available, system 100 associates each of the pieces of initial data with other related data such as bacterial taxonomy data from NCBI or ortholog data from various publications.
[0034] Because, as noted, naming ambiguities may exist across different data sources, some analysis is preferably undertaken by data analysis engine 110 to determine which imported entries are duplicates of other imported entries, or, similarly, which seemingly disparate pieces of data in fact relate to a common pathogen.
[0035] Data ambiguities within the MRS database 114 are preferably resolved by creating additional tables in the database schema. These additional linking tables allow "many-to-many"-style relationships to be defined that otherwise would not be available in a relational schema. For example, one naming ambiguity involves three GenBank nucleotide sequences (X52374, X52505 and X52506) that were linked to the Berne virus at NCBI and to the Equine torovirus at ICTV. Despite the similar sequence associations, searching NCBI with the name "Equine torovirus" yields no results. To alert users that this naming ambiguity exists while maintaining the taxonomic context for both designations, system 100 creates a new table to connect the differing NCBI and ICTV names that, by sequence associations, appear to describe the same organism. Data relationships stored in this linking table are then preferably brought to the user's attention via user interface module 104, as illustrated in Fig. 2. In Fig. 2, the NCBI pathogen name, "Berne virus" 202 is listed as the Pathogen Name , and the taxonomy source 206 field indicates that the source is the NCBI 208 database. Under the equivalent pathogen section 210 is an indication of the corresponding ICTV name, equine torovirus 212. A similar linking table architecture is preferably used to track synonyms for pathogens and disease names, and to provide data source referencing for each name. Links to documents and keywords are also included to allow referencing of threat list, disease, epidemiology, formulation, transmission, forensic, protocol, characteristic and sequence data.
[0036] Once the initial data is loaded and validated, system 100 then begins an automated curation of the data. First, system 100 ensures that pathogens are associated with diseases that they may cause. Next, if available, a link is established from each pathogen to its genomic sequence. System 100 insures a detailed and accurate taxonomic mapping of disease/threat organism by linking each included pathogen to its proper taxa, and each disease to the pathogen that causes it. Also, if available, system 100 creates a link from the pathogen to additional information on pathogen gene products and functions, including, for example, start and stop codons, protein coding regions, regions where PCR primers can bind, etc.
[0037] As part of the data validation process, system 100 attempts to identify organisms that are annotated as being different, but which are in fact the same. Preferably, system 100 includes scripts that make the identification of duplicate entries based on names and synonyms that define the threat agent in MRS database 114. For sequences, duplicate identification is preferably made by comparing and mapping features, e.g., exons, introns, untranslated regions (UTRs), primer binding sites, etc., from one sequence to another. System 100 then preferably checks to make sure that each of the pathogens identified as being duplicates of one another are in fact the same. Where they are not, they are unlinked from one another.
[0038] In addition to identifying duplicates and false duplicates, system 100 also identifies common genes and sequences among related organisms, by, for example, performing analyses of sequence similarities at the nucleotide or amino acid level. Such analyses are carried out using, e.g., available software utilities such as BLAST programs (basic local alignment search tools) and PowerBLAST programs known in the art (Altschul et al., J. Mol. BioL, 1990, 215, 403-410; Zhang and Madden, Genome Res., 1997, 7, 649-656), or other alignment utilities. For example, data analysis engine 110 preferably analyzes sequences and creates gene sequence alignments for hundreds or thousands of sequences at once. In comparing sequence features, data analysis engine 110 looks for 100% matches to a given DNA sequence feature, for example "Feature X = ACGTACGT". If the sequence feature is found in a full sequence but the feature name has not been mapped to the 100%-matched region, data analysis engine 110 designates that sequence as having "Feature X", and defines where in the full sequence it is found. Having identified duplicate, false duplicate and related organisms, in one embodiment system 100 then automatically notifies curators of inconsistent data or other errors it has discovered. Where possible, in a preferred embodiment, system 100 resolves the inconsistencies automatically; alternatively, curators are alerted to resolve the inconsistencies manually.
[0039] In one embodiment, MRS database 114 is also initially provided with documents such as studies, academic research papers, clinical trial data, news articles or other relevant materials relating to diseases and pathogens. For each document, some metadata is created either manually or automatically about the document. In one embodiment, for example, keywords are automatically derived from the document using conventional document analysis technology such as the ht://Dig product available from the ht://Dig group, and the document is then indexed according to its keywords. Those of skill in the art will appreciate that other methods exist for indexing documents, including by performing latent semantic analysis, manual indexing, etc., any of which may be appropriate for a given implementation of system 100. Once the documents are indexed, they are linked to appropriate pathogens and/or diseases and/or genetic data. For example, a document describing the identification of the Ames strain of Bacillus anthracis as the threat agent in the 2001 Anthrax attacks is stored in MRS database 114 and linked to disease, pathogen, two taxonomic lineages, other authors who have contributed papers about Bacillus anthracis, relevant gene sequences, and related keywords. Queries
[0040] A user that wishes to obtain some knowledge about a pathogen, set of symptoms, or other data tracked by system 100 makes a search request by interacting with UI module 104, as described further below. UI module 104 passes the search request received from the user to query constructor module 106, which then constructs a query that can be used to search both MRS database 114 and external data sources 112c, e.g., via a search of the Internet. Data analysis engine 110 correlates the response to the searches and provides them to user 102 via UI module 104.
User Interface
[0041] Fig. 3 illustrates an example of a query screen 300 presented to user 102 via UI module 104 in an embodiment of the present invention. Query screen 300 preferably allows a user 102 to type a global free-form query into a text box 302 and click the "find" button 304 to perform a search. Additionally, the user 102 can choose to filter the search results according to filter criteria 306. In the illustrated embodiment of Fig. 3, filter criteria 106 includes Contact, Disease, Document, Forensics, Pathogen, Sequence, Threat List, Genome, and Characteristic fields. Each field additionally includes Boolean statements by which to filter. For example, to filter by contact, a user 102 can choose to filter by the contact's last name or first name. Those of skill in the art will appreciate that a multitude of filtering options can be used in connection with inputting a query.
[0042] Fig. 4 provides an illustration of a pathogen report 400. Report 400 could be obtained, for example, when a user 102 types "Foot and mouth" into text box 302 in order to do a full database search; or by entering "foot and mouth" in a synonym search box from a pathogen search page (not shown). A list of results matching the criteria is then returned, and the user can click on a hyperlink to one of the results to view the report. Pathogen reports 400 preferably display salient data for microorganisms, including pathogen names, taxonomic rank with information source, and synonyms. Taxonomic lineages that support upward and downward tree traversals are also shown. The display of "Equivalent Pathogen(s)" 402 indicates that alternative taxonomic lineages exist for the organism. Diagnostic methods, laboratories capable of performing them and organizations which supply reagents, and links to relevant protocol references are listed in the "Capability Information" section 404 of pathogen report 400. Clicking on the "foot-and-mouth disease virus (NCBI) link 406, for example, links to Fig. 5, which is the NCBI-source pathogen page for foot and mouth disease.
[0043] Fig. 6 illustrates an example of a disease report 600 in accordance with an embodiment of the present invention. A disease report 600 preferably includes a description of host-pathogen associations, symptoms and treatments associated with the disease. In one embodiment, the report includes hyperlinks to other related data. For example, in Fig. 6, clicking on "glycine max" 602 would take a user to the pathogen page for the phakopsora pachyrhizi pathogen.
[0044] Fig. 7 illustrates a disease/host report 700 in accordance with an embodiment of the present invention. The disease/host report describes each unique host/pathogen correlation found as a result of the query search. In the illustrated embodiment, epidemiological information 702 is also displayed. [0045] Other reports can also be provided. For example, in one embodiment a forensics report can be produced by system 100, including a_history of the pathogen with links to seminal publications, and details about cases where the pathogen may have been used in a terrorist event.
[0046] Fig. 8 illustrates a way in which a list-based query can be made of the database using the Java-based POV interface. The user-configured Query Panel 802 displays: 1) data types that can be used in searches; 2) the number of results from a search ("7 Threat Organism(s)" or "83 Sequence(s)") with user-defined annotations ("Use these sequences"); 3) and data types available for additional searches based on a given result. The Result Panel 804 displays the data relationships between lists of results (here, the 7 Threat Organisms and the 83 Sequences). Details for any highlighted information in the Result Panel are shown in the Inspector Panel 806. Web links and other tools, e.g., a sequence "Feature Viewer" tool 808, are preferably embedded in the Inspector Panel 806.
[0047] For example, a medical doctor may recognize that five patients arrive in the clinic with similar symptoms of difficulty breathing, cough, and fever of 100.5 F. In order to understand more about these symptoms and how to diagnose the disease, the doctor logs into system 100 via UI module 104, clicks on the "Disease" link, and enters the symptoms into the "Symptoms" search field. A list of diseases is displayed, and by viewing the individual Disease reports, the etiologic agents, hosts, symptoms and epidemiologic properties are displayed. By clicking on the linked Pathogens in the Disease reports, the doctor can find microbiology lab protocols that may be used to uniquely identify the organism, diagnose the disease and possibly guide therapeutic intervention. Next, the doctor may wish to have the system track the progression of this outbreak and identify similar occurrences by searching the World Wide Web on a nightly, automated basis. To do so, the doctor enters the POV user interface, and constructs a query by entering the symptoms he has identified into a "Symptom" search and running the query. Using the results, he searches for all Diseases associated with those symptoms by running a second search, and then finds all pathogens which can cause those diseases by running a third search. Finally, the doctor may wish to find data on molecular diagnostic techniques, which he does by running a fourth search on "Capability" where he limits this search with the text "nucleic acid". While having similar content as the information in the web interface search, results of this POV query are preferably displayed in a table format, and the doctor can save the query logic as a template that can be used to automatically search the World Wide Web. This automated WWW search tool can be specified to run at designated intervals against the MRS database 114. The query results are then used to generate search terms that are used by a second tool to search the web in a conventional fashion, e.g., via the Google search engine. Data gathered from the web that matches the MRS results is e-mailed to the doctor as web pages, and the doctor can tell the tool to load into the MRS database 114 any or all of the new data that was found. The new information may be loaded as Documents into the system, keywords are indexed by existing tools as described above, and linked to the Document such that the next time the doctor runs a search for diseases with symptoms of difficulty breathing, cough, and fever of 100.5 F, the new data will be displayed.
Data Updates
[0048] An advantage of system 100 is that it automatically presents the most current and most relevant data to a user in response to a request for information. In a preferred embodiment, network access module 108 re-executes queries with a specified frequency, thus allowing the data in MRS database 114 to reflect the most currently-available information. For example, during the recent SARS coronavirus outbreak, research progressed rapidly on the pathogen itself, disease symptoms, and treatment protocols, with new information being published frequently. A user 102 of system 100 could create a search request, and specify that the request should remain active for a certain number of days, weeks, or even indefinitely. Thus the results the user receives after one day would be updated on a second or subsequent days, with newly-available information. And, because data analysis engine 110 resolves variation in nomenclature and identifies similarities among seemingly different information, system 100 can provide a user 102 with a centralized source of global information about the search topic— for example, disease outbreaks in China, the United States and northern Africa may seem completely unrelated, but may be grouped together by system 100 because of similarities previously unrecognized. A user 102 in the United States may not even be aware of the Chinese or African outbreaks, but would in fact be made aware of them by the results provided by system 100. In this manner, system 100 provides a centralized source of information that is readily accessible to clinicians and researchers throughout the world.
[0049] Another example of how system 100 can be used is as follows:
[0050] A group of individuals is suspected of having been exposed to an unknown pathogen. A health care professional evaluates the fever, cough and chest pain symptoms exhibited by the group of individuals and provides a search request via UI module 104, and including the observed symptoms. The results returned by data anlysis engine 110 include lists of pathogens which are known to give rise to the symptoms. Anthrax (Bacillus anthracis) appears as a member of the list. The health care professional then makes another search request, requesting information about specific tests that can be performed to distinguish among the pathogens in the list. Because the health care professional has a particular reason to suspect that the group has been exposed to anthrax, the health care professional then searches for known clinical tests that can be used to distinguish anthrax from other pathogens. The result of the query is that among other methods, base composition analysis and multi-locus VNTR analysis (MLVA) are efficient tests for identifying anthrax, both of which in this example are contained as data types within a data table in MRS database 114 designated "RS_Characteristic." The query may then provide links to a variety of data types that may include: contact information for scientists and other professionals with expertise in the testing methods, literature reports on the tests, and the like. For the case of base composition analysis, a link to laboratories performing triangulation identification for genetic evaluation of risk (TIGER) pathogen testing may be provided. Base composition analysis is described further in U.S. patent application Serial Nos. 10/323,233; 09/798,007; 09/891,793; 60/431,319; 60/443,443; 60/443,788; and 60/447,529, each of which is commonly owned and incorporated herein by reference in entirety. Additionally, commonly owned US patent applications 10/728,486; 10/660,122; 10/660,997; 10/660,996; and 10/660,998 are each also incorporated herein by reference in their entirety.
[0051] The health care professional may choose to perform a nucleic acid base composition analysis test and thus, he collects clinical samples from the group that are then sent to the TIGER pathogen testing laboratory. At the TIGER laboratory, technicians query the same database to obtain intelligent amplification primers with the aim of rapid identification of Bacillus anthracis. After amplification of the clinical samples, the amplification products are analyzed by mass spectrometry to determine their base compositions. System 100 is then queried for a match of the base composition of the amplification product, and returns information indicating that the experimentally-determined base compositions match the base compositions catalogued in the database for Bacillus anthracis, prompting the technicians to inform the health care professional to treat his patients for an anthrax infection. The health care professional can then ask system 100 again for the latest effective treatments for anthrax infection, which may include newly discovered antibiotics or other drugs.
[0052] At this point, an expert in microbial forensics may initiate an investigation wherein the strain of anthrax is determined by searching system 100 with nucleic acid sequence or base composition data. A result of such a query may provide links to laboratories known to harbor various strains of anthrax. A genetic engineering event may also be determined in a similar manner. [0053] Likewise, an expert in epidemiological investigations may perform a search for information about the spread of an anthrax outbreak. Information such as the rate of spread of particular strains of anthrax spores and the resistance to disinfection may be provided by system 100.
[0054] Referring now to Fig. 9, there is shown an example of a method for initially configuring system 100 in the manner described above. First, pathogen data is loaded 902 into system 100, either on its own or as part of a threat list. Next, disease data is loaded 904 into system 100, again, either separately or as part of a threat list linking diseases with pathogens. After the pathogen and disease data is loaded, genetic data is preferably loaded 906. System 100 then identifies 908 duplicate entries in a manner as described above. The input data is then linked 910 together, so that related pathogens, diseases and genetic information can be cross- referenced as appropriate. System 100 then extracts 912 a set of keywords from the stored data and uses those keywords as search terms to perform 914 periodic updates over the Internet. As described above, information such as published articles or research found related to those keywords is then indexed in MRS database 114 along with the appropriate diseases, pathogens or genetic information.
[0055] Referring now to Fig. 10, there is shown a flow chart illustrating a method for executing a search in accordance with an embodiment of the present invention. System 100 receives 1002 a search request and first checks to see 1004 whether the search is one that has previously been requested and is therefore cached. If the search has not been done previously, then a query is constructed 1008; otherwise, the query is retrieved 1006 from the cache. In either case, once a query is constructed or retrieved, it is then used to execute 1010 a search over the web to obtain the latest information available about the queried items— for example, if the search request is for treatments for anthrax, the query might include a search on the terms "anthrax", "Bacillus anthracis," "Ames," and "treatment". Once the search results have been obtained, they are analyzed and correlated 1012 with data already in MRS database 114, and the search results are then returned 1014. Preferably, the query is also cached so that it can be quickly retrieved again if needed.
[0056] The present invention has been described in particular detail with respect to a limited number of embodiments. Those of skill in the art will appreciate that the invention may additionally be practiced in other embodiments. First, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Further, the system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements. Also, the particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead performed by a single component. For example, the particular functions of the data analysis engine 110 and so forth may be provided in many or one module.
[0057] Some portions of the above description present the feature of the present invention in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are the means used by those skilled in the bioagent identification arts to most effectively convey the substance of their work to others skilled in the art. These operations, while described functionally or logically, are understood to be implemented by computer programs. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules or code devices, without loss of generality.
[0058] It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the present discussion, it is appreciated that throughout the description, discussions utilizing terms such as "processing" or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
[0059] Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
[0060] The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
[0061] The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description above. In addition, the present invention is not described with reference to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any references to specific languages are provided for disclosure of enablement and best mode of the present invention.
[0062] Finally, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention.
[0063] We claim:

Claims

1. A system for performing microbial investigations, the system comprising: a data store for storing microbial data; a query constructor module, for receiving search parameters and constructing a query from the parameters; a network access module, communicatively coupled to the query constructor module, for performing a search on the constructed query, the search domain including a wide area network, and receiving search results; a data analysis engine, communicatively coupled to the network access module and the data store, for creating an association between microbial data in the data store and the received search results; and a user interface module, communicatively coupled to the data analysis engine and the query constructor module, for receiving input from a user and providing responses to the user.
2. A method for performing microbial investigations, the method comprising: receiving a search request; constructing a query from the search request; executing the constructed query to retrieve a first result from at least one first data source; executing the constructed query to retrieve a second result from a second data source; associating the first result with the second result; and providing the associated first result and second result as a response to the search request.
3. The method of claim 2 wherein the search request is for diseases associated with a specified pathogen.
4. The method of claim 2 wherein the search request is for pathogens associated with a specified diseas
5. The method of claim 2 wherein the search request is for symptoms associated with a specified pathogen.
6. The method of claim 2 wherein the search request is for pathogens associated with a specified symptom.
7. The method of claim 2 wherein the search request is for diseases associated with a specified symptom.
8. The method of claim 2 wherein the search request is for genetic data associated with a pathogen.
9. A computer program product for performing microbial investigations, the computer program product stored on a computer-readable medium and including instructions for causing a processor to carry out the steps of: receiving a search request; constructing a query from the search request; executing the constructed query to retrieve a first result from at least one first data source; executing the constructed query to retrieve a second result from a second data source; associating the first result with the second result; and providing the associated first result and second result as a response to the search request.
10. The computer program product of claim 9 wherein the search request is for diseases associated with a specified pathogen.
11. The computer program product of claim 9 wherein the search request is for pathogens associated with a specified disease.
12. The computer program product of claim 9 wherein the search request is for symptoms associated with a specified pathogen.
13. The computer program product of claim 9 wherein the search request is for pathogens associated with a specified symptom.
14. The computer program product of claim 9 wherein the search request is for diseases associated with a specified symptom.
15. The computer program product of claim 9 wherein the search request is for genetic data associated with a pathogen.
PCT/US2004/033742 2003-10-09 2004-10-12 Database for microbial investigations WO2005036369A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US50991103P 2003-10-09 2003-10-09
US60/509,911 2003-10-09
US59840804P 2004-08-02 2004-08-02
US60/598,408 2004-08-02

Publications (2)

Publication Number Publication Date
WO2005036369A2 true WO2005036369A2 (en) 2005-04-21
WO2005036369A3 WO2005036369A3 (en) 2006-07-06

Family

ID=34437312

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/033742 WO2005036369A2 (en) 2003-10-09 2004-10-12 Database for microbial investigations

Country Status (1)

Country Link
WO (1) WO2005036369A2 (en)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US7964343B2 (en) 2003-05-13 2011-06-21 Ibis Biosciences, Inc. Method for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
JP2011133679A (en) * 2008-12-25 2011-07-07 Jsr Corp Negative radiation-sensitive composition, cured pattern forming method, and cured pattern
US8017322B2 (en) 2001-03-02 2011-09-13 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents
WO2011112718A1 (en) 2010-03-10 2011-09-15 Ibis Biosciences, Inc. Production of single-stranded circular nucleic acid
US8026084B2 (en) 2005-07-21 2011-09-27 Ibis Biosciences, Inc. Methods for rapid identification and quantitation of nucleic acid variants
US8046171B2 (en) 2003-04-18 2011-10-25 Ibis Biosciences, Inc. Methods and apparatus for genetic evaluation
US8057993B2 (en) 2003-04-26 2011-11-15 Ibis Biosciences, Inc. Methods for identification of coronaviruses
US8071309B2 (en) 2002-12-06 2011-12-06 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8073627B2 (en) 2001-06-26 2011-12-06 Ibis Biosciences, Inc. System for indentification of pathogens
US8084207B2 (en) 2005-03-03 2011-12-27 Ibis Bioscience, Inc. Compositions for use in identification of papillomavirus
US8097416B2 (en) 2003-09-11 2012-01-17 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8119336B2 (en) 2004-03-03 2012-02-21 Ibis Biosciences, Inc. Compositions for use in identification of alphaviruses
US8148163B2 (en) 2008-09-16 2012-04-03 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8158354B2 (en) 2003-05-13 2012-04-17 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8158936B2 (en) 2009-02-12 2012-04-17 Ibis Biosciences, Inc. Ionization probe assemblies
US8163895B2 (en) 2003-12-05 2012-04-24 Ibis Biosciences, Inc. Compositions for use in identification of orthopoxviruses
US8173957B2 (en) 2004-05-24 2012-05-08 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8182992B2 (en) 2005-03-03 2012-05-22 Ibis Biosciences, Inc. Compositions for use in identification of adventitious viruses
US8187814B2 (en) 2004-02-18 2012-05-29 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US8214154B2 (en) 2001-03-02 2012-07-03 Ibis Biosciences, Inc. Systems for rapid identification of pathogens in humans and animals
US8268565B2 (en) 2001-03-02 2012-09-18 Ibis Biosciences, Inc. Methods for identifying bioagents
US8298760B2 (en) 2001-06-26 2012-10-30 Ibis Bioscience, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
WO2013036603A1 (en) 2011-09-06 2013-03-14 Ibis Biosciences, Inc. Sample preparation methods
US8407010B2 (en) 2004-05-25 2013-03-26 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA
US8534447B2 (en) 2008-09-16 2013-09-17 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
US8546082B2 (en) 2003-09-11 2013-10-01 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8550694B2 (en) 2008-09-16 2013-10-08 Ibis Biosciences, Inc. Mixing cartridges, mixing stations, and related kits, systems, and methods
US8563250B2 (en) 2001-03-02 2013-10-22 Ibis Biosciences, Inc. Methods for identifying bioagents
WO2014052590A1 (en) 2012-09-26 2014-04-03 Ibis Biosciences, Inc. Swab interface for a microfluidic device
US8871471B2 (en) 2007-02-23 2014-10-28 Ibis Biosciences, Inc. Methods for rapid forensic DNA analysis
US8950604B2 (en) 2009-07-17 2015-02-10 Ibis Biosciences, Inc. Lift and mount apparatus
US9068017B2 (en) 2010-04-08 2015-06-30 Ibis Biosciences, Inc. Compositions and methods for inhibiting terminal transferase activity
US9149473B2 (en) 2006-09-14 2015-10-06 Ibis Biosciences, Inc. Targeted whole genome amplification method for identification of pathogens
US9194877B2 (en) 2009-07-17 2015-11-24 Ibis Biosciences, Inc. Systems for bioagent indentification
US9393564B2 (en) 2009-03-30 2016-07-19 Ibis Biosciences, Inc. Bioagent detection systems, devices, and methods
US9416409B2 (en) 2009-07-31 2016-08-16 Ibis Biosciences, Inc. Capture primers and capture sequence linked solid supports for molecular diagnostic tests
US9598724B2 (en) 2007-06-01 2017-03-21 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids
US9873906B2 (en) 2004-07-14 2018-01-23 Ibis Biosciences, Inc. Methods for repairing degraded DNA
US9970061B2 (en) 2011-12-27 2018-05-15 Ibis Biosciences, Inc. Bioagent detection oligonucleotides
WO2019094457A1 (en) * 2017-11-09 2019-05-16 Fry Laboratories, LLC Automated database updating and curation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389428B1 (en) * 1998-05-04 2002-05-14 Incyte Pharmaceuticals, Inc. System and method for a precompiled database for biomolecular sequence information
US6553317B1 (en) * 1997-03-05 2003-04-22 Incyte Pharmaceuticals, Inc. Relational database and system for storing information relating to biomolecular sequences and reagents
US20030082539A1 (en) * 2001-06-26 2003-05-01 Ecker David J. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US20030187615A1 (en) * 2002-03-26 2003-10-02 John Epler Methods and apparatus for early detection of health-related events in a population
US20050065813A1 (en) * 2003-03-11 2005-03-24 Mishelevich David J. Online medical evaluation system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553317B1 (en) * 1997-03-05 2003-04-22 Incyte Pharmaceuticals, Inc. Relational database and system for storing information relating to biomolecular sequences and reagents
US6389428B1 (en) * 1998-05-04 2002-05-14 Incyte Pharmaceuticals, Inc. System and method for a precompiled database for biomolecular sequence information
US20030082539A1 (en) * 2001-06-26 2003-05-01 Ecker David J. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US20030187615A1 (en) * 2002-03-26 2003-10-02 John Epler Methods and apparatus for early detection of health-related events in a population
US20050065813A1 (en) * 2003-03-11 2005-03-24 Mishelevich David J. Online medical evaluation system

Cited By (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8017743B2 (en) 2001-03-02 2011-09-13 Ibis Bioscience, Inc. Method for rapid detection and identification of bioagents
US9752184B2 (en) 2001-03-02 2017-09-05 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA and characterization of mitochondrial DNA heteroplasmy
US8268565B2 (en) 2001-03-02 2012-09-18 Ibis Biosciences, Inc. Methods for identifying bioagents
US8815513B2 (en) 2001-03-02 2014-08-26 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents in epidemiological and forensic investigations
US8214154B2 (en) 2001-03-02 2012-07-03 Ibis Biosciences, Inc. Systems for rapid identification of pathogens in humans and animals
US8017322B2 (en) 2001-03-02 2011-09-13 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents
US8265878B2 (en) 2001-03-02 2012-09-11 Ibis Bioscience, Inc. Method for rapid detection and identification of bioagents
US8017358B2 (en) 2001-03-02 2011-09-13 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents
US8563250B2 (en) 2001-03-02 2013-10-22 Ibis Biosciences, Inc. Methods for identifying bioagents
US8802372B2 (en) 2001-03-02 2014-08-12 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA and characterization of mitochondrial DNA heteroplasmy
US9416424B2 (en) 2001-03-02 2016-08-16 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8380442B2 (en) 2001-06-26 2013-02-19 Ibis Bioscience, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US8298760B2 (en) 2001-06-26 2012-10-30 Ibis Bioscience, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US8073627B2 (en) 2001-06-26 2011-12-06 Ibis Biosciences, Inc. System for indentification of pathogens
US8921047B2 (en) 2001-06-26 2014-12-30 Ibis Biosciences, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US8071309B2 (en) 2002-12-06 2011-12-06 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US9725771B2 (en) 2002-12-06 2017-08-08 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8822156B2 (en) 2002-12-06 2014-09-02 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8046171B2 (en) 2003-04-18 2011-10-25 Ibis Biosciences, Inc. Methods and apparatus for genetic evaluation
US8057993B2 (en) 2003-04-26 2011-11-15 Ibis Biosciences, Inc. Methods for identification of coronaviruses
US8158354B2 (en) 2003-05-13 2012-04-17 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8476415B2 (en) 2003-05-13 2013-07-02 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US7964343B2 (en) 2003-05-13 2011-06-21 Ibis Biosciences, Inc. Method for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8097416B2 (en) 2003-09-11 2012-01-17 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8546082B2 (en) 2003-09-11 2013-10-01 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8013142B2 (en) 2003-09-11 2011-09-06 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8163895B2 (en) 2003-12-05 2012-04-24 Ibis Biosciences, Inc. Compositions for use in identification of orthopoxviruses
US8187814B2 (en) 2004-02-18 2012-05-29 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US9447462B2 (en) 2004-02-18 2016-09-20 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US8119336B2 (en) 2004-03-03 2012-02-21 Ibis Biosciences, Inc. Compositions for use in identification of alphaviruses
US9449802B2 (en) 2004-05-24 2016-09-20 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8987660B2 (en) 2004-05-24 2015-03-24 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8173957B2 (en) 2004-05-24 2012-05-08 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8407010B2 (en) 2004-05-25 2013-03-26 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA
US9873906B2 (en) 2004-07-14 2018-01-23 Ibis Biosciences, Inc. Methods for repairing degraded DNA
US8182992B2 (en) 2005-03-03 2012-05-22 Ibis Biosciences, Inc. Compositions for use in identification of adventitious viruses
US8084207B2 (en) 2005-03-03 2011-12-27 Ibis Bioscience, Inc. Compositions for use in identification of papillomavirus
US8551738B2 (en) 2005-07-21 2013-10-08 Ibis Biosciences, Inc. Systems and methods for rapid identification of nucleic acid variants
US8026084B2 (en) 2005-07-21 2011-09-27 Ibis Biosciences, Inc. Methods for rapid identification and quantitation of nucleic acid variants
US9149473B2 (en) 2006-09-14 2015-10-06 Ibis Biosciences, Inc. Targeted whole genome amplification method for identification of pathogens
US8871471B2 (en) 2007-02-23 2014-10-28 Ibis Biosciences, Inc. Methods for rapid forensic DNA analysis
US9598724B2 (en) 2007-06-01 2017-03-21 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids
US8148163B2 (en) 2008-09-16 2012-04-03 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8252599B2 (en) 2008-09-16 2012-08-28 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8534447B2 (en) 2008-09-16 2013-09-17 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
US8550694B2 (en) 2008-09-16 2013-10-08 Ibis Biosciences, Inc. Mixing cartridges, mixing stations, and related kits, systems, and methods
US8609430B2 (en) 2008-09-16 2013-12-17 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US9023655B2 (en) 2008-09-16 2015-05-05 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US9027730B2 (en) 2008-09-16 2015-05-12 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
JP2011133679A (en) * 2008-12-25 2011-07-07 Jsr Corp Negative radiation-sensitive composition, cured pattern forming method, and cured pattern
US9165740B2 (en) 2009-02-12 2015-10-20 Ibis Biosciences, Inc. Ionization probe assemblies
US8796617B2 (en) 2009-02-12 2014-08-05 Ibis Biosciences, Inc. Ionization probe assemblies
US8158936B2 (en) 2009-02-12 2012-04-17 Ibis Biosciences, Inc. Ionization probe assemblies
US9393564B2 (en) 2009-03-30 2016-07-19 Ibis Biosciences, Inc. Bioagent detection systems, devices, and methods
US8950604B2 (en) 2009-07-17 2015-02-10 Ibis Biosciences, Inc. Lift and mount apparatus
US9194877B2 (en) 2009-07-17 2015-11-24 Ibis Biosciences, Inc. Systems for bioagent indentification
US9416409B2 (en) 2009-07-31 2016-08-16 Ibis Biosciences, Inc. Capture primers and capture sequence linked solid supports for molecular diagnostic tests
US10119164B2 (en) 2009-07-31 2018-11-06 Ibis Biosciences, Inc. Capture primers and capture sequence linked solid supports for molecular diagnostic tests
EP3225695A1 (en) 2009-10-15 2017-10-04 Ibis Biosciences, Inc. Multiple displacement amplification
EP2957641A1 (en) 2009-10-15 2015-12-23 Ibis Biosciences, Inc. Multiple displacement amplification
US9890408B2 (en) 2009-10-15 2018-02-13 Ibis Biosciences, Inc. Multiple displacement amplification
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
WO2011112718A1 (en) 2010-03-10 2011-09-15 Ibis Biosciences, Inc. Production of single-stranded circular nucleic acid
US9068017B2 (en) 2010-04-08 2015-06-30 Ibis Biosciences, Inc. Compositions and methods for inhibiting terminal transferase activity
US9752173B2 (en) 2010-04-08 2017-09-05 Ibis Biosciences, Inc. Compositions and methods for inhibiting terminal transferase activity
WO2013036603A1 (en) 2011-09-06 2013-03-14 Ibis Biosciences, Inc. Sample preparation methods
EP3170831A1 (en) 2011-09-06 2017-05-24 Ibis Biosciences, Inc. Sample preparation methods
US9970061B2 (en) 2011-12-27 2018-05-15 Ibis Biosciences, Inc. Bioagent detection oligonucleotides
US10662485B2 (en) 2011-12-27 2020-05-26 Ibis Biosciences, Inc. Bioagent detection oligonucleotides
WO2014052590A1 (en) 2012-09-26 2014-04-03 Ibis Biosciences, Inc. Swab interface for a microfluidic device
WO2019094457A1 (en) * 2017-11-09 2019-05-16 Fry Laboratories, LLC Automated database updating and curation
US10649982B2 (en) 2017-11-09 2020-05-12 Fry Laboratories, LLC Automated database updating and curation
US11556522B2 (en) 2017-11-09 2023-01-17 Fry Laboratories, LLC Automated database updating and curation

Also Published As

Publication number Publication date
WO2005036369A3 (en) 2006-07-06

Similar Documents

Publication Publication Date Title
WO2005036369A2 (en) Database for microbial investigations
Kalvari et al. Non‐coding RNA analysis using the Rfam database
Li et al. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform
Rose et al. Challenges in the analysis of viral metagenomes
Rappaport et al. MalaCards: A comprehensive automatically‐mined database of human diseases
Snyder et al. PATRIC: the VBI pathosystems resource integration center
Perrière et al. HOBACGEN: database system for comparative genomics in bacteria
Wong Kleisli, a functional query system
JP2016502162A (en) Primary analysis driven by a database of raw sequencing data
US20020168664A1 (en) Automated pathway recognition system
Turkahia et al. Pandemic-scale phylogenomics reveals elevated recombination rates in the SARS-CoV-2 spike region
Manda et al. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships
US11037654B2 (en) Rapid genomic sequence classification using probabilistic data structures
Kumar et al. ESTIMA, a tool for EST management in a multi-project environment
Mosa et al. A study on PubMed search tag usage pattern: association rule mining of a full-day PubMed query log
Panda et al. EumicrobeDBLite: a lightweight genomic resource and analytic platform for draft oomycete genomes
Aoki et al. Searching the literature: four simple steps.
Oliveira et al. PipeCoV: a pipeline for SARS-CoV-2 genome assembly, annotation and variant identification
Shiryev et al. Indexing and searching petabyte-scale nucleotide resources
Mendes et al. Tcruzikb: Enabling complex queries for genomic data exploration
Koehler et al. Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalised Data Structures
Masseroli et al. Explorative search of distributed bio-data to answer complex biomedical questions
Catanho et al. BioParser: a tool for processing of sequence similarity analysis reports
Topalis et al. Anatomical ontologies of mosquitoes and ticks, and their web browsers in VectorBase
Jamil Improving integration effectiveness of ID mapping based biological record linkage

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase