WO2001061568A9 - Rdl search engine - Google Patents
Rdl search engineInfo
- Publication number
- WO2001061568A9 WO2001061568A9 PCT/US2001/005268 US0105268W WO0161568A9 WO 2001061568 A9 WO2001061568 A9 WO 2001061568A9 US 0105268 W US0105268 W US 0105268W WO 0161568 A9 WO0161568 A9 WO 0161568A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- numerical data
- data
- query
- tagged
- index
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 83
- 238000012545 processing Methods 0.000 claims description 22
- 238000013523 data management Methods 0.000 claims description 16
- 150000001875 compounds Chemical class 0.000 claims description 14
- 230000004044 response Effects 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 230000004931 aggregating effect Effects 0.000 claims 2
- 238000001914 filtration Methods 0.000 claims 2
- 238000012544 monitoring process Methods 0.000 claims 2
- 230000036962 time dependent Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 33
- 238000013459 approach Methods 0.000 description 15
- 238000007726 management method Methods 0.000 description 10
- 238000006243 chemical reaction Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 230000009466 transformation Effects 0.000 description 8
- 239000012634 fragment Substances 0.000 description 6
- 239000003795 chemical substances by application Substances 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 238000000844 transformation Methods 0.000 description 4
- 241000009328 Perro Species 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000029305 taxis Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002250 progressing effect Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000000344 soap Substances 0.000 description 1
- 210000003813 thumb Anatomy 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9558—Details of hyperlinks; Management of linked annotations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99932—Access augmentation or optimizing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99934—Query formulation, input preparation, or translation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99935—Query augmenting and refining, e.g. inexact access
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99936—Pattern matching access
Definitions
- the present invention relates to a search management system capable of accessing and retrieving tagged numerical data from data services (e.g., file systems), databases (e.g., relational or object-oriented), and servers on networks (e.g., LANs and WANs (e.g., the Internet)).
- data services e.g., file systems
- databases e.g., relational or object-oriented
- servers on networks e.g., LANs and WANs (e.g., the Internet)
- a known approach to searching and managing numerical data on the web is to hide the data behind vertical pipes of proprietary servers, expensive programmers, DBMS administrators, and non-standard formats. Users cannot search, directly for numerical data. They must do a search for likely publishers of a certain type of data, then visit each site, go through a proprietary search routine, interpret the data to determine whether it fits the overall search criteria, then collect all of the individual results into a single result table - usually by manually retyping each site's results.
- WSE web search engines
- DBMS's database management systems
- Figure 15 illustrates a conventional approach to searches on the web which are characterized by two layers of query engines.
- the one closest to the end user 1500 i.e., the web search engine 1502 provides indexing to particular sites or collections of tables.
- These servers 1504 are typically operated by companies such as Yahoo! and Lycos.
- the next layer back is a layer of query engines or servers
- WSEs conduct searches using direct keyword indexing to HTML documents (e.g., Yahoo, Alta-Nista, Lycos, etc.). These search engines maintain very large indexes that map keywords to URLs. If a user types, for example, "57" as an input keyword, the user will receive instances where "57” is used in an HTML document ("NASDAQ falls 57", "#57 - Doug Henry, Major League Baseball", etc.). As an example, for this particular query on AltaVista, a user will receive a list of over 11 million pages.
- the shortcomings of this approach are obvious: no context to the numbers, too many returns, no way to narrow the query to useful numerical data.
- WSEs conduct searches for database publishers (e.g., Yahoo, AltaVista, Lycos, etc.). If the user is searching for a number and it is not on an HTML page itself, it may be in a relational database that is accessible through an HTML form. The web search engine therefore can be queried for words or phrases that might be on that HTML page or related pages. In this approach, the burden is on the user to guess what words or phrases might be associated with such numbers, and who might publish such data.
- database publishers e.g., Yahoo, AltaVista, Lycos, etc.
- XML data users may conduct searches of a repository of XML documents.
- vendors such as XYZFind take the approach of essentially modeling the XML documents in a relational or object-oriented database structure, and building indexes to documents based on this internal "repository" structure.
- XYZFind takes the approach of essentially modeling the XML documents in a relational or object-oriented database structure, and building indexes to documents based on this internal "repository” structure.
- shortcomings are the facts that only documents in that particular relational database can be accessed (not data distributed in documents across the web) and that data from different taxonomies are not directly comparable and, therefore, a search would not produce all possible results.
- a "keyword select query” is a request for all items in a dataset that contain a particular word(s). For example, "Give me a list of all web pages with the word 'baseball' on them.”
- Transformational queries are those that require numbers to be transformed in some way to test whether they meet the requirements of a query statement.
- a company's financial statements are presented in quarterly data, and a query is made for companies with "annual sales > $100 million”. This request may be equivalently stated as companies with "annual sales > 75 million (British Pounds)", or sales listed quarterly with sales greater than $25 million per quarter.
- Keyword searches and general database select queries) cannot make these types of transformations in the course of their searches.
- Arithmetic queries involve complex calculations, often requiring a specialized language (such as the Reusable Macro Language, U.S. Patent Application Serial No. 09/573,780). For example, a query might draw financial data from 10 web sites, calculate a set of financial rations, then conduct a search for companies that meet that profile.
- arithmetic queries are distinguished from numerical queries, which use basic comparison operators and transformational queries, which change the underlying units, measures and magnitudes.
- conventional WSEs are incapable of performing time-dependent queries. Queries that are time-dependent may take the form of "Let me know when . . .
- These types of queries may have a refresh capability, with related controls on scheduling, expiration of request and so forth.
- An example of this type of query would be if the user wants a notification (for example, by email) any time the search engine becomes aware of a bank company stock trading at less than 1.0 times book value.
- Methods and systems in accordance with the present invention provide a system for querying and formatting tagged numerical and text data comprising: an index comprised of the tagged numerical and text data contained within a plurality of documents; and a server for maintaining and querying the index, and receiving a query.
- the server collects the tagged numerical and text data within the plurality of documents that satisfy the query, formats a resultant data set from the collected tagged numerical and text data, and transmits the resultant data set.
- methods and systems consistent with the present invention provide the capability to search numerical data across networks such as the Internet, and remove the middle layer of query engines or servers in retrieving data from relational databases over the Internet.
- methods and systems consistent with the present invention provide the means for performing navigational, line item (or record-level), semantic, numerical, transformational, arithmetic, time-dependent, and cost based queries on numerical data.
- methods and systems consistent with the present invention permit a user to monitor the progress of the search, refine the search, and manipulate the result into various views.
- Figure 1 depicts a data management system suitable for use in accordance with the methods and systems of the present invention
- Figure 2 depicts a block diagram of software components of the client computer depicted in figure 1 in accordance with the present invention
- Figure 3 depicts a screen shot of a data query screen depicted within software components of the client computer in figure 2 in accordance with the present invention
- Figure 4 depicts a screen shot of a macro query screen depicted within software components of a client computer in figure 2 in accordance with the present invention
- Figure 5 depicts a screen shot of a query progress screen depicted within software components of a client computer in figure 2 in accordance with the present invention
- Figure 6 depicts a screen shot of a query result screen depicted within software components of a client computer in figure 2 in accordance with the present invention
- Figure 7 depicts a screen shot of a chart of two data series plotted on the same chart in accordance with the present invention.
- Figure 8 depicts a block diagram of software components of a server computer depicted in figure 1 in accordance with the present invention.
- Figure 9 depicts a block diagram of index data created by a web server depicted in figure 1 in accordance with the present invention.
- Figure 10 depicts the resulting records produced by a query and stored in a Lineltemlndex table depicted in figure 9 in accordance with the present invention
- Figure 11 depicts a flowchart of an indexing process conducted by a web server depicted in figure 1 in accordance with the present invention
- Figure 12 depicts a flowchart of a searching process conducted by a web server depicted in figure 1 in accordance with the present invention
- Figure 13 depicts a table of unit types used in transformation based searches in accordance with the present invention
- Figure 14 depicts a flowchart of transformation value searches conducted by a web server depicted in figure 1 in accordance with the present invention
- Figure 15 depicts a web search engine using a multi-layer approach for searching data over the Internet.
- Methods and systems consistent with the present invention provide a means for searching numerical data across networks such as the Internet, and removing the middle layer of query engines or servers used by conventional systems in retrieving data from relational databases over the Internet.
- L0 systems in accordance with the present invention also provide a means for tying millions of computers together into a single database, thereby a query introduced to the system returns a table of data as a single database is capable of providing.
- the methods and systems consistent with the present invention provide the means for performing navigational, line item (or record-level),
- L5 semantic, numerical, transformational, arithmetic, time-dependent, and cost based queries on numerical data may also conduct select queries between unrelated databases.
- methods and systems consistent with the present invention allow a user to monitor the query results and refine the query request during searching.
- Methods and systems in accordance with the present invention also provide means for a user to conduct queries for programming code by function and has the capabilities to distinguish between requests for data and requests for programming functionality.
- FIG. 1 depicts a data management system 100 that is suitable for use with methods and systems consistent with the present invention.
- Data management system 100 comprises a client computer 102 and a server computer 104 interconnected via a network 106 (e.g., a Lan or Wan (e.g., the Internet)).
- the network 106 alternately can be replaced by an interprocess mechanism such that the software components of computers 102 and 104 are, in fact, run on the same computer.
- the server computer 104 provides Reusable Data Language ("RDL") documents to client computer 102.
- the transmitted RDL documents may be ASCII
- RDL documents which contain text and numerical data.
- RDL documents were referred to as Reusable data Markup Language ("RDML") documents in U.S. Patent Application serial number 09/573,778 which provides a detailed explanation of RDML documents and was previously incorporated herein.
- RDML Reusable data Markup Language
- Client computer 102 includes a central processing unit (CPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110, a graphics processing unit (GPU) 110,
- L0 secondary storage device 112 e.g., a hard disk
- a display device 114 e.g., a liquid crystal display
- an input device 116 e.g., a mouse and keyboard
- main memory 118 containing client computer software components 120.
- Figure 1 also depicts a web server 124 on server computer 104 that sends RDL documents to client computer 102 via network 114.
- server computer 104 that sends RDL documents to client computer 102 via network 114.
- the web server 124 sends RDL documents over the network 106 and may be connected to a database server 130, which holds RDL documents.
- Database server 130 may receive RDL documents from the disk array 128, which may receive data from database storage 132.
- RDL documents sent by web server 124 are retrieved from file server 136 through database server 130.
- Protocols used in the transmission of RDL documents between the server 124 and the client computer 102 include, but are not limited to, HyperText Transfer Protocol (HTTP) and File Transfer Protocol (FTP).
- HTTP HyperText Transfer Protocol
- FTP File Transfer Protocol
- the RDL documents are served from web server 124, not from database
- RDL documents can be created on the fly in disk array
- the web server 124 may be implemented as a "servelet": a software application that resides on an internet or intranet server; it can reside on client computer 102. Client computer 102 maintains an index of information available on various RDL web pages. The web server 124 receives query statements from the client computer 102, collects the information available on the various target data files, formats a resultset file, and returns the resultset to the client computer 102. This web server 124 also is responsible for finding new RDL documents (by following hyperlinks or other pointers) and maintaining a list of RDL web pages, and other servers with which it can communicate. The above functions are referred to as "web indexing" or “web crawling.”
- Web server 124 may be designed as a group of cooperative and/or redundant servers that may themselves form a web of servers for a number of reasons. First, there is simply so much data in the world that it is unreasonable to expect that a single machine - even a massively parallel supercomputer - can handle all of the traffic and all of the information. To be truly scalable and to handle all possible data, a network of web servers helps meet the requirements for size, speed, and reliability.
- the components include but are not limited to an index (which may be a database table); the processors, which convert user queries to SQL queries and converts results into progress and result screens); and the transformation routines. These components may be on the client, or on the server; or spread across several servers, or in any permutation of clients and servers.
- an index which may be a database table
- the processors which convert user queries to SQL queries and converts results into progress and result screens
- the transformation routines may be on the client, or on the server; or spread across several servers, or in any permutation of clients and servers.
- Figures 2 and 8 illustrate software components (120 and 122) of client computer 102 and server computer 104 in accordance with the present invention. These components illustrated in Figures 2 and 8 do not necessarily reside on separate computers for functioning as “client” and “server.” These aggregations may be for logical construction. In actual practice, it may be preferable to either create two separate applications or to put both applications on a web server host, with the query and result pages sent to a user's web browser.
- client computer software components may include data query screen 202, data query processor 204, data query screen 206, macro query processor 208, search progress screen 210, progress processor 212, search results screen 214, results processor 216, export processor 218, overall application manager 220 and RDL data viewer 222.
- Figure 3 illustrates an exemplary data query screen 202 of client computer
- the data query screen 202 allows a user to enter the query parameters and narrow the query while it is in progress.
- the user types information into the various input boxes in data query screen 202 to describe the data that the user desires the query to retrieve.
- the contents of the input boxes are used to build a select query on a database of tag elements in RDL documents.
- the exemplary data query screen 202 includes message panel 302, data type panel 304, status panel 306, chart title 308, chart area 310, Y-axis line item elements entry block 312, document elements entry block 314, X-axis line item elements entry block 316, and action buttons 318.
- the data type panel 304 instructs the user that there may be three types of data series found in an RDL document.
- the data series types are: time series, crosstabulation, and XY plot.
- the time series data type displays the x values on a chart as time periods (such as "1959” or "Week ending 3/21/87").
- the crosstabulation (or category) data type displays the x values as categories that are
- L 0 data viewer 222 needs to determine the number of independent axes to set up.
- Data viewer 222 may be an RDL data viewer 222 as described in co-pending U.S. Patent Application Serial No. 09/573,778. The data series types are also described in that application.
- Status panel 306 presents ongoing messages regarding the status of the
- Chart title 308 allows a user to enter one or more keywords with boolean operators, if desired, that would be found in the title of a data series chart. For example, if the user wants to find all data series which, when charted, would
- Chart Area 310 allows a user to sketch, with a mouse for example, the desired shape of a chart of a data series.
- this shape may be converted to a numerical sequence which is stored in a "dsform" element in index
- the "dsform" element contains a numerical representation of the shape of the data series.
- the numerical sequence may be a series of 20 codes (e.g., 20 digits), each representing one 20 th of the underlying data series. If the data series is convex within that period (i.e., reaches a local maximum which has an absolute value higher than the local minimum), the digit assigned for that period is 6 through 9, depending on the height of the maximum. If the curve in that section is concave (i.e., reaching a local minimum that has an absolute value greater than the local maximum, it is assigned a digit 1 through 4 depending on the depth of the minimum).
- Line item elements 312 may be input items provided by the user, and may be combined with a logical AND operator. For example, to view all data series regarding Gross Domestic Product (GDP) that are greater than 3 trillion dollars, the user types "GDP" in the box marked “Legend”, 3 in the box marked “Minimum,” “trillion” in the box marked “Measure type,” and “$ US” in the box marked "y axis title”. Line item elements 312 and their attributes are discussed in greater detail in co-pending U.S. Patent Application serial No. 09/573/778.
- GDP Gross Domestic Product
- Document elements 314 are similar to line items elements 312, but apply to metadata regarding the document itself. For example, to find a document that shows depreciation in the financial statement of an airline industry company, type " ⁇ SEC:Depreciation>" in and " ⁇ SIC:Air>” in the classes box. If a box is left blank, it is simply ignored as a criteria; all documents are returned subject to the imposed criteria.
- the "timestamp” box sets bounds on when the document was created, e.g., " ⁇ Dec. 2001” would indicate the document was created before December 2001.
- the "Source” and “Formatting Source” box each receive string arguments and modifiers.
- X axis elements 316 are similar to the line item elements 312 and document elements 314, but relating to the X axis values. "X Axis Description” is analogous to "Legend” on the Y axis, and “Measure Type” is the same for both the X and Y axis. The X axis represents the independent variable in a data set. X axis elements 316 are also discussed in greater detail in co-pending U.S. Patent Application serial No. 09/573/778.
- Action buttons 318 initiate the function identified on the button.
- the "search" button initiates a search.
- data query processor 204 reads each of the components of the data query screen 202 and produces a query string.
- the SQL query string may be of the following form:
- the macro query screen 206 allows the user to specify the types of behavior L 5 that is desired of a "macro."
- RXL macros can also be searched on the web or locally.
- RXL is an XML-based format that describes programming operations to be carried out on an RDL document. By searching for specific capabilities, users can find and download macro documents that provide specific functions for specific data documents.
- JO Because RXL documents may be ASCII documents that follow the extensible Markup Language (XML) specification and the Document Object Model (DOM), they lend themselves to being searched and manipulated by automated means.
- XML extensible Markup Language
- DOM Document Object Model
- There are several types of elements in an RXL document that are indexed an index may be a database table that maps an element name and 5 element value to a URL).
- Figure 4 illustrates a macro query screen 206 consistent with the present invention.
- One element of a macro is the "macro_class," which is a general description of the type of work that the macro performs.
- a macro that performs a linear regression on a data series may be classified under the "statistical operations.” In one implementation, there are about 20 of these categories, and they are shown in block 402 of Figure 4.
- Block 404 shows some criteria boxes: document title, timestamp, etc.
- buttons 406 perform as one would expect: “Start search” initiates a query on the RXL index, “Pause” stops the search so that the user can change the selection criteria, and “Cancel” halts the operation entirely and clears the input criteria.
- “Start search” initiates a query on the RXL index
- "Pause” stops the search so that the user can change the selection criteria
- "Cancel” halts the operation entirely and clears the input criteria.
- RXL macros refer to U.S.
- the macro query processor 208 like the data query processor 204, reads all of the graphical input items in macro query screen 206 and creates a SQL query string which may be passed to the server computer's query processor 802.
- An exemplary SQL query string is of the form:
- macro query processor 208 performs conversions from common-language formats to machine-readable formats (e.g., from "December 1, 2000” to "2000.1201000”), if they are required.
- Figure 5 shows a search progress screen 210 that provides the user with a map of the overall results in accordance with the present invention.
- the results are broken down by various metrics, such as the return results by statistical series type, the sources of the results 504 (e.g., .com, .edu, and, .gov), and the number of links 506 appearing in a document. Notice that this typically depicts more than just a list of documents.
- the user may view the progress of the search as the charts change to reflect the search progress.
- the user can select certain narrowing criteria by indicating such criteria on the search progress screen 210.
- the narrowing instructions are relayed to the server computer 104, which continues its search with the new criteria.
- the user may not only delete columns from the chart (and thus the search), but also "combine columns” by dragging one on to the top of another. This leaves both criteria in the search, but merges them for display purposes, allowing the user to see the relative returns of different (customizable) categories.
- the search progress processor 212 receives progress update information from web server 124, receives the narrowing criteria from the user through search process screen 210, and passes narrowing criteria to web server 124.
- search results screen 214 provides the user with the query results, which the results screen received from the results processor 216,
- the results are provided as one of two kinds of information, either a list of RDL documents that meet the criteria or have line items that meet the criteria, or a table of line items culled from one or more RDL documents and have been collected into a new RDL document.
- Figure 6 is an example of a search result screen 214 displaying a list of
- Figure 6 displays a legend 602, title 604, URL 606, and data type 608.
- resultset may be the equivalent of a flat file table in a database: a set of rows and columns.
- each row represents a line item.
- the analogy is a loose one: most dramatically, in one implementation, each line item has not just one set of data, but two sets of data.
- Each line item has a set of "x values" that can be plotted on the x axis of a chart, and a set of y values that can be plotted on the y axis.
- the user can view each line item one by one as XY plots with a browser 222 in client computer 102 (as described in U.S. Application Serial No. 09/573,778).
- the user can view each line item by clicking on the lines in Figures 6 and 7.
- multiple lines can be plotted on the same chart if they are of the same data set type.
- Figure 7 shows that two time series data items may be placed on the same chart.
- the "General sales taxes" 702 and the "Selective sales taxes” 704 are plotted on the same chart.
- the query results are also useful in other applications, and therefore, an export processor 216 exports the different result types.
- a document list provided as a resultset may be exported into a HTML or Rich Text Format (RTF) formatted text document.
- the resultset can also be exported to a new RDL document, or saved as an Excel or comma delimited text file.
- the export processor 216 simply takes the data from the respective models for the result views, applies the necessary reformatting, and provides the capability to save the results to either a file or the clipboard.
- the methods and systems in accordance with the present invention also provide an overall application manager which is responsible for building the screens, tying the various software components together, providing menus, toolbars and status bars, and maintaining the help system. Typically, this is the user's normal web browser. It also provides the communications links with the server computer 104 over "http:" protocols.
- the server computer 104 in accordance with principals of the present invention is comprised of a query processor 802, an index database 804, an index maintenance assembly 806, an index updating agent assembly 808, a document reader 810 and a server management assembly 812.
- the query processor 802 is responsible for receiving query instructions from the client computer 102 and querying the index database 804 which will be describe in greater detail below.
- the query processor 802 distinguishes between data and macro queries, receives narrowing criteria from search progress screen 210, and provides efficiency-enhancing suggestions to the query process in the index database 804.
- the index database 804 may not contain all of
- the query processor 802 is responsible for noting the deficiencies in this area and passing a request onto the document reader 810 (described below) to fulfill the request for missing data.
- the index database 804 may be relational database with the table structure
- the index database 804 may be comprised of
- Lineltemlndex table 902 is a long table containing a cross- reference of URL's and attribute values associated with the line items in RDL documents. Each line item corresponds to a record in a traditional database, with
- the exemplary line item, from an RDL document, illustrated below and Figure 10 provide an exemplary comparison of the data found in a line item of an
- the first entry 1004 in Lineltemlndex table 902 is comprised of URL 1006, Anchor 1008, Attribute 1010, and Value 1012.
- URL 1006 identifies the source of the RDL document; Attribute 1010 provides the element attribute name; and Value 1012 provides the value for the element.
- the URL may include several pieces of information.
- the first part of the URL (http/ftp/etc.) describes the protocol to be used by a browser in accessing the web server, the next part (e.g., /www.somedata.com/) describes the web server, the third part (e.g., /somedatapage.rdl/) points to the actual document, and the final part (e.g., /statab/) provides what is known as an "anchor” 1008 or an "xlink", which is a pointer directly to a specific line item.
- Lineltemlndex Temp table 904 follows the same structure as the
- Lineltemlndex table is generally created once a query begins and contains successful matches to the query. Line items that are included in the resultset are checked for hyperlinks in real time to see if they point to any useful RDL documents and line items. If so, those line items are added to LineItemIndex_Temp table 904 and included in the search.
- Hitlnfo table 906 contains data related to the popularity and content information of the various documents. It is linked to the other tables by the document URL.
- the "NumHits” field contains a running count of the number of times this document has been accessed in a successful search.
- the "ClassRequest” field is a comma-delimited list of the top 5 classes that were requested in connection with a successful search for this document.
- the “Top5Related” field contains a comma-delimited list of the top 5 related web sites to the one sought. To each of the top 5 classes and top five documents is pre-pended the number of access hits. This allows the rankings to be updated under real-time conditions.
- index maintenance assembly 806 is responsible for: locating new RDL-formatted documents (either by following hyperlinks of known documents or by accepting manual input from an operator); checking for changes in documents by looking at the "timestamp” and "expiration” attributes of the known documents; identifying duplicates; and other housekeeping tasks.
- Index updating agent assembly 808 updates documents that have changed so all documents do not have to be updated unnecessarily.
- Index updating agent assembly 808 is comprised of a collection of routines that can parse and read XML, RDL, and RXL documents to look for specific items of information required to add 5 a document or line item to the index database 804.
- One of the easiest to use in a research operation is entitled “Aelfred,” which may be obtained online.
- Document reader 810 is used by both the query processor 802 and the index L 0 maintenance assembly 806 to read and parse attributes and tag values from remote or local RDL documents, which may be in a text file format such as ASCII.
- server management assembly 812 is responsible for building the screens, tying the various software components together, providing menus, toolbars and status bars, and maintaining the help system. It also provides [5 the communications links with the server computer 104 over "http:" protocols.
- the server computer 104 in accordance with the present invention 10 maintains the indexing and searching processes.
- the indexing process 1100 involves searching the target universe of RDL documents, and building an index of key words and URLs which are used as links to the RDL documents.
- Figure 11 describes the flow for creating an index of information of RDL documents over the Internet. This process is also amendable for the indexing of 15 limited collections of RDL documents, such as those confined to a corporation's internal network.
- the goal of the indexing process is to create a set of data tables which comprise the following four fields: URL, Anchor, Attribute, and Value, as illustrated in Figure 10.
- the indexing process starts with a user determined list of RDL documents and a list of domains, such as websites, file directories, or databases, that limit the search (step 1102). The constraint on the number of domains is implemented because the search process may be used in the context of a single corporation or working group.
- a standard XML Parser/Processor may be used to read the RDL documents, and as elements and attributes are encountered, they are evaluated for inclusion in the index (step 1104).
- the following rules are applied: (1) duplicates are ignored (the process need only identify that the keyword exists in an element in the document); (2) strings are parsed (e.g., "Net Equipment Depreciation" would be divided into three entries, one for each word); and (3) numbers are ignored.
- the indexing process applies the last rule because there are an infinite number of numbers and their indexing would be voluminous. Instead, numbers are tested in real time during the search process. This is explained in greater detail in step 1204 in the search process of Figure 12.
- an initial list of cached RDL documents may be entered (step 1106). These are a set of RDL documents that are known at the outset to be more likely to be requested than others. These documents may be "cached"; that is, the documents will be copied to a local source and the restriction on cataloging numbers is relaxed.
- the index of documents is created.
- a standard XML processor is used to proceed through each document, element by element, and then attribute by attribute to create an index of the major elements and attributes (step 1108).
- element text values are the values appearing between opening and closing tags (e.g., " ⁇ data-source > US Census Bureau ⁇ data-source>” would have an element text value of "US Census Bureau”).
- the XML processor collects both elements and attributes into name/value pairs and calls a software routine (or method) as each is completed.
- the "handler” method takes the URL, element/attribute name, and value and creates a new record in the index data table with those three values.
- the "cache" RDL documents are also handed off to an additional handler which collects all the attributes and elements into a relational table (step 1111). During the search process, the data query processor 204 will use this relational cache to create the RDL elements in real time, rather than searching the Internet to find the RDL document, to parse and process the elements.
- the server computer 104 In addition to the indexing of cached RDL documents, the server computer 104 also crawls hyperlinks in RDL documents (step 1112). There may be RDL documents that are not in the original list of RDL documents to be indexed. To try- to find additional RDL documents, the indexing application records all RDL URL's it finds in the various "href attributes of the documents it evaluates during the indexing process.
- index update agent 808 executes and visits every unique URL listed and performs the following: (1) checks whether the RDL document still exists, and if not, index update agent 808 removes all references to the document in the index; (2) checks whether each of the Attribute/Value pairs still exist, and if not, they are removed from the index; (3) checks whether there are any new Attribute/Value pairs, and if so, they are added to the indexes (step 1114).
- checks (1) and (2) are performed together, by removing all records for that URL and rebuilding the index if the timestamp tag has been changed. However, if the timestamp is the same as the last time the URL was visited, it is assumed that there is nothing to update, and the index is not modified.
- Figure 12 illustrates an overview of searching process that computer server 104 performs in accordance with the present invention.
- the user selects certain criteria and the computer server 104 collects the qualified results and formats them for display in various forms.
- the search process is the process that the user experiences when searching for information in a collection of RDL documents. In one implementation, this process is performed by the software components shown in Figure 8.
- Search process 1200 starts when the user enters a search criteria to an
- HTML form in a browser running on client computer 102 (step 1202).
- An exemplary version of this form is illustrated in the data query screen 202 of Figure 3.
- This exemplary screen is one possible configuration in which all elements are not necessary.
- an HTML form field of only text input can feed the subsequent search process.
- the data query screen 202 may be produced by a web server, or by an internal "client" program running on client computer 102.
- the search criteria from data query screen 202 is submitted typically to server computer 104 through a web-server 124, for example, via the GET or POST methods of HTTP (step 1204).
- GET and POST are two different methods in which information may be passed from a client browser to a web server using the HTTP protocol.
- web server 124 queries both the cached RDL documents then reformats the request into an SQL string and submits the query to the index data tables 804 (step 1206).
- the individual RDL documents that match the query are evaluated (step 1208). Evaluation consists of opening a file and transforming the document's data into a form that is directly comparable to the requested data. At this stage, the cached documents are also searched and evaluated.
- search progress screen 214 is updated at predetermined intervals (step 1210).
- a search progress screen 214 is updated to show the number of documents that have been found and their characteristics. The user can then focus the search by making updates to this screen 214.
- the query processor 802 adjusts its respective workload by adjusting the SQL string, and refining the query criteria (step 1212).
- the result of a completed search may be an RDL document, constructed from the processed data returned by the SQL query and the user applied refinements (step 1214).
- the RDL document is returned to the user on search result screen 600, through a browser running on client computer 102 (step 1216).
- Data management system 100 in accordance with the present invention provides unique methods for the handling of specific topics in conducting and managing the queries of RDL documents, and formatting the results provided to the 5 user.
- the following are examples of the various handling methods employed in accordance with the present invention.
- Data management system 100 uses a number of semantic descriptors in the
- L0 RDL documents and in the client computer 102 to identify documents by the meaning of their content There are, first, “class” tags, where users can place identifying classification tags that help identify the subject matter of the document or line item.
- Each class identifier has two parts: a "namespace” and a "class”. The two parts are separated by a semicolon (“:").
- the namespace is a unique identifier
- L5 representing the source of that classification system.
- the "class” is the actual classification value, drawn from the specific vocabulary of the organization that sponsored the namespace.
- dewey:8144 may mean that the subject of this document falls under section 8144 (French Literature) in the Dewey Decimal system.
- sec:Total-Debt may mean that this line item refers to total debt in a financial statement as defined by the SEC (Securities and Exchange Commission).
- SEC Securities and Exchange Commission
- the "Total-Debt" tag is specified as one of the tags of the U.S. Securities Exchange Commission's Standard Generalized Markup Language (SGML) standard. Aside from using specific class tags, data management system
- 15 100 may use other tags as an indirect way of identifying content.
- the tags may be used as an indirect way of identifying content. For example, the
- "li_units” tag might specify “US Dollars” as the unit measure of the line item. This could be interpreted by server computer 104 as an indicator that the data is of a "financial” nature. The user may specify maps from one element type to another in client computer 102. These elements and tags are described in U.S. Patnet Application Serial No. 09/573,778.
- Figure 10 shows an index with four (4) columns, other implementations may have different numbers of columns. Discussed below are three different implementations of an index table, each with a different set of tag type information. Note that attribute names and legend names are mixed together in the same column: there is no overlap between attribute and element names in
- attribute names and values are recorded. This leads to the smallest index: easiest to create and fastest to search. To really test a query, however, requires going to the document in real time to test a complex query or to do a transformational test.
- a new column is populated: the line item ID number (also called the “anchor”) and this allows a complete line item to be reconstructed from the index. This allows complex queries to be performed (eg.,
- a new column is added: the parent line item of the current line item.
- This allows queries that move up and down the hierarchies of values in a document.
- An example of where hierarchical structure is important is in a financial statement: "Bus Sales” and “Auto Sales” may both be components of "Total Sales”.
- the query processor 802 finds the children of "Total Sales.” This is performed by following the pointers in the "parent" column.
- index database 804 The method for determining the optimal index database 804 to use starts by considering the tradeoff between index keyword seek time and storage space, and real-time phase document seek time and query complexity. Because index database 804 may be created from a relational database (which is very fast), the largest architecture that can fit in the available storage space should be selected.
- frequently-changing data ie., documents which have expired according to their timestamps
- frequently changing data might suggest using one of the smaller index architectures.
- a 5-column index On a small intranet or single-disk file system, for example, it may be possible to use a 5-column index. If millions of documents are being indexed around the web, many with infrequently changing data and complex arithmetic queries are not contemplated.
- a three-column index may be suitable dropping the "anchor" column and looking them up at runtime.
- index database 804. This information can be directly accessed for future searches by including them in index database 804.
- the class information provides the necessary data, without causing the widespread nesting and restructuring of the data.
- the user of this RDL document simply has to specify the "classes” element, then use the commas that separate the different classes to parse the class element into the different member elements.
- the two member elements are "dewey:624" and "sec:debt”.
- Compound Tags are formed during the indexing phase to allow users to get the fully-qualified description of the line item they requested. This is necessary because much tabular data is necessarily hierarchical.
- the system can implicitly generate any number of tags.
- the "Sales” genus implicitly includes tags such as “Sales:Implicit_Total”, “Sales:Implicit_Average”, “Sales:Implicit_Min”,
- data management system 100 keeps a database of content in its index databases 804, but then visits individual pages once a potential match has been found. Data management system 100 is thus able to ascertain (1) whether the page still exists, (2) whether the most up-to-date version of the data is contained in the index database 804, (3) whether new line items have been added that might meet the search criteria, and (4) whether complex internal criteria are met, such as "Revenues > 100 million US Dollars AND Debt ⁇ 20 million French Francs.” Two-Step Searching Process
- balancing the indexing requirements of RDL documents, and the time it takes to check a specific document for purposes of value transformations during the search process involves managing a two-step process.
- the two search strategies searching an index versus real time searching of the RDL documents
- the searching approach of data management system 100 modulates the apportionment of the two approaches to provide the most efficient overall search combination.
- fewer attribute values may be placed in the index and more reliance may be place on real-time analysis of the documents.
- Figure 13 illustrates the unit types 1302 into which many units fall, the base units used for the unit types 1302, and other examples 1306 of possible base units that may be employed.
- Figure 14 illustrates the flow of a transformation value search in accordance with the present invention.
- Database management system 100 maintains a database of conversion rates for use in transformational searches.
- the client computer 102 transforms the requested units to the base units 1304 (step 1402) before it is sent to the server computer (step 1404).
- the server computer transforms the indexed data to the base units (step 1406).
- a search is then conducted in the base unit on the Lineltemlndex database 902 (step 1408).
- the results are then transformed to the requested units and returned to the client computer 102 (step 1410).
- the unit transformations used are kept in an XML document, a sample fragment of which appears below. Both the client computer 102 and the server computer 104 can modify this database of conversion factors to handle updates, such as updates to currency conversion rates, or the creation of new units by the user.
- data management system 100 may use a related-terms index (not shown) to find synonyms and similar concepts.
- the index for this step may be a many-to-many table that may contain two columns, each containing keywords.
- this index (or "map") of related concepts is queried to find a list of related words.
- These terms are then fed into the SQL query to index database 804 as alternates.
- one application for related-terms index is to map related foreign language words to each other.
- the server computer 104 has access to an RXL Interpreter module, which converts a high level programming language to a low level Structured Query Language (SQL) code, so that it can perform arithmetic queries. These types of queries require using one or more line item(s) in an arithmetic formula to determine whether the subject line item meets a particular inclusion criteria.
- RXL Interpreter module converts a high level programming language to a low level Structured Query Language (SQL) code, so that it can perform arithmetic queries.
- SQL Structured Query Language
- a simple example would be a time series line item where the user is only interested in line items which contain a minimum % change (period to period) of 15%.
- the server computer 104 would retrieve a collection of line items that meet the non-arithmetic criteria (in this case, it is a time series), then passes each line item in a formula to the RX Interpreter for evaluation.
- the formula for evaluation would be "MAX(PERC_CHANGE(A))", where A is the line item being tested.
- MAX and PERC_CHANGE are functions in the RXL macro language. If this formula evaluates, for example, to 22%, the line item is added to the resultset.
- Arithmetic queries can use any RXL function, or RXL macro, that returns a value that can be evaluated to a scalar or a vector.
- RXL function or RXL macro, that returns a value that can be evaluated to a scalar or a vector.
- Watch Queries may be queries that are set up to run on a scheduler, returning a result set when matches are found. This is a "pull" version of traditional "push” notification. For example, assume that a user wants to be notified when a certain stock hits a certain price from a given set of documents. A scheduler may be set to check those documents periodically. If the required tests are met, a specified result format may be sent (e.g., email message, or result table to browser, or notice to network software object).
- a specified result format may be sent (e.g., email message, or result table to browser, or notice to network software object).
- This query method contrasts with the usual method of setting a trigger on a relational database that watches updates and sends a message ("push") at the instant the change occurs.
- the downside of this approach is the necessity of placing a trigger in the software on the server and the burden placed on the server.
- the RDL and RXL document type definitions include elements that describe the licensing terms of the document.
- the server computer 104 can read these elements (they may appear in the Lineltemlndex table 902) to determine whether they meet any licensing criteria that the user has set as part of the search.
- a sample fragment of a licensing element from an RDL document follows.
- the server computer 104 can search for documents using any of these criteria. For example, if the user has set a maximum amount that can be paid of 0 "$2.5 US Dollars Per Download.” This particular document defined by the above fragment would be included in the results because it has a "licensejype" of payment per download, and a "terms" of less than $2.50.
- RDL is a particular implementation of XML which provides elements and attributes that describe the "metadata" of numbers as well as the numbers themselves. Those who are familiar with XML will recognize that the methods of the search engine described herein may be applied to any XML-based implementation that provides elements and attributes that supply equivalent information.
- the described implementation includes software but methods and systems consistent with the present invention may be implemented as a combination of hardware and software or in hardware alone. Methods and systems in accordance with the present invention may be implemented with both object- oriented and non-object-oriented programming systems. Additionally, components in accordance with the present invention may be stored in memory; one skilled in the art will appreciate that these components can be stored on other types of computer-readable media, such as secondary storage devices, like hard disks, floppy disks, or CD-ROM; a carrier wave from the Internet or other propagation medium; or other forms of RAM or ROM.
- secondary storage devices like hard disks, floppy disks, or CD-ROM
- carrier wave from the Internet or other propagation medium or other forms of RAM or ROM.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001241564A AU2001241564A1 (en) | 2000-02-17 | 2001-02-16 | Rdl search engine |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18315200P | 2000-02-17 | 2000-02-17 | |
US60/183,152 | 2000-02-17 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001061568A2 WO2001061568A2 (en) | 2001-08-23 |
WO2001061568A9 true WO2001061568A9 (en) | 2002-04-11 |
Family
ID=22671662
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/005268 WO2001061568A2 (en) | 2000-02-17 | 2001-02-16 | Rdl search engine |
Country Status (3)
Country | Link |
---|---|
US (2) | US6886005B2 (en) |
AU (1) | AU2001241564A1 (en) |
WO (1) | WO2001061568A2 (en) |
Families Citing this family (98)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AUPO489297A0 (en) * | 1997-01-31 | 1997-02-27 | Aunty Abha's Electronic Publishing Pty Ltd | A system for electronic publishing |
US7293228B1 (en) | 1997-01-31 | 2007-11-06 | Timebase Pty Limited | Maltweb multi-axis viewing interface and higher level scoping |
US9262384B2 (en) | 1999-05-21 | 2016-02-16 | E-Numerate Solutions, Inc. | Markup language system, method, and computer program product |
US9262383B2 (en) | 1999-05-21 | 2016-02-16 | E-Numerate Solutions, Inc. | System, method, and computer program product for processing a markup document |
US6920608B1 (en) * | 1999-05-21 | 2005-07-19 | E Numerate Solutions, Inc. | Chart view for reusable data markup language |
US7249328B1 (en) * | 1999-05-21 | 2007-07-24 | E-Numerate Solutions, Inc. | Tree view for reusable data markup language |
US9268748B2 (en) | 1999-05-21 | 2016-02-23 | E-Numerate Solutions, Inc. | System, method, and computer program product for outputting markup language documents |
US7421648B1 (en) * | 1999-05-21 | 2008-09-02 | E-Numerate Solutions, Inc. | Reusable data markup language |
US7136821B1 (en) | 2000-04-18 | 2006-11-14 | Neat Group Corporation | Method and apparatus for the composition and sale of travel-oriented packages |
US8122236B2 (en) | 2001-10-24 | 2012-02-21 | Aol Inc. | Method of disseminating advertisements using an embedded media player page |
US8918812B2 (en) | 2000-10-24 | 2014-12-23 | Aol Inc. | Method of sizing an embedded media player page |
FR2816157A1 (en) * | 2000-10-31 | 2002-05-03 | Thomson Multimedia Sa | PROCESS FOR PROCESSING DISTRIBUTED VIDEO DATA TO BE VIEWED ON SCREEN AND DEVICE IMPLEMENTING THE METHOD |
US20020103920A1 (en) * | 2000-11-21 | 2002-08-01 | Berkun Ken Alan | Interpretive stream metadata extraction |
US9600842B2 (en) | 2001-01-24 | 2017-03-21 | E-Numerate Solutions, Inc. | RDX enhancement of system and method for implementing reusable data markup language (RDL) |
US6910051B2 (en) * | 2001-03-22 | 2005-06-21 | International Business Machines Corporation | Method and system for mechanism for dynamic extension of attributes in a content management system |
US20020194026A1 (en) * | 2001-06-13 | 2002-12-19 | Klein Jeffrey Lawrence | System and method for managing data and documents |
DE60129942T2 (en) * | 2001-06-18 | 2008-04-17 | Hewlett-Packard Development Co., L.P., Houston | Method and system for identifying devices connected via a network, e.g. Personal computer |
US7404142B1 (en) * | 2001-06-29 | 2008-07-22 | At&T Delaware Intellectual Property, Inc. | Systems and method for rapid presentation of structured digital content items |
US20030041305A1 (en) * | 2001-07-18 | 2003-02-27 | Christoph Schnelle | Resilient data links |
US7363310B2 (en) * | 2001-09-04 | 2008-04-22 | Timebase Pty Limited | Mapping of data from XML to SQL |
US20030046276A1 (en) * | 2001-09-06 | 2003-03-06 | International Business Machines Corporation | System and method for modular data search with database text extenders |
US7281206B2 (en) * | 2001-11-16 | 2007-10-09 | Timebase Pty Limited | Maintenance of a markup language document in a database |
JP4215425B2 (en) * | 2001-11-21 | 2009-01-28 | 日本電気株式会社 | Text management system, management method thereof, and program thereof |
US7124358B2 (en) * | 2002-01-02 | 2006-10-17 | International Business Machines Corporation | Method for dynamically generating reference identifiers in structured information |
US7016914B2 (en) * | 2002-06-05 | 2006-03-21 | Microsoft Corporation | Performant and scalable merge strategy for text indexing |
US7548858B2 (en) | 2003-03-05 | 2009-06-16 | Microsoft Corporation | System and method for selective audible rendering of data to a user based on user input |
WO2005072140A2 (en) * | 2004-01-16 | 2005-08-11 | Caitlyn Harts | Method and apparatus to perform client-independent database queries |
US7231590B2 (en) * | 2004-02-11 | 2007-06-12 | Microsoft Corporation | Method and apparatus for visually emphasizing numerical data contained within an electronic document |
US7831581B1 (en) | 2004-03-01 | 2010-11-09 | Radix Holdings, Llc | Enhanced search |
JP2005309727A (en) * | 2004-04-21 | 2005-11-04 | Hitachi Ltd | File system |
US7840557B1 (en) * | 2004-05-12 | 2010-11-23 | Google Inc. | Search engine cache control |
US7506361B2 (en) * | 2004-05-17 | 2009-03-17 | International Business Machines Corporation | Method for discovering servers, spawning collector threads to collect information from servers, and reporting information |
US20050270186A1 (en) * | 2004-06-07 | 2005-12-08 | Tai-Hung Lin | Specified index method for abstract syntax notation one encoding systems |
US8244689B2 (en) * | 2006-02-17 | 2012-08-14 | Google Inc. | Attribute entropy as a signal in object normalization |
US7769579B2 (en) * | 2005-05-31 | 2010-08-03 | Google Inc. | Learning facts from semi-structured text |
US7418410B2 (en) | 2005-01-07 | 2008-08-26 | Nicholas Caiafa | Methods and apparatus for anonymously requesting bids from a customer specified quantity of local vendors with automatic geographic expansion |
US9208229B2 (en) * | 2005-03-31 | 2015-12-08 | Google Inc. | Anchor text summarization for corroboration |
US8682913B1 (en) | 2005-03-31 | 2014-03-25 | Google Inc. | Corroborating facts extracted from multiple sources |
US7587387B2 (en) | 2005-03-31 | 2009-09-08 | Google Inc. | User interface for facts query engine with snippets from information sources that include query terms and answer terms |
US8996470B1 (en) * | 2005-05-31 | 2015-03-31 | Google Inc. | System for ensuring the internal consistency of a fact repository |
US7831545B1 (en) * | 2005-05-31 | 2010-11-09 | Google Inc. | Identifying the unifying subject of a set of facts |
US7558785B2 (en) * | 2005-06-23 | 2009-07-07 | International Business Machines Corporation | Extrapolating continuous values for comparison with discrete valued data |
EP1920393A2 (en) * | 2005-07-22 | 2008-05-14 | Yogesh Chunilal Rathod | Universal knowledge management and desktop search system |
US20070100862A1 (en) | 2005-10-23 | 2007-05-03 | Bindu Reddy | Adding attributes and labels to structured data |
US7933900B2 (en) * | 2005-10-23 | 2011-04-26 | Google Inc. | Search over structured data |
US7991797B2 (en) | 2006-02-17 | 2011-08-02 | Google Inc. | ID persistence through normalization |
US8260785B2 (en) | 2006-02-17 | 2012-09-04 | Google Inc. | Automatic object reference identification and linking in a browseable fact repository |
US7617198B2 (en) * | 2006-02-09 | 2009-11-10 | Sap Ag | Generation of XML search profiles |
US8700568B2 (en) * | 2006-02-17 | 2014-04-15 | Google Inc. | Entity normalization via name normalization |
JP4677355B2 (en) * | 2006-03-03 | 2011-04-27 | キヤノン株式会社 | Web service apparatus and sequential process transfer method |
US9135238B2 (en) * | 2006-03-31 | 2015-09-15 | Google Inc. | Disambiguation of named entities |
US7933890B2 (en) * | 2006-03-31 | 2011-04-26 | Google Inc. | Propagating useful information among related web pages, such as web pages of a website |
US9633356B2 (en) | 2006-07-20 | 2017-04-25 | Aol Inc. | Targeted advertising for playlists based upon search queries |
US8301728B2 (en) * | 2006-07-21 | 2012-10-30 | Yahoo! Inc. | Technique for providing a reliable trust indicator to a webpage |
US8112703B2 (en) * | 2006-07-21 | 2012-02-07 | Yahoo! Inc. | Aggregate tag views of website information |
US7890499B1 (en) * | 2006-07-28 | 2011-02-15 | Google Inc. | Presentation of search results with common subject matters |
US8554869B2 (en) * | 2006-08-02 | 2013-10-08 | Yahoo! Inc. | Providing an interface to browse links or redirects to a particular webpage |
US20080040373A1 (en) * | 2006-08-10 | 2008-02-14 | Business Objects, S.A. | Apparatus and method for implementing match transforms in an enterprise information management system |
US8122026B1 (en) | 2006-10-20 | 2012-02-21 | Google Inc. | Finding and disambiguating references to entities on web pages |
US7987185B1 (en) | 2006-12-29 | 2011-07-26 | Google Inc. | Ranking custom search results |
US7844890B2 (en) * | 2006-12-29 | 2010-11-30 | Sap Ag | Document link management |
US8347202B1 (en) | 2007-03-14 | 2013-01-01 | Google Inc. | Determining geographic locations for place names in a fact repository |
US8234240B2 (en) * | 2007-04-26 | 2012-07-31 | Microsoft Corporation | Framework for providing metrics from any datasource |
US20080270469A1 (en) * | 2007-04-26 | 2008-10-30 | Microsoft Corporation | Business metrics aggregated by custom hierarchy |
US8239350B1 (en) | 2007-05-08 | 2012-08-07 | Google Inc. | Date ambiguity resolution |
US7966291B1 (en) | 2007-06-26 | 2011-06-21 | Google Inc. | Fact-based object merging |
US7970766B1 (en) | 2007-07-23 | 2011-06-28 | Google Inc. | Entity type assignment |
US8738643B1 (en) | 2007-08-02 | 2014-05-27 | Google Inc. | Learning synonymous object names from anchor texts |
WO2009051852A1 (en) * | 2007-10-18 | 2009-04-23 | The Nielsen Company (U.S.), Inc. | Methods and apparatus to create a media measurement reference database from a plurality of distributed sources |
US8140535B2 (en) * | 2007-10-23 | 2012-03-20 | International Business Machines Corporation | Ontology-based network search engine |
US8812435B1 (en) | 2007-11-16 | 2014-08-19 | Google Inc. | Learning objects and facts from documents |
US8126881B1 (en) | 2007-12-12 | 2012-02-28 | Vast.com, Inc. | Predictive conversion systems and methods |
GB2457267B (en) * | 2008-02-07 | 2010-04-07 | Yves Dassas | A method and system of indexing numerical data |
CN102027467A (en) * | 2008-05-27 | 2011-04-20 | 多基有限公司 | Non-linear representation of video data |
US8504582B2 (en) * | 2008-12-31 | 2013-08-06 | Ebay, Inc. | System and methods for unit of measurement conversion and search query expansion |
US8756229B2 (en) | 2009-06-26 | 2014-06-17 | Quantifind, Inc. | System and methods for units-based numeric information retrieval |
US8176074B2 (en) * | 2009-10-28 | 2012-05-08 | Sap Ag | Methods and systems for querying a tag database |
US20110218986A1 (en) * | 2010-03-06 | 2011-09-08 | David Joseph O'Hanlon | Search engine optimization economic purchasing method |
US8719722B2 (en) * | 2010-03-10 | 2014-05-06 | Hewlett-Packard Development Company, L.P. | Producing a representation of progress of a database process |
US8447766B2 (en) * | 2010-04-26 | 2013-05-21 | Hewlett-Packard Development Company, L.P. | Method and system for searching unstructured textual data for quantitative answers to queries |
US8577915B2 (en) * | 2010-09-10 | 2013-11-05 | Veveo, Inc. | Method of and system for conducting personalized federated search and presentation of results therefrom |
US9830379B2 (en) | 2010-11-29 | 2017-11-28 | Google Inc. | Name disambiguation using context terms |
US8365069B1 (en) | 2011-08-17 | 2013-01-29 | International Business Machines Corporation | Web content management based on timeliness metadata |
US9171039B2 (en) * | 2011-09-29 | 2015-10-27 | Sap Se | Query language based on business object model |
US9563674B2 (en) | 2012-08-20 | 2017-02-07 | Microsoft Technology Licensing, Llc | Data exploration user interface |
US10007946B1 (en) | 2013-03-07 | 2018-06-26 | Vast.com, Inc. | Systems, methods, and devices for measuring similarity of and generating recommendations for unique items |
US9104718B1 (en) | 2013-03-07 | 2015-08-11 | Vast.com, Inc. | Systems, methods, and devices for measuring similarity of and generating recommendations for unique items |
US9465873B1 (en) | 2013-03-07 | 2016-10-11 | Vast.com, Inc. | Systems, methods, and devices for identifying and presenting identifications of significant attributes of unique items |
US9830635B1 (en) | 2013-03-13 | 2017-11-28 | Vast.com, Inc. | Systems, methods, and devices for determining and displaying market relative position of unique items |
US9436727B1 (en) * | 2013-04-01 | 2016-09-06 | Ca, Inc. | Method for providing an integrated macro module |
US20150019584A1 (en) * | 2013-07-15 | 2015-01-15 | International Business Machines Corporation | Self-learning java database connectivity (jdbc) driver |
US10127596B1 (en) | 2013-12-10 | 2018-11-13 | Vast.com, Inc. | Systems, methods, and devices for generating recommendations of unique items |
US9317189B1 (en) * | 2013-12-20 | 2016-04-19 | Emc Corporation | Method to input content in a structured manner with real-time assistance and validation |
US10416871B2 (en) * | 2014-03-07 | 2019-09-17 | Microsoft Technology Licensing, Llc | Direct manipulation interface for data analysis |
US10225268B2 (en) * | 2015-04-20 | 2019-03-05 | Capital One Services, Llc | Systems and methods for automated retrieval, processing, and distribution of cyber-threat information |
CN116894049A (en) * | 2016-03-31 | 2023-10-17 | 施耐德电气美国股份有限公司 | Semantic search system and method for distributed data system |
US10268704B1 (en) | 2017-10-12 | 2019-04-23 | Vast.com, Inc. | Partitioned distributed database systems, devices, and methods |
US11475008B2 (en) * | 2020-04-28 | 2022-10-18 | Capital One Services, Llc | Systems and methods for monitoring user-defined metrics |
Family Cites Families (82)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4674043A (en) * | 1985-04-02 | 1987-06-16 | International Business Machines Corp. | Updating business chart data by editing the chart |
US5742738A (en) * | 1988-05-20 | 1998-04-21 | John R. Koza | Simultaneous evolution of the architecture of a multi-part program to solve a problem using architecture altering operations |
US5339392A (en) * | 1989-07-27 | 1994-08-16 | Risberg Jeffrey S | Apparatus and method for creation of a user definable video displayed document showing changes in real time data |
US5179702A (en) * | 1989-12-29 | 1993-01-12 | Supercomputer Systems Limited Partnership | System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel execution thread scheduling |
US5159662A (en) * | 1990-04-27 | 1992-10-27 | Ibm Corporation | System and method for building a computer-based rete pattern matching network |
US5423032A (en) * | 1991-10-31 | 1995-06-06 | International Business Machines Corporation | Method for extracting multi-word technical terms from text |
CA2134059C (en) * | 1993-10-29 | 2009-01-13 | Charles Simonyi | Method and system for generating a computer program |
US5548703A (en) * | 1993-11-30 | 1996-08-20 | International Business Machines Corporation | Navigation within a compound graphical object in a graphical user interface |
US6160549A (en) * | 1994-07-29 | 2000-12-12 | Oracle Corporation | Method and apparatus for generating reports using declarative tools |
US5603021A (en) * | 1994-09-02 | 1997-02-11 | Borland International, Inc. | Methods for composing formulas in an electronic spreadsheet system |
US5838906A (en) * | 1994-10-17 | 1998-11-17 | The Regents Of The University Of California | Distributed hypermedia method for automatically invoking external application providing interaction and display of embedded objects within a hypermedia document |
US5838965A (en) | 1994-11-10 | 1998-11-17 | Cadis, Inc. | Object oriented database management system |
US5758257A (en) * | 1994-11-29 | 1998-05-26 | Herz; Frederick | System and method for scheduling broadcast of and access to video programs and other data using customer profiles |
US6034676A (en) * | 1995-04-11 | 2000-03-07 | Data View, Inc. | System and method for measuring and processing tire depth data |
US5737592A (en) | 1995-06-19 | 1998-04-07 | International Business Machines Corporation | Accessing a relational database over the Internet using macro language files |
US5894311A (en) * | 1995-08-08 | 1999-04-13 | Jerry Jackson Associates Ltd. | Computer-based visual data evaluation |
US6026388A (en) * | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
US5822587A (en) * | 1995-10-20 | 1998-10-13 | Design Intelligence, Inc. | Method and system for implementing software objects |
US6625617B2 (en) | 1996-01-02 | 2003-09-23 | Timeline, Inc. | Modularized data retrieval method and apparatus with multiple source capability |
US6167409A (en) * | 1996-03-01 | 2000-12-26 | Enigma Information Systems Ltd. | Computer system and method for customizing context information sent with document fragments across a computer network |
US6014661A (en) | 1996-05-06 | 2000-01-11 | Ivee Development Ab | System and method for automatic analysis of data bases and for user-controlled dynamic querying |
US6026397A (en) * | 1996-05-22 | 2000-02-15 | Electronic Data Systems Corporation | Data analysis system and method |
US5913214A (en) * | 1996-05-30 | 1999-06-15 | Massachusetts Inst Technology | Data extraction from world wide web pages |
US6199080B1 (en) * | 1996-08-30 | 2001-03-06 | Sun Microsystems, Inc. | Method and apparatus for displaying information on a computer controlled display device |
US5956737A (en) * | 1996-09-09 | 1999-09-21 | Design Intelligence, Inc. | Design engine for fitting content to a medium |
US6199057B1 (en) | 1996-10-23 | 2001-03-06 | California Institute Of Technology | Bit-serial neuroprocessor architecture |
US6065026A (en) * | 1997-01-09 | 2000-05-16 | Document.Com, Inc. | Multi-user electronic document authoring system with prompted updating of shared language |
US5948113A (en) * | 1997-04-18 | 1999-09-07 | Microsoft Corporation | System and method for centrally handling runtime errors |
US6667747B1 (en) * | 1997-05-07 | 2003-12-23 | Unisys Corporation | Method and apparatus for providing a hyperlink within a computer program that access information outside of the computer program |
US5917485A (en) * | 1997-05-07 | 1999-06-29 | Unisys Corporation | User assistance for data processing systems |
US6173284B1 (en) * | 1997-05-20 | 2001-01-09 | University Of Charlotte City Of Charlotte | Systems, methods and computer program products for automatically monitoring police records for a crime profile |
US5920828A (en) * | 1997-06-02 | 1999-07-06 | Baker Hughes Incorporated | Quality control seismic data processing system |
US5974413A (en) * | 1997-07-03 | 1999-10-26 | Activeword Systems, Inc. | Semantic user interface |
US5950196A (en) * | 1997-07-25 | 1999-09-07 | Sovereign Hill Software, Inc. | Systems and methods for retrieving tabular data from textual sources |
US6199046B1 (en) * | 1997-07-29 | 2001-03-06 | Adsura Pty Ltd. | Method system and article of manufacture for performing real time currency conversion |
US6314562B1 (en) * | 1997-09-12 | 2001-11-06 | Microsoft Corporation | Method and system for anticipatory optimization of computer programs |
US6134563A (en) * | 1997-09-19 | 2000-10-17 | Modernsoft, Inc. | Creating and editing documents |
US6631402B1 (en) | 1997-09-26 | 2003-10-07 | Worldcom, Inc. | Integrated proxy interface for web based report requester tool set |
US6373504B1 (en) * | 1997-09-30 | 2002-04-16 | Sun Microsystems, Inc. | Local sorting of downloaded tables |
US6484149B1 (en) | 1997-10-10 | 2002-11-19 | Microsoft Corporation | Systems and methods for viewing product information, and methods for generating web pages |
US6243698B1 (en) * | 1997-10-17 | 2001-06-05 | Sagent Technology, Inc. | Extensible database retrieval and viewing architecture |
US6121924A (en) * | 1997-12-30 | 2000-09-19 | Navigation Technologies Corporation | Method and system for providing navigation systems with updated geographic data |
US6223189B1 (en) * | 1997-12-31 | 2001-04-24 | International Business Machines Corporation | System and method using metalanguage keywords to generate charts |
US5999944A (en) | 1998-02-27 | 1999-12-07 | Oracle Corporation | Method and apparatus for implementing dynamic VRML |
US6356920B1 (en) * | 1998-03-09 | 2002-03-12 | X-Aware, Inc | Dynamic, hierarchical data exchange system |
US6594653B2 (en) * | 1998-03-27 | 2003-07-15 | International Business Machines Corporation | Server integrated system and methods for processing precomputed views |
US6199063B1 (en) * | 1998-03-27 | 2001-03-06 | Red Brick Systems, Inc. | System and method for rewriting relational database queries |
US6240407B1 (en) * | 1998-04-29 | 2001-05-29 | International Business Machines Corp. | Method and apparatus for creating an index in a database system |
US6108662A (en) * | 1998-05-08 | 2000-08-22 | Allen-Bradley Company, Llc | System method and article of manufacture for integrated enterprise-wide control |
US6745384B1 (en) * | 1998-05-29 | 2004-06-01 | Microsoft Corporation | Anticipatory optimization with composite folding |
US6092036A (en) * | 1998-06-02 | 2000-07-18 | Davox Corporation | Multi-lingual data processing system and system and method for translating text used in computer software utilizing an embedded translator |
US6493717B1 (en) | 1998-06-16 | 2002-12-10 | Datafree, Inc. | System and method for managing database information |
US6912293B1 (en) * | 1998-06-26 | 2005-06-28 | Carl P. Korobkin | Photogrammetry engine for model construction |
US6460059B1 (en) | 1998-08-04 | 2002-10-01 | International Business Machines Corporation | Visual aid to simplify achieving correct cell interrelations in spreadsheets |
US6374274B1 (en) * | 1998-09-16 | 2002-04-16 | Health Informatics International, Inc. | Document conversion and network database system |
US6421656B1 (en) * | 1998-10-08 | 2002-07-16 | International Business Machines Corporation | Method and apparatus for creating structure indexes for a data base extender |
US6317750B1 (en) | 1998-10-26 | 2001-11-13 | Hyperion Solutions Corporation | Method and apparatus for accessing multidimensional data |
US6366915B1 (en) * | 1998-11-04 | 2002-04-02 | Micron Technology, Inc. | Method and system for efficiently retrieving information from multiple databases |
US6421822B1 (en) * | 1998-12-28 | 2002-07-16 | International Business Machines Corporation | Graphical user interface for developing test cases using a test object library |
US6349307B1 (en) * | 1998-12-28 | 2002-02-19 | U.S. Philips Corporation | Cooperative topical servers with automatic prefiltering and routing |
US6505246B1 (en) * | 1998-12-30 | 2003-01-07 | Candle Distributed Solutions, Inc. | User interface for system management applications |
US6370549B1 (en) * | 1999-01-04 | 2002-04-09 | Microsoft Corporation | Apparatus and method for searching for a file |
US6704739B2 (en) * | 1999-01-04 | 2004-03-09 | Adobe Systems Incorporated | Tagging data assets |
US6507856B1 (en) * | 1999-01-05 | 2003-01-14 | International Business Machines Corporation | Dynamic business process automation system using XML documents |
US6635089B1 (en) * | 1999-01-13 | 2003-10-21 | International Business Machines Corporation | Method for producing composite XML document object model trees using dynamic data retrievals |
US6370537B1 (en) * | 1999-01-14 | 2002-04-09 | Altoweb, Inc. | System and method for the manipulation and display of structured data |
US6418433B1 (en) * | 1999-01-28 | 2002-07-09 | International Business Machines Corporation | System and method for focussed web crawling |
US6957191B1 (en) | 1999-02-05 | 2005-10-18 | Babcock & Brown Lp | Automated financial scenario modeling and analysis tool having an intelligent graphical user interface |
US6591272B1 (en) * | 1999-02-25 | 2003-07-08 | Tricoron Networks, Inc. | Method and apparatus to make and transmit objects from a database on a server computer to a client computer |
US6470349B1 (en) * | 1999-03-11 | 2002-10-22 | Browz, Inc. | Server-side scripting language and programming tool |
US6206388B1 (en) * | 1999-05-14 | 2001-03-27 | Jan Wim Ouboter | Scooter board |
US6920608B1 (en) * | 1999-05-21 | 2005-07-19 | E Numerate Solutions, Inc. | Chart view for reusable data markup language |
US6260041B1 (en) | 1999-09-30 | 2001-07-10 | Netcurrents, Inc. | Apparatus and method of implementing fast internet real-time search technology (first) |
US6351755B1 (en) * | 1999-11-02 | 2002-02-26 | Alta Vista Company | System and method for associating an extensible set of data with documents downloaded by a web crawler |
US7222161B2 (en) | 1999-11-24 | 2007-05-22 | Yen Robert C | Method and system for facilitating usage of local content at client machine |
FR2806183B1 (en) * | 1999-12-01 | 2006-09-01 | Cartesis S A | DEVICE AND METHOD FOR INSTANT CONSOLIDATION, ENRICHMENT AND "REPORTING" OR BACKGROUND OF INFORMATION IN A MULTIDIMENSIONAL DATABASE |
US6823332B2 (en) | 1999-12-23 | 2004-11-23 | Larry L Russell | Information storage and retrieval device |
US6643661B2 (en) * | 2000-04-27 | 2003-11-04 | Brio Software, Inc. | Method and apparatus for implementing search and channel features in an enterprise-wide computer system |
US6721736B1 (en) * | 2000-11-15 | 2004-04-13 | Hewlett-Packard Development Company, L.P. | Methods, computer system, and computer program product for configuring a meta search engine |
US9600842B2 (en) * | 2001-01-24 | 2017-03-21 | E-Numerate Solutions, Inc. | RDX enhancement of system and method for implementing reusable data markup language (RDL) |
US20020198985A1 (en) | 2001-05-09 | 2002-12-26 | Noam Fraenkel | Post-deployment monitoring and analysis of server performance |
US7342983B2 (en) * | 2004-02-24 | 2008-03-11 | Agere Systems, Inc. | Apparatus and method for digitally filtering spurious transitions on a digital signal |
-
2001
- 2001-02-16 WO PCT/US2001/005268 patent/WO2001061568A2/en active Application Filing
- 2001-02-16 AU AU2001241564A patent/AU2001241564A1/en not_active Abandoned
- 2001-02-16 US US09/784,205 patent/US6886005B2/en not_active Expired - Lifetime
-
2004
- 2004-11-04 US US10/980,266 patent/US7401076B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
US6886005B2 (en) | 2005-04-26 |
AU2001241564A1 (en) | 2001-08-27 |
US20050086216A1 (en) | 2005-04-21 |
US20020073115A1 (en) | 2002-06-13 |
US7401076B2 (en) | 2008-07-15 |
WO2001061568A2 (en) | 2001-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7401076B2 (en) | RDL search engine | |
Cooley et al. | Web mining: Information and pattern discovery on the world wide web | |
US7039622B2 (en) | Computer-implemented knowledge repository interface system and method | |
Luk et al. | A survey in indexing and searching XML documents | |
US8700673B2 (en) | Mechanisms for metadata search in enterprise applications | |
US20070244867A1 (en) | Knowledge management tool | |
US20020042789A1 (en) | Internet search engine with interactive search criteria construction | |
US20020065857A1 (en) | System and method for analysis and clustering of documents for search engine | |
US20120059822A1 (en) | Knowledge management tool | |
CA2447322A1 (en) | Methods and apparatus for real-time business visibility using persistent schema-less data storage | |
CN102360367A (en) | XBRL (Extensible Business Reporting Language) data search method and search engine | |
Simitsis et al. | Multidimensional content exploration | |
Abramowicz et al. | Filtering the Web to feed data warehouses | |
López et al. | An efficient and scalable search engine for models | |
US8775443B2 (en) | Ranking of business objects for search engines | |
Bhowmick et al. | Web data management: a warehouse approach | |
CN101866340A (en) | Online retrieval and intelligent analysis method and system of product information | |
Butt et al. | A taxonomy of semantic web data retrieval techniques | |
Nekrestyanov et al. | Text retrieval systems for the web | |
Chen et al. | WebReader: a mechanism for automating the search and collecting information from the World Wide Web | |
CA2514165A1 (en) | Metadata content management and searching system and method | |
Cotter et al. | Pro Full-Text Search in SQL Server 2008 | |
Rajaram et al. | Web caching in Semantic Web based multiple search engines | |
Konopnicki et al. | Bringing database functionality to the WWW | |
Kosala et al. | An overview of web mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1-43, DESCRIPTION, REPLACED BY NEW PAGES 1-39; PAGES 44-62, CLAIMS, REPLACED BY NEW PAGES 40-56; PAGES 1/15-15/15, DRAWINGS, REPLACED BY NEW PAGES 1/15-15/15; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |