WO2001026018A2 - Electronic shopping agent which is capable of operating with vendor sites which have disparate formats - Google Patents

Electronic shopping agent which is capable of operating with vendor sites which have disparate formats Download PDF

Info

Publication number
WO2001026018A2
WO2001026018A2 PCT/US2000/041013 US0041013W WO0126018A2 WO 2001026018 A2 WO2001026018 A2 WO 2001026018A2 US 0041013 W US0041013 W US 0041013W WO 0126018 A2 WO0126018 A2 WO 0126018A2
Authority
WO
WIPO (PCT)
Prior art keywords
information
program code
query
category
creating
Prior art date
Application number
PCT/US2000/041013
Other languages
French (fr)
Other versions
WO2001026018A3 (en
WO2001026018A8 (en
Inventor
Simon M. Lin
Original Assignee
U-Know Software Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by U-Know Software Corporation filed Critical U-Know Software Corporation
Priority to AU14932/01A priority Critical patent/AU1493201A/en
Publication of WO2001026018A2 publication Critical patent/WO2001026018A2/en
Publication of WO2001026018A8 publication Critical patent/WO2001026018A8/en
Publication of WO2001026018A3 publication Critical patent/WO2001026018A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation

Definitions

  • This invention relates to electronic shopping agents or "bots" which operate over the Internet on behalf of a client to locate on-line vendors which provide goods and services of interest to the client.
  • the Internet and web-related technology have become widespread as personal computers have become more prevalent.
  • One of the fastest growing business sectors is electronic commerce, particularly, retail consumer shopping.
  • the Internet allows consumers to quickly locate goods and services of interest to them. In many cases, images of the goods can be viewed and orders placed directly over the web. The consumer may provide payment electronically via credit cards and the goods are then shipped to the consumer.
  • Comparative shopping using the Internet as a search and retrieval tool to locate and retrieve information and prices for comparable products is also a fast-growing area.
  • the keywords and category information are used to create an autonomous agent or "shopping bot" which scans over the Internet and locates related products from a set of online merchants.
  • the product items that are located are returned by the shopping bot are then presented to the consumer using a simple tabular form to enable comparison shopping.
  • the existing shopping tools can help users to do comparative shopping, there are several limitations among all the existing shopping bots.
  • the first limitation is that the number of online merchants included in the comparison pool that a user can access and use for comparisons is small. This limitation can be mainly attributed to a historical fact, i.e. the evolution of the Internet.
  • the Internet was originally designed to operate with information coded in a very specific format called HyperText Markup Language (HTML).
  • HTML is a presentation language that uses codes embedded in the document to define how a particular segment of a document is presented on a display mechanism such as a Web browser.
  • HTML has a predefined and fixed format, it does not give any information about the meaning or semantics of the information which it is used to format.
  • Web browsers can read HTML and use the HTML codes to identify selected parts of the information, such as text and graphics, the browsers cannot use the HTML codes to extract information from the identified document parts.
  • web pages often differ drastically depending on the taste, preference, and marketing needs of different designers and merchants.
  • some shopping bots use mechanisms such as so-called “wrappers” or "information adapters.”
  • One of these mechanisms is programmed to discover or "learn" about each product category in each merchant site. However, these mechanisms are very slow and it usually takes from hours to days to include a new merchant in the comparison pool.
  • the second limitation is the amount of information a consumer can get through existing shopping bots.
  • Most shopping bots provide consumers with only limited information such as price, a brief description of items, and a merchant link.
  • price may not be the only criteria on which to base a shopping decision.
  • Other factors such as shipping date, warranty information, creditability of a merchant, and service, etc. often affect shopping decisions.
  • it is possible to add additional criteria to existing shopping bots it means changing hundreds or thousands of "wrappers" or "information adapters” corresponding to different categories, subcategories, and merchant sites. Such a task is a very costly investment both in terms of time and human resources.
  • the third limitation on existing shopping bots is performance and accuracy of searched results.
  • a shopping bot uses real time agents that automatically contact disparate web pages representing vast variety of different categories and merchants and retrieve and unify the information therein for display when a request for the information is made. Consequently, there is no need to create a "wrapper", or a "information adapter" for each category or each merchant because the same agent can retrieve and process information in various formats.
  • the shopping bot generates queries from keywords entered by a user and a database of URL information. Information returned by the queries is filtered, parsed and mapped to a standard format. The formatted information can then be displayed. Since the information is converted to the standard format in real time, the invention allows quick addition of online merchants and additional product categories can be added quickly and easily.
  • the database of URL information includes URLs specific to site directories at each merchant site so that queries can be easily generated by appending user-provided keywords.
  • information which is retrieved from merchant sites comprises information which would normally be displayed by a browser.
  • This information which is typically in HTML or XML format is parsed and filtered and a hierarchical tree structure is used to map the information to desired categories before displaying the information for comparison.
  • efficient caching and distributed algorithms are used to reduce consumer response time.
  • Figure 1 is a block schematic diagram of a networking arrangement which includes the Internet and connects several local computer systems to remote servers.
  • Figure 2 is a block schematic diagram which illustrates the major components of the inventive shopping system.
  • Figures 3A and 3B when placed together, form a flowchart which illustrates the inventive process of generating queries in parallel from stored URLs issuing the queries to merchant sites and processing the results in parallel.
  • Figure 4 is a flowchart which illustrates the process of extracting relevant information from query results.
  • Figure 5 is a schematic diagram of information in an example tree branch having three node levels.
  • Figure 6 is a schematic diagram of information in another example tree branch having three nodes.
  • Figure 7 is a schematic diagram of information in a further example tree branch also having three nodes.
  • FIG. 1 shows a commonly used network arrangement in which local computer systems 100 and 102 are connected by a local area network (LAN) 104 to a local server 106 which may access a plurality of remote servers 110-114 through the Internet 108.
  • Each remote server 110-114 may include World Wide Web sites (web sites) that each include a plurality of World Wide Web pages (web pages).
  • Each local computer system 100 and 102, of which system 100 is shown in more detail, may access the remote web sites with web browser software 101 , such as Netscape NavigatorTM, available from Netscape Communications Corporation of Mountain View, California or Internet Explorer available from Microsoft Corporation, Redmond, Washington.
  • web browser software 101 such as Netscape NavigatorTM, available from Netscape Communications Corporation of Mountain View, California or Internet Explorer available from Microsoft Corporation, Redmond, Washington.
  • a web site has a home page which constitutes the highest level in the hierarchy.
  • the home page typically contains general information about the merchant, including graphic images and may contain other information such as a menu allowing a user who visits the web site to navigate to the other web pages that constitute the site.
  • the site may also include a site directory that is a web page that contains links to the other web pages.
  • a site directory includes a site search feature which is an integral search engine that accepts user input in the form of keywords and searches the site for matches.
  • Information of use to a consumer such as item descriptions and prices would typically be located on lower levels of the hierarchy. Ordering information, such as credit card information might be located at still another level of the site.
  • HTTP Hypertext Transfer Protocol
  • HTML Hypertext Markup Language
  • HTML allows software developers to specify graphical pointers on displayed web pages (commonly referred to as "hyperlinks") that point to other web pages resident on remote servers. Hyperlinks commonly are displayed as highlighted text or other graphical image on the web page. Selection of a hyperlink with a pointing device, such as a computer mouse, causes the local computer to download the HTML code of an associated web page from a remote server.
  • the location of the web page is expressed as a "uniform resource locator" (URL).
  • This method provides the remote server with the necessary information to upload the remote web page associated with the selected point to the local computer.
  • Web sites constructed by on-line merchants contain descriptions and or pictures of goods or services for sale. Each site is typically arranged in a hierarchical branching tree structure having a plurality of nodes that contain one or more of the web pages in the site. Each of the nodes in the site are considered to be on various levels of each branch in the tree structure. For example, a first node is considered to be on a lower level than a second node in the same branch if a web page in the first node includes the second node in its URL.
  • a third web node in the same branch is considered to be on a higher level than the second node if the URL of a web page in the second node includes the third node.
  • Web pages are accessed over the Internet, via the browser software 101 , and commonly are downloaded into a cache 103 of the local computer system 100.
  • the browser software 101 then uses the HTML code to position the various files on a display screen.
  • the inventive shopping system is illustrated in Figure 2 and consists of a search engine kernel (SEK) 206, one or more search engines 208-212 and one or more automatic learning objects (ALOs) 214-218.
  • the kernel 206 might be located in the local server (106, Figure 1 ) and interacts with one or more users 200-204 to receive a user's request for information and to send the results back to the requesting user.
  • the SEK 206 is platform independent so that it can run on any hardware platform and operating systems.
  • the SEK 206 is written in the Java programming language licensed by Sun Microsystems, Inc. and can operate on any platform as long as a Java environment is operating on that platform.
  • the SEK 206 first analyzes a request generated by a user and then starts appropriate functions according to the user's request. For example, the SEK 206 might receive a request for information relating to books on a particular topic, such as "child spousal support" from one of users 200-204.
  • the request criteria are entered by the user in a variety of manners. The criteria could be entered, via an interactive interface in which the user answers a series of questions based on keywords. Alternatively, the user might use drop-down lists and menus to select a predefined category, such as "books" from a list or a graphic display of items.
  • the request would include at least a category, such as 'book” or “auto” and selected keywords, such as “child”, “spousal” and “support” or a phrase “child spousal support” which can be broken down by a conventional parser into one or more keywords.
  • the SEK 206 generates one or more queries relating to the selected topic.
  • the SEK 206 uses an internal database 220 of URLs.
  • This database can include entries set up be participating vendors, or might be a manually downloaded table, or provided by the company which maintains the inventive shopping system.
  • the database 220 contains a plurality of URLs arranged by category. When a category is entered by a user, the URLs for that category are returned by the database 220.
  • the URLs stored in database 220 are not the URLs for a particular merchant site, but are instead the URLs for the site directory and, in particular, for the site search of each site. These URLs can be easily combined with the keywords entered by the user to form a query.
  • This latter query utilizes the internal search engine existing at most merchant sites to perform at least part of the search, thereby relieving the inventive system of having to construct a query which is specific to each merchant site. For example, continuing the above example, if the user has selected the category "books.”
  • the SEK 206 might retrieve the URLs for online book vendors such as amazon.com and kingbooks.com from the database 220.
  • These queries are advantageous because they use the built-in search engines in the respective web sites to perform the actual search, thus relieving the SEK from having to compose a customized search for each site and changing the customized search when the site changes.
  • Database 220 may contain several URLs for a single merchant site where each URL is mapped to one or more keywords. When a user enters the keywords for the query, the keywords are used to select from the URLs for a merchant site and then the final query is generated by combining the selected keywords with the user enter information.
  • the merchant site is a simple collection of web pages
  • a programmer must enter the site and navigate directly to a web page which displays an item.
  • the URL which identifies this latter web page is then entered into the database 220 and mapped to various keywords selected from the web page content.
  • keyword information the information is used to select URLs from the database 220 which are mapped to matching keywords.
  • a keyword for a merchant site is selected based on the user-selected category and a query is formed using the URL which matches the keyword for the merchant site.
  • queries are generated for each merchant site having a URL in the selected category.
  • the SEK spawns a search engine thread, for example, search engine 208 and provides it with the query for a merchant site.
  • the search engine 208 issues the query to the appropriate merchant site.
  • the merchant site returns the results of the query.
  • the results from a merchant site are received by the search engine 208 that issued the query and forwarded to the SEK 206.
  • the SEK creates an ALO, for example ALO 214 for each merchant site.
  • the ALOs process the results to extract relevant information.
  • the extracted information is returned to the SEK 206 for display formatting. Finally, the formatted results are returned to the one of users 200-204 which made the initial request.
  • the entire process is illustrated in the flowchart shown in Figures 3A and 3B, which, when placed together, form the flowchart.
  • step 300 The process starts in step 300 and proceeds to step 302 where an attempt is made to retrieve a merchant URL from the SEK database 220 under the category which has been selected by the user.
  • step 304 a check is made to determine whether the attempt was successful. If no additional URLs remain and the attempt is not successful, the process proceeds to step 306 and finishes.
  • step 304 if, in step 304, an additional URL is retrieved from database 220, then the process proceeds to step 308.
  • step 308 a query is generated by concatenating the URL and the keywords entered by the user to generate a query. As previously mentioned, in many cases, this query will be directed towards a site search engine located at the merchant site.
  • step 310 a local search engine thread is created which issues the query to the corresponding web site. The process then proceeds back to step 302 to retrieve another URL and generate another search engine thread.
  • each search thread is illustrated in Figure 3B.
  • the process proceeds, via off-page connectors 312 and 316 to step 318 in which the SEK creates an automatic learning object to receive the search results from a merchant site.
  • the query results generated by the merchant site search engine are received by the search engine and forwarded to previously-created ALO as set forth in step 320.
  • Data from a site may not be returned all at once, but processing begins by the ALO as soon as data are received.
  • step 322 the local search engine which issued the query then waits to determine whether additional results will be provided by the merchant site. If additional results are received, then step 320 is repeated. Alternatively, if there are no additional results as determined in step 322, the process finishes in step 324.
  • the SEK 206 keeps track of all ALOs 208-212 generated and performs the necessary synchronization between the ALOs 208-212.
  • the search process is conducted in parallel with each query being processed by a separate search engine thread and the corresponding results are processed by a separate ALO. This parallel processing greatly reduces the time required to obtain comparative results.
  • the procedure that receives a request message from a user and initiates a search engine is kept very short so that it takes minimum amount of time.
  • Various ALOs in the SEK share the same resources making inter process communication more efficient and eliminating unnecessary mapping.
  • the SEK 206 Based on the category, subcategory, product name, and other related information provided by the user, the SEK 206 initiates one or more search engines 208-212 that examine all merchant sites that may have the product information requested by the user. Each search engine takes a query generated by the SEK and goes to the associated web site to retrieve the desired information. In general, the information retrieved from a web site by the aforementioned queries is intended for display by a browser. Often the information will be encoded using "markup languages” such as HTML or XML or other presentation languages. HTML is a simple "markup language” that is suited for the display of small and reasonably simple documents which are commonly transmitted on the World Wide Web. Another markup language called the Extensible Markup Language (XML) is often used for more complicated documents that require capabilities beyond those provided by HTML. XML is more extensible, allows for validation and defines how URLs can be used to identify component parts of XML documents.
  • markup languages such as HTML or XML or other presentation languages.
  • HTML is
  • HTML and XML documents are composed of a series of entities or objects. Each entity can contain one or more logical elements and each element can have certain attributes or properties that describe the way in which it is to be processed. Both languages provide a formal syntax for describing the relationships between the entities, elements and attributes that make up a document. This syntax tells a computer how to recognize the component parts of each document.
  • HTML and XML use paired markup tags to identify document components. The markup tags are easily recognized codes that are added to a document to identify each document component. In particular, the start and end of each logical element is clearly identified by entry of a start-tag before the element and an end-tag after the element. For example, the tags ⁇ to> and ⁇ /to> could be used to identify the "recipient" element of a document in the following manner:
  • the arrangement of tags is hierarchical in that some tagged document portions can contain other tagged document portions.
  • users need to know how the markup tags are delimited from normal text and the relationship between the various elements. For example, in some XML systems, elements and their attributes are entered between matched pairs of angle brackets ( ⁇ ...>), while element references start with an ampersand and end with a semicolon (&...;).
  • HTML the set of markup tags is fixed and relatively small.
  • the form and composition of markup tags can be defined by users, but are often defined by a trade association or similar body in order to provide interoperability between users.
  • XML tag sets are based on the logical structure of the document and, consequently, they are easy to read and understand.
  • XML can represent a greater variety of documents and, since different documents have different parts or components, it is not practical to predefine tags for all elements of all documents. Instead, documents can be classified into "types" which have certain elements.
  • a document type definition indicates which elements to expect in a document type and indicates whether each element found in the document is not allowed, allowed and required or allowed, but not required.
  • DTD document type definition
  • By defining the role of each document element in a DTD it is possible to check that each element occurs in a valid place within the document. For example, an XML DTD allows a check to be made that a third-level heading is not entered without the existence of a second-level heading.
  • tags in the information returned from the aforementioned queries identified sections of the document which were relative to the inventive shopping bot, such as item description, price, etc.
  • the information returned is coded for display on a browser and must be processed further to extract the desired shopping information.
  • An example of information returned from a query is given below. This information is coded using HTML codes and is intended for use by a browser such as the Netscape or Internet Explorer browsers mentioned previously.
  • Your search result is sorted by publication date with most recent one first.
  • This information must be processed in order to extract the relevant information a procedure performed by an ALO.
  • a separate ALO thread is spawned for each result set received by the SEK 206 in order to reduce processing time.
  • An illustrative processing routine is illustrated in the flowchart shown in Figure 4. The routine starts in step 400 and proceeds to step 402.
  • a filtering mechanism removes formatting information and attributes.
  • This filter can be implemented with a parsing mechanism which identifies the tags.
  • Such a parsing mechanism is well-known for presentation languages, such as HTML and XML.
  • the identified tags are compared to a predetermined tag list and the formatting tags, such as ⁇ html>, ⁇ head>, ⁇ title>, ⁇ font>, ⁇ br>, etc. are removed.
  • formatting attributes in the tags are also removed. In the case of the above example, the remaining information will be:
  • Your search result is sorted by publication date with most recent one first.
  • step 404 the remaining information is parsed into a data tree.
  • HTML and XML the language is naturally hierarchical so that this parsing is relatively easy.
  • the example given immediately above splits into three separate trees which are comprised of hierarchical nodes indicated by the indented sections (the indents were added to emphasize the sections.) These trees are illustrated in Figures 5,6 and 7 and are delineated by the ⁇ table> ⁇ /table> tags.
  • Figure 5 illustrates the first tree comprised of a first node 500 consisting of information contained between the ⁇ table> ⁇ /table> tags, a second node 502 delineated by the ⁇ tr> ⁇ /tr> tags and two third nodes, 504 and 506, delineated by the ⁇ td> ⁇ /td> tags.
  • Figure 6 illustrates the first tree comprised of a first node 600 consisting of information contained between the ⁇ table> ⁇ /table> tags, a second node 602 delineated by the ⁇ tr> ⁇ /tr> tags and a third node 604 delineated by the ⁇ td> ⁇ /td> tags.
  • Figure 7 illustrates the third tree comprised of a first node 700 consisting of information contained between the ⁇ table> ⁇ /table> tags, three second nodes 702, 704 and 706 delineated by the ⁇ tr> ⁇ /tr> tags and six third nodes 708-718 delineated by the ⁇ td> ⁇ /td> tags.
  • the nodes in each tree are examined to determine whether they contain relevant information. These nodes are examined in sequence, level-by-level, as illustrated by the arrows in the figures to detect a "complete" node level which contains some or all of the desired information. For example in Figure 5, node 500 is first examined. Since it is empty, node 502 at the second level is next examined.
  • nodes 504 and 506 at the third level are examined. Nodes 504 and 506 contain information and this information is examined as discussed below. However, since the information contained in nodes 504 and 506 is not relevant information, such as the title, author or price of a book in the example given above, the entire tree, including nodes 500 and 502 is removed from consideration. If no relevant information is found as determined in step 408 ( Figure 4), then the process proceeds back to step 406.
  • step 406 the tree illustrated in Figure 6 is examined.
  • Node 600 is first examined. Since it is empty, node 602 is next examined. It is also empty so that node 604 is examined. Node 604 contains information and this information is examined as discussed below. However, since node 604 does not contain relevant information, the entire tree, including nodes 600 and 602 is removed from consideration. The process then continues from step 408 back to step 406.
  • the tree illustrated in Figure 7 is examined. Node 700 is first examined. Since it is empty, nodes 702, 704 and 706 are examined. These nodes are also empty so that nodes 708-718 are examined. All of these nodes contain information which is examined. Nodes 710, 714 and 718 contain relevant information so that, in step 410, this information is extracted by an extraction mechanism and mapped to corresponding buffers. In the above example, the result would be:
  • step 412 the buffered information is returned to the SEK where it is formatted for display on the user's computer.
  • the process then ends in step 414.
  • the information is checked against a rules set which is specific to the category which is being examined.
  • Each rule in the set defines the character of one or more fields. For example, one rule might specify that a node is complete if it contains all relevant fields. In the aforementioned example, these fields are title, author, description and price. If one field is missing, another rule might require further checking to determine whether the node is complete and information from that node should be extracted. For example, if a price field and a name field are found in a node, the node is very likely to contain relevant data.
  • each node relevant information is located by searching for keywords, symbols or data types which are specific to each category. Then, words in the vicinity of these keywords could be examined to find relevant information. For example, each node may be examined for character strings such as "name”, “title”, “description”, “price”, or "author.” If any of these keywords are found, then the subsequent characters will be considered relevant information.
  • Another rule might define a price field as the smallest number encountered in the node with, or without, a preceding "$" symbol. A price field may also be defined as a number with or without a preceding "PRICE” keyword.
  • the rules can be generated in a variety of ways. In one embodiment, rules are generated by a programmer for each merchant site and maintained by the programmer.
  • the ALO which is processing the received data will look for a rule in the rule set which matches the data and use the rule to extract the data.
  • the search results are parsed to tree nodes as described above and the ALO will check each node for keywords which are selected from a keyword set which depends on the user-selected category. The information following these predetermined keywords is then extracted.
  • separate rules are not needed for each site and the rules do not need to be maintained by a programmer.
  • each ALO is very flexible and generic. This flexibility allows different merchants to be added into the search and comparison pool quickly and easily. It also allows different information to be retrieved from a site depending on the customer or marketing needs.
  • the system may be implemented as a computer program product for use with a computer system.
  • Such implementation may include a series of computer instructions fixed either on a tangible medium, such as a computer readable media (e.g., a diskette, a CD or non-volatile storage) or transmittable to a computer system, via a modem or other interface device, such as a network.
  • the series of computer instructions embodies all or part of the functionality previously described herein with respect to the system.
  • Those skilled in the art should appreciate that such computer instructions can be written in a number of programming languages for use with many computer architectures or operating systems.
  • Such instructions may be stored in any memory device, such as semiconductor, magnetic, optical or other memory devices, and may be transmitted using any communications technology, such as optical, infrared, microwave, or other transmission technologies. It is expected that such a computer program product may be distributed as a removable media with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the network (e.g., the Internet or World Wide Web).
  • a computer system e.g., on system ROM or fixed disk
  • server or electronic bulletin board e.g., the Internet or World Wide Web

Abstract

A shopping bot uses real time agents that automatically contact disparate web pages representing vast variety of different categories and merchants and retrieve and unify the information therein for display when a request for the information is made. Consequently, there is no need to create a 'wrapper', or an 'information adapter' for each category or each merchant because the same agent can retrieve and process information in various formats. In particular, the shopping bot generates queries from keywords entered by a user and a database of URL information. Information returned by the queries is filtered, parsed and mapped to a standard format. The formatted information can then be displayed. Since the information is converted to the standard format in real time, the invention allows quick addition of online merchants and additional product categories can be added quickly and easily. Further, information about a product item can be easily enlarged as market needs increase without changing a database of codes one-by-one. In accordance with one embodiment, the database of URL information includes URLs specific to site directories at each merchant site so that queries can be easily generated by appending user-provided keywords.

Description

ELECTRONIC SHOPPING AGENT WHICH IS CAPABLE OF OPERATING WITH VENDOR SITES WHICH HAVE DISPARATE FORMATS
FIELD OF THE INVENTION This invention relates to electronic shopping agents or "bots" which operate over the Internet on behalf of a client to locate on-line vendors which provide goods and services of interest to the client.
BACKGROUND OF THE INVENTION The Internet and web-related technology have become widespread as personal computers have become more prevalent. One of the fastest growing business sectors is electronic commerce, particularly, retail consumer shopping. The Internet allows consumers to quickly locate goods and services of interest to them. In many cases, images of the goods can be viewed and orders placed directly over the web. The consumer may provide payment electronically via credit cards and the goods are then shipped to the consumer. Comparative shopping using the Internet as a search and retrieval tool to locate and retrieve information and prices for comparable products is also a fast-growing area. There are already several comparative shopping tools that are available on the Internet, such as Junglee and Jango, for example. These shopping tools accept keywords and category information as inputs from consumers. The keywords and category information are used to create an autonomous agent or "shopping bot" which scans over the Internet and locates related products from a set of online merchants. The product items that are located are returned by the shopping bot are then presented to the consumer using a simple tabular form to enable comparison shopping.
While the existing shopping tools can help users to do comparative shopping, there are several limitations among all the existing shopping bots. The first limitation is that the number of online merchants included in the comparison pool that a user can access and use for comparisons is small. This limitation can be mainly attributed to a historical fact, i.e. the evolution of the Internet. The Internet was originally designed to operate with information coded in a very specific format called HyperText Markup Language (HTML). HTML is a presentation language that uses codes embedded in the document to define how a particular segment of a document is presented on a display mechanism such as a Web browser. Although HTML has a predefined and fixed format, it does not give any information about the meaning or semantics of the information which it is used to format. Therefore, although Web browsers can read HTML and use the HTML codes to identify selected parts of the information, such as text and graphics, the browsers cannot use the HTML codes to extract information from the identified document parts. In addition, web pages often differ drastically depending on the taste, preference, and marketing needs of different designers and merchants. As a result, it is difficult and tedious for a search engine to extract specific information, such as item price, from a wide variety of different HTML coded web pages. In order to overcome this problem, some shopping bots use mechanisms such as so-called "wrappers" or "information adapters." One of these mechanisms is programmed to discover or "learn" about each product category in each merchant site. However, these mechanisms are very slow and it usually takes from hours to days to include a new merchant in the comparison pool. Furthermore, if the merchant changes its web page formats, it will take also a long time to change the mechanism used for that merchant site in order to make the search engine continue to work. The continual flux of the Internet requires the shopping bot providers to employ many programmers to design and maintain their services and the number of merchant sites covered is necessarily small.
The second limitation is the amount of information a consumer can get through existing shopping bots. Currently, most shopping bots provide consumers with only limited information such as price, a brief description of items, and a merchant link. However, from consumer's point of view, price may not be the only criteria on which to base a shopping decision. Other factors, such as shipping date, warranty information, creditability of a merchant, and service, etc. often affect shopping decisions. Although it is possible to add additional criteria to existing shopping bots, it means changing hundreds or thousands of "wrappers" or "information adapters" corresponding to different categories, subcategories, and merchant sites. Such a task is a very costly investment both in terms of time and human resources. The third limitation on existing shopping bots is performance and accuracy of searched results. Most existing shopping bots are very slow and take minutes to generate search results. Consequently, many existing systems store information retrieved from merchant sites in a local database so that searches are greatly accelerated. However, the local databases are only periodically updated by contacting the merchant sites. Therefore, the results of the search are often out-of- date and not very accurate or not related to the actual situation at the merchant site.
Therefore, there is a need for a shopping bot which can operate with a variety of different merchant site formats and which can quickly adapt to new formats or changes to existing merchant sites.
There is further need for a shopping bot which can be easily and quickly modified to retrieve and display new and different information from that currently being displayed.
There is a further need for a shopping bot which can quickly provide accurate and timely information to consumers.
SUMMARY OF THE INVENTION
In accordance with the principles of the invention, a shopping bot uses real time agents that automatically contact disparate web pages representing vast variety of different categories and merchants and retrieve and unify the information therein for display when a request for the information is made. Consequently, there is no need to create a "wrapper", or a "information adapter" for each category or each merchant because the same agent can retrieve and process information in various formats. In particular, the shopping bot generates queries from keywords entered by a user and a database of URL information. Information returned by the queries is filtered, parsed and mapped to a standard format. The formatted information can then be displayed. Since the information is converted to the standard format in real time, the invention allows quick addition of online merchants and additional product categories can be added quickly and easily. Further, information about a product item can be easily enlarged as market needs increase without changing a database of codes one-by-one. In accordance with one embodiment, the database of URL information includes URLs specific to site directories at each merchant site so that queries can be easily generated by appending user-provided keywords.
In accordance with another embodiment, information which is retrieved from merchant sites comprises information which would normally be displayed by a browser. This information which is typically in HTML or XML format is parsed and filtered and a hierarchical tree structure is used to map the information to desired categories before displaying the information for comparison.
In accordance with still another embodiment, efficient caching and distributed algorithms are used to reduce consumer response time.
BRIEF DESCRIPTION OF THE DRAWINGS
The above and further advantages of the invention may be better understood by referring to the following description in conjunction with the accompanying drawings in which:
Figure 1 is a block schematic diagram of a networking arrangement which includes the Internet and connects several local computer systems to remote servers. Figure 2 is a block schematic diagram which illustrates the major components of the inventive shopping system. Figures 3A and 3B, when placed together, form a flowchart which illustrates the inventive process of generating queries in parallel from stored URLs issuing the queries to merchant sites and processing the results in parallel.
Figure 4 is a flowchart which illustrates the process of extracting relevant information from query results. Figure 5 is a schematic diagram of information in an example tree branch having three node levels.
Figure 6 is a schematic diagram of information in another example tree branch having three nodes.
Figure 7 is a schematic diagram of information in a further example tree branch also having three nodes. DETAILED DESCRIPTION
Figure 1 shows a commonly used network arrangement in which local computer systems 100 and 102 are connected by a local area network (LAN) 104 to a local server 106 which may access a plurality of remote servers 110-114 through the Internet 108. Each remote server 110-114 may include World Wide Web sites (web sites) that each include a plurality of World Wide Web pages (web pages). Each local computer system 100 and 102, of which system 100 is shown in more detail, may access the remote web sites with web browser software 101 , such as Netscape Navigator™, available from Netscape Communications Corporation of Mountain View, California or Internet Explorer available from Microsoft Corporation, Redmond, Washington.
A web site has a home page which constitutes the highest level in the hierarchy. The home page typically contains general information about the merchant, including graphic images and may contain other information such as a menu allowing a user who visits the web site to navigate to the other web pages that constitute the site. The site may also include a site directory that is a web page that contains links to the other web pages. Often a site directory includes a site search feature which is an integral search engine that accepts user input in the form of keywords and searches the site for matches. Information of use to a consumer, such as item descriptions and prices would typically be located on lower levels of the hierarchy. Ordering information, such as credit card information might be located at still another level of the site.
The World Wide Web is actually a collection of servers on the Internet 108 that utilize the Hypertext Transfer Protocol (HTTP). HTTP is a known application protocol that provides users with access to files (which can be in different formats, such as text, graphics, images, sound, and video) using a standard page description language known as Hypertext Markup Language (HTML). Among a number of basic document formatting functions, HTML allows software developers to specify graphical pointers on displayed web pages (commonly referred to as "hyperlinks") that point to other web pages resident on remote servers. Hyperlinks commonly are displayed as highlighted text or other graphical image on the web page. Selection of a hyperlink with a pointing device, such as a computer mouse, causes the local computer to download the HTML code of an associated web page from a remote server. The location of the web page is expressed as a "uniform resource locator" (URL). This method provides the remote server with the necessary information to upload the remote web page associated with the selected point to the local computer. Web sites constructed by on-line merchants contain descriptions and or pictures of goods or services for sale. Each site is typically arranged in a hierarchical branching tree structure having a plurality of nodes that contain one or more of the web pages in the site. Each of the nodes in the site are considered to be on various levels of each branch in the tree structure. For example, a first node is considered to be on a lower level than a second node in the same branch if a web page in the first node includes the second node in its URL. Conversely, a third web node in the same branch is considered to be on a higher level than the second node if the URL of a web page in the second node includes the third node. Web pages are accessed over the Internet, via the browser software 101 , and commonly are downloaded into a cache 103 of the local computer system 100. The browser software 101 then uses the HTML code to position the various files on a display screen.
The inventive shopping system is illustrated in Figure 2 and consists of a search engine kernel (SEK) 206, one or more search engines 208-212 and one or more automatic learning objects (ALOs) 214-218. The kernel 206 might be located in the local server (106, Figure 1 ) and interacts with one or more users 200-204 to receive a user's request for information and to send the results back to the requesting user. Preferably, the SEK 206 is platform independent so that it can run on any hardware platform and operating systems. In one embodiment, the SEK 206 is written in the Java programming language licensed by Sun Microsystems, Inc. and can operate on any platform as long as a Java environment is operating on that platform. The SEK 206 first analyzes a request generated by a user and then starts appropriate functions according to the user's request. For example, the SEK 206 might receive a request for information relating to books on a particular topic, such as "child spousal support" from one of users 200-204. The request criteria are entered by the user in a variety of manners. The criteria could be entered, via an interactive interface in which the user answers a series of questions based on keywords. Alternatively, the user might use drop-down lists and menus to select a predefined category, such as "books" from a list or a graphic display of items. Generally, the request would include at least a category, such as 'book" or "auto" and selected keywords, such as "child", "spousal" and "support" or a phrase "child spousal support" which can be broken down by a conventional parser into one or more keywords. In response, the SEK 206 generates one or more queries relating to the selected topic. In order to generate these queries, the SEK 206 uses an internal database 220 of URLs. This database can include entries set up be participating vendors, or might be a manually downloaded table, or provided by the company which maintains the inventive shopping system. The database 220 contains a plurality of URLs arranged by category. When a category is entered by a user, the URLs for that category are returned by the database 220.
In accordance with an important aspect of the invention, the URLs stored in database 220 are not the URLs for a particular merchant site, but are instead the URLs for the site directory and, in particular, for the site search of each site. These URLs can be easily combined with the keywords entered by the user to form a query. This latter query utilizes the internal search engine existing at most merchant sites to perform at least part of the search, thereby relieving the inventive system of having to construct a query which is specific to each merchant site. For example, continuing the above example, if the user has selected the category "books." The SEK 206 might retrieve the URLs for online book vendors such as amazon.com and kingbooks.com from the database 220. However, the URLs actually retrieved from the database 220 are the URLs for the search engines at these sites: http://www.amazon.com/exec/obidos/external-search/? and http://www.kinqbooks.com/schpts/search3.exe?bv=kevwords&
These URLs are in a form which can readily be combined with the keywords entered by the user in order to form a query such as: http://www.amazon.com/exec/obidos/external- search/?kevword=child+spousal+support
and http://www.kinqbooks.com/scripts/search3.exe?by=keywords&kevwords=child+spousa l+support
These queries are advantageous because they use the built-in search engines in the respective web sites to perform the actual search, thus relieving the SEK from having to compose a customized search for each site and changing the customized search when the site changes.
However, some merchant sites do not have an internal search engine. Instead, some sites have an on-line catalog while other sites are simple web pages. In the case where the merchant site has a catalog, it may be necessary for a programmer to enter the site and navigate to a section of the catalog where user selections can be made. The URLs which correspond to these catalog sections can then often be combined with the user-entered keywords to generate the required query. Database 220 may contain several URLs for a single merchant site where each URL is mapped to one or more keywords. When a user enters the keywords for the query, the keywords are used to select from the URLs for a merchant site and then the final query is generated by combining the selected keywords with the user enter information. In the case where the merchant site is a simple collection of web pages, a programmer must enter the site and navigate directly to a web page which displays an item. The URL which identifies this latter web page is then entered into the database 220 and mapped to various keywords selected from the web page content. Subsequently, when a user enters keyword information, the information is used to select URLs from the database 220 which are mapped to matching keywords. This latter approach is not as advantageous as using an internal site search because the mapping must be changed if the merchant site is changed.
As previously mentioned, a keyword for a merchant site is selected based on the user-selected category and a query is formed using the URL which matches the keyword for the merchant site. In this manner queries are generated for each merchant site having a URL in the selected category. When each query is generated, the SEK spawns a search engine thread, for example, search engine 208 and provides it with the query for a merchant site. The search engine 208 issues the query to the appropriate merchant site. In response, the merchant site returns the results of the query. The results from a merchant site are received by the search engine 208 that issued the query and forwarded to the SEK 206. When results are received, the SEK creates an ALO, for example ALO 214 for each merchant site. The ALOs process the results to extract relevant information. The extracted information is returned to the SEK 206 for display formatting. Finally, the formatted results are returned to the one of users 200-204 which made the initial request. The entire process is illustrated in the flowchart shown in Figures 3A and 3B, which, when placed together, form the flowchart.
The process starts in step 300 and proceeds to step 302 where an attempt is made to retrieve a merchant URL from the SEK database 220 under the category which has been selected by the user. In step 304, a check is made to determine whether the attempt was successful. If no additional URLs remain and the attempt is not successful, the process proceeds to step 306 and finishes.
Alternatively, if, in step 304, an additional URL is retrieved from database 220, then the process proceeds to step 308. In step 308, a query is generated by concatenating the URL and the keywords entered by the user to generate a query. As previously mentioned, in many cases, this query will be directed towards a site search engine located at the merchant site. Next, in step 310, a local search engine thread is created which issues the query to the corresponding web site. The process then proceeds back to step 302 to retrieve another URL and generate another search engine thread.
The operation of each search thread is illustrated in Figure 3B. In particular, the process proceeds, via off-page connectors 312 and 316 to step 318 in which the SEK creates an automatic learning object to receive the search results from a merchant site. The query results generated by the merchant site search engine are received by the search engine and forwarded to previously-created ALO as set forth in step 320. Data from a site may not be returned all at once, but processing begins by the ALO as soon as data are received.
In step 322, the local search engine which issued the query then waits to determine whether additional results will be provided by the merchant site. If additional results are received, then step 320 is repeated. Alternatively, if there are no additional results as determined in step 322, the process finishes in step 324.
The SEK 206 keeps track of all ALOs 208-212 generated and performs the necessary synchronization between the ALOs 208-212. Advantageously, the search process is conducted in parallel with each query being processed by a separate search engine thread and the corresponding results are processed by a separate ALO. This parallel processing greatly reduces the time required to obtain comparative results. In addition, the procedure that receives a request message from a user and initiates a search engine is kept very short so that it takes minimum amount of time. Various ALOs in the SEK share the same resources making inter process communication more efficient and eliminating unnecessary mapping.
Based on the category, subcategory, product name, and other related information provided by the user, the SEK 206 initiates one or more search engines 208-212 that examine all merchant sites that may have the product information requested by the user. Each search engine takes a query generated by the SEK and goes to the associated web site to retrieve the desired information. In general, the information retrieved from a web site by the aforementioned queries is intended for display by a browser. Often the information will be encoded using "markup languages" such as HTML or XML or other presentation languages. HTML is a simple "markup language" that is suited for the display of small and reasonably simple documents which are commonly transmitted on the World Wide Web. Another markup language called the Extensible Markup Language (XML) is often used for more complicated documents that require capabilities beyond those provided by HTML. XML is more extensible, allows for validation and defines how URLs can be used to identify component parts of XML documents.
HTML and XML documents are composed of a series of entities or objects. Each entity can contain one or more logical elements and each element can have certain attributes or properties that describe the way in which it is to be processed. Both languages provide a formal syntax for describing the relationships between the entities, elements and attributes that make up a document. This syntax tells a computer how to recognize the component parts of each document. HTML and XML use paired markup tags to identify document components. The markup tags are easily recognized codes that are added to a document to identify each document component. In particular, the start and end of each logical element is clearly identified by entry of a start-tag before the element and an end-tag after the element. For example, the tags <to> and </to> could be used to identify the "recipient" element of a document in the following manner:
document text ... <to>Recipient</to> ... document text.
The arrangement of tags is hierarchical in that some tagged document portions can contain other tagged document portions. In order to operate with a set of tags, users need to know how the markup tags are delimited from normal text and the relationship between the various elements. For example, in some XML systems, elements and their attributes are entered between matched pairs of angle brackets (<...>), while element references start with an ampersand and end with a semicolon (&...;). In HTML the set of markup tags is fixed and relatively small. In XML documents, the form and composition of markup tags can be defined by users, but are often defined by a trade association or similar body in order to provide interoperability between users. XML tag sets are based on the logical structure of the document and, consequently, they are easy to read and understand.
XML can represent a greater variety of documents and, since different documents have different parts or components, it is not practical to predefine tags for all elements of all documents. Instead, documents can be classified into "types" which have certain elements. A document type definition (DTD) indicates which elements to expect in a document type and indicates whether each element found in the document is not allowed, allowed and required or allowed, but not required. By defining the role of each document element in a DTD, it is possible to check that each element occurs in a valid place within the document. For example, an XML DTD allows a check to be made that a third-level heading is not entered without the existence of a second-level heading. It would be convenient if the tags in the information returned from the aforementioned queries identified sections of the document which were relative to the inventive shopping bot, such as item description, price, etc. However, in most cases, the information returned is coded for display on a browser and must be processed further to extract the desired shopping information. An example of information returned from a query is given below. This information is coded using HTML codes and is intended for use by a browser such as the Netscape or Internet Explorer browsers mentioned previously.
<html> <head>
<title> Books Found by Search</title> <head>
<body bgcolor="#FFFFFF"> <div align="center"><center>
<table border="0" width="750"> <tr>
<td width="375"><font color="#000000" size="5"><strong> Books Found by Search:<br> <br>
</strongx/font><font color="#000000" size="3">
Your search brought up 1 titles. <br> Click on a title for more information. </font></td> <td valign="top" width="375"><font size="4"> <img src=7images/future.gif width="20" height="20">
&nbsp;This icon represents new and upcoming releases. <br> </font>For current availability info, please click on the title. </td> </tr> </table>
<table border="0" width="750"> <tr><td width="750">
Your search result is sorted by publication date with most recent one first.
</tdx/tr></table></center></div> <!--ISBN:0944058316~>
<div align="center"><center> <table border="0" width = "750"> <tr>
<td width="40"><font size="4"><strong>1 </strongx/font></td> <td width="710" colspan="5"><a name="0370994" href="/scripts/detail4.exe?/results/b9bflcb4.html-0370994"> <font size="4">How to Settle Child and Spousal Support; With CalSupport Software With 3.5 Disk</font></a></td> </tr> <tr>
<td width="40">&nbsp;</td>
<td width="710" colspan="5">Author:sherman, Ed ~
Subject: Domestic Relations - Divorce & Separation ~ Pub. Date;
1/1998</td>
</tr> <tr>
<td width="40">&nbsp;</td> <td width="710" colspan="5">Pub.Price:$29.95 ~
Kingbooks.com Price: <font color="#FF0000">$23.96</font> <font color="#000000"> ~</font>You Save: <font color="#FF0000">$5.99</font></td> </tr>
</table></centerx/div>
<div align="center"><center>
<img src="images/redline.gif width="750" height="4"> </center></div></bodyx/html>
This information must be processed in order to extract the relevant information a procedure performed by an ALO. As previously mentioned a separate ALO thread is spawned for each result set received by the SEK 206 in order to reduce processing time. An illustrative processing routine is illustrated in the flowchart shown in Figure 4. The routine starts in step 400 and proceeds to step 402. In step 402 a filtering mechanism removes formatting information and attributes. This filter can be implemented with a parsing mechanism which identifies the tags. Such a parsing mechanism is well-known for presentation languages, such as HTML and XML. Next, the identified tags are compared to a predetermined tag list and the formatting tags, such as <html>, <head>, <title>, <font>, <br>, etc. are removed. In addition, formatting attributes in the tags are also removed. In the case of the above example, the remaining information will be:
<table> <tr>
<td
Books Found by Search: Your search brought up 1 titles. Click on a title for more information.
</td> <td> This icon represents new an upcoming releases. For current availability info, please click on the title.
</td>
</tr>
</table>
<table> <tr>
<td>
Your search result is sorted by publication date with most recent one first.
</td>
</tr>
</table> <table> <tr>
<td >
</td> <td>
<a name="0370994" href="/scripts/detail4.exe?/results/b9bflcb4.html- 0370994">How to Settle Child and Spousal Support; With CalSupport Software With 3.5 Disk
</td>
</tr> <tr>
<td >
</td> <td>
Author: Sherman, Ed ~ Subject: Domestic Relations - Divorce & Separation ~ Pub. Date; 1/1998
</td>
</tr> <tr>
<td>
</td> <td>
Pub.Price:$29.95 ~ Kingbooks.com Price: $23.96 ~ You Save: $5.99
</td>
</tr>
</table>
Next, as indicated in step 404, the remaining information is parsed into a data tree. In the case of HTML and XML, the language is naturally hierarchical so that this parsing is relatively easy. The example given immediately above splits into three separate trees which are comprised of hierarchical nodes indicated by the indented sections (the indents were added to emphasize the sections.) These trees are illustrated in Figures 5,6 and 7 and are delineated by the <table></table> tags. Figure 5 illustrates the first tree comprised of a first node 500 consisting of information contained between the <table></table> tags, a second node 502 delineated by the <tr></tr> tags and two third nodes, 504 and 506, delineated by the <td></td> tags. In a similar manner, Figure 6 illustrates the first tree comprised of a first node 600 consisting of information contained between the <table></table> tags, a second node 602 delineated by the <tr></tr> tags and a third node 604 delineated by the <td></td> tags. Figure 7 illustrates the third tree comprised of a first node 700 consisting of information contained between the <table></table> tags, three second nodes 702, 704 and 706 delineated by the <tr></tr> tags and six third nodes 708-718 delineated by the <td></td> tags. As set forth in step 406, the nodes in each tree are examined to determine whether they contain relevant information. These nodes are examined in sequence, level-by-level, as illustrated by the arrows in the figures to detect a "complete" node level which contains some or all of the desired information. For example in Figure 5, node 500 is first examined. Since it is empty, node 502 at the second level is next examined. It is also empty so that nodes 504 and 506 at the third level are examined. Nodes 504 and 506 contain information and this information is examined as discussed below. However, since the information contained in nodes 504 and 506 is not relevant information, such as the title, author or price of a book in the example given above, the entire tree, including nodes 500 and 502 is removed from consideration. If no relevant information is found as determined in step 408 (Figure 4), then the process proceeds back to step 406.
Next, in step 406, the tree illustrated in Figure 6 is examined. Node 600 is first examined. Since it is empty, node 602 is next examined. It is also empty so that node 604 is examined. Node 604 contains information and this information is examined as discussed below. However, since node 604 does not contain relevant information, the entire tree, including nodes 600 and 602 is removed from consideration. The process then continues from step 408 back to step 406. Next, the tree illustrated in Figure 7 is examined. Node 700 is first examined. Since it is empty, nodes 702, 704 and 706 are examined. These nodes are also empty so that nodes 708-718 are examined. All of these nodes contain information which is examined. Nodes 710, 714 and 718 contain relevant information so that, in step 410, this information is extracted by an extraction mechanism and mapped to corresponding buffers. In the above example, the result would be:
Title Author Description Price
How to Settle Child Sherman, Ed ~ Domestic Relations - Divorce & $23.96 and Spousal Support; Separation ~ Pub Date: 1/1998
With CalSupport Pub. Price: $29.95
Software With 3.5 Disk
In step 412, the buffered information is returned to the SEK where it is formatted for display on the user's computer. The process then ends in step 414. In order to extract the information in each node, the information is checked against a rules set which is specific to the category which is being examined. Each rule in the set defines the character of one or more fields. For example, one rule might specify that a node is complete if it contains all relevant fields. In the aforementioned example, these fields are title, author, description and price. If one field is missing, another rule might require further checking to determine whether the node is complete and information from that node should be extracted. For example, if a price field and a name field are found in a node, the node is very likely to contain relevant data. Within each node relevant information is located by searching for keywords, symbols or data types which are specific to each category. Then, words in the vicinity of these keywords could be examined to find relevant information. For example, each node may be examined for character strings such as "name", "title", "description", "price", or "author." If any of these keywords are found, then the subsequent characters will be considered relevant information. Another rule might define a price field as the smallest number encountered in the node with, or without, a preceding "$" symbol. A price field may also be defined as a number with or without a preceding "PRICE" keyword. The rules can be generated in a variety of ways. In one embodiment, rules are generated by a programmer for each merchant site and maintained by the programmer. In this embodiment, the ALO which is processing the received data will look for a rule in the rule set which matches the data and use the rule to extract the data. In another embodiment, the search results are parsed to tree nodes as described above and the ALO will check each node for keywords which are selected from a keyword set which depends on the user-selected category. The information following these predetermined keywords is then extracted. In this version, separate rules are not needed for each site and the rules do not need to be maintained by a programmer.
Alternatively, provision can be made to allow a user to manually select a block of data in a particular level of a data tree so that program can search and extract that block of data each time results are returned. Because rules can be written for each field, each ALO is very flexible and generic. This flexibility allows different merchants to be added into the search and comparison pool quickly and easily. It also allows different information to be retrieved from a site depending on the customer or marketing needs.
In an alternative embodiment, the system may be implemented as a computer program product for use with a computer system. Such implementation may include a series of computer instructions fixed either on a tangible medium, such as a computer readable media (e.g., a diskette, a CD or non-volatile storage) or transmittable to a computer system, via a modem or other interface device, such as a network. The series of computer instructions embodies all or part of the functionality previously described herein with respect to the system. Those skilled in the art should appreciate that such computer instructions can be written in a number of programming languages for use with many computer architectures or operating systems. Furthermore, such instructions may be stored in any memory device, such as semiconductor, magnetic, optical or other memory devices, and may be transmitted using any communications technology, such as optical, infrared, microwave, or other transmission technologies. It is expected that such a computer program product may be distributed as a removable media with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the network (e.g., the Internet or World Wide Web).
Although various exemplary embodiments of the invention have been disclosed, it will be apparent to those skill in the art that various changes and modifications can be made that will achieve some of the advantages of the invention without departing from the true scope of the invention. These and other obvious modifications are intended to be covered by the appended claims.
What is claimed is:

Claims

1. Apparatus for retrieving comparative item information from a plurality of merchant sites having disparate information formats in response to a request, including a category and a keyword, from a user, the apparatus comprising: a database containing a plurality of categories and, for each category, at least one URL for one of the plurality of merchant sites; a query generator responsive to the request category for composing a query by concatenating a URL obtained from the database with the request category with the request keyword; a search engine for retrieving information from the plurality of merchant sites with the query; and an automatic learning object for processing retrieved information to extract the item information.
2. Apparatus according to claim 1 wherein the query generator composes a plurality of queries for the requested category and wherein a plurality of search engines are created in parallel with one search engine being constructed for each query.
3. Apparatus according to claim 1 wherein the query generator composes a plurality of queries for the requested category and wherein a plurality of automatic learning objects are created in parallel with one automatic learning object being created to process information retrieved from each query.
4. Apparatus according to claim 1 wherein the automatic learning object comprises: a filter for removing formatting information in the retrieved information; and a parser for parsing the filtered information into one or more data trees, each data tree having one or more nodes.
5. Apparatus according to claim 4 wherein the automatic learning object further comprises: a mechanism which examines each node for relevant information; and an extraction mechanism which extracts the item information from the relevant information.
6. Apparatus according to claim 1 wherein the retrieved information is coded in HTML code and wherein the automatic learning object processes the HTML code to remove HTML formatting tags.
7. Apparatus according to claim 1 wherein the retrieved information is coded in XML code and wherein the automatic learning object processes the XML code to remove XML formatting tags.
8. Apparatus according to claim 1 wherein the database includes at least one URL for a search engine located in one of the plurality of merchant sites.
9. A method for retrieving comparative item information from a plurality of merchant sites having disparate information formats in response to a request, including a category and a keyword, from a user, the method comprising: (a) constructing a database containing a plurality of categories and, for each category, at least one URL for one of the plurality of merchant sites; (b) composing a query in response to the request category by concatenating a URL obtained from the database with the request category with the request keyword; (c) using a search engine to retrieve information from the plurality of merchant sites with the query; and (d) creating an automatic learning object for processing retrieved information to extract the item information.
10. A method according to claim 9 wherein step (b) comprises composing a plurality of queries for the requested category and wherein step (c) comprises using a plurality of search engines in parallel with one search engine being used for each query.
11. A method according to claim 9 wherein step (b) comprises composing a plurality of queries for the requested category and wherein step (d) comprises creating a plurality of automatic learning objects in parallel with one automatic learning object being created to process information retrieved from each query.
12. A method according to claim 9 wherein step (d) comprises: (d1 ) removing formatting information in the retrieved information; and (d2) parsing the filtered information into one or more data trees, each data tree having one or more nodes.
13. A method according to claim 12 wherein step (d) further comprises: (d3) examining each node for relevant information; and (d4) extracting the item information from the relevant information.
14. A method according to claim 9 wherein the retrieved information is coded in HTML code and wherein step (d) comprises processing the HTML code to remove HTML formatting tags.
15. A method according to claim 9 wherein the retrieved information is coded in XML code and wherein step (d) comprises processing the XML code to remove XML formatting tags.
16. A method according to claim 9 wherein the database is constructed to include at least one URL for a search engine located in one of the plurality of merchant sites.
17. A computer program product for retrieving comparative item information from a plurality of merchant sites having disparate information formats in response to a request, including a category and a keyword, from a user, the computer program product comprising a computer usable medium having computer readable program code thereon, including: program code for constructing a database containing a plurality of categories and, for each category, at least one URL for one of the plurality of merchant sites; program code for composing a query in response to the request category by concatenating a URL obtained from the database with the request category with the request keyword; program code for creating a search engine to retrieve information from the plurality of merchant sites with the query; and program code for creating an automatic learning object for processing retrieved information to extract the item information.
18. A computer program product according to claim 17 wherein the program code for composing a query comprises program code for composing a plurality of queries for the requested category and wherein the program code for creating a search engine comprises program code for creating a plurality of search engines in parallel with one search engine being used for each query.
19. A computer program product according to claim 17 wherein the program code for composing a query comprises program code for composing a plurality of queries for the requested category and wherein the program code for creating a search engine comprises program code for creating a plurality of automatic learning objects in parallel with one automatic learning object being created to process information retrieved from each query.
20. A computer program product according to claim 17 wherein the program code for creating an automatic learning object comprises: program code for removing formatting information in the retrieved information; and program code for parsing the filtered information into one or more data trees, each data tree having one or more nodes.
21. A computer program product according to claim 20 wherein the program code for creating an automatic learning object further comprises: program code for examining each node for relevant information; and program code for extracting the item information from the relevant information.
22. A computer program product according to claim 17 wherein the retrieved information is coded in HTML code and wherein the program code for creating an automatic learning object comprises program code for processing the HTML code to remove HTML formatting tags.
23. A computer program product according to claim 17 wherein the retrieved information is coded in XML code and wherein the program code for creating an automatic learning object comprises program code for processing the XML code to remove XML formatting tags.
24. A computer program product according to claim 17 wherein the database is constructed to include at least one URL for a search engine located in one of the plurality of merchant sites.
25. A computer data signal embodied in a carrier wave for retrieving comparative item information from a plurality of merchant sites having disparate information formats in response to a request, including a category and a keyword, from a user, the computer data signal comprising: program code for constructing a database containing a plurality of categories and, for each category, at least one URL for one of the plurality of merchant sites; program code for composing a query in response to the request category by concatenating a URL obtained from the database with the request category with the request keyword; program code for creating a search engine to retrieve information from the plurality of merchant sites with the query; and program code for creating an automatic learning object for processing retrieved information to extract the item information.
26. A computer data signal according to claim 25 wherein the program code for composing a query comprises program code for composing a plurality of queries for the requested category and wherein the program code for creating a search engine comprises program code for creating a plurality of search engines in parallel with one search engine being used for each query.
27. A computer data signal according to claim 25 wherein the program code for composing a query comprises program code for composing a plurality of queries for the requested category and wherein the program code for creating a search engine comprises program code for creating a plurality of automatic learning objects in parallel with one automatic learning object being created to process information retrieved from each query.
28. A computer data signal according to claim 25 wherein the program code for creating an automatic learning object comprises: program code for removing formatting information in the retrieved information; and program code for parsing the filtered information into one or more data trees, each data tree having one or more nodes.
9. A computer program product according to claim 25 wherein the database is constructed to include at least one URL for a search engine located in one of the plurality of merchant sites.
PCT/US2000/041013 1999-10-07 2000-09-27 Electronic shopping agent which is capable of operating with vendor sites which have disparate formats WO2001026018A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU14932/01A AU1493201A (en) 1999-10-07 2000-09-27 Electronic shopping agent which is capable of operating with vendor sites which have disparate formats

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/414,277 1999-10-07
US09/414,277 US6381597B1 (en) 1999-10-07 1999-10-07 Electronic shopping agent which is capable of operating with vendor sites which have disparate formats

Publications (3)

Publication Number Publication Date
WO2001026018A2 true WO2001026018A2 (en) 2001-04-12
WO2001026018A8 WO2001026018A8 (en) 2001-09-07
WO2001026018A3 WO2001026018A3 (en) 2002-08-08

Family

ID=23640759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/041013 WO2001026018A2 (en) 1999-10-07 2000-09-27 Electronic shopping agent which is capable of operating with vendor sites which have disparate formats

Country Status (5)

Country Link
US (1) US6381597B1 (en)
CN (1) CN1408093A (en)
AU (1) AU1493201A (en)
TW (1) TW501033B (en)
WO (1) WO2001026018A2 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003027913A1 (en) * 2001-09-27 2003-04-03 Amips International Pty Ltd A purchasing system
US7039645B1 (en) 2002-09-26 2006-05-02 Requisite Technology, Inc. Managing content of an electronic catalog by collaboration with another electronic catalog
WO2008006267A1 (en) * 2006-07-05 2008-01-17 Shu Tak Raymond Lee Intelligent agent based development platform
WO2009070927A1 (en) * 2007-12-03 2009-06-11 Ebay Inc. Live search chat room
US7865830B2 (en) 2005-07-12 2011-01-04 Microsoft Corporation Feed and email content
US8280843B2 (en) 2006-03-03 2012-10-02 Microsoft Corporation RSS data-processing object
US9762668B2 (en) 2005-06-21 2017-09-12 Microsoft Technology Licensing, Llc Content syndication platform
US9894174B2 (en) 2005-06-21 2018-02-13 Microsoft Technology Licensing, Llc Finding and consuming web subscriptions in a web browser

Families Citing this family (101)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010016845A1 (en) * 1997-02-10 2001-08-23 Bruce Tribbensee Method and apparatus for receiving information in response to a request from an email client
US7334184B1 (en) 1999-03-10 2008-02-19 American Express Travel Related Services Company, Inc. Method for online information sharing for completing electronic forms
US7350139B1 (en) * 2000-06-16 2008-03-25 American Express Travel Related Services Company, Inc. System and method for utilizing a drag and drop technique to complete electronic forms
US6868392B1 (en) * 1999-07-09 2005-03-15 Fujitsu Limited System and method for electronic shopping using an interactive shopping agent
US6591260B1 (en) * 2000-01-28 2003-07-08 Commerce One Operations, Inc. Method of retrieving schemas for interpreting documents in an electronic commerce system
JP3307625B2 (en) * 2000-02-25 2002-07-24 株式会社ガーラ Electronic bulletin board system and mail server
US6889250B2 (en) * 2000-03-01 2005-05-03 Amazon.Com, Inc. Method and system for information exchange between users of different web pages
US7340464B2 (en) * 2000-03-10 2008-03-04 General Electric Company Method for automated web site maintenance via searching
US20050010494A1 (en) * 2000-03-21 2005-01-13 Pricegrabber.Com Method and apparatus for Internet e-commerce shopping guide
US7373313B1 (en) * 2000-04-25 2008-05-13 Alexa Internet Service for enabling users to share information regarding products represented on web pages
JP2001306601A (en) * 2000-04-27 2001-11-02 Canon Inc Device and method for document processing and storage medium stored with program thereof
WO2001086390A2 (en) 2000-05-08 2001-11-15 Walker Digital, Llc Method and system for providing a link in an electronic file being presented to a user
KR100803580B1 (en) * 2000-05-09 2008-02-15 삼성전자주식회사 Electronic music distribution service system and method using synchronous multimedia integration language format
US7062452B1 (en) * 2000-05-10 2006-06-13 Mikhail Lotvin Methods and systems for electronic transactions
US20040133572A1 (en) * 2000-05-18 2004-07-08 I2 Technologies Us, Inc., A Delaware Corporation Parametric searching
US20050177785A1 (en) * 2000-05-25 2005-08-11 Shrader Theodore J.L. Client-side pricing agent for collecting and managing product price information over the internet
US20080162298A1 (en) 2000-06-15 2008-07-03 American Express Travel Related Services Company, Inc. Online ordering system and method
US7305355B2 (en) * 2000-06-12 2007-12-04 American Express Travel Related Services Company, Inc. Universal shopping cart and order injection system
US20080306835A1 (en) * 2000-06-15 2008-12-11 American Express Travel Related Services Company, Inc. System and method for customizing an email message
US7747713B1 (en) * 2000-06-30 2010-06-29 Hitwise Pty. Ltd. Method and system for classifying information available on a computer network
US6873967B1 (en) * 2000-07-17 2005-03-29 International Business Machines Corporation Electronic shopping assistant and method of use
US8566248B1 (en) 2000-08-04 2013-10-22 Grdn. Net Solutions, Llc Initiation of an information transaction over a network via a wireless device
US7257581B1 (en) * 2000-08-04 2007-08-14 Guardian Networks, Llc Storage, management and distribution of consumer information
US9928508B2 (en) * 2000-08-04 2018-03-27 Intellectual Ventures I Llc Single sign-on for access to a central data repository
US7359951B2 (en) 2000-08-08 2008-04-15 Aol Llc, A Delaware Limited Liability Company Displaying search results
US7007008B2 (en) 2000-08-08 2006-02-28 America Online, Inc. Category searching
US7225180B2 (en) * 2000-08-08 2007-05-29 Aol Llc Filtering search results
US6980966B1 (en) * 2000-10-05 2005-12-27 I2 Technologies Us, Inc. Guided buying decision support in an electronic marketplace environment
US7370009B1 (en) * 2000-10-05 2008-05-06 I2 Technologies Us, Inc. Extreme capacity management in an electronic marketplace environment
US7617121B1 (en) 2000-11-10 2009-11-10 Platform-A Inc. Apparatus and method for hyperlinking specific words in content to turn the words into advertisements
US6898592B2 (en) * 2000-12-27 2005-05-24 Microsoft Corporation Scoping queries in a search engine
US20020091579A1 (en) * 2001-01-09 2002-07-11 Partnercommunity, Inc. Method and system for managing and correlating orders in a multilateral environment
US6915303B2 (en) * 2001-01-26 2005-07-05 International Business Machines Corporation Code generator system for digital libraries
US7216095B2 (en) * 2001-03-13 2007-05-08 Sony Corporation Method and system for distributing product information
US6745188B2 (en) * 2001-03-28 2004-06-01 Ge Capital Aviation Services, Inc. Methods and systems for generating and managing offerings
US8195573B2 (en) * 2001-04-12 2012-06-05 Catherine Lin-Hendel System and method for list shopping over a computer network
US20030023643A1 (en) * 2001-07-27 2003-01-30 International Business Machines Corporation Method and apparatus for providing context-sensitive code ahead input
US7389307B2 (en) * 2001-08-09 2008-06-17 Lycos, Inc. Returning databases as search results
US20030036968A1 (en) * 2001-08-20 2003-02-20 Ouchi Norman Ken Process & transformation private exchange
US20030046170A1 (en) 2001-08-28 2003-03-06 Lutnick Howard W. Systems and methods for providing interactive assistance on purchase decision-making
US20030074267A1 (en) * 2001-10-11 2003-04-17 Kamal Acharya Method and sytem for integrated online and brick and mortar provider shopping
US20030140052A1 (en) * 2001-12-18 2003-07-24 Shawn Thomas Method and system for asset transition quality control
US20030135582A1 (en) * 2001-12-21 2003-07-17 Docomo Communications Laboratories Usa, Inc. Context aware search service
US7793095B2 (en) * 2002-06-06 2010-09-07 Hardt Dick C Distributed hierarchical identity management
US20030236729A1 (en) * 2002-06-21 2003-12-25 Kenneth Epstein Systems and methods of directing, customizing, exchanging, negotiating, trading and provisioning of information, goods and services to information users
US7805339B2 (en) * 2002-07-23 2010-09-28 Shopping.Com, Ltd. Systems and methods for facilitating internet shopping
US20040167828A1 (en) * 2002-09-17 2004-08-26 Bria Frank H. Business method of providing products or services to a client in a graduated manner to achieve an end result
US7415672B1 (en) 2003-03-24 2008-08-19 Microsoft Corporation System and method for designing electronic forms
US7913159B2 (en) 2003-03-28 2011-03-22 Microsoft Corporation System and method for real-time validation of structured data files
WO2005008381A2 (en) * 2003-07-03 2005-01-27 Web Commerce Group System and method for providing selective content in an electronic commerce environment
US8090678B1 (en) 2003-07-23 2012-01-03 Shopping.Com Systems and methods for extracting information from structured documents
US7406660B1 (en) 2003-08-01 2008-07-29 Microsoft Corporation Mapping between structured data and a visual surface
US7334187B1 (en) 2003-08-06 2008-02-19 Microsoft Corporation Electronic form aggregation
US20050060296A1 (en) * 2003-09-15 2005-03-17 Whitby David Scott Search system and method for simultaneous querying and notification of multiple web sales sites
US7337166B2 (en) * 2003-12-19 2008-02-26 Caterpillar Inc. Parametric searching
US9245266B2 (en) * 2004-06-16 2016-01-26 Callahan Cellular L.L.C. Auditable privacy policies in a distributed hierarchical identity management system
US8504704B2 (en) * 2004-06-16 2013-08-06 Dormarke Assets Limited Liability Company Distributed contact information management
US8527752B2 (en) 2004-06-16 2013-09-03 Dormarke Assets Limited Liability Graduated authentication in an identity management system
US7606793B2 (en) * 2004-09-27 2009-10-20 Microsoft Corporation System and method for scoping searches using index keys
US7418410B2 (en) 2005-01-07 2008-08-26 Nicholas Caiafa Methods and apparatus for anonymously requesting bids from a customer specified quantity of local vendors with automatic geographic expansion
FR2881245A1 (en) * 2005-01-27 2006-07-28 Roger Marx Desenberg Web site content e.g. good such as refrigerator, locating method for Internet, involves automatically generating computer program code as function of electronic information of site, and transferring code to work station
US20070043583A1 (en) * 2005-03-11 2007-02-22 The Arizona Board Of Regents On Behalf Of Arizona State University Reward driven online system utilizing user-generated tags as a bridge to suggested links
US20070088680A1 (en) * 2005-10-14 2007-04-19 Microsoft Corporation Simultaneously spawning multiple searches across multiple providers
US20070150370A1 (en) * 2005-11-15 2007-06-28 Staib William E System for Increasing On-Line Shopping Presence
US8001459B2 (en) 2005-12-05 2011-08-16 Microsoft Corporation Enabling electronic documents for limited-capability computing devices
WO2007106148A2 (en) * 2006-02-24 2007-09-20 Vogel Robert B Internet guide link matching system
US7571162B2 (en) * 2006-03-01 2009-08-04 Microsoft Corporation Comparative web search
EP2039119B1 (en) * 2006-06-28 2010-08-04 Telefonaktiebolaget LM Ericsson (publ) A method, communication system and collection controller allowing third party influence on the provision of a service to a user station
US7689548B2 (en) * 2006-09-22 2010-03-30 Microsoft Corporation Recommending keywords based on bidding patterns
AU2006233245B2 (en) * 2006-10-30 2010-08-12 FeedOps Pty Ltd Web advertising management method
US9348912B2 (en) 2007-10-18 2016-05-24 Microsoft Technology Licensing, Llc Document length as a static relevance feature for ranking search results
US20090106221A1 (en) * 2007-10-18 2009-04-23 Microsoft Corporation Ranking and Providing Search Results Based In Part On A Number Of Click-Through Features
US8812493B2 (en) * 2008-04-11 2014-08-19 Microsoft Corporation Search results ranking using editing distance and document information
US20110173098A1 (en) * 2010-01-12 2011-07-14 Lee Chun-Yi System, method, computer readable medium and computer program product for managing electronic commerce
US8738635B2 (en) 2010-06-01 2014-05-27 Microsoft Corporation Detection of junk in search result ranking
US8572146B2 (en) 2010-08-17 2013-10-29 Fujitsu Limited Comparing data samples represented by characteristic functions
US8874607B2 (en) * 2010-08-17 2014-10-28 Fujitsu Limited Representing sensor data as binary decision diagrams
US8495038B2 (en) 2010-08-17 2013-07-23 Fujitsu Limited Validating sensor data represented by characteristic functions
US8930394B2 (en) 2010-08-17 2015-01-06 Fujitsu Limited Querying sensor data stored as binary decision diagrams
US9002781B2 (en) 2010-08-17 2015-04-07 Fujitsu Limited Annotating environmental data represented by characteristic functions
US8583718B2 (en) 2010-08-17 2013-11-12 Fujitsu Limited Comparing boolean functions representing sensor data
US8645108B2 (en) 2010-08-17 2014-02-04 Fujitsu Limited Annotating binary decision diagrams representing sensor data
US9138143B2 (en) 2010-08-17 2015-09-22 Fujitsu Limited Annotating medical data represented by characteristic functions
US9317834B2 (en) 2011-06-30 2016-04-19 Microsoft Technology Licensing, Llc User computing device with personal agent program for recommending meeting a friend at a service location based on current location, travel direction, and calendar activity
US9075908B2 (en) 2011-09-23 2015-07-07 Fujitsu Limited Partitioning medical binary decision diagrams for size optimization
US8838523B2 (en) 2011-09-23 2014-09-16 Fujitsu Limited Compression threshold analysis of binary decision diagrams
US9176819B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Detecting sensor malfunctions using compression analysis of binary decision diagrams
US8909592B2 (en) 2011-09-23 2014-12-09 Fujitsu Limited Combining medical binary decision diagrams to determine data correlations
US8719214B2 (en) 2011-09-23 2014-05-06 Fujitsu Limited Combining medical binary decision diagrams for analysis optimization
US8781995B2 (en) 2011-09-23 2014-07-15 Fujitsu Limited Range queries in binary decision diagrams
US8812943B2 (en) 2011-09-23 2014-08-19 Fujitsu Limited Detecting data corruption in medical binary decision diagrams using hashing techniques
US9177247B2 (en) 2011-09-23 2015-11-03 Fujitsu Limited Partitioning medical binary decision diagrams for analysis optimization
US8620854B2 (en) 2011-09-23 2013-12-31 Fujitsu Limited Annotating medical binary decision diagrams with health state information
US20130138527A1 (en) * 2011-11-30 2013-05-30 International Business Machines Corporation Dynamic customer purchase auction event
US9495462B2 (en) 2012-01-27 2016-11-15 Microsoft Technology Licensing, Llc Re-ranking search results
US20130282443A1 (en) * 2012-04-18 2013-10-24 Sahib Bal Seller url monitoring systems and methods
WO2014018780A1 (en) * 2012-07-25 2014-01-30 Indix Corporation Adaptive gathering of structured and unstructured data system and method
US10169802B2 (en) 2012-07-25 2019-01-01 Indix Corporation Data refining engine for high performance analysis system and method
US11922475B1 (en) 2013-07-25 2024-03-05 Avalara, Inc. Summarization and personalization of big data method and apparatus
US10402878B2 (en) * 2014-04-21 2019-09-03 Freightview, Inc. Computer program, method, and system for facilitating commercial transactions between a user and a vendor
CA3199159A1 (en) * 2020-11-16 2022-05-19 Say Technologies Llc Data communications protocol platform

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998032289A2 (en) * 1997-01-17 1998-07-23 The Board Of Regents Of The University Of Washington Method and apparatus for accessing on-line stores

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6119101A (en) * 1996-01-17 2000-09-12 Personal Agents, Inc. Intelligent agents for electronic commerce
US6301584B1 (en) * 1997-08-21 2001-10-09 Home Information Services, Inc. System and method for retrieving entities and integrating data
US6185558B1 (en) * 1998-03-03 2001-02-06 Amazon.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
US6317718B1 (en) * 1999-02-26 2001-11-13 Accenture Properties (2) B.V. System, method and article of manufacture for location-based filtering for shopping agent in the physical world
US6304854B1 (en) * 1999-09-09 2001-10-16 Dunhill Holdings, Corp. System and method for providing a comparable branded product based on a current branded product for non-comparison shopped products

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998032289A2 (en) * 1997-01-17 1998-07-23 The Board Of Regents Of The University Of Washington Method and apparatus for accessing on-line stores

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PERKOWITZ M ET AL: "LEARNING TO UNDERSTAND INFORMATION ON THE INTERNET: AN EXAMPLE-BASED APPROACH" JOURNAL OF INTELLIGENT INFORMATION SYSTEMS: ARTIFICIAL INTELLIGENCE AND DATABASE TECHNOLOGIES, KLUWER ACADEMIC PUBLISHERS, AMSTERDAM, NL, vol. 8, no. 2, 1997, pages 133-153, XP000933994 ISSN: 0925-9902 *
R.B. DOORENBOS; O. ETZIONI; D.S. WELD: "A Scalable Comparison-Shopping Agent for the World Wide Web" UNIVERSITY OF WASHINGTON, DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, TECHNICAL REPORT UW-CSE-96-01-03, [Online] January 1996 (1996-01), pages 1-20, XP002196638 USA Retrieved from the Internet: <URL:http://www.cs.washington.edu/research /tr/techreports.html> [retrieved on 2002-04-18] *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003027913A1 (en) * 2001-09-27 2003-04-03 Amips International Pty Ltd A purchasing system
US7039645B1 (en) 2002-09-26 2006-05-02 Requisite Technology, Inc. Managing content of an electronic catalog by collaboration with another electronic catalog
US9762668B2 (en) 2005-06-21 2017-09-12 Microsoft Technology Licensing, Llc Content syndication platform
US9894174B2 (en) 2005-06-21 2018-02-13 Microsoft Technology Licensing, Llc Finding and consuming web subscriptions in a web browser
US7865830B2 (en) 2005-07-12 2011-01-04 Microsoft Corporation Feed and email content
US8280843B2 (en) 2006-03-03 2012-10-02 Microsoft Corporation RSS data-processing object
WO2008006267A1 (en) * 2006-07-05 2008-01-17 Shu Tak Raymond Lee Intelligent agent based development platform
WO2009070927A1 (en) * 2007-12-03 2009-06-11 Ebay Inc. Live search chat room
US8132112B2 (en) 2007-12-03 2012-03-06 Ebay Inc. Live search chat room
US9003307B2 (en) 2007-12-03 2015-04-07 Ebay Inc. Live search chat room

Also Published As

Publication number Publication date
TW501033B (en) 2002-09-01
US6381597B1 (en) 2002-04-30
AU1493201A (en) 2001-05-10
WO2001026018A3 (en) 2002-08-08
WO2001026018A8 (en) 2001-09-07
CN1408093A (en) 2003-04-02

Similar Documents

Publication Publication Date Title
US6381597B1 (en) Electronic shopping agent which is capable of operating with vendor sites which have disparate formats
US5987480A (en) Method and system for delivering documents customized for a particular user over the internet using imbedded dynamic content
US8719296B2 (en) Method and apparatus for defining data of interest
US6256623B1 (en) Network search access construct for accessing web-based search services
US7076453B2 (en) System and method for designing and operating an electronic store
US6101503A (en) Active markup--a system and method for navigating through text collections
US8046681B2 (en) Techniques for inducing high quality structural templates for electronic documents
US5884309A (en) Order entry system for internet
JP4857075B2 (en) Method and computer program for efficiently retrieving dates in a collection of web documents
US7058944B1 (en) Event driven system and method for retrieving and displaying information
CN101517511B (en) System, process and software arrangement for assisting in navigating internet
US20020035619A1 (en) Apparatus and method for producing contextually marked-up electronic content
US7756849B2 (en) Method of searching for text in browser frames
US20060155728A1 (en) Browser application and search engine integration
US20090125529A1 (en) Extracting information based on document structure and characteristics of attributes
US20020010709A1 (en) Method and system for distilling content
US9230256B2 (en) System and method for electronically creating a customized catalog
US20080115057A1 (en) High precision data extraction
US7895337B2 (en) Systems and methods of generating a content aware interface
JP2002024227A (en) System and method for generating radio web page
US20030110210A1 (en) Information communication system
US20050131859A1 (en) Method and system for standard bookmark classification of web sites
US6735739B1 (en) Text in anchor tag of hyperlink adjustable according to context
KR20040048199A (en) Method and system for altering web page using detail information correspond to retrieval result
KR19990078876A (en) Information search method by URL input

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: C1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

CFP Corrected version of a pamphlet front page
CR1 Correction of entry in section i

Free format text: PAT. BUL. 15/2001 UNDER (81) ADD "IN"; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

WWE Wipo information: entry into national phase

Ref document number: 008148945

Country of ref document: CN

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP