PERFORMING PREDICTIVE PRICING BASED ON HISTORICAL DATA
CROSS REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of provisional U.S. Patent Application No. 60/458,321 , filed March 27, 2003 and entitled "Mining Historical Pricing Data To Provide Guidance For Current Purchases," which is hereby incorporated by reference in its entirety.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
The University of Washington and the University of Southern California have granted a royalty-free non-exclusive license to the U.S. government pursuant to 35 U.S.C. Section 202(c)(4) for any patent claiming an invention subject to 35 U.S.C. Section 201.
TECHNICAL FIELD
The following disclosure relates generally to the use of techniques for predicting future pricing information for items based on analysis of prior pricing information for the items, and more particularly to using such predicted future pricing information in a variety of ways, such as to assist users in making better buying and/or selling decisions.
BACKGROUND
In many situations, potential buyers or other acquirers of various types of items (such as products and/or services) are faced with difficult decisions when attempting to determine whether acquiring a particular item of interest under current conditions is desirable or optimal based on their goals, or whether instead delaying the acquisition would be preferable. For example, when the potential acquirer desires to obtain the item at the lowest price possible before some future date, and the item is currently offered by a seller for a current price, the potential acquirer needs to evaluate whether accepting the current price is more advantageous than
the potential benefits and costs associated with waiting to see if the item will continue to be available and will be later offered at a lower price before the future date. Such potential acquisitions can include a variety of types of transactions (e.g., fixed-price purchase, auction-based purchase, reverse auction purchase, name- your-price purchase, rent, lease, license, trade, evaluation, sampling, etc.), and can be performed in a variety of ways (e.g., by online shopping using a computing device, such as via the World Wide Web or other computer network).
The difficulty of evaluating a potential current item acquisition is exacerbated in environments in which the prices of the items frequently change, such as when sellers or other suppliers of the items frequently modify item prices (e.g., in an attempt to perform yield management and maximize overall profits). In such environments, the likelihood of future price changes may be high or even a certainty, but it may be difficult or impossible for the potential acquirer to determine whether the future price changes are likely to be increases or drops, let alone a likely magnitude and timing of such changes. A large number of types of items may have such frequent price changes, such as airline tickets, car rentals, hotel rentals, gasoline, food products, jewelry, various types of services, etc. Moreover, a potential acquirer may in some situations need to evaluate not only a current price of an item of interest from a single seller or other provider, but may future need to consider prices offered by other providers and/or prices for other items that are sufficiently similar to be potential substitutes for the item of interest (e.g., airline flights with the same route that leave within a determined period of time, whether from the same airline or from competitor airlines).
In a similar manner, some sellers or other providers of items may similarly face difficulties in determining an advantageous strategy related to the providing of the items, such as for intermediary sellers that must acquire an item from a third- party supplier (e.g., an original supplier of the item or other intermediary seller) before providing it to a customer. For example, it may be difficult in at least some situations for such intermediary sellers to know what price to offer to customers in order to maximize profit, as well as whether to immediately acquire from a third-party supplier an item purchased by a customer or to instead delay such an acquisition in an attempt to later acquire the item at a lower price. In the context of the airline industry, for example, such intermediary sellers may include various types of travel
agents, including travel agents that typically buy only single airline tickets in response to explicit current instructions from a customer, consolidators that buy large numbers of airline tickets in advance for later resale, tour package operators that buy large numbers of airline tickets for bundling with other tickets and/or services, etc.
Thus, it would be beneficial to be able to predict future pricing information for items, such as likely future directions in price changes and/or likely specific future item prices, as doing so would enable buyers and/or intermediate sellers to make better acquisition-related decisions.
BRIEF DESCRIPTION OF THE DRAWINGS
Figures 1A-1O provide examples illustrating the use of predictive pricing techniques in a variety of situations.
Figure 2 is a block diagram illustrating an embodiment of a computing system suitable for providing and using disclosed techniques related to predictive pricing.
Figure 3 is a flow diagram of an embodiment of a Predictive Pricing Determiner routine.
Figure 4 is a flow diagram of an embodiment of a Predictive Pricing Provider routine.
Figure 5 is a flow diagram of an embodiment of a Predictive Pricing Seller routine.
Figure 6 is a flow diagram of an embodiment of a Predictive Pricing Advisor routine.
Figure 7 is a flow diagram of an embodiment of a Predictive Pricing Buyer routine.
Figure 8 is a flow diagram of an embodiment of a Predictive Pricing Analyzer routine.
DETAILED DESCRIPTION
A software facility is described below that uses predictive pricing information for items in order to assist in evaluating decisions related to the items in various ways, such as to assist end-user item acquirers in evaluating purchasing decisions
related to the items and/or to assist intermediate providers of the items in evaluating selling decisions related to the items.
In some embodiments, the predictive pricing techniques are used for items whose prices are dynamically changed by suppliers of the items to reflect various factors and goals, such as to use yield management and differential pricing techniques in an attempt to maximize profits - moreover, in some such embodiments the factors and/or any underlying pricing algorithms that are used by the item suppliers may further be unknown when performing the predictive pricing activities. Furthermore, in some embodiments the predictive pricing techniques may be applied to items that are "perishable" so as to have an expiration or other use date after which the item has a different value (e.g., a lower value, including no value), such as performances/events/occurrences and/or services that occur on or near the expiration or use date, or information whose value is based at least in part on its timeliness or recency - in such embodiments, a supplier of such an item may alter prices or other conditions for the item as its expiration or use date approaches, such as to discount a price for the item or alternatively to raise such a price. Any such actions by suppliers based on expiration or use dates for items may in some embodiments be performed by the suppliers in a purely formulaic and repeatable manner (e.g., as an automated process), while in other embodiments some subjective variability may be included with respect to such actions (e.g., based on manual input for or oversight of the actions). When discussed herein, a supplier of an item includes an original supplier of an item and/or any other party involved in providing of the item who has control over or influence on the setting of a current price for the item before it becomes available to an acquirer (whether an intermediate seller or an end-user customer), and may further in some situations include multiple such parties (e.g., multiple parties in a supply chain).
In particular, in some embodiments the predictive pricing for an item is based on an analysis of historical pricing information for that item and/or related items. Such historical pricing information analysis may in some embodiments automatically identify patterns in the prices, such as patterns of price increases or drops. In addition, in some such embodiments the analysis further associates the prices and/or patterns with one or more related factors (e.g., factors that have a causal effect on the prices or are otherwise correlated with the prices), such as factors that
are automatically identified in one or more of a variety of ways during the analysis. Furthermore, in some embodiments predictive pricing policies are also automatically developed based on other automatically identified predictive pricing information, such as to enable specific price-related predictions for a particular item given specific current factors. In addition, in some embodiments the historical pricing information may reflect prices for items that were previously offered by item suppliers to others, while in other embodiments the historical pricing information may reflect prices at which the items were actually previously acquired/provided. Specific mechanisms for performing such predictive pricing analysis are^discussed in greater detail below.
When such predictive pricing information is available for an item, that information can then be used to assist acquirers (e.g., buyers) and/or intermediate providers (e.g., sellers) of an item in making better decisions related to acquiring and/or providing of the item. In particular, given information about current factors that are associated with the pricing information for the item, the predictive pricing information for the item can be used to make predictions about future pricing information for the item. Such future predicted pricing information can take various forms in various embodiments, including a likely direction of any future price changes, a likely timing of when any future price changes will occur, a likely magnitude of any future price changes, likely particular future prices, etc. In addition, in some embodiments the future predicted pricing information may further include predictions of the specific likelihood of one or more of such types of future pricing information. Moreover, in some embodiments and/or situations the predictive pricing information and/or assistance/functionality provided based on that information may be performed for a fee.
As one example, the facility may in some embodiments use predictive pricing information for items to advise potential buyers of those items in various ways. For example, when providing pricing information for an item to a current customer, a notification may in some embodiments be automatically provided to the customer to advise the customer in a manner based on predicted future pricing information for the item, such as whether the current price is generally a "good buy" price given those predicted future prices, or more specifically whether to buy immediately due to predicted future price increases or to delay buying due to predicted future price
drops. Such advice could further in some embodiments provide specific reasons for the provided advice, such as based on information about specific predicted future pricing information (e.g., a specific predicted direction, time and/or magnitude of a future price change, a specific predicted future price, etc.), as well as additional details related to the advice (e.g., specific future conditions under which to make an acquisition, such as a specific amount of delay to wait and/or a specific future price to wait for). In situations in which the potential buyer does not need to immediately obtain the item and the predicted future pricing information indicates that the price is likely to drop, for example, the potential buyer can use that information to determine to delay a purchase.
In other situations, advice may be automatically provided to a user in other ways, such as by proactively alerting a potential buyer regarding a current and/or predicted future price for an item (e.g., an item in which the customer has previously expressed an interest), such as when a current price for the item reflects a good buy for the customer. In addition, in some embodiments the facility may further act as an agent on behalf of a customer in order to automatically acquire an item for the customer, such as based on prior general instructions from the customer related to purchasing an item or type of item when it is a good buy. Thus, in some embodiments such advice may be provided to users as part of an interactive response to a request, while in other situations the providing of the advice may instead be automatically initiated. In addition, in some embodiments and/or situations the providing of such advice and/or related functionality to a potential buyer or other acquirer may be performed for a fee, such as a fee charged to that acquirer.
In other embodiments, the facility may act on behalf of an intermediate seller of the item in order to provide advice to the seller. For example, in some situations in which a predicted future price for an item is lower than a current price, the intermediate seller may determine based on such advice to offer a price to a customer that is lower than the current price available to the intermediate seller (e.g., but above the lower predicted future price) - if so, and if the customer indicates to purchase the item, the intermediate seller may further delay purchasing the item from its supplier in order to attempt to acquire the item at a lower future price. More generally, when a predicted future price for an item is lower than a current price
being offered to an intermediate seller, that knowledge about the lower predicted price may enable the intermediate seller to currently use the potential cost savings based on acquisitions at the later lower price in a variety of ways, including by passing some or all of the price difference on to customers, by retaining some or all of the price difference as profit, and/or by sharing some or all of the price difference with the supplier of the item to the intermediate seller (e.g., in exchange for the supplier immediately lowering their price to the intermediate seller).
In addition, in situations in which the future price is predicted not to drop, the intermediate seller may choose based on such advice to offer price protection to a customer (e.g., for a fee to the customer) based on that prediction, such as to provide additional benefit to the customer if the future actual price were to drop below the current price or some other specified price (e.g., a refund of the difference). In other embodiments, the facility may assist the intermediate seller to offer items to customers using a variety of other sales models, such as to allow customers to name a price at which the customer will purchase an item of interest (whether identified as a particular item or a category of items that is specified at any of a variety of levels of details) and to then assist the intermediate seller in determining whether to accept such an offer based on a comparison of the named price to a predicted future price for the item. In addition, in some embodiments and/or situations the providing of such advice and/or related functionality to intermediate sellers and other providers may be performed for a fee, such as a fee charged to that provider.
In addition, the facility may in some embodiments further assist buyers that purchase items in bulk (e.g., by aggregating numerous individual purchases into one), such as customers that themselves buy large numbers of the items (e.g., large corporations) and/or intermediate sellers (e.g., item consolidators) that are purchasing items from other suppliers. In such situations, the bulk purchaser may take a variety of types of steps to accomplish desired goals of the purchaser, such as to hedge or otherwise limit exposure to loss based on predicted future prices (e.g., by purchasing some but not all of multiple items at a current price even when the predicted future price is lower in order to minimize the risk of the actual future prices being higher than the current price). Alternatively, the bulk purchaser may be able to use information about predicted future prices to manually negotiate better
current prices with a seller. In other embodiments, information about such predicted future prices can assist other types of users, including suppliers of items when they have such information about similar items offered by competitors, such as to provide first mover advantage for price decreases that are likely to occur in the future by the competitors. The information provided by the analysis may further assist in some embodiments in more generally identifying and predicting trends in prices over time for specific items and/or groups of related items, such as to enable a user to immediately take action in such a manner as to benefit from such trends. In addition, in some embodiments and/or situations the providing of such advice and/or related functionality to bulk purchasers may be performed for a fee, such as a fee charged to that bulk purchaser.
In some embodiments, the facility further assists users in analyzing historical purchase data, such as for bulk purchasers. For example, a large corporation may want to analyze their prior item purchases over a specified period of time in order to determine whether the purchase prices for the items were advantageous and/or optimal in light of later available prices for those items. In some situations, the actual prior purchase prices for items could be compared against alternative prior purchase prices for those items that would have been obtained based on following predictive pricing information for those items that would have been provided at the time of actual purchase (e.g., for advice related to delaying a purchase, determining a difference between the actual prior purchase price and a later actually available price at which the item would likely have been acquired based on the advice). This allows a determination to be made of the benefits that would have been obtained by using predictive pricing in those situations. In addition, in some embodiments the actual prior purchase prices could further be compared to optimal purchase prices for those items within a relevant time period before the item was needed, such as to see the differential (if any) between the actual purchase price and the lowest possible purchase price given perfect hindsight knowledge. In addition, in some embodiments and/or situations the providing of such advice and/or related functionality for analyzing historical purchase data may be performed for a fee, such as a fee charged to a customer to whom the historical purchase data corresponds.
Moreover, such analysis of prior purchase decisions provides information not only about the benefits of using the predictive pricing techniques, but also to assist
in further refining the predictive pricing techniques (e.g., in an automated manner, such as based on a learning mechanism), such as based on identifying situations in which the predictive pricing techniques did not provide the best prediction available.
In addition, in various embodiments the predictive pricing information and/or related functionality is used to generate revenue and/or produce savings in a variety of ways, including through service fees, license fees, by maximizing profit for sellers and/or savings for acquirers, through related advertising, etc. Such revenue can be based on any or all of the various example types of functionality discussed above and in greater detail below.
For illustrative purposes, some embodiments of the software facility are described below in which particular predictive pricing techniques are used for particular types of items, and in which available predictive pricing information is used to assist buyers and/or sellers in various ways. However, those skilled in the art will appreciate that the techniques of the invention can be used in a wide variety of other situations, and that the invention is not limited to the illustrated types of items or predictive pricing techniques or uses of predictive pricing information. For example, some such items with which the illustrated predictive pricing techniques and/or uses of predictive pricing information include car rentals, hotel rentals, vacation packages, vacation rentals (e.g., homes, condominiums, timeshares, etc.), cruises, transportation (e.g., train, boat, etc.), gasoline, food products, jewelry, consumer electronics (e.g., digital and non-digital still and video cameras, cell phones, music players and recorders, video players and recorders, video game players, PDAs and other computing systems/devices, etc.), books, CDs, DVDs, video tapes, software, apparel, toys, electronic and board games, automobiles, furniture, tickets for movies and other types of performances, various other types of services, etc. Furthermore, the disclosed techniques could further be used with respect to item-related information other than prices, whether instead of or in addition to price information.
Figures 1A-1O provide examples illustrating the use of predictive pricing techniques in a variety of situations. In these examples, the predictive pricing techniques are applied to airline ticket information and are used by a provider of information about airline ticket prices, such as an intermediate seller travel agency. However, such techniques can also be used for other types of items and in other manners, as discussed elsewhere.
In particular, Figure 1A illustrates a table 100 that provides examples of historical pricing information that may be gathered for airline flights and then analyzed to produce various types of predictive pricing information. In this example, the table includes entries 111-116 that each correspond to a different instance of actual price information for a particular flight, such as a ticket price offered for a flight at a particular time. The table also includes columns 102-104 that store information about the specific offer instance for the flight number indicated in column 101 and the departure date indicated in column 105. In addition, in the illustrated embodiment the table further includes columns 106-108 with additional information about the flight, although in other embodiments such information may be stored separately. Moreover, in some embodiments the table could store additional information about other factors that may have an effect on price changes, such as one or more sell-out factors related to whether/when a flight may sell out (e.g., based in part on a number of remaining available seats in column 109, although in the illustrated embodiment that information is not currently available).
While the flight prices reflect one-way tickets for specific flights of specific airlines in this example, related information could similarly be gathered and/or aggregated in various other ways, whether instead of or in addition to one-way tickets, including for round-trip and/or multi-segment tickets, and so as to enable predictive pricing for flights on a particular route between a departure airport and an arrival airport, for flights on one or more routes between a particular pair of cities or regions (e.g., when the departure and/or arrival cities/regions include multiple airports), for flights into and/or out of an airport hub for one or more airlines, for flights on a route with an associated time (e.g., a specified departure time or departure time range, a specified arrival time or arrival time range, a specified interval of time for the travel, etc.), for some or all flights from a particular airline, for some or all flights into or out of an airport and/or city/region, for some or all flights that depart and/or arrive at a specified airport/city/region within a specified time frame, etc.
Figure 1 B illustrates a chart 131 that provides an example of historical offer price information over time for a particular airline flight, such as a flight that departs in this example on January 7, and is represented with a particular flight number from a particular airline (not shown). As one example, the illustrated price information
may correspond to a group of historical information that includes entries 11 and 115-116 of table 100 in Figure 1A. In other embodiments, information could instead be analyzed for airline flights in other ways, such as by aggregating information for a particular airline route over multiple days and/or by aggregating information for multiple airline flights of a particular airline or that are otherwise similar. In this example, the price data generally shows three tiers of relatively stable prices, although there is an additional small price fluctuation in the first price tier around the dates of 12/17 and 12/ 8 (e.g., based on a reaction of the airline to a temporary price increase by a competitor on a flight for the same route and date).
Figure 1C illustrates a chart 132 that provides an example of historical price information for the same flight except with a different departure date that is five days earlier, which in this example results in a departure date of January 2 rather than January 7. Despite the similarities between the two flights (the same airline, flight number, route and close departure date), the price of this airline flight with the earlier departure date fluctuates much more than that of the flight discussed with respect to Figure 1B, such as based at least in part on the increased demand for travel near the New Year holiday.
Figure 1D illustrates an example of a chart 133 that provides an example of historical price information for a different airline flight, such as on a different route and/or from a different airline. In this example, there are two primary price tiers, but there is also a large drop in price in the middle of the second price tier (near the date of 1/5 in this example). Thus, customers who express interest in the item near the beginning of the second price tier (e.g., around the dates of 12/30-1/3) might pay a large price if they purchased immediately, but could significantly benefit by waiting to purchase the ticket during the later price drop. Figure 1E illustrates yet another example chart 134 with example historical price information, which in this example shows a more gradual increase in price over time as the departure date approaches. A variety of other types of price change behavior could similarly occur.
As illustrated in these examples, prices for airline tickets can change in various ways and based on various factors. For example, information other than an amount of time before departure can be a factor that affects price changes in at least some embodiments, and can thus be used as part of a later predictive pricing determination in those situations. For example, Figure 1F illustrates an example
chart 135 for the same flight previously discussed with respect to Figure 1 B, although in this example the price is shown as it varies based on factor of the availability of remaining seats on the flight. In other embodiments, however, such flight availability factor information may not be available and/or other additional factors could similarly be considered.
In a similar manner, the example chart 136 illustrated in Figure 1 G illustrates that information other than price may be tracked, analyzed and used in some embodiments, such as in this example displaying historical flight availability information for one or more flights (e.g., flights that depart on a particular day or instead on any of a group of similarly situated days). Such availability information may then be used in some embodiments to assist in a determination of whether a current price is currently a good buy, such as based on considering the likelihood of the flight selling out in the future. However, in embodiments where the price of an item typically already varies based on remaining availability, such as for airline tickets, availability information may instead be considered implicitly based merely on the price factor (e.g., if the availability does not independently affect a decision).
Thus, as previously indicated, historical price information for airline ticket prices can be illustrated in a graphical manner to show changes over time as the departure time nears. In addition, such historical data can also be analyzed in a variety of ways to provide other types of information that can assist in later performing predictive pricing. For example, with respect to Figure 1 H, example information is shown in table 142 that indicates a historical minimum price, maximum price, and maximum price change for particular routes. Similarly, Figure 11 illustrates in table 144 an average number of price changes for particular routes. A variety of other types of analyses can similarly be performed related to price changes if the resulting information assists in predictive pricing and/or in providing advice based on current prices.
Once predictive pricing information is available, it can be used in a variety of manners to assist potential acquirers and/or providers of items. For example, Figure 1 J illustrates example information 159 that provides flight alternative information to a potential customer, such as via a Web page provided to the customer for display from an online travel agent. In this example, the provider of information is referred to as "Hamlet". In particular, in this example four alternative flight options 150 are
displayed to the customer, such as by including them in search results in response to a prior request from the customer. The alternatives are listed in order from lowest price to highest price in this example, with the lowest price being $499 for alternative 150a. However, in this example the low price for alternative 150a corresponds to a special fare that is being offered to the customer by the travel agent, as is indicated to the customer in this example via notification 155 and is explained more fully to the customer if they select the control 156. In particular, alternative 150a corresponds to the same flight as that for alternative 150c, but the indicated price of $649 for alternative 150c reflects the actual price currently offered by the original supplier of this flight (in this example, Alaska Airlines). The special fare offered in this example for alternative 150a is instead based on predictive pricing for this flight, which in this example has provided an indication that a future price for this flight will be lower (e.g., as low as $499 if the travel agency plans to offer the full potential savings to the customer, lower than $499 if the travel agency plans to retain some of the potential savings, somewhat above $499 if the travel agency is willing to offer an additional discount to this customer and/or in this situation, etc.).
Thus, in this example the travel agency has elected to provide at least $150 worth of potential savings to the customer if the customer purchases now by offering a price to the customer that is lower than the price currently offered from the original supplier. If the customer then proceeds to purchase the flight for alternative 150a at the special fare, such as by selecting control 157, the travel agency may nonetheless wait until later to actually purchase a ticket for the customer on this flight, such as a later time when the actual price offered by the item supplier is lower. In this example, the customer is selecting a round-trip flight, and thus after selecting the control 157 for alternative 150a the customer will be prompted to provide information related to a return flight. If the customer was instead selecting a oneway trip or this was the last selection of a trip with multiple legs or segments, selection of a corresponding control by the customer could instead prompt the online travel agency to provide confirmation to the customer of their purchase having been completed at the indicated price of $499, even if the travel agency delays a purchase of tickets for the flight until later.
In other embodiments, explicit notification that the alternative 150a is a special fare might instead not be provided, such as by not showing alternative 150c with the actual current price offered by the original supplier airline, and instead merely listing a price for alternative 150a that is selected as satisfying one or more goals of the travel agency, such as to maximize profit (e.g., a price that is between the lowest predicted future price and the lowest actual current price offered by a supplier for one of the alternatives, such as in this example to be less than the $598 price for alternative 150b). Conversely, in some embodiments additional aspects related to the special fare may be conveyed to the customer, such as if any special fares selected by the customers for purchase are contingent on the travel agency acquiring the ticket under specified conditions (e.g., at or below a specified price and/or within a specified amount of time). If so, the travel agency may not confirm purchase to the customer until after the ticket is actually acquired from a supplier.
Figure 1K illustrates example information 169 that provides return flight information to the customer after selection of control 157 in Figure U, including information 162 about that previously selected flight. The information includes two alternative return flights 160, with both of the alternatives in this example including a notification 165 that the indicated prices are special fares from the travel agent. In some embodiments, such notifications may further explicitly indicate to a customer that a special fare is based on predictive pricing, while in other embodiments such information may not be provided to the customer.
Figure 1 L illustrates example information 179 that provides alternative return flight information to the customer if the customer instead selected alternative 150b in Figure U for the initial segment of the trip, including information 172 about that previously selected flight. In this example, three alternative return flights 170 are included, and an additional notification 177 is provided to the customer to remind them that a lower price trip is available based on the special fare offered by the travel agency. In addition, in this example the travel agency provides various other types of notifications based on the use of predictive pricing. For example, as indicated in information 178 and with notification 175, the current price of $598 round trip for flight alternative 170a is indicated in this example to be a good buy that may justify immediate purchase, such as due to the price being unlikely to fall but may possibly rise in the near future. In other embodiments, customers may instead
be referred to an agent or a supplier from whom they can immediately acquire such good buy tickets, such as in exchange for a referral fee. In addition, in this example the travel agency further provides an option 176 to the customer that is also based on predictive pricing information - in particular, since in this example the predictive pricing indicates that the price is not likely to drop, the travel agency is willing to offer price protection insurance to the customer for a small fee, such that if the actual offered price drops after purchase the customer would then receive an additional benefit (e.g., a discount on their purchased price so as to reduce it to the lowest actual price that was offered). While the price protection insurance is offered to the customer for an additional fee in this example, in other embodiments such price protection insurance may not be offered or instead may be offered to a customer without additional explicit cost to the customer.
Figure 1 M illustrates example information 189a that provides return flight information to the customer if the customer instead selected alternative 150d in Figure U for the initial segment of the trip, including information 182 about that previously selected flight. In particular, in this example three alternative return flights 180 are provided to the user, and predictive pricing information allows the travel agency to determine whether some or all of the alternatives are good fares or are otherwise good buys. However, in this example such additional information based on predictive pricing is available only to registered customers, and thus the information 189a includes indications 181a-181c to the user that they can obtain such notification information after they register (e.g., via selection of the control in section 183), such as based on a fee charged to the customer (e.g., a one-time fee or an ongoing subscription), or instead based on other benefits to the travel agency of such registration (e.g., obtaining additional information about the customers for use in better serving them and/or tailoring advertising or other information that will be displayed or otherwise provided to them). Alternatively, the initial registration may be free and may provide a basic level of information to a customer, while an upgrade to one or more premium fee-based registration services with additional information and/or functionality (e.g., to provide details and/or reasons about notifications, to provide alerting functionality, etc.) may additionally be available. Different types of services could also be used for different types of customers, such as individuals
purchasing on their own behalf versus users acting on behalf of others (e.g., travel agents, corporate travel managers, etc.).
As noted above, in some embodiments and situations revenue may be derived through various types of advertising to users, such as advertising supplied interactively to users along with other supplied information (e.g., as banner or popup ads, sponsored listings in search results, paid inclusion for search or other results, etc.), advertising supplied or otherwise made available to users in a non- interactive manner (e.g., permission-based or other email or other forms of notification) such as based on demographic and/or personal preference information for the users, etc. Similarly, in some embodiments and situations revenue may be derived through other uses of information about users themselves and/or about purchase-related activities of such users, including selling or otherwise providing such information to third-parties (e.g., with permission of the users).
Figure 1N illustrates information 189b that is similar to that displayed with respect to Figure 1 M, but which includes alternative types of notifications to the customer for the return flight alternatives. For example, these alternative notifications may be provided to the customer after they complete the registration process with respect to Figure 1 M. In particular, example notification 184a provides additional information to a user for a particular flight, such as to buy the flight at the current price now because the price is not likely to drop and the flight may soon sell out. Conversely, notification 184b indicates to the customer to hold off on purchasing the indicated flight at its current price, as the price of that flight is likely to drop in the future. Notification 184c indicates for its alternative flight that the price is not likely to rise or drop, and thus advice on whether to purchase immediately cannot be made based purely on price information. In other alternatives, yet other types of information could be provided, such as by including information in alternative 184b that further indicates to the customer a length of time that the customer should wait before purchasing and/or a price or price range for which the customer should wait before completing the purchase.
While not illustrated here, advice could also be provided to customers in a variety of ways other than as part of an interactive response to the customer. For example, various types of alerts could instead be provided to a customer in a manner initiated by the travel agency or a related system with access to the airline
price information, such as for alternative 150a in Figure 1J if the customer had previously requested information on special fares for this flight or on fares below $500 for any flights between Seattle and Boston. Such alerts could take a variety of forms, including e-mail, instant message, a phone call, fax, etc. In addition, in other alternatives the travel agent and/or an independent agent acting on behalf of the customer could automatically purchase a flight when it met certain criteria for the customer, including if the flight is determined to be a good buy.
Figure 1O illustrates example information 199 showing another alternative for using predictive pricing information to assist customers and/or sellers. In particular, in this example information is shown that is similar to that illustrated in Figure U, but with only two alternatives illustrated to the customer. The top alternative in this example corresponds to the same flight that was previously indicated to be alternative 150a in Figure U, but in this example a specific special fare is not offered to the customer based on the predictive pricing. Instead, as indicated by the customer-selectable control 193, the customer is in this example offered the opportunity to offer a named price for the particular flight shown. In addition, the displayed information to the customer further includes an indication of a second alternative flight, which in this example does not include the name-your-price functionality, although the specific offered price does provide context to the customer of other current prices offered for competitive flights - in other embodiments, such additional information may instead not be provided. In this example, if the customer selects the control 193 and offers a price above $499 (the special fare in Figure 1 J for this flight), the travel agency may accept that offer even though it is below the current price offered for the flight of $649. In other alternatives, customers could name prices for flights at varying degrees of specificity, such as any flight that is sufficiently similar to previously indicated search criteria by the customer, flights on a specified airline but not limited to a particular flight, etc. In addition, customers could similarly purchase items using other purchase models that similarly use predicted price information, such as based on various auction-related purchase models.
Thus, Figures 1J-1O provide examples of specific types of functionality that may be provided to customers by intermediate sellers based on the use of predictive pricing information, although in other embodiments such predictive pricing
information could be used in other ways. Also, as was shown in these examples, the predictive pricing information allows different types of functionality to be offered to different types or categories of customers. For example, the special fares and general notifications of whether a flight is a good buy may be of interest to bargain and value shoppers. Similarly, the name-your-price model may allow such customers to save money, while also being able to specify flights at a much more detailed level than is currently provided in the marketplace (e.g., by Priceline), which provides less uncertainty and less restrictions for the customers. Conversely, frequent travelers may prefer to obtain additional information related to predictive pricing, such as details and/or reasons related to why a flight is a good buy, or specific recommendations on how to obtain potential savings when the future price may drop - if so, such additional information may be available to them for an additional fee, such as based on a premium registration service. In addition, professionals that represent other travelers (e.g., travel agents, in-house corporate travel managers, etc.) may want even more information and/or the ability to obtain predictive pricing information in high volume and/or in bulk, such as for additional fees.
Figure 2 illustrates a server computing system 200 suitable for executing embodiments of one or more software systems/modules that perform analyses related to predictive pricing information. The example server computing system includes a CPU 205, various I/O devices 210, storage 220, and memory 230. The I/O devices include a display 211 , a network connection 212, a computer-readable media drive 213, and other I/O devices 215.
A Predictive Pricing ("PP") Determiner system facility 240 is executing in memory 230 in this example in order to analyze historical price data and determine predictive pricing information. Similarly, a PP Provider system facility 241 is executing in memory 230 in order to provide predictive pricing information relative to current items on request, such as to users (e.g., buyers and/or sellers) and/or to other system facilities that use that information to provide various services to users.
As the PP Determiner system executes in memory 230, it analyzes various historical item price information, such as that available in a database 221 of storage 220 or instead as obtained from another executing system or remote storage location. After analyzing the historical price information, such as at the request of a
user or instead on a scheduled basis, the PP Determiner system determines various predictive pricing information related to the historical item prices (e.g., underlying factors that affect price changes, various patterns or other information about price changes relative to the factors, policies related to responding to current factors, etc.). The system then in the illustrated embodiment stores the determined information in a database 223 on storage, although in other embodiments the system could provide the information interactively to a user or other executing system. In some embodiments and/or situations, the PP Determiner system could also obtain historical price information for use in its analysis by repeatedly querying an external supplier of such information to obtain then-current information, and could then analyze the obtained information, whether dynamically as it is obtained or instead later after a sufficient amount of historical price information has been gathered or on a periodic basis. Such external information sources could be accessed in a variety of ways, such as via one or more server computers 270 over a network 280 (e.g., to retrieve stored information 273 directly and/or via interaction with an application 279 executing in memory).
When predictive pricing information is available, whether via previously stored information in database 223 or in response to a query to the PP Determiner system, the PP Provider system facility 241 executing in memory 230 can obtain and provide predictive pricing information (e.g., for a specified item or group of items), such as in response to a request from a user or other executing system facility. In this illustrated embodiment, example system facilities 243-249 are executing in memory 230 to provide functionality based on predictive pricing information, and thus may provide such requests to the PP Provider system, although in other embodiments some or all of those additional system facilities may instead be executing remotely or may not be present. In this illustrated embodiment, the PP Provider system provides predictive pricing information for a request by obtaining information about predicted future prices for the item as discussed above, by analyzing and modifying the obtained information if needed, and providing information about those predicted future prices. In some embodiments, the system 241 could further obtain, use and provide current pricing information for the items, such as from a current item information database 225 on storage 220, while in other embodiments the PP Provider system may instead obtain and provide predictive pricing information based
merely on various current factors for an item, such as those supplied in the request or instead otherwise obtained by the PP Provider system (e.g., from the database 225).
In particular, as one example of a system facility that can obtain and use predictive pricing information, the PP Advisor system facility 243 is executing in memory 230. In response to an indication to provide advice, such as based on an interactive request from a customer or instead based on a scheduled indication to determine whether to provide an alert to a customer based on a previously received request, the PP Advisor system obtains predictive pricing information for one or more items, such as by interacting with the PP Provider system. The PP Advisor system also obtains current price information for those items, and then determines one or more types of advice to provide to an appropriate customer based on that information. In some embodiments, the advice is provided via notifications interactively displayed to the customer that indicate information about current item prices to advise the customer. In other embodiments, the advice may be provided in other forms, such as via an alert sent to a registered customer. Various information about customers may be stored and used when providing advice, such as in a customer database 227 on storage 220, in other to determine when, whether, and how to provide notification to a customer in accordance with their preferences and interests.
As another example of a system facility that uses predictive pricing information, the illustrated embodiment further includes a PP Seller system facility 245 executing in memory 230. The PP Seller system obtains predictive pricing information for one or more items, such as from the PP Provider system, as well as current price information for the items. The PP Seller system then assists a seller (e.g., an intermediate seller) in using the predictive pricing information in one or more of a variety of ways, such as to determine whether and when to offer prices to customers that are lower than prices currently offered by suppliers of items, to accept bids or offers from customers that are lower than prices currently offered by item suppliers but higher than predicted lower future prices for the items, to delay an actual purchase of one or more items from item suppliers that have been purchased from the seller by customers, etc.
The PP Buyer system facility 247 is another example of a system facility executing in memory 230 in the illustrated embodiment that can obtain and use predictive pricing information in order to enable better buying decisions, in this situation by directly assisting buyers (e.g., bulk buyers). In particular, the PP Buyer system obtains predictive pricing information for one or more items, such as from the PP Provider system, as well as current price information for those items. The PP Buyer system facility then assists the buyer in determining whether and when to make purchasing decisions, such as to delay purchases based on predicted future price drops and/or to aggregate multiple purchases together to provide additional benefits, to hedge against such delays by purchasing some items immediately and delaying others, to negotiate with an intermediate seller or item supplier for lower prices based on predicted future price drops, to immediately purchase items that are not otherwise immediately needed based on predicted future price increases, etc.
The PP Analyzer system facility 249 is another example system executing in memory 230 that uses predictive pricing information to provide benefits to customers or other users. The PP Analyzer system analyzes prior purchase information, such as that stored in database 229 on storage 220 or instead as interactively supplied by a user making a request, in order to determine whether the prior purchasing decisions were made effectively. In particular, the PP Analyzer system obtains information about pricing information that would have been predicted for those items at the time of purchase, such as from the PP Provider system, and then compares the actual purchase decisions made to the decisions that would have been advised based on use of the predictive pricing information. In some embodiments, the PP Analyzer system further may obtain historical price information for the purchase items (e.g., from the database 221) that corresponds to offered prices after the purchase date but before a date that the item is needed, such as to determine whether the actual purchase prices were higher than an optimal purchase price that was available. The PP Analyzer system can then provide information about the analysis performed to assist in better future buying decisions.
Those skilled in the art will appreciate that computing systems and devices 200, 250 and 270 are merely illustrative and are not intended to limit the scope of the present invention. Computing system 200 may be connected to other devices that are not illustrated, including through one or more networks such as the Internet
(e.g., via the World Wide Web ("Web")) or other computer network. More generally, a "client" or "server" may comprise any combination of hardware or software that can interact in the indicated manner, including computers, network devices, internet appliances, PDAs, wireless phones, pagers, electronic organizers, television-based systems and various other consumer products that include inter-communication capabilities. In addition, the functionality provided by the various system components may in some embodiments be combined in fewer components or distributed in additional components. Similarly, in some embodiments the functionality of some of the illustrated components may not be provided and/or other additional functionality may be available.
Those skilled in the art will also appreciate that, while various items are illustrated as being stored in memory or on storage while being used, these items or portions of them can be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively, in other embodiments some or all of the software components may execute in memory on another device and communicate with the illustrated computing device via intercomputer communication. Some or all of the system components or data structures may also be stored (e.g., as instructions or structured data) on a computer-readable medium, such as a hard disk, a memory, a network, or a portable article to be read by an appropriate drive. The system components and data structures can also be transmitted as generated data signals (e.g., as part of a carrier wave) on a variety of computer-readable transmission mediums, including wireless-based and wired/cable-based mediums. Accordingly, the present invention may be practiced with other computer system configurations.
Figure 3 is a flow diagram of an embodiment of a Predictive Pricing Determiner routine 300. The routine begins at step 305, where historical pricing information is obtained for one or more items, and continues to step 310 to analyze the data to determine predictive pricing information based on the historical data. In step 315, the routine then stores or updates previously stored predictive pricing information from the analysis in step 310. After step 315, the routine continues to step 395 to determine whether to continue. If so, the routine returns to step 305, and if not the routine continues to step 399 and ends.
Figure 4 is a flow diagram of an embodiment of a Predictive Pricing Provider routine 400. While illustrated here as a routine that is separate from the Predictive
Pricing Determiner routine, as well as from other later-discussed routines that use the provided determined information, the routine could instead in other embodiments be incorporated together with one or more such other routines.
The routine begins in step 405, where a request is received for predictive pricing information for one or more specified items and/or a specified situation. The routine continues to step 410 to determine whether the requestor is authorized to receive the requested information, such as for a registered customer (whether directly or via another system facility acting as an intermediary on behalf of that customer). If so, the routine continues to step 415 to obtain corresponding predictive pricing information, such as by retrieving stored information or instead by interactively requesting the PP Determiner to provide the information. After step 415, the routine continues to step 420 to determine predictive pricing specific to the request based on the retrieved information, such as one or more specific predicted future prices, a predicted future price pattern, a predicted future direction, predictions about specific times in the future corresponding to predictive prices, etc. While not illustrated here, in some embodiments the routine may further obtain information about current prices for the items, such as to assist in the predictive pricing (e.g., to determine a future price relative to the current price) and/or to enable comparison between the current and predicted future prices. After step 420, the routine continues to step 425 to provide the determined information to the requestor. After step 425, or if it was instead determined in step 410 that the requestor was not authorized, the routine continues to step 495 to determine whether to continue. If so, the routine returns to step 405, and if not the routine continues instead to step 499 and ends.
Figure 5 is a flow diagram of an embodiment of a Predictive Pricing Seller routine 500. The routine obtains predictive pricing information for one or more items, and uses the information to assist a seller (e.g., an intermediate seller) to perform selling decisions in one or more of a variety of ways.
The routine begins in step 505, where a request is received related to one or more items. In step 510, the routine determines current prices for the items, and in step 515 obtains predicted prices for the items, such as by interacting with the PP
Provider routine. In step 520, the routine then determines a price at which to currently offer the items based on the current prices and/or the predicted future prices, and in step 525 provides information about the determined offer and price to the requestor. In other embodiments, a variety of other additional types of functionality could be provided, such as to determine whether to offer price protection insurance to a requestor based on the current prices and/or the predicted future prices. In addition, the determination of the price at which to offer an item can be made in various ways, such as to select prices lower than current offer prices based on predicted future prices dropping, or instead in some embodiments by negotiating with a supplier of the items to obtain a lower offered price from the supplier based on predicted lower future prices.
After step 525, the routine continues in step 530 to determine whether the requestor is interested in purchasing or otherwise acquiring one or more of the items at one of the offered prices. If so, the routine continues to step 535 to determine whether to fulfill the requester's acquisition by actually acquiring the item from an item supplier now or instead by waiting until later (e.g., based on a predicted lower future price). If it is determined that it is preferable to buy now, the routine continues to step 540 to buy the item, and otherwise the routine continues to step 545 to store information about the item and to optionally schedule a later time to buy the item (e.g., to reflect a time at which it is predicted that the price will be lower, or instead to periodically check for lower prices). In some embodiments, the decision to delay a purchase may further be made at least in part on the basis of a goal to aggregate multiple item purchase requests (e.g., for the same item, for related items such as items from a single supplier, etc.) in order to perform hedge activities or otherwise negotiate discounts. After steps 540 or 545, the routine continues to step 550 to provide confirmation to the requestor of the requestor's purchase. In situations in which the item has already been bought or is otherwise available, the item may in addition be supplied to the purchaser at this time, while in other situations (e.g., when the actual purchase is delayed), the supplying of the item may similarly be delayed. After step 550, or if it was instead determined in step 530 not to make a purchase, the routine continues in step 595 to determine whether to continue. If so, the routine returns to step 505, and if not the routine continues to step 599 and ends.
Figure 6 is a flow diagram of a Predictive Pricing Advisor routine 600. The routine obtains predictive pricing information for items and uses the information to provide advice, such as to customers.
The routine begins in step 605, where a request is received related to one or more items. In the illustrated embodiment, the routine is illustrated as providing advice in an interactive manner, although in other embodiments such requests could be for future alerts and could be stored for periodic or scheduled processing to satisfy the requests. After step 605, the routine continues in step 610 to determine current prices for the items corresponding to the request, and in step 615 to obtain predicted prices for the items, such as by interacting with the PP Provider routine. In step 620, the routine determines what advice to give, such as based on a comparison of the current price to the predicted future price and on any other available information. The routine then continues to step 625 to determine how to provide the advice, such as via a notification displayed to the user along with other information or instead by alerting the user proactively in one or more of a variety of ways. After step 625, the routine continues to step 630 to provide the determined advice to the customer in the determined manner. In step 695, the routine then determines whether to continue. If so, the routine returns to step 605, and if not the routine continues to step 699 and ends.
Figure 7 is a flow diagram of an embodiment of a Predictive Pricing Buyer routine 700. The routine obtains and uses predictive pricing information for items in order to assist buyers in making buying decisions, such as for bulk buyers.
The routine begins In step 705, where one or more items of interest to purchase are determined, such as based on a request received from a user. In step 710, the routine determines current prices for the items, and in step 715 obtains predicted prices for the items, such as by interacting with the PP Provider routine. In step 720, the routine determines a price at which to buy some or all of the items and a time at which such purchases should be made. In some embodiments, such a determination could be made by interactively negotiating with an item supplier or intermediate seller in order to obtain discounted prices based on predicted lower future prices, while in other embodiments the determined may be made based on other factors. Similarly, some or all of such items could be determined to have their purchases held until later in order to aggregate for various purposes, such as for a
consolidator. In step 725, the routine then provides information about the determined price and optionally additional information about the predictive pricing in an appropriate manner, such as by providing the information to a requester from step 705.
After step 725, the routine continues in step 730 to determine whether an appropriate user is interested in purchasing or otherwise acquiring one or more of the items at one of the offered prices, such as based on a received request. If so, the routine continues to step 735 to determine whether to fulfill that acquisition by actually acquiring the item from an item supplier now or instead by waiting until later (e.g., based on a predicted lower future price). If it is determined that it is preferable to buy now, the routine continues to step 740 to buy the item, and otherwise the routine continues to step 745 to store information about the item and to optionally schedule a later time to buy the item (e.g., to reflect a time at which it is predicted that the price will be lower, or instead to periodically check for lower prices). In some embodiments, the decision to delay a purchase may further be made at least in part on the basis of a goal to aggregate multiple item purchase requests in order to perform hedge activities or otherwise negotiate discounts. After steps 740 or 745, the routine continues to step 750 to provide confirmation of the requested acquisition. After step 750, or if it was instead determined in step 730 not to make a purchase, the routine continues in step 795 to determine whether to continue. If so, the routine returns to step 705, and if not the routine continues to step 799 and ends.
Figure 8 is a flow diagram of an embodiment of a Predictive Pricing Analyzer routine 800. The routine obtains predictive pricing information for items that corresponds to prior purchases of those items, and uses the predictive pricing information to analyze whether the buying decisions could have been performed more efficiently based on the predictive pricing. In addition, in the illustrated embodiment the routine further compares the previously purchased item prices to later actually available prices in order to determine how the actual and/or predicted prices compare to optimally available prices, although in other embodiments such use of actual later price information may not be used.
In step 805, a request is received to analyze historical purchases of items, and in step 807 the routine obtains information about the historical purchases,
although in other embodiments such information may instead be supplied as part of the request in step 805. In step 810, the routine determines the purchase prices for the items, and in step 815 determines the predicted prices that would have been made for those items at that time (e.g., based on data that was then available and/or a version of predictive pricing techniques that were then used), such as based on interactions with the PP Provider routine. In step 820, the routine then generates an analysis of the actual prior purchase prices versus the prices that would have been obtained based on following the predictive pricing advice that would have been provided at that time, and in the illustrated embodiment further generates an analysis based on a comparison to the optimal price that could have been obtained based on other actual offered prices (e.g., before and/or after the time of actual purchase). In step 825, the routine then continues to provide the generated analysis to the requestor. After step 825, the routine continues to step 895 to determine whether to continue. If so, the routine returns to step 805, and if not the routine continues to step 899 and ends.
In a similar manner, this or a related routine could use predictive pricing information to assist a user in analyzing historical and/or recent/current pricing information for a specified group of one or more item suppliers, such as on behalf of an item supplier to analyze pricing information for one or more competitors and/or affiliated business entities (e.g., customers, suppliers, partners, etc.). When performed with respect to recent/current pricing information for one or more competitors, for example, such predictive pricing information may allow a user to anticipate likely price changes for those competitors and use that information to guide their own actions, whether in advance of any such actions by the competitors or instead as a response (e.g., an immediate response) if such actions by the competitors occur. If performed in advance, the user may be able to gain a first- mover advantage by use of the predictive pricing information.
Those skilled in the art will also appreciate that in some embodiments the functionality provided by the routines discussed above may be provided in alternative ways, such as being split among more routines or consolidated into less routines. Similarly, in some embodiments illustrated routines may provide more or less functionality than is described, such as when other illustrated routines instead lack or include such functionality respectively, or when the amount of functionality
that is provided is altered. In addition, while various operations may be illustrated as being performed in a particular manner (e.g., in serial or in parallel) and/or in a particular order, those skilled in the art will appreciate that in other embodiments the operations may be performed in other orders and in other manners. Those skilled in the art will also appreciate that the data structures discussed above may be structured in different manners, such as by having a single data structure split into multiple data structures or by having multiple data structures consolidated into a single data structure. Similarly, in some embodiments illustrated data structures may store more or less information than is described, such as when other illustrated data structures instead lack or include such information respectively, or when the amount or types of information that is stored is altered.
Appendix A provides additional details related to one example of techniques for performing predictive pricing, which in that illustrative example are in the context of airline ticket prices. In addition, those skilled in the art will appreciate that a variety of similar techniques could instead be used in alternative embodiments. Some such additional techniques are discussed generally in "Machine Learning" by Tom M. Mitchell, McGraw-Hill Companies Inc., 1997, which is hereby incorporated by reference in its entirety.
From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims and the elements recited therein. In addition, while certain aspects of the invention are presented below in certain claim forms, the inventors contemplate the various aspects of the invention in any available claim form. For example, while only some aspects of the invention may currently be recited as being embodied in a computer-readable medium, other aspects may likewise be so embodied.
/APPENDIX A
To Buy or Not to Buy: Mining Airfare Data to Minimize
Ticket Purchase Price
Oren Etzioni Craig A. Knoblock
Dept. Computer Science Information Sciences Institute
University of Washington University of Southern California
Seattle, Washington 98195 Marina del Rey, CA 90292 etzioni@cs.washington.edu knoblock@isi.edu
Rattapoom Tuchinda Alexander Yates
Dept. of Computer Science Dept. Computer Science
University of Southern California University of Washington
Los Angeles, CA 90089 Seattle, Washington 98195 pipet@isi.edu ayates@cs.washington.edu
ABSTRACT Keywords
As product prices become increasingly available on the price mining, Internet, web mining, airline price prediction World Wide Web, consumers attempt to understand how- corporations vary these prices over time. However, corporations change prices based on proprietary algorithms and hid1. INTRODUCTION AND MOTIVATION den variables (e.g., the number of unsold seats on a flight).
Corporations often use complex policies to vary product Is it possible to develop data mining techniques that will prices over time. The airline industry is one of the most enable consumers to predict price changes under these consophisticated in its use of dynamic pricing strategies in an ditions? attempt to maximize its revenue. Airlines have many fare
This paper reports on a pilot study in the domain of airclasses for seats on the same flight, use different sales chanline ticket prices where we recorded over 12,000 price obsernels (e.g., travel agents, priceline.com, consolidators), and vations over a 41 day period. When trained on this data, frequently vary the price per seat over time based on a slew Hamlet — our multi-strategy data mining algorithm — genof factors including seasonality, availability of seats, competerated a predictive model that saved 341 simulated passenitive moves by other airlines, and more. The airlines are said gers $198,074 by advising them when to buy and when to to use proprietary software to compute ticket prices on any postpone ticket purchases. Remarkably, a clairvoyant algogiven day, but the algorithms used are jealously guarded rithm with complete knowledge of future prices could save trade secrets [19]. Hotels, rental car agencies, and other at most $320,572 in our simulation, thus HAMLET'S savings vendors with a "standing" inventory are increasingly using were 61.8% of optimal. The algorithm's savings of $198,074 similar techniques. represents an average savings of 23.8% for the 341 passen¬
As product prices become increasingly available on the gers for whom savings are possible. Overall, HAMLET saved World Wide Web, consumers have the opportunity to be4.4% of the ticket price averaged over the entire set of 4,488 come more sophisticated shoppers. They are able to comsimulated passengers. Our pilot study suggests that mining parison shop efficiently and to track prices over time; they of price data available over the web has the potential to save can attempt to identify pricing patterns and rush or delay consumers substantial sums of money per annum. purchases based on anticipated price changes (e.g., "I'll wait to buy because they always have a big sale in the spring..."). In this paper we describe the use of data mining methods to
Categories and Subject Descriptors help consumers with this task. We report on a pilot study
1.2.6 [Artificial Intelligence]: Learning in the domain of airfares where an automatically learned model, based on price information available on the Web, was able to save consumers a substantial sum of money in simulation.
The paper addresses the following central questions:
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are • What is the behavior of airline ticket prices not made or distributed for profit or commercial advantage and that copies over time? Do airfares change frequently? Do they bear this notice and the full citation on the first page. To copy otherwise, to move in small increments or in large jumps? Do they republish, to post on servers or to redistribute to lists, requires prior specific tend to go up or down over time? Our pilot study enpermission and/or a fee.
SK3KDD '03, August 2 -27, 2003, Washington, DC, USA. ables us to begin to characterize the complex behavior Copyright 2003 ACM 1-58113-737-0/03/0008 ...$5.00. of airfares.
• What data mining methods are able to detect for each departure date was collected 8 times a day.2 Overpatterns in price data? In this paper we consider all, we collected over 12,000 fare observations over a 41 day reinforcement learning, rule learning, time series methperiod for six different airlines including American, United, ods, and combinations of the above. etc. We used three-hour mtervals to limit the number of http requests to the web site. For each flight, we recorded
• Can Web price tracking coupled with data minthe lowest fare available for an emnomy ticket. We also ing save consumers money in practice? Vendors recorded when economy tickets were no longer available; we vary prices based on numerous variables whose values refer to such flights as sold out. are not available on the Web. For example, an airline may discount seats on a flight if the number of 2.1 Pricing Behavior in Our Data unsold seats, on a particular date, is high relative to We found that the price of tickets on a particular flight the airline's model. However, consumers do not have can change as often as seven times in a single day. We cateaccess to the airline's model or to the number of availgorize price change into two types: dependent price changes able seats on the flights. Thus, prioii, price changes and independent price changes. Dependent changes occur could appear to be unpredictable to a consumer trackwhen prices of similar flights (i.e. having the same origin ing prices over the Web. In fact, we have found price and destination) from the same airline change at the same changes to be surprisingly predictable in some cases. time. This type of change can happen as often as once or twice a day when airlines adjust their prices to maximize
The remainder of this paper is organized as follows. Sectheir overall revenue or "yield" . Independent changes occur tion 2 describes our data collection mechanism and analyzes when the price of a particular flight changes independently the basic characteristics of airline pricing in our data. Secof similar flights from the same airline. We speculate that tion 3 considers related work in the areas of computational this type of change results from the change in the seat availfinance and time series analysis. Section 4 introduces our ability of the particular flight. Table 1 shows the average data mining methods and describes how each method was number of changes per flight aggregated over all airlines for tailored to our domain. We investigated rule learning [8], Q- each route. Overall, 762 price changes occurred across all learning [25], moving average models [13], and the combinathe flights in our data. 63% of the changes can be classified tion of these methods via stacked generalization [28]. Next, as dependent changes based on the behavior of other flights Section 5 describes our simulation and the perfonnance of by the same airline. each of the methods on our test data. The section also reports on a sensitivity analysis to assess the robustness of our results to changes in the simulation. We conclude with a discussion of future work and a summary of the paper's
contributions.
Table 1: Average number of price changes per route.
2. DATA COLLECTION We found that the round-trip ticket price for flights can
We collected airfare data directly from a major travel web vary significantly over time. Table 2 shows the minimum site. In order to extract the large amount of data required price, maximum price, and the maximum difference in prices for our machine learning algorithms, we built a flight data that can occur for flights on each route. collection agent that runs at a scheduled interval, extracts the pricing data, and stores the result in a database.
We built our flight data collection agent using Agent- Builder
1 for wrapping web sites and Theseus for executing
the agent [3]. AgentBuilder exploits machine learning technology [15] that enables the system to automatically learn Table 2: Minimum price, maximum price, and maxextraction rules that reliably convert information presented imum change in ticket price per route. AH prices in on web pages into XML. Once the system has learned the this paper refer to the lowest economy airfare availextraction rules, AgentBuilder compiles this into a Theseus able for purchase-. plan. Theseus is a streaming dataflow execution system that supports highly optimized execution of a plan in a network For many flights there are easily discernible price tiers environment. The system maximizes the parallelism across where ticket prices fall into a relatively small price range. different operators and streams data between operations to The number of tiers typically varies from two to four, desupport the efficient execution of plans with complex navipending on the airline and the particular flight. Even flights gation paths and extraction from multiple pages. from the same airline with the same schedule (but with dif¬
For the purpose of our pilot study, we restricted ourferent departure dates) can have different numbers of tiers. selves to collecting data on non-stop, round-trip flights for For example, there are two price tiers for the flight in Figure two routes: Los Angeles (LAX) to Boston (BOS) and Seat1, four price tiers in Figure 4 and three price tiers in Figure tle (SEA) to Washington, DC (IAD). Our departure dates 2 and Figure 3. spanned January 2003 with the return flight 7 days after departure. For each departure date, we began collecting pric
2We expected to record 168 (21 * 8) price observations for each flight. In fact, we found that on average each flight was ing data 21 days in advance at three-hour intervals; data missing 25 observations due to problems during data collection including remote server failures, site changes, wrapper
xwww. fetch. com bugs, etc.
12/18/200212(23/2002 12/28/2002 1/2/2003 1/7/2003 1/12/20O3 1/17/2003 12/8/2002 1213/2002 1218/2002 12/23/2002 12/28/2002 1/2/2003 1/7/2003 Date
Figure 1: Price change over time for United AirFigure 4: Price change over time for Alaska Airlines lines roundtrip flight#168:169 LAX-BOS departing roundtrip flight#6:3, SEA-IAD departing on Jan 4. on Jan 12. This figure is an example of two price This figure shows an example of four price tiers. tiers and how consumers might benefit from the price drop. need to submit the change to the Airline Tariff Publishing Company (ATP CO),'' the organization formed by leading airlines around the world that collects and distributes airline
2250 pricing data. The whole process of detecting competitors' fare changes, deciding whether or not to match competitors'
1750 prices, and submitting the price update at ATPCO can take S up to one day [19]. 1250 Pπce changes appear to be fairly orderly on some flights (e.g., Figure 3), and we see evidence of the well-known 7 and 14 day "advance purchase
' fares. However, we also see plenty of surprising price changes For example, flights
12/8/2002 12/132002 12/18/2002 12/23/2002 12/28/2002 1/2/2003 1/7/2003 that depart around holidays appear to fluctuate more (e.g., Figure 2. Figure 2 and Figure 3 show how pricing strategies differ between two flights from American Airlines that have the same schedule but fly on different dates. Figure 2 shows
Figure 2: Price change over time for American Aira flight that departs around the new year, while Figure 3 lines roundtrip flight#192:223, LAX-BOS departing shows the flight that departs one week after the first flight. on Jan 2. This figure shows an example of rapid Both flights have the tier structure that we described earlier price fluctuation in the days priori to the New Year. in this section, but ticket prices in the first flight fluctuate more often.
In terms of pricing strategy, we can divide the airlines
Price matching plays an important role in airline pricing into two categories. The first category covers airlines that structure. Airlines use sophisticated software to track their are big players in the industry, such as United Airlines, and competitors' pricing history and propose adjustments that American Airlines. The second category covers smaller airoptimize their overall revenue. To change the price, airlines lines that concentrate on selling low-price tickets, such as Air Trans and Southwest. We have found that pricing policies tend to be similar for airlines that belong to the same category. Fares for airlines in the first category are expensive and fluctuate often, while fares for airlines in the second category are moderate and appear relatively stable. However, there are some policies that every airline seems to use. For example, airlines usually increase ticket prices two weeks before departure dates and ticket prices are at a maximum on departure dates.
3. RELATED WORK
12/28/2002 1/2/2003 1/7/2003 Previous work in the Al community on the problem of Date predicting product prices over time has been limited to the Trading Agent Competition (TAC) [27]. In 2002, TAC focused on the travel domain. TAC relies on a simulator of
Figure 3: Price change over time for American Airairline, hotel, and ticket prices and the competitors build lines roundtrip flight#192:223, LAX-BOS departing agents to bid on these. The problem is different from ours on Jan 7. This figure shows an example of three since the competition works as an auction (similar to Price- price tiers and low price fluctuation lsee http ://www. atpco .net.
line.com). Whereas we gathered actual flight price data from Comparison shopping "bots" gather price data available the web, TAC simulates flight prices using a stochastic proon the web for a wide range of products.4 These are decess that follows a random walk with an increasingly upward scendants of the Shopbot [11] which automatically learned bias. Also, the TAC auction of airline tickets assumes that to extract product and price information from online merthe supply of airline tickets is unlimited. Several TAC comchants' web sites. None of these services attempts to analyze petitors have explored a range of methods for price predicand predict the behavior of product prices over time. Thus, tion including historical averaging, neural nets, and boostthe data mining methods in this paper complement the body ing. It is difficult to know how these methods would perform of work on shopbots. if reconfigured for our price mining task.
There has been some recent interest in temporal data mining (see [23] for a survey). However, the problems studied 4. DATA MINING METHODS under this heading are often quite different from our own In this section we explain how we generated training data, (e.g., [1]). There has also been algorithmic work on time seand then describe the various data mining methods we inries methods within the data mining community (e.g., [4]). vestigated: Ripper [8], Q-learning [25], and time series [13, We discuss time series methods below. 9]. We then explain how our data mining algorithm, HAM¬
Problems that are closely related to price prediction over LET, combines the results of these methods using a variant time have been studied in statistics under the heading of of stacked generalization [26, 28]. "time series analysis" [7, 13, 9] and in computational fiOur data consists of price observations recorded every 3 nance [20, 22, 21] under the heading of "optimal stopping hours over a 41 day period. Our goal is to learn whether to problems". However, these techniques have not been used buy a ticket or wait at a particular time point, for a particuto predict price changes for consumer goods based on data lar flight, given the price history that we have recorded. All available over the web. Moreover, we combine these techof our experiments enforce the following essential temporal niques with rule learning techniques to improve their perconstraint: all the information used to make a decision at formance. particular time, point was recorded before that time point.
Computational finance is concerned with predicting prices In this way, we ensure that we rely on the past to predict and making buying decisions in markets for stock, options, the future, but not vice versa. and commodities. Prices in such markets are not determined 4.1 Rule Learning by a hidden algorithm, as in the product pricing case, but rather by supply and demand as determined by the actions Our first step was to run the popular Ripper rule learning of a large number of buyers and sellers. Thus, for example, system [8] on our training data. Ripper is an efficient sepstock prices tend to move in small incremental steps rather arate and conquer rule learner. We represented each price than in the large, tiered jumps observed in the airline data. observation to Ripper as a vector of the following features:
Nevertheless, there are well known problems in options • Flight number. trading that are related to ours. First, there is the early exercise of American Calls on stocks that pay dividends. The • Number of hours until departure (denoted as hours- second problem is the exercise of American Puts on stocks before-takeoff). that don't pay dividends. These problems are described in sections 11.12 and 7.6 respectively of [14]. In both cases, • Current price. there may be a time before the expiration of an option at
• Airline. which its exercise is optimal. Reinforcement learning methods have been applied to both problems, and that is one • Route (LAX-BOS or SEA-IAD). reason we consider reinforcement learning for our problem.
Time series analysis is a large body of statistical techThe class labels on each training instance were 'buy' or niques that apply to a sequence of values of a variable that 'wait'. varies over time due to some underlying process or structure We considered a host of additional features derived from [7, 13, 9]. The observations of product prices over time are the data, but they did not improve Ripper's performance. naturally viewed as time series data. Standard data mining We did not represent key variables like the number of unsold techniques are "trained" on a set of data to produce a preseats on a flight, whether an airline is running a promotion, dictive model based on that data, which is then tested on a or seasonal variables because HAMLET did not have access separate set of test data. In contrast, time series techniques to this information. However, see Section 6 for a discussion would attempt to predict the value of a variable based on of how HAMLET might be able to obtain this information in its own history. For example, our moving average model atthe future. tempts to predict the future changes in the price of a ticket Some sample rules generated by Ripper are shown in Figon a flight from that flight's own price history. ure 5.
There is also significant interest in bidding and pricing In our domain, classification accuracy is not the best metstrategies for online auctions. For example, in [24] Harshit et ric to optimize because the cost of misclassified examples al. use cluster analysis techniques to categorize the bidding is highly variable. For example, misclassifying a single exstrategies being used by the bidders. And in [17], Lncking- ample can cost from nothing to upwards of $2,000. Meta- Reiley et al. explore the various factors that determine the Cost [10] is a well-known general method for training cost- final price paid in an online auction, such as the length of sensitive classifiers. In our domain, MetaCost will make a the auction, whether there is a reserve price, and the reputalearned classifier either more conservative or more aggrestion of the seller. However, these techniques are not readily sive about waiting for a better price, depending on the cost applicable to our price mining problem.
See, for example, froogle .google . com and mysimon. com.
IF hours-before-takeoff >= 252 AND price >= 2223 class and then use the learned model to generate predictions AND route = LAX-BOS THEN wait for other states in the class.
To define our equivalence class we need to introduce some
IF airline = United AND price >= 360 notation. Airlines typically use the same flight number (e.g., AND hours-before-takeoff >= 438 THEN wait UA 168) to refer to multiple flights with the same route that depart at the same time on different dates. Thus,
Figure 5: Sample Ripper rules. United flight 168 departs once daily from LAX to Boston at 10:15pm. We refer to a particular flight by a combination of its flight number and date. For example, UA168-Jan7 refers of misclassifying a 'buy' as a 'wait' compared with the cost to flight 168 which departs on January 7th, 2003. Since we of misclassifying a 'wait' as a 'buy'. We implemented Meta- observe the price of each flight eight times in every 24 hour Cost with mixed results. period, there are many price observations for each flight. We
We found that MetaCost improves Ripper's performance distinguish among them by recording the time (number of by 14 percent, but that MetaCost hurts HAMLET'S overall hours) until the flight departs. Thus, UA168-Jan7-120 is the performance by 29 percent. As a result, we did not use price observation for flight UA168, departing on January 7, MetaCost in HAMLET. which was recorded on January 2nd (120 hours before the
4.2 Q-learning flight departs on the 7th). Our equivalence class is the set of states with the same flight number and the same hours be¬
As our next step we considered Q-learning, a species of fore takeoff, but different departure dates. Thus, the states reinforcement learning [25]. Reinforcement learning seems denoted UA168-Jan7-120 and UA168-JanlO-120 are in the like a natural fit because after making each new price obsame equivalence class, but the state UA168-Jan7-117 is not. servation HAMLET has to decide whether to buy or to wait. We denote that s and s* are in the same equivalence class Yet the reward (or penalty) associated with the decision is by s ~ s* . only determined later, when HAMLET determines whether it Thus, our revised Q-learning formula is: saved or lost money through its buying policy. .Reinforcement learning is also a popular technique in computational Q(a, s) = Aυgs*~_ (R(s*, a) + mnxaι (Q(a', s'))) finance [20, 22, 21].
The standard Q-learning formula is: The reason for choosing -300,000 is now more apparent: the large penalty can tilt the average toward a low value,
Q{u, s) = R{s, a) + max
a,
s )) even when many Q values are being averaged together. Suppose, for example, that there are ten training examples in
Here, R(s, a) is the immediate reward, 7 is the discount the same equivalence class, and each has a current price of factor for future rewards, and s' is the state resulting from $2,500. Suppose now that in nine of the ten examples the taking action a in state s. We use the notion of state to price drops to $2,000 at some point in the future, but the model the state of the world after each price observation flight in the tenth example sells out in the next state. The Q (represented by the price, flight number, departure date, value for waiting in any state in this equivalence class will be and number of hours prior to takeoff). Thus, there are two (-300,000-2, 000*9)/10 = -31, 800, or still much less then possible actions in each state: b for 'buy' and w for 'wait'. the Q value for any equivalence class where no flight sells
Of course, the particular reward function used is critical out in the next state. Thus the choice of reward for a flight to the success (or failure) of Q-learning. In our study, the that sells out will determine how willing the Q-Learning alreward associated with b is the negative of the ticket price gorithm will be to risk waiting when there's a chance a flight at that state, and the state resulting from b is a terminal may sell out. Using a hill climbing search in the space of state so there is no future reward. The immediate reward penalties, we found -300,000 to be locally optimal. associated with w is zero as long as economy tickets on the Q-leaπiing can be very slow, but we were able to exflight do not sell out in the next time step. We set 7 = 1, ploit the structure of the problem and the close relationship so we do not discount future rewards. between dynamic programming and reinforcement learning
To discourage the algorithm from learning a model that (see [25]) to complete the learning in one pass over the trainwaits until flights sell out, we introduce a "penalty" for such ing set. Specifically, the reinforcement learning problem we flights in the reward function. Specifically, in the case where face has a particularly nice structure, in wliich the value the flight does sell out at the next time point, we make of Q(b, s) depends only on the price in state a, and the the immediate reward for waiting a negative constant whose value of Q(w, s) depends only on the Q values of exactly absolute value is substantially greater than the price for any one other state: the state containing the same flight numflight. We set the reward for reaching a sold-out state to be ber and departure date but with three hours less time left —300, 000. This setting can best be explained below, after until departure. Applying dynamic programming is thus we introduce a notion of equivalence classes among states. straightforward, and the initial training step requires only
In short, we define the Q function by a single pass over the data. In order to compute averages
Q(b, s) = —price(s) over states in the same equivalence class, we keep a running
. _ -300000 if flight sells out after s. total and a count of the Q values in each equivalence class.
Q(u>, s) - j max(Q(b, s'), Q(w, s')) otherwise. Thus, the reinforcement learning algorithm just makes a single pass over the training data, which bodes well for scaling
To generalize from the training data we used a variant the algorithm to much larger data sets. of the averaging step described in [18]. More specifically, The output of Q-learning is the learned policy, which dewe defined an equivalence class over states, which enabled termines whether to buy or wait in unseen states by mapping the algorithm to train on a limited set of observations of the them to the appropriate equivalence class and choosing the
action with the lowest learned cost. Let TS be the output of the Time Series algorithm, and let Q be the output of Q-Leaming.
4.3 Time Series
Time series analysis is a large and diverse subfield of IF hours-before-takeoff >= 480 AND airline = Uriited statistics whose goal is to detect and predict trends. In this AND price >= 360 AND TS = bay AND QL = wait paper, we investigated a first order moving average model. THEN wai At time step i, the model predicts the price one step into the future, pt+i, based on a weighted average of prices already Figure 6: A sample rule generated by Hamlet. seen. Thus, whereas Q-learning and Ripper attempt to generalize from the behavior of a set of flights in the training data to the behavior of future flights, the moving average puted for each training example by our level-0 generalizers. model attempts to predict the price behavior of a flight in To add our three level-1 features to the data, we applied the test data based on its own history. the model produced by each base-level generalizer (Ripper,
At time t, we predict the next price using a fixed window Q-learning, and time series) to each instance in the training of price observations, pt-ic+i, . .. , pt- (In HAMLET, we found data and labeled it with 'buy' or 'wait'. Thus, we added that setting k to one week's worth of price observations was features of the form TS = buy (time series says to buy) and locally optimal.) We take a weighted average of these prices, QL = wait (Q-learning says to wait). weighting the more recent prices more and more heavily. We then used Ripper as our level-1 generalizer, running Formally, we predict that t i will be it over this augmented training data. We omitted leave- one-out cross validation because of the temporal nature of
∑i=. a(i)pt-k+i our data. Although a fonn of cross validation is possible on temporal data, it was not necessary because each of our ∑J X «W base learners did not appear to overfit the training data. where a(i) is some increasing function of i. We experiOur stacked generalizer was our most successful data minmented with different functions and chose a simple linearly ing method as shown in Table 3 and we refer to it as HAMincreasing function. LET.
Given the time series prediction, HAMLET relies on the following simple decision rule: if the model predicts that 4.5 Hand-Crafted Rule Pt-H > Pt, then buy, otherwise wait. Thus, our time series After we studied the data in depth and consulted with model makes its decisions based on a one-step prediction travel agents, we were able to come up with a fairly simple of the ticket price change. The decision rule ignores the policy "by hand". We describe it below, and include it in our magnitude of the difference between pt+i and pt, which is results as a baseline for comparison with the more complex overly simplistic, and indeed the time series prediction does models produced by our data mining algorithms. not do very well on its own (see Table 3). However, HAMLET The intuition underlying the hand-crafted rules is as foluses the time series predictions extensively in its rules. In lows. First, to avoid sell outs we do not want to wait too effect, the time series prediction provides information about long. By inspection of the data, we decided to buy if the how the current price compares to a local average, and that price has not dropped within 7 days of the departure date. turns out to be valuable information for HAMLET. We can compute an expectation for the lowest. price of the
4.4 Stacked Generalization flight in the future based on similar flights in the training data.r' If the current price is higher than the expected min¬
Ensemble-based learning techniques such as bagging [5], imum then it is best to wait. Otherwise, we buy. boosting [12], and stacking [26, 28], which combine the reMore formally, let MinPrice(s, t) of a flight in the trainsults of multiple generalizers, have been shown to improve ing set denote the minimum price of that flight over the generalizer accuracy on many data sets. In our study, we interval starting from s days before departure up until investigated multiple data mining methods with very differtime t (or until the flight sells out). Let ExpPrice(s, t) ent characteristics (Ripper, Q-learning, and time series) so for a particular flight number denote the average over all it makes sense to combine their outputs. MinPrice's, t) for flights in the training set with that flight
We preferred stacking to voting algorithms such as number. Suppose a passenger asks at time i() to buy a ticket weighted majority [16] or bagging [5] because we believed that leaves in s» days, and whose current price is Cur Price. that there were identifiable conditions under which one The hand-crafted rule is shown in Figure 7. method's model would be more successful than another. See, for example, the sample rule in Figure 6. IF ExpPrice(s{ , tu) < Cur Price
Standard stacking methods separate the original vecAND so > 7 days THEN wait. tor representation of training examples (leυel-0 data in ELSE buy Wolpert's terminology), and use the class labels from each level-0 generalizer, along with the example's true classifiFigure 7: Hand-crafted rule for deciding whether to cation as input to a meta-level (or leυel-1) generalizer. To wait or buy. avoid over-fitting, "care is taken to ensure that the models are formed from a batch of training data that does not We also considered simpler decision rules of the form "if include the instance in question" [26]. the current time is less than K days before the flight's de¬
In our implementation of stacking, we collapsed level-0 parture then buy." In our simulation (described below) we and level- 1 features. Specifically, we used the feature representation described in Section 4.1 but added three additional r'For "similar" flights we used flights with the same airline features corresponding to the class labels (buy or wait) coin- and flight number.
tested such rules for K ranging from 1 to 22, but none of of the ticket at the point when the predictive model recomthese rules resulted in savings and some resulted in submends buying. Net savings is savings net of both losses and stantial losses. upgrade costs.
5.2 Savings
5. EXPERIMENTAL RESULTS Table 3 shows the savings, losses, upgrade costs, and net
In this section we describe the simulation we used to assavings achieved in our simulation by each predictive model sess the savings due to each of the data mining methods we generated. We also report on the frequency of upgrades described earlier. We then compare the methods in Table 3, as a percentage of the total passenger population, the net perform a sensitivity analysis of the comparison along sevsavings as a percent of the total ticket price, and the perforeral dimensions, and consider the implications of our pilot mance of each model as a percent of the maximal possible study. savings.
5.1 Ticket Purchasing Simulation The models we used are the following:
The most natural way to assess the quality of the predic• Optimal: This model represents the maximal possitive models generated by the data mining methods described ble savings, which are computed by a "clairvoyant" alin Section 4 is to quantify the savings that each model would gorithm with perfect information about future prices, generate for a population of passengers. For us, a passenger and which obtained the best possible purchase price is a person wanting to buy a ticket on a particular flight for each passenger. at a particular date and time. It is easy to imagine that • By hand: This model was hand-crafted by one of an online travel agent such as Expedia or Travelocity could the authors after consulting with travel agents and offer discounted fares to passengers on its web site, and use throughly analyzing our training data (see Figure 7). HAMLET to appropriately time ticket purchases behind the scenes. For example, if HAMLET anticipates that a fare will • Time series: This model was generated by the movdrop by $500, the agent could offer a $300 discount and keep ing average method described earlier. $200 as compensation and to offset losses due prediction er• Ripper: This model was generated by Ripper. rors by HAMLET.
Since HAMLET is not yet ready for use by real passengers, • Q-learning: This model was generated by our Q- we simulated passengers by generating a uniform distribulearning method. tion of passengers wanting to purchase tickets on various • Hamlet: This model was generated by our stacking flights as a function of time. Specifically, the simulation generalizer which combined the results of Ripper, Q- generated one passenger for each fare observation in our set learning, and Time series. of test data. The total number of passengers was 4,488. Thus, each simulated passenger has a particular flight for Table 3 shows a comparison ofthe different methods. Note which they need to buy a ticket and an earliest time point that the savings measure we focus on is savings net of losses at which they could purchase that ticket (called the "earliest and upgrade costs. We see that HAMLET outperformed each purchase point"). The earliest purchase points, for different of the learning methods as well as the hand-crafted model simulated passengers, varied from 21 days before the flight to achieve a net savings of $198,074. Furthermore, despite to the day of the flight. the fact that HAMLET had access to a very limited price
At each subsequent time point, HAMLET decides whether history and no information about the number of unsold seats to buy a ticket immediately or to wait. This process conon the flight, its net savings were a remarkable 61.8% of tinues until either the passenger buys a ticket or economy optimal. Finally, while an average net savings of 4.4% may seats on the flight sell out, in which case HAMLET will buy not seem like much, passengers spend billions of dollars on a higher priced business-class ticket for the flight.6 We deair travel each year so 4.4% amounts to a substantial number fined upgrade costs as the difference between the cost of a of dollars. business class ticket and the cost of an economy ticket at We believe that our simulation understates the savings the earliest purchase point. In our simulation, HAMLET was that HAMLET would achieve in practice. For close to 75% of forced to "upgrade" passengers to business class only 0.42% the passengers in our test set, savings were not possible beof the time, but the total cost of these upgrades was quite cause prices never dropped from the earliest purchase point high ($38,743 in Table 3).7 until the flight departed. We report the percent savings in
We recorded for each simulated passenger, and for each ticket prices over the set of flights where savings was possible predictive model considered, the price of the ticket pur("feasible flights") in Table 4, These savings figures are of chased and the optimal price for that passenger given their interest because of the unrealistic distribution of passengers earliest time point and the subsequent price behavior for in our simulation. Because we only gathered data for 21 that flight. The savings (or loss) that a predictive model days before each flight in our test set, passengers "arrived" yields for a simulated passenger is the difference between the at most 21 days before a flight. Furthermore, due to the price of a ticket at the earliest purchase point and the price uniform distribution of passengers, 33% of the passengers arrived at most 7 days before the flight's departure, when
rTt's possible, of course, for business class to sell out as well, savings are hard to come by. In fact, on our test data, HAMin which case HAMLET would have to buy a first-class ticket or re-book the passenger on a different flight. However, busiLET lost money for passengers who "arrived" in the last 7 ness class did not sell out in our simulation. days prior to the flight. We believe that in practice we would
7Since we did not collect upgrade costs for all flights, our find- additional opportunities to save money for the bulk of upgrade costs are approximate but always positive and often passengers who buy their tickets more than 7 days before as high as $1,000 or more. the flight date.
Table 3: Savings by Method.
it saved more than any other method on all distributions except the Quadratic Decrease distribution, where it performed slightly worse than the hand-crafted decision rule. HAMLET'S savings were above 38% of optimal in all cases.
Table 5 reports on the performance of the different methods under the modified model where a passenger requests a ticket on a non-stop flight that departs at any time during a
particular three hour interval (e.g., morning). This different
Table 4: Comparison of Net Savings (as a percent model does not change our results qualitatively. HAMLET of total ticket price) on Feasible Flights. still achieves a substantial percentage of the optimal savings (59.2%) and its percentage of upgrades drops to only 0.1%. Finally, HAMLET still substantially outperforms the
5.3 Sensitivity Analysis other data mining methods.
To test the robustness of our results to changes in our simulation, we varied two key parameters. First, we changed the distribution of passengers requesting flight tickets. Second, we changed the model of a passenger from one where a passenger wants to purchase a ticket on a particular flight to one where a passenger wants to fly at any time during a three hour interval. The interval model is similar to the interface offered at many travel web sites where a potential
buyer specifies if they want to fly in the morning, afternoon, or evening. Table 5: Performance of algorithms on multiple
We used the following distributions to model the earliest flights over three hour interval. purchase point (i.e., the first time point at which passengers "arrive" and need to decide whether to buy a ticket or to Overall, our analysis confirms that HAMLET'S perforwait): mance on the test data is robust to the parameters we varied.
• Uniform: a uniform distribution of simulated pas6. FUTURE WORK sengers over the 21 days before the flight's departure date; There are several promising directions for future work on price mining. We plan to perform a more comprehensive
• Linear Decrease: a distribution in which the number study on airline pricing with data collected over a longer of passengers arriving at the system decreased linearly period of time and over more routes. We plan to include as the amount of time left before departure decreased; multi-leg flights in this new data set. The pricing behavior of multi-leg flights is different than that of non-stop flights
• Quadratic Decrease: a distribution like Linear Debecause each leg in the flight can cause a change in the price, crease, but with a quadratic relationship; and because pricing through airline hubs appears to behave
• Square Root Decrease: a distribution like Linear differently as well. Decrease, but with a square root relationship; We also plan to exploit other sources of information to further improve HAMLET'S predictions. We do not currently
• Linear Increase: a distribution like Linear Decrease, have access to a key variable — the number of unsold seats except that the number of passengers increase as the on a flight. However, on-line travel agents and centralized amount of time left before departure decreased; reservation systems such as Sabre or Galileo do have this information. If we had access to the number of unsold seats
• Quadratic Increase: a distribution like Linear Inon a flight, HAMLET could all but eliminate the need to crease, but with a quadratic relationship; upgrade passengers, which is a major cost.
• Square Root Increase: a distribution like Linear Tb use the methods in this paper on the full set of domesIncrease, but with a square root relationship. tic and international flights on any given day would require collecting vast amounts of data. One possible way to address
Table 6 reports the net savings, as a percentage of the tothis problem is to build agents on demand that collect the tal ticket price, under the different distributions. HAMLET required data to make price predictions for on a particular saved more than 2.5% of the ticket price in all cases, and future flight on a particular day. The agents would still need
Table 6: Sensitivity of Methods to Distribution of Passengers' Earliest Purchase Points. The numbers reported are the savings, as a percentage of total ticket price, achieved by each algorithm under each distribution. We see that Hamlet outperforms Q-learning, time series, and Ripper on all distributions.
to collect data for multiple flights, but the amount of data tion on key variables such as the number of seats available would be much smaller. This type of agent would fit well on a flight, our data mining algorithms performed surpriswithin the Electric Elves system [6, 2], which deploys a set ingly well. Most notably, our HAMLET data mining method of personalized agents to monitor various aspects of a trip. achieved 61.8% of the possible savings by appropriately timFor example, Elves can notify you if your flight is delayed ing ticket purchases. or canceled or let you know if there is an earlier connecting Our algorithms were drawn from statistics (time series flight to your destination. methods), computational finance (reinforcement learning)
Beyond airline pricing, we believe that the techniques deand classical machine learning (Ripper rule learning). Each scribed in this paper will apply to other product categories. algorithm was tailored to the problem at hand (e.g., we In the travel industry, hotels and car rental agencies employ devised an appropriate reward function for reinforcement many of the same pricing strategies as the airlines and it learning), and the algorithms were combined using a variwould be interesting to see how much HAMLET can save in ant of stacking to improve their predictive accuracy. these product categories. Similarly, online shopping sites Additional experiments on larger airfare data sets and in such as Amazon and Wal-mart are beginning to explore other domains (e.g., hotels, reverse auctions) are essential, more sophisticated pricing strategies and HAMLET will albut this initial pilot study provides the first demonstration of low consumers to make more informed decisions. Finally, the potential of price mining algorithms to save consumers reverse auction sites, such as half.com, also provide an opsubstantial amounts of money using data available on the portunity for HAMLET to learn about pricing over time and Internet. We believe that price mining of this sort is a fertile make recommendations about purchasing an item right away area for future research. or waiting to buy it. In general, price mining over time provides a new dimension for comparison shopping engines to 8. ACKNOWLEDGMENTS exploit. We thank Haym Hirsh, John Moody, and Pedro Domingos
We recognize that if a progeny of HAMLET would achieve for helpful suggestions. This paper is based upon work supwide spread use it could start to impact the airlines' (already ported in part by the Air Force Office of Scientific Research slim) profit margins. Could the airlines introduce noise into under grant number F49620-01-1-0053 to USC. The views their pricing patterns in an attempt to fool a price miner? and conclusions contained herein are those of the authors While we have not studied this question in depth, the oband should not be interpreted as necessarily representing the vious problem is that changing fares on a flight in order to official policies or endorsements, either expressed or implied, fool a price miner would impact all consumers considering of any of the above organizations or any person connected buying tickets on that flight. If the price of a ticket moves with them. up substantially, then consumers are likely to buy tickets on different flights resulting in a revenue loss for the airline. Similarly, if the price moves down substantially, consumers 9. REFERENCES
[1] R. Agrawal and R. Srikant. Mining sequential will be buying tickets at a discount resulting in a revenue patterns. In P. S. Yu and A. S. P. Chen, editors, loss again. Thus, to avoid these distortions, the airlines are Eleventh International Conference on Data forced to show the prices that they actually want to charge Engineering, pages 3-14, Taipei, Taiwan, 1995. IEEE for tickets. Of course, there are more prosaic methods of Computer Society Press. trying to block a price miner such as placing prices inside [2] J. L. Ambite, G. Barish, C. A. Knoblock, M. Muslea, GIF files or blocking the IP address of the price miner. HowJ. Oh, and S. Minton, Getting from here to there: ever, an "industrial strength" price miner would not rely on Interactive planning and agent execution for "scraping" information from web sites, but would access a optimizing travel. In Proceedings of the FouHeenth fare database directly. Conference on Innovative Applications of Artificial Inte.llige.nce (IAAI-2002), pages 862-869, AAAI Press,
7. CONCLUSION Menlo Park, CA, 2002.
This paper reported on a pilot study in "price mining" [3] G. Barish and C. A. Knoblock. An efficient and over the web. We gathered airfare data from the web and expressive language for information gathering on the showed that it is feasible to predict price changes for flights web. In Proceedings of the AIPS-2002 Workshop on Is based on historical fare data. Despite the complex algothere life after operutυr sequencing? - Exploring real rithms used by the airlines, and the absence of informaworld planning, pages 5-12, Tolouse, France, 2002.
[4] D. Berndt and J. Clifford. Finding patterns in time [16] N. Littlestone and M. K. Waπnuth. The weighted series: a dynamic programming approach. In majority algorithm. Information and Computation, U. Fayyad, G. Shapiro, P. Smyth, and R. Uthurusamy, 108(2):212-261, February 1994. editors, Advances in Knowledge Discovery and Data [17] D. Lucking-Reiley, D. Bryan, N. Prasad, and Mining. AAAI Press, 1996. D. Reeves. Pennies from ebay: The determinants of [5] L. Breiman. Bagging predictors. Machine Learning, price in online auctions. Technical report, University
24:123-140, 1996. of Arizona, 2000. [6] H. Chalupsky, Y. Gil, C. A. Knoblock, K. Leπnan, [18] S. Mahadevan. Average reward reinforcement J. Oh, D. V. Pynadath, T. A. Russ, and M. Tambe. learning: Foundations, algorithms, and empirical Electric elves: Applying agent technology to support results. Machine Learning, 22(1-3):159-195, 1996. human organizations. In Proceedings of the Conference [19] S. McCartney. Airlines Rely on Technology To on Innovative Applications of Artificial Intelligence, Manipuate Fare Structure. Wall Street Journal, 2001. November 3 1997. [7] C. Chatfield. The Analysis of Time Series: An [20] J. Moody and M. Saffell. Reinforcement learning for
Introduction. Chapman and Hall, London, UK, 1989. trading systems and portfolios. In KDD, pages [8] W. W. Cohen. Fast effective rule induction. In 279-283, 1998.
A. Prieditis and S. Russell, editors, Proc. of the 12th [21] J. Moody and M. Saffell. Minimizing downside risk via International Conference on Machine Learning, pages stochastic dynamic programming. In Y. S. 115-123, Tahoe City, CA, July 9-12, 1995. Morgan Abu-Mostafa, B. LeBaron, A. W. Lo, and A. S. Kaufmann. Weigend, editors, Computational Finanix. 1999, [9] F. Diebold. Elements of Forecasting. South- Western Cambridge, MA, 2000. MIT Press. College Publishing, 2nd edition, 2000. [22] J; Moody and M. Saffell. Learning to trade via direct
[10] P. Domingos. MetaCost: A general method for making reinforcement. In IEEE Transactions on Neural classifiers cost-sensitive. In Proceedings of the Fifth Networks, Vol. 12, No. 4, 2001. ACM SIGKDD International Conference on [23] J. F. Roddick and M. Spiliopoulou. A bibliography of Knowledge Discovery and Data Mining, pages temporal, spatial and spatio-temporal data mining 155-164, San Diego, CA, 1999. ACM Press. research. SIGKDD Explorations, l(l):34-38, 1999.
[11] R. Doorenbos, O. Etzioni, and D. Weld. A scalable [24] H. S. Shah, N. R. Joshi, A. Sureka, and P. R. Wurman. comparison-shopping agent for the World-Wide Web. Mining for bidding strategies on ebay. In Lecture In Proc. First Intl. Conf. Autonomous Agents, pages Notes in Artificial Intelligence. Springer- Verlag, 2003. 39-48, 1997. [25] R. S. Suttoπ and A. Barto. Reinforcement Learning:
[12] Y. Freund and R. E. Schapϊre. Experiments with a An Introduction. MIT Press, Cambridge, MA, 1998. new boosting algorithm. In Proceedings of the [26] K. M. Ting and I. H. Witten. Issues in stacked Thirteenth Inter-national Confere x on Machine generalization. Journal of Artificial Intelligence Learning, pages 148-156, Bari, Italy, 1996. Morgan Research, 10:271-289, 1999. Kaufmann. [27] M. P. Wellman, D. M. Reeves, K. M. Lochner, and
[13] C. W. J. Granger. Forecasting in Business and Y. Vorobeychik. Price prediction in a trading agent Economics. Harcourt Brace, second edition, 1989. competition. Technical report, University of Michigan,
[14] J. C. Hull. Options, Futures, and Other Derivatives. 2002. Prentice Hall College Div, 5th edition, 2002. [28] D. Wolpert. Stacked generalization. Neural Networks,
[15] C. A. Knoblock, K. Leπnan, S. Minton, and I. Muslea. 5:241-259, 1992. Accurately and reliably extracting data from the web: A machine learning approach. In P. S. Szczepaniak, J. Segovia, J. Kacprzyk, and L. A. Zadeh, editors, Intelligent Exploration of the Web, pages 275-287. Springer- Verlag, Berkeley, CA, 2003.