WO2010073106A1 - Dynamic customization of a virtual world - Google Patents

Dynamic customization of a virtual world Download PDF

Info

Publication number
WO2010073106A1
WO2010073106A1 PCT/IB2009/007864 IB2009007864W WO2010073106A1 WO 2010073106 A1 WO2010073106 A1 WO 2010073106A1 IB 2009007864 W IB2009007864 W IB 2009007864W WO 2010073106 A1 WO2010073106 A1 WO 2010073106A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual world
user
sound
virtual
conversation
Prior art date
Application number
PCT/IB2009/007864
Other languages
French (fr)
Inventor
John H. Yoakum
Tony Mccormack
Neil O'connor
Original Assignee
Nortel Networks Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nortel Networks Limited filed Critical Nortel Networks Limited
Priority to JP2011541636A priority Critical patent/JP5748668B2/en
Priority to EP09834194.4A priority patent/EP2380107A4/en
Publication of WO2010073106A1 publication Critical patent/WO2010073106A1/en

Links

Classifications

    • A63F13/12
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/63Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor by the player, e.g. authoring using a level editor
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/215Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/30Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • A63F13/79Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/85Providing additional services to players
    • A63F13/87Communicating with other players during game play, e.g. by e-mail or chat
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1081Input via voice recognition
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/50Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
    • A63F2300/57Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of game services offered to the player
    • A63F2300/572Communication between players during game play of non game information, e.g. e-mail, chat, file transfer, streaming of audio and streaming of video
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6009Methods for processing data by generating or executing the game program for importing or creating game content, e.g. authoring tools during game development, adapting content to different platforms, use of a scripting language to create content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting

Definitions

  • This invention relates to virtual worlds, and in particular to dynamically customizing a virtual world based on a conversation occurring with respect to a location in the virtual world.
  • a virtual world is a computer simulated environment in which humans typically participate via a computer-rendered entity referred to as an avatar.
  • Virtual worlds have long been associated with entertainment, and the success of several multiplayer online simulations, such as World of Warcraft and Second Life, are evidence of the popularity of virtual worlds.
  • the immersive qualities of virtual worlds are frequently cited as the basis of their popularity.
  • Commercial entities are beginning to explore using virtual worlds for marketing products or services to existing or potential customers.
  • virtual worlds are typically used as a mechanism for enticing a potential customer to contact a representative of the company to discuss the company's products or services, or as a means for strengthening a brand through exposure to users of the virtual world - in essence, as a billboard.
  • Virtual worlds are increasingly providing voice communication capabilities among users of the virtual world. Headphones with integrated microphones and speakers are commonplace among computer users today, and virtual worlds use voice enabled communications to enable users participating in the virtual world to talk with one another. Some virtual worlds also provide rudimentary voice recognition interfaces, wherein a user of the virtual world can navigate within the virtual world by using a set of predefined commands.
  • virtual worlds currently lack the ability to integrate voice with activity occurring in the virtual world in a way that is natural and that enhances a user's experience from a marketing perspective. Therefore, there is a need to combine a virtual world's immersive qualities with a user's voice communications to enable a virtual world to provide a customized experience based on a user's particular interests.
  • the present invention relates to a virtual world that is customized based on a conversation that is associated with the virtual world.
  • a first user of the virtual world and a second user engage in a conversation.
  • the conversation is monitored, and a sound is detected that matches a key sound.
  • a portion of the virtual world is then altered to include a virtual world customization based on the key sound.
  • the conversation may be monitored by analyzing a media stream including a voice signal of the user and the agent.
  • the media stream may be a single media stream carrying voice signals of both the first and second users, or may be multiple media streams wherein one media stream includes the voice signals of the first user and another media stream includes the voice signals of the second user.
  • the media streams may be analyzed by voice recognition software, such as a speech analytics algorithm, that is capable of real-time or near real-time analysis of conversations.
  • the second user may comprise an agent associated with an enterprise and be represented in the virtual world by an agent avatar.
  • the agent may be computer-controlled.
  • the agent may be a human associated with an enterprise depicted in the virtual world.
  • the agent may be one of many agents managing calls associated with the enterprise from a customer contact center.
  • the computer-controlled agent may be programmed to communicate in response to words spoken by the user.
  • the user may be represented in the virtual world by an avatar, and may interact with the virtual world with a user device that includes communication abilities and enables the user to engage in conversation with the agent with respect to a location in the virtual world.
  • the key sounds may include all or portions of words and phrases that are associated with products available for sale by the enterprise and sounds that provide information about the participants of the conversation including, for example, an emotional state of the user, a dialect associated with a user, an accent associated with a user, and the like.
  • a virtual world customization made in response to detecting a sound made by a participant that matches a key sound can include, but is not limited to, displaying a video or graphical image in the virtual world, playing an audio stream or other recording in the virtual world, reconfiguring an existing display of virtual objects from a first configuration to a second configuration in the virtual world, including additional virtual objects or removing existing virtual objects in the virtual world, introducing a virtual world environment not previously displayed in the virtual world, or any combination thereof.
  • the particular virtual world customization included in the virtual world may be based on the particular key sound detected.
  • the virtual world customization is also based on user information associated with the participant.
  • the present invention may determine from user profile information that the user is a male residing in Reno, Nevada, and initially present a virtual world customization comprising a display of men's cowboy hats viewable by the user in the virtual world.
  • the virtual world customization comprises altering the virtual world to include a virtual world environment that was absent from display in the virtual world prior to detection of the key sound, wherein the virtual world environment includes a display of a product associated with the key sound in an environment that shows a conventional use of the product by a purchaser of the product.
  • a user may indicate an interest in coats, and an agent may ask whether a "parka" may be suitable.
  • the virtual environment viewed by the user may dynamically change to reflect a snow-covered street where avatars wearing parkas are walking.
  • Each parka may bear indicia enabling the user to express an interest about any particular parka.
  • the user could be directed to walk through a door that exists in the virtual world and that opens to such a winter setting with models wearing parkas, viewable to the user upon passing through the doorway.
  • FIG. 1 is a block diagram illustrating a system according to one embodiment of the invention.
  • FIG. 2 is a block diagram illustrating a speech processor illustrated in
  • Fig. 1 in greater detail.
  • FIG. 3 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to one embodiment of the invention.
  • Fig. 4 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to another embodiment of the invention.
  • the present invention relates to dynamically customizing a virtual world based on a conversation occurring with respect to a location of the virtual world.
  • the present invention enables the virtual world to change in response to sounds made by either participant in a conversation.
  • the present invention will be described herein in the context of a commercial entity, or enterprise, providing marketing, sales, or services to existing or potential customers in a virtual world.
  • the present invention is not limited to such a commercial context, and has applicability in any context where it would be beneficial to dynamically alter a virtual world based on a conversation between users or participants in the virtual world.
  • a server 10 including a virtual world engine 12 and a speech processor 14 provides a virtual world 16.
  • the server 10 can comprise any suitable processing device capable of interacting with a network, such as the Internet 18, via a communications interface 19, and capable of executing instructions sufficient to provide a virtual world 16 and to carry out the functionality described herein.
  • the server 10 can execute any conventional or proprietary operating system, and the virtual world engine 12 and the speech processor 14 can be coded in any conventional or proprietary software language.
  • Users 2OA, 2OB participate in the virtual world 16 using user devices 24A, 24B, respectively.
  • the respective element may be referred to collectively without reference to a specific instance of the element where the discussion does not relate to a specific element.
  • the users 2OA, 2OB may be referred to collectively as the users 20 where the discussion does not pertain to a specific user 20
  • the user devices 24A, 24B may be referred to collectively as the user devices 24 where the discussion does not pertain to a specific user device 24.
  • the user devices 24 may comprise any suitable processing device capable of interacting with a network, such as the Internet 18, and of executing a client module 26 suitable to interact with the server 10.
  • the client module 26 also provides the virtual world 16 for display on a display device (not shown) to a respective user 20.
  • the user devices 24 could comprise, for example, personal computers, cell phones, personal digital assistants, a fixed or mobile gaming console, and the like.
  • the user devices 24 may connect to the Internet 18 using any desired communications technologies, including wired or wireless technologies, and any suitable communication protocols, such as Transmission Control Protocol/Internet Protocol (TCP/IP) or Sequenced Packet Exchange/Internetwork Packet Exchange (SPX/! PX). While the invention is described herein as using a public network, such as the Internet 18, to enable communications between the server 10 and the user devices 24, any network that enables such communications, private or public, conventional or proprietary, could be used with the present invention.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • SPX/! PX Sequenced Packet Exchange/Internetwork Packet Exchange
  • the users 20 are typically represented in the virtual world 16 through the use of a computer-rendered entity known as an avatar.
  • An avatar is essentially a representation of the respective user 20 within the virtual world 16, and indicates a location with respect to the virtual world 16 of the respective user 20.
  • Avatars 28A, 28B represent the users 2OA, 2OB, respectively, in the virtual world 16.
  • the client modules 26 receive instructions from the users 20, typically via an input device such as a mouse, a toggle, a keyboard, and the like, to move a respective avatar 28 about the virtual world 16.
  • the respective client module 26 renders the virtual world 16 to show movement of the avatar 28 to the respective user 20 within the context of the virtual world 16, and also provides movement data to the virtual world engine 12.
  • the virtual engine world 12 collects information from the client modules 26, and informs the client modules 26 of events occurring that may be within an area of interest of the respective avatar 28 controlled by the respective client module 26. As is understood by those skilled in the art, such information is typically communicated between a client module 26 and virtual world engine 12 in the form of messages. For example, if the user 2OA moves the avatar 28A from one side of a room in the virtual world 16 to the other side of the room, the client module 26A will provide this movement information to the virtual world engine 12, which in turn will provide the movement information to the client module 26B, which can then render the virtual world 16 showing the movement of the avatar 28A to the user 2OB.
  • the virtual world 16 may be a virtual world that provides access to a large number of users for a social interaction purpose, such as Second Life, in the context of a competitive or collaborative game, such as World of Warcraft, or may provide access for a more limited purpose such as the provision of services for a particular commercial enterprise.
  • a social interaction purpose such as Second Life
  • World of Warcraft World of Warcraft
  • the process for downloading the client modules 26 may be dynamic and practically transparent to the users 20, such as, for example, where the client module 26 is automatically downloaded by virtue of a connection to a particular website, and the client module 26 runs in a browser used by the user 20 to interact with websites on the Internet 18.
  • the virtual world engine 12 enables speech communications among users within the virtual world 16.
  • the user devices 24 preferably include speech enabling technology.
  • a user 20 uses a headset (not shown) that includes an integrated microphone and that is coupled, wired or wirelessly, to the user device 24.
  • the user 2OA may speak into the microphone, causing the user device 24A to generate a media stream of the voice signal of the user 2OA that is provided by the client module 26A to the virtual world engine 12.
  • the virtual world engine 12, determining that the avatar 28B is within an auditory area of interest of the avatar 28A, may provide the media stream to the client module 26B for playback by the user device 24B to the user 2OB.
  • a user such as the user 2OB, may be a representative, or agent, of a commercial enterprise, or entity, that is depicted or otherwise portrayed in the virtual world 16.
  • the avatar 28B may be in a clothing store 30 that is depicted in the virtual world 16.
  • the user 2OB may initiate a conversation with the user 2OA.
  • the virtual world engine 12 may enable such communications either automatically, based on a virtual world proximity between the avatars 28A and 28B, or may first require an explicit request and approval between the users 2OA and 2OB through the use of dialog boxes or the like, prior to enabling such conversations.
  • the virtual world engine 12, upon enabling such communications, also provides the voice signals generated by the users 2OA and 2OB to the speech processor 14.
  • the speech processor 14 can comprise any suitable speech recognition processor capable of detecting sounds in speech signals.
  • Sounds and key sounds can include, but are not limited to, words, phrases, combinations of words occurring within a predetermined proximity of one another, utterances, names, nick names, unique pronunciations, accents, dialects, and the like.
  • the speech processor 14 comprises phonetic- based speech processing.
  • Phonetic-based speech processing has the capability of detecting sounds in conversations quickly and efficiently.
  • the speech processor 14 may be implemented on a separate device that is coupled to the server 10 via a network, such as the Internet 18.
  • the avatars 28A and 28B may be surrounded by displays of certain types of virtual object products, such as shirts and pants.
  • the user 2OA may respond that they are interested in whether the clothing store 30 carries shoes.
  • the speech processor 14 monitors the conversation between the users 2OA and 2OB. Each sound detected by the speech processor 14 may be matched against a list of key sounds to determine whether the detected sound matches a key sound. Assume that the sound "shoe" is a key sound. Upon determining that the sound "shoe" matches a key sound, the speech processor 14 signals the virtual world engine 12 that a key sound has been detected, and provides the virtual world engine 12 a unique index identifying the respective key sound.
  • the virtual world engine 12 determines a virtual world customization that is associated with the key sound "shoe.”
  • the virtual world customization may comprise any suitable alteration of the virtual world 16, and can comprise, for example, display of a video or graphical image in the virtual world 16, playing of an audio stream or other recording in the virtual world 16, reconfiguration of an existing display of virtual objects from a first configuration to a second configuration in the virtual world 16, inclusion of additional virtual objects or removal of existing virtual objects in the virtual world 16, introduction of a virtual world environment not previously displayed in the virtual world 16, or any combination thereof.
  • the virtual world engine 12 determines that the virtual world customization associated with the key sound "shoe” comprises altering the virtual world 16 to include a table bearing a plurality of different types of shoes.
  • the imagery associated with the virtual world customization may be stored on a storage device 32 and coupled to the virtual world engine 12 directly, or via a network, such as the internet 18.
  • the virtual world engine 12 provides the imagery to the user device 24A, which in turn may render, according to instructions, the shoe display in place of an existing display in a location viewable by the user 2OA.
  • the virtual world customization may appear to the user 2OA to simply 'appear' in the virtual world 16 in front of the avatar 28A
  • the virtual world customization may similarly be rendered with respect to a location of the avatar 28A such that the user 2OA may direct the avatar 28A to a different location within the clothing store 30, or through a door, for example, to view the virtual world customization.
  • the virtual world customization may be based on the key sound and based on information known or discemable about the user 2OA.
  • the user 2OA may be an existing customer of the business enterprise depicted as the clothing store 30 and the business enterprise may have data showing that the user 2OA ordinarily purchases running shoes.
  • the virtual world engine 12 may have local access to such information in the storage device 32 or, where the business entity is one of several business entities depicted in the virtual world 16, such as for example where the virtual world 16 is a shopping mall, may have access to a business server 34 associated with the respective business.
  • the user information stored in the business server 34 belonging to the business enterprise will be available to the user 2OB and can be made available to the server 10 for use in the virtual world 16 as appropriate.
  • the virtual world engine 12 in conjunction with the speech processor 14 may provide information including the detection of the key sound "shoe" and an identifier identifying the user 2OA to the business server 34.
  • the business server 34 may determine that the user 2OA typically purchases running shoes, and provide that information to the virtual world engine 12.
  • the virtual world engine 12 may then select a virtual world customization relating solely to running shoes in lieu of a virtual world customization that relates to a plurality of different types of shoes.
  • the user 2OA may view the display of running shoes and indicate they are not interested in running shoes.
  • the user 2OB a contact center agent sitting in front of a computer, may view a record of purchases of the user 2OA and note that the user 2OA previously purchased cowboy boots on several occasions, and may ask whether the user 2OA is interested in cowboy boots.
  • the speech processor 14 may determine that the sound "cowboy boots" is a key sound and identify the key sound to the virtual world engine 12.
  • the virtual world engine 12 may determine that a virtual world customization associated with cowboy boots involves altering the virtual world 16 to include a display of cowboy boots.
  • the virtual world engine 12 obtains the imagery, or skin, associated with cowboy boots from the storage device 32, provides the imagery to the user device 24A, which in turn renders the new display of cowboy boots for the user 2OA.
  • the present invention dynamically customizes the virtual world 16 based on a conversation between the user 2OA and the user 2OB.
  • the virtual world 16 may be altered based on sounds made by either the user 2OA, the user 2OB, or a combination of sounds made by both users 2OA, 2OB.
  • a virtual world customization can range from a subtle change to the virtual world 16, or a complete re-skinning of portions of the virtual world 16.
  • the virtual world customization may include changing the wall color of the clothing store 30, altering the content of posters hanging on the walls, changing dynamic information, such as data feeds that are being shown on a virtual flat screen television portrayed in the clothing store 30, and the like.
  • the virtual world customization can occur within a portion of the virtual world 16 that is immediately visible to the respective user 20 without any movement of the respective avatar 28, or within a portion of the virtual world 16 that the user 20 will view as the avatar 28 moves about the virtual world 16.
  • the user 2OB may be a human, alternately, the user 2OB may be a computer-controlled agent.
  • the virtual world engine 12 may control the avatar 28B and, upon detection of a proximity of the avatar 28A, engage the user 2OA in a conversation using artificial intelligence.
  • the virtual world engine 12 may provide a computer-controlled avatar 28B until a certain key sound is detected by the speech processor 14 that the virtual world engine 12 uses to initiate a process to contact a human, such as a contact center agent, to take over the control of the avatar 28B and the conversation with the user 2OA.
  • a human such as a contact center agent
  • the detection of certain key sounds such as "purchase,” “see,” “do you have,” and the like, may be deemed of sufficient interest that it warrants the resources of a human to engage the user 2OA.
  • a human contact center agent for teachings relating to using a human contact center agent in the context of a virtual world, please see U.S. Patent Application Serial No. 11/608,475, filed December 8, 2006 entitled PROVISION OF CONTACT CENTER SERVICES TO PLAYERS OF GAMES, which is hereby incorporated by reference herein.
  • Fig. 2 is a block diagram illustrating aspects of the speech processor 14 in greater detail.
  • the speech processor 14 receives one or more media streams 4OA, 4OB, representing voice signals made by the users 2OA, 2OB respectively. While shown in Fig. 2 as separate voice signals for purposes of illustration, the voice signals could be presented to the speech processor 14 as a single media stream 40 including voice signals from both users 2OA, 2OB. Further, while the embodiment herein has been shown as having two users 20 engaged in a conversation, the present invention could be used with any number of users 20 engaging in a conversation.
  • the speech processor 14 monitors the conversation by analyzing the media streams 40A, 4OB.
  • the speech processor 14 detects sounds and determines whether any of the sounds match a key sound.
  • Key sounds may be stored in a key sound table 42.
  • the key sound table 42 may include a key sound column 44 containing rows of individual key sounds, and an index column 46 containing rows of unique identifiers, each of which is associated with a unique key sound. While not shown in the key sound table 42, data may also be present that define a key sound as comprising multiple sounds, and may further include data defining a key sound as multiple sounds occurring within a proximity of one another. While for purposes of illustration the key sounds in the key sound column 44 are represented as simple words, it will be understood by those skilled in the art that the key sounds may comprise data representing phonetic components of speech, waveforms, or any other data suitable for representing a sound that may be made during a conversation.
  • the server 10 also includes a control system 48, which includes a processor for executing the various software suitable for implementing the virtual world engine 12 and the speech processor 14 modules, and inter process communication capabilities suitable for enabling communications between the various modules.
  • a control system 48 which includes a processor for executing the various software suitable for implementing the virtual world engine 12 and the speech processor 14 modules, and inter process communication capabilities suitable for enabling communications between the various modules.
  • the speech processor 14 detects the sound "parkas" in the media stream 40A.
  • the speech processor 14 initiates a table lookup on the key sound table 42 and determines that the sound "parkas" matches the key sound "parka.”
  • the detected sound in the media stream 4OA may match a key sound even if the detected sound is not identical to the key sound, for example where the key sound is in singular form and the detected sound is in plural form.
  • the speech processor 14 also determines that the index associated with the key sound "parka” is "4.”
  • the speech processor 14 sends the index "4" and an identifier identifying the user 2OA as the source of the key sound to the virtual world engine 12.
  • the virtual world engine 12 may use one or both of the key sound index and the user identifier to determine a virtual world customization to include in the virtual world 16.
  • Fig. 3 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to one embodiment of the present invention.
  • the present embodiment relates to a business entity providing customer support via the virtual world 16.
  • the user 2OA may direct a browser on the user device 24A to connect to a website associated with the business enterprise.
  • a Java code client module 26A may be loaded onto the user device 24A and imagery representing the virtual world 16 may be loaded into the client module 26A for display to the user 2OA.
  • a default avatar may be available and be shown in the virtual world 16, or alternately, the user 2OA may be able to select an avatar from a number of avatars provided by the virtual world 16.
  • the virtual world 16 is in essence a virtual store associated with the business enterprise, for example a large electronics store.
  • the user 2OA moves the avatar 28A through the business enterprise in the virtual world 16 to a counter identified as a "customer service" counter (step 100).
  • the business enterprise maintains a number of human agents who continually monitor the virtual world 16 for avatars that approach various areas of the business enterprise in the virtual world 16 or, alternately, are alerted when a user avatar 28 is within a proximity of a specific area in the virtual world 16.
  • a human agent associated with the customer service counter and represented in the virtual world 16 by the avatar 28B initiates a conversation with the user 2OA and asks the user 2OA whether the user 2OA requires any help (step 102).
  • the user 2OA indicates that he is having a problem with a television (step 104).
  • the agent asks the user 2OA for the model or type of television with which the user 2OA is having a problem (step 106).
  • the user 2OA responds with a particular television model (step 108).
  • the speech processor 14, which is monitoring the conversation between the agent and the user 2OA determines that the sound associated with the television model number matches a key sound in the key sound table 42 (step 110).
  • the speech processor 14 provides a key sound index to the virtual world engine 12.
  • the virtual world engine 12 determines that a virtual world customization associated with the key sound relates to providing a three-dimensional image of the particular television model with which the user 2OA is having a problem on the countertop (step 112).
  • the user 2OA sees the three-dimensional rendering of the television appear on the counter.
  • the user 2OA indicates to the agent that the problem relates to connecting a High-Definition Multimedia Interface (HDMI) cable (step 114).
  • the speech processor 14 determines that the sound "HDMI" matches a key sound (step 116), and provides a key sound index associated with the key sound to the virtual world engine 12.
  • Fig. 4 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to another embodiment of the invention. Assume in this embodiment that the virtual world 16 is a shopping mall and the user 2OA moves the avatar 28A into a store of a business depicted in the shopping mall.
  • the virtual world engine 12 determines that the avatar 28A is within a particular proximity of the virtual world enterprise (step 200).
  • the virtual world engine 12 sends a notification to the business server 34 that an avatar 28 is in proximity of the virtual world enterprise (step 202).
  • a contact agent is notified via the business server 34 and moves the avatar 28B in proximity to the avatar 28A (step 204).
  • the contact agent initiates a conversation with the user 2OA and asks, for example, whether the user 2OA is interested in any particular type of clothing (step 206).
  • the user 2OA indicates an interest in purchasing a parka (step 208).
  • the speech processor 14 monitors the conversation and determines that the sound "parka” matches the key sound "parka” (step 210).
  • the speech processor 14 then provides an index associated with the key sound "parka” to the virtual world engine 12.
  • the virtual world engine 12 determines that the virtual world customization associated with the key sound "parka” involves altering the virtual world 16 to include a virtual world customization showing avatars 28 wearing parkas in an environment where customers conventionally use parkas (step 212).
  • the virtual world engine 12 changes the imagery from a virtual world enterprise to a snow covered street nestled in the mountains where avatars are modeling various parkas sold by the virtual world enterprise.
  • Each parka may bear some indicia by which the user 2OA, upon determining an interest in a particular parka, can request to see.
  • the users 2OA, 2OB are two friends located geographically apart, but who are exploring the virtual world 16 together.
  • the avatars 28A, 28B move into the clothing store 30 and the user 2OA indicates to the user 2OB that the user 2OA finds certain belts to be "too formal.”
  • the user 2OB responds that he wishes the clothing store 30 carried braided belts.
  • the speech processor 14 monitors the conversation and determines that the sound "braided belt” matches a key sound.
  • the speech processor 14 provides an index associated with the key sound "braided belt” to the virtual world engine 12.
  • the virtual world engine 12 determines that the virtual world customization associated with the key sound "braided belt” involves altering the virtual world 16 to include a display of hanging braided belts and playing modern rock music.
  • the display appears in front of the users 2OA, 2OB, and music being played in the clothing store 30 changes from classical music to modern rock music.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Transfer Between Computers (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A method and apparatus of dynamically customizing a virtual world. A first user and a second user engage in a conversation with respect to a location in the virtual world. A speech processor monitors the conversation and detects that a sound made matches a key sound. The virtual world is altered to include a virtual world customization based on the key sound. The virtual world customization may also be based on user information associated with the user in the conversation that made the sound.

Description

DYNAMIC CUSTOMIZATION OFA VIRTUAL WORLD
Field of the Invention
[0001] This invention relates to virtual worlds, and in particular to dynamically customizing a virtual world based on a conversation occurring with respect to a location in the virtual world.
Background of the Invention
[0002] A virtual world is a computer simulated environment in which humans typically participate via a computer-rendered entity referred to as an avatar. Virtual worlds have long been associated with entertainment, and the success of several multiplayer online simulations, such as World of Warcraft and Second Life, are evidence of the popularity of virtual worlds. The immersive qualities of virtual worlds are frequently cited as the basis of their popularity. Commercial entities are beginning to explore using virtual worlds for marketing products or services to existing or potential customers. However, currently, in a commercial context, virtual worlds are typically used as a mechanism for enticing a potential customer to contact a representative of the company to discuss the company's products or services, or as a means for strengthening a brand through exposure to users of the virtual world - in essence, as a billboard. [0003] Virtual worlds are increasingly providing voice communication capabilities among users of the virtual world. Headphones with integrated microphones and speakers are commonplace among computer users today, and virtual worlds use voice enabled communications to enable users participating in the virtual world to talk with one another. Some virtual worlds also provide rudimentary voice recognition interfaces, wherein a user of the virtual world can navigate within the virtual world by using a set of predefined commands. However, virtual worlds currently lack the ability to integrate voice with activity occurring in the virtual world in a way that is natural and that enhances a user's experience from a marketing perspective. Therefore, there is a need to combine a virtual world's immersive qualities with a user's voice communications to enable a virtual world to provide a customized experience based on a user's particular interests.
Summary of the Invention
[0004] The present invention relates to a virtual world that is customized based on a conversation that is associated with the virtual world. A first user of the virtual world and a second user engage in a conversation. The conversation is monitored, and a sound is detected that matches a key sound. A portion of the virtual world is then altered to include a virtual world customization based on the key sound.
[0005] The conversation may be monitored by analyzing a media stream including a voice signal of the user and the agent. The media stream may be a single media stream carrying voice signals of both the first and second users, or may be multiple media streams wherein one media stream includes the voice signals of the first user and another media stream includes the voice signals of the second user. The media streams may be analyzed by voice recognition software, such as a speech analytics algorithm, that is capable of real-time or near real-time analysis of conversations.
[0006] The second user may comprise an agent associated with an enterprise and be represented in the virtual world by an agent avatar. In an alternate embodiment, the agent may be computer-controlled. The agent may be a human associated with an enterprise depicted in the virtual world. The agent may be one of many agents managing calls associated with the enterprise from a customer contact center. The computer-controlled agent may be programmed to communicate in response to words spoken by the user. The user may be represented in the virtual world by an avatar, and may interact with the virtual world with a user device that includes communication abilities and enables the user to engage in conversation with the agent with respect to a location in the virtual world.
[0007] The key sounds may include all or portions of words and phrases that are associated with products available for sale by the enterprise and sounds that provide information about the participants of the conversation including, for example, an emotional state of the user, a dialect associated with a user, an accent associated with a user, and the like.
[0008] A virtual world customization made in response to detecting a sound made by a participant that matches a key sound can include, but is not limited to, displaying a video or graphical image in the virtual world, playing an audio stream or other recording in the virtual world, reconfiguring an existing display of virtual objects from a first configuration to a second configuration in the virtual world, including additional virtual objects or removing existing virtual objects in the virtual world, introducing a virtual world environment not previously displayed in the virtual world, or any combination thereof. The particular virtual world customization included in the virtual world may be based on the particular key sound detected. According to one embodiment of the invention, the virtual world customization is also based on user information associated with the participant. For example, if a user communicates that they are interested in "hats," the present invention may determine from user profile information that the user is a male residing in Reno, Nevada, and initially present a virtual world customization comprising a display of men's cowboy hats viewable by the user in the virtual world.
[0009] According to another embodiment of the invention, the virtual world customization comprises altering the virtual world to include a virtual world environment that was absent from display in the virtual world prior to detection of the key sound, wherein the virtual world environment includes a display of a product associated with the key sound in an environment that shows a conventional use of the product by a purchaser of the product. For example, a user may indicate an interest in coats, and an agent may ask whether a "parka" may be suitable. Upon detection of the keyword "parka," the virtual environment viewed by the user may dynamically change to reflect a snow-covered street where avatars wearing parkas are walking. Each parka may bear indicia enabling the user to express an interest about any particular parka. Alternately, the user could be directed to walk through a door that exists in the virtual world and that opens to such a winter setting with models wearing parkas, viewable to the user upon passing through the doorway.
[0010] Those skilled in the art will appreciate the scope of the present invention and realize additional aspects thereof after reading the following detailed description of the preferred embodiments in association with the accompanying drawing figures.
Brief Description of the Drawing Figures
[0011] The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the invention, and together with the description serve to explain the principles of the invention.
[0012] Fig. 1 is a block diagram illustrating a system according to one embodiment of the invention.
[0013] Fig. 2 is a block diagram illustrating a speech processor illustrated in
Fig. 1 in greater detail.
[0014] Fig. 3 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to one embodiment of the invention.
[0015] Fig. 4 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to another embodiment of the invention.
Detailed Description of the Preferred Embodiments
[0016] The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the invention and illustrate the best mode of practicing the invention. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the invention and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims. [0017] The present invention relates to dynamically customizing a virtual world based on a conversation occurring with respect to a location of the virtual world. The present invention enables the virtual world to change in response to sounds made by either participant in a conversation. For purposes of illustration, the present invention will be described herein in the context of a commercial entity, or enterprise, providing marketing, sales, or services to existing or potential customers in a virtual world. However, the present invention is not limited to such a commercial context, and has applicability in any context where it would be beneficial to dynamically alter a virtual world based on a conversation between users or participants in the virtual world.
[0018] Referring now to Fig. 1 , a block diagram of a system according to one embodiment of the invention is illustrated. A server 10 including a virtual world engine 12 and a speech processor 14 provides a virtual world 16. The server 10 can comprise any suitable processing device capable of interacting with a network, such as the Internet 18, via a communications interface 19, and capable of executing instructions sufficient to provide a virtual world 16 and to carry out the functionality described herein. The server 10 can execute any conventional or proprietary operating system, and the virtual world engine 12 and the speech processor 14 can be coded in any conventional or proprietary software language. Users 2OA, 2OB participate in the virtual world 16 using user devices 24A, 24B, respectively. Throughout the specification where the Figures show multiple instances of the same element, such as the users 2OA, 2OB, the respective element may be referred to collectively without reference to a specific instance of the element where the discussion does not relate to a specific element. For example, the users 2OA, 2OB may be referred to collectively as the users 20 where the discussion does not pertain to a specific user 20, and the user devices 24A, 24B may be referred to collectively as the user devices 24 where the discussion does not pertain to a specific user device 24. The user devices 24 may comprise any suitable processing device capable of interacting with a network, such as the Internet 18, and of executing a client module 26 suitable to interact with the server 10. The client module 26 also provides the virtual world 16 for display on a display device (not shown) to a respective user 20. The user devices 24 could comprise, for example, personal computers, cell phones, personal digital assistants, a fixed or mobile gaming console, and the like. The user devices 24 may connect to the Internet 18 using any desired communications technologies, including wired or wireless technologies, and any suitable communication protocols, such as Transmission Control Protocol/Internet Protocol (TCP/IP) or Sequenced Packet Exchange/Internetwork Packet Exchange (SPX/! PX). While the invention is described herein as using a public network, such as the Internet 18, to enable communications between the server 10 and the user devices 24, any network that enables such communications, private or public, conventional or proprietary, could be used with the present invention.
[0019] The users 20 are typically represented in the virtual world 16 through the use of a computer-rendered entity known as an avatar. An avatar is essentially a representation of the respective user 20 within the virtual world 16, and indicates a location with respect to the virtual world 16 of the respective user 20. Avatars 28A, 28B represent the users 2OA, 2OB, respectively, in the virtual world 16. The client modules 26 receive instructions from the users 20, typically via an input device such as a mouse, a toggle, a keyboard, and the like, to move a respective avatar 28 about the virtual world 16. In response to a particular request to move an avatar 28, the respective client module 26 renders the virtual world 16 to show movement of the avatar 28 to the respective user 20 within the context of the virtual world 16, and also provides movement data to the virtual world engine 12. The virtual engine world 12 collects information from the client modules 26, and informs the client modules 26 of events occurring that may be within an area of interest of the respective avatar 28 controlled by the respective client module 26. As is understood by those skilled in the art, such information is typically communicated between a client module 26 and virtual world engine 12 in the form of messages. For example, if the user 2OA moves the avatar 28A from one side of a room in the virtual world 16 to the other side of the room, the client module 26A will provide this movement information to the virtual world engine 12, which in turn will provide the movement information to the client module 26B, which can then render the virtual world 16 showing the movement of the avatar 28A to the user 2OB.
[0020] The virtual world 16 may be a virtual world that provides access to a large number of users for a social interaction purpose, such as Second Life, in the context of a competitive or collaborative game, such as World of Warcraft, or may provide access for a more limited purpose such as the provision of services for a particular commercial enterprise. In order to participate in the virtual world 16, it may be necessary to perform an installation process that involves preloading certain content onto the user devices 24, such as, for example, downloading the client modules 26 onto the user devices 24 as well as graphical content relating to the virtual world 16. Alternately, the process for downloading the client modules 26 may be dynamic and practically transparent to the users 20, such as, for example, where the client module 26 is automatically downloaded by virtue of a connection to a particular website, and the client module 26 runs in a browser used by the user 20 to interact with websites on the Internet 18.
[0021] The virtual world engine 12 enables speech communications among users within the virtual world 16. As such, the user devices 24 preferably include speech enabling technology. Typically, a user 20 uses a headset (not shown) that includes an integrated microphone and that is coupled, wired or wirelessly, to the user device 24. For example, the user 2OA may speak into the microphone, causing the user device 24A to generate a media stream of the voice signal of the user 2OA that is provided by the client module 26A to the virtual world engine 12. The virtual world engine 12, determining that the avatar 28B is within an auditory area of interest of the avatar 28A, may provide the media stream to the client module 26B for playback by the user device 24B to the user 2OB. A similar process following the reverse path from the user device 24B to the virtual world engine 12 to the user device 24A will be followed if the user 2OB decides to respond to the user 2OA. In this manner, the user 2OA may engage in a conversation with the user 2OB. [0022] The users 2OA, 2OB could be any type of participants in the virtual world 16, including acquaintances such as mutual friends exploring the virtual world 16 collaboratively, or strangers who happen upon one another in the virtual world 16. According to one embodiment of the present invention, a user, such as the user 2OB, may be a representative, or agent, of a commercial enterprise, or entity, that is depicted or otherwise portrayed in the virtual world 16. For example, the avatar 28B may be in a clothing store 30 that is depicted in the virtual world 16. Upon detection of the avatar 28A in or about the clothing store 30, the user 2OB may initiate a conversation with the user 2OA. The virtual world engine 12 may enable such communications either automatically, based on a virtual world proximity between the avatars 28A and 28B, or may first require an explicit request and approval between the users 2OA and 2OB through the use of dialog boxes or the like, prior to enabling such conversations. The virtual world engine 12, upon enabling such communications, also provides the voice signals generated by the users 2OA and 2OB to the speech processor 14. The speech processor 14 can comprise any suitable speech recognition processor capable of detecting sounds in speech signals. Sounds and key sounds, as used herein, can include, but are not limited to, words, phrases, combinations of words occurring within a predetermined proximity of one another, utterances, names, nick names, unique pronunciations, accents, dialects, and the like. According to one embodiment of the invention, the speech processor 14 comprises phonetic- based speech processing. Phonetic-based speech processing has the capability of detecting sounds in conversations quickly and efficiently. For basic information on one technique for parsing speech into phonemes, please refer to the phonetic processing technology provided by Nexidia Inc., 3565 Piedmont Road NE, Building Two, Suite 400, Atlanta, GA 30305 (www.nexidia.com), and its white paper entitled Phonetic Search Technology, 2007 and the references cited therein, wherein the white paper and cited references are each incorporated herein by reference in their entireties. While shown herein as an integral portion of the server 10, the speech processor 14 may be implemented on a separate device that is coupled to the server 10 via a network, such as the Internet 18. [0023] Assume the user 2OA has moved the avatar 28A into the clothing store 30 and the user 2OB asks the user 2OA whether the user 2OA needs assistance. The avatars 28A and 28B may be surrounded by displays of certain types of virtual object products, such as shirts and pants. The user 2OA may respond that they are interested in whether the clothing store 30 carries shoes. The speech processor 14 monitors the conversation between the users 2OA and 2OB. Each sound detected by the speech processor 14 may be matched against a list of key sounds to determine whether the detected sound matches a key sound. Assume that the sound "shoe" is a key sound. Upon determining that the sound "shoe" matches a key sound, the speech processor 14 signals the virtual world engine 12 that a key sound has been detected, and provides the virtual world engine 12 a unique index identifying the respective key sound. The virtual world engine 12, according to one embodiment of the invention, determines a virtual world customization that is associated with the key sound "shoe." The virtual world customization may comprise any suitable alteration of the virtual world 16, and can comprise, for example, display of a video or graphical image in the virtual world 16, playing of an audio stream or other recording in the virtual world 16, reconfiguration of an existing display of virtual objects from a first configuration to a second configuration in the virtual world 16, inclusion of additional virtual objects or removal of existing virtual objects in the virtual world 16, introduction of a virtual world environment not previously displayed in the virtual world 16, or any combination thereof.
[0024] The virtual world engine 12 determines that the virtual world customization associated with the key sound "shoe" comprises altering the virtual world 16 to include a table bearing a plurality of different types of shoes. The imagery associated with the virtual world customization may be stored on a storage device 32 and coupled to the virtual world engine 12 directly, or via a network, such as the internet 18. The virtual world engine 12 provides the imagery to the user device 24A, which in turn may render, according to instructions, the shoe display in place of an existing display in a location viewable by the user 2OA. While the virtual world customization may appear to the user 2OA to simply 'appear' in the virtual world 16 in front of the avatar 28A, the virtual world customization may similarly be rendered with respect to a location of the avatar 28A such that the user 2OA may direct the avatar 28A to a different location within the clothing store 30, or through a door, for example, to view the virtual world customization.
[0025] According to another embodiment of the present invention, the virtual world customization may be based on the key sound and based on information known or discemable about the user 2OA. For example, the user 2OA may be an existing customer of the business enterprise depicted as the clothing store 30 and the business enterprise may have data showing that the user 2OA ordinarily purchases running shoes. The virtual world engine 12 may have local access to such information in the storage device 32 or, where the business entity is one of several business entities depicted in the virtual world 16, such as for example where the virtual world 16 is a shopping mall, may have access to a business server 34 associated with the respective business. In cases where the user 2OB is a contact center agent for a business enterprise, the user information stored in the business server 34 belonging to the business enterprise will be available to the user 2OB and can be made available to the server 10 for use in the virtual world 16 as appropriate. The virtual world engine 12 in conjunction with the speech processor 14 may provide information including the detection of the key sound "shoe" and an identifier identifying the user 2OA to the business server 34. The business server 34 may determine that the user 2OA typically purchases running shoes, and provide that information to the virtual world engine 12. The virtual world engine 12 may then select a virtual world customization relating solely to running shoes in lieu of a virtual world customization that relates to a plurality of different types of shoes.
[0026] The user 2OA may view the display of running shoes and indicate they are not interested in running shoes. The user 2OB, a contact center agent sitting in front of a computer, may view a record of purchases of the user 2OA and note that the user 2OA previously purchased cowboy boots on several occasions, and may ask whether the user 2OA is interested in cowboy boots. The speech processor 14 may determine that the sound "cowboy boots" is a key sound and identify the key sound to the virtual world engine 12. The virtual world engine 12 may determine that a virtual world customization associated with cowboy boots involves altering the virtual world 16 to include a display of cowboy boots. The virtual world engine 12 obtains the imagery, or skin, associated with cowboy boots from the storage device 32, provides the imagery to the user device 24A, which in turn renders the new display of cowboy boots for the user 2OA. In this manner, the present invention dynamically customizes the virtual world 16 based on a conversation between the user 2OA and the user 2OB. Notably, the virtual world 16 may be altered based on sounds made by either the user 2OA, the user 2OB, or a combination of sounds made by both users 2OA, 2OB. [0027] While for purposes of illustration several examples of virtual world custom izations are provided, it should be apparent to those skilled in the art after reading the present disclosure that a virtual world customization can range from a subtle change to the virtual world 16, or a complete re-skinning of portions of the virtual world 16. For example, based on detection of one or more key sounds, the virtual world customization may include changing the wall color of the clothing store 30, altering the content of posters hanging on the walls, changing dynamic information, such as data feeds that are being shown on a virtual flat screen television portrayed in the clothing store 30, and the like. The virtual world customization can occur within a portion of the virtual world 16 that is immediately visible to the respective user 20 without any movement of the respective avatar 28, or within a portion of the virtual world 16 that the user 20 will view as the avatar 28 moves about the virtual world 16. [0028] While the user 2OB may be a human, alternately, the user 2OB may be a computer-controlled agent. For example, the virtual world engine 12 may control the avatar 28B and, upon detection of a proximity of the avatar 28A, engage the user 2OA in a conversation using artificial intelligence. The process as described above with respect to a human user 2OB would otherwise be the same, with the conversation between the computer-controlled user 2OB and the user 2OA being monitored by the speech processor 14, and the virtual world engine 12 altering the virtual world 16 by including a virtual world customization based on the detection of key sound by either the computer-controlled user 2OB or the user 2OA.
[0029] According to another embodiment, the virtual world engine 12 may provide a computer-controlled avatar 28B until a certain key sound is detected by the speech processor 14 that the virtual world engine 12 uses to initiate a process to contact a human, such as a contact center agent, to take over the control of the avatar 28B and the conversation with the user 2OA. For example, the detection of certain key sounds, such as "purchase," "see," "do you have," and the like, may be deemed of sufficient interest that it warrants the resources of a human to engage the user 2OA. For teachings relating to using a human contact center agent in the context of a virtual world, please see U.S. Patent Application Serial No. 11/608,475, filed December 8, 2006 entitled PROVISION OF CONTACT CENTER SERVICES TO PLAYERS OF GAMES, which is hereby incorporated by reference herein.
[0030] Fig. 2 is a block diagram illustrating aspects of the speech processor 14 in greater detail. The speech processor 14 receives one or more media streams 4OA, 4OB, representing voice signals made by the users 2OA, 2OB respectively. While shown in Fig. 2 as separate voice signals for purposes of illustration, the voice signals could be presented to the speech processor 14 as a single media stream 40 including voice signals from both users 2OA, 2OB. Further, while the embodiment herein has been shown as having two users 20 engaged in a conversation, the present invention could be used with any number of users 20 engaging in a conversation. The speech processor 14 monitors the conversation by analyzing the media streams 40A, 4OB. The speech processor 14 detects sounds and determines whether any of the sounds match a key sound. Key sounds may be stored in a key sound table 42. The key sound table 42 may include a key sound column 44 containing rows of individual key sounds, and an index column 46 containing rows of unique identifiers, each of which is associated with a unique key sound. While not shown in the key sound table 42, data may also be present that define a key sound as comprising multiple sounds, and may further include data defining a key sound as multiple sounds occurring within a proximity of one another. While for purposes of illustration the key sounds in the key sound column 44 are represented as simple words, it will be understood by those skilled in the art that the key sounds may comprise data representing phonetic components of speech, waveforms, or any other data suitable for representing a sound that may be made during a conversation. The server 10 also includes a control system 48, which includes a processor for executing the various software suitable for implementing the virtual world engine 12 and the speech processor 14 modules, and inter process communication capabilities suitable for enabling communications between the various modules. [0031] Assume that the speech processor 14 detects the sound "parkas" in the media stream 40A. The speech processor 14 initiates a table lookup on the key sound table 42 and determines that the sound "parkas" matches the key sound "parka." Notably, the detected sound in the media stream 4OA may match a key sound even if the detected sound is not identical to the key sound, for example where the key sound is in singular form and the detected sound is in plural form. The speech processor 14 also determines that the index associated with the key sound "parka" is "4." The speech processor 14 sends the index "4" and an identifier identifying the user 2OA as the source of the key sound to the virtual world engine 12. As discussed previously with respect to Fig. 1 , the virtual world engine 12 may use one or both of the key sound index and the user identifier to determine a virtual world customization to include in the virtual world 16.
[0032] Fig. 3 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to one embodiment of the present invention. Assume that the present embodiment relates to a business entity providing customer support via the virtual world 16. The user 2OA may direct a browser on the user device 24A to connect to a website associated with the business enterprise. Upon connection to the website, a Java code client module 26A may be loaded onto the user device 24A and imagery representing the virtual world 16 may be loaded into the client module 26A for display to the user 2OA. A default avatar may be available and be shown in the virtual world 16, or alternately, the user 2OA may be able to select an avatar from a number of avatars provided by the virtual world 16. Assume further that the virtual world 16 is in essence a virtual store associated with the business enterprise, for example a large electronics store. The user 2OA moves the avatar 28A through the business enterprise in the virtual world 16 to a counter identified as a "customer service" counter (step 100). Assume further that the business enterprise maintains a number of human agents who continually monitor the virtual world 16 for avatars that approach various areas of the business enterprise in the virtual world 16 or, alternately, are alerted when a user avatar 28 is within a proximity of a specific area in the virtual world 16.
[0033] A human agent associated with the customer service counter and represented in the virtual world 16 by the avatar 28B initiates a conversation with the user 2OA and asks the user 2OA whether the user 2OA requires any help (step 102). The user 2OA indicates that he is having a problem with a television (step 104). The agent asks the user 2OA for the model or type of television with which the user 2OA is having a problem (step 106). The user 2OA responds with a particular television model (step 108). The speech processor 14, which is monitoring the conversation between the agent and the user 2OA, determines that the sound associated with the television model number matches a key sound in the key sound table 42 (step 110). The speech processor 14 provides a key sound index to the virtual world engine 12. The virtual world engine 12 determines that a virtual world customization associated with the key sound relates to providing a three-dimensional image of the particular television model with which the user 2OA is having a problem on the countertop (step 112). The user 2OA sees the three-dimensional rendering of the television appear on the counter. The user 2OA indicates to the agent that the problem relates to connecting a High-Definition Multimedia Interface (HDMI) cable (step 114). The speech processor 14 determines that the sound "HDMI" matches a key sound (step 116), and provides a key sound index associated with the key sound to the virtual world engine 12. The virtual world engine 12 determines that the virtual world customization associated with the "HDMI" key sound is to rotate the television shown on the counter such that the portion of the television in which cables are inserted is shown so that the agent and the user 2OA can see where HDMI cables are plugged into the television (step 118). [0034] Fig. 4 is a flow chart illustrating a process for altering a virtual world to include a virtual world customization according to another embodiment of the invention. Assume in this embodiment that the virtual world 16 is a shopping mall and the user 2OA moves the avatar 28A into a store of a business depicted in the shopping mall. Assume further that the respective business does not maintain human users 20 to monitor the virtual world 16, but rather maintains a contact center of agents that can be contacted automatically by the virtual world engine 12 upon the determination that an avatar has entered the store. [0035] The virtual world engine 12 determines that the avatar 28A is within a particular proximity of the virtual world enterprise (step 200). The virtual world engine 12 sends a notification to the business server 34 that an avatar 28 is in proximity of the virtual world enterprise (step 202). A contact agent is notified via the business server 34 and moves the avatar 28B in proximity to the avatar 28A (step 204). The contact agent initiates a conversation with the user 2OA and asks, for example, whether the user 2OA is interested in any particular type of clothing (step 206). The user 2OA indicates an interest in purchasing a parka (step 208). The speech processor 14 monitors the conversation and determines that the sound "parka" matches the key sound "parka" (step 210). The speech processor 14 then provides an index associated with the key sound "parka" to the virtual world engine 12. The virtual world engine 12 determines that the virtual world customization associated with the key sound "parka" involves altering the virtual world 16 to include a virtual world customization showing avatars 28 wearing parkas in an environment where customers conventionally use parkas (step 212). In this particular customization, the virtual world engine 12 changes the imagery from a virtual world enterprise to a snow covered street nestled in the mountains where avatars are modeling various parkas sold by the virtual world enterprise. Each parka may bear some indicia by which the user 2OA, upon determining an interest in a particular parka, can request to see. [0036] In an alternate embodiment, assume that the users 2OA, 2OB are two friends located geographically apart, but who are exploring the virtual world 16 together. Assume that the avatars 28A, 28B move into the clothing store 30 and the user 2OA indicates to the user 2OB that the user 2OA finds certain belts to be "too formal." The user 2OB responds that he wishes the clothing store 30 carried braided belts. The speech processor 14 monitors the conversation and determines that the sound "braided belt" matches a key sound. The speech processor 14 provides an index associated with the key sound "braided belt" to the virtual world engine 12. The virtual world engine 12 determines that the virtual world customization associated with the key sound "braided belt" involves altering the virtual world 16 to include a display of hanging braided belts and playing modern rock music. The display appears in front of the users 2OA, 2OB, and music being played in the clothing store 30 changes from classical music to modern rock music.
[0037] Those skilled in the art will recognize improvements and modifications to the preferred embodiments of the present invention. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.

Claims

ClaimsWhat is claimed is:
1. A method comprising: providing a virtual world; monitoring a conversation between a first user of the virtual world with respect to a location in the virtual world and a second user of the virtual world; detecting in the conversation at least one sound; making a determination that the at least one sound matches one of a plurality of key sounds; and in response to the determination, altering a portion of the virtual world to include a virtual world customization based on the one of the plurality of key sounds.
2. The method of claim 1 wherein the second user comprises a contact center representative associated with a business depicted in the virtual world.
3. The method of claim 1 wherein the second user is computer-controlled.
4. The method of claim 1 wherein the second user is an acquaintance of the first user.
5. The method of claim 1 wherein monitoring the conversation further comprises analyzing a first media stream comprising a voice signal generated by the user and a second media stream comprising a voice signal generated by the second user.
6. The method of claim 1 wherein the at least one sound comprises a distinctive style of pronunciation of a person from a particular area, country, or social background.
7. The method of claim 1 wherein the at least one sound was made by the first user.
8. The method of claim 1 wherein the at least one sound was made by the second user.
9. The method of claim 1 wherein detecting in the conversation at least one sound further comprises detecting, via a phoneme-based speech analytics process, the at least one sound.
10. The method of claim 1 further comprising determining the virtual world customization based on the one of the plurality of key sounds.
11. The method of claim 10 further comprising determining the virtual world customization based on the one of the plurality of key sounds and user information associated with one of the first user and the second user.
12. The method of claim 1 wherein altering the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds comprises presenting in the virtual world one or more products absent from the virtual world prior to detecting the at least one sound.
13. The method of claim 1 wherein altering the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds comprises presenting in the virtual world one or more items absent from display in the virtual world prior to detecting the at least one sound.
14. The method of claim 13 wherein the one or more items comprise one or more products associated with a business depicted in the virtual world.
15. The method of claim 1 wherein altering the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds comprises presenting in the virtual world a virtual environment absent from display in the virtual world prior to detecting the at least one sound, wherein the virtual environment includes a display of a product associated with the at least one sound presented in an environment showing a conventional use of the product by a purchaser of the product.
16. A server comprising: a communications interface adapted to communicate with a network; and a control system adapted to: provide a virtual world; monitor a conversation between a first user of the virtual world with respect to a location in the virtual world and a second user of the virtual world; detect in the conversation at least one sound; making a determination that the at least one sound matches one of a plurality of key sounds; and in response to the determination, alter a portion of the virtual world to include a virtual world customization based on the one of the plurality of key sounds.
17. The server of claim 16 wherein the second user comprises a contact center representative associated with a business depicted in the virtual world.
18. The server of claim 16 wherein to monitor the conversation the control system is further adapted to analyze a first media stream comprising a voice signal generated by the first user and a second media stream comprising a voice signal generated by the second user.
19. The server of claim 16 wherein the at least one sound was made by the second user.
20. The server of claim 16 wherein to detect in the conversation the at least one sound the control system is further adapted to detect, via a phoneme-based speech analytics process, the at least one sound.
21. The server of claim 16 wherein the virtual world customization is based on the one of the plurality of key sounds and user information associated with one of the first user and the second user.
22. The server of claim 16 wherein to alter the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds the control system is further adapted to present in the virtual world one or more products absent from the virtual world prior to detecting the at least one sound.
23. The server of claim 16 wherein to alter the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds the control system is further adapted to present in the virtual world one or more items absent from display in the virtual world prior to detecting the at least one sound.
24. The server of claim 23 wherein the one or more items comprise one or more products associated with a business depicted in the virtual world.
25. The server of claim 16 wherein to alter the portion of the virtual world to include the virtual world customization based on the one of the plurality of key sounds the control system is further adapted to present in the virtual world a virtual environment absent from display in the virtual world prior to detecting the at least one sound, wherein the virtual environment includes a display of a product associated with the at least one sound presented in an environment showing a conventional use of the product by a purchaser of the product.
PCT/IB2009/007864 2008-12-22 2009-12-22 Dynamic customization of a virtual world WO2010073106A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2011541636A JP5748668B2 (en) 2008-12-22 2009-12-22 Dynamic customization of the virtual world
EP09834194.4A EP2380107A4 (en) 2008-12-22 2009-12-22 Dynamic customization of a virtual world

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/341,871 2008-12-22
US12/341,871 US20100162121A1 (en) 2008-12-22 2008-12-22 Dynamic customization of a virtual world

Publications (1)

Publication Number Publication Date
WO2010073106A1 true WO2010073106A1 (en) 2010-07-01

Family

ID=42267918

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2009/007864 WO2010073106A1 (en) 2008-12-22 2009-12-22 Dynamic customization of a virtual world

Country Status (4)

Country Link
US (1) US20100162121A1 (en)
EP (1) EP2380107A4 (en)
JP (1) JP5748668B2 (en)
WO (1) WO2010073106A1 (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9357025B2 (en) 2007-10-24 2016-05-31 Social Communications Company Virtual area based telephony communications
US9009603B2 (en) * 2007-10-24 2015-04-14 Social Communications Company Web browser interface for spatial communication environments
US20090288007A1 (en) * 2008-04-05 2009-11-19 Social Communications Company Spatial interfaces for realtime networked communications
US8407605B2 (en) 2009-04-03 2013-03-26 Social Communications Company Application sharing
US8397168B2 (en) * 2008-04-05 2013-03-12 Social Communications Company Interfacing with a spatial virtual communication environment
US7769806B2 (en) 2007-10-24 2010-08-03 Social Communications Company Automated real-time data stream switching in a shared virtual area communication environment
US8756304B2 (en) 2010-09-11 2014-06-17 Social Communications Company Relationship based presence indicating in virtual area contexts
JP5368547B2 (en) 2008-04-05 2013-12-18 ソーシャル・コミュニケーションズ・カンパニー Shared virtual area communication environment based apparatus and method
EP2377089A2 (en) * 2008-12-05 2011-10-19 Social Communications Company Managing interactions in a network communications environment
US9065874B2 (en) 2009-01-15 2015-06-23 Social Communications Company Persistent network resource and virtual area associations for realtime collaboration
US9853922B2 (en) 2012-02-24 2017-12-26 Sococo, Inc. Virtual area communications
US10356136B2 (en) * 2012-10-19 2019-07-16 Sococo, Inc. Bridging physical and virtual spaces
US9319357B2 (en) 2009-01-15 2016-04-19 Social Communications Company Context based virtual area creation
US8737598B2 (en) * 2009-09-30 2014-05-27 International Business Corporation Customer support center with virtual world enhancements
US20120028712A1 (en) * 2010-07-30 2012-02-02 Britesmart Llc Distributed cloud gaming method and system where interactivity and resources are securely shared among multiple users and networks
KR101565665B1 (en) 2010-08-16 2015-11-04 소우셜 커뮤니케이션즈 컴퍼니 Promoting communicant interactions in a network communications environment
US9192860B2 (en) * 2010-11-08 2015-11-24 Gary S. Shuster Single user multiple presence in multi-user game
WO2012135231A2 (en) 2011-04-01 2012-10-04 Social Communications Company Creating virtual areas for realtime communications
US9105013B2 (en) 2011-08-29 2015-08-11 Avaya Inc. Agent and customer avatar presentation in a contact center virtual reality environment
EP3666352B1 (en) 2011-10-28 2021-12-01 Magic Leap, Inc. Method and device for augmented and virtual reality
WO2015008162A2 (en) * 2013-07-15 2015-01-22 Vocavu Solutions Ltd. Systems and methods for textual content creation from sources of audio that contain speech
CN104134226B (en) 2014-03-12 2015-08-19 腾讯科技(深圳)有限公司 Speech simulation method, device and client device in a kind of virtual scene
US11900734B2 (en) 2014-06-02 2024-02-13 Accesso Technology Group Plc Queuing system
GB201409764D0 (en) 2014-06-02 2014-07-16 Accesso Technology Group Plc Queuing system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6757362B1 (en) * 2000-03-06 2004-06-29 Avaya Technology Corp. Personal virtual assistant
US20060025214A1 (en) * 2004-07-29 2006-02-02 Nintendo Of America Inc. Voice-to-text chat conversion for remote video game play
US20070179867A1 (en) * 2004-03-11 2007-08-02 American Express Travel Related Services Company, Inc. Virtual reality shopping experience
US20080204450A1 (en) * 2007-02-27 2008-08-28 Dawson Christopher J Avatar-based unsolicited advertisements in a virtual universe
US20080262911A1 (en) * 2007-04-20 2008-10-23 Utbk, Inc. Methods and Systems to Search in Virtual Reality for Real Time Communications

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3314704B2 (en) * 1998-01-20 2002-08-12 東洋紡績株式会社 Method of synthesizing image showing fitting state and virtual fitting system using the method
JP3683504B2 (en) * 2001-02-14 2005-08-17 日本電信電話株式会社 Voice utilization type information retrieval apparatus, voice utilization type information retrieval program, and recording medium recording the program
JP2003030469A (en) * 2001-07-16 2003-01-31 Ricoh Co Ltd Commodity sales system by virtual department store using virtual reality space, virtual sales system, program and recording medium
US7987151B2 (en) * 2001-08-10 2011-07-26 General Dynamics Advanced Info Systems, Inc. Apparatus and method for problem solving using intelligent agents
JP3892338B2 (en) * 2002-05-08 2007-03-14 松下電器産業株式会社 Word dictionary registration device and word registration program
JP2004178094A (en) * 2002-11-25 2004-06-24 Nippon Telegr & Teleph Corp <Ntt> Online electronic catalog constitution system and method
EP2544066B1 (en) * 2005-12-02 2018-10-17 iRobot Corporation Robot system
JP4368388B2 (en) * 2007-03-27 2009-11-18 株式会社インコムジャパン Virtual space excursion system with avatar product try-on function
US20090313007A1 (en) * 2008-06-13 2009-12-17 Ajay Bajaj Systems and methods for automated voice translation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6757362B1 (en) * 2000-03-06 2004-06-29 Avaya Technology Corp. Personal virtual assistant
US20070179867A1 (en) * 2004-03-11 2007-08-02 American Express Travel Related Services Company, Inc. Virtual reality shopping experience
US20060025214A1 (en) * 2004-07-29 2006-02-02 Nintendo Of America Inc. Voice-to-text chat conversion for remote video game play
US20080204450A1 (en) * 2007-02-27 2008-08-28 Dawson Christopher J Avatar-based unsolicited advertisements in a virtual universe
US20080262911A1 (en) * 2007-04-20 2008-10-23 Utbk, Inc. Methods and Systems to Search in Virtual Reality for Real Time Communications

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2380107A4 *

Also Published As

Publication number Publication date
EP2380107A1 (en) 2011-10-26
US20100162121A1 (en) 2010-06-24
EP2380107A4 (en) 2014-01-15
JP5748668B2 (en) 2015-07-15
JP2012514243A (en) 2012-06-21

Similar Documents

Publication Publication Date Title
US20100162121A1 (en) Dynamic customization of a virtual world
JP6700463B2 (en) Filtering and parental control methods for limiting visual effects on head mounted displays
Kern et al. Audio in VR: Effects of a soundscape and movement-triggered step sounds on presence
US10632372B2 (en) Game content interface in a spectating system
US11373196B2 (en) Method and system for viral marketing within a virtual world
JP4395687B2 (en) Information processing device
CN112074899A (en) System and method for intelligent initiation of human-computer dialog based on multimodal sensory input
CN110536725A (en) Personalized user interface based on behavior in application program
US20150084838A1 (en) Public Signage
KR20180022866A (en) Integration of the specification and game systems
US20100083324A1 (en) Synchronized Video Playback Among Multiple Users Across A Network
CN105723325A (en) Media item selection using user-specific grammar
JP2010535362A (en) Monitoring the opinions and reactions of users in the virtual world
US20100082515A1 (en) Environmental factor based virtual communication systems and methods
CN116782986A (en) Identifying a graphics interchange format file for inclusion with content of a video game
US20220020053A1 (en) Apparatus, systems and methods for acquiring commentary about a media content event
JP6751919B2 (en) Social media systems and programs
JP2024012541A (en) Information processing system, information processing method and information processing program
Siriaraya et al. The social interaction experiences of older people in a 3D virtual environment
CN117377519A (en) Crowd noise simulating live events through emotion analysis of distributed inputs
Park et al. Catch me if you can: effects of AR-enhanced presence on the mobile game experience
CN106113057A (en) Audio frequency and video advertising method based on robot and system
Semerádová et al. The place of virtual reality in e-retail: Viable shopping environment or just a game
Rome Narrative virtual reality filmmaking: A communication conundrum
Chen Behind clubhouse’s trajectory and phenom

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09834194

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2011541636

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2009834194

Country of ref document: EP