US7689423B2 - System and method of providing telematically user-optimized configurable audio - Google Patents
System and method of providing telematically user-optimized configurable audio Download PDFInfo
- Publication number
- US7689423B2 US7689423B2 US11/105,076 US10507605A US7689423B2 US 7689423 B2 US7689423 B2 US 7689423B2 US 10507605 A US10507605 A US 10507605A US 7689423 B2 US7689423 B2 US 7689423B2
- Authority
- US
- United States
- Prior art keywords
- user
- pause
- utterance
- telematics unit
- computer readable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
Definitions
- This invention relates generally to replaying phrases spoken by a user at a telematics unit.
- the invention relates to replaying phrases spoken by a user with a tempo and volume.
- a user can initiate a phone call from an in-vehicle mobile phone by announcing a call command.
- a call command can include the name or phone number of the person to be called.
- the telematics unit will repeat the phone number being called just before the call is initiated.
- the repeated phone number is generated by a speech-generation algorithm that provides audio signals for the string of numbers at a preset volume, without inflection and with a measured equal pause between each number.
- the acoustic waves generated by audio signal at a speaker have frequencies within the full range of frequencies heard by a human with average hearing.
- users may be prompted to enter information through a voice interface.
- the virtual utterances are synthetically generated at a preset volume, without inflection, with a measured equal pause between each word, and cover the full range of frequencies heard by a human with average hearing.
- people say a phone number with a pattern of varying pauses between different numbers. For example, a person may say the number 555-222-3333 with short pauses between the fives (5's), a long pause between the final 5 and the first 2, short pauses between the twos (2's), a long pause between the last 2 and the first 3, and finally with short pauses between the threes (3's).
- This sequence of pauses and numbers can be illustrated as the following string in which each under-score represents a small pause and four under-scores in a sequence combine to form a long pause: 5 — 5 — 5 — 5 — — — — — 2 — 2 — — — — 3 — 3 — 3 — 3.
- Some people have difficulty recognizing or remembering a number if the number is spoken in an unfamiliar pattern or if the pattern takes too long to announce.
- a person is in an MVCU with noisy background conditions, for example, if the window is open and trucks are passing at high speeds, then the person announcing a voice command to the telematics unit speaks loudly to ensure that the microphone in the vehicle picks up their voice command over the background noise.
- the telematics unit announces a virtual utterance responsive to the voice command, but the response may not be audible to the person in the vehicle because of the background noise.
- the user of the telematics unit does not hear a portion of the frequency range that is normally heard by humans, then synthetically generated responses from the telematics unit can be difficult for the user to hear. For example, if a user is not able to hear acoustic waves at frequencies above a frequency, F 1 , and if forty percent (40%) of the acoustic waves produced by synthetically generated prompts in a telematics unit include frequencies above the frequency, F 1 , the ear of the user will only respond to sixty percent (60%) of the acoustic waves produced by synthetically generated prompt.
- One aspect of the present invention provides a method of repeating a computer recognized string in a telematics unit in a vehicle, including receiving a user utterance at the telematics unit from a user, the user utterance including a plurality of words and a plurality of user pauses between the words, parsing the user utterance into a plurality of phonemes, forming a data string in which each user pause is associated with a phoneme adjacent to the user pause, and playing back the data string.
- a second aspect of the present invention provides computer readable medium storing a computer program including computer readable code for receiving a user utterance at the telematics unit from a user, the user utterance including a plurality of words and a plurality of user pauses between the words, computer readable code for parsing the user utterance into a plurality of phonemes, computer readable code for forming a data string in which each user pause is associated with a phoneme adjacent to the user pause and computer readable code for playing back the data string.
- a third aspect of the present invention provides a system for repeating a computer recognized string in a telematics unit in a vehicle including means for receiving a user utterance from a user, means for parsing the user utterance into a plurality of phonemes, means for forming a data string in which each user pause is associated with a phoneme adjacent to the user pause, and means for playing back the data string.
- FIG. 1 is a schematic diagram of a system for providing access to a telematics system in a mobile vehicle
- FIG. 2 illustrates a method for repeating a computer recognized string in a telematics unit in accordance with the present invention
- FIG. 3 illustrates a method for parsing a user utterance into a plurality of phonemes in accordance with the present invention
- FIGS. 3 a - 3 c illustrate the duration of phrases and pauses in an exemplary ten digit phone number
- FIG. 4 illustrates a method for forming a data string in accordance with the present invention
- FIG. 5 illustrates a first embodiment of a method for playing back the data string in accordance with the present invention
- FIG. 6 illustrates a second embodiment of a method for playing back the data string in accordance with the present invention
- FIG. 7 illustrates a first embodiment of a method for setting frequency parameters in a speech generation algorithm in accordance with the present invention.
- FIG. 8 illustrates a second embodiment of a method for setting frequency parameters in a speech generation algorithm in accordance with the present invention.
- FIG. 1 illustrates one embodiment of system for data transmission over a wireless communication system, in accordance with the present invention at 100 .
- Mobile vehicle communication system (MVCS) 100 includes a mobile vehicle communication unit (MVCU) 110 , a vehicle communication network 112 , a telematics unit 120 , one or more wireless carrier systems 140 , one or more communication networks 142 , one or more land networks 144 , one or more client, personal or user computers 150 , one or more web-hosting portals 160 , and/or one or more call centers 170 .
- MVCU 110 is implemented as a mobile vehicle equipped with suitable hardware and software for transmitting and receiving voice and data communications.
- MVCS 100 may include additional components not relevant to the present discussion. Mobile vehicle communication systems and telematics units are known in the art.
- MVCU 110 may also be referred to as a mobile vehicle throughout the discussion below. In operation, MVCU 110 may be implemented as a motor vehicle, a marine vehicle, or as an aircraft. MVCU 110 may include additional components not relevant to the present discussion.
- Vehicle communication network 112 sends signals to various units of equipment and systems within vehicle 110 to perform various functions such as monitoring the operational state of vehicle systems, collecting and storing data from the vehicle systems, providing instructions, data and programs to various vehicle systems, and calling from telematics unit 120 .
- vehicle communication network 112 utilizes interfaces such as controller-area network (CAN), Media Oriented System Transport (MOST), Local Interconnect Network (LIN), Ethernet (10 base T, 100 base T), International Organization for Standardization (ISO) Standard 9141, ISO Standard 11898 for high-speed applications, ISO Standard 11519 for lower speed applications, and Society of Automotive Engineers (SAE) standard J1850 for higher and lower speed applications.
- vehicle communication network 112 is a direct connection between connected devices.
- Wireless carrier system 140 is implemented as any suitable system for transmitting a signal from MVCU 110 to communication network 142 .
- Telematics unit 120 includes a processor 122 connected to a wireless modem 124 , a global positioning system (GPS) unit 126 , an in-vehicle memory 128 , a microphone 130 , one or more speakers 132 , an embedded or in-vehicle portable communication device 134 , such as, mobile phone or a personal digital assistant and a display 136 .
- the display 136 is not part of the telematics unit 120 but is part of the MVCU 110 and interfaces with the telematics unit 120 via the vehicle communication network 112 .
- the display 136 is part of the embedded or in-vehicle portable communication device 134 .
- the embedded or in-vehicle portable communication device 134 includes short-range wireless receivers and transmitters.
- the short-range wireless receivers and transmitters can be Wi-Fi and/or Bluetooth devices as is known in the art.
- telematics unit 120 includes short-range wireless receiver chips that are compatible with the Wi-Fi and/or Bluetooth technologies.
- the term “wi-fi” includes any radio transmission configured to broadcast within a limited range, such as less than one mile, and includes transmissions made under an industry standard, such as FCC part 13.
- “Wi-fi” includes, but is not limited to, 802.11 transmissions.
- Telematics unit 120 may be implemented without one or more of the above listed components. Telematics unit 120 may include additional components not relevant to the present discussion.
- Processor 122 is implemented as a microcontroller, microprocessor, controller, host processor, or vehicle communications processor.
- processor 122 is a digital signal processor (DSP).
- DSP digital signal processor
- processor 122 is implemented as an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- processor 122 is implemented as a processor working in conjunction with a central processing unit (CPU) performing the function of a general purpose processor.
- GPS unit 126 provides longitude and latitude coordinates of the vehicle responsive to a GPS broadcast signal received from one or more GPS satellite broadcast systems (not shown).
- Processor 122 executes various computer programs that control programming and operational modes of electronic and mechanical systems within MVCU 110 .
- Processor 122 controls communications (e.g. call signals) between telematics unit 120 , wireless carrier system 140 , and call center 170 .
- Processor 122 generates and accepts digital signals transmitted between telematics unit 120 and a vehicle communication network 112 that is connected to various electronic modules in the vehicle. In one embodiment, these digital signals activate the programming mode and operation modes, as well as provide for data transfers.
- a voice-recognition application including one or more speech recognition engines is installed in processor 122 .
- Speech recognition engines translate human voice input through microphone 130 to digital signals.
- the one or more speech recognition engines installed in processor 122 include one or more speech generation algorithms.
- the speech generation algorithms translate digital signals into virtual utterances, which are sent from processor 122 out through one or more speakers 132 .
- Communication network 142 includes services from one or more mobile telephone switching offices and wireless networks. Communication network 142 connects wireless carrier system 140 to land network 144 . Communication network 142 is implemented as any suitable system or collection of systems for connecting wireless carrier system 140 to MVCU 110 and land network 144 .
- Land network 144 connects communication network 142 to client computer 150 , web-hosting portal 160 , and call center 170 .
- land network 144 is a public-switched telephone network (PSTN).
- PSTN public-switched telephone network
- land network 144 is implemented as an Internet protocol (IP) network.
- IP Internet protocol
- land network 144 is implemented as a wired network, an optical network, a fiber network, other wireless networks, or any combination thereof.
- Land network 144 is connected to one or more landline telephones. Communication network 142 and land network 144 connect wireless carrier system 140 to web-hosting portal 160 and call center 170 .
- Client, personal or user computer 150 includes a computer usable medium to execute Internet browser and Internet-access computer programs for sending and receiving data over land network 144 and optionally, wired or wireless communication networks 142 to web-hosting portal 160 .
- Personal or client computer 150 sends user preferences to web-hosting portal through a web-page interface using communication standards such as hypertext transport protocol (HTTP), and transport-control protocol and Internet protocol (TCP/IP).
- HTTP hypertext transport protocol
- TCP/IP transport-control protocol and Internet protocol
- the data includes directives to change certain programming and operational modes of electronic and mechanical systems within MVCU 110 .
- a client utilizes computer 150 to initiate setting or re-setting of user-preferences for MVCU 110 .
- User-preference data from client-side software is transmitted to server-side software of web-hosting portal 160 .
- User-preference data is stored at web-hosting portal 160 .
- Web-hosting portal 160 includes one or more data modems 162 , one or more web servers 164 , one or more databases 166 , and a network system 168 .
- Web-hosting portal 160 is connected directly by wire to call center 170 , or connected by phone lines to land network 144 , which is connected to call center 170 .
- web-hosting portal 160 is connected to call center 170 utilizing an IP network.
- both components, web-hosting portal 160 and call center 170 are connected to land network 144 utilizing the IP network.
- web-hosting portal 160 is connected to land network 144 by one or more data modems 162 .
- Land network 144 sends digital data to and from modem 162 , data that is then transferred to web server 164 .
- Modem 162 may reside inside web server 164 .
- Land network 144 transmits data communications between web-hosting portal 160 and call center 170 .
- Web server 164 receives user-preference data from user computer 150 via land network 144 .
- computer 150 includes a wireless modem to send data to web-hosting portal 160 through a wireless communication network 142 and a land network 144 .
- Data is received by land network 144 and sent to one or more web servers 164 .
- web server 164 is implemented as any suitable hardware and software capable of providing web services to help change and transmit personal preference settings from a client at computer 150 to telematics unit 120 .
- Web server 164 sends to or receives from one or more databases 166 data transmissions via network system 168 .
- Web server 164 includes computer applications and files for managing and storing personalization settings supplied by the client, such as door lock/unlock behavior, radio station pre-set selections, climate controls, custom button configurations and theft alarm settings. For each client, the web server potentially stores hundreds of preferences for wireless vehicle communication, networking, maintenance and diagnostic services for a mobile vehicle.
- one or more web servers 164 are networked via network system 168 to distribute user-preference data among its network components such as database 166 .
- database 166 is a part of or a separate computer from web server 164 .
- Web server 164 sends data transmissions with user preferences to call center 170 through land network 144 .
- Call center 170 is a location where many calls are received and serviced at the same time, or where many calls are sent at the same time.
- the call center is a telematics call center, facilitating communications to and from telematics unit 120 .
- the call center is a voice call center, providing verbal communications between an advisor in the call center and a subscriber in a mobile vehicle.
- the call center contains each of these functions.
- call center 170 and web-hosting portal 160 are located in the same or different facilities.
- Call center 170 contains one or more voice and data switches 172 , one or more communication services managers 174 , one or more communication services databases 176 , one or more communication services advisors 178 , and one or more network systems 180 .
- Switch 172 of call center 170 connects to land network 144 .
- Switch 172 transmits voice or data transmissions from call center 170 , and receives voice or data transmissions from telematics unit 120 in MVCU 110 through wireless carrier system 140 , communication network 142 , and/or land network 144 .
- Switch 172 receives data transmissions from and sends data transmissions to one or more web-hosting portals 160 .
- Switch 172 receives data transmissions from or sends data transmissions to one or more communication services managers 174 via one or more network systems 180 .
- Communication services manager 174 is any suitable hardware and software capable of providing requested communication services to telematics unit 120 in MVCU 110 .
- Communication services manager 174 sends to or receives from one or more communication services databases 176 data transmissions via network system 180 .
- communication services manager 174 includes at least one analog and/or digital modem.
- Communication services manager 174 sends to or receives from one or more communication services advisors 178 data transmissions via network system 180 .
- Communication services database 176 sends to or receives from communication services advisor 178 data transmissions via network system 180 .
- Communication services advisor 178 receives from or sends to switch 172 voice or data transmissions.
- Communication services manager 174 provides one or more of a variety of services, including enrollment services, navigation assistance, directory assistance, roadside assistance, business or residential assistance, information services assistance, emergency assistance, and communications assistance.
- Communication services manager 174 receives service-preference requests for a variety of services from the client via computer 150 , web-hosting portal 160 , and land network 144 .
- Communication services manager 174 transmits user-preference and other data to telematics unit 120 through wireless carrier system 140 , communication network 142 , land network 144 , voice and data switch 172 , and/or network system 180 .
- Communication services manager 174 stores or retrieves data and information from communication services database 176 .
- Communication services manager 174 may provide requested information to communication services advisor 178 .
- communication services manager 174 contains at least one analog and/or digital modem.
- communication services advisor 178 is implemented as a real advisor.
- a real advisor is a human being in verbal communication with a user or subscriber (e.g. a client) in MVCU 110 via telematics unit 120 .
- communication services advisor 178 is implemented as a virtual advisor.
- a virtual advisor is implemented as a synthesized voice interface responding to requests from telematics unit 120 in MVCU 110 .
- Communication services advisor 178 provides services to telematics unit 120 in MVCU 110 .
- Services provided by communication services advisor 178 include enrollment services, navigation assistance, real-time traffic advisories, directory assistance, roadside assistance, business or residential assistance, information services assistance, emergency assistance, and communications assistance.
- Communication services advisor 178 communicates with telematics unit 120 through wireless carrier system 140 , communication network 142 , and land network 144 using voice transmissions, or through communication services manager 174 and switch 172 using data transmissions. Switch 172 selects between voice transmissions and data transmissions.
- FIG. 2 illustrates a method 200 for repeating a computer recognized string in a telematics unit 120 in accordance with the present invention.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 200 .
- the telematics unit (TU) 120 receives a user utterance from a user.
- the spoken phrase includes a plurality of words and a plurality of user pauses between the words.
- the telematics unit (TU) 120 receives a user utterance at a speech recognition engine operating in continuous recognition mode.
- the speech recognition engine is executed by processor 122 .
- the user utterance includes a phone number.
- the user speaks a name of a person and the person's phone number in sequence.
- the telematics unit 120 parses the user utterance into a plurality of phonemes.
- One embodiment of a method for parsing the user utterance into a plurality of phonemes is described below with reference to method 300 in FIG. 3 .
- the telematics unit 120 forms a data string in which each user pause is associated with a phoneme adjacent to the user pause.
- the phoneme adjacent to the user pause is one of a phoneme immediately preceding the user pause and a phoneme immediately following the user pause.
- One embodiment of a method for forming a data string is described below with reference to method 400 in FIG. 4 .
- stage S 208 the telematics unit (TU) 120 stores the data strings in the in-vehicle memory 128 .
- the telematics unit 120 stores the data strings corresponding to the name of a person and their phone number as correlated data strings in the in-vehicle memory 128 .
- stage S 208 is omitted and the flow proceeds directly from stage S 206 to stage S 212 .
- Stage S 210 is optional.
- the telematics unit 120 sets frequency parameters in a speech generation algorithm in a speech recognition engine. This option of this method is available to users who have difficulty hearing one or more range of frequencies. If this option is selected, then during stage S 212 , the data string is played back to the user in a voice prompt or virtual utterance having acoustic waves all within the range of frequencies that the user can hear.
- Two embodiments of methods for setting frequency parameters in a speech generation algorithm in a speech recognition engine are described below in a first and a second embodiment with reference to method 700 in FIG. 7 and method 800 in FIG. 8 , respectively.
- the telematics unit 120 plays back the data string.
- a speech generation algorithm in the speech recognition engine operates on the data string to generate the audio signals of a virtual utterance from the telematics unit 120 .
- the speech recognition engine converts the digital signal into the phonemes and pauses.
- the phonemes and pauses are sent from processor 122 out through one or more speakers 132 as acoustic waves of the virtual utterance.
- Acoustic waves are heard by the user as the virtual utterance including a plurality of words and a plurality of user pauses between the words.
- the virtual utterance mimics the patterns established by the rhythm and timing of the user utterance so the user utterance is repeated to the user as a virtual utterance in a more natural progression used by humans.
- the one or more speech generation algorithms in the speech recognition engine translate the phonemes and the associated user pauses in the received data string into a signal representative of the words and pauses between words, which is repeated back to the user as a virtual utterance upon receipt of the user utterance. Additional details about how the telematics unit 120 plays back the data string are described below in a first and a second embodiment with reference to method 500 in FIG. 5 and method 600 in FIG. 6 , respectively.
- FIG. 3 illustrates one embodiment of a method 300 for parsing a user utterance into a plurality of phonemes in accordance with the present invention.
- the user utterance includes the plurality of words and plurality of pauses between the words.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 300 .
- the telematics unit (TU) 120 recognizes phonemes for the plurality of words corresponding to the user utterance at the speech recognition engine.
- the one or more speech recognition engines for example, installed in processor 122 receive the words as well as the pauses between the words, apply algorithms to parse the words into phonemes and, apply algorithms to recognize the parsed phonemes.
- the recognized phonemes are converted into digital signals correlated to the phoneme.
- the phonemes are converted into digital signals correlated to the phoneme from a phoneme look-up table in the in-vehicle memory 128 .
- the digital signal format of the converted phoneme depends on the coding scheme of the speech recognition engine.
- the data string is a series of 1's and 0's.
- the data string is a series of +1's, 0's and ⁇ 1's.
- Other embodiments employ any data string coding formats known in the art.
- FIGS. 3 a , 3 b , and 3 c illustrate signal patterns 306 , 320 , and 324 , respectively, which indicate the duration of phrases and pauses in an exemplary ten-digit phone number.
- the horizontal axis in these figures is time and the vertical axis represents the amplitude of the spoken phrases. Pauses are shown as below the horizontal axis.
- Signal pattern 306 in FIG. 3 a represents the spoken phrases and pauses included in a first set of three digits in a ten digit number.
- the first set of three digits is correlated to the area code of the phone number.
- the extent of segment 308 along the horizontal axis represents the time taken to say the first digit of the ten digit number as word time 1 (WT 1 ).
- the extent of segment 312 along the horizontal axis represents the time taken to say the second digit of the ten digit number as word time 2 (WT 2 ).
- the extent of segment 316 along the horizontal axis represents the time taken to say the third digit of the ten digit number as word time 3 (WT 3 ).
- the extent of segment 310 along the horizontal axis represents the duration of the pause between the first digit and the second digit.
- the extent of segment 314 along the horizontal axis represents the duration of the pause between the second digit and the third digit. In this exemplary phrase, the duration of segment 314 is longer than the duration of segment 310 .
- signal pattern 320 in FIG. 3 b represents the spoken phrases and pauses included in a second set of three digits in the ten digit number.
- the second set of three digits is correlated to the local code of the phone number.
- the extent of segment 322 along the horizontal axis represents the time taken to say the fourth digit of the ten digit number as word time 4 (WT 4 ).
- the extent of segment 326 along the horizontal axis represents the time taken to say the fifth digit of the ten digit number as word time 5 (WT 5 ).
- the extent of segment 330 along the horizontal axis represents the time taken to say the sixth digit of the ten digit number as word time 6 (WT 6 ).
- Segment 324 represents the duration of the pause between the fourth digit and the fifth digit.
- Segment 328 represents the duration of the pause between the fifth digit and the sixth digit. In this exemplary phrase, the duration of segment 324 is longer than the duration of segment 328 .
- FIG. 3 c represents the spoken phrases and pauses included in the set of four digits in the ten digit number.
- the four digits correlate to the last four digits of the phone number, which uniquely identify a phone number.
- the extent of segment 336 along the horizontal axis represents the time taken to say the seventh digit of the ten digit number as word time 7 (WT 7 ).
- the extent of segment 340 along the horizontal axis represents the time taken to say the eighth digit of the ten digit number as word time 8 (WT 8 ).
- the extent of segment 344 along the horizontal axis represents the time taken to say the ninth digit of the ten digit number as word time 9 (WT 9 ).
- segment 348 along the horizontal axis represents the time taken to say the tenth digit of the ten digit number as word time 10 (WT 10 ).
- Segment 338 represents the duration of the pause between the seventh digit and the eighth digit.
- Segment 342 represents the duration of the pause between the eighth digit and the ninth digit.
- Segment 346 represents the duration of the pause between the ninth digit and the tenth digit.
- the duration of segment 342 is longer than the duration of segment 346 and segment 338 and the duration of segment 346 is longer than the duration of segment 338 .
- FIG. 4 illustrates a first embodiment of a method 400 for forming a data string in accordance with the present invention.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in a computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 400 .
- the telematics unit 120 determines the duration of each pause. Each pause is converted into a digital signal, which is correlated to the duration of the pause.
- a clock (not shown) in the processor 122 is set to zero when the user utterance begins.
- a time stamp from a GPS unit, such as GPS Unit 126 provides timing information.
- the processor 122 obtains the time on the clock when the user utterance ends, calculates the time duration of the user utterance and stores the calculated time duration of the user utterance in the in-vehicle memory 128 . In this manner, the processor 122 in the telematics unit 120 determines the duration of the user utterance.
- the duration of time for each pause is a measured time. In another embodiment, the duration of time for each pause is a calculated average of time. In yet another embodiment, the duration of time for each pause is a plurality of time-units. In one embodiment, the time-units are one-tenth ( 1/10) of a second. In another embodiment, the time-units are one-hundredth ( 1/100) of a second.
- an algorithm in the processor 122 measures the duration of the user utterance, calculates the number of words based on the phonemes recognized during stage S 302 described above with reference to method 300 of FIG. 3 , and divides the number of words into the measured time duration of the user utterance to determine the duration of each pause as an average time for each pause.
- the processor 122 determines N words were spoken, the processor 122 determines the speech time, ST 1 , for the combined N words without pauses, and then determines an average pause duration of (T ⁇ ST 1 )/N for each pause.
- the duration of the pauses reflects the speed with which the user announced the user utterance.
- the processor 122 in order to determine an average duration of pauses between N words spoken by the user, the processor 122 clocks the duration of the user utterance as a time T, the processor determines N words were spoken, the processor 122 determines the speech time, ST 1 , for the combined N words without pauses, and then determines an average pause duration of (T ⁇ ST 1 )/(N ⁇ 1) for each pause.
- the duration of the pauses reflects the speed with which the user announced the user utterance.
- an algorithm in the processor 122 determines that the user utterance is a phone number of ten digits based on the phonemes recognized during stage S 302 described above with reference to method 300 of FIG. 3 .
- the processor 122 sequentially segments the user utterance into a first set of three digits, a second set of three digits and a set of four digits, which correlate to the area code numbers, the local code numbers and the last four digits of the ten digit phone number, respectively.
- the clock (not shown) in the processor 122 is set to zero when the user utterance begins and the time on the clock as the user utterance is received is periodically stored with a correlated currently received pause or phoneme in received-time look-up table in the in-vehicle memory 128 .
- the period with which the user utterance is stored is small enough to include at least one time notation for each phoneme and each pause.
- the processor 122 determines the following from the received-time look-up table: the time duration for the first set of three digits, T 1 , which correlate to summing segments 308 , 310 , 312 , 314 , and 316 in FIG. 3 a ; the time duration for the second set of three digits, T 2 , which correlated to summing segments 322 , 324 , 326 , 328 , and 330 in FIG.
- the asymmetric pause segment 310 and segment 314 are averaged as P 1 .
- the asymmetric pause segment 324 and segment 328 are averaged as P 2 .
- the three asymmetric pause segments 338 , 342 and 346 are averaged as P 3 .
- the processor 122 also calculates the sum (2 ⁇ P 1 )+(2 ⁇ P 2 )+(3 ⁇ P 3 ), which is the total of pauses within the segments the user utterance.
- the processor 122 retrieves the time duration of the user utterance from the in-vehicle memory 128 .
- the processor 122 calculates an average pause P 4 between the first set of three digits and the second set of three digits and between the second set of three digits and the set of four digits.
- a first pause P 4 is located between WT 3 and WT 4 and a second pause P 4 is located between WT 6 and WT 7 .
- other calculations are performed to determine the duration of the pauses.
- the processor 122 stores the average time durations P 1 , P 2 , P 3 , and P 4 of the pauses as well as the clock based time durations T 1 , T 2 , T 3 , and T 4 of the sets of numbers in an in-vehicle memory 128 .
- an algorithm in the processor 122 determines that the user utterance is a phone number of seven digits based on the phonemes recognized during stage S 302 described above with reference to method 300 of FIG. 3 .
- the processor 122 sequentially segments the user utterance into a set of three digits and a set of four digits, which correlate to the local code number and the last four digits of the seven digit phone number, respectively.
- the processor 122 determines the average time durations of the pauses between the set of three digits, the average time durations of the pauses between the set of four digits, the time of input for the seven digits in phone number and the pause between the set of three digits and the set of four digits in the manner similar to that described above for the ten digit phone number.
- the clock in the processor 122 tracks the beginning time and end time of each word and each pause.
- the processor 122 calculates the time duration for each pause based on the difference between the beginning time and end time of each pause and stores the time duration of a pause with the correlated pause in a measured-pause-look up table in the in-vehicle memory 128 .
- the duration of the word time segments 308 , 312 , 316 , 322 , 326 , 330 , 336 , 340 , 344 , and 348 in the exemplary ten digit number represented in FIGS. 3 a , 3 b and 3 c are each stored in the look up table.
- the duration of pause segments 310 , 314 , 324 , 328 , 338 , 342 , and 346 in the exemplary ten digit number represented in FIGS. 3 a , 3 b and 3 c are each stored in the look up table along with the duration of the pause between word time segment 316 and segment 322 and the duration of the pause between word time segment 330 and segment 336 .
- the user utterance is a seven-digit phone number that is announced with less than seven numbers.
- the number 211-2000 can be announced as “Two, eleven, two-thousand,” as “Twenty-one, one, twenty, zero, zero” or as “Two, one, one, two-thousand.”
- the user may or may not have included a pause within the phrase “two-thousand.”
- the processor 122 can determine the duration of each pause using an average pause length or a measured pause length as described above. The processor 122 recognizes that the user utterance correlates to the seven-digit number of 2, 1, 1, 2, 0, 0, 0.
- each pause is associated with a recognized phoneme adjacent to the user pause.
- the phoneme adjacent to the user pause is the phoneme immediately preceding the user pause.
- segment 310 is associated with the segment 308 , which represents a first phoneme for a first digit and while segment 314 is associated with segment 312 , which represents a second phoneme for a second digit.
- the phoneme adjacent to the user pause is the phoneme immediately following the user pause.
- segment 310 is associated with the segment 312 , which represents a second phoneme for the second digit and segment 314 is associated with segment 316 , which represents a third phoneme for a third digit.
- the pauses are assigned a time duration equal to the time duration determined during stage S 402 .
- the processor 122 in the telematics unit 120 concatenates the user pauses each having an assigned time duration and the associated adjacent phonemes to form the data string.
- the data string is a digital data string.
- the data string is represented, herein, as a list of numbers with one or more markers between each number. Each type of marker represents a pause having a duration of unit time.
- concatenated data string for the phone number 555-222-3333 is represented as 5*5*5*2*2*2*3*3*3*3*3*3.
- the concatenated data string for the phone number 211-2000 is represented as 2*11*2000.
- the concatenated data string for the phone number 211-2000 is represented as 21*1*20*0*0.
- concatenated data string for the phone number 555-222-3333 is represented as 5+5+5+2+2+3+3+3+3.
- marker [p 1 ] represents the average time duration P 1 for the pauses between the first set of three digits
- marker [p 2 ] represents the average time duration P 2 for the pauses between the second set of three digits
- marker [p 3 ] represents the average time duration P 3 for the pauses between the set of four digits
- marker [p 4 ] represents average pause P 4 between the first set of three digits and the second set of three digits and between the second set of three digits and the set of four digits.
- average time duration P 1 is the average time duration of segment 310 and segment 314 .
- average time duration P 2 is the average time duration of segment 324 and segment 328 .
- average time duration P 3 is the average time duration of segment 338 , segment 342 , and segment 346 .
- the concatenated data string for the phone number 555-666-7777 is represented as 5[p 1 ]5[p 1 ]5[p 4 ]6[p 2 ]6[p 2 ]6[p 4 ]7[p 3 ]7[p 3 ]7[p 3 ]7.
- the number 211-2000 was announced as “Two, eleven, two-thousand,” or “Twenty-one, one, twenty, zero, zero,” there is not a set of three numbers and a set of four numbers on which the algorithm can operate.
- an under-score marker _ represents a pause having the duration of time-unit T.
- N number of under-scores in an uninterrupted sequence represent a pause having the duration of N ⁇ T.
- the concatenated data string is represented as 5 — 5 — 5 — 5 — — — — 2 — 2 — 2 — — 3 — — — 3 — 3 — 3.
- the concatenated data string for the phone number 211-2000 is represented as 2 — 11 — — 2000.
- the concatenated data string for the phone number 211-2000 is represented as 21 — 1 — — — 20 — — 0 — 0.
- FIG. 5 illustrates a first embodiment of a method 500 for playing back the data string in accordance with the present invention.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 500 .
- the telematics unit 120 receives a command from the user to perform an operation requiring playing back the data string that was stored in the in-vehicle memory 128 during stage S 208 .
- An exemplary command requiring a play back of a data string is when the user provides the command “Call John,” the telematics unit 120 retrieves the phone number correlated to “John” and provides a virtual utterance to the user.
- the virtual utterance is the phone number “555-222-3333,” which is announced prior to establishing the wireless connection with 555-222-3333. The virtual utterance allows the user to confirm the phone number.
- the virtual utterance is a voice prompt, which announces “Calling John at 5, 5, 5, 2, 2, 2, 3, 3, 3, 3.” If the virtual utterance includes the phone number 211-2000, then the virtual utterance to allow the user to confirm the phone number may announce “Calling John at 21, 1, 20, 0, 0.”
- the telematics unit 120 retrieves the data string from the in-vehicle memory 128 for playback.
- the speech recognition engine recognizes the command and generates an instruction to retrieve the data string from the in-vehicle memory 128 .
- the retrieve instruction directs the processor 120 to a look-up table that correlates the recognized command with the storage address in the in-vehicle memory 128 .
- the processor 122 retrieves the data string in digital format from the in-vehicle memory 128 .
- the data string is then played back according to the method described in stage S 212 with reference to method 200 in FIG. 2 .
- FIG. 6 illustrates a second embodiment of a method 600 for playing back the data string corresponding to the spoken phrase in accordance with the present invention.
- the spoken phrase is played back to the user at a volume that matches the volume of the command received from the user.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 600 .
- stage S 602 the telematics unit 120 receives a command from the user to perform an operation requiring playing back the data string.
- S 602 is implemented as stage S 402 described above with reference to method 400 in FIG. 4 .
- the telematics unit 120 determines a user-volume value of the command received during stage S 602 .
- processor 122 evaluates the audio signal generated by the command at one or more preset ranges of frequencies.
- the preset ranges of frequencies are stored, for example, in the in-vehicle memory 128 .
- the processor 122 determines the signal strength at one or more preset ranges of frequencies and calculates a user-volume value from the determined signal strengths.
- a high signal strength or low signal strength may correlate to a command that is spoken loudly or softly, respectively, and is a function of the calibration of the speech recognition engine that analyzes the signal.
- the processor 122 evaluates a wave envelope that encompasses all the received frequencies for the audio signal, determines the maximum signal strength of the envelope, and calculates a user-volume value from the determined maximum signal strength. In another embodiment, the processor 122 evaluates a wave envelope that encompasses all the received frequencies for the audio signal, determines the average signal strength of the envelope, and calculates a user-volume value from the determined average signal strength.
- the processor 122 evaluates for the audio signal generated by the command for each word of the command and determines the user-volume value for each word of the command. In this case, the processor 122 averages the user-volume value of all the words of the command and uses that average as the user-volume.
- the audio signal is operable to generate acoustic waves within a preset volume from one or more speakers 132 .
- the telematics unit 120 correlates a volume value in the telematics unit 120 to the user-volume value determined during stage S 604 before the prompt is announced to the user responsive to the command received during stage S 602 .
- the correlation between the volume value and the user-volume value is calibrated in the telematics unit 120 .
- the user-volume value determined by the processor 122 is correlated to a programmable volume control on the one or more speakers 132 .
- the processor 122 transmits instructions to adjust a voltage level of a processor in the one or more speakers 132 .
- the user-volume value determined by the processor 122 is correlated to set voltage levels of the audio signal sent to the one or more speakers 132 from the speech recognition engine.
- the processor 122 transmits instructions to adjust a voltage level of the audio signals to the speech generation algorithm in a speech recognition engine.
- the telematics unit 120 plays back the data string to the user with a volume that matches the volume of the received command.
- the speech generation algorithm in the speech recognition engine in the processor 122 converts the digital signals of the data string into the phonemes and pauses of the user utterance.
- the phonemes and pauses are sent from processor 122 out through one or more speakers 132 as acoustic waves of the virtual utterance. Acoustic waves are heard by the user as the user utterance including a plurality of words and a plurality of user pauses between the words.
- one or more speakers 132 Since the volume value of the speech generation algorithm is correlated to the user-volume value, one or more speakers 132 generate acoustic waves at amplitudes that are equivalent to the acoustic waves of the command received during stage S 602 . In this manner, when the telematics unit 120 is being used in excessively windy or noisy background conditions, the volume of the virtual utterances increases by measuring the volume of the input speaker command at the microphone. Likewise, quiet background conditions and soft speakers initiate reduced volume virtual utterances.
- FIG. 7 illustrates a first embodiment of a method for setting frequency parameters in a speech generation algorithm in accordance with the present invention.
- the user informs a communication services advisor 178 at the call center 170 that the user does not hear one or more ranges of frequencies.
- the call center 170 , the telematics unit 120 , and the processor 122 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 700 .
- the telematics unit 120 provides one or more user-selected ranges of frequencies to a call center (CC) 170 .
- the user initiates a call to the call center 170 by, for example, a button push on a button in communication with the telematics unit 120 .
- the button push establishes a connection between the telematics unit 120 and call center 170 via one or more wireless carrier systems 140 , one or more communication networks 142 , one or more land networks 144 , one or more client, personal or user computers 150 , one or more web-hosting portals 160 .
- the user informs the communication services advisor 178 about the user's hearing limitations.
- the telematics unit 120 receives a frequency-adjustment algorithm from the call center 170 based on the user-selected ranges of frequencies.
- the communication services advisor 178 generates a frequency-adjustment algorithm at the call center 170 and transmits the frequency-adjustment algorithm to the telematics unit 120 .
- the frequency-adjustment algorithm is generated when the communication services advisor 178 inputs the range or ranges of frequency, which are inaudible by the user, into a table and applies a frequency-modifying algorithm to the table.
- the communication services advisor 178 transmits the table to the telematics unit 120 and the processor 122 generates the frequency-adjustment algorithm.
- the telematics unit 120 applies the frequency-adjustment algorithm to the speech generation algorithm, so that the speech generation algorithm is programmed to generate signals operable to produce acoustic waves at one or more speakers 132 having frequencies only within the user-selected range of frequencies.
- the modified speech generation algorithm is now the default speech generation algorithm. In this manner, if a user does not hear the high end of the frequency spectrum, the telematics unit 120 provides prompts that are in the low end of the frequency spectrum.
- FIG. 8 illustrates a second embodiment of a method 800 for setting frequency parameters in a speech generation algorithm in accordance with the present invention.
- the user can listen to audio-selection tests at the telematics unit 120 .
- the user can then select the acoustic ranges, which are most audible and/or pleasing to the user.
- the telematics unit 120 , the processor 122 , the in-vehicle memory 128 , and the one or more speakers 132 have stored in computer readable medium at least one computer program including computer readable code to perform the operations described with reference to method 800 .
- the telematics unit 120 provides a menu of selectable ranges of frequencies to the user for an audio-selection test.
- the menu can be displayed on a display 136 in the MVCU 110 .
- the menu can provide more than one range of frequencies that, in combination, cover the range of frequencies heard by humans.
- the frequency range is selected or deselected by touching the screen, or otherwise providing an input, on or near each range of frequencies. If the range is currently selected, touching the range deselects the range. If the range is not currently selected, touching the range, selects the range. Once a set of ranges is selected the user can touch an ENTER box in the displayed menu to indicate that the selected ranges of frequencies are the to be used in the audio-selection test.
- the display 136 is on a display in the embedded or in-vehicle portable communication device 134 .
- the telematics unit 120 communicates with the embedded or in-vehicle portable communication device 134 via a short-range wireless connection as described above with reference to FIG. 1 .
- the telematics unit 120 receives one or more test ranges of frequencies based on the user's selection of one or more ranges of frequencies from the menu.
- the processor 122 in the telematics unit 120 identifies the ranges of frequencies selected by the user during stage S 802 as the test ranges of frequencies.
- the telematics unit 120 generates a test frequency adjustment algorithm based on the test ranges of frequencies.
- the processor 120 generates a frequency-adjustment algorithm by entering the user selected test range or test ranges of frequency into a table and applying a frequency-modifying algorithm to the table.
- the processor 120 generates a frequency-adjustment algorithm by entering the test range or test ranges of frequency, deselected by the user, into a table and applying a frequency-modifying algorithm to the table.
- the telematics unit 120 applies the test frequency-adjustment algorithm to the speech generation algorithm.
- the telematics unit 120 applies the frequency-adjustment algorithm to the speech generation algorithm, so that the speech generation algorithm is temporarily programmed to generate signals operable to produce acoustic waves at one or more speakers 132 having frequencies only within the test range of frequencies when playing back the data string.
- the telematics unit 120 plays back a test-data-string to generate acoustic waves having frequencies only in the test ranges of frequencies for the user to hear.
- the play back of the test-data-string is an audio-selection test.
- the user is provided with the same menu that was provided during stage S 802 .
- the selected test ranges of frequencies used in the last audio-selection test are indicated on the menu for the user.
- the user can see the menu with test ranges of frequencies during the audio-selection test.
- the test data string can be the numbers from zero (0) to nine (9).
- the user can repeat the flow of method 800 from stages S 802 to S 810 as many times as he or she desires. In this manner, the user can hear many different audio-selection tests.
- the telematics unit 120 receives a set-command from the user.
- the set-command is operable to set user-selected ranges of frequencies based on hearing one or more audio-selection tests.
- the user selects the desired ranges of frequencies from the selectable test ranges of frequencies and presses a set button.
- the set button is displayed on the menu with the desired ranges of frequencies.
- the set button is a button in the telematics unit 120 .
- a set-command is generated in the telematics unit 120 by the pressing of the set button, or otherwise providing a set input.
- the set-command sets the user-selected ranges of frequencies as the default ranges of frequencies for all prompts generated in the telematics unit.
- the default ranges of frequencies are set until the user repeats stages S 802 -S 812 of method 800 .
- the telematics unit 120 repeats the test-data-string, to generate acoustic waves having frequencies only in the user-selected ranges of frequencies, and then prompts the user to confirm the set-command.
- the telematics unit 120 generates a frequency-adjustment algorithm based on the user-selected ranges of frequencies.
- the processor 120 generates a frequency-adjustment algorithm by entering the user-selected ranges of frequency into a table and applying a frequency-modifying algorithm to the table.
- the processor 120 generates a frequency-adjustment algorithm by entering the ranges of frequency that were not selected by the user into a table and applying a frequency-modifying algorithm to that table.
- the telematics unit 120 applies the frequency-adjustment algorithm to the speech generation algorithm, so that the speech generation algorithm is programmed to generate acoustic waves having frequencies only within the user-selected range of frequencies when playing back the data string.
- the telematics unit 120 plays back data strings that mimic the pattern of a users speech at the volume level of the command that generated the prompt and in the ranges of frequencies that are audible to the user of the telematics unit. In another embodiment, the telematics unit 120 plays back data strings that mimic the pattern of a users speech in the ranges of frequencies that are audible to the user of the telematics unit.
Abstract
Description
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/105,076 US7689423B2 (en) | 2005-04-13 | 2005-04-13 | System and method of providing telematically user-optimized configurable audio |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/105,076 US7689423B2 (en) | 2005-04-13 | 2005-04-13 | System and method of providing telematically user-optimized configurable audio |
Publications (2)
Publication Number | Publication Date |
---|---|
US20060235688A1 US20060235688A1 (en) | 2006-10-19 |
US7689423B2 true US7689423B2 (en) | 2010-03-30 |
Family
ID=37109648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/105,076 Expired - Fee Related US7689423B2 (en) | 2005-04-13 | 2005-04-13 | System and method of providing telematically user-optimized configurable audio |
Country Status (1)
Country | Link |
---|---|
US (1) | US7689423B2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080221882A1 (en) * | 2007-03-06 | 2008-09-11 | Bundock Donald S | System for excluding unwanted data from a voice recording |
US20120060090A1 (en) * | 2010-07-29 | 2012-03-08 | Ubersox George C | System for Automatic Mouse Control |
US20120271640A1 (en) * | 2010-10-15 | 2012-10-25 | Basir Otman A | Implicit Association and Polymorphism Driven Human Machine Interaction |
US9886862B1 (en) | 2016-12-23 | 2018-02-06 | X Development Llc | Automated air traffic communications |
US10347136B2 (en) | 2016-12-23 | 2019-07-09 | Wing Aviation Llc | Air traffic communication |
US11190155B2 (en) | 2019-09-03 | 2021-11-30 | Toyota Motor North America, Inc. | Learning auxiliary feature preferences and controlling the auxiliary devices based thereon |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4859642B2 (en) * | 2006-11-30 | 2012-01-25 | 富士通株式会社 | Voice information management device |
WO2009027980A1 (en) * | 2007-08-28 | 2009-03-05 | Yissum Research Development Company Of The Hebrew University Of Jerusalem | Method, device and system for speech recognition |
JP2009244639A (en) * | 2008-03-31 | 2009-10-22 | Sanyo Electric Co Ltd | Utterance device, utterance control program and utterance control method |
US8983841B2 (en) * | 2008-07-15 | 2015-03-17 | At&T Intellectual Property, I, L.P. | Method for enhancing the playback of information in interactive voice response systems |
US8798847B2 (en) * | 2012-05-16 | 2014-08-05 | The Morey Corporation | Method and system for remote diagnostics of vessels and watercrafts |
US9294859B2 (en) | 2013-03-12 | 2016-03-22 | Google Technology Holdings LLC | Apparatus with adaptive audio adjustment based on surface proximity, surface type and motion |
JP6244658B2 (en) | 2013-05-23 | 2017-12-13 | 富士通株式会社 | Audio processing apparatus, audio processing method, and audio processing program |
US9787273B2 (en) * | 2013-06-13 | 2017-10-10 | Google Technology Holdings LLC | Smart volume control of device audio output based on received audio input |
JP6454514B2 (en) * | 2014-10-30 | 2019-01-16 | 株式会社ディーアンドエムホールディングス | Audio device and computer-readable program |
US10192546B1 (en) * | 2015-03-30 | 2019-01-29 | Amazon Technologies, Inc. | Pre-wakeword speech processing |
US9548958B2 (en) * | 2015-06-16 | 2017-01-17 | International Business Machines Corporation | Determining post velocity |
KR101942521B1 (en) | 2015-10-19 | 2019-01-28 | 구글 엘엘씨 | Speech endpointing |
US10269341B2 (en) | 2015-10-19 | 2019-04-23 | Google Llc | Speech endpointing |
US20170110118A1 (en) * | 2015-10-19 | 2017-04-20 | Google Inc. | Speech endpointing |
US10103699B2 (en) * | 2016-09-30 | 2018-10-16 | Lenovo (Singapore) Pte. Ltd. | Automatically adjusting a volume of a speaker of a device based on an amplitude of voice input to the device |
US10929754B2 (en) | 2017-06-06 | 2021-02-23 | Google Llc | Unified endpointer using multitask and multidomain learning |
EP4083998A1 (en) | 2017-06-06 | 2022-11-02 | Google LLC | End of query detection |
US10089305B1 (en) * | 2017-07-12 | 2018-10-02 | Global Tel*Link Corporation | Bidirectional call translation in controlled environment |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4870686A (en) * | 1987-10-19 | 1989-09-26 | Motorola, Inc. | Method for entering digit sequences by voice command |
US6308154B1 (en) * | 2000-04-13 | 2001-10-23 | Rockwell Electronic Commerce Corp. | Method of natural language communication using a mark-up language |
US20020042709A1 (en) * | 2000-09-29 | 2002-04-11 | Rainer Klisch | Method and device for analyzing a spoken sequence of numbers |
US6470315B1 (en) * | 1996-09-11 | 2002-10-22 | Texas Instruments Incorporated | Enrollment and modeling method and apparatus for robust speaker dependent speech models |
US20030023439A1 (en) * | 2001-05-02 | 2003-01-30 | Gregory Ciurpita | Method and apparatus for automatic recognition of long sequences of spoken digits |
US20030055653A1 (en) * | 2000-10-11 | 2003-03-20 | Kazuo Ishii | Robot control apparatus |
US20030168844A1 (en) | 2001-12-07 | 2003-09-11 | Borroni-Bird Christopher E. | Chassis with energy-absorption zones |
US20030185162A1 (en) | 2002-03-28 | 2003-10-02 | General Motors Corporation | Method and system for dynamically determining sleep cycle values in a quiescent mobile vehicle |
US20030232619A1 (en) | 2002-06-18 | 2003-12-18 | General Motors Corporation | Method and system for communicating with a vehicle in a mixed communication service environment |
US20040073423A1 (en) * | 2002-10-11 | 2004-04-15 | Gordon Freedman | Phonetic speech-to-text-to-speech system and method |
US6839670B1 (en) * | 1995-09-11 | 2005-01-04 | Harman Becker Automotive Systems Gmbh | Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process |
US20050137860A1 (en) * | 2003-12-22 | 2005-06-23 | Samsung Electronics Co., Ltd. | Apparatus and method for controlling frequency band considering individual auditory characteristic in a mobile communication system |
US20060047373A1 (en) | 2004-08-30 | 2006-03-02 | General Motors Corporation | Providing services within a telematics communication system |
US7031924B2 (en) * | 2000-06-30 | 2006-04-18 | Canon Kabushiki Kaisha | Voice synthesizing apparatus, voice synthesizing system, voice synthesizing method and storage medium |
-
2005
- 2005-04-13 US US11/105,076 patent/US7689423B2/en not_active Expired - Fee Related
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4870686A (en) * | 1987-10-19 | 1989-09-26 | Motorola, Inc. | Method for entering digit sequences by voice command |
US6839670B1 (en) * | 1995-09-11 | 2005-01-04 | Harman Becker Automotive Systems Gmbh | Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process |
US6470315B1 (en) * | 1996-09-11 | 2002-10-22 | Texas Instruments Incorporated | Enrollment and modeling method and apparatus for robust speaker dependent speech models |
US6308154B1 (en) * | 2000-04-13 | 2001-10-23 | Rockwell Electronic Commerce Corp. | Method of natural language communication using a mark-up language |
US7031924B2 (en) * | 2000-06-30 | 2006-04-18 | Canon Kabushiki Kaisha | Voice synthesizing apparatus, voice synthesizing system, voice synthesizing method and storage medium |
US20020042709A1 (en) * | 2000-09-29 | 2002-04-11 | Rainer Klisch | Method and device for analyzing a spoken sequence of numbers |
US20030055653A1 (en) * | 2000-10-11 | 2003-03-20 | Kazuo Ishii | Robot control apparatus |
US20030023439A1 (en) * | 2001-05-02 | 2003-01-30 | Gregory Ciurpita | Method and apparatus for automatic recognition of long sequences of spoken digits |
US20030168844A1 (en) | 2001-12-07 | 2003-09-11 | Borroni-Bird Christopher E. | Chassis with energy-absorption zones |
US20030185162A1 (en) | 2002-03-28 | 2003-10-02 | General Motors Corporation | Method and system for dynamically determining sleep cycle values in a quiescent mobile vehicle |
US20030232619A1 (en) | 2002-06-18 | 2003-12-18 | General Motors Corporation | Method and system for communicating with a vehicle in a mixed communication service environment |
US20040073423A1 (en) * | 2002-10-11 | 2004-04-15 | Gordon Freedman | Phonetic speech-to-text-to-speech system and method |
US20050137860A1 (en) * | 2003-12-22 | 2005-06-23 | Samsung Electronics Co., Ltd. | Apparatus and method for controlling frequency band considering individual auditory characteristic in a mobile communication system |
US20060047373A1 (en) | 2004-08-30 | 2006-03-02 | General Motors Corporation | Providing services within a telematics communication system |
Non-Patent Citations (2)
Title |
---|
U.S. Appl. No. 10/212,957, filed Aug. 6, 2002, Fraser et al. |
U.S. Appl. No. 10/837,935, filed May 3, 2004, Oesterling et al. |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080221882A1 (en) * | 2007-03-06 | 2008-09-11 | Bundock Donald S | System for excluding unwanted data from a voice recording |
US20120060090A1 (en) * | 2010-07-29 | 2012-03-08 | Ubersox George C | System for Automatic Mouse Control |
US20120271640A1 (en) * | 2010-10-15 | 2012-10-25 | Basir Otman A | Implicit Association and Polymorphism Driven Human Machine Interaction |
US9886862B1 (en) | 2016-12-23 | 2018-02-06 | X Development Llc | Automated air traffic communications |
US10186158B2 (en) | 2016-12-23 | 2019-01-22 | X Development Llc | Automated air traffic communications |
US10347136B2 (en) | 2016-12-23 | 2019-07-09 | Wing Aviation Llc | Air traffic communication |
US10593219B2 (en) | 2016-12-23 | 2020-03-17 | Wing Aviation Llc | Automated air traffic communications |
US11190155B2 (en) | 2019-09-03 | 2021-11-30 | Toyota Motor North America, Inc. | Learning auxiliary feature preferences and controlling the auxiliary devices based thereon |
Also Published As
Publication number | Publication date |
---|---|
US20060235688A1 (en) | 2006-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7689423B2 (en) | System and method of providing telematically user-optimized configurable audio | |
US8175886B2 (en) | Determination of signal-processing approach based on signal destination characteristics | |
US7219063B2 (en) | Wirelessly delivered owner's manual | |
US8005668B2 (en) | Adaptive confidence thresholds in telematics system speech recognition | |
CN1941079B (en) | Speech recognition method and system | |
US8880402B2 (en) | Automatically adapting user guidance in automated speech recognition | |
US20050065779A1 (en) | Comprehensive multiple feature telematics system | |
US20120203557A1 (en) | Comprehensive multiple feature telematics system | |
CN102543077B (en) | Male acoustic model adaptation method based on language-independent female speech data | |
US20160071518A1 (en) | Service Oriented Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle User Interfaces Requiring Minimal Cognitive Driver Processing for Same | |
US9082414B2 (en) | Correcting unintelligible synthesized speech | |
US7844246B2 (en) | Method and system for communications between a telematics call center and a telematics unit | |
US20030050783A1 (en) | Terminal device, server device and speech recognition method | |
EP1739546A2 (en) | Automobile interface | |
US20070174055A1 (en) | Method and system for dynamic nametag scoring | |
JP2000194386A (en) | Voice recognizing and responsing device | |
CN104426998A (en) | Vehicle telematics unit and method of operating the same | |
US7711358B2 (en) | Method and system for modifying nametag files for transfer between vehicles | |
US20060265217A1 (en) | Method and system for eliminating redundant voice recognition feedback | |
US7596370B2 (en) | Management of nametags in a vehicle communications system | |
US7986974B2 (en) | Context specific speaker adaptation user interface | |
US7328159B2 (en) | Interactive speech recognition apparatus and method with conditioned voice prompts | |
CN102623006A (en) | Mapping obstruent speech energy to lower frequencies | |
US8433570B2 (en) | Method of recognizing speech | |
US20070136063A1 (en) | Adaptive nametag training with exogenous inputs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GENERAL MOTORS CORPORATION,MICHIGAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BICEGO, JAMES E.;SUMCAD, ANTHONY J.;PATENAUDE, RUSSELL A.;REEL/FRAME:016479/0364 Effective date: 20050412 Owner name: GENERAL MOTORS CORPORATION, MICHIGAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BICEGO, JAMES E.;SUMCAD, ANTHONY J.;PATENAUDE, RUSSELL A.;REEL/FRAME:016479/0364 Effective date: 20050412 |
|
AS | Assignment |
Owner name: UNITED STATES DEPARTMENT OF THE TREASURY, DISTRICT Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:022191/0254 Effective date: 20081231 Owner name: UNITED STATES DEPARTMENT OF THE TREASURY,DISTRICT Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:022191/0254 Effective date: 20081231 |
|
AS | Assignment |
Owner name: CITICORP USA, INC. AS AGENT FOR BANK PRIORITY SECU Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:022552/0006 Effective date: 20090409 Owner name: CITICORP USA, INC. AS AGENT FOR HEDGE PRIORITY SEC Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:022552/0006 Effective date: 20090409 |
|
AS | Assignment |
Owner name: MOTORS LIQUIDATION COMPANY (F/K/A GENERAL MOTORS C Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:UNITED STATES DEPARTMENT OF THE TREASURY;REEL/FRAME:023119/0491 Effective date: 20090709 |
|
AS | Assignment |
Owner name: MOTORS LIQUIDATION COMPANY (F/K/A GENERAL MOTORS C Free format text: RELEASE BY SECURED PARTY;ASSIGNORS:CITICORP USA, INC. AS AGENT FOR BANK PRIORITY SECURED PARTIES;CITICORP USA, INC. AS AGENT FOR HEDGE PRIORITY SECURED PARTIES;REEL/FRAME:023119/0817 Effective date: 20090709 Owner name: MOTORS LIQUIDATION COMPANY, MICHIGAN Free format text: CHANGE OF NAME;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:023129/0236 Effective date: 20090709 Owner name: MOTORS LIQUIDATION COMPANY,MICHIGAN Free format text: CHANGE OF NAME;ASSIGNOR:GENERAL MOTORS CORPORATION;REEL/FRAME:023129/0236 Effective date: 20090709 |
|
AS | Assignment |
Owner name: GENERAL MOTORS COMPANY, MICHIGAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTORS LIQUIDATION COMPANY;REEL/FRAME:023148/0248 Effective date: 20090710 Owner name: UNITED STATES DEPARTMENT OF THE TREASURY, DISTRICT Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023155/0814 Effective date: 20090710 Owner name: UAW RETIREE MEDICAL BENEFITS TRUST, MICHIGAN Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023155/0849 Effective date: 20090710 Owner name: GENERAL MOTORS COMPANY,MICHIGAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTORS LIQUIDATION COMPANY;REEL/FRAME:023148/0248 Effective date: 20090710 Owner name: UNITED STATES DEPARTMENT OF THE TREASURY,DISTRICT Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023155/0814 Effective date: 20090710 Owner name: UAW RETIREE MEDICAL BENEFITS TRUST,MICHIGAN Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023155/0849 Effective date: 20090710 |
|
AS | Assignment |
Owner name: GENERAL MOTORS LLC, MICHIGAN Free format text: CHANGE OF NAME;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023504/0691 Effective date: 20091016 Owner name: GENERAL MOTORS LLC,MICHIGAN Free format text: CHANGE OF NAME;ASSIGNOR:GENERAL MOTORS COMPANY;REEL/FRAME:023504/0691 Effective date: 20091016 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: GM GLOBAL TECHNOLOGY OPERATIONS, INC., MICHIGAN Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:UAW RETIREE MEDICAL BENEFITS TRUST;REEL/FRAME:025311/0770 Effective date: 20101026 Owner name: GM GLOBAL TECHNOLOGY OPERATIONS, INC., MICHIGAN Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:UNITED STATES DEPARTMENT OF THE TREASURY;REEL/FRAME:025245/0442 Effective date: 20100420 |
|
AS | Assignment |
Owner name: WILMINGTON TRUST COMPANY, DELAWARE Free format text: SECURITY AGREEMENT;ASSIGNOR:GENERAL MOTORS LLC;REEL/FRAME:025327/0196 Effective date: 20101027 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: GENERAL MOTORS LLC, MICHIGAN Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WILMINGTON TRUST COMPANY;REEL/FRAME:034183/0436 Effective date: 20141017 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.) |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.) |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20180330 |