METHOD AND APPARATUS FOR PERFORMING HANDSFREE OPERATIONS AND VOICING TEXT WITH A CDMA
TELEPHONE
REFERENCE TO PROVISIONAL APPLICATION
This application claims the benefit of U.S. Provisional Application 60/079,406, filed March 25, 1998.
BACKGROUND OF THE INVENTION
I. Field of the Invention
The present invention relates to communications systems. More particularly, the present invention relates to a novel and improved method for performing handsfree operations and voicing text with a CDMA telephone.
II. Description of the Related Art
Advances made in the development of cellular communications have resulted in a proliferation of cellular telephone users. Smaller, lighter cellular telephones with ever increasing voice quality and battery life draw more and more users each year to the convenience of cellular telephones. The convenience of these telephones makes them easy to use in a variety of locations. However, as more people use their telephones while operating a motor vehicle, the risks associated with such use increase as well.
A driver must be alert to respond to unanticipated changes in driving conditions while operating a motor vehicle. It is clear that any distraction which causes the driver to divert attention from the task of driving increases the risk of an accident resulting in property damage or injury. Cellular telephones are usually distributed with warning information
2 suggesting that all calls be placed while the vehicle is stopped safely out of traffic. Even if that warning is heeded, it is not possible to pull over in anticipation of an incoming call.
To reduce the risks of cellular telephone use while driving, modern cellular telephones are often sold with car kits used to facilitate "handsfree" operation. In handsfree mode, occupants of a motor vehicle can be heard through a microphone, and the remote audio signal is delivered through an external speaker. These features eliminate the need for holding a handset while communicating. Some car kits interface with the audio system installed in the car.
These systems automatically disconnect any programming being played on the audio system, replacing it with the remote audio signal when a call is placed or received. The use of such a system eliminates the risk associated with the additional step of adjusting the car audio volume when a call is received or placed while driving. However, these systems still require driver interaction at the initiation and termination of a call.
Modern telephone systems offer advanced features such as caller identification, call waiting and multi-way conference calling. These features add convenience and enhance the productivity of the user. Unfortunately, these advanced features often require the user to look at the display located on the cellular telephone, and often require the user to enter data through keys located on a keypad on the cellular telephone. These activities are distracting to the driver and affect the safety of the driver as well as those around the driver. The art of speech synthesis has seen many improvements, and today text to speech converters are commercially available. Cellular telephones which supply audio feedback are known in the art. One such system, U.S. Patent No. 5,140,632, entitled "TELEPHONE HAVING VOICE CAPABILITY ADAPTOR", teaches a way to provide audio feedback of numbers in response to keys pressed on a keypad. Another example is U.S. Patent No. 5,095,503, entitled "CELLULAR TELEPHONE CONTROLLER WITH
3 SYNTHESIZED VOICE FEEDBACK FOR DIRECTORY NUMBER
CONFIRMATION AND CALL STATUS".
Speech recognition is another area that now offers commercial solutions to those wishing to employ voice commands in a system. For example, speech recognition has been used to replace touch tone dialing for menu selection in automated telephone answering systems. In this example, a user may be prompted to say the number "one" to select a first option, or the number "two" to select a second option. Rather than pressing a key on the handset, the user speaks the desired selection, and the speech recognition software transforms the spoken command into a machine usable number.
Further improvements in the cellular telephone user interface which reduce driver distraction will result in safety benefits. In addition to safety issues, there is demand for cellular telephones which combine advanced features with ease of use.
Further, conventional wireless digital telephones have the capability of displaying text messages on a visual display unit. For example, text may be derived from information in a short message as defined in the Short Message Service (SMS) as defined in IS-637, applicable to Code Division Multiple Access (CDMA) wireless telephones as defined in IS-95. The text could also be derived from the contents of standard call setup messages, such as the name(s) or number(s) of the party making (or forwarding) a call to a wireless digital telephone. Other application text messages are also suitable. The information from which text messages are derived is included in the signaling message (or messages) used by a CDMA base station to establish communication with a wireless digital telephone. As such, this information consumes only a small portion of the base station's resources. It may nonetheless eliminate the need to establish a complete voice telephone call, and thus may eliminate a much greater use of resources. The telephone user may decide, for example, not to take the call (based on the caller's identity), or may not need to respond to the text message.
4 This text capability is substantially limited by the requirement that the person receiving the call must look at a visual display. This is often inconvenient, as when the person is driving an automobile.
SUMMARY OF THE INVENTION
In one aspect, the purpose of this invention is to reduce the manual manipulation of keys and visual inspection of displayed data required to operate a cellular telephone. This invention utilizes known speech synthesis and speech recognition techniques to provide handsfree operation which, in addition to allowing conversation to proceed handsfree, allows calls to be initiated, responded to, and terminated solely with the use of speech commands of the user. The user's speech commands are issued in conjunction with messages provided to the user over a loudspeaker. These, and other advanced telephony features, can be utilized by the user without the need to look at or handle the cellular telephone.
In the present invention, certain information is transmitted to a mobile station from a base station. This information is formatted into specific messages, of which there are a number of different types. A subset of the possible messages are identified, and those messages are converted to text. Utilizing text to speech technology, as is well known in the art, each textual message is transformed into an audible message directed to the user. For example, instead of simple caller identification, where an alphanumeric message is displayed on the cellular telephone when an incoming call is being received, the incoming call number can be translated into speech and provided audibly to the user. In this manner, the user is not required to divert attention to the display in order to determine whether or not to answer the call. In another variation, a database can be utilized indexing names based on telephone numbers. In this case, the name of a caller can be announced by the cellular telephone instead of just the number. A variety of other message types which contain information destined for the user can
5 be identified and processed in like manner to provide audible transmission to the user.
In addition to audible feedback, as just described, the present invention provides a means to respond to the audible messages without resorting to a combination of viewing the keypad and manually depressing keys to initiate a command. Instead, a microphone capable of detecting the users voice is deployed in connection with a speech recognition device.
Voice commands are then processed in the mobile station. The processed messages are then acted on by the mobile station or formatted into messages for transmission to the base station as necessary. For example, an incoming call is announced with the standard ring and an audible declaration of the incoming phone number or the callers name. The user can then respond verbally whether or not to accept the call. In another example, the user need not utilize the keypad to initiate a call. Verbal commands can be recognized to dial a number, or perhaps a name can be used to look up the appropriate number in the database. Following the verbal telephone number entry, a voiced "send" command can replace the now standard requirement of depressing the send button found on most cellular telephones.
In another aspect, Applicants have overcome the limitations of the prior art by newly combining two old elements: (a) a conventional wireless digital telephone, and (b) a conventional text-to-speech converter.
Wireless cellular/PCS systems based on the IS-95 CDMA standard (and probably those based on standards for other technologies as well) are capable of delivering the calling number and calling name to a wireless telephone. Existing text-to-speech technology allows the rendering into synthetic speech of a simple message, such as "You have a call from XXXX," where XXXX is the calling number or calling name delivered across the air interface. A "safe driver" adjunct to the telephone, or the telephone itself, could incorporate this capability. This would allow drivers to decide whether to answer a call without taking their eyes off the road. If no calling number or name is delivered across the air interface, then the adjunct or telephone would play a canned message, such as "You have a call from an
6 unknown number." If the call had been forwarded, then the forwarding number or name could be used instead of the calling number or name.
In a similar fashion, the adjunct or telephone could take advantage of text-to-speech technology to voice back the number of messages received by a subscriber to wireless Short Message Service, and the contents of these messages.
The Short Message Service message could be a conventional message or a broadcast message containing information similar to a highway billboard, such as commercials for products and establishments, or directions to the nearest food area, rest area, or gas station.
BRIEF DESCRIPTION OF THE DRAWINGS
The features, objects, and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify correspondingly throughout and wherein:
FIG. 1 is an illustration of an exemplary mobile station. FIG. 2 is a block diagram of the prior art text display. FIG. 3 is a block diagram of the voice text display of the present invention.
FIG. 4 is a flow chart showing the operation of the apparatus of FIG. 2.
7 DETAILED DESCRIPTION OF THE PREFERRED
EMBODIMENTS
FIG. 1 depicts a first embodiment of a remote station employing the present invention. Incoming signals are received at antenna 10 and passed through duplexer 20 into receiver 90 where the signals are downconverted and amplified. The resultant signals are demodulated and decoded in demodulator /decoder 100. Message identifier 110 monitors the data produced by demodulator /decoder 100 and provides messages to control processor 60. The types of messages provided are a prescribed subset of the total number of message types, and the information contained in those messages is processed in accordance with the present invention.
Control processor 60 receives the messages provided by message identifier 110. Database 160 contains information used by control processor 60 to determine whether to deliver a message to message to text converter 120. For example, database 160 may contain a list containing numbers from which an incoming call should be accepted. The user will not be notified of any incoming call that is not on that list, and the call will go unanswered. Similarly, a list could be maintained of those numbers from which a call is not to be received, and any call from a number not on the list will cause the user to be alerted.
If appropriate, the message will be delivered by control processor 60 to message-to-text converter 120. In this block, the information contained in the message will be formatted into a text version. As mentioned previously, another technique may be to replace an incoming caller identification number with the text of that caller's name. If this latter technique is employed, control processor 60 operating in conjunction with database 160 directs message to text converter 120 to alter the message to contain the text name rather than the incoming call number. Once the text version of the information is created, it is passed to text- to-speech converter 130, where known techniques are employed to convert
8 the text to speech signals appropriate for amplification in amplifier 140 and audio delivery through speaker 150. It will be clear to those skilled in the art that a single module could be designed which incorporates the elements of message to text converter 120 and text-to-speech converter 130 to form a single message to speech converter. The speech signal can alternatively be delivered to an optional car kit (not shown) for audio delivery through an alternate amplifier /loudspeaker combination. To provide the safety benefits outlined previously, it is desirable that the loudspeaker in use can be heard by the user without the user having to hold a handset. However, the convenience of audio feedback provided by this invention may also be incorporated into the standard speaker for normal handset use as well.
Subsequent to an audio prompt as described above, the user replies with a voice command in response to the information provided in the audible announcement. This voice command is received by microphone 80, or a microphone in an optional car kit (not shown), and delivered to speech recognition module 70. Speech recognition module 70 transforms the voice command using techniques known in the art into commands for interpretation by control processor 60. One alternative is to turn voice commands into text in speech recognition module 70, but other speech encoding techniques could also be used. Another alternative may be to turn a voice command into a number as indicated by its position in a list of valid voice commands. Control processor 60 takes required action based on the voice command received. For example, if an incoming call is to be accepted, control processor 60 performs the necessary operations to proceed with initiation of call acceptance.
In cases where the information contained in the voice command must be processed in the base station (or other equipment connected to the base station), control processor delivers the information to message generator 50 for proper formatting. These messages created by message generator 50 are delivered to modulator 40 for modulation, amplified and upconverted in transmitter 30, and sent to the base station by antenna 10 via duplexer 20.
9 In the exemplary embodiment, the modulation format employed is that described in the TIA/EIA Interim Standard IS-95-A entitled "Mobile
Station-Base Station Compatibility Standard for Dual-Mode Wideband
Spread Spectrum Cellular System", incorporated herein by reference, and referred to hereinafter simply as IS-95. The generation and receipt of CDMA signals is disclosed in U.S. Patent No. 4,401,307 entitled "SPREAD
SPECTRUM MULTIPLE ACCESS COMMUNICATION SYSTEMS USING
SATELLITE OR TERRESTRIAL REPEATERS" and in U.S. Patent No.
5,103,459 entitled "SYSTEM AND METHOD FOR GENERATING WAVEFORMS IN A CDMA CELLULAR TELEPHONE SYSTEM" both of which are assigned to the assignees of the present invention and incorporated herein by reference.
IS-95 provides a number of message types in both the forward and reverse directions. Some of these message types will be described as they relate to use in the present invention. There are two different forward link channels which deliver different types of messages, the traffic channel and the paging channel.
The messages which will be subsequently transformed into audible announcements, as described above, are typically delivered on the forward link. "Alert With Information" messages are delivered on a traffic channel. One type of "Alert With Information" message directs the cellular phone to ring. In the exemplary embodiment, the present invention turns this message into a speech signal which says, for example, "You have an incoming call." Another "Alert With Information" message contains the calling party's number. In the exemplary embodiment, the present invention transforms this message into a speech signal which says, for example, "You have an incoming call from 555-1234". Alternatively, as described above, if the incoming call is from a known party, the speech signal can be generated which says, for example, "You have an incoming call from John Doe".
When a call is active, another type of traffic channel message, "Flash With Information", can provide information to the user actively engaged in
10 a conversation. One such use of this type of message is for the call waiting feature. A speech signal can be generated to indicate to the user that an incoming call is arriving and to allow the user, via voice command, to put the current conversation on hold to answer the new call. Alternatively the user can direct the call to voice mail or to be ignored, again under voice command. Variations utilizing database 160, as described above, may only allow a select group of callers the privilege of interrupting a current call. Of course, the audible signal notifying the user that a call is incoming, and who it is, along with the voice response given by the user should be blocked from delivery to the current conversant, for privacy reasons.
Some messages are delivered on the paging channel. An example is the "Feature Notification Message", which can be used to signal to a user that voice mail is waiting for the user. In like fashion to the messages described above for the traffic channel, a speech signal such as "You have voice mail" can periodically be delivered audibly to the user until such time as the user responds to the messages.
Another class of information that can be provided utilizing the present invention is the Short Messaging Service (SMS) as described in ANSI/TIA/EIA - 664 - 1996, entitled "Cellular Features Description". IS-95 provides messages for implementing these features such as "Data Burst Message". SMS messages can be directed to a single user providing information such as whether or not there are messages in voicemail and how many messages there are. SMS also provides for broadcast information to be sent to a plurality of users. All of this messaging can be processed into speech as described above.
When a user travels out of their home cellular system, it is common for their cellular phone to register in a neighboring system. This is referred to as roaming. Roaming often comes with additional charges to the user. A useful feature of the present invention is to audibly alert the user, whether or not a call is in progress, that the user has traveled into a roaming region. If available, current rate or cellular provider information could be audibly delivered to the user, allowing the user to make informed choices about
11 whether to place or receive a new call, or to continue a call which is currently in progress.
Speech recognition of voice commands can be used for a variety of tasks. Responses to those messages which require a response, examples of which have been described above, can be given with a voice command.
Calls can also be placed in a similar manner. Another example of a useful voice command might be the recognition of the words "privacy on", for example, where the user is about to engage in conversation of a private or sensitive nature. This voice command will be transmitted to the base station, as described above, and applicable security measures will be added to the call processing for the duration of the call, or until the user issues a voice command such as "privacy off".
In some cases, audible feedback will be used in response to conditions caused by a voice command. For example, suppose a voice command was used to place a call. The user can be notified with speech such as "The line you are trying to reach is busy". Often in a cellular network it is the current base station which is at capacity, a condition typically indicated by a fast busy signal. This type of busy signal can be handled separately from the busy signal given when the called telephone is in use with a message such as "The cellular network is currently busy."
The aforementioned ANSI/TIA/EIA - 664 - 1996, "Cellular Features Description" also details a service called Priority Access and Channel Assignment (PACA). This service was developed in response to the need to provide certain classes of users with priority access to the cellular network in certain circumstances. For example, in the case of a natural disaster, it is often true that standard phone lines go down while cellular networks are functioning properly. Emergency vehicles and the like may be given priority so as to guarantee access in these situations. The PACA service provides a queue. If the network is busy, the user is notified that his call has been entered into the queue. When a channel becomes available, the user's call is put through and the user is notified. Higher priority calls can be moved ahead of lower priority calls in the queue. Sometimes a lower
12 priority call must be ejected from the queue to make space for a higher priority call. All of the PACA messaging can be conveniently delivered to the user utilizing the present invention.
FIG. 2 shows the prior art 200. A base station 202 transmits digitized speech and supplementary signaling information, in a call control or application message, to a wireless telephone receiver 204. A text generator 206 generates the text from the information and a display unit 208 visually displays it. There is a speaker 210, but it is used only for the non-text- generated speech. FIG. 3 shows the apparatus 300 of the present invention. A wireless digital telephone receiver 204 includes or drives a text generator 206, which is constructed to generate digital text from supplementary signaling information which has been included in one or more call control or application messages. The text generator 206 drives a text-to-speech converter 312, which is constructed to convert the text to a digital electronic speech signal. The digital electronic speech signal is applied to a digital-to- analog converter (DAC) 314, which is connected to convert it to an analog electronic speech signal. The analog electronic speech signal is applied to a speaker 316, which is connected to produce audible speech. It is generally desirable for the text generator 206 to also drive the display unit 208, since there are some situations (such as a meeting or a theater) in which a visual presentation of the text is preferable to an audible presentation. However, the display unit 208 may be omitted if desired. This is indicated on FIG. 2 by the display unit 208 being in dotted line. The text may conveniently include, for example, a calling number, calling name, forwarding number, or forwarding name. It may also conveniently include voice mail notifications and other Short Message Service messages.
The digital wireless telephone 204 may include the text-to-speech converter 312, DAC 314, and speaker3, as described above. Alternatively, these components may be included in an external device, referred to generally as reference numeral 318. It might be convenient, for example, for
13 a wireless telephone to use only a display unit 208 in hand-held operation.
The telephone might also have an adapter for use in an automobile, however. This adapter would be the external device 318, and might well also have a power supply and the like. In principle, speaker 316 in the automotive mode could be the same earpiece 210 conventionally used in hand-held mode, instead of a more powerful external speaker. This is not preferred unless the automobile in question is extremely quiet.
Functionally, the text generator 206 is a receiver means constructed to generate text which derived from information in one or more call control or application messages. Likewise, the text-to-speech converter 312 is a text-to- speech converting mean connected to the receiving means and constructed to convert the received text to a digital electronic speech signal. The DAC 314 is a digital-to-analog converting means connected to receive the digital electronic speech signal and to convert it to an analog electronic speech signal. The speaker 316 is an electronic-to-acoustical transducing means connected to receive the analog electronic speech signal and to produce audible speech. This is true whether external device 318 is truly external, or is included within the telephone 204.
FIG. 4 shows a flowchart 400 illustrating the operation of the apparatus 300 of FIG. 3. Digital text has been included in information in one or more call control or application messages. This information is received 402, and the text is generated 404 from it. The text is converted 406 to a digital electronic speech signal. The digital electronic speech is converted 408 to an analog electronic speech signal; and audible speech is produced 410 from the analog electronic speech signal.
The previous description of the preferred embodiments is provided to enable any person skilled in the art to make or use the present invention. The various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without the use of the inventive faculty. Thus, the present invention is not intended to be limited to the
14 embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
WE CLAIM: