WO2003091899A2 - Apparatus and method for identifying audio - Google Patents

Apparatus and method for identifying audio Download PDF

Info

Publication number
WO2003091899A2
WO2003091899A2 PCT/US2003/013023 US0313023W WO03091899A2 WO 2003091899 A2 WO2003091899 A2 WO 2003091899A2 US 0313023 W US0313023 W US 0313023W WO 03091899 A2 WO03091899 A2 WO 03091899A2
Authority
WO
WIPO (PCT)
Prior art keywords
audio
portable device
audio track
broadcast
track
Prior art date
Application number
PCT/US2003/013023
Other languages
French (fr)
Other versions
WO2003091899A3 (en
Inventor
Julie M. Zimring
Xiuzhi Gao
Timothy Michael Johnson
Marc Anguiano
Joseph Born
Original Assignee
Neuros Audio, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neuros Audio, Llc filed Critical Neuros Audio, Llc
Priority to AU2003223748A priority Critical patent/AU2003223748A1/en
Publication of WO2003091899A2 publication Critical patent/WO2003091899A2/en
Publication of WO2003091899A3 publication Critical patent/WO2003091899A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions

Definitions

  • the invention relates generally to identification of audio. More particularly, the invention is directed to a portable device configured to identify an audio track.
  • Audio Sound audible to the human ear, i.e., having a frequency between 20 and 20,000 vibrations per second (20-20,000Hz), is known as audio.
  • Examples of audio include speech, music, or the like. What is more, audio is typically heard from one of three sources, namely live performances, recordings, or broadcasts, hi general, recordings and broadcasts are either analog or digital.
  • Analog recordings include magnetic tape recordings and records, while digital recordings include compact discs (CDs), mini-discs, various data file formats, such as MPEG Audio Layer 3 (MP3) files, or the like.
  • Analog broadcasts include sound reproduction, such as via a stereo, and analog radio broadcasts.
  • Digital broadcasts include digital radio broadcasts, such as those provided by XM SATELLITE RADIO and StRFJS SATELLITE RADIO, and streaming broadcasts over the Internet, such as REAL AUDIO, WINDOWS MEDIA, or MP3 streams.
  • digital radio broadcasts such as those provided by XM SATELLITE RADIO and StRFJS SATELLITE RADIO
  • streaming broadcasts over the Internet such as REAL AUDIO, WINDOWS MEDIA, or MP3 streams.
  • the audio track may be any finite length audio composition, such as a song, speech, or the like.
  • the identity of the audio track may be important for a number of reasons, such as to enable a user to identify a song in order to purchase the song; to know more about the artist; to find out further details about the artist; to be able to identify the audio in the future; to ascertain to whom royalties must be payed; to index a list of unidentified audio tracks; or the like.
  • the identity of the audio track is established by a number of methods, such as by the listener recognizing the audio track, reading an associated writing identifying the audio track, or relying on an announcement of the identity of the audio track. For example, a listener may recognize a song or artist that he/she knows, he/she may read a music album's CD jacket to determine the identity of a song, or he/she may listen to a radio announcer announce the title and artist of a song.
  • each source of audio has its own associated drawbacks to audio identification.
  • drawbacks in broadcasting are that radio announcers often don't announce the identity of an audio track; they wait too long to make an announcement and a listener cannot wait until the song is completed to hear the announcement; it is often inconvenient to write down the name of the song; etc.
  • An example of a drawback of recordings is that, historically, recordings did not inform the listener of the identity of the audio track.
  • audio identification data is otherwise known as metadata and is associated with many types of digital audio files.
  • An example of such metadata is the ID3tags associated with MP3 audio files.
  • This metadata typically contains basic information about the audio file such as song title, artist, track length, etc.
  • digital streaming broadcasts sometimes also attach metadata to their digital audio streams.
  • GRACENOTE previously CDDB of Berkeley, California
  • GRACENOTE uses a Compact Disc Database (CDDB) to identify music that is generated from prerecorded CDs.
  • CDDB uses the unique identifiers found in the CD's table of contents, such as the CD's list of tracks and associated track times, to identify the songs on a CD.
  • the CDDB service works in conjunction with a variety of computer software media players to identify audio tracks. These media players use the CDDB to populate file names and metadata for each song encoded from a CD.
  • CDDB technology allows standalone CD players (not attached to a computer or the Internet) to display song title and artist information.
  • the device must store the GRACENOTE database locally and perform the same technique as described above, locally on the device.
  • a drawback of the CDDB technology is that it requires the presence of a full prerecorded CD to be able to identify the CD's individual audio tracks. Therefore, this technology cannot be used to identify individual audio tracks heard by a listener from sources other than a recorded CD.
  • Yet another type of device for identifying audio uses a time-stamping technique to identify audio tracks.
  • Two known devices that employ this time-stamping technique are the SONY E-MARKER and the XENOTE I-TAG. These devices are very simple keychain devices that simply record the date and time when a button on the device is depressed. In use, when a listener hears a song on the radio that he/she wants to identify, he/she presses the button on the device and the device records the date and time associated with the depression of the button. Later, when the device is synchronized with a desktop computer, a unique user identifier associated with the listener's device and the recorded date and time information is sent to a server via the Internet.
  • a web page is then displayed which shows the songs played on a variety of stations that the listener (having the unique identifier) had previously identified as the radio stations most commonly listened to.
  • the device itself does not store any information relating to the station the user was listening to at the time of the selection.
  • the Web-page, that presents the identified songs, also often presents options related to purchasing the CD that contains the selected song, etc.
  • a drawback of such devices that use time-stamping techniques is that they do not fully automate the process of identifying song information because the user is required to remember what station he/she was listening to when he/she actuated the device. Further, the user must interact with a desktop computer to obtain the audio track identification. Specifically, the user must identify the radio stations that he/she most commonly listens to. In addition, interaction through the Internet is required, and as a result, includes the normal drawbacks associated with the latency, reliability, and speed of the Internet. Put differently, the interaction is typically much slower than that encountered when using a non-Internet based audio track identification solution.
  • Such devices only record time and date of actuation, use of such devices is limited to radio broadcasts.
  • such devices require the service provider to maintain a database that contains the complete playlists and accompanying playtimes from every radio station in every market that the service provider wishes to support.
  • collecting such playlists and accompanying playtimes is usually performed by a third party.
  • the third party either manually identifies and enters the playlists and accompanying playtimes into a database, or these playlists and accompanying playtimes are automatically identified and stored in the database by a computer. In either event, such identification and storage is complex, requires significant effort, is costly, and is, therefore, typically limited to the most popular stations, thereby excluding many geographic areas and markets.
  • Audio fingerprinting typically uses software to identify a song by comparing a unique audio identifier or fingerprint (hereinafter “fingerprint”) of an audio sample to a database of known “fingerprints" associated with known audio samples.
  • CLANGO a software product made by AUDIBLE MAGIC CORP of Los Gatos, California, uses digital fingerprinting to identify streaming audio broadcasts that do not provide associated audio track metadata.
  • the fingerprinting performed by AUDIBLE MAGIC CORP is described in U.S. Patent No. 5,918,223.
  • ID3MAN Another provider of similar audio fingerprinting technology is AUDITUDE, whose software product ID3MAN is aimed at users who posses a collection of digital audio files whose associated identification data is either incorrect or incomplete.
  • ID3MAN identifies the audio files and subsequently corrects the identification data associated with those files.
  • a drawback of these fingerprinting devices or services is that they do not provide any benefit to users listening to music away from their desktop computers (except in the case of a CDDB enabled CD player, which requires the device to store an extremely large GRACENOTE database, and which has its own associated drawbacks, as described above.).
  • a further means for identifying audio uses a cellular telephone network, where upon hearing the audio that the user wants to identify, the user calls a designated number to have that audio identified for them. There are at least two methods that are used to provide this service.
  • the first method which was offered under the name BUZZHITS (now defunct), allowed the user to call a designated number and enter a user identifier which identified the caller (and the caller's geographic market) and then prompted the user for the broadcast frequency of the radio station broadcasting the audio to be identified. Once the broadcast frequency was supplied, the user was provided with sample audio clips, from which the user selected a sample audio clip to obtain the identity of the audio track. This information was also emailed to the user.
  • One embodiment of the invention includes a portable device that can record from a microphone, audio player, and/or radio receiver. To identify an audio track, this portable device, when actuated, records an audio sample of an audio track being played through the player or being received by the radio receiver. If the portable device is not currently playing audio or receiving a radio broadcast, it can record the audio sample through the microphone. This recorded audio sample is stored on the portable device's internal storage, and later, when connected to a client computer, is uploaded to that client computer. The client computer then processes the audio sample to generates a "fingerprint" of the audio sample that is then compared to a fingerprint database either on the client computer locally or on an identification server (ID server) coupled to the client computer via the Internet. Once the finge ⁇ rint has been identified, the title and artist information is returned to the client computer and ultimately displayed on the portable device.
  • ID server identification server
  • the portable device itself processes the finge ⁇ rint from the recorded audio sample, and generates the finge ⁇ rint.
  • This has the advantage of reducing the amount of storage space needed to store the audio samples, as only the finge ⁇ rint is stored on the portable device.
  • This embodiment requires that the portable device have adequate processor power to perform finge ⁇ rinting.
  • the device also performs additional actions once the audio sample or track has been identified.
  • additional actions include downloading the identified audio track from a subscription service, recommending more audio tracks similar to the identified audio track, obtaining prices of the identified audio track from Internet music merchants. These additional actions are preferably selected by choosing a menu item from the player's display, and can be customized and downloaded from third party service providers.
  • a method for identifying audio on a portable device An audio sample is recorded on a portable device from an audio track. The audio sample is then stored in a cache on the portable device. The audio sample is transmitted to a computing device to be identified. The audio sample is received by the computing device and finge ⁇ rinting is performed on the audio sample to obtain a unique audio finge ⁇ rint for the audio sample. A finge ⁇ rint database is then searched for a match of the finge ⁇ rint to a known finge ⁇ rint of a previously identified audio track. A match is located and identification data associated with the previously identified audio track is sent to the portable device. Identification of the audio sample is then received and displayed on the portable device. In another embodiment, the finge ⁇ rinting is performed on the portable device.
  • a radio broadcast is received and played on the portable device.
  • An instruction to identify an audio track of the radio broadcast is received and the radio broadcast's broadcast frequency, and the date and time that the portable device received the instruction to identify the audio track is automatically recorded.
  • the broadcast frequency, date, and time along with a unique device identifier is then transmitted to a computing device to be identified. This data is received by the computing device.
  • a playlist database is then searched for a match of the broadcast frequency, date, and time to a known radio station's broadcast frequency based on the user's geographic location determined by the unique device identifier, and known date and time that an audio track was broadcast by the radio station.
  • the audio track associated with the broadcast frequency, date and time is located and sent to the portable device.
  • the portable device thereafter receives and displays information associated with the identified audio track, hi another embodiment, the finge ⁇ rinting is performed on the portable device.
  • the process of identification can be automated.
  • the possible range of uses of the device can be broadened to cover a broader array of music, and music sources, where additional functionality can be provided for little additional cost.
  • the device can facilitate compiling a database on radio station playlists. Additional action can be initiated at the device, where the additional actions can be personalized to a user's preferences.
  • Figure 1 is a diagrammatic view of a system for identifying audio on a portable device, according to an embodiment of the invention
  • FIG 2 is a block diagram of the identification server (ID server) and/or client computer shown in Figure 1;
  • FIG 3 is a block diagram of the portable device shown in Figure 1;
  • Figure 4 is a flow chart of a method for identifying audio, where the identification is performed by an ID server, according to an embodiment of the invention
  • Figure 5 is a flow chart of another method for identifying audio, where the identification is performed by a client computer, according to another embodiment of the invention.
  • Figure 6 is a flow chart of yet another method for identifying audio, where the identification is performed by a portable device, according to yet another embodiment of the invention.
  • Figure 7 is a flow chart of a method for identifying audio from a radio broadcast, where the identification is performed by an identification server (ID server), according to an embodiment of the invention
  • Figure 8 is a flow chart of another method for identifying audio from a radio broadcast, where the identification is performed by a client computer, according to another embodiment of the invention.
  • Figure 9 is a flow chart of yet another method for identifying audio from a radio broadcast, where the identification is performed by a portable device, according to yet another embodiment of the invention.
  • FIG. 1 is a block diagram of a system 100 for identifying audio, according to an embodiment of the invention.
  • the system 100 comprises at least one identification server 102 (hereinafter “ID server”) and at least one client computer 106 coupled to one another via a network 104.
  • ID server 102 and client computer 106 are any type of computing devices.
  • the client computer 106 is a desktop computer and the network 104 is the Internet.
  • the client computer 106 is coupled to the network 104 by any suitable communication link 108, such as Ethernet, coaxial cables, copper telephone lines, optical fibers, wireless, infra-red, or the like.
  • a portable audio identification device 112 (hereinafter “portable device”) is coupled to the client computer 106.
  • the portable device 112 is preferably sized to be carried in the palm of one's hand.
  • the portable device 112 couples to the client computer 112 by any suitable communication link 110, such as Universal Serial Bus (USB), Firewire, Ethernet, coaxial cable, copper telephone line, optical fiber, wireless, infra-red, or the like.
  • USB Universal Serial Bus
  • the client computer 106 is a fixed wireless base station coupled to a gateway/modem that is in turn connected to the network 104.
  • client computer might be a WiFi (Wireless Fidelity - IEEE 802.1 lb wireless networking) base station coupled to the network 104 via a Digital Subscriber Line (DSL) gateway (not shown).
  • DSL Digital Subscriber Line
  • the communication link from the portable device 112 to the client computer 106 is a WiFi wireless communication link.
  • no client computer 106 is present and the portable device 112 communicates directly with the ID server 102.
  • the portable device 112 includes cellular telephone communication circuitry which communicates with the ID server 102 via a cellular telephone network (network 104).
  • the portable device 112 alone is necessary to identify audio.
  • the portable device periodically downloads updated playlists and/or finge ⁇ rinting databases from the network 104, as explained in further detail below in relation to Figures 6 and 9.
  • a playlist provider 114 and finge ⁇ rint provider 116 may also be coupled to the network 104.
  • the playlist provider 114 is a server that supplies updated playlists to the ID server 102, client computer 106, and/or portable device 112
  • the finge ⁇ rint provider 116 is a server that supplies updated finge ⁇ rint data for new audio tracks to the ID server 102, client computer 106, and/or portable device 112.
  • FIG 2 is a block diagram of the ID server 102 and/or client computer 106 shown in Figure 1.
  • ID server 102 and/or client computer 106 are shown in one diagram to avoid repetition. It should, however, be appreciated that all the elements of the ID server 102 and/or client computer 106 listed below need not be present in all embodiments of the invention and are merely included for exemplary pu ⁇ oses.
  • the ID server 102 and/or client computer 106 preferably include: at least one data processor or central processing unit (CPU) 202; a memory 210; user interface devices 206, such as a monitor and keyboard; communications circuitry 204 for communicating with the network 104 ( Figure 1), ID server 102 ( Figure 1), client computer 106 ( Figure 1) and/or portable device 112 ( Figure 1); and at least one bus 208 that interconnects these components.
  • CPU central processing unit
  • Memory 210 preferably includes an operating system 212, such as VXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc.
  • Memory 210 also preferably includes communications procedures 214 for communicating with the network 104 ( Figure 1), ID server 102 ( Figure 1), client computer 106 ( Figure 1) and/or portable device 112 ( Figure 1); finge ⁇ rinting procedures 216; searching procedures 218; a finge ⁇ rinting database 220; a radio playlist database 224; a geographic identifier 234; a "no identification" message 236; and a cache 238 for temporarily storing data.
  • an operating system 212 such as VXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc.
  • Memory 210 also preferably includes communications procedures 214 for communicating with the network 104 ( Figure 1), ID server 102 ( Figure 1), client computer 106 ( Figure 1) and/or portable device 112 ( Figure 1); finge
  • the finge ⁇ rinting procedures 216 are used to obtain a unique identifier or finge ⁇ rint for an audio sample of an audio track, as described in further detail below in relation to figures 4 and 5.
  • the finge ⁇ rinting procedures 216 include instructions for performing finge ⁇ rinting on the audio sample to obtain a unique audio finge ⁇ rint for the audio sample.
  • the searching procedures 218 are used for searching the finge ⁇ rint database 220 in order to attempt to identify audio, as described in further detail below in relation to figures 4 to 6.
  • the finge ⁇ rinting database 220 includes numerous finge ⁇ rints of known audio samples or audio tracks and their associated identification data 222(1 )-(N), such as song title, artist, or the like.
  • a radio playlist database 224 is provided.
  • the radio playlist database 224 includes numerous radio frequencies 226(1)-(N) and an associated playlist 228(1)-(N) for each frequency 226(1)-(N).
  • Each playlist 228(1)- (N) includes a date 230(1)-(N) and time 232(1)-(N) , and the identity 232(1)-(N) of each audio track broadcast at that date and time.
  • radio station KJAZ may have a frequency of 98.7FM, and a playlist that includes Frank Sinatra's "New York, New York" broadcast on January 21, 2002 at 9:00AM.
  • a geographic identifier 234 is provided to identify the radio stations or frequencies 226(1)-(N) in a particular geographic area.
  • This geographic identifier 234 may be provided by any suitable means.
  • the user supplies the geographic identifier.
  • the geographic identifier 234 is obtained from the user's unique network address. For example, an Internet Protocol (IP) address of the client computer 106 and/or portable device 112 can be used to approximate the geographic area of the user.
  • IP Internet Protocol
  • GPS Global Positioning System inco ⁇ orated into the client computer 106 and/or portable device 112 can be used to determine the geographic area of the user.
  • the "no identification" message 236 is used to inform the user that no identification can be made.
  • the user may be presented with a number of "closest match" possible identifications.
  • the finge ⁇ rinting procedures 216, searching procedures 218, and finge ⁇ rinting database 224 may only be necessary on the device on which finge ⁇ rinting of an audio track occurs.
  • the aforementioned elements of memory 210 need only be present on the ID server 102.
  • the aforementioned elements of memory 210 are not provided on either the ID server 102 or the client computer 106.
  • FIG 3 is a block diagram of the portable device 112 shown in Figure 1. It should be appreciated to one skilled in the art that all the elements of the portable device 112 listed below need not be present in all embodiments of the invention and are merely included for exemplary pu ⁇ oses.
  • the portable device 112 preferably includes: at least one data processor or central processing unit (CPU) 302; a memory 310; user interface devices 308, such as buttons, a screen, and a headset; communications circuitry 304 for communicating with the network 104 ( Figure 1), ID server 102 ( Figure 1), and/or client computer 106 ( Figure 1); one or more audio players 350, such as a CD or MP3 player; a microphone 352; a radio receiver 354 and antenna 356 for receiving radio broadcasts; and at least one bus 306 that interconnects these components.
  • CPU central processing unit
  • Memory 310 preferably includes an operating system 312, such as NXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc.
  • Memory 310 also preferably includes communications procedures 314 for communicating with the network 104 ( Figure 1), ID server 102 ( Figure 1), and/or client computer 106 ( Figure 1); finge ⁇ rinting procedures 316; searching procedures 318; a finge ⁇ rinting database 320; a radio playlist database 324; a geographic identifier 334; geographic identification procedures 336; a "no identification" message 338; recording procedures 340; player procedures 342; radio procedures 344; a cache 346 for temporarily storing data; frequency detection procedures 358; and a clock 360.
  • an operating system 312 such as NXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc.
  • Memory 310 also preferably includes communications procedures 314 for communicating with the network 104 ( Figure 1), ID server 102
  • the finge ⁇ rinting procedures 316 are used to obtain a unique identifier or finge ⁇ rint for an audio sample of an audio track, as described in further detail below in relation to Figure 6.
  • the searching procedures 318 are used for searching the finge ⁇ rint database 320 in order to attempt to identify audio, as described in further detail below.
  • the finge ⁇ rinting database 320 includes numerous finge ⁇ rints of known audio samples or audio tracks and their associated identification data 322(1 )-(N), such as song title, artist, or the like.
  • a radio playlist database 324 is provided.
  • the radio playlist database 324 includes numerous radio frequencies 326(1)-(N) and an associated playlist 328(1)-(N) for each frequency 326(1)-(N).
  • Each playlist 328(1)- (N) includes a date 330(1)-(N) and time 332(1)-(N) , and the identity 332(1)-(N) of each audio track broadcast at that date and time.
  • a geographic identifier 334 is provided to assist in identifying the radio stations or frequencies 326(1)-(N) in a particular geographic area.
  • the geographic identifier 334 may select from a set of frequencies stored on the device based on the identified geographic area.
  • This geographic identifier 334 may be provided by any suitable means.
  • the user supplies the geographic identifier 334.
  • the geographic identifier 334 is obtained by the geographic identification procedures 336. As described above, this can be determined from the user's unique network address. For example, an Internet Protocol (IP) address of the portable device 112 can be used to approximate the geographic area of the user.
  • IP Internet Protocol
  • GPS Global Positioning System (not shown) inco ⁇ orated into the portable device 112 can be used to determine the geographic area of the user.
  • a "no identification" message 236 is used to inform the user that no identification can be made, if the portable device 112 cannot identify the audio track.
  • the recording procedures 340 record an audio sample 348, which is stored in the cache 348.
  • the audio sample is recorded from the audio player/s 350, microphone 352, and/or the radio receiver 354.
  • the recording procedures are used to record the date, time, and broadcast or radio station frequency 349, which is stored in the cache 348.
  • the player procedures 342 are preferably provided to play audio on the audio player/s 350. These player procedures 342 are especially needed for playing digital audio, such as MP3 audio tracks, or the like.
  • the radio procedures 344 are preferably provided to play radio received at the antenna 356 and fed through the radio receiver 354. It should, however, be appreciated that all the aforementioned components of the memory 310 need not be present in all embodiments of the invention and are merely included for exemplary pu ⁇ oses.
  • the frequency procedures 358 are used to detect the frequency of a radio station broadcast, and the clock 360 is used to keep the date and time.
  • the frequency procedures 358 and clock 360 are explained in further detail below in relation to Figures 7-9.
  • Figure 4 is a flow chart of a method for identifying audio, where the identification is performed by the identification server (ID server) 102, according to an embodiment of the invention.
  • the audio player/s 350 ( Figure 3) and/or player procedures 342 ( Figure 3) of the portable device 112 play at step 402 audio through the user interface devices 308 ( Figure 3).
  • a built-in MP3 player plays audio to a user through a headset.
  • the radio receiver 354 ( Figure 3) and/or radio procedures 344 ( Figure 3) receive and play a radio broadcast through the portable device's headset.
  • Instructions are then received at step 404 to identify the audio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like. In an alternative embodiment, the instruction to record is received automatically. For example, an audio sample is automatically recorded every 2 minutes. The steps 402 and 404 of playing audio and receiving instructions are not essential to the invention and in some embodiments need not occur.
  • audio sample is then recorded at step 406 by the recording procedures 342 ( Figure 3) and saved as an audio sample 348 ( Figure 3) in the cache 346 ( Figure 3).
  • audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo finge ⁇ rinting. For example, audio is continually recorded and automatically segmented into 30 second audio samples that are continually sent to the ID server 102 to be identified.
  • audio is recorded at step 406 through the microphone 352 ( Figure 3).
  • the communication procedures 314 ( Figure 3) transmit the audio sample to the client computer 106 at step 408.
  • this communication occurs over communications link 110, such as a serial port connection, wireless connection, or the like.
  • the audio sample is then received by the client computer and sent at step 410 to the ID server 102.
  • the portable device does not have a persistent communication link with the client computer 106 and/or ID server 102, then the audio samples are saved in the cache 346 ( Figure 3) until such time as a connection is established between the portable device 112 and the client computer 106 and/or ID server 102.
  • the audio sample is transmitted at step 408 directly to the ID server 102.
  • the ID server 102 then receives the audio sample at step 412.
  • the finge ⁇ rinting procedures 216 ( Figure 2) on the ID server 102 subsequently perform at step 414 finge ⁇ rinting on the audio sample to determine a unique identifier or finge ⁇ rint for the audio sample based on the audio sample's characteristics or acoustical features.
  • characteristics or acoustical features include the audio sample's analog waveform, loudness, pitch, brightness, bandwidth, Mel Frequency Cepstral Coefficients (MFCC), or the like.
  • MFCC Mel Frequency Cepstral Coefficients
  • the finge ⁇ rint database 220 ( Figure 2) is then searched at step 416 for a match or partial match of the finge ⁇ rint of the audio sample recorded on the portable device to a known finge ⁇ rint of a previously identified audio track. If a match is not located (418 - No), then a "no identification" message 236 ( Figure 2) is sent at step 430 to the portable device 112. In the embodiment where the portable device 112 couples to the J-D server 102 via the client computer, the client computer receives the "no identification" message and sends it to the portable device at step 432.
  • the portable device then receives at step 434 the "no identification message," which is displayed at step 436 to the user informing the user that the audio could not be identified.
  • a message is preferably displayed on a screen on the portable device.
  • identification data associated with said previously identified audio track is sent at step 420 to the portable device 112.
  • the client computer receives the identification of the audio sample and sends the identification to the portable device at step 422.
  • the portable device receives and displays the identification of the audio sample at steps 424 and 426 respectively. For example, an artist and song title is displayed.
  • additional actions are then taken at step 428. These additional actions are performed only after the identity of the audio sample is known, and for example include the client computer 106 and/or portable device 112 automatically displaying the identified artist's Web-page, biography, discography; automatically displaying a Web-page selling the artist's song or album; downloading the audio track from a subscription service; recommending a similar audio; obtaining prices of the identified audio track from Internet music merchants; or the like.
  • These additional actions are preferably selected by choosing a menu item from the portable device's display, and can be customized and/or downloaded from third party service providers.
  • the portable device can also return information about commercials or other spoken word recordings. Further these additional actions can be taken on all audio samples where the identity of the audio is known or unknown. For example, the user may have downloaded a digital audio file and may have the complete or partial identity of the song known, but still want to send the song for identification to receive additional information on the audio track or take some action on that audio track.
  • Figure 5 is a flow chart of another method for identifying audio, where the identification is performed by a client computer 106, according to another embodiment of the invention.
  • the audio player/s 350 ( Figure 3) and/or player procedures 342 ( Figure 3) of the portable device 112 play audio at step 502 through the user interface devices 308 ( Figure 3).
  • a built-in MP3 player plays audio to a user through a headset.
  • Instructions are then received at step 504 to identify audio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device. In an alternative embodiment, the instruction to record is received automatically. For example, an audio sample is automatically recorded every 2 minutes. It should be noted that the steps 502 and 504 of playing audio and receiving instructions to identify the audio are not essential to the invention and in some embodiments need not occur.
  • audio sample is then recorded at step 506 by the recording procedures 342 ( Figure 3) and saved as an audio sample 348 ( Figure 3) in the cache 349 ( Figure 3).
  • audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo finge ⁇ rinting. For example, audio is continually recorded and automatically segmented into 30 second audio samples that are continually identified by the client computer 106.
  • audio is recorded at step 506 through the microphone 352 ( Figure 3).
  • the communication procedures 314 ( Figure 3) continue transmitting audio samples to the client computer 106 at step 508 until the wireless connection is brought down. The audio sample is then received at step 510 by the client computer 106.
  • the portable device does not have a persistent communication link with the client computer 106
  • the audio samples are saved in the cache until such time as a connection is established between the portable device 112 and the client computer 106.
  • the finge ⁇ rinting procedures 216 ( Figure 2) on the client computer 106 subsequently perform at step 512 finge ⁇ rinting on the audio sample to determine a unique identifier or finge ⁇ rint for the audio sample based on the audio sample's characteristics or acoustical features, as described above.
  • the finge ⁇ rinting procedures 316 ( Figure 3) on the portable device 112 perform finge ⁇ rinting on the audio sample to determine a unique identifier or finge ⁇ rint for the audio sample based on the audio sample's characteristics or acoustical features. This finge ⁇ rint is then sent to the client computer 106, which searches the finge ⁇ rint database at step 514.
  • the finge ⁇ rint database 220 ( Figure 2) is then searched at step 514 for a match or partial match to the finge ⁇ rint of the audio sample recorded on the portable device. If a match is not located (516 - No), then a "no identification" message 236 ( Figure 2) is sent at step 526 to the portable device 112, which receives the "no identification message” at step 528 and displays it at step 530 to the user informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device. If a match is located (516 - Yes), then an identification of the audio sample is sent 518 to the portable device 112. The portable device receives and displays the identification of the audio sample at steps 520 and 522, preferably on the portable device's screen. For example, an artist and song title is displayed. In a preferred embodiment, additional actions are then taken at step 524, as described above.
  • Figure 6 is a flow chart of yet another method for identifying audio, where the identification is performed by a portable device 112, according to yet another embodiment of the invention.
  • the audio player/s 360 ( Figure 3) and/or player procedures 342 ( Figure 3) of the portable device 112 play audio at step 602 through the user interface devices 308 ( Figure 3).
  • a built-in MP3 player plays audio to a user through a headset.
  • An instruction to identify audio is then received at step 604.
  • the instruction to record is received automatically. It should be noted that the steps 602 and 404 of playing audio and receiving instructions to identify the audio are not essential to the invention and in some embodiments need not occur.
  • An audio sample is then recorded at step 606 by the recording procedures 342 ( Figure 3) and saved as an audio sample 348 ( Figure 3) in the cache 349 ( Figure 3).
  • audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo finge ⁇ rinting.
  • the audio samples 348 ( Figure 3) are preferably temporarily saved in the cache 346 ( Figure 3).
  • audio is recorded at step 606 through the microphone 352 ( Figure 3).
  • the finge ⁇ rinting procedures 316 ( Figure 3) on the portable device 112 subsequently perform at step 608 finge ⁇ rinting on the audio sample to determine a unique identifier or finge ⁇ rint for the audio sample based on the audio sample's characteristics or acoustical features.
  • characteristics or acoustical features include the audio sample's analog waveform, loudness, pitch, brightness, bandwidth, Mel Frequency Cepstral Coefficients (MFCC), or the like.
  • MFCC Mel Frequency Cepstral Coefficients
  • the finge ⁇ rint database 320 ( Figure 3) is then searched at step 610 for a match or partial match to the finge ⁇ rint of the audio sample recorded on the portable device. If a match is not located (612 - No), then a "no identification" message 340 ( Figure 3) is displayed at step 614 informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device.
  • an identification of the audio sample is displayed at step 616, preferably on the portable device's screen. For example, an artist and song title is displayed. In a preferred embodiment, additional actions are then taken at step 618, as described above.
  • Another embodiment of the invention identifies audio using a peer to peer network consisting of networked portable devices only. For example, if the device has wireless networking capabilities, and finge ⁇ rinting is performed on the portable device, then identification of the audio sample may occur by searching for finge ⁇ rints on other networked portable devices.
  • Another embodiment where finge ⁇ rinting is performed on the portable device provides central kiosks, such as at record stores, where identification of the finge ⁇ rint may be performed. This embodiment alleviates the load placed on such kiosks, as they do not generate the finge ⁇ rint and less data would have to be transferred and maintained at each kiosk.
  • Figure 7 is a flow chart of a method for identifying audio from a radio broadcast, where the identification is performed by an identification server (ID server) 102, according to an embodiment of the invention.
  • the radio receiver 354 ( Figure 3) and/or radio procedures 344 ( Figure 3) of the portable device 112 receive and play a radio broadcast at step 702.
  • An instruction is then received at step 704 to identify an audio track of the radio broadcast.
  • These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like.
  • the instruction to identify the audio track is received automatically. For example, an attempt to identify the audio track occurs automatically every 2 minutes.
  • the recording procedures 342 ( Figure 3) then record and store the radio station's broadcast frequency at step 706, and the date and time that the portable device 112 received the instruction to identify the audio track. For example, a broadcast frequency of 95.7 kHz, date of February 23, 2002, and a time of 113H00 is recorded.
  • the recording procedures 340 ( Figure 3) obtain the frequency from the frequency detection procedures 358 ( Figure 3) and the date and time from the clock 360 ( Figure 3).
  • the frequency detection procedures 358 ( Figure 3) do nothing more than ascertain what radio frequency the user has selected, i.e., by reading the value that the radio receiver has been tuned to.
  • the frequency detection procedures 358 may detect the frequency of the broadcast, which is often broadcast together with the audio signal.
  • RDS Radio Data Service
  • RDS typically transmits the actual radio identification.
  • RDS typically transmits the actual station identification, which is more reliable as such a station identification is not dependent on geography.
  • RDS actually transmits information about the owner of the radio station which would unambiguously defines which playlist to search.
  • Still other frequency detection procedures 358 may detect the frequency of the broadcast by detecting the frequency that the radio receiver is tuned to. It should be appreciated that the frequency of the broadcast is determined automatically, i.e., the user does not supply the radio frequency to the frequency detection procedures 358 ( Figure 3).
  • the date and time on the clock 360 ( Figure 3) initially can be set by the user, or the portable device can automatically set the clock using known techniques for remotely synchronizing a clock from a reliable time source, such as an atomic clock.
  • a reliable time source such as an atomic clock.
  • Such techniques for remotely synchronizing a clock are disclosed in U.S. Patent Nos. 4,823,328, and 4,768,178, both of which are inco ⁇ orated herein by reference.
  • clock/frequency data 349 ( Figure 3) are then stored in the cache 346 ( Figure 3).
  • the portable device 112 then transmits the clock/frequency data 349 ( Figure 3)at step 708.
  • the communication procedures 314 ( Figure 3) transmit the clock/frequency data 349 ( Figure 3) to the client computer 106 at step 708.
  • this communication occurs over communications link 110 ( Figure 1), such as a serial port connection, wireless connection, or the like.
  • the clock/frequency data 349 ( Figure 3) is then received by the client computer and sent to the ID server 102 at step 710.
  • the portable device does not have a persistent communication link with the client computer 106 and/or ID server 102, then the clock/frequency data 349 ( Figure 3) are saved in the cache until such time as a connection is established between the portable device 112 and the client computer 106 and/or ID server 102.
  • the audio sample is transmitted directly to the ID server 102 at step 708.
  • the ID server 102 then receives the clock/frequency data 349 ( Figure 3) at step 712.
  • the radio playlist database 224 ( Figure 2) is then searched at step 716 for a match to the clock/frequency data 349 ( Figure 3) recorded on the portable device 112. If a match is not located (718 - No), then a "no identification" message 236 ( Figure 2) is sent at step 730 to the portable device 112.
  • the client computer receives the "no identification" message and sends it to the portable device at step 732.
  • the portable device then receives the "no identification message," which is displayed at step 736 to the user informing the user that the audio could not be identified.
  • a message is preferably displayed on a screen on the portable device.
  • an identification of the audio track is sent at step 720 to the portable device 112.
  • the client computer receives the identification of the audio sample and sends the identification to the portable device at step 722.
  • the portable device receives and displays the identification of the audio sample, such as an artist and song title, at steps 724 and 726. In a preferred embodiment, additional actions are then taken at step 728, as described above.
  • Figure 8 is a flow chart of another method for identifying audio from a radio broadcast, where the identification is performed by a client computer 106, according to another embodiment of the invention.
  • the radio receiver 354 ( Figure 3) and/or radio procedures 344 ( Figure 3) of the portable device 112 receive and play at step 802 a radio broadcast on the portable device 112. Instructions are then received at step 804 to identify an audio track played on the radio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like.
  • the instruction to identify the audio track is received automatically. For example, an attempt to identify the audio track occurs automatically every 2 minutes.
  • the recording procedures 342 ( Figure 3) then store the clock/frequency data 349 ( Figure 3) at step 806. hi a similar manner to that explained above, the recording procedures 340 ( Figure 3) obtain the clock/frequency data 349 ( Figure 3) from the frequency detection procedures 358 ( Figure 3) and the date and time from the clock 360 ( Figure 3).
  • the clock/frequency data 349 ( Figure 3) are stored in the cache 346 ( Figure 3), and subsequently transmitted at step 808 the clock/frequency data 349 ( Figure 3) to the client computer 106. As mentioned previously, this communication occurs over communications link 110 ( Figure 1), such as a serial port connection, wireless connection, or the like.
  • communications link 110 such as a serial port connection, wireless connection, or the like.
  • the audio sample is then received at step 810 by the client computer 106.
  • the radio playlist database 224 ( Figure 2) is then searched at step 814 for a match to the clock/frequency data 349 ( Figure 3) recorded on the portable device 112. If a match is not located (816 - No), then a "no identification" message 236 ( Figure 2) is sent at step 826 to the portable device 112. The portable device then receives and displays the "no identification message” at steps 828 and 830.
  • an identification of the audio track is sent at step 818 to the portable device 112.
  • the portable device receives and displays the identification of the audio sample, such as an artist and song title, at steps 820 and 822.
  • additional actions are then taken at step 824, as described above.
  • Figure 9 is a flow chart of yet another method for identifying audio from a radio broadcast, where the identification is performed by a portable device 112, according to yet another embodiment of the invention.
  • the radio receiver 354 ( Figure 3) and/or radio procedures 344 ( Figure 3) of the portable device 112 receive and play at step 902 a radio broadcast on the portable device 112.
  • instructions are then received at step 904 to identify an audio track played on the radio.
  • the recording procedures 342 ( Figure 3) then store the clock/frequency data 349 ( Figure 3) at step 906.
  • the recording procedures 340 Figure 3) obtain the clock/frequency data 349 ( Figure 3) from the frequency detection procedures 358 ( Figure 3) and the date and time from the clock 360 ( Figure 3).
  • the clock/frequency data 349 ( Figure 3) are stored in the cache 346 ( Figure 3).
  • the radio playlist database 324 ( Figure 3) is then searched at step 910 for a match to the clock/frequency data 349 ( Figure 3). If a match is not located (912 - No), then a "no identification" message 338 ( Figure 3) is displayed at step 914. If a match is located (912 - Yes), then an identification of the audio track is displayed at step 916. In a preferred embodiment, additional actions are then taken at step 918, as described above.
  • the finge ⁇ rinting database 220 ( Figure 2) and/or 320 ( Figure 3) on the ID server 102 ( Figure 1), client computer 106 ( Figure 1), or portable device 112 ( Figure 1) is preferably periodically updated from the finge ⁇ rint provider 116 ( Figure 1).
  • the radio playlist database 224 ( Figure 2) and/or 324 ( Figure 3) on the ID server 102 ( Figure 1), client computer 106 ( Figure 1), and/or portable device 112 ( Figure 1) is preferably periodically updated from the playlist provider 114 ( Figure 1).
  • the portable device appends clock/frequency data to the audio sample.
  • This data may be used for many uses, such as determining a user's listening habits for targeted advertising, or the like.
  • a user when a user is listening to a radio broadcast on a secondary device, such as a car radio, he/she can use the portable device to identify the broadcast channel which the secondary device is tuned and then record from that channel. To do this, the portable device searches through all the channels until it finds the broadcast station whose signal matches the ambient audio as heard through the portable device's microphone. Subsequently, the audio is identified using one of the above described techniques.
  • This embodiment addresses drawbacks associated with recording ambient noise.
  • the tuned radio frequency is recorded, which can be used to augment the finge ⁇ rinting process and provide additional information to the database.
  • another embodiment of the invention utilizes broadcasting a predefined set of audible tones.
  • the tones are used to identify the beginning and end of audio that has been designated for identification.
  • a radio station transmits one or more audible tones.
  • the portable device 112 ( Figure 1) is configured to record the audio track encapsulated by the audible tones. Once recorded, the audio track is identified as described above.
  • the audible tones themselves contain identification data.
  • a series of audible tones represent an identifier, such as a series of numbers. This identifier is then used to look-up associated information on a database, either on the portable device 112 ( Figure 1) or on the ID server 102 ( Figure 1).
  • a series of audible tones themselves represent a artist's name, song title, radio station identifier, or the like.
  • Such a series of tones may also be identifiable by the human ear, but may be forced to conform with a prescribed set of rules for such tones that would distinguish them from normal audio. For example, they may have to begin with a prescribed set of tones or conform to certain prescribed length.
  • a local band promoting itself registers it's audible tone identifier in the database to be identified by the system.
  • the portable device Upon synchronization the portable device returns information, such as telling the user where to get the recording for that particular local band or where to buy a CD containing the local band's songs, etc.
  • audible tone identifiers can be associated with particular artists, radio stations, etc., and used in marketing and promotion.
  • Such audible tone identifiers may be transmitted to other users via any suitable means, such as email, beaming from one portable device to another, or the like.
  • audible tone/s and audible tone identifiers can be used to assist any of the abovementioned methods for audio track identification.

Abstract

An audio sample is recorded on a portable device (112) from an audio track. The audio sample is then stored in a cache on the portable device (112). The audio sample is transmitted to a computing device (106) to be identified. The audio sample is received by the computing device and fingerprinting is performed on the audio sample to obtain a unique audio fingerprint for the audio sample. A fingerprint database (116) is then searched for a match of the fingerprint to a known fingerprint of a previously identified audio track. A match is located and identification data associated with the previously identified audio track is sent to the portable device. Identification of the audio sample is then received and displayed on the portable device.

Description

APPARATUS AND METHOD FOR JDENTIFYΓNG AUDIO
BACKGROUND OF THE INVENTION
FIELD OF THE INVENTION The invention relates generally to identification of audio. More particularly, the invention is directed to a portable device configured to identify an audio track.
DESCRIPTION OF RELATED ART Sound audible to the human ear, i.e., having a frequency between 20 and 20,000 vibrations per second (20-20,000Hz), is known as audio. Examples of audio include speech, music, or the like. What is more, audio is typically heard from one of three sources, namely live performances, recordings, or broadcasts, hi general, recordings and broadcasts are either analog or digital. Analog recordings include magnetic tape recordings and records, while digital recordings include compact discs (CDs), mini-discs, various data file formats, such as MPEG Audio Layer 3 (MP3) files, or the like. Analog broadcasts include sound reproduction, such as via a stereo, and analog radio broadcasts. Digital broadcasts, on the other hand, include digital radio broadcasts, such as those provided by XM SATELLITE RADIO and StRFJS SATELLITE RADIO, and streaming broadcasts over the Internet, such as REAL AUDIO, WINDOWS MEDIA, or MP3 streams.
Often, listeners of audio want to identify an audio track to which they are listening. The audio track may be any finite length audio composition, such as a song, speech, or the like. The identity of the audio track may be important for a number of reasons, such as to enable a user to identify a song in order to purchase the song; to know more about the artist; to find out further details about the artist; to be able to identify the audio in the future; to ascertain to whom royalties must be payed; to index a list of unidentified audio tracks; or the like.
Typically, the identity of the audio track is established by a number of methods, such as by the listener recognizing the audio track, reading an associated writing identifying the audio track, or relying on an announcement of the identity of the audio track. For example, a listener may recognize a song or artist that he/she knows, he/she may read a music album's CD jacket to determine the identity of a song, or he/she may listen to a radio announcer announce the title and artist of a song.
However, in certain situations a user may not be able to identify an audio track using one of these aforementioned methods. Indeed, each source of audio has its own associated drawbacks to audio identification. For example, drawbacks in broadcasting are that radio announcers often don't announce the identity of an audio track; they wait too long to make an announcement and a listener cannot wait until the song is completed to hear the announcement; it is often inconvenient to write down the name of the song; etc. An example of a drawback of recordings is that, historically, recordings did not inform the listener of the identity of the audio track.
In recent years, however, digital recordings have introduced audio identification data into recorded audio track data. This audio identification data is otherwise known as metadata and is associated with many types of digital audio files. An example of such metadata is the ID3tags associated with MP3 audio files. This metadata typically contains basic information about the audio file such as song title, artist, track length, etc. Likewise, digital streaming broadcasts sometimes also attach metadata to their digital audio streams.
Another means for identifying digital recordings is provided by GRACENOTE (previously CDDB of Berkeley, California) and described in U.S. Patent Nos. 6,330,593; 6,240,459; 6,230,207; 6,230,192; 6,161,132; 6,154,773; 6,061,680; and 5,987,525. GRACENOTE uses a Compact Disc Database (CDDB) to identify music that is generated from prerecorded CDs. The CDDB uses the unique identifiers found in the CD's table of contents, such as the CD's list of tracks and associated track times, to identify the songs on a CD. The CDDB service works in conjunction with a variety of computer software media players to identify audio tracks. These media players use the CDDB to populate file names and metadata for each song encoded from a CD.
Another application of the CDDB technology allows standalone CD players (not attached to a computer or the Internet) to display song title and artist information. To do this, the device must store the GRACENOTE database locally and perform the same technique as described above, locally on the device.
A drawback of the CDDB technology is that it requires the presence of a full prerecorded CD to be able to identify the CD's individual audio tracks. Therefore, this technology cannot be used to identify individual audio tracks heard by a listener from sources other than a recorded CD.
Yet another type of device for identifying audio uses a time-stamping technique to identify audio tracks. Two known devices that employ this time-stamping technique are the SONY E-MARKER and the XENOTE I-TAG. These devices are very simple keychain devices that simply record the date and time when a button on the device is depressed. In use, when a listener hears a song on the radio that he/she wants to identify, he/she presses the button on the device and the device records the date and time associated with the depression of the button. Later, when the device is synchronized with a desktop computer, a unique user identifier associated with the listener's device and the recorded date and time information is sent to a server via the Internet. Typically, a web page is then displayed which shows the songs played on a variety of stations that the listener (having the unique identifier) had previously identified as the radio stations most commonly listened to. The device itself does not store any information relating to the station the user was listening to at the time of the selection. The Web-page, that presents the identified songs, also often presents options related to purchasing the CD that contains the selected song, etc.
A drawback of such devices that use time-stamping techniques is that they do not fully automate the process of identifying song information because the user is required to remember what station he/she was listening to when he/she actuated the device. Further, the user must interact with a desktop computer to obtain the audio track identification. Specifically, the user must identify the radio stations that he/she most commonly listens to. In addition, interaction through the Internet is required, and as a result, includes the normal drawbacks associated with the latency, reliability, and speed of the Internet. Put differently, the interaction is typically much slower than that encountered when using a non-Internet based audio track identification solution.
Moreover, because such devices only record time and date of actuation, use of such devices is limited to radio broadcasts. In addition, such devices require the service provider to maintain a database that contains the complete playlists and accompanying playtimes from every radio station in every market that the service provider wishes to support. Because the radio stations do not provide this information, collecting such playlists and accompanying playtimes is usually performed by a third party. The third party either manually identifies and enters the playlists and accompanying playtimes into a database, or these playlists and accompanying playtimes are automatically identified and stored in the database by a computer. In either event, such identification and storage is complex, requires significant effort, is costly, and is, therefore, typically limited to the most popular stations, thereby excluding many geographic areas and markets.
Yet another prior art means for identifying audio tracks is performed using audio fingerprinting. Audio fingerprinting typically uses software to identify a song by comparing a unique audio identifier or fingerprint (hereinafter "fingerprint") of an audio sample to a database of known "fingerprints" associated with known audio samples.
A number of service providers and/or software applications utilize digital finge rinting techniques to identify audio tracks. For example, CLANGO a software product made by AUDIBLE MAGIC CORP of Los Gatos, California, uses digital fingerprinting to identify streaming audio broadcasts that do not provide associated audio track metadata. The fingerprinting performed by AUDIBLE MAGIC CORP is described in U.S. Patent No. 5,918,223.
Another provider of similar audio fingerprinting technology is AUDITUDE, whose software product ID3MAN is aimed at users who posses a collection of digital audio files whose associated identification data is either incorrect or incomplete. Through a combination of techniques, including audio fingeφrinting, ID3MAN identifies the audio files and subsequently corrects the identification data associated with those files.
A drawback of these fingerprinting devices or services is that they do not provide any benefit to users listening to music away from their desktop computers (except in the case of a CDDB enabled CD player, which requires the device to store an extremely large GRACENOTE database, and which has its own associated drawbacks, as described above.).
A further means for identifying audio uses a cellular telephone network, where upon hearing the audio that the user wants to identify, the user calls a designated number to have that audio identified for them. There are at least two methods that are used to provide this service.
The first method, which was offered under the name BUZZHITS (now defunct), allowed the user to call a designated number and enter a user identifier which identified the caller (and the caller's geographic market) and then prompted the user for the broadcast frequency of the radio station broadcasting the audio to be identified. Once the broadcast frequency was supplied, the user was provided with sample audio clips, from which the user selected a sample audio clip to obtain the identity of the audio track. This information was also emailed to the user.
While this phone service solve some of the above described drawbacks, it still requires the user to manually interact with the device and the user is forced to interact at exactly the time the audio is heard, which is often inconvenient.
What is more, none of the above described audio identification solutions automatically perform additional actions once an audio track has been identified. As more music becomes available for download through the emerging subscription services, consumers will desire an option to purchase and download the music they hear from a variety of sources. Completing a transaction using the above products/services is inherently a multi-step manual process that requires interaction with the Internet and a desktop computer, or cellular phone.
In light of the above, there is a need for an audio identification device and method that addresses the abovementioned drawbacks, while being convenient and easy to use, and providing accurate identification at a low associated cost per identification.
BRIEF SUMMARY OF THE INVENTION One embodiment of the invention includes a portable device that can record from a microphone, audio player, and/or radio receiver. To identify an audio track, this portable device, when actuated, records an audio sample of an audio track being played through the player or being received by the radio receiver. If the portable device is not currently playing audio or receiving a radio broadcast, it can record the audio sample through the microphone. This recorded audio sample is stored on the portable device's internal storage, and later, when connected to a client computer, is uploaded to that client computer. The client computer then processes the audio sample to generates a "fingerprint" of the audio sample that is then compared to a fingerprint database either on the client computer locally or on an identification server (ID server) coupled to the client computer via the Internet. Once the fingeφrint has been identified, the title and artist information is returned to the client computer and ultimately displayed on the portable device.
Alternatively, the portable device itself processes the fingeφrint from the recorded audio sample, and generates the fingeφrint. This has the advantage of reducing the amount of storage space needed to store the audio samples, as only the fingeφrint is stored on the portable device. This embodiment, however, requires that the portable device have adequate processor power to perform fingeφrinting.
As an adjunct, in addition to the return of artist and title information, the device also performs additional actions once the audio sample or track has been identified. Examples of additional actions include downloading the identified audio track from a subscription service, recommending more audio tracks similar to the identified audio track, obtaining prices of the identified audio track from Internet music merchants. These additional actions are preferably selected by choosing a menu item from the player's display, and can be customized and downloaded from third party service providers.
According to the invention there is provided a method for identifying audio on a portable device. An audio sample is recorded on a portable device from an audio track. The audio sample is then stored in a cache on the portable device. The audio sample is transmitted to a computing device to be identified. The audio sample is received by the computing device and fingeφrinting is performed on the audio sample to obtain a unique audio fingeφrint for the audio sample. A fingeφrint database is then searched for a match of the fingeφrint to a known fingeφrint of a previously identified audio track. A match is located and identification data associated with the previously identified audio track is sent to the portable device. Identification of the audio sample is then received and displayed on the portable device. In another embodiment, the fingeφrinting is performed on the portable device.
In another embodiment, a radio broadcast is received and played on the portable device. An instruction to identify an audio track of the radio broadcast is received and the radio broadcast's broadcast frequency, and the date and time that the portable device received the instruction to identify the audio track is automatically recorded. The broadcast frequency, date, and time along with a unique device identifier is then transmitted to a computing device to be identified. This data is received by the computing device. A playlist database is then searched for a match of the broadcast frequency, date, and time to a known radio station's broadcast frequency based on the user's geographic location determined by the unique device identifier, and known date and time that an audio track was broadcast by the radio station. The audio track associated with the broadcast frequency, date and time is located and sent to the portable device. The portable device thereafter receives and displays information associated with the identified audio track, hi another embodiment, the fingeφrinting is performed on the portable device.
According to the invention there is also provided a portable device, computing device, and identification server for performing the above described methods.
Therefore, by combining this functionality with a portable music player, the process of identification can be automated. In addition, the possible range of uses of the device can be broadened to cover a broader array of music, and music sources, where additional functionality can be provided for little additional cost. Also, the device can facilitate compiling a database on radio station playlists. Additional action can be initiated at the device, where the additional actions can be personalized to a user's preferences.
BRIEF DESCRIPTION OF THE DRAWINGS
For a better understanding of the nature and objects of the invention, reference should be made to the following detailed description, taken in conjunction with the accompanying drawings, in which:
Figure 1 is a diagrammatic view of a system for identifying audio on a portable device, according to an embodiment of the invention;
Figure 2 is a block diagram of the identification server (ID server) and/or client computer shown in Figure 1;
Figure 3 is a block diagram of the portable device shown in Figure 1;
Figure 4 is a flow chart of a method for identifying audio, where the identification is performed by an ID server, according to an embodiment of the invention;
Figure 5 is a flow chart of another method for identifying audio, where the identification is performed by a client computer, according to another embodiment of the invention;
Figure 6 is a flow chart of yet another method for identifying audio, where the identification is performed by a portable device, according to yet another embodiment of the invention;
Figure 7 is a flow chart of a method for identifying audio from a radio broadcast, where the identification is performed by an identification server (ID server), according to an embodiment of the invention; Figure 8 is a flow chart of another method for identifying audio from a radio broadcast, where the identification is performed by a client computer, according to another embodiment of the invention; and
Figure 9 is a flow chart of yet another method for identifying audio from a radio broadcast, where the identification is performed by a portable device, according to yet another embodiment of the invention.
Like reference numerals refer to corresponding parts throughout the several views of the drawings.
DETAILED DESCRIPTION OF THE INVENTION
Figure 1 is a block diagram of a system 100 for identifying audio, according to an embodiment of the invention. The system 100 comprises at least one identification server 102 (hereinafter "ID server") and at least one client computer 106 coupled to one another via a network 104. The ID server 102 and client computer 106 are any type of computing devices. However, in one embodiment the client computer 106 is a desktop computer and the network 104 is the Internet.
The client computer 106 is coupled to the network 104 by any suitable communication link 108, such as Ethernet, coaxial cables, copper telephone lines, optical fibers, wireless, infra-red, or the like. A portable audio identification device 112 (hereinafter "portable device") is coupled to the client computer 106. The portable device 112 is preferably sized to be carried in the palm of one's hand. The portable device 112 couples to the client computer 112 by any suitable communication link 110, such as Universal Serial Bus (USB), Firewire, Ethernet, coaxial cable, copper telephone line, optical fiber, wireless, infra-red, or the like.
In an alternative embodiment, the client computer 106 is a fixed wireless base station coupled to a gateway/modem that is in turn connected to the network 104. For example, client computer might be a WiFi (Wireless Fidelity - IEEE 802.1 lb wireless networking) base station coupled to the network 104 via a Digital Subscriber Line (DSL) gateway (not shown). In this embodiment, the communication link from the portable device 112 to the client computer 106 is a WiFi wireless communication link.
In yet another embodiment, no client computer 106 is present and the portable device 112 communicates directly with the ID server 102. For example, the portable device 112 includes cellular telephone communication circuitry which communicates with the ID server 102 via a cellular telephone network (network 104).
In a further embodiment, the portable device 112 alone is necessary to identify audio. For example, the portable device periodically downloads updated playlists and/or fingeφrinting databases from the network 104, as explained in further detail below in relation to Figures 6 and 9.
In one embodiment, a playlist provider 114 and fingeφrint provider 116 may also be coupled to the network 104. Here, the playlist provider 114 is a server that supplies updated playlists to the ID server 102, client computer 106, and/or portable device 112, while the fingeφrint provider 116 is a server that supplies updated fingeφrint data for new audio tracks to the ID server 102, client computer 106, and/or portable device 112.
Figure 2 is a block diagram of the ID server 102 and/or client computer 106 shown in Figure 1. ID server 102 and/or client computer 106 are shown in one diagram to avoid repetition. It should, however, be appreciated that all the elements of the ID server 102 and/or client computer 106 listed below need not be present in all embodiments of the invention and are merely included for exemplary puφoses.
The ID server 102 and/or client computer 106 preferably include: at least one data processor or central processing unit (CPU) 202; a memory 210; user interface devices 206, such as a monitor and keyboard; communications circuitry 204 for communicating with the network 104 (Figure 1), ID server 102 (Figure 1), client computer 106 (Figure 1) and/or portable device 112 (Figure 1); and at least one bus 208 that interconnects these components.
Memory 210 preferably includes an operating system 212, such as VXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc. Memory 210 also preferably includes communications procedures 214 for communicating with the network 104 (Figure 1), ID server 102 (Figure 1), client computer 106 (Figure 1) and/or portable device 112 (Figure 1); fingeφrinting procedures 216; searching procedures 218; a fingeφrinting database 220; a radio playlist database 224; a geographic identifier 234; a "no identification" message 236; and a cache 238 for temporarily storing data.
The fingeφrinting procedures 216 are used to obtain a unique identifier or fingeφrint for an audio sample of an audio track, as described in further detail below in relation to figures 4 and 5. The fingeφrinting procedures 216 include instructions for performing fingeφrinting on the audio sample to obtain a unique audio fingeφrint for the audio sample.
The searching procedures 218 are used for searching the fingeφrint database 220 in order to attempt to identify audio, as described in further detail below in relation to figures 4 to 6. The fingeφrinting database 220 includes numerous fingeφrints of known audio samples or audio tracks and their associated identification data 222(1 )-(N), such as song title, artist, or the like.
In an alternative embodiment, a radio playlist database 224 is provided. In this embodiment, the radio playlist database 224 includes numerous radio frequencies 226(1)-(N) and an associated playlist 228(1)-(N) for each frequency 226(1)-(N). Each playlist 228(1)- (N) includes a date 230(1)-(N) and time 232(1)-(N) , and the identity 232(1)-(N) of each audio track broadcast at that date and time. For example, radio station KJAZ may have a frequency of 98.7FM, and a playlist that includes Frank Sinatra's "New York, New York" broadcast on January 21, 2002 at 9:00AM.
As multiple radio stations across the world share the same frequencies, a geographic identifier 234 is provided to identify the radio stations or frequencies 226(1)-(N) in a particular geographic area. This geographic identifier 234 may be provided by any suitable means. In one embodiment the user supplies the geographic identifier. In another embodiment, the geographic identifier 234 is obtained from the user's unique network address. For example, an Internet Protocol (IP) address of the client computer 106 and/or portable device 112 can be used to approximate the geographic area of the user. In still another embodiment, a Global Positioning System (GPS) incoφorated into the client computer 106 and/or portable device 112 can be used to determine the geographic area of the user. If the ID server 102 and or client computer 106 cannot identify an audio track, the "no identification" message 236 is used to inform the user that no identification can be made. Alternatively, prior to receiving a no identification message, the user may be presented with a number of "closest match" possible identifications.
It should be appreciated by one skilled in the art that certain elements of these devices need not be present on both the ID server 102 and the client computer 106. For example, the fingeφrinting procedures 216, searching procedures 218, and fingeφrinting database 224 may only be necessary on the device on which fingeφrinting of an audio track occurs. In other words, if fingeφrinting occurs on the ID server 102 then the aforementioned elements of memory 210 need only be present on the ID server 102. Likewise, in the embodiment where identification occurs on the portable device 112 (Figure 1), the aforementioned elements of memory 210 are not provided on either the ID server 102 or the client computer 106.
Figure 3 is a block diagram of the portable device 112 shown in Figure 1. It should be appreciated to one skilled in the art that all the elements of the portable device 112 listed below need not be present in all embodiments of the invention and are merely included for exemplary puφoses.
The portable device 112 preferably includes: at least one data processor or central processing unit (CPU) 302; a memory 310; user interface devices 308, such as buttons, a screen, and a headset; communications circuitry 304 for communicating with the network 104 (Figure 1), ID server 102 (Figure 1), and/or client computer 106 (Figure 1); one or more audio players 350, such as a CD or MP3 player; a microphone 352; a radio receiver 354 and antenna 356 for receiving radio broadcasts; and at least one bus 306 that interconnects these components.
Memory 310 preferably includes an operating system 312, such as NXWORKS, LINUX, or WINDOWS having instructions for processing, accessing, storing, or searching data, etc. Memory 310 also preferably includes communications procedures 314 for communicating with the network 104 (Figure 1), ID server 102 (Figure 1), and/or client computer 106 (Figure 1); fingeφrinting procedures 316; searching procedures 318; a fingeφrinting database 320; a radio playlist database 324; a geographic identifier 334; geographic identification procedures 336; a "no identification" message 338; recording procedures 340; player procedures 342; radio procedures 344; a cache 346 for temporarily storing data; frequency detection procedures 358; and a clock 360.
The fingeφrinting procedures 316 are used to obtain a unique identifier or fingeφrint for an audio sample of an audio track, as described in further detail below in relation to Figure 6.
Also, in this embodiment the searching procedures 318 are used for searching the fingeφrint database 320 in order to attempt to identify audio, as described in further detail below. The fingeφrinting database 320 includes numerous fingeφrints of known audio samples or audio tracks and their associated identification data 322(1 )-(N), such as song title, artist, or the like. In an alternative embodiment, a radio playlist database 324 is provided. In this embodiment, the radio playlist database 324 includes numerous radio frequencies 326(1)-(N) and an associated playlist 328(1)-(N) for each frequency 326(1)-(N). Each playlist 328(1)- (N) includes a date 330(1)-(N) and time 332(1)-(N) , and the identity 332(1)-(N) of each audio track broadcast at that date and time.
Also for the above alternative embodiment, a geographic identifier 334 is provided to assist in identifying the radio stations or frequencies 326(1)-(N) in a particular geographic area. For example, the geographic identifier 334 may select from a set of frequencies stored on the device based on the identified geographic area. This geographic identifier 334 may be provided by any suitable means. In one embodiment the user supplies the geographic identifier 334. In another embodiment, the geographic identifier 334 is obtained by the geographic identification procedures 336. As described above, this can be determined from the user's unique network address. For example, an Internet Protocol (IP) address of the portable device 112 can be used to approximate the geographic area of the user. In still another embodiment, a Global Positioning System (GPS) (not shown) incoφorated into the portable device 112 can be used to determine the geographic area of the user.
In all embodiments, a "no identification" message 236 is used to inform the user that no identification can be made, if the portable device 112 cannot identify the audio track.
In the embodiments of the invention where fingeφrinting of the audio track is used to identify the audio, the recording procedures 340 record an audio sample 348, which is stored in the cache 348. The audio sample is recorded from the audio player/s 350, microphone 352, and/or the radio receiver 354.
In the embodiment of the invention where a date, time, and radio station or broadcast frequency, are used to identify audio from a radio station playlist, the recording procedures are used to record the date, time, and broadcast or radio station frequency 349, which is stored in the cache 348.
The player procedures 342 are preferably provided to play audio on the audio player/s 350. These player procedures 342 are especially needed for playing digital audio, such as MP3 audio tracks, or the like.
The radio procedures 344 are preferably provided to play radio received at the antenna 356 and fed through the radio receiver 354. It should, however, be appreciated that all the aforementioned components of the memory 310 need not be present in all embodiments of the invention and are merely included for exemplary puφoses.
The frequency procedures 358 are used to detect the frequency of a radio station broadcast, and the clock 360 is used to keep the date and time. The frequency procedures 358 and clock 360 are explained in further detail below in relation to Figures 7-9.
Figure 4 is a flow chart of a method for identifying audio, where the identification is performed by the identification server (ID server) 102, according to an embodiment of the invention. In one embodiment, the audio player/s 350 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112 play at step 402 audio through the user interface devices 308 (Figure 3). For example, a built-in MP3 player plays audio to a user through a headset. In another embodiment, the radio receiver 354 (Figure 3) and/or radio procedures 344 (Figure 3) receive and play a radio broadcast through the portable device's headset.
Instructions are then received at step 404 to identify the audio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like. In an alternative embodiment, the instruction to record is received automatically. For example, an audio sample is automatically recorded every 2 minutes. The steps 402 and 404 of playing audio and receiving instructions are not essential to the invention and in some embodiments need not occur.
An audio sample is then recorded at step 406 by the recording procedures 342 (Figure 3) and saved as an audio sample 348 (Figure 3) in the cache 346 (Figure 3). In one embodiment, audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo fingeφrinting. For example, audio is continually recorded and automatically segmented into 30 second audio samples that are continually sent to the ID server 102 to be identified.
Where audio is not being played by the audio player/s 350 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112, audio is recorded at step 406 through the microphone 352 (Figure 3).
In the embodiment where the portable device 112 couples to network 104 via the client computer 106, the communication procedures 314 (Figure 3) transmit the audio sample to the client computer 106 at step 408. As mentioned previously, this communication occurs over communications link 110, such as a serial port connection, wireless connection, or the like. The audio sample is then received by the client computer and sent at step 410 to the ID server 102.
It should be appreciated that where the portable device does not have a persistent communication link with the client computer 106 and/or ID server 102, then the audio samples are saved in the cache 346 (Figure 3) until such time as a connection is established between the portable device 112 and the client computer 106 and/or ID server 102.
In an alternative embodiment, where no client computer is present, such as where the portable device 112 communicates with the ID server 102 via a cellular telephone network, the audio sample is transmitted at step 408 directly to the ID server 102.
The ID server 102 then receives the audio sample at step 412. The fingeφrinting procedures 216 (Figure 2) on the ID server 102 subsequently perform at step 414 fingeφrinting on the audio sample to determine a unique identifier or fingeφrint for the audio sample based on the audio sample's characteristics or acoustical features. Such characteristics or acoustical features include the audio sample's analog waveform, loudness, pitch, brightness, bandwidth, Mel Frequency Cepstral Coefficients (MFCC), or the like. One suitable method for fingeφrinting an audio sample is disclosed in U.S. patent No. 5,918,223, which is incoφorated herein by reference.
The fingeφrint database 220 (Figure 2) is then searched at step 416 for a match or partial match of the fingeφrint of the audio sample recorded on the portable device to a known fingeφrint of a previously identified audio track. If a match is not located (418 - No), then a "no identification" message 236 (Figure 2) is sent at step 430 to the portable device 112. In the embodiment where the portable device 112 couples to the J-D server 102 via the client computer, the client computer receives the "no identification" message and sends it to the portable device at step 432.
The portable device then receives at step 434 the "no identification message," which is displayed at step 436 to the user informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device.
If a match of said fingeφrint to a known fingeφrint of a previously identified audio track is located (418 - Yes), then identification data associated with said previously identified audio track is sent at step 420 to the portable device 112. In the embodiment where the portable device 112 couples to the ID server 102 via the client computer, the client computer receives the identification of the audio sample and sends the identification to the portable device at step 422.
The portable device receives and displays the identification of the audio sample at steps 424 and 426 respectively. For example, an artist and song title is displayed. In a preferred embodiment, additional actions are then taken at step 428. These additional actions are performed only after the identity of the audio sample is known, and for example include the client computer 106 and/or portable device 112 automatically displaying the identified artist's Web-page, biography, discography; automatically displaying a Web-page selling the artist's song or album; downloading the audio track from a subscription service; recommending a similar audio; obtaining prices of the identified audio track from Internet music merchants; or the like. These additional actions are preferably selected by choosing a menu item from the portable device's display, and can be customized and/or downloaded from third party service providers. The portable device can also return information about commercials or other spoken word recordings. Further these additional actions can be taken on all audio samples where the identity of the audio is known or unknown. For example, the user may have downloaded a digital audio file and may have the complete or partial identity of the song known, but still want to send the song for identification to receive additional information on the audio track or take some action on that audio track.
Figure 5 is a flow chart of another method for identifying audio, where the identification is performed by a client computer 106, according to another embodiment of the invention. In one embodiment, the audio player/s 350 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112 play audio at step 502 through the user interface devices 308 (Figure 3). For example, a built-in MP3 player plays audio to a user through a headset.
Instructions are then received at step 504 to identify audio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device. In an alternative embodiment, the instruction to record is received automatically. For example, an audio sample is automatically recorded every 2 minutes. It should be noted that the steps 502 and 504 of playing audio and receiving instructions to identify the audio are not essential to the invention and in some embodiments need not occur.
An audio sample is then recorded at step 506 by the recording procedures 342 (Figure 3) and saved as an audio sample 348 (Figure 3) in the cache 349 (Figure 3). In one embodiment, audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo fingeφrinting. For example, audio is continually recorded and automatically segmented into 30 second audio samples that are continually identified by the client computer 106.
Where audio is not being played the audio player/s 350 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112, audio is recorded at step 506 through the microphone 352 (Figure 3).
In the embodiment where the portable device 112 couples to the client computer 106 via a wireless connection, the communication procedures 314 (Figure 3) continue transmitting audio samples to the client computer 106 at step 508 until the wireless connection is brought down. The audio sample is then received at step 510 by the client computer 106.
It should be appreciated that where the portable device does not have a persistent communication link with the client computer 106, the audio samples are saved in the cache until such time as a connection is established between the portable device 112 and the client computer 106.
The fingeφrinting procedures 216 (Figure 2) on the client computer 106 subsequently perform at step 512 fingeφrinting on the audio sample to determine a unique identifier or fingeφrint for the audio sample based on the audio sample's characteristics or acoustical features, as described above.
In yet another embodiment, the fingeφrinting procedures 316 (Figure 3) on the portable device 112 perform fingeφrinting on the audio sample to determine a unique identifier or fingeφrint for the audio sample based on the audio sample's characteristics or acoustical features. This fingeφrint is then sent to the client computer 106, which searches the fingeφrint database at step 514.
The fingeφrint database 220 (Figure 2) is then searched at step 514 for a match or partial match to the fingeφrint of the audio sample recorded on the portable device. If a match is not located (516 - No), then a "no identification" message 236 (Figure 2) is sent at step 526 to the portable device 112, which receives the "no identification message" at step 528 and displays it at step 530 to the user informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device. If a match is located (516 - Yes), then an identification of the audio sample is sent 518 to the portable device 112. The portable device receives and displays the identification of the audio sample at steps 520 and 522, preferably on the portable device's screen. For example, an artist and song title is displayed. In a preferred embodiment, additional actions are then taken at step 524, as described above.
Figure 6 is a flow chart of yet another method for identifying audio, where the identification is performed by a portable device 112, according to yet another embodiment of the invention. In one embodiment, the audio player/s 360 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112 play audio at step 602 through the user interface devices 308 (Figure 3). For example, a built-in MP3 player plays audio to a user through a headset. An instruction to identify audio is then received at step 604. In an alternative embodiment, the instruction to record is received automatically. It should be noted that the steps 602 and 404 of playing audio and receiving instructions to identify the audio are not essential to the invention and in some embodiments need not occur.
An audio sample is then recorded at step 606 by the recording procedures 342 (Figure 3) and saved as an audio sample 348 (Figure 3) in the cache 349 (Figure 3). In one embodiment, audio is recorded continuously and automatically segmented into audio samples having sufficient length to undergo fingeφrinting. The audio samples 348 (Figure 3) are preferably temporarily saved in the cache 346 (Figure 3).
Where audio is not being played by the audio player/s 350 (Figure 3) and/or player procedures 342 (Figure 3) of the portable device 112, audio is recorded at step 606 through the microphone 352 (Figure 3).
The fingeφrinting procedures 316 (Figure 3) on the portable device 112 subsequently perform at step 608 fingeφrinting on the audio sample to determine a unique identifier or fingeφrint for the audio sample based on the audio sample's characteristics or acoustical features. Such characteristics or acoustical features include the audio sample's analog waveform, loudness, pitch, brightness, bandwidth, Mel Frequency Cepstral Coefficients (MFCC), or the like. One suitable method for fingeφrinting an audio sample is disclosed in U.S. patent No. 6,918,223, which is incoφorated herein by reference.
The fingeφrint database 320 (Figure 3) is then searched at step 610 for a match or partial match to the fingeφrint of the audio sample recorded on the portable device. If a match is not located (612 - No), then a "no identification" message 340 (Figure 3) is displayed at step 614 informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device.
If a match is located (612 - Yes), then an identification of the audio sample is displayed at step 616, preferably on the portable device's screen. For example, an artist and song title is displayed. In a preferred embodiment, additional actions are then taken at step 618, as described above.
One of the advantages to performing fingeφrinting on the portable device is that it saves memory on the portable device, as fingeφrints are substantially smaller than the audio samples. In addition, another embodiment of the invention identifies audio using a peer to peer network consisting of networked portable devices only. For example, if the device has wireless networking capabilities, and fingeφrinting is performed on the portable device, then identification of the audio sample may occur by searching for fingeφrints on other networked portable devices.
Another embodiment where fingeφrinting is performed on the portable device, provides central kiosks, such as at record stores, where identification of the fingeφrint may be performed. This embodiment alleviates the load placed on such kiosks, as they do not generate the fingeφrint and less data would have to be transferred and maintained at each kiosk.
Figure 7 is a flow chart of a method for identifying audio from a radio broadcast, where the identification is performed by an identification server (ID server) 102, according to an embodiment of the invention. The radio receiver 354 (Figure 3) and/or radio procedures 344 (Figure 3) of the portable device 112 receive and play a radio broadcast at step 702. An instruction is then received at step 704 to identify an audio track of the radio broadcast. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like. In an alternative embodiment, the instruction to identify the audio track is received automatically. For example, an attempt to identify the audio track occurs automatically every 2 minutes.
The recording procedures 342 (Figure 3) then record and store the radio station's broadcast frequency at step 706, and the date and time that the portable device 112 received the instruction to identify the audio track. For example, a broadcast frequency of 95.7 kHz, date of February 23, 2002, and a time of 113H00 is recorded. The recording procedures 340 (Figure 3) obtain the frequency from the frequency detection procedures 358 (Figure 3) and the date and time from the clock 360 (Figure 3). In their simplest form, the frequency detection procedures 358 (Figure 3) do nothing more than ascertain what radio frequency the user has selected, i.e., by reading the value that the radio receiver has been tuned to. Alternatively, the frequency detection procedures 358 (Figure 3) may detect the frequency of the broadcast, which is often broadcast together with the audio signal. This uses the Radio Data Service (RDS) which typically transmits the actual radio identification. RDS typically transmits the actual station identification, which is more reliable as such a station identification is not dependent on geography. RDS actually transmits information about the owner of the radio station which would unambiguously defines which playlist to search. Still other frequency detection procedures 358 (Figure 3) may detect the frequency of the broadcast by detecting the frequency that the radio receiver is tuned to. It should be appreciated that the frequency of the broadcast is determined automatically, i.e., the user does not supply the radio frequency to the frequency detection procedures 358 (Figure 3).
The date and time on the clock 360 (Figure 3) initially can be set by the user, or the portable device can automatically set the clock using known techniques for remotely synchronizing a clock from a reliable time source, such as an atomic clock. Such techniques for remotely synchronizing a clock are disclosed in U.S. Patent Nos. 4,823,328, and 4,768,178, both of which are incoφorated herein by reference.
The recorded date, time, and frequency (hereinafter "clock/frequency data") 349 (Figure 3) are then stored in the cache 346 (Figure 3). The portable device 112 then transmits the clock/frequency data 349 (Figure 3)at step 708.
In the embodiment where the portable device 112 couples to ID server 102 via the client computer 106, the communication procedures 314 (Figure 3) transmit the clock/frequency data 349 (Figure 3) to the client computer 106 at step 708. As mentioned previously, this communication occurs over communications link 110 (Figure 1), such as a serial port connection, wireless connection, or the like. The clock/frequency data 349 (Figure 3) is then received by the client computer and sent to the ID server 102 at step 710.
It should be appreciated that where the portable device does not have a persistent communication link with the client computer 106 and/or ID server 102, then the clock/frequency data 349 (Figure 3) are saved in the cache until such time as a connection is established between the portable device 112 and the client computer 106 and/or ID server 102. In an alternative embodiment, where no client computer is present, such as where the portable device 112 communicates directly with the ID server 102 via a cellular telephone network, the audio sample is transmitted directly to the ID server 102 at step 708.
The ID server 102 then receives the clock/frequency data 349 (Figure 3) at step 712. The radio playlist database 224 (Figure 2) is then searched at step 716 for a match to the clock/frequency data 349 (Figure 3) recorded on the portable device 112. If a match is not located (718 - No), then a "no identification" message 236 (Figure 2) is sent at step 730 to the portable device 112. In the embodiment where the portable device 112 couples to the ID server 102 via the client computer, the client computer receives the "no identification" message and sends it to the portable device at step 732.
At step 734 the portable device then receives the "no identification message," which is displayed at step 736 to the user informing the user that the audio could not be identified. Such a message is preferably displayed on a screen on the portable device.
If a match is located (718 - Yes), then an identification of the audio track is sent at step 720 to the portable device 112. In the embodiment where the portable device 112 couples to the ID server 102 via the client computer, the client computer receives the identification of the audio sample and sends the identification to the portable device at step 722. The portable device receives and displays the identification of the audio sample, such as an artist and song title, at steps 724 and 726. In a preferred embodiment, additional actions are then taken at step 728, as described above.
Figure 8 is a flow chart of another method for identifying audio from a radio broadcast, where the identification is performed by a client computer 106, according to another embodiment of the invention. The radio receiver 354 (Figure 3) and/or radio procedures 344 (Figure 3) of the portable device 112 receive and play at step 802 a radio broadcast on the portable device 112. Instructions are then received at step 804 to identify an audio track played on the radio. These instructions preferably come from the user, such as by the user depressing a "identify now" button on the portable device, or the like. In an alternative embodiment, the instruction to identify the audio track is received automatically. For example, an attempt to identify the audio track occurs automatically every 2 minutes.
The recording procedures 342 (Figure 3) then store the clock/frequency data 349 (Figure 3) at step 806. hi a similar manner to that explained above, the recording procedures 340 (Figure 3) obtain the clock/frequency data 349 (Figure 3) from the frequency detection procedures 358 (Figure 3) and the date and time from the clock 360 (Figure 3).
The clock/frequency data 349 (Figure 3) are stored in the cache 346 (Figure 3), and subsequently transmitted at step 808 the clock/frequency data 349 (Figure 3) to the client computer 106. As mentioned previously, this communication occurs over communications link 110 (Figure 1), such as a serial port connection, wireless connection, or the like. The audio sample is then received at step 810 by the client computer 106.
It should be appreciated that where the portable device does not have a persistent communication link with the client computer 106, then the clock/frequency data 349 (Figure 3) are saved in the cache until such time as a connection is established between the portable device 112 and the client computer 106.
The radio playlist database 224 (Figure 2) is then searched at step 814 for a match to the clock/frequency data 349 (Figure 3) recorded on the portable device 112. If a match is not located (816 - No), then a "no identification" message 236 (Figure 2) is sent at step 826 to the portable device 112. The portable device then receives and displays the "no identification message" at steps 828 and 830.
If a match is located (816 - Yes), then an identification of the audio track is sent at step 818 to the portable device 112. The portable device receives and displays the identification of the audio sample, such as an artist and song title, at steps 820 and 822. In a preferred embodiment, additional actions are then taken at step 824, as described above.
Figure 9 is a flow chart of yet another method for identifying audio from a radio broadcast, where the identification is performed by a portable device 112, according to yet another embodiment of the invention. The radio receiver 354 (Figure 3) and/or radio procedures 344 (Figure 3) of the portable device 112 receive and play at step 902 a radio broadcast on the portable device 112. In a similar manner to that described above, instructions are then received at step 904 to identify an audio track played on the radio.
The recording procedures 342 (Figure 3) then store the clock/frequency data 349 (Figure 3) at step 906. In a similar manner to that explained above, the recording procedures 340 (Figure 3) obtain the clock/frequency data 349 (Figure 3) from the frequency detection procedures 358 (Figure 3) and the date and time from the clock 360 (Figure 3).
The clock/frequency data 349 (Figure 3) are stored in the cache 346 (Figure 3). The radio playlist database 324 (Figure 3) is then searched at step 910 for a match to the clock/frequency data 349 (Figure 3). If a match is not located (912 - No), then a "no identification" message 338 (Figure 3) is displayed at step 914. If a match is located (912 - Yes), then an identification of the audio track is displayed at step 916. In a preferred embodiment, additional actions are then taken at step 918, as described above.
Moreover, the fingeφrinting database 220 (Figure 2) and/or 320 (Figure 3) on the ID server 102 (Figure 1), client computer 106 (Figure 1), or portable device 112 (Figure 1), is preferably periodically updated from the fingeφrint provider 116 (Figure 1). Similarly, the radio playlist database 224 (Figure 2) and/or 324 (Figure 3) on the ID server 102 (Figure 1), client computer 106 (Figure 1), and/or portable device 112 (Figure 1) is preferably periodically updated from the playlist provider 114 (Figure 1).
In still a further embodiment of the invention, the portable device appends clock/frequency data to the audio sample. This data may be used for many uses, such as determining a user's listening habits for targeted advertising, or the like.
In yet another embodiment of the invention, when a user is listening to a radio broadcast on a secondary device, such as a car radio, he/she can use the portable device to identify the broadcast channel which the secondary device is tuned and then record from that channel. To do this, the portable device searches through all the channels until it finds the broadcast station whose signal matches the ambient audio as heard through the portable device's microphone. Subsequently, the audio is identified using one of the above described techniques. This embodiment addresses drawbacks associated with recording ambient noise. Furthermore, the tuned radio frequency is recorded, which can be used to augment the fingeφrinting process and provide additional information to the database.
Still further, another embodiment of the invention utilizes broadcasting a predefined set of audible tones. In one embodiment the tones are used to identify the beginning and end of audio that has been designated for identification. For example, before and after an audio track, a radio station transmits one or more audible tones. The portable device 112 (Figure 1) is configured to record the audio track encapsulated by the audible tones. Once recorded, the audio track is identified as described above.
In another embodiment, the audible tones themselves contain identification data. For example, a series of audible tones (such as three quick beeps) represent an identifier, such as a series of numbers. This identifier is then used to look-up associated information on a database, either on the portable device 112 (Figure 1) or on the ID server 102 (Figure 1). Another example would be where a series of audible tones themselves represent a artist's name, song title, radio station identifier, or the like. Such a series of tones may also be identifiable by the human ear, but may be forced to conform with a prescribed set of rules for such tones that would distinguish them from normal audio. For example, they may have to begin with a prescribed set of tones or conform to certain prescribed length. One use of such an embodiment, is where a local band promoting itself registers it's audible tone identifier in the database to be identified by the system. Upon synchronization the portable device returns information, such as telling the user where to get the recording for that particular local band or where to buy a CD containing the local band's songs, etc. What is more, such audible tone identifiers can be associated with particular artists, radio stations, etc., and used in marketing and promotion. Such audible tone identifiers may be transmitted to other users via any suitable means, such as email, beaming from one portable device to another, or the like. In addition, such audible tone/s and audible tone identifiers can be used to assist any of the abovementioned methods for audio track identification.
The foregoing descriptions of specific embodiments of the present invention are presented for puφoses of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously many modifications and variations are possible in view of the above teachings. For example, any of the aforementioned embodiments or methods, may be combined with one another, especially if a combination of embodiments or methods can be used to assist in the identification of an audio track. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. Furthermore, the order of steps in the method are not necessarily intended to occur in the sequence laid out. It is intended that the scope of the invention be defined by the following claims and their equivalents.

Claims

WHAT IS CLAIMED IS:
1. A method for identifying audio on a portable device comprising: recording on a portable device an audio sample from an audio track; storing said audio sample in a cache on said portable device; transmitting said audio sample to a computing device to be identified; receiving an identification of said audio sample from said computing device; and displaying said identification.
2. The method of claim 1, further comprising, before said recording, playing said audio track.
3. The method of claim 2, wherein said playing comprises playing said audio track through an audio player on said portable device.
4. The method of claim 2, wherein said playing comprises playing said audio track received by a radio receiver on said portable device.
5. The method of claim 1, wherein said recording comprises recording said audio sample from a microphone on said portable device.
6. The method of claim 1 , further comprising, before said recording, receiving instructions to identify said audio frack.
7. The method of claim 1, further comprising performing additional actions based on said identification of said audio sample, where said additional actions are selected from a group consisting of: downloading said audio track from a subscription service; recommending a similar audio track; obtaining prices of said audio track; receiving additional information on said audio track; displaying a Web-page associated with an artist who performed said audio track; displaying a biography of an artist who performed said audio track; displaying a discography of an artist who performed said audio track; displaying a Web-page selling said audio track; and any combination of the aforementioned additional actions.
8. The method of claim 1, further comprising performing additional actions prior to said recording, where said additional actions are selected from a group consisting of: downloading said audio track from a subscription service; recommending a similar audio track; obtaining prices of said audio track; receiving additional information on said audio frack; displaying a Web-page associated with an artist who performed said audio track; displaying a biography of an artist who performed said audio track; displaying a discography of an artist who performed said audio track; displaying a Web-page selling said audio track; and any combination of the aforementioned additional actions.
9. The method of claim 1 , wherein said transmitting comprises sending said audio sample to an identification server to be identified.
10. The method of claim 1 , wherein said transmitting comprises sending said audio sample to a client computer to be identified.
11. A method for identifying audio comprising: receiving an audio sample of an audio track as recorded on a portable device; performing fingeφrinting on said audio sample to obtain a unique audio fingeφrint for said audio sample; searching a fingeφrint database for a match of said fingeφrint to a known fingeφrint of a previously identified audio frack; locating a match of said fingeφrint to a known fingeφrint of a previously identified audio track; and sending identification data associated with said previously identified audio track to said portable device, such that said portable device can display said identification data.
12. A method for identifying audio on a portable device comprising: recording on a portable device an audio sample from an audio track; performing fingeφrinting on said audio sample to obtain a unique audio fingeφrint for said audio sample; obtaining identification data associated with a match of said fingeφrint to a known fingeφrint of a previously identified audio track; and displaying said identification data.
13. The method of claim 12, wherein said obtaining comprises : searching a fingeφrint database for a match of said fingeφrint to a known fingeφrint of a previously identified audio track; and locating a match of said fϊngeφrint to a known fingeφrint of a previously identified audio track.
14. The method of claim 12, wherein said obtaining comprises: transmitting said fingeφrint to a computing device to be identified; and receiving said identification of said audio sample from said computing device.
15. The method of claim 13, wherein said transmitting comprises sending said fingeφrint to an identification server to be identified.
16. The method of claim 13, wherein said transmitting comprises sending said fingeφrint to a client computer to be identified.
17. The method of claim 12, further comprising, before said recording, playing said audio track.
18. The method of claim 17, wherein said playing comprises playing said audio track through a audio player on said portable device.
19. The method of claim 17, wherein said playing comprises playing said audio frack received by a radio receiver on said portable device.
20. The method of claim 12, further comprising, before said recording, receiving instructions to identify said audio track.
21. The method of claim 12, further comprising performing additional actions based on said identification of said audio sample, where said additional actions are selected from a group consisting of: downloading said audio frack from a subscription service; recommending a similar audio track; obtaining prices of said audio track; receiving additional information on said audio frack; displaying a Web-page associated with an artist who performed said audio frack; displaying a biography of an artist who performed said audio frack; displaying a discography of an artist who performed said audio frack; displaying a Web-page selling said audio frack; and any combination of the aforementioned additional actions.
22. A method for identifying audio at a portable device comprising: receiving a radio broadcast on a portable device; playing said radio broadcast on said portable device; receiving an instruction to identify an audio frack of said radio broadcast; automatically recording said radio broadcast's broadcast frequency, and the date and time that said portable device received said instruction to identify said audio frack; transmitting said broadcast frequency, date, and time to a computing device to be identified; receiving at said portable device an identification of said audio track from said computing device, based on said broadcast frequency, date, and time; and displaying said identification.
23. The method of claim 22, further comprising performing additional actions, where said additional actions are selected from a group consisting of: downloading said audio track from a subscription service; recommending a similar audio frack; obtaining prices of said audio frack; receiving additional information on said audio frack; displaying a Web-page associated with an artist who performed said audio frack; displaying a biography of an artist who performed said audio frack; displaying a discography of an artist who performed said audio track; displaying a Web-page selling said audio track; and any combination of the aforementioned additional actions.
24. The method of claim 22, wherein said transmitting comprises sending said audio sample to an identification server to be identified.
25. The method of claim 22, wherein said transmitting comprises sending said audio sample to a client computer to be identified.
26. A method for identifying audio at a portable device comprising: receiving at an identification server a broadcast frequency, date and time as recorded on a portable device; searching a playlist database for a match of said broadcast frequency, date and time to a known radio station's broadcast frequency, and known date and time that an audio track was broadcast by said radio station; locating said audio track associated with said broadcast frequency, date and time; sending identification data associated with said audio track to said portable device, such that said portable device can display said identification data.
27. A method for identifying audio on a portable device comprising: receiving a radio broadcast on a portable device; playing said radio broadcast on said portable device; receiving an instruction to identify an audio track of said radio broadcast; automatically recording said radio broadcast's broadcast frequency, and the date and time that said portable device received said instruction to identify said audio track; searching a playlist database for a match of said broadcast frequency, date and time to a known radio station's broadcast frequency, and the known date and time that an audio frack was broadcast by said radio station; locating an audio frack associated with said broadcast frequency, date and time; and displaying identification data associated with said audio frack.
28. The method of claim 27, further comprising performing additional actions based on said identification data, where said additional actions are selected from a group consisting of: downloading said audio track from a subscription service; recommending a similar audio frack; obtaining prices of said audio track; receiving additional information on said audio track; displaying a Web-page associated with an artist who performed said audio frack; displaying a biography of an artist who performed said audio track; displaying a discography of an artist who performed said audio track; displaying a Web-page selling said audio track; and any combination of the aforementioned additional actions.
29. A portable device for identifying audio, comprising: a central processing unit; communications procedures; and a memory, comprising: recording procedures configured to record an audio sample from an audio track; communications procedures for transmitting said audio sample to a computing device to be identified, and for receiving an identification of said audio sample from said computing device; a display for displaying said identification; and a cache for storing said audio sample.
30. The portable device of claim 29, further comprising an audio player for playing said audio frack.
31. The portable device of claim 29, further comprising a radio receiver for receiving a broadcast of said audio track.
32. The portable device of claim 29, further comprising a microphone for recording said audio track.
33. An computing device for identifying audio comprising: a central processing unit; communications procedures; and a memory, comprising: communications procedures for receiving an audio sample of an audio track as recorded on a portable device, and for sending identification data associated with a known fingeφrint to said portable device; fingeφrinting procedures for performing fingeφrinting on said audio sample to obtain a unique audio fingeφrint for said audio sample; a fingeφrint database containing multiple known fingeφrints of a previously identified audio tracks; and searching procedures for searching said fingeφrint database for a match of said fingeφrint to a known fingeφrint of a previously identified audio track.
34. A portable device for identifying audio, comprising: a central processing unit; communications procedures; and a memory, comprising: recording procedures configured to record an audio sample from an audio track; fmgeφrinting procedures for performing fingeφrinting on said audio sample to obtain a unique audio fingeφrint for said audio sample; a fingeφrint database containing multiple known fingeφrints of a previously identified audio tracks; and searching procedures for searching said fingeφrint database for a match of said fingeφrint to a known fingeφrint of a previously identified audio track; and a display for displaying identification data associated with said audio frack.
35. The portable device of claim 34, further comprising an audio player for playing said audio track.
36. The portable device of claim 34, further comprising a radio receiver for receiving a broadcast of said audio frack.
37. The portable device of claim 34, further comprising a microphone for recording said audio track.
38. A portable device for identifying audio, comprising: a central processing unit; communications circuitry; user interface devices including: a receiver for receiving an instruction to identify an audio track; and a display for displaying an identification of said audio track; a radio receiver for receiving a radio broadcast; a memory, comprising: radio procedures for playing said radio broadcast; recording procedures for recording said radio broadcast's broadcast frequency, and the date and time that said portable device received said instruction to identify said audio frack; and communication procedures for transmitting said broadcast frequency, date, and time to a computing device to be identified, and for receiving an identification of said audio track from said computing device, based on said broadcast frequency, date, and time.
39. A computing device for identifying audio, comprising: a central processing unit; communications procedures; a memory, comprising: communications procedures for receiving a broadcast frequency, date and time as recorded on a portable device, and for sending identification data associated with said audio track to said portable device; a playlist database containing radio stations' broadcast frequencies, and known dates and times that audio tracks were broadcast by said radio stations; and searching procedures for searching said playlist database for a match of said broadcast frequency, date, and time to a known radio station's broadcast frequency, and a known date and time that an audio track was broadcast by said radio station.
40. A portable device for identifying audio, comprising: a central processing unit; communications circuitry; user interface devices including: a receiver for receiving an instruction to identify an audio track; and a display for displaying an identification of said audio track; a radio receiver for receiving a radio broadcast; a memory, comprising: radio procedures for playing said radio broadcast; recording procedures for recording said radio broadcast's broadcast frequency, and the date and time that said portable device received said instruction to identify said audio track; and a playlist database containing radio stations' broadcast frequencies, and known dates and times that audio tracks were broadcast by said radio stations; searching procedures for searching said playlist database for a match of said broadcast frequency, date, and time to a known radio station's broadcast frequency, and a known date and time that an audio frack was broadcast by said radio station.
PCT/US2003/013023 2002-04-25 2003-04-24 Apparatus and method for identifying audio WO2003091899A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003223748A AU2003223748A1 (en) 2002-04-25 2003-04-24 Apparatus and method for identifying audio

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13327602A 2002-04-25 2002-04-25
US10/133,276 2002-04-25

Publications (2)

Publication Number Publication Date
WO2003091899A2 true WO2003091899A2 (en) 2003-11-06
WO2003091899A3 WO2003091899A3 (en) 2004-01-08

Family

ID=29268776

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/013023 WO2003091899A2 (en) 2002-04-25 2003-04-24 Apparatus and method for identifying audio

Country Status (3)

Country Link
AU (1) AU2003223748A1 (en)
TW (1) TW200307874A (en)
WO (1) WO2003091899A2 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004017180A2 (en) * 2002-08-16 2004-02-26 Neuros Audio, Llc System and method for creating an index of audio tracks
WO2007046739A1 (en) * 2005-10-17 2007-04-26 Emdo Ab System, method and device for downloading media products
EP1872199A2 (en) * 2005-04-22 2008-01-02 Microsoft Corporation Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items
US7881657B2 (en) 2006-10-03 2011-02-01 Shazam Entertainment, Ltd. Method for high-throughput identification of distributed broadcast content
WO2012030851A1 (en) * 2010-08-30 2012-03-08 Qualcomm Incorporated Audio-based environment awareness
US9118951B2 (en) 2012-06-26 2015-08-25 Arris Technology, Inc. Time-synchronizing a parallel feed of secondary content with primary media content
US9301070B2 (en) 2013-03-11 2016-03-29 Arris Enterprises, Inc. Signature matching of corrupted audio signal
US9307337B2 (en) 2013-03-11 2016-04-05 Arris Enterprises, Inc. Systems and methods for interactive broadcast content
US9363562B1 (en) 2014-12-01 2016-06-07 Stingray Digital Group Inc. Method and system for authorizing a user device
US20170024467A1 (en) * 2004-08-06 2017-01-26 Digimarc Corporation Distributed computing for portable computing devices
US9628829B2 (en) 2012-06-26 2017-04-18 Google Technology Holdings LLC Identifying media on a mobile device
US10162888B2 (en) 2014-06-23 2018-12-25 Sony Interactive Entertainment LLC System and method for audio identification
US10846334B2 (en) 2014-04-22 2020-11-24 Gracenote, Inc. Audio identification during performance

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055244A (en) * 1990-11-27 2000-04-25 Scientific-Atlanta, Inc. Method and apparatus for communicating different types of data in a data stream
US6232539B1 (en) * 1998-06-17 2001-05-15 Looney Productions, Llc Music organizer and entertainment center
US6247130B1 (en) * 1999-01-22 2001-06-12 Bernhard Fritsch Distribution of musical products by a web site vendor over the internet

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055244A (en) * 1990-11-27 2000-04-25 Scientific-Atlanta, Inc. Method and apparatus for communicating different types of data in a data stream
US6232539B1 (en) * 1998-06-17 2001-05-15 Looney Productions, Llc Music organizer and entertainment center
US6247130B1 (en) * 1999-01-22 2001-06-12 Bernhard Fritsch Distribution of musical products by a web site vendor over the internet

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004017180A3 (en) * 2002-08-16 2004-07-15 Digital Innovations Llc System and method for creating an index of audio tracks
WO2004017180A2 (en) * 2002-08-16 2004-02-26 Neuros Audio, Llc System and method for creating an index of audio tracks
US20170024467A1 (en) * 2004-08-06 2017-01-26 Digimarc Corporation Distributed computing for portable computing devices
US9842163B2 (en) * 2004-08-06 2017-12-12 Digimarc Corporation Distributed computing for portable computing devices
EP1872199A2 (en) * 2005-04-22 2008-01-02 Microsoft Corporation Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items
EP1872199A4 (en) * 2005-04-22 2010-09-01 Microsoft Corp Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items
WO2007046739A1 (en) * 2005-10-17 2007-04-26 Emdo Ab System, method and device for downloading media products
US9864800B2 (en) 2006-10-03 2018-01-09 Shazam Entertainment, Ltd. Method and system for identification of distributed broadcast content
US8442426B2 (en) 2006-10-03 2013-05-14 Shazam Entertainment Ltd. Method and system for identification of distributed broadcast content
US9361370B2 (en) 2006-10-03 2016-06-07 Shazam Entertainment, Ltd. Method and system for identification of distributed broadcast content
US8086171B2 (en) 2006-10-03 2011-12-27 Shazam Entertainment Ltd. Method and system for identification of distributed broadcast content
US7881657B2 (en) 2006-10-03 2011-02-01 Shazam Entertainment, Ltd. Method for high-throughput identification of distributed broadcast content
US8812014B2 (en) 2010-08-30 2014-08-19 Qualcomm Incorporated Audio-based environment awareness
WO2012030851A1 (en) * 2010-08-30 2012-03-08 Qualcomm Incorporated Audio-based environment awareness
US10785506B2 (en) 2012-06-26 2020-09-22 Google Technology Holdings LLC Identifying media on a mobile device
US9118951B2 (en) 2012-06-26 2015-08-25 Arris Technology, Inc. Time-synchronizing a parallel feed of secondary content with primary media content
US11812073B2 (en) 2012-06-26 2023-11-07 Google Technology Holdings LLC Identifying media on a mobile device
US11140424B2 (en) 2012-06-26 2021-10-05 Google Technology Holdings LLC Identifying media on a mobile device
US9628829B2 (en) 2012-06-26 2017-04-18 Google Technology Holdings LLC Identifying media on a mobile device
US10051295B2 (en) 2012-06-26 2018-08-14 Google Technology Holdings LLC Identifying media on a mobile device
US9307337B2 (en) 2013-03-11 2016-04-05 Arris Enterprises, Inc. Systems and methods for interactive broadcast content
US9301070B2 (en) 2013-03-11 2016-03-29 Arris Enterprises, Inc. Signature matching of corrupted audio signal
US10846334B2 (en) 2014-04-22 2020-11-24 Gracenote, Inc. Audio identification during performance
US11574008B2 (en) 2014-04-22 2023-02-07 Gracenote, Inc. Audio identification during performance
US10162888B2 (en) 2014-06-23 2018-12-25 Sony Interactive Entertainment LLC System and method for audio identification
US9363562B1 (en) 2014-12-01 2016-06-07 Stingray Digital Group Inc. Method and system for authorizing a user device

Also Published As

Publication number Publication date
AU2003223748A1 (en) 2003-11-10
WO2003091899A3 (en) 2004-01-08
TW200307874A (en) 2003-12-16

Similar Documents

Publication Publication Date Title
US7853664B1 (en) Method and system for purchasing pre-recorded music
US9350788B2 (en) Apparatus and methods of delivering music and information
KR101464403B1 (en) System and method for ordering and delivering media content
US20040143349A1 (en) Personal audio recording system
US20070149114A1 (en) Capture, storage and retrieval of broadcast information while on-the-go
US20100093393A1 (en) Systems and Methods for Music Recognition
JP2001216434A (en) Method and device for identifying and purchasing broadcasting digital music and other type information
WO2008128037A1 (en) Dynamic podcast content delivery
WO2008128084A1 (en) Delivering podcast content
US20030186645A1 (en) Method for marking a portion of a media broadcast for later use
WO2003091899A2 (en) Apparatus and method for identifying audio
US20090061765A1 (en) Mobile terminal system and method for monitoring music program using music recognition
KR101715070B1 (en) System and method for providong digital sound transmission based music radio service
JP2002319226A (en) Audio device
JP2002091455A (en) Terminal equipment and electronic music distributing system
US20070071418A1 (en) Recording device, recording method, and program
KR100350706B1 (en) Method for providing sound data and Apparatus for the same
JP2005274992A (en) Music identification information retrieving system, music purchasing system, music identification information obtaining method, music purchasing method, audio signal processor and server device
US7509089B2 (en) Reproduction device, reproduction method, and program
JP2002162973A (en) Retrieving method for broadcasted music
JP2004077556A (en) Information distribution system, audio apparatus, server, and related information distributing method
JP2002278562A (en) Wired broadcast downloading system
JP2005352290A (en) System and center for music distribution
JP2008225549A (en) Music selling system and terminal device
KR20100133174A (en) Music broadcasting system via internet with introduction service and method thereof

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP