WO2000039788A3 - Knowledge-based strategies applied to n-best lists in automatic speech recognition systems - Google Patents

Knowledge-based strategies applied to n-best lists in automatic speech recognition systems Download PDF

Info

Publication number
WO2000039788A3
WO2000039788A3 PCT/US1999/031311 US9931311W WO0039788A3 WO 2000039788 A3 WO2000039788 A3 WO 2000039788A3 US 9931311 W US9931311 W US 9931311W WO 0039788 A3 WO0039788 A3 WO 0039788A3
Authority
WO
WIPO (PCT)
Prior art keywords
string
hypothesized
knowledge
speech recognition
spoken
Prior art date
Application number
PCT/US1999/031311
Other languages
French (fr)
Other versions
WO2000039788A2 (en
Inventor
Thomas B Schalk
Roger S Zimmerman
Original Assignee
Koninkl Philips Electronics Nv
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninkl Philips Electronics Nv filed Critical Koninkl Philips Electronics Nv
Priority to AU24017/00A priority Critical patent/AU2401700A/en
Priority to JP2000591610A priority patent/JP2002533789A/en
Priority to EP99967801A priority patent/EP1070315A4/en
Priority to KR1020007009585A priority patent/KR20010041440A/en
Publication of WO2000039788A2 publication Critical patent/WO2000039788A2/en
Publication of WO2000039788A3 publication Critical patent/WO2000039788A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/085Methods for reducing search complexity, pruning

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)
  • Machine Translation (AREA)

Abstract

A highly accurate technique for recognizing spoken digit strings is described. A spoken digit string is received (14) and analyzed by a speech recognizer (18), which generates a list of hypothesized digit strings arranged in ranked order (16) based on a likelihood of matching the spoken digit string (20). The individual hypothesized strings are then analyzed in order beginning with the hypothesized string having the greatest likelihood of matching the spoken string to determine whether they satisfy a given constraint. The first hypothesized string in the list satisfying the constraint is selected as the recognized string (22).
PCT/US1999/031311 1998-12-29 1999-12-29 Knowledge-based strategies applied to n-best lists in automatic speech recognition systems WO2000039788A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU24017/00A AU2401700A (en) 1998-12-29 1999-12-29 Knowledge-based strategies applied to n-best lists in automatic speech recognition systems
JP2000591610A JP2002533789A (en) 1998-12-29 1999-12-29 Knowledge-based strategy for N-best list in automatic speech recognition system
EP99967801A EP1070315A4 (en) 1998-12-29 1999-12-29 Knowledge-based strategies applied to n-best lists in automatic speech recognition systems
KR1020007009585A KR20010041440A (en) 1998-12-29 1999-12-29 Knowledge-based strategies applied to n-best lists in automatic speech recognition systems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/222,073 US6922669B2 (en) 1998-12-29 1998-12-29 Knowledge-based strategies applied to N-best lists in automatic speech recognition systems
US09/222,073 1998-12-29

Publications (2)

Publication Number Publication Date
WO2000039788A2 WO2000039788A2 (en) 2000-07-06
WO2000039788A3 true WO2000039788A3 (en) 2000-11-02

Family

ID=22830703

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/031311 WO2000039788A2 (en) 1998-12-29 1999-12-29 Knowledge-based strategies applied to n-best lists in automatic speech recognition systems

Country Status (7)

Country Link
US (1) US6922669B2 (en)
EP (1) EP1070315A4 (en)
JP (1) JP2002533789A (en)
KR (1) KR20010041440A (en)
CN (1) CN1179323C (en)
AU (1) AU2401700A (en)
WO (1) WO2000039788A2 (en)

Families Citing this family (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7016827B1 (en) * 1999-09-03 2006-03-21 International Business Machines Corporation Method and system for ensuring robustness in natural language understanding
DE10043499A1 (en) * 2000-09-01 2002-03-14 Bosch Gmbh Robert Data transmission method
KR100352748B1 (en) * 2001-01-05 2002-09-16 (주) 코아보이스 Online trainable speech synthesizer and its method
AUPR654401A0 (en) * 2001-07-23 2001-08-16 Transurban City Link Limited Method and system for recognising a spoken identification sequence
US7526431B2 (en) * 2001-09-05 2009-04-28 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US7467089B2 (en) * 2001-09-05 2008-12-16 Roth Daniel L Combined speech and handwriting recognition
US7444286B2 (en) 2001-09-05 2008-10-28 Roth Daniel L Speech recognition using re-utterance recognition
US7505911B2 (en) * 2001-09-05 2009-03-17 Roth Daniel L Combined speech recognition and sound recording
US7809574B2 (en) * 2001-09-05 2010-10-05 Voice Signal Technologies Inc. Word recognition using choice lists
US7246062B2 (en) 2002-04-08 2007-07-17 Sbc Technology Resources, Inc. Method and system for voice recognition menu navigation with error prevention and recovery
US20040002849A1 (en) * 2002-06-28 2004-01-01 Ming Zhou System and method for automatic retrieval of example sentences based upon weighted editing distance
US7386454B2 (en) * 2002-07-31 2008-06-10 International Business Machines Corporation Natural error handling in speech recognition
US7664639B2 (en) * 2004-01-14 2010-02-16 Art Advanced Recognition Technologies, Inc. Apparatus and methods for speech recognition
US20060004574A1 (en) * 2004-06-30 2006-01-05 Microsoft Corporation Semantic based validation information in a language model to detect recognition errors and improve dialog performance
CN101873267B (en) * 2004-08-30 2012-10-24 高通股份有限公司 Adaptive De-Jitter Buffer for Voice over IP
US8085678B2 (en) * 2004-10-13 2011-12-27 Qualcomm Incorporated Media (voice) playback (de-jitter) buffer adjustments based on air interface
US7827032B2 (en) * 2005-02-04 2010-11-02 Vocollect, Inc. Methods and systems for adapting a model for a speech recognition system
US8200495B2 (en) * 2005-02-04 2012-06-12 Vocollect, Inc. Methods and systems for considering information about an expected response when performing speech recognition
US7895039B2 (en) * 2005-02-04 2011-02-22 Vocollect, Inc. Methods and systems for optimizing model adaptation for a speech recognition system
US7949533B2 (en) * 2005-02-04 2011-05-24 Vococollect, Inc. Methods and systems for assessing and improving the performance of a speech recognition system
US7865362B2 (en) 2005-02-04 2011-01-04 Vocollect, Inc. Method and system for considering information about an expected response when performing speech recognition
US8355907B2 (en) * 2005-03-11 2013-01-15 Qualcomm Incorporated Method and apparatus for phase matching frames in vocoders
US8155965B2 (en) * 2005-03-11 2012-04-10 Qualcomm Incorporated Time warping frames inside the vocoder by modifying the residual
US20070016460A1 (en) * 2005-07-14 2007-01-18 Vocollect, Inc. Task management system having selectively variable check data
EP2005417A2 (en) 2006-04-03 2008-12-24 Vocollect, Inc. Methods and systems for optimizing model adaptation for a speech recognition system
US8688451B2 (en) * 2006-05-11 2014-04-01 General Motors Llc Distinguishing out-of-vocabulary speech from in-vocabulary speech
CN100452042C (en) * 2006-06-23 2009-01-14 腾讯科技(深圳)有限公司 Digital string fuzzy match method
US8055502B2 (en) * 2006-11-28 2011-11-08 General Motors Llc Voice dialing using a rejection reference
EP1933302A1 (en) * 2006-12-12 2008-06-18 Harman Becker Automotive Systems GmbH Speech recognition method
DE602008001787D1 (en) * 2007-02-12 2010-08-26 Dolby Lab Licensing Corp IMPROVED RELATIONSHIP BETWEEN LANGUAGE TO NON-LINGUISTIC AUDIO CONTENT FOR ELDERLY OR HARMFUL ACCOMPANIMENTS
JP5530720B2 (en) 2007-02-26 2014-06-25 ドルビー ラボラトリーズ ライセンシング コーポレイション Speech enhancement method, apparatus, and computer-readable recording medium for entertainment audio
US8589162B2 (en) * 2007-09-19 2013-11-19 Nuance Communications, Inc. Method, system and computer program for enhanced speech recognition of digits input strings
EP2081185B1 (en) * 2008-01-16 2014-11-26 Nuance Communications, Inc. Speech recognition on large lists using fragments
DE102008007698A1 (en) * 2008-02-06 2009-08-13 Siemens Aktiengesellschaft Method for detecting an input in a speech recognition system
EP2289065B1 (en) * 2008-06-10 2011-12-07 Dolby Laboratories Licensing Corporation Concealing audio artifacts
US8321958B1 (en) 2008-07-30 2012-11-27 Next It Corporation Detecting presence of a subject string in a target string and security event qualification based on prior behavior by an end user of a computer system
US20100281435A1 (en) * 2009-04-30 2010-11-04 At&T Intellectual Property I, L.P. System and method for multimodal interaction using robust gesture processing
US8374868B2 (en) 2009-08-21 2013-02-12 General Motors Llc Method of recognizing speech
US11416214B2 (en) 2009-12-23 2022-08-16 Google Llc Multi-modal input on an electronic device
EP3091535B1 (en) 2009-12-23 2023-10-11 Google LLC Multi-modal input on an electronic device
US9123339B1 (en) 2010-11-23 2015-09-01 Google Inc. Speech recognition using repeated utterances
US8352245B1 (en) 2010-12-30 2013-01-08 Google Inc. Adjusting language models
US8296142B2 (en) 2011-01-21 2012-10-23 Google Inc. Speech recognition using dock context
US10534931B2 (en) 2011-03-17 2020-01-14 Attachmate Corporation Systems, devices and methods for automatic detection and masking of private data
US8914290B2 (en) 2011-05-20 2014-12-16 Vocollect, Inc. Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment
CN103188409A (en) * 2011-12-29 2013-07-03 上海博泰悦臻电子设备制造有限公司 Voice auto-answer cloud server, voice auto-answer system and voice auto-answer method
US9978395B2 (en) 2013-03-15 2018-05-22 Vocollect, Inc. Method and system for mitigating delay in receiving audio stream during production of sound from audio stream
US9842592B2 (en) 2014-02-12 2017-12-12 Google Inc. Language models using non-linguistic context
US9412365B2 (en) 2014-03-24 2016-08-09 Google Inc. Enhanced maximum entropy models
DE112014006795B4 (en) * 2014-07-08 2018-09-20 Mitsubishi Electric Corporation Speech recognition system and speech recognition method
US10572810B2 (en) 2015-01-07 2020-02-25 Microsoft Technology Licensing, Llc Managing user interaction for input understanding determinations
US10134394B2 (en) 2015-03-20 2018-11-20 Google Llc Speech recognition using log-linear model
EP3089159B1 (en) 2015-04-28 2019-08-28 Google LLC Correcting voice recognition using selective re-speak
US10249297B2 (en) 2015-07-13 2019-04-02 Microsoft Technology Licensing, Llc Propagating conversational alternatives using delayed hypothesis binding
CN105468582B (en) * 2015-11-18 2018-03-02 苏州思必驰信息科技有限公司 A kind of method and device for correcting of the numeric string based on man-machine interaction
US9978367B2 (en) 2016-03-16 2018-05-22 Google Llc Determining dialog states for language models
US10714121B2 (en) 2016-07-27 2020-07-14 Vocollect, Inc. Distinguishing user speech from background speech in speech-dense environments
US10832664B2 (en) 2016-08-19 2020-11-10 Google Llc Automated speech recognition using language models that selectively use domain-specific model components
US10446137B2 (en) 2016-09-07 2019-10-15 Microsoft Technology Licensing, Llc Ambiguity resolving conversational understanding system
US10311860B2 (en) 2017-02-14 2019-06-04 Google Llc Language model biasing system
CN107632718B (en) * 2017-08-03 2021-01-22 百度在线网络技术(北京)有限公司 Method, device and readable medium for recommending digital information in voice input
JP7170287B2 (en) * 2018-05-18 2022-11-14 パナソニックIpマネジメント株式会社 Speech recognition device, speech recognition method, and program
CN109472980A (en) * 2018-10-18 2019-03-15 成都亚讯星科科技股份有限公司 Earth magnetism wagon detector and its detection method based on NB-IoT technology
CN113178190A (en) * 2021-05-14 2021-07-27 山东浪潮科学研究院有限公司 End-to-end automatic speech recognition algorithm for improving rarely-used word recognition based on meta learning

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4866778A (en) * 1986-08-11 1989-09-12 Dragon Systems, Inc. Interactive speech recognition apparatus
US4882757A (en) * 1986-04-25 1989-11-21 Texas Instruments Incorporated Speech recognition system
US5222187A (en) * 1989-12-29 1993-06-22 Texas Instruments Incorporated Grammar-based checksum constraints for high performance speech recognition circuit
US5241619A (en) * 1991-06-25 1993-08-31 Bolt Beranek And Newman Inc. Word dependent N-best search method
US5267345A (en) * 1992-02-10 1993-11-30 International Business Machines Corporation Speech recognition apparatus which predicts word classes from context and words from word classes
US5276741A (en) * 1991-05-16 1994-01-04 Trw Financial Systems & Services, Inc. Fuzzy string matcher
US5606644A (en) * 1993-07-22 1997-02-25 Lucent Technologies Inc. Minimum error rate training of combined string models
US5903864A (en) * 1995-08-30 1999-05-11 Dragon Systems Speech recognition
US6003002A (en) * 1997-01-02 1999-12-14 Texas Instruments Incorporated Method and system of adapting speech recognition models to speaker environment
US6049768A (en) * 1997-11-03 2000-04-11 A T & T Corp Speech recognition system with implicit checksum

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5119416A (en) * 1990-05-30 1992-06-02 Nynex Corporation Automated telephone number identification for automatic intercept in telephone networks
WO1994014270A1 (en) * 1992-12-17 1994-06-23 Bell Atlantic Network Services, Inc. Mechanized directory assistance
US5712957A (en) * 1995-09-08 1998-01-27 Carnegie Mellon University Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists
US5737489A (en) * 1995-09-15 1998-04-07 Lucent Technologies Inc. Discriminative utterance verification for connected digits recognition
US6208965B1 (en) * 1997-11-20 2001-03-27 At&T Corp. Method and apparatus for performing a name acquisition based on speech recognition
US6205428B1 (en) * 1997-11-20 2001-03-20 At&T Corp. Confusion set-base method and apparatus for pruning a predetermined arrangement of indexed identifiers

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4882757A (en) * 1986-04-25 1989-11-21 Texas Instruments Incorporated Speech recognition system
US4866778A (en) * 1986-08-11 1989-09-12 Dragon Systems, Inc. Interactive speech recognition apparatus
US5222187A (en) * 1989-12-29 1993-06-22 Texas Instruments Incorporated Grammar-based checksum constraints for high performance speech recognition circuit
US5276741A (en) * 1991-05-16 1994-01-04 Trw Financial Systems & Services, Inc. Fuzzy string matcher
US5241619A (en) * 1991-06-25 1993-08-31 Bolt Beranek And Newman Inc. Word dependent N-best search method
US5267345A (en) * 1992-02-10 1993-11-30 International Business Machines Corporation Speech recognition apparatus which predicts word classes from context and words from word classes
US5606644A (en) * 1993-07-22 1997-02-25 Lucent Technologies Inc. Minimum error rate training of combined string models
US5903864A (en) * 1995-08-30 1999-05-11 Dragon Systems Speech recognition
US6003002A (en) * 1997-01-02 1999-12-14 Texas Instruments Incorporated Method and system of adapting speech recognition models to speaker environment
US6049768A (en) * 1997-11-03 2000-04-11 A T & T Corp Speech recognition system with implicit checksum

Also Published As

Publication number Publication date
AU2401700A (en) 2000-07-31
CN1299503A (en) 2001-06-13
WO2000039788A2 (en) 2000-07-06
JP2002533789A (en) 2002-10-08
CN1179323C (en) 2004-12-08
US6922669B2 (en) 2005-07-26
EP1070315A4 (en) 2005-07-27
KR20010041440A (en) 2001-05-25
EP1070315A2 (en) 2001-01-24
US20030154075A1 (en) 2003-08-14

Similar Documents

Publication Publication Date Title
WO2000039788A3 (en) Knowledge-based strategies applied to n-best lists in automatic speech recognition systems
Soong et al. A Tree. Trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition
WO2004090866A3 (en) Phonetically based speech recognition system and method
WO1999016052A3 (en) Speech recognition system for recognizing continuous and isolated speech
CA2233179A1 (en) Unsupervised hmm adaptation based on speech-silence discrimination
EP0825586A3 (en) Lexical tree pre-filtering in speech recognition
EP0762385A3 (en) Speech recognition
EP0834862A3 (en) Method of key-phrase detection and verification for flexible speech understanding
CA2089786A1 (en) Context-dependent speech recognizer using estimated next word context
ATE364219T1 (en) VOICE RECOGNITION METHOD WITH SUBSTITUTION COMMAND
US6058363A (en) Method and system for speaker-independent recognition of user-defined phrases
EP1020847A3 (en) Method for multistage speech recognition using confidence measures
DE69330427T2 (en) VOICE RECOGNITION SYSTEM FOR LANGUAGES WITH COMPOSED WORDS
CA2117932A1 (en) Soft Decision Speech Recognition
CA2180392A1 (en) User Selectable Multiple Threshold Criteria for Voice Recognition
CA2177638A1 (en) Utterance verification using word based minimum verification error training for recognizing a keyword string
EP1022723A3 (en) Unsupervised adaptation of a speech recognizer using reliable information among N-best strings
WO2004034355A3 (en) System and methods for comparing speech elements
Rabiner On the application of energy contours to the recognition of connected word sequences
EP0177854B1 (en) Keyword recognition system using template-concatenation model
Lubensky et al. Connected digit recognition using connectionist probability estimators and mixture-Gaussian densities.
Phillips et al. Modelling context dependency in acoustic-phonetic and lexical representations
Cardin et al. High performance connected digit recognition using codebook exponents
Gupta et al. Improved utterance rejection using length dependent thresholds.
Bush et al. Network-based connected digit recognition using vector quantization

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 99805475.5

Country of ref document: CN

AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 1999967801

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2000 591610

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020007009585

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWP Wipo information: published in national office

Ref document number: 1999967801

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1020007009585

Country of ref document: KR