US8527273B2 - Systems and methods for determining the N-best strings - Google Patents
Systems and methods for determining the N-best strings Download PDFInfo
- Publication number
- US8527273B2 US8527273B2 US13/562,022 US201213562022A US8527273B2 US 8527273 B2 US8527273 B2 US 8527273B2 US 201213562022 A US201213562022 A US 201213562022A US 8527273 B2 US8527273 B2 US 8527273B2
- Authority
- US
- United States
- Prior art keywords
- automaton
- state
- queue
- input
- states
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/085—Methods for reducing search complexity, pruning
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
p[π]=p[e 1 ],n[π]=n[e n] (1)
The label of the path π is the string obtained by concatenating the labels of its constituent transitions:
l[π]=l[e 1 ] . . . l[e n] (2)
The weight associated to π is the sum of the initial weight (if p[π]=i), the weights of its constituent transitions:
w[π]=w[e 1 ]+ . . . +w[e n] (3)
and the final weight p[n[π]] if the state reached by π is final. A symbol sequence x is accepted by A if there exists a successful path π labeled with x: l[π]=x. The weight associated by A to the sequence x is then the minimum of the weights of all the successful paths π labeled with x.
w=min{w i +w[e]:eεE[q i ],l[e]=a}. (4)
w′=min[w i +w[e]−w:n[e]−w:n[e]=q′}. (5)
w=min{w+p[q]:(q, w)εS,qεF}. (6)
φ[q]=min{w[π]+p[f]:πεP(q,f),fεF}. (7)
The potentials or distances φ[q] can be directly computed by running a shortest-paths process from the final states F using the reverse of the digraph. In the case where the automaton contains no negative weights, this can be computed for example using Dijkstra's algorithm in time O(|E|log|Q| using classical heaps, or in time O(|E|+|Q|log|Q| if Fibonacci heaps are used.
Φ[q′]=min{wi =φ[q i]:0≦i≦n}, (8)
where q′ corresponds to the subset {(qo,wo), . . . , (qn, wn)}. The potential Φ[q′] can be directly computed from each constructed subset. The potential Φ[q′] or determinized potential can be used to determine the shortest distance from each state to a set of determinized final states within the partially determinized automaton. Thus, the determinized potential can be used in shortest path algorithms. The determinized potential is valid even though the determinized automaton is only partially determinized in one embodiment.
1 for p ← 1 to [Q′] do r[p] ← 0 | ||
2 π[i′, 0)] ← NIL | ||
3 S ← {(i′, 0)} | ||
4 while S ≠ |
5 | do (p, c) ← head(S); DEQUEUE(S) |
6 | r[p] ← r[p] + 1 | |
7 | if (r[p] = n and p ∈ F) then |
|
8 | if r[p] ≦ n |
9 | then for each e ∈ E[p] |
10 | do c′ ←c + w[e] |
11 | π [n[e], c′)] ←(p, c) | ||
12 | ENQUEUE (S, (n [e], c′)) | ||
(p,c)<(p′,c)(c+Φ[p]<c′+Φ[p′]) (9)
An attribute r[p], for each state p, gives at any time during the execution of the process the number of times a pair (p, c) with state p has been extracted from S. The attribute r[p] is initiated to 0 (line 1) and incremented after each extraction from S (line 6).
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/562,022 US8527273B2 (en) | 2002-03-29 | 2012-07-30 | Systems and methods for determining the N-best strings |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US36910902P | 2002-03-29 | 2002-03-29 | |
US10/301,098 US8234115B2 (en) | 2002-03-29 | 2002-11-21 | Systems and methods for determining the N-best strings |
US13/562,022 US8527273B2 (en) | 2002-03-29 | 2012-07-30 | Systems and methods for determining the N-best strings |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/301,098 Continuation US8234115B2 (en) | 2002-03-29 | 2002-11-21 | Systems and methods for determining the N-best strings |
Publications (2)
Publication Number | Publication Date |
---|---|
US20120296648A1 US20120296648A1 (en) | 2012-11-22 |
US8527273B2 true US8527273B2 (en) | 2013-09-03 |
Family
ID=28456950
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/301,098 Expired - Fee Related US8234115B2 (en) | 2002-03-29 | 2002-11-21 | Systems and methods for determining the N-best strings |
US13/562,022 Expired - Fee Related US8527273B2 (en) | 2002-03-29 | 2012-07-30 | Systems and methods for determining the N-best strings |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/301,098 Expired - Fee Related US8234115B2 (en) | 2002-03-29 | 2002-11-21 | Systems and methods for determining the N-best strings |
Country Status (4)
Country | Link |
---|---|
US (2) | US8234115B2 (en) |
EP (1) | EP1398759B1 (en) |
CA (1) | CA2423145C (en) |
DE (1) | DE60326196D1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10380166B2 (en) * | 2015-06-29 | 2019-08-13 | The Nielson Company (Us), Llc | Methods and apparatus to determine tags for media using multiple media features |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8234115B2 (en) | 2002-03-29 | 2012-07-31 | At&T Intellectual Property Ii, L.P. | Systems and methods for determining the N-best strings |
US7865352B2 (en) * | 2006-06-02 | 2011-01-04 | Microsoft Corporation | Generating grammatical elements in natural language sentences |
US8209163B2 (en) * | 2006-06-02 | 2012-06-26 | Microsoft Corporation | Grammatical element generation in machine translation |
US7881928B2 (en) * | 2006-09-01 | 2011-02-01 | International Business Machines Corporation | Enhanced linguistic transformation |
US8065300B2 (en) * | 2008-03-12 | 2011-11-22 | At&T Intellectual Property Ii, L.P. | Finding the website of a business using the business name |
US8831944B2 (en) * | 2009-12-15 | 2014-09-09 | At&T Intellectual Property I, L.P. | System and method for tightly coupling automatic speech recognition and search |
US8838434B1 (en) * | 2011-07-29 | 2014-09-16 | Nuance Communications, Inc. | Bootstrap call router to other languages using selected N-best translations |
KR101222486B1 (en) | 2012-04-13 | 2013-01-16 | 주식회사 페타바이 | Method, server, terminal, and computer-readable recording medium for selectively eliminating nondeterministic element of nondeterministic finite automata |
US9280970B1 (en) * | 2013-06-25 | 2016-03-08 | Google Inc. | Lattice semantic parsing |
JP6301647B2 (en) | 2013-12-24 | 2018-03-28 | 株式会社東芝 | SEARCH DEVICE, SEARCH METHOD, AND PROGRAM |
US9881006B2 (en) | 2014-02-28 | 2018-01-30 | Paypal, Inc. | Methods for automatic generation of parallel corpora |
US9940658B2 (en) | 2014-02-28 | 2018-04-10 | Paypal, Inc. | Cross border transaction machine translation |
US9569526B2 (en) | 2014-02-28 | 2017-02-14 | Ebay Inc. | Automatic machine translation using user feedback |
US9530161B2 (en) | 2014-02-28 | 2016-12-27 | Ebay Inc. | Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-structured data |
US10572810B2 (en) | 2015-01-07 | 2020-02-25 | Microsoft Technology Licensing, Llc | Managing user interaction for input understanding determinations |
US10249297B2 (en) * | 2015-07-13 | 2019-04-02 | Microsoft Technology Licensing, Llc | Propagating conversational alternatives using delayed hypothesis binding |
US10446137B2 (en) | 2016-09-07 | 2019-10-15 | Microsoft Technology Licensing, Llc | Ambiguity resolving conversational understanding system |
Citations (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5450598A (en) * | 1985-12-27 | 1995-09-12 | Xerox Corporation | Finite state machine data storage where data transition is accomplished without the use of pointers |
US5477451A (en) * | 1991-07-25 | 1995-12-19 | International Business Machines Corp. | Method and system for natural language translation |
US5495409A (en) | 1993-01-29 | 1996-02-27 | Matsushita Electric Industrial Co., Ltd. | Constructing method of finite-state machine performing transitions according to a partial type of success function and a failure function |
US5510981A (en) * | 1993-10-28 | 1996-04-23 | International Business Machines Corporation | Language translation apparatus and method using context-based translation models |
US5737621A (en) | 1993-04-21 | 1998-04-07 | Xerox Corporation | Finite-state encoding system for hyphenation rules |
US5806032A (en) * | 1996-06-14 | 1998-09-08 | Lucent Technologies Inc. | Compilation of weighted finite-state transducers from decision trees |
US6018736A (en) * | 1994-10-03 | 2000-01-25 | Phonetic Systems Ltd. | Word-containing database accessing system for responding to ambiguous queries, including a dictionary of database words, a dictionary searcher and a database searcher |
US6032111A (en) * | 1997-06-23 | 2000-02-29 | At&T Corp. | Method and apparatus for compiling context-dependent rewrite rules and input strings |
US6073098A (en) * | 1997-11-21 | 2000-06-06 | At&T Corporation | Method and apparatus for generating deterministic approximate weighted finite-state automata |
US6167377A (en) * | 1997-03-28 | 2000-12-26 | Dragon Systems, Inc. | Speech recognition language models |
US6182039B1 (en) * | 1998-03-24 | 2001-01-30 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus using probabilistic language model based on confusable sets for speech recognition |
US6233544B1 (en) * | 1996-06-14 | 2001-05-15 | At&T Corp | Method and apparatus for language translation |
US6243679B1 (en) * | 1997-01-21 | 2001-06-05 | At&T Corporation | Systems and methods for determinization and minimization a finite state transducer for speech recognition |
US6278973B1 (en) * | 1995-12-12 | 2001-08-21 | Lucent Technologies, Inc. | On-demand language processing system and method |
US6401060B1 (en) * | 1998-06-25 | 2002-06-04 | Microsoft Corporation | Method for typographical detection and replacement in Japanese text |
US6714905B1 (en) * | 2000-05-02 | 2004-03-30 | Iphrase.Com, Inc. | Parsing ambiguous grammar |
US6823307B1 (en) * | 1998-12-21 | 2004-11-23 | Koninklijke Philips Electronics N.V. | Language model based on the speech recognition history |
US6952667B2 (en) * | 2000-04-03 | 2005-10-04 | Xerox Corporation | Method and apparatus for extracting infinite ambiguity when factoring finite state transducers |
US6959273B2 (en) * | 2000-04-03 | 2005-10-25 | Xerox Corporation | Method and apparatus for factoring finite state transducers with unknown symbols |
US6990445B2 (en) * | 2001-12-17 | 2006-01-24 | Xl8 Systems, Inc. | System and method for speech recognition and transcription |
US8234115B2 (en) | 2002-03-29 | 2012-07-31 | At&T Intellectual Property Ii, L.P. | Systems and methods for determining the N-best strings |
-
2002
- 2002-11-21 US US10/301,098 patent/US8234115B2/en not_active Expired - Fee Related
-
2003
- 2003-03-21 CA CA2423145A patent/CA2423145C/en not_active Expired - Fee Related
- 2003-03-27 EP EP03100794A patent/EP1398759B1/en not_active Expired - Fee Related
- 2003-03-27 DE DE60326196T patent/DE60326196D1/en not_active Expired - Lifetime
-
2012
- 2012-07-30 US US13/562,022 patent/US8527273B2/en not_active Expired - Fee Related
Patent Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5450598A (en) * | 1985-12-27 | 1995-09-12 | Xerox Corporation | Finite state machine data storage where data transition is accomplished without the use of pointers |
US5477451A (en) * | 1991-07-25 | 1995-12-19 | International Business Machines Corp. | Method and system for natural language translation |
US5805832A (en) * | 1991-07-25 | 1998-09-08 | International Business Machines Corporation | System for parametric text to text language translation |
US5495409A (en) | 1993-01-29 | 1996-02-27 | Matsushita Electric Industrial Co., Ltd. | Constructing method of finite-state machine performing transitions according to a partial type of success function and a failure function |
US5737621A (en) | 1993-04-21 | 1998-04-07 | Xerox Corporation | Finite-state encoding system for hyphenation rules |
US5510981A (en) * | 1993-10-28 | 1996-04-23 | International Business Machines Corporation | Language translation apparatus and method using context-based translation models |
US6018736A (en) * | 1994-10-03 | 2000-01-25 | Phonetic Systems Ltd. | Word-containing database accessing system for responding to ambiguous queries, including a dictionary of database words, a dictionary searcher and a database searcher |
US6256630B1 (en) * | 1994-10-03 | 2001-07-03 | Phonetic Systems Ltd. | Word-containing database accessing system for responding to ambiguous queries, including a dictionary of database words, a dictionary searcher and a database searcher |
US6278973B1 (en) * | 1995-12-12 | 2001-08-21 | Lucent Technologies, Inc. | On-demand language processing system and method |
US6233544B1 (en) * | 1996-06-14 | 2001-05-15 | At&T Corp | Method and apparatus for language translation |
US5806032A (en) * | 1996-06-14 | 1998-09-08 | Lucent Technologies Inc. | Compilation of weighted finite-state transducers from decision trees |
US6243679B1 (en) * | 1997-01-21 | 2001-06-05 | At&T Corporation | Systems and methods for determinization and minimization a finite state transducer for speech recognition |
US6167377A (en) * | 1997-03-28 | 2000-12-26 | Dragon Systems, Inc. | Speech recognition language models |
US6032111A (en) * | 1997-06-23 | 2000-02-29 | At&T Corp. | Method and apparatus for compiling context-dependent rewrite rules and input strings |
US6073098A (en) * | 1997-11-21 | 2000-06-06 | At&T Corporation | Method and apparatus for generating deterministic approximate weighted finite-state automata |
US6266634B1 (en) * | 1997-11-21 | 2001-07-24 | At&T Corporation | Method and apparatus for generating deterministic approximate weighted finite-state automata |
US6182039B1 (en) * | 1998-03-24 | 2001-01-30 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus using probabilistic language model based on confusable sets for speech recognition |
US6401060B1 (en) * | 1998-06-25 | 2002-06-04 | Microsoft Corporation | Method for typographical detection and replacement in Japanese text |
US6823307B1 (en) * | 1998-12-21 | 2004-11-23 | Koninklijke Philips Electronics N.V. | Language model based on the speech recognition history |
US6952667B2 (en) * | 2000-04-03 | 2005-10-04 | Xerox Corporation | Method and apparatus for extracting infinite ambiguity when factoring finite state transducers |
US6959273B2 (en) * | 2000-04-03 | 2005-10-25 | Xerox Corporation | Method and apparatus for factoring finite state transducers with unknown symbols |
US6714905B1 (en) * | 2000-05-02 | 2004-03-30 | Iphrase.Com, Inc. | Parsing ambiguous grammar |
US6990445B2 (en) * | 2001-12-17 | 2006-01-24 | Xl8 Systems, Inc. | System and method for speech recognition and transcription |
US8234115B2 (en) | 2002-03-29 | 2012-07-31 | At&T Intellectual Property Ii, L.P. | Systems and methods for determining the N-best strings |
Non-Patent Citations (8)
Title |
---|
A Buchsbaum et al., "on the Determinization of Weighted Finite Automata," in Proc. 25th ICALP, Aalborg, Denmark, 1998, pp. 1-12. |
Alfred V. Aho et al, Compilers, Principles, Techniques and Tools, Addision Wesley: Reading, MA, 1986. |
Cyril Allauzen et al., On the Determinizability of Weighted Automata and Transducers, in Proceedings of the Workshop Weighted Automata: Theory and Application (WATA), Dresden, Germany, Mar. 2002. |
European Examination Report dated May 4, 2007 for corresponding European Patent Application No. 03100794.1-2215 (5 pages). |
European Search Report dated Nov. 16, 2004 for corresponding European Patent Application No. 03100794.4. |
Mehrya Mohri, Finite-State Transducers in Language and Speech Processing, Computational Linguistics, vol. 23:2, 1997. |
Mohri, et al., "Weighted Finite-State Transducers in Speech Recognition", Computer Speech and Language, Academic Press, Jan. 2002. |
T. Cormen et al., Introduction to Algorithms, The MIT Press: Cambridge, MA 1992. |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10380166B2 (en) * | 2015-06-29 | 2019-08-13 | The Nielson Company (Us), Llc | Methods and apparatus to determine tags for media using multiple media features |
US11138253B2 (en) * | 2015-06-29 | 2021-10-05 | The Nielsen Company (Us), Llc | Methods and apparatus to determine tags for media using multiple media features |
US20220027402A1 (en) * | 2015-06-29 | 2022-01-27 | The Nielsen Company (Us), Llc | Methods and apparatus to determine tags for media using multiple media features |
US11727044B2 (en) * | 2015-06-29 | 2023-08-15 | The Nielsen Company (Us), Llc | Methods and apparatus to determine tags for media using multiple media features |
Also Published As
Publication number | Publication date |
---|---|
EP1398759B1 (en) | 2009-02-18 |
CA2423145C (en) | 2010-11-16 |
US20120296648A1 (en) | 2012-11-22 |
US20030187644A1 (en) | 2003-10-02 |
US8234115B2 (en) | 2012-07-31 |
EP1398759A2 (en) | 2004-03-17 |
EP1398759A3 (en) | 2004-12-29 |
CA2423145A1 (en) | 2003-09-29 |
DE60326196D1 (en) | 2009-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8527273B2 (en) | Systems and methods for determining the N-best strings | |
US7127393B2 (en) | Dynamic semantic control of a speech recognition system | |
Chen et al. | Recurrent neural networks as weighted language recognizers | |
JP4053141B2 (en) | Method of automatic task classification based on voice, method of automatic call classification, and automatic task classification system | |
US7421387B2 (en) | Dynamic N-best algorithm to reduce recognition errors | |
Soong et al. | A Tree. Trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition | |
EP0387602B1 (en) | Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system | |
US9202464B1 (en) | Curriculum learning for speech recognition | |
US6501833B2 (en) | Method and apparatus for dynamic adaptation of a large vocabulary speech recognition system and for use of constraints from a database in a large vocabulary speech recognition system | |
KR100312920B1 (en) | Method and apparatus for connected speech recognition | |
JP4528535B2 (en) | Method and apparatus for predicting word error rate from text | |
JP2775140B2 (en) | Pattern recognition method, voice recognition method, and voice recognition device | |
US20070118376A1 (en) | Word clustering for input data | |
CN111428504B (en) | Event extraction method and device | |
US6859774B2 (en) | Error corrective mechanisms for consensus decoding of speech | |
US6507815B1 (en) | Speech recognition apparatus and method | |
US8676580B2 (en) | Automatic speech and concept recognition | |
JP2002149186A (en) | Selection of substitute word string concerning identifiable adaptation | |
US20050187767A1 (en) | Dynamic N-best algorithm to reduce speech recognition errors | |
JP2002358097A (en) | Voice recognition device | |
US10402492B1 (en) | Processing natural language grammar | |
JP2006201265A (en) | Voice recognition device | |
CN115294974A (en) | Voice recognition method, device, equipment and storage medium | |
Švec et al. | Semantic entity detection from multiple ASR hypotheses within the WFST framework | |
JP3950957B2 (en) | Language processing apparatus and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: AT&T CORP., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MOHRI, MEHRYAR;RILEY, MICHAEL DENNIS;SIGNING DATES FROM 20021114 TO 20021119;REEL/FRAME:028684/0188 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: AT&T PROPERTIES, LLC, NEVADA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AT&T CORP.;REEL/FRAME:052387/0631 Effective date: 20200407 Owner name: AT&T INTELLECTUAL PROPERTY II, L.P., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AT&T PROPERTIES, LLC;REEL/FRAME:052388/0196 Effective date: 20200407 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20210903 |