DE69814104D1 - Aufteilung von texten und identifizierung von themen - Google Patents

Aufteilung von texten und identifizierung von themen

Info

Publication number
DE69814104D1
DE69814104D1 DE69814104T DE69814104T DE69814104D1 DE 69814104 D1 DE69814104 D1 DE 69814104D1 DE 69814104 T DE69814104 T DE 69814104T DE 69814104 T DE69814104 T DE 69814104T DE 69814104 D1 DE69814104 D1 DE 69814104D1
Authority
DE
Germany
Prior art keywords
topics
texts
identification
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
DE69814104T
Other languages
English (en)
Other versions
DE69814104T2 (de
Inventor
Jonathan Yamron
G Bamberg
James Barnett
S Gillick
Mulbregt A Van
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
L&H Holdings USA Inc
Original Assignee
L&H Holdings USA Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by L&H Holdings USA Inc filed Critical L&H Holdings USA Inc
Publication of DE69814104D1 publication Critical patent/DE69814104D1/de
Application granted granted Critical
Publication of DE69814104T2 publication Critical patent/DE69814104T2/de
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
DE69814104T 1997-09-09 1998-09-09 Aufteilung von texten und identifizierung von themen Expired - Fee Related DE69814104T2 (de)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US5826197P 1997-09-09 1997-09-09
US58261P 1997-09-09
US08/978,487 US6052657A (en) 1997-09-09 1997-11-25 Text segmentation and identification of topic using language models
US978487 1997-11-25
PCT/US1998/018830 WO1999013408A2 (en) 1997-09-09 1998-09-09 Text segmentation and identification of topics

Publications (2)

Publication Number Publication Date
DE69814104D1 true DE69814104D1 (de) 2003-06-05
DE69814104T2 DE69814104T2 (de) 2004-04-29

Family

ID=26737425

Family Applications (1)

Application Number Title Priority Date Filing Date
DE69814104T Expired - Fee Related DE69814104T2 (de) 1997-09-09 1998-09-09 Aufteilung von texten und identifizierung von themen

Country Status (4)

Country Link
US (1) US6052657A (de)
EP (1) EP1012736B1 (de)
DE (1) DE69814104T2 (de)
WO (1) WO1999013408A2 (de)

Families Citing this family (107)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529756B1 (en) 1998-07-21 2009-05-05 West Services, Inc. System and method for processing formatted text documents in a database
US7778954B2 (en) 1998-07-21 2010-08-17 West Publishing Corporation Systems, methods, and software for presenting legal case histories
US7447626B2 (en) * 1998-09-28 2008-11-04 Udico Holdings Method and apparatus for generating a language independent document abstract
US7356462B2 (en) 2001-07-26 2008-04-08 At&T Corp. Automatic clustering of tokens from a corpus for grammar acquisition
US6317707B1 (en) * 1998-12-07 2001-11-13 At&T Corp. Automatic clustering of tokens from a corpus for grammar acquisition
US20050261907A1 (en) * 1999-04-12 2005-11-24 Ben Franklin Patent Holding Llc Voice integration platform
US6904402B1 (en) * 1999-11-05 2005-06-07 Microsoft Corporation System and iterative method for lexicon, segmentation and language model joint optimization
US6529902B1 (en) * 1999-11-08 2003-03-04 International Business Machines Corporation Method and system for off-line detection of textual topical changes and topic identification via likelihood based methods for improved language modeling
US6505151B1 (en) * 2000-03-15 2003-01-07 Bridgewell Inc. Method for dividing sentences into phrases using entropy calculations of word combinations based on adjacent words
US7035788B1 (en) * 2000-04-25 2006-04-25 Microsoft Corporation Language model sharing
US7490092B2 (en) 2000-07-06 2009-02-10 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
JP4299963B2 (ja) * 2000-10-02 2009-07-22 ヒューレット・パッカード・カンパニー 意味的まとまりに基づいて文書を分割する装置および方法
US6941266B1 (en) * 2000-11-15 2005-09-06 At&T Corp. Method and system for predicting problematic dialog situations in a task classification system
US6772120B1 (en) * 2000-11-21 2004-08-03 Hewlett-Packard Development Company, L.P. Computer method and apparatus for segmenting text streams
JP4947861B2 (ja) * 2001-09-25 2012-06-06 キヤノン株式会社 自然言語処理装置およびその制御方法ならびにプログラム
US7610189B2 (en) * 2001-10-18 2009-10-27 Nuance Communications, Inc. Method and apparatus for efficient segmentation of compound words using probabilistic breakpoint traversal
US7062498B2 (en) 2001-11-02 2006-06-13 Thomson Legal Regulatory Global Ag Systems, methods, and software for classifying text from judicial opinions and other documents
US7117200B2 (en) 2002-01-11 2006-10-03 International Business Machines Corporation Synthesizing information-bearing content from multiple channels
KR20030069377A (ko) * 2002-02-20 2003-08-27 대한민국(전남대학교총장) 음성인식시스템의 토픽 검출장치 및 방법
US7143035B2 (en) * 2002-03-27 2006-11-28 International Business Machines Corporation Methods and apparatus for generating dialog state conditioned language models
US20040006628A1 (en) * 2002-07-03 2004-01-08 Scott Shepard Systems and methods for providing real-time alerting
US20040117188A1 (en) * 2002-07-03 2004-06-17 Daniel Kiecza Speech based personal information manager
US7493253B1 (en) * 2002-07-12 2009-02-17 Language And Computing, Inc. Conceptual world representation natural language understanding system and method
US20040083090A1 (en) * 2002-10-17 2004-04-29 Daniel Kiecza Manager for integrating language technology components
US7376893B2 (en) * 2002-12-16 2008-05-20 Palo Alto Research Center Incorporated Systems and methods for sentence based interactive topic-based text summarization
US7451395B2 (en) * 2002-12-16 2008-11-11 Palo Alto Research Center Incorporated Systems and methods for interactive topic-based text summarization
US7117437B2 (en) * 2002-12-16 2006-10-03 Palo Alto Research Center Incorporated Systems and methods for displaying interactive topic-based text summaries
GB0230097D0 (en) * 2002-12-24 2003-01-29 Koninkl Philips Electronics Nv Method and system for augmenting an audio signal
US7310658B2 (en) * 2002-12-27 2007-12-18 International Business Machines Corporation Method for tracking responses to a forum topic
US7958443B2 (en) * 2003-02-28 2011-06-07 Dictaphone Corporation System and method for structuring speech recognized text into a pre-selected document format
EP1462950B1 (de) * 2003-03-27 2007-08-29 Sony Deutschland GmbH Verfahren zur Sprachmodellierung
ATE518193T1 (de) * 2003-05-28 2011-08-15 Loquendo Spa Automatische segmentierung von texten mit einheiten ohne trennzeichen
US7493251B2 (en) * 2003-05-30 2009-02-17 Microsoft Corporation Using source-channel models for word segmentation
US8327255B2 (en) * 2003-08-07 2012-12-04 West Services, Inc. Computer program product containing electronic transcript and exhibit files and method for making the same
US7389233B1 (en) * 2003-09-02 2008-06-17 Verizon Corporate Services Group Inc. Self-organizing speech recognition for information extraction
WO2005050474A2 (en) * 2003-11-21 2005-06-02 Philips Intellectual Property & Standards Gmbh Text segmentation and label assignment with user interaction by means of topic specific language models and topic-specific label statistics
JP2007512609A (ja) * 2003-11-21 2007-05-17 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ 文書構造化のためのテキストセグメンテーション及びトピック注釈付け
US20070244690A1 (en) * 2003-11-21 2007-10-18 Koninklijke Philips Electronic, N.V. Clustering of Text for Structuring of Text Documents and Training of Language Models
JP5255769B2 (ja) * 2003-11-21 2013-08-07 ニュアンス コミュニケーションズ オーストリア ゲーエムベーハー テキストフォーマッティング及びスピーチ認識のためのトピック特有のモデル
EP1704499A1 (de) * 2003-12-31 2006-09-27 Thomson Global Resources AG Systeme, verfahren, programme für datenverarbeitungsanlagen und schnittstellen zur integration von rechtssprechung mit rechtlichen kurzmitteilungen, rechtsstreitdokumenten und/oder rechtsstreit unterstützenden dokumenten
JP4860265B2 (ja) * 2004-01-16 2012-01-25 日本電気株式会社 テキスト処理方法/プログラム/プログラム記録媒体/装置
US7426557B2 (en) * 2004-05-14 2008-09-16 International Business Machines Corporation System, method, and service for inducing a pattern of communication among various parties
US7281022B2 (en) * 2004-05-15 2007-10-09 International Business Machines Corporation System, method, and service for segmenting a topic into chatter and subtopics
US20060224584A1 (en) * 2005-03-31 2006-10-05 Content Analyst Company, Llc Automatic linear text segmentation
US20060256937A1 (en) * 2005-05-12 2006-11-16 Foreman Paul E System and method for conversation analysis
US8572018B2 (en) * 2005-06-20 2013-10-29 New York University Method, system and software arrangement for reconstructing formal descriptive models of processes from functional/modal data using suitable ontology
KR100755677B1 (ko) * 2005-11-02 2007-09-05 삼성전자주식회사 주제 영역 검출을 이용한 대화체 음성 인식 장치 및 방법
US20070106644A1 (en) * 2005-11-08 2007-05-10 International Business Machines Corporation Methods and apparatus for extracting and correlating text information derived from comment and product databases for use in identifying product improvements based on comment and product database commonalities
US8301448B2 (en) * 2006-03-29 2012-10-30 Nuance Communications, Inc. System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy
US8386232B2 (en) * 2006-06-01 2013-02-26 Yahoo! Inc. Predicting results for input data based on a model generated from clusters
US8401841B2 (en) * 2006-08-31 2013-03-19 Orcatec Llc Retrieval of documents using language models
JP4188989B2 (ja) * 2006-09-15 2008-12-03 本田技研工業株式会社 音声認識装置、音声認識方法、及び音声認識プログラム
JP5256654B2 (ja) * 2007-06-29 2013-08-07 富士通株式会社 文章分割プログラム、文章分割装置および文章分割方法
US7983902B2 (en) * 2007-08-23 2011-07-19 Google Inc. Domain dictionary creation by detection of new topic words using divergence value comparison
US7917355B2 (en) 2007-08-23 2011-03-29 Google Inc. Word detection
US8073682B2 (en) 2007-10-12 2011-12-06 Palo Alto Research Center Incorporated System and method for prospecting digital information
US8671104B2 (en) 2007-10-12 2014-03-11 Palo Alto Research Center Incorporated System and method for providing orientation into digital information
US8165985B2 (en) 2007-10-12 2012-04-24 Palo Alto Research Center Incorporated System and method for performing discovery of digital information in a subject area
DE102007056140A1 (de) 2007-11-19 2009-05-20 Deutsche Telekom Ag Verfahren und System zur Informationssuche
US20090132252A1 (en) * 2007-11-20 2009-05-21 Massachusetts Institute Of Technology Unsupervised Topic Segmentation of Acoustic Speech Signal
WO2009084554A1 (ja) * 2007-12-27 2009-07-09 Nec Corporation テキスト分割装置とテキスト分割方法およびプログラム
US8666729B1 (en) * 2010-02-10 2014-03-04 West Corporation Processing natural language grammar
US8806455B1 (en) * 2008-06-25 2014-08-12 Verint Systems Ltd. Systems and methods for text nuclearization
US20100057577A1 (en) * 2008-08-28 2010-03-04 Palo Alto Research Center Incorporated System And Method For Providing Topic-Guided Broadening Of Advertising Targets In Social Indexing
US20100057536A1 (en) * 2008-08-28 2010-03-04 Palo Alto Research Center Incorporated System And Method For Providing Community-Based Advertising Term Disambiguation
US8209616B2 (en) * 2008-08-28 2012-06-26 Palo Alto Research Center Incorporated System and method for interfacing a web browser widget with social indexing
US8010545B2 (en) * 2008-08-28 2011-08-30 Palo Alto Research Center Incorporated System and method for providing a topic-directed search
US8549016B2 (en) * 2008-11-14 2013-10-01 Palo Alto Research Center Incorporated System and method for providing robust topic identification in social indexes
US8713016B2 (en) 2008-12-24 2014-04-29 Comcast Interactive Media, Llc Method and apparatus for organizing segments of media assets and determining relevance of segments to a query
US9442933B2 (en) * 2008-12-24 2016-09-13 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US11531668B2 (en) * 2008-12-29 2022-12-20 Comcast Interactive Media, Llc Merging of multiple data sets
US8452781B2 (en) * 2009-01-27 2013-05-28 Palo Alto Research Center Incorporated System and method for using banded topic relevance and time for article prioritization
US8356044B2 (en) * 2009-01-27 2013-01-15 Palo Alto Research Center Incorporated System and method for providing default hierarchical training for social indexing
US8239397B2 (en) * 2009-01-27 2012-08-07 Palo Alto Research Center Incorporated System and method for managing user attention by detecting hot and cold topics in social indexes
US8458105B2 (en) * 2009-02-12 2013-06-04 Decisive Analytics Corporation Method and apparatus for analyzing and interrelating data
US20100235314A1 (en) * 2009-02-12 2010-09-16 Decisive Analytics Corporation Method and apparatus for analyzing and interrelating video data
US8176043B2 (en) 2009-03-12 2012-05-08 Comcast Interactive Media, Llc Ranking search results
US10191654B2 (en) 2009-03-30 2019-01-29 Touchtype Limited System and method for inputting text into electronic devices
GB201016385D0 (en) 2010-09-29 2010-11-10 Touchtype Ltd System and method for inputting text into electronic devices
GB0905457D0 (en) 2009-03-30 2009-05-13 Touchtype Ltd System and method for inputting text into electronic devices
US9424246B2 (en) 2009-03-30 2016-08-23 Touchtype Ltd. System and method for inputting text into electronic devices
US9189472B2 (en) 2009-03-30 2015-11-17 Touchtype Limited System and method for inputting text into small screen devices
GB0917753D0 (en) 2009-10-09 2009-11-25 Touchtype Ltd System and method for inputting text into electronic devices
US20100250614A1 (en) * 2009-03-31 2010-09-30 Comcast Cable Holdings, Llc Storing and searching encoded data
US8533223B2 (en) 2009-05-12 2013-09-10 Comcast Interactive Media, LLC. Disambiguation and tagging of entities
US9892730B2 (en) 2009-07-01 2018-02-13 Comcast Interactive Media, Llc Generating topic-specific language models
EP2485212A4 (de) * 2009-10-02 2016-12-07 Nat Inst Inf & Comm Tech Sprachübersetzungssystem, erstes endgerät, spracherkennungsserver, übersetzungsserver und sprachsyntheseserver
US20110202484A1 (en) * 2010-02-18 2011-08-18 International Business Machines Corporation Analyzing parallel topics from correlated documents
GB201003628D0 (en) 2010-03-04 2010-04-21 Touchtype Ltd System and method for inputting text into electronic devices
US9031944B2 (en) 2010-04-30 2015-05-12 Palo Alto Research Center Incorporated System and method for providing multi-core and multi-level topical organization in social indexes
US8434001B2 (en) 2010-06-03 2013-04-30 Rhonda Enterprises, Llc Systems and methods for presenting a content summary of a media item to a user based on a position within the media item
US9326116B2 (en) 2010-08-24 2016-04-26 Rhonda Enterprises, Llc Systems and methods for suggesting a pause position within electronic text
US8977538B2 (en) * 2010-09-13 2015-03-10 Richard Salisbury Constructing and analyzing a word graph
GB201200643D0 (en) 2012-01-16 2012-02-29 Touchtype Ltd System and method for inputting text
US9069754B2 (en) 2010-09-29 2015-06-30 Rhonda Enterprises, Llc Method, system, and computer readable medium for detecting related subgroups of text in an electronic document
US9442930B2 (en) 2011-09-07 2016-09-13 Venio Inc. System, method and computer program product for automatic topic identification using a hypertext corpus
US9442928B2 (en) 2011-09-07 2016-09-13 Venio Inc. System, method and computer program product for automatic topic identification using a hypertext corpus
US9355170B2 (en) * 2012-11-27 2016-05-31 Hewlett Packard Enterprise Development Lp Causal topic miner
WO2015199653A1 (en) * 2014-06-24 2015-12-30 Nuance Communications, Inc. Methods and apparatus for joint stochastic and deterministic dictation formatting
US9881023B2 (en) * 2014-07-22 2018-01-30 Microsoft Technology Licensing, Llc Retrieving/storing images associated with events
US20160070692A1 (en) * 2014-09-10 2016-03-10 Microsoft Corporation Determining segments for documents
GB201610984D0 (en) 2016-06-23 2016-08-10 Microsoft Technology Licensing Llc Suppression of input images
US10402473B2 (en) * 2016-10-16 2019-09-03 Richard Salisbury Comparing, and generating revision markings with respect to, an arbitrary number of text segments
KR20180077689A (ko) * 2016-12-29 2018-07-09 주식회사 엔씨소프트 자연어 생성 장치 및 방법
US11301629B2 (en) 2019-08-21 2022-04-12 International Business Machines Corporation Interleaved conversation concept flow enhancement
US11308944B2 (en) 2020-03-12 2022-04-19 International Business Machines Corporation Intent boundary segmentation for multi-intent utterances
JP2023035617A (ja) * 2021-09-01 2023-03-13 株式会社東芝 コミュニケーションデータログ処理装置、方法及びプログラム

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4663675A (en) * 1984-05-04 1987-05-05 International Business Machines Corporation Apparatus and method for digital speech filing and retrieval
US4783803A (en) * 1985-11-12 1988-11-08 Dragon Systems, Inc. Speech recognition apparatus and method
US4829576A (en) * 1986-10-21 1989-05-09 Dragon Systems, Inc. Voice recognition system
US4805218A (en) * 1987-04-03 1989-02-14 Dragon Systems, Inc. Method for speech analysis and speech recognition
US4805219A (en) * 1987-04-03 1989-02-14 Dragon Systems, Inc. Method for speech recognition
US4931950A (en) * 1988-07-25 1990-06-05 Electric Power Research Institute Multimedia interface and method for computer system
US5027406A (en) * 1988-12-06 1991-06-25 Dragon Systems, Inc. Method for interactive speech recognition and training
US5392428A (en) * 1991-06-28 1995-02-21 Robins; Stanford K. Text analysis system
US5251131A (en) * 1991-07-31 1993-10-05 Thinking Machines Corporation Classification of data records by comparison of records to a training database using probability weights
US5278980A (en) * 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
US5267345A (en) * 1992-02-10 1993-11-30 International Business Machines Corporation Speech recognition apparatus which predicts word classes from context and words from word classes
GB9220404D0 (en) * 1992-08-20 1992-11-11 Nat Security Agency Method of identifying,retrieving and sorting documents
US5425129A (en) * 1992-10-29 1995-06-13 International Business Machines Corporation Method for word spotting in continuous speech
US5428707A (en) * 1992-11-13 1995-06-27 Dragon Systems, Inc. Apparatus and methods for training speech recognition systems and their users and otherwise improving speech recognition performance
US5806021A (en) * 1995-10-30 1998-09-08 International Business Machines Corporation Automatic segmentation of continuous text using statistical approaches
US5835888A (en) * 1996-06-10 1998-11-10 International Business Machines Corporation Statistical language model for inflected languages
US5839106A (en) * 1996-12-17 1998-11-17 Apple Computer, Inc. Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model

Also Published As

Publication number Publication date
WO1999013408A2 (en) 1999-03-18
WO1999013408A3 (en) 1999-06-03
EP1012736A2 (de) 2000-06-28
US6052657A (en) 2000-04-18
DE69814104T2 (de) 2004-04-29
EP1012736B1 (de) 2003-05-02

Similar Documents

Publication Publication Date Title
DE69814104D1 (de) Aufteilung von texten und identifizierung von themen
DE69842210D1 (de) Einführvorrichtung Und Einführungssatz
DE69803199D1 (de) Erkennung und entfernung von makroviren
DE69828963D1 (de) Wirstoffabgabe und gentherapiesystem
DE69422791D1 (de) Sendung und Empfang von Programminformation
DE69836459D1 (de) Optisches Element und damit versehenes optisches System
DE69637439D1 (de) Ringoszillatorschaltung und chipkarte
DE69831755D1 (de) Optoakustische kontrastmittel und anwendungsverfahren
DE69627188T2 (de) Brille und brillenfassung
DE69728434D1 (de) Ankoppelvorrichtung und flaschen
LU91437I2 (fr) "dabigatran-etexilate et ses sels- particulièrement dabigatran-etexilate-mésilate (PRADAXA)"
DE69817082D1 (de) Kassette und Markierungssystem
DE69831750D1 (de) Biozide und biozide tücher
DE959898T1 (de) Laktoferrin varianten und verwendungen davon
ID18456A (id) Produksi-bersama 6-aminokapronitril dan heksametilenadiamina
DE69732436D1 (de) Sirupe sowie daraus hergestellte eine emulsion enthaltende umhuellungen
FI974077A0 (fi) Foerfarande och anordning foer automatisk utdragning och indragning av antennen i en traodloes telefon
ID18459A (id) Produksi-bersama 6-aminokapronitril dan heksametilenadiamina
DE69717677T2 (de) Koproduktion von perfluormethylperfluorvinylether und perfluorethylperfluorvinylether
ID16587A (id) Koproduksi 6-aminokapronitril dan heksametilendiamin
DE69842206D1 (de) Echounterdrücker und echounterdrückungsverfahren
DE29717551U1 (de) Beutelgebinde, Beutelgebindesortiment und Produktpalette von Beuteln
DE69706322T2 (de) Mischung von verzweigten und linearen polycarbonatharzen
DE69324588D1 (de) Fluorierungsreagenz und fluorierungsverfahren
FI972010A0 (fi) Arrangemang i en borrningsanordning

Legal Events

Date Code Title Description
8332 No legal effect for de
8370 Indication of lapse of patent is to be deleted
8328 Change in the person/name/address of the agent

Representative=s name: P.E. MEISSNER UND KOLLEGEN, 14199 BERLIN

8364 No opposition during term of opposition
8339 Ceased/non-payment of the annual fee