WO2014084712A1 - A system and method for automated generation of contextual revised knowledge base - Google Patents

A system and method for automated generation of contextual revised knowledge base Download PDF

Info

Publication number
WO2014084712A1
WO2014084712A1 PCT/MY2013/000199 MY2013000199W WO2014084712A1 WO 2014084712 A1 WO2014084712 A1 WO 2014084712A1 MY 2013000199 W MY2013000199 W MY 2013000199W WO 2014084712 A1 WO2014084712 A1 WO 2014084712A1
Authority
WO
WIPO (PCT)
Prior art keywords
concepts
entities
instances
domain
list
Prior art date
Application number
PCT/MY2013/000199
Other languages
French (fr)
Inventor
Sieow Yeek TAN
Anand Sadanandan ARUN
Lukose Dickson
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2014084712A1 publication Critical patent/WO2014084712A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data

Definitions

  • the present invention relates to a system and method for automated generation of contextual revised knowledge base.
  • the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base from a domain knowledge-base based on specific entities and associated contextual information.
  • KB contextual revised knowledge-base
  • Direct keyword mapping is inaccurate and incomplete and semantic level context cannot be extracted from the description given in natural text format provided as guidance.
  • current searching process to map the keywords from all contents from Subject Specific KB to form a Revise KB is computationally expensive. Many strategies were introduced in the process of mapping selection.
  • presently available contextual revised KB is not extendible to work with different domains where the designed methods are tailored for working with specific domain utilizing proprietary protocol.
  • Extraction Module handles extraction of important sentences from document using HTML tag to identify text and words of important entities as compared to the present invention that utilizes NLP (Natural Language Processing) and semantic processing techniques.
  • Tiun's paper does not identify properties and instances in a process and it does not compare concepts, relations, instances and concept description contribute to derive concept similarity matching value as provided in the present invention; instead a weight value concept is introduced wherein higher weight shows a more important concept as higher keywords match frequency contributes to higher weight value.
  • WordNet database is used in enriching concepts and output is a keyword which is known as topic concept.
  • the present invention utilizes extended linguistic thesaurus for enriching concepts.
  • Lame's paper Another mechanism to identify ontology components by using NLP to extract concepts and relations among the concepts are proposed in a published paper entitled "Using NLP techniques to identify legal ontology components: Concepts and Relations" by Guiraude Lame; published by Springer 2006; Artificial Intelligence and Law (2004) hereby denoted as Lame's paper. It establishes a methodology to identify ontology components utilizing pre-define relations, rules, text analysis methods such as syntactical analysis and analysis of the coordination relations and statistical analysis in identifying concepts. The present invention compares concepts, relations, instances and concept description which contribute to derive concept similarity matching value. In brief, Lame does not utilize extended linguistic thesaurus for enriching concepts as compared to the present invention whereby all concepts and relation are analyzed from text document.
  • US 436 Publication A mechanism for repurposing ontology is described in United States Patent Publication No. US 2004/0117346 A1 , hereby denoted as US 436 Publication. It relates to a method and apparatus for leveraging existing ontologies for unintended applications and rapid development of new ontologies by leveraging existing ontologies. All contents are obtained from original ontology whereby the US 436 Publication does not identify properties and instances in a process andit does not " provide for keyword " ehrichment as extended linguistic thesaurus is not used.
  • the present invention relates to a system and method for automated generation of contextual revised knowledge base.
  • the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base from a domain knowledge-base based on specific entities and associated contextual information.
  • One aspect of the present invention provides for a system (100, 200) for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator (102).
  • the said contextual knowledge-base generator (102) comprising at least one Salient Entity List Composer module (204); at least one Concept Extension module (208); at least one Ontology Content Mapping module (212); and at least one Revised Knowledge-base Reconstruction module (214).
  • the at least one Revised Knowledge-base Reconstruction module (214) having means for receiving domain knowledge base with concepts from mapped content ontology; determining if said concepts are marked and further processing marked concepts by preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts.
  • the invention provides for the at least one Salient Entity List Composer module (204).
  • the said Salient Entity List Composer module (204) having means for extracting a list of concepts and entities from " ihput " of said contextual >evised knowledge- base generator by the steps of receiving pre-defined concepts, properties and instances with associated information for a specific context; tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming; identifying important entities from tokenized list; and producing a list of entities by combining identified entities with said pre- defined concepts properties and instances with associated information for a specific context.
  • the said Concept Extension module (208) having means for extending said concepts and entities which relates to a specific context for ontology mapping by the steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus; looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus; and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities.
  • the said Concept Extension module (208) having means for extending said concepts and entities which relates to a specific context for ontology mapping by the steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus; looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus; and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities.
  • the at least one Ontology Content Mapping module (212) having means for mapping content ontology by marking concepts from domain ontology for -further— processing— in ⁇ revised knowledge-base reconstruction module.
  • the invention provides a method for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator.
  • the said method comprising steps of extracting a list of concepts and entities from input of said contextual revised knowledge-base generator (302); extending said concepts and entities which relates to a specific context for ontology mapping (304); mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module (306); and constructing revised knowledge-base containing marked concepts with its respective properties and instances (308)
  • the step for constructing revised knowledge-base containing marked concepts with its respective properties and instances further comprises steps of receiving domain knowledge base with concepts from mapped content ontology ( 004); determining if said concepts are marked (1010, 1012) and further processing marked concepts by preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts (1008, 1014, 1016).
  • the step for extracting a list of concepts and entities from input of said contextual revised knowledge-base generator further comprising steps of receiving pre-defined concepts, properties and instances with associated information for a specific context (402); tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming (404); identifying important entities from tokenized list (406); and producing a list of entities by combining identified entities with said pre-defined concepts properties and instances with associated information for a specific context (408).
  • the step for extending said concepts and entities which relates to a specific context for ontology mapping further comprising steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus (502); looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus (504); and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities (506).
  • the step for mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module further comprising steps of receiving a list of extended concepts and entities related to a specific context together with information stored in domain knowledge-base which contains concepts, properties and instances (804); determining if there is any unmarked concepts and instances from domain ontology (806); removing extended concepts and entities from said list should there be no unmarked concepts and instances from domain ontology (808); performing concept similarity matching on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledge-base with said list of extended concepts and entities (810); determining if concepts from domain ontology have high concept mapping similarity value (812); marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value (818) and removing extended concepts and entities from said list (808) obtaining concept annotation resources and instances from domain knowledge-base (814); determining if said concept annotation resources and instances have high similarity matching value if there are no
  • the invention provides that marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value further comprising steps of tracking total highest marked number of marked instance and property for all compared domain concepts (702a); normalizing total marked number of _ each concept in domain ontology - with-highest " marked- numbe as tracked (704a); selecting domain concept with content mapping similarity value larger then threshold value for a threshold value between 0 and 1 (706a); and flagging selected domain concepts with a "mark concept" (708a).
  • FIG. 1.0 illustrates architecture of the present invention.
  • FIG. 2.0 illustrates the components of the contextual revised knowledge-base generator of the present invention.
  • FIG. 3.0 is a flowchart illustrating the methodology for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator of the present invention.
  • FIG. 4.0 illustrates the methodology of the Salient Entity List Composer module of the present invention.
  • FIG. 5.0 illustrates the methodology of Concept Extension module of the present invention.
  • FIG. 6.0 illustrates the scenario of ontology content mapping of the present invention.
  • FIG. 7.0 illustrates the scenario of Concept Mapping Similarity (CMS) calculation of the present invention.
  • FIG. 7.0a is a flowchart illustrating the methodology for marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity of the present invention.
  • FIG. 8.0 is a flowchart illustrating the methodology for mapping content ontology by marking concepts from domain ontology for further processing in revised knowledgebase reconstruction module.
  • FIG. 9.0 illustrates the scenario of a Revised Ontology Reconstruction of a revised knowledge-base of the present invention.
  • FIG. 10.0 is a flowchart illustrating the methodology of constructing revised knowledgebase containing marked concepts with its respective properties and instances of the present invention.
  • the present invention provides a system and method for automated generation of contextual revised knowledge base.
  • the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base (KB) from a domain knowledge-base based on specific entities and associated contextual information.
  • the said contextual KB generator (102) includes a Salient Entity List Composer module (204); a Concept Extension module (208); an Ontology Content Mapping module (212); and a Revised Knowledge-base Reconstruction module (214).
  • the functions of each of the said components of the contextual KB generator (102) will be described in detail together with the methodology of the present invention in the following sections.
  • the invention includes the steps of extracting a list of concepts and entities from the input of said contextual revised knowledge-base generator by the Salient Entity List Composer Module (302) and thereafter the Concept Extension Module extends said concepts and entities which relates to a specific context for ontology mapping (304).
  • the extended concepts and entities will be process further by the Ontology Content Mapping Module whereby said Ontology Content Mapping Module maps content ontology by marking concepts from domain ontology to be further process in revised knowledge-base reconstruction module (306).
  • the constructed revised knowledge-base contains marked concepts with its respective properties and instances.
  • a list of predefined concepts of a specific entity and associated contextual information (402) is first provided as input to the Salient Entity List Compose module (204).
  • a description of summary of text, which is in natural text format, is also provided as the content of the contextual information (402).
  • a list of predefined concepts of a specific context such as television, computer, radio and slow cooker and an example of summary of a described topic; "A topic describing about electric apparatus in a shopping complex" is provided as input to the Salient Entity List Compose module (204).
  • the received domain specific concepts, properties and instances with associated information is tokenized (404) wherein the summary of the described topic is pre-processed to produce a list of tokens which becomes input for further processing such as for text mining, sense tagging and stemming (404).
  • the lists of tokens are further processed to identify important entities (406).
  • the important entities identified from the example as provided earlier are “electrical”, “apparatus” and “shopping complex” (406).
  • the extracted entities will be combined with all pre-defined concepts, properties and instances with associated information for a specific context (408) which outputs a list of concept and entities belonging to the specific context. Based on the example of the specific context, the output of the said list of concept and entities are "electrical", “apparatus”, “shopping complex", “television”, “computer”, “radio”, and “slow cooker”.
  • said module will loop through said list of entities produced from Salient Entity List Composer module together with the information from said External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym of the entities from said External Linguistic Thesaurus (504).
  • Example of extended entities which relates to the specific context are in bold "electrical / electric / charge / electricity / apparatus / equipment / machine / device / appliance / instrument / shopping complex / - I I - shopping center / television / tv / video / computer / calculator / CPU / processor / PC / radio / tape player / slow cooker/".
  • the output (506) produced for the specific context are "electrical, electric, charge, electricity, apparatus, equipment, machine, device, appliance, instrument, shopping complex, shopping center, television, tv, video, computer, calculator, CPU, processor, PC, radio, tape player".
  • the extended concepts and entities related to a specific context together with information stored in domain knowledge-base which contains concepts, properties and instances (804) are forwarded to the Ontology Content Mapping module (212).
  • a domain knowledge-base is used for processing and extracting the content for the revised knowledge-base.
  • Content ontology is mapped by marking concepts from domain ontology to be further process in revised knowledge-base reconstruction module.
  • the said Ontology Content Mapping module (212) determines if there are any unmarked concepts and instances from domain ontology (806) and extended concepts and entities are removed from said list should there be no unmarked concepts and instances from domain ontology (808).
  • the contents received by said Ontology Content Mapping module (212) will undergo a matching process namely the Concept Similarity Matching procedure whereby concept similarity matching is perform on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledgebase based on said list of extended concepts -and-entities (810).
  • the comparison similarity values of the Concept Similarity Matching procedure are evaluated to determine if concepts from domain ontology have high concept mapping similarity value (812). Thereafter, the concepts from domain ontology will be marked if concepts from domain ontology have high concept mapping similarity value (818) and the extended concepts and entities will be removed from said list (808).
  • the marking of the concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value is based on the Concept Mapping Similarity (CMS) calculation whereby the total highest marked number of marked instance and property will be tracked for all compared domain concepts (702a) and the total marked number of each concept in domain ontology with highest marked number as tracked will be normalized (704a). Thereafter, domain concept with content mapping similarity value larger then threshold value for a threshold value between 0 and 1 (706a) is selected and selected domain concepts with a "mark concept" will be flag accordingly (708a). Higher threshold used represents a more specific version of the knowledge-base required and vice versa.
  • CMS Concept Mapping Similarity
  • concept annotation resources and its instances will be obtained from domain knowledge-base if the concept mapping similarity value is not high (814). It is further determined if the annotation resources and instances have high similarity matching value if there are no concepts in domain ontology with high concept mapping similarity value (816). The extended concepts and entities from said list are removed should there be any unmarked concepts and instances from domain ontology (808); else the concepts from domain ontology will be marked if said concept annotation resources and instances have high similarity matching value (8 8) and said extended concepts and entities are removed from said list (808). Thereafter, the domain knowledge-base with marked concepts are forwarded to the Revised Knowledge-base Reconstruction module (214). Concept from domain ontology is forwarded (1004) to the Revised Knowledge-base Reconstruction module (214).
  • the marked concepts are further processed by preserving original hierarchical structure of marked concepts; the instances attached to marked concepts are preserved, properties with its domain are preserved-as-instanees; -and unmarked ' concepts are removed from ontology while preserving original hierarchical structure of said marked concepts (1008, 1014, 1016).
  • the present invention provides for automated generation of contextual revised knowledge base whereby the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base (KB) from a domain knowledge-base based on specific entities and associated contextual information.
  • the word “comprise”, or variations such as “comprises” or “comprising” will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers.
  • the term “comprising” is used in an inclusive sense and thus should be understood as meaning “including principally, but not necessarily solely”.

Abstract

A system and method (100, 200) for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator (102), the said contextual knowledge-base generator (102) comprising at least one Salient Entity List Composer module (204); at least one Concept Extension module (208); at least one Ontology Content Mapping module (212); and at least one Revised Knowledge-base Reconstruction module (214). The at least one Revised Knowledge-base Reconstruction module (214) having means for receiving domain knowledge base with concepts from mapped content ontology; determining if said concepts are marked and further processing marked concepts by preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts. In short, the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base from a domain knowledge-base based on specific entities and associated contextual information.

Description

A SYSTEM AND METHOD FOR AUTOMATED GENERATION OF CONTEXTUAL
REVISED KNOWLEDGE BASE
FIELD OF INVENTION
The present invention relates to a system and method for automated generation of contextual revised knowledge base. In particular, the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base from a domain knowledge-base based on specific entities and associated contextual information.
BACKGROUND ART
Generation of contextual revised knowledge-base (KB) currently utilizes direct simple keywords mapping based on a guidance description to identify the content for a revised KB. Direct keyword mapping is inaccurate and incomplete and semantic level context cannot be extracted from the description given in natural text format provided as guidance. Further, current searching process to map the keywords from all contents from Subject Specific KB to form a Revise KB is computationally expensive. Many strategies were introduced in the process of mapping selection. However, presently available contextual revised KB is not extendible to work with different domains where the designed methods are tailored for working with specific domain utilizing proprietary protocol. One existing strategy which utilizes ontology hierarchical structure to identify a topic of- text- is described -in -a-published_paper entitled" "Automatic Topic Identification Using Ontology Hierarchy" by Tiun et. al; a Springer-Verlag Berlin Heidelberg 2001 publication, hereby denoted as Tiun's paper which identifies and map keywords extracted from a given text from a web document into corresponding concepts in the ontology. Extraction Module handles extraction of important sentences from document using HTML tag to identify text and words of important entities as compared to the present invention that utilizes NLP (Natural Language Processing) and semantic processing techniques. Further, Tiun's paper does not identify properties and instances in a process and it does not compare concepts, relations, instances and concept description contribute to derive concept similarity matching value as provided in the present invention; instead a weight value concept is introduced wherein higher weight shows a more important concept as higher keywords match frequency contributes to higher weight value. Only WordNet database is used in enriching concepts and output is a keyword which is known as topic concept. In contrast, the present invention utilizes extended linguistic thesaurus for enriching concepts.
Another mechanism to identify ontology components by using NLP to extract concepts and relations among the concepts are proposed in a published paper entitled "Using NLP techniques to identify legal ontology components: Concepts and Relations" by Guiraude Lame; published by Springer 2006; Artificial Intelligence and Law (2004) hereby denoted as Lame's paper. It establishes a methodology to identify ontology components utilizing pre-define relations, rules, text analysis methods such as syntactical analysis and analysis of the coordination relations and statistical analysis in identifying concepts. The present invention compares concepts, relations, instances and concept description which contribute to derive concept similarity matching value. In brief, Lame does not utilize extended linguistic thesaurus for enriching concepts as compared to the present invention whereby all concepts and relation are analyzed from text document.
A mechanism for repurposing ontology is described in United States Patent Publication No. US 2004/0117346 A1 , hereby denoted as US 436 Publication. It relates to a method and apparatus for leveraging existing ontologies for unintended applications and rapid development of new ontologies by leveraging existing ontologies. All contents are obtained from original ontology whereby the US 436 Publication does not identify properties and instances in a process andit does not" provide for keyword"ehrichment as extended linguistic thesaurus is not used.
The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced. SUMMARY OF INVENTION
The present invention relates to a system and method for automated generation of contextual revised knowledge base. In particular, the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base from a domain knowledge-base based on specific entities and associated contextual information.
One aspect of the present invention provides for a system (100, 200) for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator (102). The said contextual knowledge-base generator (102) comprising at least one Salient Entity List Composer module (204); at least one Concept Extension module (208); at least one Ontology Content Mapping module (212); and at least one Revised Knowledge-base Reconstruction module (214). The at least one Revised Knowledge-base Reconstruction module (214) having means for receiving domain knowledge base with concepts from mapped content ontology; determining if said concepts are marked and further processing marked concepts by preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts.
Another aspect the invention provides for the at least one Salient Entity List Composer module (204). The said Salient Entity List Composer module (204) having means for extracting a list of concepts and entities from"ihput"of said contextual >evised knowledge- base generator by the steps of receiving pre-defined concepts, properties and instances with associated information for a specific context; tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming; identifying important entities from tokenized list; and producing a list of entities by combining identified entities with said pre- defined concepts properties and instances with associated information for a specific context.
In yet another aspect of the invention is the at least one Concept Extension module (208). The said Concept Extension module (208) having means for extending said concepts and entities which relates to a specific context for ontology mapping by the steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus; looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus; and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities. In still another aspect of the invention there is provided with the at least one Concept Extension module (208). The said Concept Extension module (208) having means for extending said concepts and entities which relates to a specific context for ontology mapping by the steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus; looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus; and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities.
In a further aspect of the invention there is provided that the at least one Ontology Content Mapping module (212) having means for mapping content ontology by marking concepts from domain ontology for -further— processing— in ~ revised knowledge-base reconstruction module.
In another aspect the invention provides a method for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator. The said method comprising steps of extracting a list of concepts and entities from input of said contextual revised knowledge-base generator (302); extending said concepts and entities which relates to a specific context for ontology mapping (304); mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module (306); and constructing revised knowledge-base containing marked concepts with its respective properties and instances (308) The step for constructing revised knowledge-base containing marked concepts with its respective properties and instances further comprises steps of receiving domain knowledge base with concepts from mapped content ontology ( 004); determining if said concepts are marked (1010, 1012) and further processing marked concepts by preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts (1008, 1014, 1016). In a further aspect of the invention there is provided that the step for extracting a list of concepts and entities from input of said contextual revised knowledge-base generator further comprising steps of receiving pre-defined concepts, properties and instances with associated information for a specific context (402); tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming (404); identifying important entities from tokenized list (406); and producing a list of entities by combining identified entities with said pre-defined concepts properties and instances with associated information for a specific context (408).
In yet another aspect of the invention is provided that the step for extending said concepts and entities which relates to a specific context for ontology mapping further comprising steps of receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus (502); looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus (504); and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities (506). In still another aspect of the invention is provided that the step for mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module further comprising steps of receiving a list of extended concepts and entities related to a specific context together with information stored in domain knowledge-base which contains concepts, properties and instances (804); determining if there is any unmarked concepts and instances from domain ontology (806); removing extended concepts and entities from said list should there be no unmarked concepts and instances from domain ontology (808); performing concept similarity matching on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledge-base with said list of extended concepts and entities (810); determining if concepts from domain ontology have high concept mapping similarity value (812); marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value (818) and removing extended concepts and entities from said list (808) obtaining concept annotation resources and instances from domain knowledge-base (814); determining if said concept annotation resources and instances have high similarity matching value if there are no concepts in domain ontology with high concept mapping similarity value (816); and removing extended concepts and entities from said list should there be any unmarked concepts and instances from domain ontology (808); marking concepts from domain ontology if said concept annotation resources and instances have high similarity matching value (818) and removing extended concepts and entities from said list (808); and forwarding domain knowledgebase with marked concepts to Revised Knowledge-base Reconstruction module. In another aspect the invention provides that marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value further comprising steps of tracking total highest marked number of marked instance and property for all compared domain concepts (702a); normalizing total marked number of _ each concept in domain ontology - with-highest "marked- numbe as tracked (704a); selecting domain concept with content mapping similarity value larger then threshold value for a threshold value between 0 and 1 (706a); and flagging selected domain concepts with a "mark concept" (708a).
The present invention consists of features and a combination of parts hereinafter fully described and illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention. BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which: FIG. 1.0 illustrates architecture of the present invention.
FIG. 2.0 illustrates the components of the contextual revised knowledge-base generator of the present invention. FIG. 3.0 is a flowchart illustrating the methodology for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator of the present invention.
FIG. 4.0 illustrates the methodology of the Salient Entity List Composer module of the present invention.
FIG. 5.0 illustrates the methodology of Concept Extension module of the present invention. FIG. 6.0 illustrates the scenario of ontology content mapping of the present invention.
FIG. 7.0 illustrates the scenario of Concept Mapping Similarity (CMS) calculation of the present invention. FIG. 7.0a is a flowchart illustrating the methodology for marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity of the present invention. FIG. 8.0 is a flowchart illustrating the methodology for mapping content ontology by marking concepts from domain ontology for further processing in revised knowledgebase reconstruction module. FIG. 9.0 illustrates the scenario of a Revised Ontology Reconstruction of a revised knowledge-base of the present invention.
FIG. 10.0 is a flowchart illustrating the methodology of constructing revised knowledgebase containing marked concepts with its respective properties and instances of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a system and method for automated generation of contextual revised knowledge base. In particular, the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base (KB) from a domain knowledge-base based on specific entities and associated contextual information.
Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.
Referring first to FIGs. 1.0 and 2.0 respectively, the architecture (100) of the present invention and the components of the contextual knowledge-base (KB) generator (102) according to the present invention is illustrated. The said contextual KB generator (102) includes a Salient Entity List Composer module (204); a Concept Extension module (208); an Ontology Content Mapping module (212); and a Revised Knowledge-base Reconstruction module (214). The functions of each of the said components of the contextual KB generator (102) will be described in detail together with the methodology of the present invention in the following sections.
Referring to the subsequent drawings as appended in the specification, an embodiment of the method (300) of the invention is illustrated. ..Generally, the invention includes the steps of extracting a list of concepts and entities from the input of said contextual revised knowledge-base generator by the Salient Entity List Composer Module (302) and thereafter the Concept Extension Module extends said concepts and entities which relates to a specific context for ontology mapping (304). The extended concepts and entities will be process further by the Ontology Content Mapping Module whereby said Ontology Content Mapping Module maps content ontology by marking concepts from domain ontology to be further process in revised knowledge-base reconstruction module (306). The constructed revised knowledge-base contains marked concepts with its respective properties and instances. A list of predefined concepts of a specific entity and associated contextual information (402) is first provided as input to the Salient Entity List Compose module (204). A description of summary of text, which is in natural text format, is also provided as the content of the contextual information (402). A list of predefined concepts of a specific context such as television, computer, radio and slow cooker and an example of summary of a described topic; "A topic describing about electric apparatus in a shopping complex" is provided as input to the Salient Entity List Compose module (204).
In the said Salient Entity List Compose module (204), the received domain specific concepts, properties and instances with associated information is tokenized (404) wherein the summary of the described topic is pre-processed to produce a list of tokens which becomes input for further processing such as for text mining, sense tagging and stemming (404). The lists of tokens are further processed to identify important entities (406). The important entities identified from the example as provided earlier are "electrical", "apparatus" and "shopping complex" (406). The extracted entities will be combined with all pre-defined concepts, properties and instances with associated information for a specific context (408) which outputs a list of concept and entities belonging to the specific context. Based on the example of the specific context, the output of the said list of concept and entities are "electrical", "apparatus", "shopping complex", "television", "computer", "radio", and "slow cooker".
The list of concept and entities belonging to the specific context will be forwarded to the Concept Extension module (208); said Concept Extension module (208) extends said concepts and entities which relates to a specific context- together with the information from the External Linguistic Thesaurus (502) for ontology mapping. Example of concepts and entities which relates to a specific context with the information from the External Linguistic Thesaurus (502) is "electrical", "apparatus", "shopping complex", "television", "computer", "radio" and "slow cooker". In the Concept Extension module (208), said module will loop through said list of entities produced from Salient Entity List Composer module together with the information from said External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym of the entities from said External Linguistic Thesaurus (504). Example of extended entities which relates to the specific context are in bold "electrical / electric / charge / electricity / apparatus / equipment / machine / device / appliance / instrument / shopping complex / - I I - shopping center / television / tv / video / computer / calculator / CPU / processor / PC / radio / tape player / slow cooker/". Thereafter, a list of extended concepts and entities are produced and said extended concepts and entities related to a specific context are added to the existing list of entities (506). The output (506) produced for the specific context are "electrical, electric, charge, electricity, apparatus, equipment, machine, device, appliance, instrument, shopping complex, shopping center, television, tv, video, computer, calculator, CPU, processor, PC, radio, tape player".
The extended concepts and entities related to a specific context together with information stored in domain knowledge-base which contains concepts, properties and instances (804) are forwarded to the Ontology Content Mapping module (212). A domain knowledge-base is used for processing and extracting the content for the revised knowledge-base. Content ontology is mapped by marking concepts from domain ontology to be further process in revised knowledge-base reconstruction module. The said Ontology Content Mapping module (212) determines if there are any unmarked concepts and instances from domain ontology (806) and extended concepts and entities are removed from said list should there be no unmarked concepts and instances from domain ontology (808). The contents received by said Ontology Content Mapping module (212) will undergo a matching process namely the Concept Similarity Matching procedure whereby concept similarity matching is perform on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledgebase based on said list of extended concepts -and-entities (810). The comparison similarity values of the Concept Similarity Matching procedure are evaluated to determine if concepts from domain ontology have high concept mapping similarity value (812). Thereafter, the concepts from domain ontology will be marked if concepts from domain ontology have high concept mapping similarity value (818) and the extended concepts and entities will be removed from said list (808).
The marking of the concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value is based on the Concept Mapping Similarity (CMS) calculation whereby the total highest marked number of marked instance and property will be tracked for all compared domain concepts (702a) and the total marked number of each concept in domain ontology with highest marked number as tracked will be normalized (704a). Thereafter, domain concept with content mapping similarity value larger then threshold value for a threshold value between 0 and 1 (706a) is selected and selected domain concepts with a "mark concept" will be flag accordingly (708a). Higher threshold used represents a more specific version of the knowledge-base required and vice versa.
Alternatively, concept annotation resources and its instances will be obtained from domain knowledge-base if the concept mapping similarity value is not high (814). It is further determined if the annotation resources and instances have high similarity matching value if there are no concepts in domain ontology with high concept mapping similarity value (816). The extended concepts and entities from said list are removed should there be any unmarked concepts and instances from domain ontology (808); else the concepts from domain ontology will be marked if said concept annotation resources and instances have high similarity matching value (8 8) and said extended concepts and entities are removed from said list (808). Thereafter, the domain knowledge-base with marked concepts are forwarded to the Revised Knowledge-base Reconstruction module (214). Concept from domain ontology is forwarded (1004) to the Revised Knowledge-base Reconstruction module (214). Upon determining if said concepts are marked (1010, 1012), the marked concepts are further processed by preserving original hierarchical structure of marked concepts; the instances attached to marked concepts are preserved, properties with its domain are preserved-as-instanees; -and unmarked'concepts are removed from ontology while preserving original hierarchical structure of said marked concepts (1008, 1014, 1016).
In short, the present invention provides for automated generation of contextual revised knowledge base whereby the invention automatically identify all concepts, properties and instances (C,P,I) for a revised knowledge-base (KB) from a domain knowledge-base based on specific entities and associated contextual information. Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term "comprising" is used in an inclusive sense and thus should be understood as meaning "including principally, but not necessarily solely".
It will be appreciated that the foregoing description has been given by way of illustrative example of the invention and that all such modifications and variations thereto as would be apparent to persons of skill in the art are deemed to fall within the broad scope and ambit of the invention as herein set forth.

Claims

CLAI S
A system (100, 200) for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator (102), the said contextual knowledge-base generator (102) comprising:
at least one Salient Entity List Composer module (204);
at least one Concept Extension module (208);
at least one Ontology Content Mapping module (212); and
at least one Revised Knowledge-base Reconstruction module (214) characterized in that
the at least one Revised Knowledge-base Reconstruction module (214) having means for:
receiving domain knowledge base with concepts from mapped content ontology;
determining if said concepts are marked and further processing marked concepts by
preserving original hierarchy structure of marked concepts; preserving instances attached to marked concepts;
preserving properties with its domain as preserved instances; and
removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts.
A system (200) according to Claim J , -wherein the at least one Salient Entity List Composer module (204) having means for extracting a list of concepts and entities from input of said contextual revised knowledge-base generator by the steps of:
receiving pre-defined concepts, properties and instances with associated information for a specific context;
tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming; identifying important entities from tokenized list; and producing a list of entities by combining identified entities with said predefined concepts properties and instances with associated information for a specific context.
A system (200) according to Claim 1 , wherein the at least one Concept Extension module (208) having means for extending said concepts and entities which relates to a specific context for ontology mapping by the steps of:
receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus;
looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus; and producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities.
A system (200) according to Claim 1 , wherein the at least one Ontology Content Mapping module (212) having means for mapping content ontology by marking concepts from domain ontology for further processing in revised knowledgebase reconstruction module by the steps of:
receiving a list of extended concepts and entities related to a specific context together with information stored in domain knowledge-base whic contains concepts, properties and instances;
determining if there is any unmarked concepts and instances from domain ontology;
removing extended concepts and entities from said list should there be any unmarked concepts and instances from domain ontology;
performing concept similarity matching on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledge-base with said list of extended concepts and entities; determining if concepts from domain ontology have high concept mapping similarity value;
marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value and removing extended concepts and entities from said list
obtaining concept annotation resources and instances from domain knowledge-base and determining if said concept annotation resources and instances have high similarity matching value if there are no concepts in domain ontology with high concept mapping similarity value; removing extended concepts and entities from said list should there be any unmarked concepts and instances from
domain ontology;
marking concepts from domain ontology if said concept annotation resources and instances have high similarity matching value and removing extended concepts and entities from said list; and forwarding domain knowledge-base with marked concepts to Revised
Knowledge-base Reconstruction module.
5. A method (300) for automated generation of contextual knowledge-base by utilizing contextual revised knowledge-base generator, the said method comprising steps of:
extracting a list of concepts and entities from input of said contextual revised knowledge-base generator (302);
extending said concepts and-entities-whieh relates to a specific context for ontology mapping (304);
mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module (306); and
constructing revised knowledge-base containing marked concepts with its respective properties and instances (308)
characterized in that
constructing revised knowledge-base containing marked concepts with its respective properties and instances further comprises steps of. receiving domain knowledge base with concepts from mapped content ontology (1004);
determining if said concepts are marked (1010, 1012) and further processing marked concepts by
preserving original hierarchy structure of marked concepts;
preserving instances attached to marked concepts; preserving properties with its domain as preserved instances; and
removing unmarked concepts from ontology while preserving original hierarchy structure of said marked concepts (1008, 10 4, 016).
A method (400) according to Claim 5, wherein extracting a list of concepts and entities from input of said contextual revised knowledge-base generator further comprising steps of:
receiving pre-defined concepts, properties and instances with associated information for a specific context (402);
tokenizing said domain specific concepts, properties and instances with associated information for text mining, sense tagging and stemming
(404);
identifying important entities from tokenized list (406); and
producing a list of entities by combining identified entities with said predefined concepts properties and-instances- wit associated information for a specific context (408).
A method (500) according to Claim 5, wherein extending said concepts and entities which relates to a specific context for ontology mapping further comprising steps of:
receiving said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus (502);
looping through said list of entities produced from Salient Entity List Composer module together with information from at least one External Linguistic Thesaurus to identify synonym, hypernym and hyponym, meronym and holonym from said External Linguistic Thesaurus (504); and
producing a list of extended concepts and entities and further including said extended concepts and entities related to a specific context to existing list of entities (506).
8. A method (800) according to Claim 5, wherein mapping content ontology by marking concepts from domain ontology for further processing in revised knowledge-base reconstruction module further comprising steps of:
receiving a list of extended concepts and entities related to a specific context together with information stored in domain knowledge-base which contains concepts, properties and instances (804);
determining if there is any unmarked concepts and instances from domain ontology (806);
removing extended concepts and entities from said list should there be no unmarked concepts and instances from domain ontology (808);
performing concept similarity matching on marked concepts and instances by comparing concept name, property name, concept description and instances from said domain knowledge-base with said list of extended concepts and entities (810);
determining if concepts from domain ontology have high concept mapping similarity value (812);
marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value (818) and removing extended concepts and entities from said list (808) obtaining concept annotation resources and instances from domain knowledge-base (814); determining if said concept annotation resources and instances have high similarity matching value if there are no concepts in domain ontology with high concept mapping similarity value (816); and removing extended concepts and entities from said list should there be any unmarked concepts and instances from domain ontology (808); marking concepts from domain ontology if said concept annotation resources and instances have high similarity matching value (818) and removing extended concepts and entities from said list (808); and forwarding domain knowledge-base with marked concepts to Revised Knowledge-base Reconstruction module.
9. A method (700a) according to Claim 8, wherein marking concepts from domain ontology if concepts from domain ontology have high concept mapping similarity value further comprising steps of:
tracking total highest marked number of marked instance and property for all compared domain concepts (702a);
normalizing total marked number of each concept in domain ontology with highest marked number as tracked (704a);
selecting domain concept with content mapping similarity value larger then threshold value for a threshold value between 0 and 1 (706a); and flagging selected domain concepts with a "mark concept" (708a).
PCT/MY2013/000199 2012-11-29 2013-11-20 A system and method for automated generation of contextual revised knowledge base WO2014084712A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2012005159 2012-11-29
MYPI2012005159A MY188005A (en) 2012-11-29 2012-11-29 A system and method for automated generation of contextual revised knowledge base

Publications (1)

Publication Number Publication Date
WO2014084712A1 true WO2014084712A1 (en) 2014-06-05

Family

ID=49920578

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2013/000199 WO2014084712A1 (en) 2012-11-29 2013-11-20 A system and method for automated generation of contextual revised knowledge base

Country Status (2)

Country Link
MY (1) MY188005A (en)
WO (1) WO2014084712A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109804371A (en) * 2016-08-10 2019-05-24 瑞典爱立信有限公司 Method and apparatus for semantic knowledge migration
CN112069817A (en) * 2020-07-17 2020-12-11 中国科学院计算机网络信息中心 Student knowledge extraction and fusion method and device
CN112328810A (en) * 2020-11-11 2021-02-05 河海大学 Knowledge graph fusion method based on self-adaptive mixed ontology mapping
CN114880484A (en) * 2022-05-11 2022-08-09 军事科学院系统工程研究院网络信息研究所 Satellite communication frequency-orbit resource map construction method based on vector mapping
CN115238702A (en) * 2022-09-21 2022-10-25 中科雨辰科技有限公司 Entity library processing method and storage medium
CN116049447A (en) * 2023-03-24 2023-05-02 中科雨辰科技有限公司 Entity linking system based on knowledge base

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117346A1 (en) 2002-09-20 2004-06-17 Kilian Stoffel Computer-based method and apparatus for repurposing an ontology
WO2005062202A2 (en) * 2003-12-23 2005-07-07 Thomas Eskebaek Knowledge management system with ontology based methods for knowledge extraction and knowledge search
US20090177634A1 (en) * 2008-01-09 2009-07-09 International Business Machine Corporation Method and System for an Application Domain

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117346A1 (en) 2002-09-20 2004-06-17 Kilian Stoffel Computer-based method and apparatus for repurposing an ontology
WO2005062202A2 (en) * 2003-12-23 2005-07-07 Thomas Eskebaek Knowledge management system with ontology based methods for knowledge extraction and knowledge search
US20090177634A1 (en) * 2008-01-09 2009-07-09 International Business Machine Corporation Method and System for an Application Domain

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ARTIFICIAL INTELLIGENCE AND LAW, 2004
GUIRAUDE LAME: "Using NLP techniques to identify legal ontology components: Concepts and Relations", 2006, SPRINGER
MAHMOUDI M T ET AL: "An Ontological Model for Knowledge Management through Concept Composition", DIGITAL SOCIETY, 2007. ICDS '07. FIRST INTERNATIONAL CONFERENCE ON THE, IEEE, PISCATAWAY, NJ, USA, 2 January 2007 (2007-01-02), pages 9, XP031331592, ISBN: 978-0-7695-2760-4 *
TIUN: "Automatic Topic Identification Using Ontology Hierarchy", 2001, SPRINGER-VERLAG

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109804371A (en) * 2016-08-10 2019-05-24 瑞典爱立信有限公司 Method and apparatus for semantic knowledge migration
CN109804371B (en) * 2016-08-10 2023-05-23 瑞典爱立信有限公司 Method and device for semantic knowledge migration
CN112069817A (en) * 2020-07-17 2020-12-11 中国科学院计算机网络信息中心 Student knowledge extraction and fusion method and device
CN112328810A (en) * 2020-11-11 2021-02-05 河海大学 Knowledge graph fusion method based on self-adaptive mixed ontology mapping
CN112328810B (en) * 2020-11-11 2022-10-14 河海大学 Knowledge graph fusion method based on self-adaptive mixed ontology mapping
CN114880484A (en) * 2022-05-11 2022-08-09 军事科学院系统工程研究院网络信息研究所 Satellite communication frequency-orbit resource map construction method based on vector mapping
CN115238702A (en) * 2022-09-21 2022-10-25 中科雨辰科技有限公司 Entity library processing method and storage medium
CN115238702B (en) * 2022-09-21 2022-12-06 中科雨辰科技有限公司 Entity library processing method and storage medium
CN116049447A (en) * 2023-03-24 2023-05-02 中科雨辰科技有限公司 Entity linking system based on knowledge base
CN116049447B (en) * 2023-03-24 2023-06-13 中科雨辰科技有限公司 Entity linking system based on knowledge base

Also Published As

Publication number Publication date
MY188005A (en) 2021-11-09

Similar Documents

Publication Publication Date Title
US10078632B2 (en) Collecting training data using anomaly detection
WO2014084712A1 (en) A system and method for automated generation of contextual revised knowledge base
US9965726B1 (en) Adding to a knowledge base using an ontological analysis of unstructured text
CN102567509B (en) Method and system for instant messaging with visual messaging assistance
CN111966792B (en) Text processing method and device, electronic equipment and readable storage medium
Badam et al. Aletheia: A fake news detection system for Hindi
Mukherjee et al. Domain cartridge: Unsupervised framework for shallow domain ontology construction from corpus
Lu et al. Web Entity Detection for Semi-structured Text Data Records with Unlabeled Data.
Perera et al. Interaction history based answer formulation for question answering
Bellaachia et al. Learning from twitter hashtags: Leveraging proximate tags to enhance graph-based keyphrase extraction
Su et al. SSL-GAN-RoBERTa: A robust semi-supervised model for detecting Anti-Asian COVID-19 hate speech on social media
Thenmalar et al. Enhanced ontology-based indexing and searching
Gao et al. Scientific table search using keyword queries
Chandu et al. Extractive Approach For Query Based Text Summarization
Nghiem et al. Which one is better: presentation-based or content-based math search?
Drymonas et al. Opinion mapping travelblogs
Sunitha et al. Automatic summarization of Malayalam documents using clause identification method
Krapivin et al. Unsupervised key-phrases extraction from scientific papers using domain and linguistic knowledge
Schneider et al. Golden retriever: A real-time multi-modal text-image retrieval system with the ability to focus
Wang et al. Wikipedia2Onto: Building concept ontology automatically, experimenting with web image retrieval
Kaiser et al. Using Wikipedia-based conceptual contexts to calculate document similarity
Sharma et al. Keyword Extraction Using Graph Centrality and WordNet
Kaur et al. Review of recent plagiarism detection techniques and their performance comparison
Raja et al. NLP: rule based name entity recognition
Khurshid et al. Text-based intelligent content filtering on social platforms

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13818473

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13818473

Country of ref document: EP

Kind code of ref document: A1