CN102332137A - Goods matching method and system - Google Patents

Goods matching method and system Download PDF

Info

Publication number
CN102332137A
CN102332137A CN201110288717A CN201110288717A CN102332137A CN 102332137 A CN102332137 A CN 102332137A CN 201110288717 A CN201110288717 A CN 201110288717A CN 201110288717 A CN201110288717 A CN 201110288717A CN 102332137 A CN102332137 A CN 102332137A
Authority
CN
China
Prior art keywords
key element
commodity
dictionary
keyword
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201110288717A
Other languages
Chinese (zh)
Inventor
黄哲铿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Niuhai Information Technology (Shanghai) Co Ltd
Original Assignee
Niuhai Information Technology (Shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Niuhai Information Technology (Shanghai) Co Ltd filed Critical Niuhai Information Technology (Shanghai) Co Ltd
Priority to CN201110288717A priority Critical patent/CN102332137A/en
Publication of CN102332137A publication Critical patent/CN102332137A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a goods matching method, which comprises the following steps of: determining the categories of goods and calling lexicons corresponding to the categories of the goods; dividing description of the goods into at least two elements and assigning an element weight for each element; segmenting the description of the goods into at least one element key word according to each element by using the lexicons; for every two goods: establishing group mapping for each element, and calculating the similarity of each group mapping; calculating the matching scores of the two goods, and the matching score being shown as an accompanying drawing; when the matching score is compared with a threshold, if the matching score is more than or equal to the threshold, then determining that the two goods are the same; if the matching score is less than the threshold, determining that the two goods are different. The invention also discloses a goods matching system. The goods matching method and system are provided by the invention to eliminate the errors of the description of the goods for the same goods on different websites so as to automatically recognize the same kind of goods on the different websites.

Description

Commodity matching process and system
Technical field
The present invention relates to a kind of commodity matching process and system, particularly relate to a kind of commodity matching process and system that can discern the same kind of goods of different web sites automatically.
Background technology
In ecommerce flourish today, the comparison of merchandise news, analyze and seem particularly important.Yet; Each website is skimble-scamble often to the description of same commodity; Such as: a same toothpaste, A website are described as " the pure white tooth family wear 500g that protects of Darlie ", and the B website is described as " the fast-selling Darlie whole family of special price adorns 500 grams "; This is to computer identification and relatively brought difficulty, because computer is not understand the meaning of one's words.Because descriptive labelling is not quite similar, the user tends to search complete desired commodity when a certain commodity of search, and this has brought certain puzzlement to the user to a certain extent.
Therefore, expectation can be found and a kind ofly can eliminate error on the descriptive labelling, let the computer expert cross the certain method identification meaning of one's words and can be applied to discern automatically the commodity matching process and the system of the same kind of goods of different web sites.
Summary of the invention
The technical matters that the present invention will solve is that each website is skimble-scamble often to the description of same commodity in order to overcome in the prior art, the different statement of identical goods meeting brings the defective of puzzlement to the user, provides a kind of and can eliminate error on the descriptive labelling, lets the computer expert cross the certain method identification meaning of one's words and can be applied to discern automatically the commodity matching process and the system of the same kind of goods of different web sites.
The present invention solves above-mentioned technical matters through following technical proposals:
A kind of commodity matching process, its characteristics are that it may further comprise the steps:
Earlier to every commodity:
Descriptive labelling according to commodity is confirmed merchandise classification, and calls and the corresponding dictionary of this merchandise classification according to merchandise classification;
Descriptive labelling is divided at least two key elements also distributes the key element weights, wherein with P for each key element iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1; The distribution of key element weights; Relevant with merchandise classification; For example: if commodity are books classes; The key element weights of ISBN sign indicating number (International Standard Book Number, International Standard Book Number) key element are the highest so, and the key element weights of author's key element, publishing house's key element just can hang down relatively; If commodity are digital products, the key element weights of brand key element, model key element are than higher so, and the key element weights of key elements such as color element, place of production key element can hang down;
Utilize this dictionary and descriptive labelling is cut speech to become at least one key element keyword, and format this at least one key element keyword to unify the form of key element keyword according to each key element;
Again to per two commodity:
For each key element is set up packet map, this packet map is the set of key element keyword of the same key element of these two commodity; Such as: cut that the brand key element of A commodity is behind the speech: " nokia, Nokia ", the brand key element of B commodity is " Nokias ", so " nokia, Nokia " to follow " Nokia " be exactly the packet map of this brand key element;
Calculate the similarity of each packet map, this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element;
Calculate the coupling score value of these two commodity,
Should mate a score value and a threshold ratio, if this matees score value more than or equal to this threshold value, then these two commodity are complementary, and confirm that these two commodity are same commodity; If this coupling score value is less than this threshold value, then these two commodity are not complementary, and confirm that these two commodity are different commodity.To different merchandise classifications, threshold value is different; Even same sometimes merchandise classification, after the dictionary that calls changed, threshold value also possibly change thereupon.
Preferably, as the similarity F of the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.That is to say; All key element keywords of two commodity in this packet map are all different; Promptly there is not identical key element keyword in this packet map; In this case, the key element weights of this key element are dispensed to the key element weights of other key elements with transfer, for example are transferred to the key element weights of other key elements by a certain percentage.
Preferably, adopt the unified synon mode of a synonym dictionary, adopt the mode of unified capital and small letter and adopt in the mode of replacement half double byte character one or more to format this at least one key element keyword to unify the form of key element keyword.For example; Can make up a synonym dictionary earlier; This synonym dictionary comprises: abbreviation, term, formal name used at school, full name, phonetic, English etc. all can be indexed to thesaurus; Utilize this synonym dictionary to unify the key element keyword afterwards, term unified in the key element keyword of identical meanings, convenient follow-up comparison.Be used to represent foreign language characters, numeral of marque, unit etc. for some; The foreign language characters that adopts formative mode to reduce to write by different way to after the influence relatively of key element keyword; As: N908, n908, N 908; Can be formatted into n908, with convenient follow-up comparison.
Preferably, this dictionary is one or more in brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and the conventional speech dictionary.
Preferably, this utilizes this dictionary and according to each key element descriptive labelling is cut speech and becomes the step of at least one key element keyword before further comprising the steps of: adopt preposition and/or auxiliary word in the conventional speech dictionary excision descriptive labelling.
Preferably, these at least two key elements are selected from following key element: brand key element, generic name of the goods key element, unit key element, attribute key element and model key element.
The present invention also provides a kind of commodity matching system, and its characteristics are that it comprises:
Confirm the merchandise classification identification module of merchandise classification according to the descriptive labelling of commodity;
Call the dictionary calling module with the corresponding dictionary of this merchandise classification according to merchandise classification;
Descriptive labelling is divided at least two key elements are also distributed the key element weights for each key element key element formation module, wherein with P iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1; The distribution of key element weights, relevant with merchandise classification, for example: if commodity are books classes, the key element weights of ISBN sign indicating number key element are the highest so, the key element weights of author's key element, publishing house's key element just can hang down relatively; If commodity are digital products, the key element weights of brand key element, model key element are than higher so, and the key element weights of key elements such as color element, place of production key element can hang down;
Utilize this dictionary and descriptive labelling cut speech to become at least one key element keyword and format this at least one key element keyword and cut the speech module with the form of unifying the key element keyword according to each key element; And,
One subsystem to per two commodity, wherein this subsystem also comprises:
Set up the packet map of packet map for each key element and set up module, this packet map is the set of key element keyword of the same key element of these two commodity; Such as: cut that the brand key element of A commodity is behind the speech: " nokia, Nokia ", the brand key element of B commodity is " Nokias ", so " nokia, Nokia " to follow " Nokia " be exactly the packet map of this brand key element;
Calculate the similarity calculation module of the similarity of each packet map, this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element;
Calculate the coupling score value computing module of the coupling score value of these two commodity,
With this coupling score value and threshold ratio comparison module, wherein, if this matees score value more than or equal to this threshold value, then these two commodity are complementary, and confirm that these two commodity are same commodity; If this coupling score value is less than this threshold value, then these two commodity are not complementary, and confirm that these two commodity are different commodity.
To different merchandise classifications, threshold value is different; Even same sometimes merchandise classification, after the dictionary that calls changed, threshold value also possibly change thereupon.In addition, this threshold value can be come to be provided with automatically by systematic learning after system moves a period of time.
Preferably, this subsystem also comprises a key element weights shift module, is used for the similarity F when the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.That is to say; All key element keywords of two commodity in this packet map are all different; Promptly there is not identical key element keyword in this packet map; In this case, the key element weights of this key element are dispensed to the key element weights of other key elements with transfer, for example are transferred to the key element weights of other key elements by a certain percentage.
Preferably, this is cut the speech module and also is used for adopting the unified synon mode of a synonym dictionary, adopts the mode of unified capital and small letter and adopts one or more of mode of replacement half double byte character to format this at least one key element keyword to unify the form of key element keyword.For example; Can make up a synonym dictionary earlier; This synonym dictionary comprises: abbreviation, term, formal name used at school, full name, phonetic, English etc. all can be indexed to thesaurus; Utilize this synonym dictionary to unify the key element keyword afterwards, term unified in the key element keyword of identical meanings, convenient follow-up comparison.Be used to represent foreign language characters, numeral of marque, unit etc. for some; The foreign language characters that adopts formative mode to reduce to write by different way to after the influence relatively of key element keyword; As: N908, n908, N 908; Can be formatted into n908, with convenient follow-up comparison.
Preferably, this dictionary is one or more in brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and the conventional speech dictionary.
Preferably, this cuts preposition and/or auxiliary word that the speech module also is used for adopting conventional speech dictionary excision descriptive labelling.
Preferably, these at least two key elements are selected from following key element: brand key element, generic name of the goods key element, unit key element, attribute key element and model key element.
Positive progressive effect of the present invention is: commodity matching process provided by the invention and system; Eliminated the error of the same kind of goods on descriptive labelling of different web sites; Be able to discern automatically the same kind of goods of different web sites; Browse the user, when searching for a certain commodity for the user provides great facility, effectively improved the recall ratio of search.
Description of drawings
Fig. 1 is the process flow diagram of the commodity matching process of one embodiment of the invention.
Fig. 2 is the structured flowchart of the commodity matching system of one embodiment of the invention.
Fig. 3 matees synoptic diagram for the speech of cutting in the commodity matching process of one embodiment of the invention.
Embodiment
Provide preferred embodiment of the present invention below in conjunction with accompanying drawing, to specify technical scheme of the present invention.
With reference to figure 1, introduce the commodity matching process of one embodiment of the invention.
Step 101, earlier to every commodity: the descriptive labelling according to commodity is confirmed merchandise classification, and calls and the corresponding dictionary of this merchandise classification according to merchandise classification.For example brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and conventional speech dictionary.
Step 102 is divided at least two key elements with descriptive labelling.For example brand key element, generic name of the goods key element, unit key element, attribute key element and model key element.
Step 103 is for each key element is distributed the key element weights, wherein with P iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1;
Step 104 is utilized this dictionary and according to each key element descriptive labelling is cut speech to become at least one key element keyword, and formats this at least one key element keyword to unify the form of key element keyword.Wherein, utilizing this dictionary and according to each key element descriptive labelling is being cut speech and can adopt preposition and/or the auxiliary word in the conventional speech dictionary excision descriptive labelling earlier, the accuracy that helps to cut speech like this before becoming at least one key element keyword.
Specifically, adopt the unified synon mode of a synonym dictionary, adopt the mode of unified capital and small letter and adopt in the mode of replacement half double byte character one or more to format this at least one key element keyword to unify the form of key element keyword.
Step 105, again to per two commodity: for each key element is set up packet map, this packet map is the set of key element keyword of the same key element of these two commodity.
Step 106 is calculated the similarity of each packet map, and this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element.Especially, as the similarity F of the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.
Step 107; Calculate the coupling score value of these two commodity,
Figure BSA00000581634100061
Step 108 should be mated a score value and a threshold ratio, if this coupling score value then gets into step 109 more than or equal to this threshold value; If this coupling score value then gets into step 110 less than this threshold value.
Step 109, these two commodity are complementary, and confirm that these two commodity are same commodity.
Step 110, these two commodity are not complementary, and confirm that these two commodity are different commodity.
With reference to figure 2, introduce the commodity matching system of one embodiment of the invention.
As shown in Figure 2, this commodity matching system comprises:
Confirm the merchandise classification identification module 1 of merchandise classification according to the descriptive labelling of commodity;
Call the dictionary calling module 2 with the corresponding dictionary of this merchandise classification according to merchandise classification, wherein, this dictionary is brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and conventional speech dictionary;
Descriptive labelling is divided at least two key elements are also distributed the key element weights for each key element key element formation module 3, and for example, brand key element, generic name of the goods key element, unit key element, attribute key element and model key element are wherein with P iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1;
Utilize this dictionary and descriptive labelling cut speech to become at least one key element keyword and format this at least one key element keyword and cut speech module 4 with the form of unifying the key element keyword according to each key element;
And, a subsystem 5 to per two commodity, wherein this subsystem 5 also comprises:
Set up the packet map of packet map for each key element and set up module 51, this packet map is the set of key element keyword of the same key element of these two commodity;
Calculate the similarity calculation module 52 of the similarity of each packet map, this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element;
Calculate the coupling score value computing module 53 of the coupling score value of these two commodity,
Figure BSA00000581634100071
Figure BSA00000581634100072
With this coupling score value and threshold ratio comparison module 54, wherein, if this matees score value more than or equal to this threshold value, then these two commodity are complementary, and confirm that these two commodity are same commodity; If this coupling score value is less than this threshold value, then these two commodity are not complementary, and confirm that these two commodity are different commodity.
In addition, this subsystem 5 also comprises a key element weights shift module 55, is used for the similarity F when the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.
Wherein, this is cut speech module 4 and also is used for adopting the unified synon mode of a synonym dictionary, adopts the mode of unified capital and small letter and adopts one or more of mode of replacement half double byte character to format this at least one key element keyword to unify the form of key element keyword.In addition, this cuts preposition and/or auxiliary word that speech module 4 also is used for adopting conventional speech dictionary excision descriptive labelling.
Next,, lift an application example, further introduce commodity matching process of the present invention with reference to figure 3.
As shown in Figure 3; Use the commodity of different descriptive labellings to be example with two; Cut the step of speech and coupling in the summary commodity matching process; Wherein, the descriptive labelling of two commodity is respectively: 5 kilograms of two-power washing machines of the ultralow valency of the washing machine XQS50-Z9288FM of Haier, and the washing machine XQS50-Z9288FM of Haier.
At first, cut the step of speech:
A descriptive labelling is divided into 7 key elements and cuts speech become key element keyword (in general, the brand key element must exist), these 7 key elements are respectively:
Key element 1: the brand key element, like " Nokia " " Xiaxin " (press merchandise classification and divide, such as big household electrical appliances the commodity brand keyword set of big household electrical appliances is arranged, mobile phone also has the brand keyword set of oneself);
Key element 2: generic name of the goods key element, the adopted name as " washing powder " " air-conditioning ";
Key element 3: the unit key element, as: " 20ml " " 30kg ";
Key element 4: the attribute key element of representing the item property of different merchandise classifications: as: big household electrical appliances have " clothes closet ", " wall-hanging ", " changes in temperature ", " two opening " (the attribute speech of refrigerator); (general property is divided by merchandise classification)
Key element 5: the attribute key element of expression commodity general-purpose attribute: like commodity color " redness " " silver is black " and so on;
Key element 6: the model key element, generally show with the continuation character string list of character with numeral and some connectors composition;
Key element 7: the remaining general speech method of cutting of usefulness is cut the keyword that obtains behind the speech.
Cutting speech needs dictionary, and the dictionary here is according to classifying: the brand like big household electrical appliances has: " Xiaxin " " Changhong " " Philip " etc.
The needs of cutting speech by key element in a certain order; Cutting speech and need be placed on after key element 3 and the key element 1 like key element 6; Because if key element 6 is cut speech before in key element 1 and key element 3; Then can cause brand or the commodity unit description of the English of a part to get into key element 6, cause the speech of cutting of different key elements to obscure and cause the inaccurate of final matching results.Simultaneously, need one press merchandise classification and key element division classification synonym table, have like the brand synonym of mobile phone: " nokia " correspondence " Nokia ", " association " correspondence " Lenovo ".The common name synonym of big household electrical appliances has: " refrigerator-freezer " and " ice box " synonym, " showcase " and " showcase " synonym.General unit keyword synonym has: unit: " milliliter " and " ml " synonym, the key element keyword that can unify identical meanings like this is beneficial to follow-up coupling.
The step of next mating: cutting speech is the basis of coupling; After cutting speech, can obtain 7 key elements; And be two commodity each key element set up packet map, each packet map comprises all key element keywords of this key element of two commodity, in judging the process whether two descriptive labellings mate; Calculate the similarity Fi of each packet map, promptly the identical key element keyword of this key element of two commodity accounts for the ratio of all key element keywords of this packet map.
Calculate coupling score value: F 1* P 1+ F 2* P 2+ F 3* P 3+ F 4* P 4+ F 5* P 5+ F 6* P 6+ F 7* P 7, will mate score value and threshold ratio to obtain a result, here key element sum n=7.
Different merchandise classifications, the effect of the packet map of its key element in matching process is different, so the key element weights of different key elements need be set according to different merchandise classifications.Mainly be provided with these key element weights by manual work early stage, and the later stage hopes that the method that designs a cover machine learning automatically is provided with these key element weights.
Certainly; Be not that any situation all is to carry out the calculating of matching rate according to such coupling score value; Such as for some commodity classification, if 1 the Fi of dividing into groups is (expression is the brand coupling) under 1 the situation, 6 the Fi value of dividing into groups also is 1; Then need not consider the match condition of other groupings, can think directly that commodity mate.
At last, with reference to figure 3,, sketch an incision speech and the process of setting up packet map with reference to above-mentioned key element 1-7.
The packet map of key element 1: " Haier ", " Haier ";
The packet map of key element 2: " washing machine ", " washing machine ";
The packet map of key element 3: " 5 kilograms ", nothing;
The packet map of key element 6: " XQS50-Z9288FM ", " XQS50-Z9288FM ";
The packet map of key element 7: " ultralow valency, double dynamical ", nothing.
Though more than described embodiment of the present invention, it will be understood by those of skill in the art that these only illustrate, protection scope of the present invention is limited appended claims.Those skilled in the art can make numerous variations or modification to these embodiments under the prerequisite that does not deviate from principle of the present invention and essence, but these changes and modification all fall into protection scope of the present invention.

Claims (12)

1. commodity matching process is characterized in that it may further comprise the steps:
Earlier to every commodity:
Descriptive labelling according to commodity is confirmed merchandise classification, and calls and the corresponding dictionary of this merchandise classification according to merchandise classification;
Descriptive labelling is divided at least two key elements also distributes the key element weights, wherein with P for each key element iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1;
Utilize this dictionary and descriptive labelling is cut speech to become at least one key element keyword, and format this at least one key element keyword to unify the form of key element keyword according to each key element;
Again to per two commodity:
For each key element is set up packet map, this packet map is the set of key element keyword of the same key element of these two commodity;
Calculate the similarity of each packet map, this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element;
Calculate the coupling score value of these two commodity,
Figure FSA00000581634000011
Should mate a score value and a threshold ratio, if this matees score value more than or equal to this threshold value, then these two commodity are complementary, and confirm that these two commodity are same commodity; If this coupling score value is less than this threshold value, then these two commodity are not complementary, and confirm that these two commodity are different commodity.
2. commodity matching process as claimed in claim 1 is characterized in that, as the similarity F of the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.
3. commodity matching process as claimed in claim 1; It is characterized in that, adopt the unified synon mode of a synonym dictionary, adopt the mode of unified capital and small letter and adopt in the mode of replacement half double byte character one or more to format this at least one key element keyword to unify the form of key element keyword.
4. like any described commodity matching process among the claim 1-3, it is characterized in that this dictionary is one or more in brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and the conventional speech dictionary.
5. commodity matching process as claimed in claim 4; It is characterized in that this utilizes this dictionary and according to each key element descriptive labelling is cut speech and becomes the step of at least one key element keyword before further comprising the steps of: adopt preposition and/or auxiliary word in the conventional speech dictionary excision descriptive labelling.
6. commodity matching process as claimed in claim 4 is characterized in that, these at least two key elements are selected from following key element: brand key element, generic name of the goods key element, unit key element, attribute key element and model key element.
7. commodity matching system is characterized in that it comprises:
Confirm the merchandise classification identification module of merchandise classification according to the descriptive labelling of commodity;
Call the dictionary calling module with the corresponding dictionary of this merchandise classification according to merchandise classification;
Descriptive labelling is divided at least two key elements are also distributed the key element weights for each key element key element formation module, wherein with P iThe key element weights of representing i key element are represented the number of key element with n, and the key element weights sum of all key elements is 1;
Utilize this dictionary and descriptive labelling cut speech to become at least one key element keyword and format this at least one key element keyword and cut the speech module with the form of unifying the key element keyword according to each key element; And,
One subsystem to per two commodity, wherein this subsystem also comprises:
Set up the packet map of packet map for each key element and set up module, this packet map is the set of key element keyword of the same key element of these two commodity;
Calculate the similarity calculation module of the similarity of each packet map, this similarity representes that key element keyword identical in this packet map accounts for the ratio of all key element keywords, wherein with F iThe similarity of representing the packet map of i key element;
Calculate the coupling score value computing module of the coupling score value of these two commodity,
With this coupling score value and threshold ratio comparison module, wherein, if this matees score value more than or equal to this threshold value, then these two commodity are complementary, and confirm that these two commodity are same commodity; If this coupling score value is less than this threshold value, then these two commodity are not complementary, and confirm that these two commodity are different commodity.
8. commodity matching system as claimed in claim 7 is characterized in that, this subsystem also comprises a key element weights shift module, is used for the similarity F when the packet map of i key element iIt is 0 o'clock, with the key element weights P of this i key element iBe transferred on the key element weights of other key elements.
9. commodity matching system as claimed in claim 7; It is characterized in that this is cut the speech module and also is used for adopting the unified synon mode of a synonym dictionary, adopts the mode of unified capital and small letter and adopts one or more of mode of replacement half double byte character to format this at least one key element keyword to unify the form of key element keyword.
10. like any described commodity matching system among the claim 7-9, it is characterized in that this dictionary is one or more in brand dictionary, generic name of the goods dictionary, unit dictionary, attribute dictionary, model dictionary and the conventional speech dictionary.
11. commodity matching system as claimed in claim 10 is characterized in that, this cuts preposition and/or auxiliary word that the speech module also is used for adopting conventional speech dictionary excision descriptive labelling.
12. commodity matching system as claimed in claim 10 is characterized in that, these at least two key elements are selected from following key element: brand key element, generic name of the goods key element, unit key element, attribute key element and model key element.
CN201110288717A 2011-09-23 2011-09-23 Goods matching method and system Pending CN102332137A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110288717A CN102332137A (en) 2011-09-23 2011-09-23 Goods matching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110288717A CN102332137A (en) 2011-09-23 2011-09-23 Goods matching method and system

Publications (1)

Publication Number Publication Date
CN102332137A true CN102332137A (en) 2012-01-25

Family

ID=45483902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110288717A Pending CN102332137A (en) 2011-09-23 2011-09-23 Goods matching method and system

Country Status (1)

Country Link
CN (1) CN102332137A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309886A (en) * 2012-03-13 2013-09-18 阿里巴巴集团控股有限公司 Trading-platform-based structural information searching method and device
WO2013170587A1 (en) * 2012-05-14 2013-11-21 华为技术有限公司 Multimedia question and answer system and method
CN103810468A (en) * 2012-11-05 2014-05-21 东芝泰格有限公司 Commodity recognition apparatus and commodity recognition method
CN103903249A (en) * 2012-12-27 2014-07-02 纽海信息技术(上海)有限公司 Image matching system and method
CN104765858A (en) * 2015-04-21 2015-07-08 北京航天长峰科技工业集团有限公司上海分公司 Construction method for public security synonym library and obtained public security synonym library
CN104978356A (en) * 2014-04-10 2015-10-14 阿里巴巴集团控股有限公司 Synonym identification method and device
CN105005917A (en) * 2015-07-07 2015-10-28 上海晶赞科技发展有限公司 Universal method for correlating single items of different e-commerce websites
CN105354194A (en) * 2014-08-19 2016-02-24 上海中怡通信息科技有限公司 Intelligent commodity classifying method and system
CN106096609A (en) * 2016-06-16 2016-11-09 武汉大学 A kind of merchandise query keyword automatic generation method based on OCR
CN103235803B (en) * 2013-04-17 2016-12-28 北京京东尚科信息技术有限公司 A kind of method and apparatus obtaining goods attribute value from text
CN107133218A (en) * 2017-05-26 2017-09-05 北京惠商之星网络科技有限公司 Trade name intelligent Matching method, system and computer-readable recording medium
CN107220334A (en) * 2017-05-25 2017-09-29 北京小度信息科技有限公司 Similarity calculating method, device and the equipment of name of firm
CN108960923A (en) * 2018-07-09 2018-12-07 北京百悟科技有限公司 A kind of method, apparatus and computer storage medium of price
CN110083678A (en) * 2019-03-12 2019-08-02 平安科技(深圳)有限公司 A kind of electric business platform goods matching method, device and readable storage medium storing program for executing
CN110968685A (en) * 2018-09-26 2020-04-07 阿里巴巴集团控股有限公司 Commodity name aggregation method and device
CN112199451A (en) * 2020-09-30 2021-01-08 京东数字科技控股股份有限公司 Commodity identification method and device, computer equipment and storage medium
CN112784861A (en) * 2019-11-07 2021-05-11 北京沃东天骏信息技术有限公司 Similarity determination method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1932817A (en) * 2006-09-15 2007-03-21 陈远 Common interconnection network content keyword interactive system
CN102193936A (en) * 2010-03-09 2011-09-21 阿里巴巴集团控股有限公司 Data classification method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1932817A (en) * 2006-09-15 2007-03-21 陈远 Common interconnection network content keyword interactive system
CN102193936A (en) * 2010-03-09 2011-09-21 阿里巴巴集团控股有限公司 Data classification method and device

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309886B (en) * 2012-03-13 2017-05-10 阿里巴巴集团控股有限公司 Trading-platform-based structural information searching method and device
CN103309886A (en) * 2012-03-13 2013-09-18 阿里巴巴集团控股有限公司 Trading-platform-based structural information searching method and device
WO2013170587A1 (en) * 2012-05-14 2013-11-21 华为技术有限公司 Multimedia question and answer system and method
CN103425640A (en) * 2012-05-14 2013-12-04 华为技术有限公司 Multimedia questioning-answering system and method
CN103810468A (en) * 2012-11-05 2014-05-21 东芝泰格有限公司 Commodity recognition apparatus and commodity recognition method
CN103903249A (en) * 2012-12-27 2014-07-02 纽海信息技术(上海)有限公司 Image matching system and method
CN103903249B (en) * 2012-12-27 2017-10-13 北京京东尚科信息技术有限公司 Image matching system and method
CN103235803B (en) * 2013-04-17 2016-12-28 北京京东尚科信息技术有限公司 A kind of method and apparatus obtaining goods attribute value from text
CN104978356A (en) * 2014-04-10 2015-10-14 阿里巴巴集团控股有限公司 Synonym identification method and device
CN104978356B (en) * 2014-04-10 2019-09-06 阿里巴巴集团控股有限公司 A kind of recognition methods of synonym and device
CN105354194A (en) * 2014-08-19 2016-02-24 上海中怡通信息科技有限公司 Intelligent commodity classifying method and system
CN104765858A (en) * 2015-04-21 2015-07-08 北京航天长峰科技工业集团有限公司上海分公司 Construction method for public security synonym library and obtained public security synonym library
CN105005917A (en) * 2015-07-07 2015-10-28 上海晶赞科技发展有限公司 Universal method for correlating single items of different e-commerce websites
CN106096609A (en) * 2016-06-16 2016-11-09 武汉大学 A kind of merchandise query keyword automatic generation method based on OCR
CN106096609B (en) * 2016-06-16 2019-03-19 武汉大学 A kind of merchandise query keyword automatic generation method based on OCR
CN107220334A (en) * 2017-05-25 2017-09-29 北京小度信息科技有限公司 Similarity calculating method, device and the equipment of name of firm
CN107133218A (en) * 2017-05-26 2017-09-05 北京惠商之星网络科技有限公司 Trade name intelligent Matching method, system and computer-readable recording medium
CN108960923A (en) * 2018-07-09 2018-12-07 北京百悟科技有限公司 A kind of method, apparatus and computer storage medium of price
CN110968685A (en) * 2018-09-26 2020-04-07 阿里巴巴集团控股有限公司 Commodity name aggregation method and device
CN110968685B (en) * 2018-09-26 2023-06-20 阿里巴巴集团控股有限公司 Commodity name collection method and device
CN110083678A (en) * 2019-03-12 2019-08-02 平安科技(深圳)有限公司 A kind of electric business platform goods matching method, device and readable storage medium storing program for executing
CN112784861A (en) * 2019-11-07 2021-05-11 北京沃东天骏信息技术有限公司 Similarity determination method and device, electronic equipment and storage medium
CN112199451A (en) * 2020-09-30 2021-01-08 京东数字科技控股股份有限公司 Commodity identification method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN102332137A (en) Goods matching method and system
US10921956B2 (en) System and method for assessing content
CN101876981B (en) A kind of method and device building knowledge base
CN107038186B (en) Method and device for generating title, displaying search result and displaying title
US8010344B2 (en) Dictionary word and phrase determination
US20080312911A1 (en) Dictionary word and phrase determination
CN101506767B (en) Relative to taxonomic hierarchies classify such as document and/or cluster object and from this classification derive data structure
US9934293B2 (en) Generating search results
CN102760142A (en) Method and device for extracting subject label in search result aiming at searching query
KR20080114764A (en) System and method for identifying related queries for languages with multiple writing systems
US10134076B2 (en) Method and system for attribute extraction from product titles using sequence labeling algorithms
Chen et al. Mining user requirements to facilitate mobile app quality upgrades with big data
CN105159998A (en) Keyword calculation method based on document clustering
RU2012144649A (en) PRODUCT SYNTHESIS FROM MULTIPLE SOURCES
CN105404680A (en) Searching recommendation method and apparatus
CN104008186A (en) Method and device for determining keywords in target text
CN105843796A (en) Microblog emotional tendency analysis method and device
US11803541B2 (en) Primitive-based query generation from natural language queries
CN107609192A (en) The supplement searching method and device of a kind of search engine
EP2189917A1 (en) Facilitating display of an interactive and dynamic cloud with advertising and domain features
CN102982025A (en) Identification method and device for searching requirement
CN111737607B (en) Data processing method, device, electronic equipment and storage medium
CN108470289B (en) Virtual article issuing method and equipment based on E-commerce shopping platform
CN112395856B (en) Text matching method, text matching device, computer system and readable storage medium
CN112800317A (en) Search platform architecture for automobile vertical field

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120125