WO2016033708A1 - Apparatus and methods for image data classification - Google Patents
Apparatus and methods for image data classification Download PDFInfo
- Publication number
- WO2016033708A1 WO2016033708A1 PCT/CN2014/000825 CN2014000825W WO2016033708A1 WO 2016033708 A1 WO2016033708 A1 WO 2016033708A1 CN 2014000825 W CN2014000825 W CN 2014000825W WO 2016033708 A1 WO2016033708 A1 WO 2016033708A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- training data
- target code
- neural network
- training
- network system
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/004—Arrangements for detecting or preventing errors in the information received by using forward error control
- H04L1/0056—Systems characterized by the type of code used
- H04L1/0057—Block codes
Definitions
- the present application generally relates to a field of target identification, more particularly, to an apparatus and a method for image data classification.
- an apparatus for data classification may comprising: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
- the method may comprise: retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes; generating a target code for each of the retrieved training data samples; for an unclassified data sample, generating a target prediction for the unclassified data sample; and predicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
- the present invention brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
- Fig. 1 is a schematic diagram illustrating an apparatus for image data classification according to an embodiment of the present application.
- Fig. 2 is a schematic diagram illustrating a target code generator according to an embodiment of the present application.
- Fig. 3 is a schematic diagram illustrating an apparatus with a training unit according to another embodiment of the present application.
- Fig. 4. is a schematic diagram illustrating the training unit according to another embodiment of the present application.
- Fig. 5. is a schematic diagram illustrating a predictor according to an embodiment of the present application.
- Fig. 6. is a schematic diagram illustrating a training unit according to another embodiment of the present application.
- Fig. 7. is a schematic diagram illustrating a predictor according to another embodiment of the present application.
- Fig. 8 is a schematic flowchart illustrating a method for image data classification according to an embodiment of the present application.
- Fig. 9 is a schematic flowchart illustrating a process for generating a target code according to an embodiment of the present application.
- Fig. 10 is a schematic flowchart illustrating a process for training a neural network system according to an embodiment of the present application.
- Fig. 11 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to an embodiment of the present application.
- Fig. 12 is a schematic flowchart illustrating a process for training a neural network system according to another embodiment of the present application.
- Fig. 13 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to another embodiment of the present application.
- FIG. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for data classification with some disclosed embodiments.
- the apparatus 1000 may be implemented using certain hardware, software, or a combination thereof.
- the embodiments of the present application may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
- the apparatus 1000 can be run in one or more system that may include a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
- the apparatus 1000 may comprise a target code generator 100, a neural network system 200 and a predictor 300.
- the target code generator 100 may configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes.
- the target prediction generator 200 may be configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples.
- the target prediction generator 200 may comprises a neural network system.
- the neural network system may comprise at least one of a deep belief network and a convolutional network.
- the neural network may consist of the convolutional filters, pooling layers, and locally or fully connected layers, which are well known in the art, and thus the detailed configurations thereof are omitted herein.
- the predictor 300 may be configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
- T be a set of integers, called the alphabet set.
- An element in T is called a symbol.
- T ⁇ 0, 1 ⁇ is a binary alphabet set.
- a target code S is a matrix S ⁇ T n ⁇ l , wherein each row of a target code is called a codeword, l denotes the number of symbols in each codeword and n denotes the total number of codewords.
- the target code can be constructed with a deterministic method, which is built on the Hadamard matrix.
- For a target code S we denote be the set of empirical distributions of symbols in the rows of S, i. e.
- ⁇ i is a vector of length
- the Hamming distance between row i and row i' of a target code S.
- Table 1 shows an example of a 1-of-K target code, which is typically used in deep learning for representing K classes.
- Each of the K symbols either ‘0’ or ‘1’ , indicates the probability of a specific class.
- the target coding can play additional roles, such as error correcting or facilitating better feature representation.
- additional roles a target code S fulfilling specific requirements should be constructed.
- the target code generator 100 further comprises a matrix generating module 110, a removing module 120, a changing module 130, and a selecting module 140.
- the definition of Hadamard matrix requires that any pair of distinct rows and columns are orthogonal, respectively.
- the removing module 120 is configured to let S BC ⁇ T (m- 1 ) ⁇ (m-1) obtained by removing the first row and the first column of H.
- the changing module 130 is configured to remove a first row and a first column of the Hadamard matrix.
- the above formulation yields the balanced target code S BC of size (m-1) ⁇ (m-1) , row sum m/2, column sum m/2, and pairwise Hamming distance is constant m/2.
- the selecting module 140 is configured to randomly select a plurality of rows of the changed Hadamard matrix as the target code, wherein the number of rows is identical to that of the classes of the training data samples.
- the target code may be represented as a vector.
- the selecting module 140 is configured to randomly select c rows as balanced target codes for c classes, wherein each of the selected rows corresponds to one target code.
- the class labels C BC ⁇ T K ⁇ (m-1) is constructed by choosing K codewords randomly from S BC ⁇ T (m-1) ⁇ (m-1) .
- the apparatus 1000’ comprises a target code generator 100, a neural network system 200, a predictor 300, and a training unit 400.
- the functions of the target code generator 100, the neural network system 200, and the predictor 300 have been described with reference to Fig. 1, and thus will be omitted hereinafter.
- the training unit 400 is configured to train the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
- the target prediction may be represented as a vector.
- the training unit 400 further comprises a drawing module 410, an error computing module 420, and a back-propagating module 430.
- the drawing module 410 is configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample.
- the target code may be a ground-truth target code.
- the error computing module 420 is configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code.
- the back-propagating module 430 is configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. In order to get a convergence result, the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
- the predictor 300 further comprises a distance computing module 310, and an assigning module 320.
- the distance computing module 310 is configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples. Since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be determined by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is determined by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance is 2.
- the assigning module 320 is configured to assign the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
- the training unit 400’ comprises a drawing module 410, an error computing module 420, a back-propagating module 430, and an extracting module 440.
- the drawing module 410 may be configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample.
- the error computing module 420 may be configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code.
- the back-propagating module 430 may be configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system.
- the drawing module 410, the error computing module 420 and the back-propagating module 430 repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
- the extracting module 440 may be configured to extract hidden layer features from the penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value.
- the hidden layer features will be used as training input of the multiclass classifier
- the class labels will be used as training target of the multiclass classifier
- the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function.
- Given an unclassified data sample its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
- the predictor 300’ comprises a receiving module 340, a retrieving module 350, and a prediction generating module 360.
- the receiving module 340 may be configured to receive an unclassified data sample.
- the retrieving module 350 may be configured to retrieve the trained multiclass classifier from the training unit.
- the prediction generating module 360 may be configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
- Fig. 8 is a schematic flowchart illustrating a method 2000 for data classification.
- the method 2000 may be described in detail with respect to Fig. 8.
- a plurality of training data samples is retrieved and a target code for each of the retrieved training data samples is generated by a target code generator, wherein the training data samples being grouped into different classes.
- a target prediction for the unclassified data sample is generated by a neural network system.
- the neural network system may consist of multiple layers of convolutional filters, pooling layers, and locally or fully connected layers.
- the neural network system may comprise at least one of a deep belief network and a convolutional network.
- the method further comprises a step S240 of training the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
- the step S220 of generating a target code comprises following steps.
- a Hadamard matrix whose entries are either “+1” or “-1” , is generated.
- a first row and a first column of the Hadamard matrix is removed.
- “+1” is changed to “0” and “-1” is mapped to “1” .
- a number of rows of changed Hadamard matrix are randomly selected as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
- step S230 at which a class for an unclassified data sample is predicted by a predictor based on the generated target code and the generated target prediction.
- the step S240 of training a neural network system comprises following steps.
- a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
- an error such as a Hamming distance between the generated target prediction and the ground-truth target code is computed.
- the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
- step S440 the steps S410-S430 are repeated until the error is less than a predetermined value, i. e. , until the training process is converged.
- the step S230 of predicting a class for an unclassified data sample comprises following steps.
- step S510 an unclassified data sampled is received.
- step S520 Hamming distances between a target prediction of the unclassified data sample and the corresponding ground-truth target code of each class of the training samples is computed.
- the distance between the target prediction and the ground-truth target code can be computed by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is computed by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance may be 2.
- the unclassified data sample is assigned to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
- the step S240’ of training a neural network system further comprises following steps.
- a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
- step S420 an error between the generated target prediction and the ground-truth target code is computed.
- the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
- step S440 if the error is less than a predetermined value, i. e. , the training process is converged, the steps S410-S430 are repeated, otherwise, the method proceed with step S450 of extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples.
- the hidden layer features will be used as training input of the multiclass classifier
- the class labels will be used as training target of the multiclass classifier
- the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function.
- Given an unclassified data sample its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
- the step S230’ of predicting the class for an unclassified data sample comprises following steps.
- step S540 an unclassified data sample is received.
- step S550 the multiclass classifier trained in step S450 is retrieved.
- a class prediction is generating for the unclassified data sample by the trained multiclass classifier.
- present application provides a neural network system, with a balanced target coding unit to represent the target code of different data classes.
- target codes are employed in the learning of a neural network along with a predetermined set of training data.
- Prior arts often adopt a 1-of-K coding scheme in neural network training.
- the balanced coding unit brings extra benefits to neural network training.
- more discriminative hidden features can form in the neural network system.
- the predictions generated by the neural network system has error correcting capability.
Abstract
Disclosed is an apparatus for image data classification. The apparatus may comprise: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes, and the generated target code has a dimension identical to number of the classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction. A method for image data classification is also disclosed.
Description
The present application generally relates to a field of target identification, more particularly, to an apparatus and a method for image data classification.
Learning robust and invariant representation has been a long-standing goal in computer vision. In comparison to hand-crafted visual features, such as SIFT or HoG, features learned by deep models have recently been shown more capable of capturing abstract concepts invariant to various phenomenon in visual world, e. g. viewpoint, illumination, and clutter. Hence, an increasing number of studies are now exploring the use of deep representation on vision problems, particularly on classification tasks.
Rather than using deep models for direct classification, many vision studies choose to follow a multistage technique. This technique has been shown effective in combining good invariant behavior of deep features and discriminative power of the standard classifiers. Typically, they first learn a deep model, e. g. convolutional neural network, in a supervised manner. The 1-of-K coding, containing vectors of length K, with the k-th element as one and the remaining zeros, is used along with a softmax function for classification. Each element in a 1-of-K code essentially represents a probability of a specific class. Subsequently, the features of a raw image are extracted from the penultimate layer or shallower layers, to form a high-dimensional feature vector as input to the classifiers such as SVM.
In neural network training, prior arts often adopt a 1-of-K coding scheme. However, discriminative hidden features formed in the neural network system trained by the 1-of-K coding are limited, and the predictions generated by the neural network system do not have error correcting capability. Thus there is a requirement for a more effective target coding having a better performance in neural network training.
Summary
According to an embodiment of the present application, disclosed is an apparatus for data classification. The apparatus may comprising: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
According to another embodiment of the present application, disclosed is a method for data classification. The method may comprise: retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes; generating a target code for each of the retrieved training data samples; for an unclassified data sample, generating a target prediction for the unclassified data sample; and predicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
The present invention brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 is a schematic diagram illustrating an apparatus for image data classification according to an embodiment of the present application.
Fig. 2 is a schematic diagram illustrating a target code generator according to an embodiment of the present application.
Fig. 3 is a schematic diagram illustrating an apparatus with a training unit according to another embodiment of the present application.
Fig. 4. is a schematic diagram illustrating the training unit according to another embodiment of the present application.
Fig. 5. is a schematic diagram illustrating a predictor according to an embodiment of the present application.
Fig. 6. is a schematic diagram illustrating a training unit according to another embodiment of the present application.
Fig. 7. is a schematic diagram illustrating a predictor according to another embodiment of the present application.
Fig. 8 is a schematic flowchart illustrating a method for image data classification according to an embodiment of the present application.
Fig. 9 is a schematic flowchart illustrating a process for generating a target code according to an embodiment of the present application.
Fig. 10 is a schematic flowchart illustrating a process for training a neural network system according to an embodiment of the present application.
Fig. 11 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to an embodiment of the present application.
Fig. 12 is a schematic flowchart illustrating a process for training a neural network system according to another embodiment of the present application.
Fig. 13 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to another embodiment of the present application.
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When appropriate, the same reference numbers are used throughout the drawings to refer to the same or like parts. Fig. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for data classification with some disclosed embodiments.
It shall be appreciated that the apparatus 1000 may be implemented using certain hardware, software, or a combination thereof. In addition, the embodiments of the present application may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
In the case that the apparatus 1000 is implemented with software, the apparatus 1000 can be run in one or more system that may include a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
Referring to Fig. 1 again, where the apparatus 1000 is implemented by the hardware or the combination of hardware and software, it may comprise a target code generator 100, a neural network system 200 and a predictor 300. In the embodiment shown in Fig. 1, the target code generator 100 may configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes. The target prediction generator 200 may be configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples. In some embodiments, the target prediction generator 200 may comprises a neural network system. In some embodiments, the neural network system may comprise at least one of a deep belief network and a convolutional network. For example, the neural network may consist of the convolutional filters, pooling layers, and locally or fully connected layers, which are well known in the art, and thus the detailed configurations thereof are omitted herein. The predictor 300 may be configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
Hereinafter the definition of a target code (or target coding) will be described. Let T be a set of integers, called the alphabet set. An element in T is called a symbol. For example, T = {0, 1} is a binary alphabet set. A target code S is a matrix S ∈ T n×l, wherein each row of a target code is called a codeword, l denotes the number of symbols in each
codeword and n denotes the total number of codewords. The target code can be constructed with a deterministic method, which is built on the Hadamard matrix. For a target code S, we denote be the set of empirical distributions of symbols in the rows of S, i. e. for i=1, 2, …, n, αi is a vector of length |T|, with the t-th component of αi counting the number of occurrence of the t-th symbol in the i-th row of S. Similarly, we let be the set of empirical distributions of symbols in the columns of S. Given two distinct row indices i and i', the Hamming distance between row i and row i' of a target code S is defined as | {j: Sij≠Si'j'} |, i. e. , it counts the number of column indices such that the corresponding symbols in row i and row i'are not equal. For simplicity, we call it pairwise Hamming distance.
Table 1 shows an example of a 1-of-K target code, which is typically used in deep learning for representing K classes. Each of the K symbols, either ‘0’ or ‘1’ , indicates the probability of a specific class. The target code here can be written as S=I, where I∈TK×K is an identity matrix. It is easy to attain some properties of the 1-of-K coding. For instance, for i=1, 2, ... , K, we have and since only one symbol in each codeword has a value ‘1’ . Similarly, we haveandThe pairwise Hamming distance is two.
Table 1
Instead of representing classes, the target coding can play additional roles, such as error correcting or facilitating better feature representation. To enable the additional roles, a
target code S fulfilling specific requirements should be constructed.
The specific requirements a good target code should satisfy will be introduced hereinafter. Generally, the specific requirements can be summarized in three aspects: uniformness in each column, redundancy in each row, and constant pairwise Hamming distance. Hereinafter, how to generate a target code as shown in Table 2, which is also considered as Balanced Code (BC) denoted as SBC, will be described based on above requirements.
Table 2
As shown in Fig. 2, the target code generator 100 further comprises a matrix generating module 110, a removing module 120, a changing module 130, and a selecting module 140.
The target code generator 100 is configured to generate a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples. Particularly, for a square m×m matrix H, whose entries are either ‘+1’ or ‘-1’ , If HHT=mI, this matrix is called a Hadamard matrix. In some embodiments, we can use ‘+’ represents ‘+1’ and ‘-’ represents ‘-1’ . The definition of Hadamard matrix requires that any pair of distinct rows and columns are orthogonal, respectively. A possible way to generate the Hadamard matrix is by Sylvester’s method Hedayat and Wallis (1978) , where a new Hadamard matrix is produced from the old one by the Kronecker (or tensor) product. For
example, given a Hadamard matrix H2= [++; +﹣] , we can produce H4 by as following equations, where denotes the Kronecker product. Similiarly, H8 is computed by
The removing module 120 is configured to let SBC∈T (m-1)×(m-1) obtained by removing the first row and the first column of H. The changing module 130 is configured to remove a first row and a first column of the Hadamard matrix. The above formulation yields the balanced target code SBC of size (m-1) ×(m-1) , row sum m/2, column sum m/2, and pairwise Hamming distance is constant m/2.
The selecting module 140 is configured to randomly select a plurality of rows of the changed Hadamard matrix as the target code, wherein the number of rows is identical to that of the classes of the training data samples. In some embodiments, the target code may be represented as a vector. Particularly, the selecting module 140 is configured to randomly select c rows as balanced target codes for c classes, wherein each of the selected rows corresponds to one target code. In some embodiments, the class labels CBC∈TK×(m-1) is constructed by choosing K codewords randomly from SBC∈T(m-1)×(m-1) .
As shown in Fig. 3, the apparatus 1000’ according to another embodiment of the present application comprises a target code generator 100, a neural network system 200, a predictor 300, and a training unit 400. The functions of the target code generator 100, the neural network system 200, and the predictor 300 have been described with reference to Fig. 1, and thus will be omitted hereinafter. The training unit 400 is configured to train the neural
network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions. In some embodiments, the target prediction may be represented as a vector.
As shown in Fig. 4, the training unit 400 further comprises a drawing module 410, an error computing module 420, and a back-propagating module 430. The drawing module 410 is configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample. For example, above association with a ground truth target code based on a class label may have a form in which class label = ‘1’ , target code = ‘1010101’ , and class label = ‘2’ , target code = ‘0110011’ . In some embodiment, the target code may be a ground-truth target code. The error computing module 420 is configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code. The back-propagating module 430 is configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. In order to get a convergence result, the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
As shown in Fig. 5, the predictor 300 further comprises a distance computing module 310, and an assigning module 320. The distance computing module 310 is configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples. Since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be determined by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is determined by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance is 2. The assigning module 320 is configured to assign the unclassified
data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
As shown in Fig. 6, the training unit 400’ according to another embodiment of the present application comprises a drawing module 410, an error computing module 420, a back-propagating module 430, and an extracting module 440. The drawing module 410 may be configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample. The error computing module 420 may be configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code. The back-propagating module 430 may be configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. The drawing module 410, the error computing module 420 and the back-propagating module 430 repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value. The extracting module 440 may be configured to extract hidden layer features from the penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value. Particularly, the hidden layer features will be used as training input of the multiclass classifier, the class labels will be used as training target of the multiclass classifier, and the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function. Given an unclassified data sample, its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
As shown in Fig. 7, the predictor 300’ according to another embodiment of the
present application comprises a receiving module 340, a retrieving module 350, and a prediction generating module 360. The receiving module 340 may be configured to receive an unclassified data sample. The retrieving module 350 may be configured to retrieve the trained multiclass classifier from the training unit. The prediction generating module 360 may be configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
Fig. 8 is a schematic flowchart illustrating a method 2000 for data classification. Hereinafter, the method 2000 may be described in detail with respect to Fig. 8.
At step S210, a plurality of training data samples is retrieved and a target code for each of the retrieved training data samples is generated by a target code generator, wherein the training data samples being grouped into different classes.
At step S220, for an unclassified data sample, a target prediction for the unclassified data sample is generated by a neural network system. In some embodiments, as stated in the above, the neural network system may consist of multiple layers of convolutional filters, pooling layers, and locally or fully connected layers. In some embodiments, the neural network system may comprise at least one of a deep belief network and a convolutional network. In some embodiments, the method further comprises a step S240 of training the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
As shown in Fig. 9, the step S220 of generating a target code comprises following steps. To be specific, at step S310, a Hadamard matrix, whose entries are either “+1” or “-1” , is generated. At step S320, a first row and a first column of the Hadamard matrix is removed. At step S330, “+1” is changed to “0” and “-1” is mapped to “1” . At step S340, a number of rows of changed Hadamard matrix are randomly selected as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
And then the method 2000 proceeds with step S230, at which a class for an
unclassified data sample is predicted by a predictor based on the generated target code and the generated target prediction.
As shown in Fig. 10, in the case of following the nearest neighbor classification paradigm, the step S240 of training a neural network system comprises following steps.
At step S410, a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample. For example, above association with a ground truth target code based on a class label may have a form in which class label = ‘1’ , target code = ‘1010101’ , and class label = ‘2’ , target code = ‘0110011’ .
At step S420, an error such as a Hamming distance between the generated target prediction and the ground-truth target code is computed.
At step S430, the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
At step S440, the steps S410-S430 are repeated until the error is less than a predetermined value, i. e. , until the training process is converged.
As shown in Fig. 11, in the case of following the nearest neighbor classification paradigm, the step S230 of predicting a class for an unclassified data sample comprises following steps.
At step S510, an unclassified data sampled is received.
At step S520, Hamming distances between a target prediction of the unclassified data sample and the corresponding ground-truth target code of each class of the training samples is computed. As discussed in the above, since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be computed by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is computed by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance may be 2.
At step S530, the unclassified data sample is assigned to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
As shown in Fig. 12, according to another embodiment of the present application, in the case of following the multistage paradigm, the step S240’ of training a neural network system further comprises following steps.
At step S410, a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
At step S420, an error between the generated target prediction and the ground-truth target code is computed.
At step S430, the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
At step S440’ , if the error is less than a predetermined value, i. e. , the training process is converged, the steps S410-S430 are repeated, otherwise, the method proceed with step S450 of extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples. Particularly, the hidden layer features will be used as training input of the multiclass classifier, the class labels will be used as training target of the multiclass classifier, and the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function. Given an unclassified data sample, its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
As shown in Fig. 13, according to another embodiment of the present application, in the case of following the multistage paradigm, the step S230’ of predicting the class for an unclassified data sample comprises following steps.
At step S540, an unclassified data sample is received.
At step S550, the multiclass classifier trained in step S450 is retrieved.
At step S560, a class prediction is generating for the unclassified data sample by the trained multiclass classifier.
In present application provides a neural network system, with a balanced target coding unit to represent the target code of different data classes. Such target codes are employed in the learning of a neural network along with a predetermined set of training data.
Prior arts often adopt a 1-of-K coding scheme in neural network training. In contrast to the conventional 1-of-K coding scheme, the balanced coding unit brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims is intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention.
Interestingly, even on just a two-dimensional embedding space, the features induced by Balanced Code-based learning can already be easily separable. In contrast, the feature clusters induced by 1-of-K are overlapping, such that separation of such clusters may only be possible at higher dimensions. By replacing 1-of-K with the Balanced Code in deep feature learning, some classes, which are confused in 1-of-K coding, can be separated. A longer balanced code leads to more separable and distinct feature clusters.
Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.
Claims (20)
- An apparatus for image data classification, comprising:a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples are grouped into different classes;a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; anda predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
- An apparatus of claim 1, wherein the target code generator further comprises:a matrix generating module configured to generate a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples;a removing module configured to remove a first row and a first column of the Hadamard matrix;a changing module configured to change “+1” and “-1” in the Hadamard matrix to “0” and “1” , respectively; anda selecting module configured to randomly select a plurality of rows of the changed Hadamard matrix as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
- An apparatus of claim 2, wherein the prediction generator comprises a neural network system, andwherein the apparatus further comprises a training unit configured to train the neural network system with the retrieved training data samples such that the trained neural network system is capable of generating said target predictions.
- An apparatus of claim 3, wherein the target code is a ground-truth target code.
- An apparatus of claim 4, wherein the training unit further comprises:a drawing module configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;an error computing module configured to compute an error between the generated target prediction of the training data sample and the ground-truth target code; anda back-propagating module configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system,wherein the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
- An apparatus of claim 5, wherein the predictor further configured comprises:a receiving module configured to receive an unclassified data sample;a distance computing module configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples; andan assigning module configured to assign the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances.
- An apparatus of claim 4, wherein the training unit further comprises:a drawing module configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;an error computing module configured to compute an error between the generated target prediction of the training data sample and the ground-truth target code;a back-propagating module configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system; andan extracting module configured to extract hidden layer features from a penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value,wherein the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
- A apparatus of claim 5 or 7, wherein the error is a Hamming distance.
- An apparatus of claim 7, wherein the predictor further comprises:a receiving module configured to receive an unclassified data sample;a retrieving module configured to retrieve the trained multiclass classifier from the training unit;a prediction generating module configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
- An apparatus of claim 3, wherein the neural network system comprises at least one of a deep belief network and a convolutional network.
- A method for image data classification, comprising:retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes;generating a target code for each of the retrieved training data samples;for an unclassified data sample, generating a target prediction for the unclassified data sample; andpredicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
- A method of claim 11, wherein the step of generating a target code comprises:generating a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples;removing a first row and a first column of the Hadamard matrix;changing “+1” and “-1” in the Hadamard matrix to “0” and “1” , respectively; andrandomly selecting a plurality of rows of the changed Hadamard matrix as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected row corresponds to one target code.
- A method of claim 12, wherein the target prediction is generated by a neural network system, the method further comprises training the neural network system with the retrieved training data samples such that the trained neural network system is capable of generating said target predictions.
- A method of claim 13, wherein the target code is a ground-truth target code.
- A method of claim 14, wherein the step of training a neural network system comprises:1) drawing a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;2) computing an error between the generated target prediction of the training data sample and the ground-truth target code;3) back-propagating the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system; and4) repeating steps 1) -3) until the error is less than a predetermined value.
- A method of claim 15, wherein the step of predicting a class for an unclassified data sample comprises:receiving an unclassified data sample;computing Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples; andassigning the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances.
- A method of claim 14, wherein the step of training a neural network system further comprises:1) drawing a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;2) computing an error between the generated target prediction of the training data sample and the ground-truth target code;3) back-propagating the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system;4) determining whether the error is larger than a predetermined value,if yes, repeating steps 1)-3) ,if no, proceeding with 5) extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples.
- A method of claim 15 or 17, wherein the error is a Hamming distance.
- A method of claim 17, wherein the step of predicting the class for an unclassified data sample further comprisesreceiving an unclassified data sample;retrieving the multiclass classifier trained in step 5) ;generating a class prediction for the unclassified data sample by the trained multiclass classifier.
- A method of claim 13, wherein the neural network system comprises at least one of a deep belief network and a convolutional network.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/000825 WO2016033708A1 (en) | 2014-09-03 | 2014-09-03 | Apparatus and methods for image data classification |
CN201480081756.1A CN106687993B (en) | 2014-09-03 | 2014-09-03 | Device and method for image data classification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/000825 WO2016033708A1 (en) | 2014-09-03 | 2014-09-03 | Apparatus and methods for image data classification |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016033708A1 true WO2016033708A1 (en) | 2016-03-10 |
Family
ID=55438961
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/000825 WO2016033708A1 (en) | 2014-09-03 | 2014-09-03 | Apparatus and methods for image data classification |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106687993B (en) |
WO (1) | WO2016033708A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909151A (en) * | 2017-07-02 | 2018-04-13 | 小蚁科技(香港)有限公司 | For realizing the method and system of notice mechanism in artificial neural network |
CN109472274A (en) * | 2017-09-07 | 2019-03-15 | 富士通株式会社 | The training device and method of deep learning disaggregated model |
CN109753978A (en) * | 2017-11-01 | 2019-05-14 | 腾讯科技(深圳)有限公司 | Image classification method, device and computer readable storage medium |
CN109946669A (en) * | 2019-03-18 | 2019-06-28 | 西安电子科技大学 | Variant aircraft High Range Resolution restoration methods based on depth confidence network |
CN112765034A (en) * | 2021-01-26 | 2021-05-07 | 四川航天系统工程研究所 | Software defect prediction method based on neural network |
CN115797710A (en) * | 2023-02-08 | 2023-03-14 | 成都理工大学 | Neural network image classification performance improving method based on hidden layer feature difference |
CN116794975A (en) * | 2022-12-20 | 2023-09-22 | 维都利阀门有限公司 | Intelligent control method and system for electric butterfly valve |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767735A (en) * | 2019-03-26 | 2020-10-13 | 北京京东尚科信息技术有限公司 | Method, apparatus and computer readable storage medium for executing task |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050031219A1 (en) * | 2002-09-06 | 2005-02-10 | The Regents Of The University Of California | Encoding and decoding of digital data using cues derivable at a decoder |
US7831531B1 (en) * | 2006-06-22 | 2010-11-09 | Google Inc. | Approximate hashing functions for finding similar content |
CN103246893A (en) * | 2013-03-20 | 2013-08-14 | 西交利物浦大学 | ECOC (European Conference on Optical Communication) encoding classification method based on rejected random subspace |
CN103426004A (en) * | 2013-07-04 | 2013-12-04 | 西安理工大学 | Vehicle type recognition method based on error correction output code |
-
2014
- 2014-09-03 WO PCT/CN2014/000825 patent/WO2016033708A1/en active Application Filing
- 2014-09-03 CN CN201480081756.1A patent/CN106687993B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050031219A1 (en) * | 2002-09-06 | 2005-02-10 | The Regents Of The University Of California | Encoding and decoding of digital data using cues derivable at a decoder |
US7831531B1 (en) * | 2006-06-22 | 2010-11-09 | Google Inc. | Approximate hashing functions for finding similar content |
CN103246893A (en) * | 2013-03-20 | 2013-08-14 | 西交利物浦大学 | ECOC (European Conference on Optical Communication) encoding classification method based on rejected random subspace |
CN103426004A (en) * | 2013-07-04 | 2013-12-04 | 西安理工大学 | Vehicle type recognition method based on error correction output code |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909151A (en) * | 2017-07-02 | 2018-04-13 | 小蚁科技(香港)有限公司 | For realizing the method and system of notice mechanism in artificial neural network |
CN107909151B (en) * | 2017-07-02 | 2020-06-02 | 小蚁科技(香港)有限公司 | Method and system for implementing an attention mechanism in an artificial neural network |
CN109472274B (en) * | 2017-09-07 | 2022-06-28 | 富士通株式会社 | Training device and method for deep learning classification model |
CN109472274A (en) * | 2017-09-07 | 2019-03-15 | 富士通株式会社 | The training device and method of deep learning disaggregated model |
CN109753978A (en) * | 2017-11-01 | 2019-05-14 | 腾讯科技(深圳)有限公司 | Image classification method, device and computer readable storage medium |
CN109753978B (en) * | 2017-11-01 | 2023-02-17 | 腾讯科技(深圳)有限公司 | Image classification method, device and computer readable storage medium |
CN109946669A (en) * | 2019-03-18 | 2019-06-28 | 西安电子科技大学 | Variant aircraft High Range Resolution restoration methods based on depth confidence network |
CN109946669B (en) * | 2019-03-18 | 2022-12-02 | 西安电子科技大学 | Method for recovering high-resolution range profile of morphing aircraft based on deep confidence network |
CN112765034A (en) * | 2021-01-26 | 2021-05-07 | 四川航天系统工程研究所 | Software defect prediction method based on neural network |
CN112765034B (en) * | 2021-01-26 | 2023-11-24 | 四川航天系统工程研究所 | Software defect prediction method based on neural network |
CN116794975A (en) * | 2022-12-20 | 2023-09-22 | 维都利阀门有限公司 | Intelligent control method and system for electric butterfly valve |
CN116794975B (en) * | 2022-12-20 | 2024-02-02 | 维都利阀门有限公司 | Intelligent control method and system for electric butterfly valve |
CN115797710A (en) * | 2023-02-08 | 2023-03-14 | 成都理工大学 | Neural network image classification performance improving method based on hidden layer feature difference |
CN115797710B (en) * | 2023-02-08 | 2023-04-07 | 成都理工大学 | Neural network image classification performance improving method based on hidden layer feature difference |
Also Published As
Publication number | Publication date |
---|---|
CN106687993B (en) | 2018-07-27 |
CN106687993A (en) | 2017-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2016033708A1 (en) | Apparatus and methods for image data classification | |
US10438091B2 (en) | Method and apparatus for recognizing image content | |
CN106599900B (en) | Method and device for recognizing character strings in image | |
Stewart et al. | End-to-end people detection in crowded scenes | |
CN105447498B (en) | Client device, system and server system configured with neural network | |
CN107330127B (en) | Similar text detection method based on text picture retrieval | |
RU2693916C1 (en) | Character recognition using a hierarchical classification | |
Charalampous et al. | On-line deep learning method for action recognition | |
Xi et al. | Deep prototypical networks with hybrid residual attention for hyperspectral image classification | |
CN110019652B (en) | Cross-modal Hash retrieval method based on deep learning | |
Liu et al. | Deep ordinal regression based on data relationship for small datasets. | |
Cohen et al. | DNN or k-NN: That is the Generalize vs. Memorize Question | |
CN107004140A (en) | Text recognition method and computer program product | |
Katiyar et al. | A hybrid recognition system for off-line handwritten characters | |
US20200074273A1 (en) | Method for training deep neural network (dnn) using auxiliary regression targets | |
Bansal et al. | mRMR-PSO: a hybrid feature selection technique with a multiobjective approach for sign language recognition | |
CN114730398A (en) | Data tag validation | |
CN109522432B (en) | Image retrieval method integrating adaptive similarity and Bayes framework | |
CN111898703A (en) | Multi-label video classification method, model training method, device and medium | |
Huang et al. | Accelerate learning of deep hashing with gradient attention | |
EP3876236A1 (en) | Extracting chemical structures from digitized images | |
Alam et al. | A multi-view convolutional neural network approach for image data classification | |
CN110199300A (en) | Indistinct Input for autocoder | |
Kokkinos et al. | Breaking ties of plurality voting in ensembles of distributed neural network classifiers using soft max accumulations | |
Mitrović et al. | Flower classification with convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14901041 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14901041 Country of ref document: EP Kind code of ref document: A1 |