WO2016033708A1 - Apparatus and methods for image data classification - Google Patents

Apparatus and methods for image data classification Download PDF

Info

Publication number
WO2016033708A1
WO2016033708A1 PCT/CN2014/000825 CN2014000825W WO2016033708A1 WO 2016033708 A1 WO2016033708 A1 WO 2016033708A1 CN 2014000825 W CN2014000825 W CN 2014000825W WO 2016033708 A1 WO2016033708 A1 WO 2016033708A1
Authority
WO
WIPO (PCT)
Prior art keywords
training data
target code
neural network
training
network system
Prior art date
Application number
PCT/CN2014/000825
Other languages
French (fr)
Inventor
Xiaoou Tang
Shuo YANG
Ping Luo
Chen Change Loy
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to PCT/CN2014/000825 priority Critical patent/WO2016033708A1/en
Priority to CN201480081756.1A priority patent/CN106687993B/en
Publication of WO2016033708A1 publication Critical patent/WO2016033708A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0057Block codes

Definitions

  • the present application generally relates to a field of target identification, more particularly, to an apparatus and a method for image data classification.
  • an apparatus for data classification may comprising: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
  • the method may comprise: retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes; generating a target code for each of the retrieved training data samples; for an unclassified data sample, generating a target prediction for the unclassified data sample; and predicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
  • the present invention brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
  • Fig. 1 is a schematic diagram illustrating an apparatus for image data classification according to an embodiment of the present application.
  • Fig. 2 is a schematic diagram illustrating a target code generator according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram illustrating an apparatus with a training unit according to another embodiment of the present application.
  • Fig. 4. is a schematic diagram illustrating the training unit according to another embodiment of the present application.
  • Fig. 5. is a schematic diagram illustrating a predictor according to an embodiment of the present application.
  • Fig. 6. is a schematic diagram illustrating a training unit according to another embodiment of the present application.
  • Fig. 7. is a schematic diagram illustrating a predictor according to another embodiment of the present application.
  • Fig. 8 is a schematic flowchart illustrating a method for image data classification according to an embodiment of the present application.
  • Fig. 9 is a schematic flowchart illustrating a process for generating a target code according to an embodiment of the present application.
  • Fig. 10 is a schematic flowchart illustrating a process for training a neural network system according to an embodiment of the present application.
  • Fig. 11 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to an embodiment of the present application.
  • Fig. 12 is a schematic flowchart illustrating a process for training a neural network system according to another embodiment of the present application.
  • Fig. 13 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to another embodiment of the present application.
  • FIG. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for data classification with some disclosed embodiments.
  • the apparatus 1000 may be implemented using certain hardware, software, or a combination thereof.
  • the embodiments of the present application may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
  • the apparatus 1000 can be run in one or more system that may include a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
  • the apparatus 1000 may comprise a target code generator 100, a neural network system 200 and a predictor 300.
  • the target code generator 100 may configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes.
  • the target prediction generator 200 may be configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples.
  • the target prediction generator 200 may comprises a neural network system.
  • the neural network system may comprise at least one of a deep belief network and a convolutional network.
  • the neural network may consist of the convolutional filters, pooling layers, and locally or fully connected layers, which are well known in the art, and thus the detailed configurations thereof are omitted herein.
  • the predictor 300 may be configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
  • T be a set of integers, called the alphabet set.
  • An element in T is called a symbol.
  • T ⁇ 0, 1 ⁇ is a binary alphabet set.
  • a target code S is a matrix S ⁇ T n ⁇ l , wherein each row of a target code is called a codeword, l denotes the number of symbols in each codeword and n denotes the total number of codewords.
  • the target code can be constructed with a deterministic method, which is built on the Hadamard matrix.
  • For a target code S we denote be the set of empirical distributions of symbols in the rows of S, i. e.
  • ⁇ i is a vector of length
  • the Hamming distance between row i and row i' of a target code S.
  • Table 1 shows an example of a 1-of-K target code, which is typically used in deep learning for representing K classes.
  • Each of the K symbols either ‘0’ or ‘1’ , indicates the probability of a specific class.
  • the target coding can play additional roles, such as error correcting or facilitating better feature representation.
  • additional roles a target code S fulfilling specific requirements should be constructed.
  • the target code generator 100 further comprises a matrix generating module 110, a removing module 120, a changing module 130, and a selecting module 140.
  • the definition of Hadamard matrix requires that any pair of distinct rows and columns are orthogonal, respectively.
  • the removing module 120 is configured to let S BC ⁇ T (m- 1 ) ⁇ (m-1) obtained by removing the first row and the first column of H.
  • the changing module 130 is configured to remove a first row and a first column of the Hadamard matrix.
  • the above formulation yields the balanced target code S BC of size (m-1) ⁇ (m-1) , row sum m/2, column sum m/2, and pairwise Hamming distance is constant m/2.
  • the selecting module 140 is configured to randomly select a plurality of rows of the changed Hadamard matrix as the target code, wherein the number of rows is identical to that of the classes of the training data samples.
  • the target code may be represented as a vector.
  • the selecting module 140 is configured to randomly select c rows as balanced target codes for c classes, wherein each of the selected rows corresponds to one target code.
  • the class labels C BC ⁇ T K ⁇ (m-1) is constructed by choosing K codewords randomly from S BC ⁇ T (m-1) ⁇ (m-1) .
  • the apparatus 1000’ comprises a target code generator 100, a neural network system 200, a predictor 300, and a training unit 400.
  • the functions of the target code generator 100, the neural network system 200, and the predictor 300 have been described with reference to Fig. 1, and thus will be omitted hereinafter.
  • the training unit 400 is configured to train the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
  • the target prediction may be represented as a vector.
  • the training unit 400 further comprises a drawing module 410, an error computing module 420, and a back-propagating module 430.
  • the drawing module 410 is configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample.
  • the target code may be a ground-truth target code.
  • the error computing module 420 is configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code.
  • the back-propagating module 430 is configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. In order to get a convergence result, the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
  • the predictor 300 further comprises a distance computing module 310, and an assigning module 320.
  • the distance computing module 310 is configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples. Since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be determined by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is determined by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance is 2.
  • the assigning module 320 is configured to assign the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
  • the training unit 400’ comprises a drawing module 410, an error computing module 420, a back-propagating module 430, and an extracting module 440.
  • the drawing module 410 may be configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample.
  • the error computing module 420 may be configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code.
  • the back-propagating module 430 may be configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system.
  • the drawing module 410, the error computing module 420 and the back-propagating module 430 repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
  • the extracting module 440 may be configured to extract hidden layer features from the penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value.
  • the hidden layer features will be used as training input of the multiclass classifier
  • the class labels will be used as training target of the multiclass classifier
  • the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function.
  • Given an unclassified data sample its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
  • the predictor 300’ comprises a receiving module 340, a retrieving module 350, and a prediction generating module 360.
  • the receiving module 340 may be configured to receive an unclassified data sample.
  • the retrieving module 350 may be configured to retrieve the trained multiclass classifier from the training unit.
  • the prediction generating module 360 may be configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
  • Fig. 8 is a schematic flowchart illustrating a method 2000 for data classification.
  • the method 2000 may be described in detail with respect to Fig. 8.
  • a plurality of training data samples is retrieved and a target code for each of the retrieved training data samples is generated by a target code generator, wherein the training data samples being grouped into different classes.
  • a target prediction for the unclassified data sample is generated by a neural network system.
  • the neural network system may consist of multiple layers of convolutional filters, pooling layers, and locally or fully connected layers.
  • the neural network system may comprise at least one of a deep belief network and a convolutional network.
  • the method further comprises a step S240 of training the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
  • the step S220 of generating a target code comprises following steps.
  • a Hadamard matrix whose entries are either “+1” or “-1” , is generated.
  • a first row and a first column of the Hadamard matrix is removed.
  • “+1” is changed to “0” and “-1” is mapped to “1” .
  • a number of rows of changed Hadamard matrix are randomly selected as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
  • step S230 at which a class for an unclassified data sample is predicted by a predictor based on the generated target code and the generated target prediction.
  • the step S240 of training a neural network system comprises following steps.
  • a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
  • an error such as a Hamming distance between the generated target prediction and the ground-truth target code is computed.
  • the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
  • step S440 the steps S410-S430 are repeated until the error is less than a predetermined value, i. e. , until the training process is converged.
  • the step S230 of predicting a class for an unclassified data sample comprises following steps.
  • step S510 an unclassified data sampled is received.
  • step S520 Hamming distances between a target prediction of the unclassified data sample and the corresponding ground-truth target code of each class of the training samples is computed.
  • the distance between the target prediction and the ground-truth target code can be computed by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is computed by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance may be 2.
  • the unclassified data sample is assigned to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
  • the step S240’ of training a neural network system further comprises following steps.
  • a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
  • step S420 an error between the generated target prediction and the ground-truth target code is computed.
  • the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
  • step S440 if the error is less than a predetermined value, i. e. , the training process is converged, the steps S410-S430 are repeated, otherwise, the method proceed with step S450 of extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples.
  • the hidden layer features will be used as training input of the multiclass classifier
  • the class labels will be used as training target of the multiclass classifier
  • the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function.
  • Given an unclassified data sample its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
  • the step S230’ of predicting the class for an unclassified data sample comprises following steps.
  • step S540 an unclassified data sample is received.
  • step S550 the multiclass classifier trained in step S450 is retrieved.
  • a class prediction is generating for the unclassified data sample by the trained multiclass classifier.
  • present application provides a neural network system, with a balanced target coding unit to represent the target code of different data classes.
  • target codes are employed in the learning of a neural network along with a predetermined set of training data.
  • Prior arts often adopt a 1-of-K coding scheme in neural network training.
  • the balanced coding unit brings extra benefits to neural network training.
  • more discriminative hidden features can form in the neural network system.
  • the predictions generated by the neural network system has error correcting capability.

Abstract

Disclosed is an apparatus for image data classification. The apparatus may comprise: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes, and the generated target code has a dimension identical to number of the classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction. A method for image data classification is also disclosed.

Description

APPARATUS AND METHODS FOR IMAGE DATA CLASSIFICATION Technical Field
The present application generally relates to a field of target identification, more particularly, to an apparatus and a method for image data classification.
Background
Learning robust and invariant representation has been a long-standing goal in computer vision. In comparison to hand-crafted visual features, such as SIFT or HoG, features learned by deep models have recently been shown more capable of capturing abstract concepts invariant to various phenomenon in visual world, e. g. viewpoint, illumination, and clutter. Hence, an increasing number of studies are now exploring the use of deep representation on vision problems, particularly on classification tasks.
Rather than using deep models for direct classification, many vision studies choose to follow a multistage technique. This technique has been shown effective in combining good invariant behavior of deep features and discriminative power of the standard classifiers. Typically, they first learn a deep model, e. g. convolutional neural network, in a supervised manner. The 1-of-K coding, containing vectors of length K, with the k-th element as one and the remaining zeros, is used along with a softmax function for classification. Each element in a 1-of-K code essentially represents a probability of a specific class. Subsequently, the features of a raw image are extracted from the penultimate layer or shallower layers, to form a high-dimensional feature vector as input to the classifiers such as SVM.
In neural network training, prior arts often adopt a 1-of-K coding scheme. However, discriminative hidden features formed in the neural network system trained by the 1-of-K coding are limited, and the predictions generated by the neural network system do not have error correcting capability. Thus there is a requirement for a more effective target coding having a better performance in neural network training.
Summary
According to an embodiment of the present application, disclosed is an apparatus for data classification. The apparatus may comprising: a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes; a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
According to another embodiment of the present application, disclosed is a method for data classification. The method may comprise: retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes; generating a target code for each of the retrieved training data samples; for an unclassified data sample, generating a target prediction for the unclassified data sample; and predicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
The present invention brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 is a schematic diagram illustrating an apparatus for image data classification according to an embodiment of the present application.
Fig. 2 is a schematic diagram illustrating a target code generator according to an embodiment of the present application.
Fig. 3 is a schematic diagram illustrating an apparatus with a training unit according to another embodiment of the present application.
Fig. 4. is a schematic diagram illustrating the training unit according to another embodiment of the present application.
Fig. 5. is a schematic diagram illustrating a predictor according to an embodiment of the present application.
Fig. 6. is a schematic diagram illustrating a training unit according to another embodiment of the present application.
Fig. 7. is a schematic diagram illustrating a predictor according to another embodiment of the present application.
Fig. 8 is a schematic flowchart illustrating a method for image data classification according to an embodiment of the present application.
Fig. 9 is a schematic flowchart illustrating a process for generating a target code according to an embodiment of the present application.
Fig. 10 is a schematic flowchart illustrating a process for training a neural network system according to an embodiment of the present application.
Fig. 11 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to an embodiment of the present application.
Fig. 12 is a schematic flowchart illustrating a process for training a neural network system according to another embodiment of the present application.
Fig. 13 is a schematic flowchart illustrating a process for predicting a class for an unclassified data sample according to another embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When appropriate, the same reference numbers are used throughout the drawings to refer to the same or like parts. Fig. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for data classification with some disclosed embodiments.
It shall be appreciated that the apparatus 1000 may be implemented using certain hardware, software, or a combination thereof. In addition, the embodiments of the present application may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
In the case that the apparatus 1000 is implemented with software, the apparatus 1000 can be run in one or more system that may include a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
Referring to Fig. 1 again, where the apparatus 1000 is implemented by the hardware or the combination of hardware and software, it may comprise a target code generator 100, a neural network system 200 and a predictor 300. In the embodiment shown in Fig. 1, the target code generator 100 may configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples being grouped into different classes. The target prediction generator 200 may be configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples. In some embodiments, the target prediction generator 200 may comprises a neural network system. In some embodiments, the neural network system may comprise at least one of a deep belief network and a convolutional network. For example, the neural network may consist of the convolutional filters, pooling layers, and locally or fully connected layers, which are well known in the art, and thus the detailed configurations thereof are omitted herein. The predictor 300 may be configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
Hereinafter the definition of a target code (or target coding) will be described. Let T be a set of integers, called the alphabet set. An element in T is called a symbol. For example, T = {0, 1} is a binary alphabet set. A target code S is a matrix S ∈ T n×l, wherein each row of a target code is called a codeword, l denotes the number of symbols in each  codeword and n denotes the total number of codewords. The target code can be constructed with a deterministic method, which is built on the Hadamard matrix. For a target code S, we denote 
Figure PCTCN2014000825-appb-000001
 be the set of empirical distributions of symbols in the rows of S, i. e. for i=1, 2, …, n, αi is a vector of length |T|, with the t-th component of αi counting the number of occurrence of the t-th symbol in the i-th row of S. Similarly, we let 
Figure PCTCN2014000825-appb-000002
 be the set of empirical distributions of symbols in the columns of S. Given two distinct row indices i and i', the Hamming distance between row i and row i' of a target code S is defined as | {j: Sij≠Si'j'} |, i. e. , it counts the number of column indices such that the corresponding symbols in row i and row i'are not equal. For simplicity, we call it pairwise Hamming distance.
Table 1 shows an example of a 1-of-K target code, which is typically used in deep learning for representing K classes. Each of the K symbols, either ‘0’ or ‘1’ , indicates the probability of a specific class. The target code here can be written as S=I, where I∈TK×K is an identity matrix. It is easy to attain some properties of the 1-of-K coding. For instance, for i=1, 2, ... , K, we have 
Figure PCTCN2014000825-appb-000003
 and 
Figure PCTCN2014000825-appb-000004
 since only one symbol in each codeword has a value ‘1’ . Similarly, we have
Figure PCTCN2014000825-appb-000005
and
Figure PCTCN2014000825-appb-000006
The pairwise Hamming distance is two.
Table 1
Figure PCTCN2014000825-appb-000007
Instead of representing classes, the target coding can play additional roles, such as error correcting or facilitating better feature representation. To enable the additional roles, a  target code S fulfilling specific requirements should be constructed.
The specific requirements a good target code should satisfy will be introduced hereinafter. Generally, the specific requirements can be summarized in three aspects: uniformness in each column, redundancy in each row, and constant pairwise Hamming distance. Hereinafter, how to generate a target code as shown in Table 2, which is also considered as Balanced Code (BC) denoted as SBC, will be described based on above requirements.
Table 2
Figure PCTCN2014000825-appb-000008
As shown in Fig. 2, the target code generator 100 further comprises a matrix generating module 110, a removing module 120, a changing module 130, and a selecting module 140.
The target code generator 100 is configured to generate a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples. Particularly, for a square m×m matrix H, whose entries are either ‘+1’ or ‘-1’ , If HHT=mI, this matrix is called a Hadamard matrix. In some embodiments, we can use ‘+’ represents ‘+1’ and ‘-’ represents ‘-1’ . The definition of Hadamard matrix requires that any pair of distinct rows and columns are orthogonal, respectively. A possible way to generate the Hadamard matrix is by Sylvester’s method Hedayat and Wallis (1978) , where a new Hadamard matrix is produced from the old one by the Kronecker (or tensor) product. For  example, given a Hadamard matrix H2= [++; +﹣] , we can produce H4 by 
Figure PCTCN2014000825-appb-000009
 as following equations, where denotes the Kronecker product. Similiarly, H8 is computed by 
Figure PCTCN2014000825-appb-000010
Figure PCTCN2014000825-appb-000011
equations 1-2
The removing module 120 is configured to let SBCT (m-1)×(m-1) obtained by removing the first row and the first column of H. The changing module 130 is configured to remove a first row and a first column of the Hadamard matrix. The above formulation yields the balanced target code SBC of size (m-1) ×(m-1) , row sum m/2, column sum m/2, and pairwise Hamming distance is constant m/2.
The selecting module 140 is configured to randomly select a plurality of rows of the changed Hadamard matrix as the target code, wherein the number of rows is identical to that of the classes of the training data samples. In some embodiments, the target code may be represented as a vector. Particularly, the selecting module 140 is configured to randomly select c rows as balanced target codes for c classes, wherein each of the selected rows corresponds to one target code. In some embodiments, the class labels CBC∈TK×(m-1) is constructed by choosing K codewords randomly from SBC∈T(m-1)×(m-1) .
As shown in Fig. 3, the apparatus 1000’ according to another embodiment of the present application comprises a target code generator 100, a neural network system 200, a predictor 300, and a training unit 400. The functions of the target code generator 100, the neural network system 200, and the predictor 300 have been described with reference to Fig. 1, and thus will be omitted hereinafter. The training unit 400 is configured to train the neural  network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions. In some embodiments, the target prediction may be represented as a vector.
As shown in Fig. 4, the training unit 400 further comprises a drawing module 410, an error computing module 420, and a back-propagating module 430. The drawing module 410 is configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample. For example, above association with a ground truth target code based on a class label may have a form in which class label = ‘1’ , target code = ‘1010101’ , and class label = ‘2’ , target code = ‘0110011’ . In some embodiment, the target code may be a ground-truth target code. The error computing module 420 is configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code. The back-propagating module 430 is configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. In order to get a convergence result, the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
As shown in Fig. 5, the predictor 300 further comprises a distance computing module 310, and an assigning module 320. The distance computing module 310 is configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples. Since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be determined by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is determined by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance is 2. The assigning module 320 is configured to assign the unclassified  data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
As shown in Fig. 6, the training unit 400’ according to another embodiment of the present application comprises a drawing module 410, an error computing module 420, a back-propagating module 430, and an extracting module 440. The drawing module 410 may be configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code, for example, based on a class label of the training data sample. The error computing module 420 may be configured to compute an error such as a Hamming distance between the generated target prediction of the training data sample and the ground-truth target code. The back-propagating module 430 may be configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system. The drawing module 410, the error computing module 420 and the back-propagating module 430 repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value. The extracting module 440 may be configured to extract hidden layer features from the penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value. Particularly, the hidden layer features will be used as training input of the multiclass classifier, the class labels will be used as training target of the multiclass classifier, and the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function. Given an unclassified data sample, its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
As shown in Fig. 7, the predictor 300’ according to another embodiment of the  present application comprises a receiving module 340, a retrieving module 350, and a prediction generating module 360. The receiving module 340 may be configured to receive an unclassified data sample. The retrieving module 350 may be configured to retrieve the trained multiclass classifier from the training unit. The prediction generating module 360 may be configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
Fig. 8 is a schematic flowchart illustrating a method 2000 for data classification. Hereinafter, the method 2000 may be described in detail with respect to Fig. 8.
At step S210, a plurality of training data samples is retrieved and a target code for each of the retrieved training data samples is generated by a target code generator, wherein the training data samples being grouped into different classes.
At step S220, for an unclassified data sample, a target prediction for the unclassified data sample is generated by a neural network system. In some embodiments, as stated in the above, the neural network system may consist of multiple layers of convolutional filters, pooling layers, and locally or fully connected layers. In some embodiments, the neural network system may comprise at least one of a deep belief network and a convolutional network. In some embodiments, the method further comprises a step S240 of training the neural network system with the retrieved training data samples such that the trained neural network system is capable of applying the convolutional filters, pooling layers, and locally or fully connected layers to the retrieved training data samples to generate said target predictions.
As shown in Fig. 9, the step S220 of generating a target code comprises following steps. To be specific, at step S310, a Hadamard matrix, whose entries are either “+1” or “-1” , is generated. At step S320, a first row and a first column of the Hadamard matrix is removed. At step S330, “+1” is changed to “0” and “-1” is mapped to “1” . At step S340, a number of rows of changed Hadamard matrix are randomly selected as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
And then the method 2000 proceeds with step S230, at which a class for an  unclassified data sample is predicted by a predictor based on the generated target code and the generated target prediction.
As shown in Fig. 10, in the case of following the nearest neighbor classification paradigm, the step S240 of training a neural network system comprises following steps.
At step S410, a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample. For example, above association with a ground truth target code based on a class label may have a form in which class label = ‘1’ , target code = ‘1010101’ , and class label = ‘2’ , target code = ‘0110011’ .
At step S420, an error such as a Hamming distance between the generated target prediction and the ground-truth target code is computed.
At step S430, the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
At step S440, the steps S410-S430 are repeated until the error is less than a predetermined value, i. e. , until the training process is converged.
As shown in Fig. 11, in the case of following the nearest neighbor classification paradigm, the step S230 of predicting a class for an unclassified data sample comprises following steps.
At step S510, an unclassified data sampled is received.
At step S520, Hamming distances between a target prediction of the unclassified data sample and the corresponding ground-truth target code of each class of the training samples is computed. As discussed in the above, since both the target prediction and the ground-truth target code are vectors having similar length, the distance between the target prediction and the ground-truth target code can be computed by calculating the Hamming distance. For example, if target prediction is ‘1110111’ , and ground-truth target code is ‘1010101’ , the Hamming distance is computed by calculating the number of positions at which the corresponding values are different. In this example, the Hamming distance may be 2.
At step S530, the unclassified data sample is assigned to a class corresponding to the minimum Hamming distance among the computed Hamming distances. That is to say, if the unclassified data sample is closest to a particular class (based on Hamming distance between its target prediction and ground-truth target code) , then the unclassified data sample is considered to be in the same class as the ground-truth code.
As shown in Fig. 12, according to another embodiment of the present application, in the case of following the multistage paradigm, the step S240’ of training a neural network system further comprises following steps.
At step S410, a training data sample is drawn from a predetermined training set, wherein the training data sample is associated with a corresponding target code, particularly a ground-truth target code, for example, based on a class label of the training data sample.
At step S420, an error between the generated target prediction and the ground-truth target code is computed.
At step S430, the computed error is back-propagated through the neural network system so as to adjust weights on connections between neurons of the neural network system.
At step S440’ , if the error is less than a predetermined value, i. e. , the training process is converged, the steps S410-S430 are repeated, otherwise, the method proceed with step S450 of extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples. Particularly, the hidden layer features will be used as training input of the multiclass classifier, the class labels will be used as training target of the multiclass classifier, and the training input and the training target are used to train the multiclass classifier by optimizing the classifier’ sobjective function. Given an unclassified data sample, its hidden layer features may be extracted by the trained neural network system, and then fed into the multiclass classifier. Then, the multiclass classifier may output a class prediction for the unclassified data sample.
As shown in Fig. 13, according to another embodiment of the present application, in the case of following the multistage paradigm, the step S230’ of predicting the class for an unclassified data sample comprises following steps.
At step S540, an unclassified data sample is received.
At step S550, the multiclass classifier trained in step S450 is retrieved.
At step S560, a class prediction is generating for the unclassified data sample by the trained multiclass classifier.
In present application provides a neural network system, with a balanced target coding unit to represent the target code of different data classes. Such target codes are employed in the learning of a neural network along with a predetermined set of training data.
Prior arts often adopt a 1-of-K coding scheme in neural network training. In contrast to the conventional 1-of-K coding scheme, the balanced coding unit brings extra benefits to neural network training. On one hand, more discriminative hidden features can form in the neural network system. On the other hand, and the predictions generated by the neural network system has error correcting capability.
Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims is intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention.
Interestingly, even on just a two-dimensional embedding space, the features induced by Balanced Code-based learning can already be easily separable. In contrast, the feature clusters induced by 1-of-K are overlapping, such that separation of such clusters may only be possible at higher dimensions. By replacing 1-of-K with the Balanced Code in deep feature learning, some classes, which are confused in 1-of-K coding, can be separated. A longer balanced code leads to more separable and distinct feature clusters.
Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.

Claims (20)

  1. An apparatus for image data classification, comprising:
    a target code generator configured to retrieve a plurality of training data samples and to generate a target code for each of the retrieved training data samples, wherein the training data samples are grouped into different classes;
    a target prediction generator configured to receive a plurality of arbitrary data samples and to generate a target prediction for each of the received arbitrary data samples; and
    a predictor configured to predict a class for each of the received arbitrary data sample based on the generated target code and the generated target prediction.
  2. An apparatus of claim 1, wherein the target code generator further comprises:
    a matrix generating module configured to generate a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples;
    a removing module configured to remove a first row and a first column of the Hadamard matrix;
    a changing module configured to change “+1” and “-1” in the Hadamard matrix to “0” and “1” , respectively; and
    a selecting module configured to randomly select a plurality of rows of the changed Hadamard matrix as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected rows corresponds to one target code.
  3. An apparatus of claim 2, wherein the prediction generator comprises a neural network system, and
    wherein the apparatus further comprises a training unit configured to train the neural network system with the retrieved training data samples such that the trained neural network  system is capable of generating said target predictions.
  4. An apparatus of claim 3, wherein the target code is a ground-truth target code.
  5. An apparatus of claim 4, wherein the training unit further comprises:
    a drawing module configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;
    an error computing module configured to compute an error between the generated target prediction of the training data sample and the ground-truth target code; and
    a back-propagating module configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system,
    wherein the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
  6. An apparatus of claim 5, wherein the predictor further configured comprises:
    a receiving module configured to receive an unclassified data sample;
    a distance computing module configured to compute Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples; and
    an assigning module configured to assign the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances.
  7. An apparatus of claim 4, wherein the training unit further comprises:
    a drawing module configured to draw a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;
    an error computing module configured to compute an error between the generated target prediction of the training data sample and the ground-truth target code;
    a back-propagating module configured to back-propagate the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system; and
    an extracting module configured to extract hidden layer features from a penultimate layer of the neural network system and train a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples, after the error is less than a predetermined value,
    wherein the drawing module, the error computing module and the back-propagating module repeat processes of drawing, computing and back-propagating until the error is less than a predetermined value.
  8. A apparatus of claim 5 or 7, wherein the error is a Hamming distance.
  9. An apparatus of claim 7, wherein the predictor further comprises:
    a receiving module configured to receive an unclassified data sample;
    a retrieving module configured to retrieve the trained multiclass classifier from the training unit;
    a prediction generating module configured to generate a class prediction for the unclassified data sample by the trained multiclass classifier.
  10. An apparatus of claim 3, wherein the neural network system comprises at least one of a deep belief network and a convolutional network.
  11. A method for image data classification, comprising:
    retrieving a plurality of training data samples, wherein the training data samples being grouped into different classes;
    generating a target code for each of the retrieved training data samples;
    for an unclassified data sample, generating a target prediction for the unclassified data sample; and
    predicting a class for the unclassified data sample based on the generated target code and the generated target prediction.
  12. A method of claim 11, wherein the step of generating a target code comprises:
    generating a Hadamard matrix, wherein entries of the Hadamard matrix being either “+1” or “-1” , and dimension of the Hadamard matrix is larger than the number of classes of the training data samples;
    removing a first row and a first column of the Hadamard matrix;
    changing “+1” and “-1” in the Hadamard matrix to “0” and “1” , respectively; and
    randomly selecting a plurality of rows of the changed Hadamard matrix as the target codes, wherein the number of the selected rows is identical to that of the classes of the training data samples and each of the selected row corresponds to one target code.
  13. A method of claim 12, wherein the target prediction is generated by a neural network system, the method further comprises training the neural network system with the retrieved training data samples such that the trained neural network system is capable of generating said target predictions.
  14. A method of claim 13, wherein the target code is a ground-truth target code.
  15. A method of claim 14, wherein the step of training a neural network system comprises:
    1) drawing a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;
    2) computing an error between the generated target prediction of the training data sample and the ground-truth target code;
    3) back-propagating the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system; and
    4) repeating steps 1) -3) until the error is less than a predetermined value.
  16. A method of claim 15, wherein the step of predicting a class for an unclassified data sample comprises:
    receiving an unclassified data sample;
    computing Hamming distances between a target prediction of an unclassified data sample and the corresponding ground-truth target code of each class of the training samples; and
    assigning the unclassified data sample to a class corresponding to the minimum Hamming distance among the computed Hamming distances.
  17. A method of claim 14, wherein the step of training a neural network system further comprises:
    1) drawing a training data sample from the training data samples, wherein each of the training data samples is associated with a corresponding ground-truth target code;
    2) computing an error between the generated target prediction of the training data sample and the ground-truth target code;
    3) back-propagating the computed error through the neural network system so as to adjust weights on connections between neurons of the neural network system;
    4) determining whether the error is larger than a predetermined value,
    if yes, repeating steps 1)-3) ,
    if no, proceeding with 5) extracting hidden layer features from the penultimate layer of the neural network system and training a multiclass classifier based on the extracted hidden layer features and class labels of the training data samples.
  18. A method of claim 15 or 17, wherein the error is a Hamming distance.
  19. A method of claim 17, wherein the step of predicting the class for an unclassified  data sample further comprises
    receiving an unclassified data sample;
    retrieving the multiclass classifier trained in step 5) ;
    generating a class prediction for the unclassified data sample by the trained multiclass classifier.
  20. A method of claim 13, wherein the neural network system comprises at least one of a deep belief network and a convolutional network.
PCT/CN2014/000825 2014-09-03 2014-09-03 Apparatus and methods for image data classification WO2016033708A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/000825 WO2016033708A1 (en) 2014-09-03 2014-09-03 Apparatus and methods for image data classification
CN201480081756.1A CN106687993B (en) 2014-09-03 2014-09-03 Device and method for image data classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/000825 WO2016033708A1 (en) 2014-09-03 2014-09-03 Apparatus and methods for image data classification

Publications (1)

Publication Number Publication Date
WO2016033708A1 true WO2016033708A1 (en) 2016-03-10

Family

ID=55438961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/000825 WO2016033708A1 (en) 2014-09-03 2014-09-03 Apparatus and methods for image data classification

Country Status (2)

Country Link
CN (1) CN106687993B (en)
WO (1) WO2016033708A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909151A (en) * 2017-07-02 2018-04-13 小蚁科技(香港)有限公司 For realizing the method and system of notice mechanism in artificial neural network
CN109472274A (en) * 2017-09-07 2019-03-15 富士通株式会社 The training device and method of deep learning disaggregated model
CN109753978A (en) * 2017-11-01 2019-05-14 腾讯科技(深圳)有限公司 Image classification method, device and computer readable storage medium
CN109946669A (en) * 2019-03-18 2019-06-28 西安电子科技大学 Variant aircraft High Range Resolution restoration methods based on depth confidence network
CN112765034A (en) * 2021-01-26 2021-05-07 四川航天系统工程研究所 Software defect prediction method based on neural network
CN115797710A (en) * 2023-02-08 2023-03-14 成都理工大学 Neural network image classification performance improving method based on hidden layer feature difference
CN116794975A (en) * 2022-12-20 2023-09-22 维都利阀门有限公司 Intelligent control method and system for electric butterfly valve

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767735A (en) * 2019-03-26 2020-10-13 北京京东尚科信息技术有限公司 Method, apparatus and computer readable storage medium for executing task

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050031219A1 (en) * 2002-09-06 2005-02-10 The Regents Of The University Of California Encoding and decoding of digital data using cues derivable at a decoder
US7831531B1 (en) * 2006-06-22 2010-11-09 Google Inc. Approximate hashing functions for finding similar content
CN103246893A (en) * 2013-03-20 2013-08-14 西交利物浦大学 ECOC (European Conference on Optical Communication) encoding classification method based on rejected random subspace
CN103426004A (en) * 2013-07-04 2013-12-04 西安理工大学 Vehicle type recognition method based on error correction output code

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050031219A1 (en) * 2002-09-06 2005-02-10 The Regents Of The University Of California Encoding and decoding of digital data using cues derivable at a decoder
US7831531B1 (en) * 2006-06-22 2010-11-09 Google Inc. Approximate hashing functions for finding similar content
CN103246893A (en) * 2013-03-20 2013-08-14 西交利物浦大学 ECOC (European Conference on Optical Communication) encoding classification method based on rejected random subspace
CN103426004A (en) * 2013-07-04 2013-12-04 西安理工大学 Vehicle type recognition method based on error correction output code

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107909151A (en) * 2017-07-02 2018-04-13 小蚁科技(香港)有限公司 For realizing the method and system of notice mechanism in artificial neural network
CN107909151B (en) * 2017-07-02 2020-06-02 小蚁科技(香港)有限公司 Method and system for implementing an attention mechanism in an artificial neural network
CN109472274B (en) * 2017-09-07 2022-06-28 富士通株式会社 Training device and method for deep learning classification model
CN109472274A (en) * 2017-09-07 2019-03-15 富士通株式会社 The training device and method of deep learning disaggregated model
CN109753978A (en) * 2017-11-01 2019-05-14 腾讯科技(深圳)有限公司 Image classification method, device and computer readable storage medium
CN109753978B (en) * 2017-11-01 2023-02-17 腾讯科技(深圳)有限公司 Image classification method, device and computer readable storage medium
CN109946669A (en) * 2019-03-18 2019-06-28 西安电子科技大学 Variant aircraft High Range Resolution restoration methods based on depth confidence network
CN109946669B (en) * 2019-03-18 2022-12-02 西安电子科技大学 Method for recovering high-resolution range profile of morphing aircraft based on deep confidence network
CN112765034A (en) * 2021-01-26 2021-05-07 四川航天系统工程研究所 Software defect prediction method based on neural network
CN112765034B (en) * 2021-01-26 2023-11-24 四川航天系统工程研究所 Software defect prediction method based on neural network
CN116794975A (en) * 2022-12-20 2023-09-22 维都利阀门有限公司 Intelligent control method and system for electric butterfly valve
CN116794975B (en) * 2022-12-20 2024-02-02 维都利阀门有限公司 Intelligent control method and system for electric butterfly valve
CN115797710A (en) * 2023-02-08 2023-03-14 成都理工大学 Neural network image classification performance improving method based on hidden layer feature difference
CN115797710B (en) * 2023-02-08 2023-04-07 成都理工大学 Neural network image classification performance improving method based on hidden layer feature difference

Also Published As

Publication number Publication date
CN106687993B (en) 2018-07-27
CN106687993A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
WO2016033708A1 (en) Apparatus and methods for image data classification
US10438091B2 (en) Method and apparatus for recognizing image content
CN106599900B (en) Method and device for recognizing character strings in image
Stewart et al. End-to-end people detection in crowded scenes
CN105447498B (en) Client device, system and server system configured with neural network
CN107330127B (en) Similar text detection method based on text picture retrieval
RU2693916C1 (en) Character recognition using a hierarchical classification
Charalampous et al. On-line deep learning method for action recognition
Xi et al. Deep prototypical networks with hybrid residual attention for hyperspectral image classification
CN110019652B (en) Cross-modal Hash retrieval method based on deep learning
Liu et al. Deep ordinal regression based on data relationship for small datasets.
Cohen et al. DNN or k-NN: That is the Generalize vs. Memorize Question
CN107004140A (en) Text recognition method and computer program product
Katiyar et al. A hybrid recognition system for off-line handwritten characters
US20200074273A1 (en) Method for training deep neural network (dnn) using auxiliary regression targets
Bansal et al. mRMR-PSO: a hybrid feature selection technique with a multiobjective approach for sign language recognition
CN114730398A (en) Data tag validation
CN109522432B (en) Image retrieval method integrating adaptive similarity and Bayes framework
CN111898703A (en) Multi-label video classification method, model training method, device and medium
Huang et al. Accelerate learning of deep hashing with gradient attention
EP3876236A1 (en) Extracting chemical structures from digitized images
Alam et al. A multi-view convolutional neural network approach for image data classification
CN110199300A (en) Indistinct Input for autocoder
Kokkinos et al. Breaking ties of plurality voting in ensembles of distributed neural network classifiers using soft max accumulations
Mitrović et al. Flower classification with convolutional neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14901041

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14901041

Country of ref document: EP

Kind code of ref document: A1