US8463045B2 - Hierarchical sparse representation for image retrieval - Google Patents
Hierarchical sparse representation for image retrieval Download PDFInfo
- Publication number
- US8463045B2 US8463045B2 US12/943,805 US94380510A US8463045B2 US 8463045 B2 US8463045 B2 US 8463045B2 US 94380510 A US94380510 A US 94380510A US 8463045 B2 US8463045 B2 US 8463045B2
- Authority
- US
- United States
- Prior art keywords
- image
- level
- features
- feature
- nodal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000012549 training Methods 0.000 claims description 63
- 238000000034 method Methods 0.000 claims description 33
- 230000006870 function Effects 0.000 claims description 23
- 230000005055 memory storage Effects 0.000 claims description 12
- 238000012545 processing Methods 0.000 abstract description 4
- 230000004044 response Effects 0.000 description 25
- 239000013598 vector Substances 0.000 description 18
- 238000005259 measurement Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002028 premature Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000010926 purge Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
Definitions
- CBIR Content-based image retrieval
- Some websites or search engines offer content-based image search services to Internet users. Specifically, a user submits a query image which is similar to his/her desired image to a website or search engine that provides CBIR services. Based on the query image, the website or search engine subsequently returns one or more stored images to the user.
- the website or search engine represents or encodes the stored images in terms of image features. The website or search engine compares the image features of the stored images with image features of the query image, and retrieves one or more stored images that have image features similar to the image features of the query image.
- This application describes example techniques for generating a hierarchical sparse codebook.
- training image features are received.
- a hierarchical sparse codebook is then generated based at least upon the training image features.
- the generated hierarchical sparse codebook includes multiple levels, with each level being associated with a sparseness factor.
- FIG. 1 illustrates an exemplary environment including an example hierarchical sparse coding system 110 .
- FIG. 2 illustrates the example hierarchical sparse coding system 110 of FIG. 1 in more detail.
- FIG. 3 illustrates a first example hierarchical sparse codebook.
- FIG. 4 illustrates an exemplary method of generating a hierarchical sparse codebook.
- FIG. 5 illustrates a second example hierarchical sparse codebook.
- FIG. 6 illustrates an exemplary method of representing an image using a hierarchical sparse codebook.
- This disclosure describes a hierarchical sparse coding using a hierarchical sparse codebook.
- the described codebook includes multiple levels.
- the described codebook allows a gradual determination/classification of an image feature into one or more groups or nodes by traversing the image feature through one or more paths to the one or more groups or nodes. That is, the described codebook compares an image feature of an image with nodes or nodal features of the nodes, beginning from a root level down to a leaf level of the codebook. Furthermore, the image feature is only compared with a subset of nodes at each level of the codebook, and therefore processing time is significant reduced relative to existing image search strategies.
- the number of determined/classified groups for the image feature is small/sparse in comparison with the total number of available groups or nodes in the codebook.
- Using the described codebook allows an efficient determination or classification of an image feature, and therefore provides an efficient and time-saving way of representing an image in terms of image features.
- image retrieval can be enhanced by comparing extracted features of an image with the codebook to obtain a representation of the image that can be used as an index or a reference for retrieving one or more stored images in a database.
- FIG. 1 illustrates an exemplary environment 100 usable to implement hierarchical sparse representation for image retrieval.
- the environment 100 includes one or more users 102 - 1 , 102 - 2 , . . . 102 -N (which are collectively referred to as 102 ), a search engine 104 , a website 106 , an image database 108 , a hierarchical sparse coding system 110 , and a network 112 .
- the user 102 communicates with the search engine 104 , the website 106 or the hierarchical sparse coding system 110 through the network 112 using one or more devices 114 - 1 , 114 - 2 , . . . 114 -M, which are collectively referred to as 114 .
- the devices 114 may be implemented as a variety of conventional computing devices including, for example, a server, a desktop personal computer, a notebook or portable computer, a workstation, a mainframe computer, a mobile computing device, a handheld device, a mobile phone, an Internet appliance, a network router, etc. or a combination thereof.
- the network 112 may be a wireless or a wired network, or a combination thereof.
- the network 112 may be a collection of individual networks interconnected with each other and functioning as a single large network (e.g., the Internet or an intranet). Examples of such individual networks include, but are not limited to, Local Area Networks (LANs), Wide Area Networks (WANs), and Metropolitan Area Networks (MANs). Further, the individual networks may be wireless or wired networks, or a combination thereof.
- the device 114 includes a processor 116 coupled to a memory 118 .
- the memory 118 includes a browser 120 and other program data 122 .
- the memory 118 may be coupled to or associated with, and/or accessible to other devices, such as network servers, router, and/or other devices 114 .
- the user 102 uses the browser 120 of the device 114 to submit an image query to the search engine 104 or the website 106 .
- the search engine 104 or the website 106 compares image query with images stored in the image database 108 and retrieves one or more stored images from the image database 108 using a hierarchical sparse codebook that is generated by the hierarchical sparse coding system 110 .
- the search engine 104 or the website 106 then presents the one or more stored images to the user 102 .
- the hierarchical sparse coding system 110 generates a hierarchical sparse codebook using images stored in the image database 108 either upon request from the search engine 104 or the website 106 , or on a regular basis.
- the hierarchical sparse coding system 110 encodes or represents an image received from the user 102 , the search engine 104 or the website 106 based on the hierarchical sparse codebook.
- the hierarchical sparse coding system 110 may return a representation of the received image to the user 102 , the search engine 104 or the website 106 .
- the hierarchical sparse coding system 110 may store the representation of the received image or send the image representation to the image database 108 for storage. This image representation may further be stored as an index or a reference for the received image in the image database 108 .
- FIG. 2 illustrates various components of the exemplary hierarchical sparse coding system 110 in more detail.
- the system 110 can include, but is not limited to, a processor 202 , a network interface 204 , a system memory 206 , and an input/output interface 208 .
- the memory 206 includes a computer-readable media in the form of volatile memory, such as Random Access Memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash RAM.
- the memory 206 includes program modules 210 and program data 212 .
- the program data 212 may include a hierarchical sparse codebook 214 and other program data 216 .
- the memory 206 may further include a feature database 218 storing training image features that are used for generating the hierarchical sparse codebook 214 .
- the hierarchical sparse codebook 214 may include a hierarchical tree.
- FIG. 3 shows an example of a hierarchical sparse codebook 214 in a form of a hierarchical tree.
- the hierarchical codebook may comprise L number of levels, including a root level 302 - 1 , one or more intermediate levels 302 - 2 , . . . 302 -L ⁇ 1, and a leaf level 302 -L.
- Each node of the root level and the one or more intermediate levels may include K number of child nodes.
- Each node of the hierarchical codebook is associated with a nodal feature.
- a nodal feature is a trained image feature associated with a node of the hierarchical codebook.
- the nodal feature may be in a form of a vector, for example. Additionally, each node may further be assigned a subset of the training image features.
- each level of the hierarchical sparse codebook is associated with a sparseness factor to determine a degree of sparseness for each level.
- a degree of sparseness for a level is defined as an average number of nodes or nodal features used to represent each training image feature at that level divided by the total number of nodal features at that same level.
- the program module 210 may further include an image receiving module 220 .
- the image receiving module 220 may receive an image from the user 102 , the search engine 104 or the website 106 .
- the image may be a query image that the user 102 uses to find his/her desired image(s).
- the image receiving module 220 may transfer the image to a feature extraction module 222 , which extracts features that are representative of the image.
- the feature extraction module 222 may adopt one or more feature extraction techniques such as singular vector decomposition (SVD), Bag of Visual Words (BoW), etc. Examples of the features include, but are not limited to, scale-invariant feature transform (SIFT) features and intensity histograms.
- SIFT scale-invariant feature transform
- the feature extraction module 222 may send the extracted features to a feature determination module 224 , the feature database 218 , or both.
- the feature determination module 224 determines one or more leaf nodes of the hierarchical sparse codebook 214 to represent each extracted feature. Specifically, the feature determination module 224 compares each extracted feature with nodal features associated with a subset of nodes of the hierarchical sparse codebook 214 level by level.
- Table 1 shows a first example algorithm for representing an image using the hierarchical sparse codebook 214 .
- the hierarchical sparse codebook 214 in FIG. 3 is one example.
- the feature determination module 224 compares the extracted feature with each nodal feature associated with each node at next level 302 - 2 , i.e., level 1 in Table 1.
- the feature determination module 224 may employ a distance measurement module 226 to determine a distance or a degree of overlap between the extracted feature and each nodal feature.
- the distance measurement module 226 may measure the distance or the degree of overlap according to a predetermined distance metric. For example, if features (i.e., the extracted feature and the nodal feature) are expressed in terms of feature vectors, the predetermined distance metric may include computing a normalized Lp-distance between the extracted feature and the nodal feature, where p can be any integer greater than zero.
- the predetermined distance metric may include computing a normalized L2-distance (i.e., Euclidean distance) or a normalized L1-distance (i.e., Manhattan distance) between the extracted feature and the nodal feature.
- the predetermined distance metric may include computing an inner product of the extracted feature and the nodal feature to determine a degree of overlap therebetween.
- the feature determination module 224 may select a node at level 302 - 2 whose parent has a distance from the extracted feature that is less than a predetermined distance threshold (e.g., 0.2). Alternatively, the feature determination module 224 may select a node at level 302 - 2 whose parent has a degree of overlap with the extracted feature that is greater than a predetermined overlap threshold (e.g., zero).
- a predetermined distance threshold or the predetermined overlap threshold can be adaptively adjusted for each level in order to control a degree of sparseness for each level.
- a degree of sparseness for a level is defined as an average number of nodes or nodal features used to represent each training image feature at that particular level divided by the total number of nodes or nodal features at that same level.
- the feature determination module 224 repeats distance measurement for those selected nodes at level 302 - 2 and node selection for child nodes of the selected nodes at level 302 - 3 . In the above algorithm 1, the feature determination module 224 leaves those unselected nodes at level 302 - 2 and respective child nodes or branches untouched. More specifically, the feature determination module 224 does not perform any distance determination or node selection for the child nodes of the unselected nodes of level 302 - 2 .
- leaf nodes are selected according to the above algorithm and are used to represent the extracted feature by the feature determination module 224 .
- the feature determination module 224 may generate a histogram representation of the image.
- the histogram representation of the image may be generated by counting a number of times each node or nodal feature at a leaf level (i.e., level 302 -L in FIG. 3 or level L ⁇ 1 in Table 1) of the codebook 214 is selected for the extracted features of the images.
- the histogram representation may be used to represent the image, and may be stored in the image database 108 as an index or a comparison reference for the image.
- the feature determination module 224 may additionally or alternatively employ a cost module 228 to determine which nodes are selected and which nodes are not selected for the extracted feature at each level of the codebook 214 .
- the cost module 228 may include a cost function. Table 2 (below) shows a second example algorithm for representing an image using the hierarchical sparse codebook 214 .
- the hierarchical sparse codebook in FIG. 3 is used for illustration.
- an extracted feature x i arrives at the root level 302 - 1 in FIG. 3 or level 0 in Table 2, an active set A 1 is initially set to include each nodal feature associated with each node at a next level 302 - 2 , i.e., level 1.
- L1 is then minimized with respect to a response u i .
- Each entry, u i j , in the response u i represents a response of the extracted feature x i to corresponding nodal feature v j .
- a new active set A 2 may be created by selecting a node or a nodal feature in level 302 - 3 in FIG. 3 or level 2 in Table 2 whose parent at level 302 - 2 or level 1 gives a response u i j greater than a predetermined response threshold.
- the processes of cost function minimization and nodal feature selection are repeated for level 302 - 3 in FIG. 3 or level 2 in Table 2, until leaf level (i.e., level 320 -L in FIG. 3 or level L ⁇ 1 in Table 2) of the codebook 214 is reached.
- leaf level i.e., level 320 -L in FIG. 3 or level L ⁇ 1 in Table 2
- One or more nodes or nodal features at the leaf level of the codebook 214 having a response with the extracted feature x i greater than a predetermined response threshold may be selected to represent the extracted feature x i .
- a parameter ⁇ which controls the degree of sparseness, may be different for different levels of the codebook 214 .
- the parameter ⁇ may be smaller for levels closer to the root level to allow more nodes or nodal features to be selected at those levels, and may gradually increase towards the leaf level of the codebook 214 to avoid over-number of selected nodes or nodal features at the leaf level.
- the parameter ⁇ will not be modified until the codebook 214 is reconstructed or representations of the images are redone.
- an image may be represented using a combination of the above two algorithms.
- algorithm 1 may first be used to find an active set up to a predetermined level of the codebook 214 for each image feature of the image.
- Algorithm 2 may then be used for the rest of the levels of the codebook 214 to obtain one or more nodes or nodal features at the leaf level of the codebook 214 for each image feature.
- algorithm 1 can allow more nodes or nodal features to be selected for an image feature at each level, and therefore permits a broader exploration of nodal features to represent the image feature. This avoids pre-mature elimination of nodes or nodal features that are actually good candidates for representing the image feature.
- algorithm 2 may be employed to limit number of selected nodes or nodal features at subsequent levels in order to prevent the number of selected nodes or nodal features (i.e., active set in Table 1) from going too large in size.
- the feature determination module 224 may save the representation in the image database 108 and use this representation as an index for retrieving the image. Additionally or alternatively, this representation can be saved as a reference for comparison with representations of other images such as a query image during image retrieval.
- a representation e.g., histogram representation
- the representation of the query image may be used to retrieve one or more stored images in the image database 108 .
- the representation of the query image may be compared with representations of images stored in the image database 108 .
- a classifier may be used to classify the query image into one of a plurality of classes (e.g., automobile class) based on the representation of the query image.
- the classifier may include a neural network, a Bayesian belief network, support vector machines (SVMs), fuzzy logic, Hidden Markov Model (HMM), or any combination thereof, etc.
- SVMs support vector machines
- HMM Hidden Markov Model
- the classifier may be trained on a subset of the representations of the images stored in the image database 108 .
- stored images within that class may be retrieved and presented to the user 102 according to respective frequencies of retrieval within a certain interval (e.g., the past one day, past one week, past one month, etc).
- the representation of the query image may be compared with the representations of the stored images according to an image similarity metric.
- the image similarity metric is a measure of similarity between two images, and may return a similarity score to represent a relative resemblance of a stored image with respect to the query image.
- a similarity measurement module 230 may be used to calculate a similarity score of a stored image with respect to the query image based upon the representation of the query image. For example, the similarity measurement module 230 calculates the similarity score based on a ratio of the number of common features in the representations of the query image and the stored image with respect to their average number of features.
- the similarity measurement module 230 may compute a correlation between the representation of the query image with representation of a stored image. For example, if an image is represented in the form of a histogram as described above, a correlation between a histogram representation of the query image and a histogram representation of a stored image may be computed to obtain a similarity score therebetween. In one embodiment, each of these histogram representations may first be normalized such that a respective area integral of the histogram representations are normalized to one, for example.
- one or more stored images may be presented to the user 102 , and arranged according to their similarity scores, for example, in a descending order of their similarity scores.
- the program module 210 may further include a codebook generation module 232 .
- the codebook generation module 232 generates the hierarchical sparse codebook 214 based on the training image features that are stored in the feature database 218 . Additionally or alternatively, the codebook generation module 232 generates the hierarchical sparse codebook 214 based on images stored in the image database 108 . In one embodiment, the codebook generation module 232 generates or reconstructs the hierarchical sparse codebook 214 on a regular basis, e.g., each day, each week, each month, or each year. Alternatively, the hierarchical sparse codebook 214 may be generated upon request, for example, from the search engine 104 or the website 106 .
- the hierarchical sparse codebook 214 is reconstructed based on performance of the codebook 214 in retrieving stored images in response to query images submitted from the user 102 .
- the program data 212 may further include image query data 234 .
- the image query data 234 may include query images that have been submitted by one or more users 102 and stored images that were returned in response to the query images. Additionally or alternatively, the image query data 234 may include one or more stored images that have been selected by the users 102 in response to the query images. In one embodiment, the image query data 234 may further include similarity scores of the one or more selected images with respect to the query images.
- the codebook 214 may be reconstructed in response to an average similarity score of the selected images in the image query data 234 being less than a predetermined similarity threshold.
- the predetermined similarity threshold may be set by an administrator or operator of the system 110 according to the accuracy and/or computing requirements, for example. For example, if a perfect match between a query image and a stored image has a similarity score of one, the codebook 214 may be reconstructed in response to the average similarity score being less than 0.7, for example.
- the codebook generation module 232 may receive a plurality of training image features from the feature database 218 . Additionally or alternatively, the codebook generation module 232 may receive a plurality of images from the image database 108 and use the feature extraction module 222 to extract a plurality of image features for training purposes. Upon receiving the plurality of training image features, the codebook generation module 232 generates a hierarchical sparse codebook 214 according to a codebook generation algorithm. An example algorithm is illustrated in Table 3 (below).
- k number of nodes at level 1 are branched out from a root node at level 0.
- Each node at level 1 is associated with a nodal feature which is a training image feature randomly selected from the plurality of training image features.
- the plurality of training image features are then compared with each nodal feature at level 1 in order to assign a subset of training image features to the corresponding node at level 1.
- the subset of training image features assigned to a node includes a training image feature that has a response (e.g., a degree of overlap) to a nodal feature associated with that node greater than a predetermined response threshold, e.g., zero.
- a set of k nodal features are trained with respect to the assigned subset of training image features for the node. Specifically, based on the assigned subset of training image features, a cost function is minimized with respect to the set of k nodal features: ⁇ i
- this set of k nodal features are assigned to child nodes of the node at next level, i.e., level 2. These processes of cost function minimization and nodal feature assignment are repeated for each node at each level until each node at the leaf level of the codebook is assigned a nodal feature and a subset of training image features or leaf level of the codebook is reached. At this point, the hierarchical sparse codebook is generated.
- X l j For each node j at level l, i.e., o l j , collect a subset of X which has a response with a nodal feature vector associated with node o l j greater than a predetermined response threshold, and is denoted as X l j ; 2.
- the parameter ⁇ l (which is also called a sparseness factor for level l) can be adaptively adjusted to change a degree of sparseness for the level l.
- the parameter ⁇ l or the degree of sparseness for a level is adjusted to be less than a predetermined threshold level.
- the parameter ⁇ l or the degree of sparseness for a level is adjusted to be within a predetermined range.
- the parameter ⁇ l or the degree of sparseness for each level is collectively adjusted to obtain an overall degree of sparseness for the codebook and the plurality of training image features that is less than a predetermined overall threshold or within a predetermined overall range.
- the predetermined threshold level or the predetermined range may be the same or different for different levels.
- the above algorithm may further be modified. Specifically, after randomly assigning k number of training image features to be nodal features associated with the nodes at level 1, the algorithm may further train these nodal features to minimize the above cost function for level 1. Upon obtaining a set of optimized nodal features that minimize the cost function of level 1, the algorithm may assign these optimized nodal features to the nodes of level 1. The algorithm further assigns a subset of training image features that have responses greater than a predetermined response threshold to each node of level 1.
- the algorithm may further specify that a training image feature that is assigned to a node is also a training image feature that has been assigned to the parent of the node.
- the codebook 214 is described to include a hierarchical tree in the foregoing embodiments, the codebook 214 is not limited thereto.
- the hierarchical sparse codebook 214 can include any hierarchical structure.
- the hierarchical sparse codebook 214 may initially include a hierarchical tree. After or during the training phase of the hierarchical sparse codebook 214 , however, a node (i.e., a node at an intermediate level and/or a leaf level of the codebook 214 ) may be purged based on an average degree of overlap between associated training image features and corresponding nodal feature of the node.
- a node may be purged if corresponding average degree of overlap between associated training image features and corresponding nodal feature is less than a predetermined threshold.
- this predetermined threshold may vary among different levels.
- the predetermined threshold for average degree of overlap is lower at a higher level (i.e., a level closer to the root level of the codebook 214 ), and increases towards the leaf level of the codebook 214 . This is because the number of training image features assigned to a node at the higher level is usually greater and a nodal feature associated with the node is more generalized with respect to the assigned training image features. Having a lower threshold therefore avoids pre-mature purging of the node at the higher level.
- a node at a lower level is usually assigned with a fewer number of training image features, and a corresponding nodal feature may be more specific to the assigned training image features. Therefore, the predetermined threshold associated with the node at the lower level can be higher to reflect a change from generality to specificity of nodal features from a high level to a low level of the codebook 214 .
- the hierarchical sparse codebook may be a hierarchical structure having a plurality of levels, with each level having a predetermined number of nodes. Rather than having an equal number of intermediate child nodes for each node at one level, the number of intermediate child nodes of a node at that level may be determined upon the number of training image features assigned to that particular node. For example, the number of intermediate child nodes of a first node at one level is greater than the number of intermediate child nodes of a second node at the same level if the number of training image features assigned to the first node is greater than the number of training image features assigned to the second node.
- a node having a greater number of training image features is allocated more resources (i.e., child nodes) to represent these training image features while a node having a fewer number of training image features is allocated fewer resources, thereby optimizing the use of resources which are usually limited.
- Exemplary methods for generating a hierarchical sparse codebook or representing an image using the hierarchical sparse codebook are described with reference to FIGS. 4-6 .
- These exemplary methods can be described in the general context of computer executable instructions.
- computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like that perform particular functions or implement particular abstract data types.
- the methods can also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communication network.
- computer executable instructions may be located both in local and remote computer storage media, including memory storage devices.
- the exemplary methods are illustrated as a collection of blocks in a logical flow graph representing a sequence of operations that can be implemented in hardware, software, firmware, or a combination thereof.
- the order in which the methods are described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the methods, or alternate methods. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein.
- the blocks represent computer instructions that, when executed by one or more processors, perform the recited operations.
- FIG. 4 illustrates an exemplary method 400 of generating a hierarchical sparse codebook.
- a plurality of training image features are received.
- This plurality of training image features may be obtained from one or more databases and/or one or more search engines.
- the plurality of training image features may be extracted from a plurality of images that are stored in the one or more databases and/or the one or more search engines.
- a hierarchical sparse codebook is generated based at least upon the plurality of training image features.
- the hierarchical sparse codebook may be generated to include a plurality of levels.
- each of the plurality of levels may be associated with a sparseness factor as shown in FIG. 3 , for example.
- Each level of the hierarchical sparse codebook is generated by adjusting corresponding sparseness factors to be less than respective predetermined thresholds or within respective predetermined ranges.
- the hierarchical sparse codebook may be generated by adjusting the sparseness factor of each level to obtain an overall degree of sparseness for the codebook and the plurality of training image features.
- the sparseness factor of each level is adjusted to obtain an overall degree of sparseness that is less than a predetermined overall threshold or within a predetermined overall range.
- This predetermined overall threshold or predetermined overall range may be set by an administrator or an operator of the system 112 based on specified computing requirements or needs.
- generating the hierarchical sparse codebook at block 404 may include representing each training image feature by a sparse number of leaf nodes or nodal features that are associated with the leaf nodes of the hierarchical sparse codebook.
- FIG. 5 shows an example of this hierarchical sparse codebook.
- each training image feature j is represented by a sparse number of nodes or nodal features at the leaf level of the codebook.
- FIG. 6 illustrates an exemplary method 600 of representing or encoding an image using a hierarchical sparse codebook.
- an image is received.
- This image may be received from a user for image query.
- this image may be received from a search engine or a website for encoding the image.
- a plurality of image features are extracted from the image.
- each image feature of the image is compared with a hierarchical sparse codebook to obtain one or more leaf-level features (i.e., nodal features at leaf level) of the codebook.
- the one or more leaf-level features represent a sparse code representation of the respective image feature.
- a histogram for the image is generated based upon the one or more leaf-level features of each image feature of the image.
- the histogram represents respective number of times that each leaf-level feature of the codebook is encountered by the plurality of image features of the image.
- the image is represented by the histogram.
- the histogram may further be stored in a database as an index for the image. Additionally or alternatively, the histogram may be acted a reference for comparison between another image such as a query image during image retrieval.
- the histogram of the query image may be compared with histograms of a subset of stored images in the database. In one embodiment, the comparison may be performed by computing correlations between the histogram of the query image and the histograms of the subset of stored images.
- One or more stored images having a correlation greater than a predetermined correlation threshold may be retrieved and presented to the user.
- Computer-readable media can be any available media that can be accessed during generation of the hierarchical sparse codebook or encoding an image using the hierarchical sparse codebook.
- Computer-readable media may comprise volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data.
- Computer-readable media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information. Combinations of any of the above should also be included within the scope of computer-readable media.
Abstract
Description
TABLE 1 |
Algorithm 1: Encode an Image using a hierarchical sparse codebook |
[Input]: Feature vector set X = {x1, x2, x3, ... xM}, e.g., M |
feature vectors extracted from the image l; Constructed |
hierarchical sparse codebook T |
[Output]: Histogram representation h |
[initialization]: Active set Al = {v1, v2, v3, ... vk} for tree level l = 1 |
[Main] |
For i = 1 to M |
For l = 1 to L−1 |
1. Measure a distance or a degree of overlap between |
feature vector xi and each nodal feature vector in the active set Al; |
2. Generate active set Al+1 by selecting a node or a |
nodal feature at tree level l+1 whose parent has a distance from or a |
degree of overlap with the feature vector xi respectively less than a |
predetermined distance threshold or greater than a predetermined |
overlap threshold in |
l = l + 1 |
End For |
i = i + l |
End For |
The histogram representation h is calculated by counting number of times each node or nodal feature at level l = L of the codebook T is selected for all X = {x1, x2, x3, ... xM}. |
TABLE 2 |
Algorithm 2: Encode an Image using a hierarchical sparse codebook |
[Input]: Feature vector set X = {x1, x2, x3, ... xM}, e.g., M feature |
vectors extracted from the image l; Constructed hierarchical sparse |
codebook T |
[Output]: Histogram representation h |
[initialization]: Active set Al = {v1, v2, v3, ... vk} for tree level l = 1 |
[Main] |
For i = 1 to M |
For l = 1 to L−1 |
1. Encode feature vector xi using the active set Al by |
minimizing a cost function |xi − uiAl|L1 + λ|ui|L1, where λ is a parameter |
to control a degree of sparseness for representing the feature vector xi |
in terms of nodal feature vectors in Al, and | |L1 represents L1-norm. |
2. Generate active set Al+1 by selecting a node or a nodal |
feature at tree level l+1 whose parent gives a response ui j greater than a |
predetermined response threshold in |
l = l + 1 |
End For |
i = i + 1 |
End For |
The histogram representation h is calculated by summing and normalizing all responses of all X = {x1, x2, x3, ... xM} at level l = L of the codebook tree T. |
Σi |x li j −u li j V l j|L1+λlΣi |u li j|L1 (1)
where
-
- xli j represents a training image feature in a subset Xl j
- Vl j represents the set of k nodal features that are trained for node j at level l, i.e., ol j,
- uli j represents a response of xli j to Vl j, and
- λl represents a parameter to control a degree of sparseness for level l.
TABLE 3 |
Algorithm 3: Generate a hierarchical sparse codebook |
[Input]: Feature vector set X = {x1, x2, x3, ... xN}, e.g., N feature |
vectors from a set of training images |
[Output]: K-branch tree T, level l = 0, 1, 2, ... L, each node being |
associated with a nodal feature vector v |
[initialization]: Branch a root node (at level l = 0) into K nodes |
(at level l = 1), each of the K nodes at level l = 1 is randomly |
selected from the feature vector set X [Main] |
For l = 1 to L−1 |
1. For each node j at level l, i.e., ol j, collect a subset of X |
which has a response with a nodal feature vector associated with |
node ol j greater than a predetermined response threshold, and is |
denoted as Xl j; |
2. For each node j at level l, ol j, based on Xl j, train a set of |
K nodal features Vl j, by minimizing a cost function Σi|xli j − uli jVl j|L1 + |
λl Σi|uli j|L1 with respect to a visual codebook associated with node ol j, |
i.e., Vl j, then child nodes of node ol j at |
nodal features of Vl j; |
l = l + 1 |
End For |
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/943,805 US8463045B2 (en) | 2010-11-10 | 2010-11-10 | Hierarchical sparse representation for image retrieval |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/943,805 US8463045B2 (en) | 2010-11-10 | 2010-11-10 | Hierarchical sparse representation for image retrieval |
Publications (2)
Publication Number | Publication Date |
---|---|
US20120114248A1 US20120114248A1 (en) | 2012-05-10 |
US8463045B2 true US8463045B2 (en) | 2013-06-11 |
Family
ID=46019687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/943,805 Active 2031-08-12 US8463045B2 (en) | 2010-11-10 | 2010-11-10 | Hierarchical sparse representation for image retrieval |
Country Status (1)
Country | Link |
---|---|
US (1) | US8463045B2 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130142401A1 (en) * | 2011-12-05 | 2013-06-06 | Canon Kabushiki Kaisha | Image processing apparatus and image processing method |
US20140355880A1 (en) * | 2012-03-08 | 2014-12-04 | Empire Technology Development, Llc | Image retrieval and authentication using enhanced expectation maximization (eem) |
US20170132826A1 (en) * | 2013-09-25 | 2017-05-11 | Heartflow, Inc. | Systems and methods for controlling user repeatability and reproducibility of automated image annotation correction |
US20170364740A1 (en) * | 2016-06-17 | 2017-12-21 | International Business Machines Corporation | Signal processing |
US10007679B2 (en) | 2008-08-08 | 2018-06-26 | The Research Foundation For The State University Of New York | Enhanced max margin learning on multimodal data mining in a multimedia database |
US10319035B2 (en) | 2013-10-11 | 2019-06-11 | Ccc Information Services | Image capturing and automatic labeling system |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8577131B1 (en) * | 2011-07-12 | 2013-11-05 | Google Inc. | Systems and methods for visual object matching |
US9311337B2 (en) * | 2012-12-20 | 2016-04-12 | Broadcom Corporation | Image subset determination and processing |
CN103324954B (en) * | 2013-05-31 | 2017-02-08 | 中国科学院计算技术研究所 | Image classification method based on tree structure and system using same |
CN103559683B (en) * | 2013-09-24 | 2016-08-10 | 浙江大学 | The damaged image restorative procedure represented based on the popular low-rank of multi views |
CN103678551B (en) * | 2013-12-05 | 2017-09-26 | 银江股份有限公司 | A kind of large-scale medical image retrieval encoded based on Random sparseness |
CN103679150B (en) * | 2013-12-13 | 2016-12-07 | 深圳市中智科创机器人有限公司 | A kind of facial image sparse coding method and apparatus |
KR102024867B1 (en) * | 2014-09-16 | 2019-09-24 | 삼성전자주식회사 | Feature extracting method of input image based on example pyramid and apparatus of face recognition |
EP3166021A1 (en) * | 2015-11-06 | 2017-05-10 | Thomson Licensing | Method and apparatus for image search using sparsifying analysis and synthesis operators |
CN104765878A (en) * | 2015-04-27 | 2015-07-08 | 合肥工业大学 | Sparse coding algorithm suitable for multi-modal information and application thereof |
US10102448B2 (en) * | 2015-10-16 | 2018-10-16 | Ehdp Studios, Llc | Virtual clothing match app and image recognition computing device associated therewith |
US10977565B2 (en) * | 2017-04-28 | 2021-04-13 | At&T Intellectual Property I, L.P. | Bridging heterogeneous domains with parallel transport and sparse coding for machine learning models |
CN107634787A (en) * | 2017-08-22 | 2018-01-26 | 南京邮电大学 | A kind of method of extensive MIMO millimeter wave channel estimations |
CN109815355A (en) * | 2019-01-28 | 2019-05-28 | 网易(杭州)网络有限公司 | Image search method and device, storage medium, electronic equipment |
CN111125577A (en) * | 2019-11-22 | 2020-05-08 | 百度在线网络技术(北京)有限公司 | Webpage processing method, device, equipment and storage medium |
Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999045483A1 (en) | 1998-03-04 | 1999-09-10 | The Trustees Of Columbia University In The City Of New York | Method and system for generating semantic visual templates for image and video retrieval |
US6285995B1 (en) | 1998-06-22 | 2001-09-04 | U.S. Philips Corporation | Image retrieval system using a query image |
US20010056415A1 (en) | 1998-06-29 | 2001-12-27 | Wei Zhu | Method and computer program product for subjective image content smilarity-based retrieval |
US6408293B1 (en) | 1999-06-09 | 2002-06-18 | International Business Machines Corporation | Interactive framework for understanding user's perception of multimedia data |
US20020136468A1 (en) | 2001-03-20 | 2002-09-26 | Hung-Ming Sun | Method for interactive image retrieval based on user-specified regions |
US6744935B2 (en) | 2000-11-02 | 2004-06-01 | Korea Telecom | Content-based image retrieval apparatus and method via relevance feedback by using fuzzy integral |
US20040111453A1 (en) | 2002-12-06 | 2004-06-10 | Harris Christopher K. | Effective multi-class support vector machine classification |
US20040175041A1 (en) | 2003-03-06 | 2004-09-09 | Animetrics, Inc. | Viewpoint-invariant detection and identification of a three-dimensional object from two-dimensional imagery |
US20040249801A1 (en) | 2003-04-04 | 2004-12-09 | Yahoo! | Universal search interface systems and methods |
US6847733B2 (en) | 2001-05-23 | 2005-01-25 | Eastman Kodak Company | Retrieval and browsing of database images based on image emphasis and appeal |
US20050057570A1 (en) | 2003-09-15 | 2005-03-17 | Eric Cosatto | Audio-visual selection process for the synthesis of photo-realistic talking-head animations |
US20050192992A1 (en) | 2004-03-01 | 2005-09-01 | Microsoft Corporation | Systems and methods that determine intent of data and respond to the data based on the intent |
WO2006005187A1 (en) | 2004-07-09 | 2006-01-19 | Parham Aarabi | Interactive three-dimensional scene-searching, image retrieval and object localization |
US20060112092A1 (en) | 2002-08-09 | 2006-05-25 | Bell Canada | Content-based image retrieval method |
US7065521B2 (en) | 2003-03-07 | 2006-06-20 | Motorola, Inc. | Method for fuzzy logic rule based multimedia information retrival with text and perceptual features |
US7099860B1 (en) | 2000-10-30 | 2006-08-29 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US7113944B2 (en) | 2001-03-30 | 2006-09-26 | Microsoft Corporation | Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR). |
US20070104378A1 (en) | 2003-03-05 | 2007-05-10 | Seadragon Software, Inc. | Method for encoding and serving geospatial or other vector data as images |
US7240075B1 (en) | 2002-09-24 | 2007-07-03 | Exphand, Inc. | Interactive generating query related to telestrator data designating at least a portion of the still image frame and data identifying a user is generated from the user designating a selected region on the display screen, transmitting the query to the remote information system |
US20070214172A1 (en) | 2005-11-18 | 2007-09-13 | University Of Kentucky Research Foundation | Scalable object recognition using hierarchical quantization with a vocabulary tree |
US20070259318A1 (en) | 2006-05-02 | 2007-11-08 | Harrison Elizabeth V | System for interacting with developmentally challenged individuals |
US20070288453A1 (en) | 2006-06-12 | 2007-12-13 | D&S Consultants, Inc. | System and Method for Searching Multimedia using Exemplar Images |
KR100785928B1 (en) | 2006-07-04 | 2007-12-17 | 삼성전자주식회사 | Method and system for searching photograph using multimodal |
US20090125510A1 (en) | 2006-07-31 | 2009-05-14 | Jamey Graham | Dynamic presentation of targeted information in a mixed media reality recognition system |
US20090171929A1 (en) | 2007-12-26 | 2009-07-02 | Microsoft Corporation | Toward optimized query suggeston: user interfaces and algorithms |
US7624337B2 (en) | 2000-07-24 | 2009-11-24 | Vmark, Inc. | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
US20100088342A1 (en) | 2008-10-04 | 2010-04-08 | Microsoft Corporation | Incremental feature indexing for scalable location recognition |
US7801893B2 (en) | 2005-09-30 | 2010-09-21 | Iac Search & Media, Inc. | Similarity detection and clustering of images |
-
2010
- 2010-11-10 US US12/943,805 patent/US8463045B2/en active Active
Patent Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999045483A1 (en) | 1998-03-04 | 1999-09-10 | The Trustees Of Columbia University In The City Of New York | Method and system for generating semantic visual templates for image and video retrieval |
US6285995B1 (en) | 1998-06-22 | 2001-09-04 | U.S. Philips Corporation | Image retrieval system using a query image |
US20010056415A1 (en) | 1998-06-29 | 2001-12-27 | Wei Zhu | Method and computer program product for subjective image content smilarity-based retrieval |
US6408293B1 (en) | 1999-06-09 | 2002-06-18 | International Business Machines Corporation | Interactive framework for understanding user's perception of multimedia data |
US7624337B2 (en) | 2000-07-24 | 2009-11-24 | Vmark, Inc. | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
US7099860B1 (en) | 2000-10-30 | 2006-08-29 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US6744935B2 (en) | 2000-11-02 | 2004-06-01 | Korea Telecom | Content-based image retrieval apparatus and method via relevance feedback by using fuzzy integral |
US20020136468A1 (en) | 2001-03-20 | 2002-09-26 | Hung-Ming Sun | Method for interactive image retrieval based on user-specified regions |
US7113944B2 (en) | 2001-03-30 | 2006-09-26 | Microsoft Corporation | Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval (CBIR). |
US6847733B2 (en) | 2001-05-23 | 2005-01-25 | Eastman Kodak Company | Retrieval and browsing of database images based on image emphasis and appeal |
US20060112092A1 (en) | 2002-08-09 | 2006-05-25 | Bell Canada | Content-based image retrieval method |
US7240075B1 (en) | 2002-09-24 | 2007-07-03 | Exphand, Inc. | Interactive generating query related to telestrator data designating at least a portion of the still image frame and data identifying a user is generated from the user designating a selected region on the display screen, transmitting the query to the remote information system |
US20040111453A1 (en) | 2002-12-06 | 2004-06-10 | Harris Christopher K. | Effective multi-class support vector machine classification |
US20070104378A1 (en) | 2003-03-05 | 2007-05-10 | Seadragon Software, Inc. | Method for encoding and serving geospatial or other vector data as images |
US20040175041A1 (en) | 2003-03-06 | 2004-09-09 | Animetrics, Inc. | Viewpoint-invariant detection and identification of a three-dimensional object from two-dimensional imagery |
US7065521B2 (en) | 2003-03-07 | 2006-06-20 | Motorola, Inc. | Method for fuzzy logic rule based multimedia information retrival with text and perceptual features |
US20040249801A1 (en) | 2003-04-04 | 2004-12-09 | Yahoo! | Universal search interface systems and methods |
US20050057570A1 (en) | 2003-09-15 | 2005-03-17 | Eric Cosatto | Audio-visual selection process for the synthesis of photo-realistic talking-head animations |
US20050192992A1 (en) | 2004-03-01 | 2005-09-01 | Microsoft Corporation | Systems and methods that determine intent of data and respond to the data based on the intent |
WO2006005187A1 (en) | 2004-07-09 | 2006-01-19 | Parham Aarabi | Interactive three-dimensional scene-searching, image retrieval and object localization |
US7801893B2 (en) | 2005-09-30 | 2010-09-21 | Iac Search & Media, Inc. | Similarity detection and clustering of images |
US20070214172A1 (en) | 2005-11-18 | 2007-09-13 | University Of Kentucky Research Foundation | Scalable object recognition using hierarchical quantization with a vocabulary tree |
US20070259318A1 (en) | 2006-05-02 | 2007-11-08 | Harrison Elizabeth V | System for interacting with developmentally challenged individuals |
US20070288453A1 (en) | 2006-06-12 | 2007-12-13 | D&S Consultants, Inc. | System and Method for Searching Multimedia using Exemplar Images |
KR100785928B1 (en) | 2006-07-04 | 2007-12-17 | 삼성전자주식회사 | Method and system for searching photograph using multimodal |
US20090125510A1 (en) | 2006-07-31 | 2009-05-14 | Jamey Graham | Dynamic presentation of targeted information in a mixed media reality recognition system |
US20090171929A1 (en) | 2007-12-26 | 2009-07-02 | Microsoft Corporation | Toward optimized query suggeston: user interfaces and algorithms |
US20100088342A1 (en) | 2008-10-04 | 2010-04-08 | Microsoft Corporation | Incremental feature indexing for scalable location recognition |
Non-Patent Citations (120)
Title |
---|
Abdel-Mottaleb, et al., "Performance Evaluation of Clustering Algorithms for Scalable Image Retrieval", retrieved on Jul. 30, 2010 at <<http://www.umiacs.umd.edu/˜gopal/Publications/cvpr98.pdf>>, John Wiley—IEEE Computer Society, Empirical Evaluation Techniques in Computer Vision, Santa Barbara, CA, 1998, pp. 45-56. |
Abdel-Mottaleb, et al., "Performance Evaluation of Clustering Algorithms for Scalable Image Retrieval", retrieved on Jul. 30, 2010 at >, John Wiley-IEEE Computer Society, Empirical Evaluation Techniques in Computer Vision, Santa Barbara, CA, 1998, pp. 45-56. |
Baumberg, "Reliable Feature Matching Across Widely Separated Views", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.1666&rep=rep1&type=pdf>>, IEEE, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head Island, SC, vol. 1, 2000, pp. 774-781. |
Baumberg, "Reliable Feature Matching Across Widely Separated Views", retrieved on Jul. 30, 2010 at >, IEEE, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head Island, SC, vol. 1, 2000, pp. 774-781. |
Beckmann, et al., "The R-Tree: An Efficient and Robust Access Method for Points and Rectangles", retrieved on Jul. 30, 2010 at <<http://epub.ub.uni-muenchen.de/4256/1/31.pdf>>, ACM, SIGMOD Record, vol. 19, No. 2, Jun. 1990, pp. 322-331. |
Beckmann, et al., "The R-Tree: An Efficient and Robust Access Method for Points and Rectangles", retrieved on Jul. 30, 2010 at >, ACM, SIGMOD Record, vol. 19, No. 2, Jun. 1990, pp. 322-331. |
Belussi, et al., "Estimating the Selectivity of Spatial Queries Using the ‘Correlation’ Fractal Dimension", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.4521&rep=rep1&type=pdf>>, Morgan Kaufmann Publishers, Proceedings of International Conference on Very Large Data Bases, 1995, pp. 299-310. |
Belussi, et al., "Estimating the Selectivity of Spatial Queries Using the 'Correlation' Fractal Dimension", retrieved on Jul. 30, 2010 at >, Morgan Kaufmann Publishers, Proceedings of International Conference on Very Large Data Bases, 1995, pp. 299-310. |
Bengio, et al., "Group Sparse Coding", retrieved on Jul. 7, 2010 at <<http://books.nips.cc/papers/files/nips22/NIPS2009—0865.pdf>>, MIT Press, Advances in Neural Information Processing Systems (NIPS), 2009, pp. 1-8. |
Bengio, et al., "Group Sparse Coding", retrieved on Jul. 7, 2010 at >, MIT Press, Advances in Neural Information Processing Systems (NIPS), 2009, pp. 1-8. |
Berchtold, et al., "Fast Parallel Similarity Search in Multimedia Databases", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.6847&rep=repl&type=pdf>>, ACM, Proceedings of International Conference on Management of Data, Tucson, Arizona, 1997, pp. 1-12. |
Berchtold, et al., "Fast Parallel Similarity Search in Multimedia Databases", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of International Conference on Management of Data, Tucson, Arizona, 1997, pp. 1-12. |
Berchtold, et al., "The X-Tree: An Index Structure for High-Dimensional Data", retrieved on Jul. 30, 2010 at <<http://eref.uqu.edu.sa/files/the-x-tree-an-index-structure-for-high.pdf>>, Morgan Kaufmann Publishers, Proceedings of Conference on Very Large Data Bases, Mumbai, India, 1996, pp. 28-39. |
Berg, et al., "Shape Matching and Object Recognition using Low Distortion Correspondences", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.1762&rep=rep1&type=pdf>>, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 2005, pp. 26-33. |
Berg, et al., "Shape Matching and Object Recognition using Low Distortion Correspondences", retrieved on Jul. 30, 2010 at >, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 2005, pp. 26-33. |
Berker, et al., "Very-Large Scale Incremental Clustering", retrieved on Jul. 30, 2010 at <<http://www.google.com/search?q=Berker%2C+Very-Large+Scale+Incremental+Clustering&rls=com.microsoft:en-us:IE-SearchBox&ie=UTF-8&oe=UTF-8&sourceid=ie7&rlz=117ADBF>>, Mar. 2007, pp. 1-24. |
Can, et al., "Concepts and Effectiveness of the Cover Coefficient Based Clustering Methodology for Text Databases", retrieved on Jul. 30, 2010 at <<http://sc.lib.muohio.edu/bitstream/handle/2374.MIA/246/fulltext.pdf?sequence=1>>, Miami University Libraries, Oxford, Ohio, Technical Report MU-SEAS-CSA-1987-002, Dec. 1987, pp. 1-45. |
Can, et al., "Concepts and Effectiveness of the Cover Coefficient Based Clustering Methodology for Text Databases", retrieved on Jul. 30, 2010 at >, Miami University Libraries, Oxford, Ohio, Technical Report MU-SEAS-CSA-1987-002, Dec. 1987, pp. 1-45. |
Cui, et al., "Combining Stroke-Based and Selecion-Based Relevance Feedback for Content-Based Image Retrieval", at <<http://portal.acm.org/citation.cfm?id=1291304#abstract>>, ACM, 2007, pp. 329-332. |
Cui, et al., "Combining Stroke-Based and Selecion-Based Relevance Feedback for Content-Based Image Retrieval", at >, ACM, 2007, pp. 329-332. |
Datar, et al., "Locality-Sensitive Hashing Scheme Based on p-Stable Distributions", retrieved on Jul. 30, 2010 at <<http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p253-datar.pdf>>, ACM, Proceedings of Symposium on Computational Geometry (SCG), Brooklyn, New York, 2004, pp. 253-262. |
Datar, et al., "Locality-Sensitive Hashing Scheme Based on p-Stable Distributions", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of Symposium on Computational Geometry (SCG), Brooklyn, New York, 2004, pp. 253-262. |
Datta, et al., "Image Retrieval: Ideas, Influences, and Tends of the New Age", at <<http://infolab.stanford.edu/˜wangz/project/imsearch/review/JOUR/datta.pdf>>, ACM, 2006, pp. 65. |
Datta, et al., "Image Retrieval: Ideas, Influences, and Tends of the New Age", at >, ACM, 2006, pp. 65. |
Extended European Search Report mailed Aug. 11, 2011 for European patent application No. 09755475.2, 9 pages. |
Faloutsos, et al., "Efficient and Effective Querying by Image Content", retreived on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.9013&rep=rep1&type=pdf>>, Kluwer Academic Publishers, Hingham, MA, Journal of Intelligent Information Systems, vol. 3, No. 3-4, Jul. 1994, pp. 231-262. |
Faloutsos, et al., "Efficient and Effective Querying by Image Content", retreived on Jul. 30, 2010 at >, Kluwer Academic Publishers, Hingham, MA, Journal of Intelligent Information Systems, vol. 3, No. 3-4, Jul. 1994, pp. 231-262. |
Fraundorfer, et al., "Evaluation of local detectors on non-planar scenes", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.3077&rep=rep1&type=pdf>>, Proceedings of Workshop of the Austrian Association for Pattern Recognition, 2004, pp. 125-132. |
Fraundorfer, et al., "Evaluation of local detectors on non-planar scenes", retrieved on Jul. 30, 2010 at >, Proceedings of Workshop of the Austrian Association for Pattern Recognition, 2004, pp. 125-132. |
Friedman, et al., "An Algorithm for Finding Best Matches in Logarithmic Expected Time", retrieved on Jun. 29, 2010 at <<http://delivery.acm.org/10.1145/360000/355745/p209-freidman.pdf?key1=355745&key2=3779787721&coll=GUIDE&d1=GUIDE&CFID=93370504&CFTOKEN=86954411>>, ACM Transactions on Mathematical Software, vol. 3, No. 3, Sep. 1977, pp. 209-226. |
Furao, et al., "A Self-controlled Incremental Method for Vector Quantization", retrieved on Jul. 3, 2010 at <<http://www.isl.titech.ac.jp/˜hasegawalab/papers/shen—prmu—sept—2004.pdf>>, Japan Science and Technology Agency, Journal: IEIC Technical Report, Institute of Electronics, Information and Communication Engineers, vol. 104, No. 290, 2004, pp. 93-100. |
Furao, et al., "A Self-controlled Incremental Method for Vector Quantization", retrieved on Jul. 3, 2010 at >, Japan Science and Technology Agency, Journal: IEIC Technical Report, Institute of Electronics, Information and Communication Engineers, vol. 104, No. 290, 2004, pp. 93-100. |
Gevers et al., "The PicToSeek WWW Image Search System," Proceedings of the IEEE International Conference on Multimedia Computing and Systems, vol. 1, Jun. 7, 1999, Florence, Italy, pp. 264-269. |
Grauman, et al., "The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.100.253&rep=rep1&type=pdf>>, IEEE, Proceedings of International Conference on Computer Vision (ICCV), Beijing, China, Oct. 2005, pp. 1458-1465. |
Grauman, et al., "The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features", retrieved on Jul. 30, 2010 at >, IEEE, Proceedings of International Conference on Computer Vision (ICCV), Beijing, China, Oct. 2005, pp. 1458-1465. |
Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.7887&rep=rep1&type=pdf>>, ACM, 1984, pp. 47-57. |
Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching", retrieved on Jul. 30, 2010 at >, ACM, 1984, pp. 47-57. |
He, et al., "An Investigation of Using K-d Tree to Improve Image Retrieval Efficiency", retrieved on Jul. 30, 2010 at <<http://www.aprs.org.au/dicta2002/dicta2002—proceedings/He128.pdf>>, Digital Image Computing Techniques and Application (DICTA), Melbourne, Australia, Jan. 2002, pp. 1-6. |
He, et al., "An Investigation of Using K-d Tree to Improve Image Retrieval Efficiency", retrieved on Jul. 30, 2010 at >, Digital Image Computing Techniques and Application (DICTA), Melbourne, Australia, Jan. 2002, pp. 1-6. |
He, et al., "Learning and Inferring a Semantic Space from User's Relevance Feedback for Image Retrieval", retrieved on Jul. 30, 2010 at <<http://research.microsoft.com/pubs/69949/tr-2002-62.doc>>, ACM, Proceedings of International Multimedia Conference, Juan-les-Pins, France, 2002, pp. 343-346. |
He, et al., "Learning and Inferring a Semantic Space from User's Relevance Feedback for Image Retrieval", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of International Multimedia Conference, Juan-les-Pins, France, 2002, pp. 343-346. |
Henrich, et al., "The LSD Tree: Spatial Access to Multidimensional Point and Non-Point Objects", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.9784&rep=rep1&type=pdf>>, Proceedings of the International Conference on Very large Data Bases, Amsterdam, 1989, pp. 45-54. |
Henrich, et al., "The LSD Tree: Spatial Access to Multidimensional Point and Non-Point Objects", retrieved on Jul. 30, 2010 at >, Proceedings of the International Conference on Very large Data Bases, Amsterdam, 1989, pp. 45-54. |
Indyk, et al., "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.249&rep=rep1&type=pdf>>, ACM, Proceedings of Symposium on Theory of Computing, Dallas, TX, Apr. 1998, pp. 604-613. |
Indyk, et al., "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of Symposium on Theory of Computing, Dallas, TX, Apr. 1998, pp. 604-613. |
Javidi, et al., "A Semantic Feedback Framework for Image Retrieval", retrieved on Jul. 30, 2010 at http://www.ijcee.org/papers/171.pdf>>, International Journal of Computer and Electrical Engineering, vol. 2, No. 3, Jun. 2010, pp. 417-425. |
Jeong, et al., "Automatic Exaction of Semantic Relatonships from Images Using Ontologies and SVM Classifiers", Proceedings of the Korean Information Science Society Conference, vol. 34, No. 1(c), Jun. 2006, pp. 13-18. |
Katayama, et al., "The SR-Tree: An Index Structure for High-Dimensional Nearest Neighbor Queries", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.64.4381&rep=rep1&type=pdf>>, ACM, Proceedings of Internaltional Conference on Management of Data, Tucson, Arizona, 1997, pp. 369-380. |
Katayama, et al., "The SR-Tree: An Index Structure for High-Dimensional Nearest Neighbor Queries", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of Internaltional Conference on Management of Data, Tucson, Arizona, 1997, pp. 369-380. |
Kim, et al., "A Hierarchical Grid Feature Representation Framework for Automatic Image Annotation", retrieved on Jul. 7, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4959786>>, IEEE Computer Society, Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009, pp. 1125-1128. |
Kim, et al., "A Hierarchical Grid Feature Representation Framework for Automatic Image Annotation", retrieved on Jul. 7, 2010 at >, IEEE Computer Society, Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009, pp. 1125-1128. |
Kushki, et al., "Query Feedback for Interactve Image Rerieval", at <<http://ieeexplore.ieee.org/xpl/freeabs—all.jsp?arnumber=1294956>>, IEEE, vol. 14, No. 15, May 2004, pp. 644-655. |
Kushki, et al., "Query Feedback for Interactve Image Rerieval", at >, IEEE, vol. 14, No. 15, May 2004, pp. 644-655. |
Lepetit, et al., "Randomized Trees for Real-Time Keypoint Recognition", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.4902&rep=rep1&type=pdf>>, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2005, pp. 775-781. |
Lepetit, et al., "Randomized Trees for Real-Time Keypoint Recognition", retrieved on Jul. 30, 2010 at >, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2005, pp. 775-781. |
Li, et al., "An Adaptive Relevance Feedback Image Retrieval Method with Based on Possibilistic Clustering Algorithm", retrieved on Jul. 30, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04021676>>, IEEE Computer Society, Proceedings of International Conference on Intelligent Systems Design and Applications (ISDA), 2006, pp. 295-299. |
Li, et al., "An Adaptive Relevance Feedback Image Retrieval Method with Based on Possibilistic Clustering Algorithm", retrieved on Jul. 30, 2010 at >, IEEE Computer Society, Proceedings of International Conference on Intelligent Systems Design and Applications (ISDA), 2006, pp. 295-299. |
Likas, et al., "The global k-means clustering algorthm", retrieved on Jul. 30, 2010 at <<http://www.cs.uoi.gr/˜arly/papers/PR2003.pdf>>, Elsevier Science Ltd., Pattern Recognition, vol. 36, 2003, pp. 451-461. |
Likas, et al., "The global k-means clustering algorthm", retrieved on Jul. 30, 2010 at >, Elsevier Science Ltd., Pattern Recognition, vol. 36, 2003, pp. 451-461. |
Linde, et al., "An Algorithm for Vector Quantizer Design", retrieved on Jul. 30, 2010 at <<http://148.204.64.201/paginas%20anexas/voz/articulos%20interesantes/reconocimiento%20de%20voz/cuantificacion%20vectorial/An%20algorithm%20for%20Vector%20Quantizer%20Design.pdf>>, IEEE Transactions on Communications, vol. COM-28, No. 1, Jan. 1980, pp. 84-95. |
Lindeberg, et al., "Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.5090&rep=rep1&type=pdf>>, Springer-Verlag New York, Proceedings of European Conference on Computer Vision, Stockholm, Sweden, vol. 1, 1994, pp. 389-400. |
Lindeberg, et al., "Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure", retrieved on Jul. 30, 2010 at >, Springer-Verlag New York, Proceedings of European Conference on Computer Vision, Stockholm, Sweden, vol. 1, 1994, pp. 389-400. |
Lloyd, "Least Squares Quantization in PCM", retreived on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=8A3C215650DD1680BE51B35B421D21D7?doi=10.1.1.131.1338&rep=rep1&type=pdf>>, IEEE Transactions on Information Theory, vol. IT-28, No. 2, Mar. 1982, pp. 129-137. |
Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu. edu/viewdoc/download?doi=10.1.1.157.3843&rep=rep1&type=pdf>>, Kluwer Academic Publishers, Hingham, MA, vol. 60, No. 2, International Journal of Computer Vision, 2004, pp. 91-110. |
Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", retrieved on Jul. 30, 2010 at >, Kluwer Academic Publishers, Hingham, MA, vol. 60, No. 2, International Journal of Computer Vision, 2004, pp. 91-110. |
Magalhaes, et al., "High-Dimensional Visual Vocabularies for Image Retrieval", retrieved on Jul. 7, 2010 at <<http:// citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.7027&rep=rep1&type=pdf>>, ACM, Proceedings of Conference on Research and Development in Information Retrieval, Amsterdam, NL, Jul. 27, 2007, pp. 815-816. |
Magalhaes, et al., "High-Dimensional Visual Vocabularies for Image Retrieval", retrieved on Jul. 7, 2010 at >, ACM, Proceedings of Conference on Research and Development in Information Retrieval, Amsterdam, NL, Jul. 27, 2007, pp. 815-816. |
Mairal, et al., "Online Dictionary Learning for Sparse Coding", retrieved on Jul. 7, 2010 at <<http://www.di.ens.fr/˜fbach/mairal—icm109.pdf>>, ACM, Proceedings of International Conference on Machine Learning, Montreal, CA, vol. 382, 2009, pp. 1-8. |
Mairal, et al., "Online Dictionary Learning for Sparse Coding", retrieved on Jul. 7, 2010 at >, ACM, Proceedings of International Conference on Machine Learning, Montreal, CA, vol. 382, 2009, pp. 1-8. |
Matas, et al., "Robust Wide Baseline Stereo from Maximally Stable Extremal Regions", retrieved on Jul. 30, 2010 at <<http://cmp.felk.cvut.cz/˜matas/papers/matas-bmvc02.pdf>>, Proceedings of British Machine Vision Conference (BMVC), Cardiff, UK, 2002, pp. 384-393. |
Matas, et al., "Robust Wide Baseline Stereo from Maximally Stable Extremal Regions", retrieved on Jul. 30, 2010 at >, Proceedings of British Machine Vision Conference (BMVC), Cardiff, UK, 2002, pp. 384-393. |
Max, "Quantizing for Minimum Distortion", retrieved on Jul. 30, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1057548&userType=inst>>, IEEE Transactions on Information Theory, vol. 6, No. 3, Mar. 1960, pp. 7-12. |
Max, "Quantizing for Minimum Distortion", retrieved on Jul. 30, 2010 at >, IEEE Transactions on Information Theory, vol. 6, No. 3, Mar. 1960, pp. 7-12. |
Mehrotra, et al., "Feature-Based Retrieval of Similar Shapes", retrieved on Jul. 30, 2010 at http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=344072>>, IEEE Computer Society, Proceedings of International Conference on Data Engineering, 1993, pp. 108-115. |
Mikolajczyk, et al., "A Comparison of Affine Region Detectors", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.595&rep=repl&type=pdf>>, Kluwer Academic Publishers Hingham, MA, International Journal of Computer Vision, vol. 65, No. 1-2, 2005, pp. 43-72. |
Mikolajczyk, et al., "A Comparison of Affine Region Detectors", retrieved on Jul. 30, 2010 at >, Kluwer Academic Publishers Hingham, MA, International Journal of Computer Vision, vol. 65, No. 1-2, 2005, pp. 43-72. |
Mikolajczyk, et al., "A performance evaluation of local descriptors", retrieved on Jul. 30, 2010 at <<http://www.ai.mit.edu/courses/6.891/handouts/mikolajczyk—cvpr2003.pdf>>, IEEE Computer Society, Transactions on Pattern Analysis and Machine Intelligence, vol. 27, No. 10, 2005, pp. 1615-1630. |
Mikolajczyk, et al., "A performance evaluation of local descriptors", retrieved on Jul. 30, 2010 at >, IEEE Computer Society, Transactions on Pattern Analysis and Machine Intelligence, vol. 27, No. 10, 2005, pp. 1615-1630. |
Mikolajczyk, et al., "Scale and Affine Invariant Interest Point Detectors", retrieved on Jul. 30, 2010 at <<http://www.robots.ox.ac.uk/~vgg/research/affine/det-eval-files/mikolajczyk-ijcv2004.pdf>>, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 60, No. 1, 2004, pp. 63-86. |
Mikolajczyk, et al., "Scale and Affine Invariant Interest Point Detectors", retrieved on Jul. 30, 2010 at <<http://www.robots.ox.ac.uk/˜vgg/research/affine/det—eval—files/mikolajczyk—ijcv2004.pdf>>, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 60, No. 1, 2004, pp. 63-86. |
Murthy, et al., "Content Based Image Retrieval using Hierarchical and K-Means Clustering Techniques", retrieved on Jul. 30, 2010 at <<http://www.ijest.info/docs/IJEST10-02-03-13.pdf>>, International Journal of Engineering Science and Technology, vol. 2, No. 3, 2010, pp. 209-212. |
Murthy, et al., "Content Based Image Retrieval using Hierarchical and K-Means Clustering Techniques", retrieved on Jul. 30, 2010 at >, International Journal of Engineering Science and Technology, vol. 2, No. 3, 2010, pp. 209-212. |
Nister, et al., "Scalable Recognition with a Vocabulary Tree", retrieved on Jul. 7, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.9520&rep=rep1&type=pdf>>, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2006, pp. 2161-2168. |
Nister, et al., "Scalable Recognition with a Vocabulary Tree", retrieved on Jul. 7, 2010 at >, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2006, pp. 2161-2168. |
Obdrzalek, et al., "Sub-linear Indexing for Large Scale Object Recognition", retrieved on Jul. 30, 2010 at <<http://cmp.felk.cvut.cz/˜matas/papers/obdrzalek-tree-bmvc05.pdf>>, Proceedings of the British Machine Vision Conference (BMVC), London, UK, vol. 1, Sep. 2005, pp. 1-10. |
Obdrzalek, et al., "Sub-linear Indexing for Large Scale Object Recognition", retrieved on Jul. 30, 2010 at >, Proceedings of the British Machine Vision Conference (BMVC), London, UK, vol. 1, Sep. 2005, pp. 1-10. |
Office Action for U.S. Appl. No. 12/938,310, mailed on Apr. 11, 2012, Linjun Yang, "Adaptive Image Retrieval Database," 12 pages. |
Office action for U.S. Appl. No. 12/938,310, mailed on Oct. 25, 2012, Yang et al., "Adaptive Image Retrieval Database", 11 pages. |
Patane, et al., "The Enhanced LBG Algorithm", retrieved on Jul. 30, 2010 at <<http://www.google.com.sg/url?sa=t&source=web&cd=1&ved=0CBcQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload% 3Fdoi%3D10.1.1.74.1995%26rep%3Drep1%26type%3Dpdf&rct=j&q=The%20enhanced%20LBG%20algorithm&ei=QXpSTOyFEoyevQOQ35Qa&usg=AFQjCNGkfxm5Kgm4BalKO42-FpgsDADtyw>>, Pergamon, Neural Networks, vol. 14, 2001, pp. 1219-1237. |
Qi et al., "Image Retrieval Using Transaction-Based and SVM-Based Learning in Relevance Feedback Sessions," Image Analysis and Recognition; (Lecture Notes in Computer Science), Aug. 22, 2007, Heidelbert, Berlin, pp. 638-649. |
Qian, et al., "Gaussian Mixture Model for Relevance Feedback in Image Retrieval", at <<http://research.microsoft.com/asia/dload—files/group/mcomputing/2003P/ICME02-qf.pdf>>, In Proceedings of International Conference on Multimedia and Expo (ICME '02), Aug. 2002, pp. 1-4. |
Qian, et al., "Gaussian Mixture Model for Relevance Feedback in Image Retrieval", at >, In Proceedings of International Conference on Multimedia and Expo (ICME '02), Aug. 2002, pp. 1-4. |
Rothganger, et al., "3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints", retrieved on Jul. 30, 2010 at <<http://www-cvr.ai.uiuc.edu/ponce—grp/publication/paper/ijcv04d.pdf>>, Kluwer Academic Publishers Hingham, MA, International Journal of Computer Vision, vol. 66, No. 3, Mar. 2006, pp. 231-259. |
Rothganger, et al., "3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints", retrieved on Jul. 30, 2010 at >, Kluwer Academic Publishers Hingham, MA, International Journal of Computer Vision, vol. 66, No. 3, Mar. 2006, pp. 231-259. |
Samet, "The Quadtree and Related Hierarchical Data Structure", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.75.5407&rep=rep1&type=pdf>>, ACM, Computing Surveys, vol. 16, No. 2, Jun. 1984, pp. 187-260. |
Samet, "The Quadtree and Related Hierarchical Data Structure", retrieved on Jul. 30, 2010 at >, ACM, Computing Surveys, vol. 16, No. 2, Jun. 1984, pp. 187-260. |
Sawhney, et al., "Efficient Color Histogram Indexing", retrieved on Jul. 30, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=413532>>, IEEE. Proceedings of International Conference on Image Processing (ICIP), Austin, TX, vol. 2, Nov. 1994, pp. 66-70. |
Sawhney, et al., "Efficient Color Histogram Indexing", retrieved on Jul. 30, 2010 at >, IEEE. Proceedings of International Conference on Image Processing (ICIP), Austin, TX, vol. 2, Nov. 1994, pp. 66-70. |
Schmid, et al., "Evaluation of Interest Point Detectors", retrieved on Jul. 30, 2010 at <<http://cs.grnu.edu/˜zduric/cs774/Papers/Schmid-Evaluation-IJCV.pdf>>, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 37, No. 2, 2000, pp. 151-172. |
Schmid, et al., "Evaluation of Interest Point Detectors", retrieved on Jul. 30, 2010 at >, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 37, No. 2, 2000, pp. 151-172. |
Sclaroff et al., "Unifying Textual and Vsua Cues for Content-Based Image Retrieval on the World Wide Web", at <<http://www.csai.unipa.it/lacascia/papers/cviu99.pdf>>, Academic Press, vol. 75, No. 1/2, Aug. 1999, pp. 86-98. |
Sclaroff et al., "Unifying Textual and Vsua Cues for Content-Based Image Retrieval on the World Wide Web", at >, Academic Press, vol. 75, No. 1/2, Aug. 1999, pp. 86-98. |
Sellis, et al., "The R+-Tree: A Dynamic Index for Multidimensional Objects", retrieved on Jul. 30, 2010 at <<http://www.vldb.org/conf/1987/P507.PDF>>, Proceedings of the Conference on Very Large Data Bases, Brighton, 1987, pp. 507-518. |
Sellis, et al., "The R+-Tree: A Dynamic Index for Multidimensional Objects", retrieved on Jul. 30, 2010 at >, Proceedings of the Conference on Very Large Data Bases, Brighton, 1987, pp. 507-518. |
Sivic, et al., "Video Google: A Text Retrieval Approach to Object Matching in Videos", retrieved on Jul. 30, 2010 at <<http://www.robots.ox.ac.uk/˜vgg/publications/papers/sivic03.pdf>>, IEEE Computer Society, Proceedings of International Conference on Computer Vision (ICCV), vol. 2, 2003, pp. 1-8. |
Sivic, et al., "Video Google: A Text Retrieval Approach to Object Matching in Videos", retrieved on Jul. 30, 2010 at >, IEEE Computer Society, Proceedings of International Conference on Computer Vision (ICCV), vol. 2, 2003, pp. 1-8. |
Sproull, "Refinements to Nearest-Neighbor Searching in k-Dimensional Trees", Springer-Verlag NY, Algorithmica, vol. 6, 1991, pp. 579-589. |
Torres et al., "Semantic Image Retrieval Using Region-Base Relevance Feedback," Adaptive Multimedia Retrieval: User, Context, and Feedback (Lecture Notes in Computer Science; LNCS), Heidelberg, Berlin, 2007, pp. 192-206. |
Tuytelaars, et al., "Matching Widely Separated Views Based on Affine Invariant Regions", retrieved on Jul. 30, 2010 at <<http://www.vis.uky.edu/˜dnister/Teaching/CS684Fall2005/tuytelaars—ijcv2004.pdf>>, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 59, No. 1, 2004, pp. 61-85. |
Tuytelaars, et al., "Matching Widely Separated Views Based on Affine Invariant Regions", retrieved on Jul. 30, 2010 at >, Kluwer Academic Publishers, The Netherlands, International Journal of Computer Vision, vol. 59, No. 1, 2004, pp. 61-85. |
van Rijsbergen, "Information Retrieval", Butterworth-Heinemann, 1979, pp. 1-151. |
White, et al., "Similarity Indexing: Algorithms and Performance", retrieved on Jul. 30, 2010 at <<http://citeseerx.ist.psu. edu/viewdoc/download?doi=10.1.1.48.5758&rep=rep1&type=pdf>> Proceedings of Conference on Storage and Retrieval for Image and Video Databases (SPIE), vol. 2670, San Jose, CA, 1996, pp. 62-75. |
White, et al., "Similarity Indexing: Algorithms and Performance", retrieved on Jul. 30, 2010 at > Proceedings of Conference on Storage and Retrieval for Image and Video Databases (SPIE), vol. 2670, San Jose, CA, 1996, pp. 62-75. |
Yang et al., "Learning Image Similarities and Categories from Content Analysis and Relevance Feedback," Proceedings ACM Multimedia 2000 Workshops, Marina Del Rey, CA, Nov. 4, 2000, vol. CONF. 8, pp. 175-178. |
Yang, et al., "Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification", retrieved on Jul. 7, 2010 at <<http://www.ifp.illinois.edu/˜jyang29/papers/CVPR09-ScSPM.pdf>>, IEEE Computer Society, Conference on Computer Vision and Pattern Recognition (CVPR), 2009, Miami, FLA, pp. 1-8. |
Yang, et al., "Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification", retrieved on Jul. 7, 2010 at >, IEEE Computer Society, Conference on Computer Vision and Pattern Recognition (CVPR), 2009, Miami, FLA, pp. 1-8. |
Yu, et al., "Adaptive Document Clustering", retrieved on Jul. 30, 2010 at <<http://74.125.155.132/scholar?q=cache: nleqYBgXXhMJ:scholar.google.com/&h1=en&as—sdt=2000>>, ACM, Proceedings of International Conference on Research and Development in Information Retrieval, Montreal, Quebec, 1985, pp. 197-203. |
Yu, et al., "Adaptive Document Clustering", retrieved on Jul. 30, 2010 at >, ACM, Proceedings of International Conference on Research and Development in Information Retrieval, Montreal, Quebec, 1985, pp. 197-203. |
Zhou., et al., "Unifying Keywords and Visual Contents in Image Retrieval", at <<http://www.ifp.uiuc.edu/˜xzhou2/Research/papers/Selected—papers/IEEE—MM.pdf>>, IEEE, 2002, pp. 11. |
Zhou., et al., "Unifying Keywords and Visual Contents in Image Retrieval", at >, IEEE, 2002, pp. 11. |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10007679B2 (en) | 2008-08-08 | 2018-06-26 | The Research Foundation For The State University Of New York | Enhanced max margin learning on multimodal data mining in a multimedia database |
US20130142401A1 (en) * | 2011-12-05 | 2013-06-06 | Canon Kabushiki Kaisha | Image processing apparatus and image processing method |
US9245206B2 (en) * | 2011-12-05 | 2016-01-26 | Canon Kabushiki Kaisha | Image processing apparatus and image processing method |
US20140355880A1 (en) * | 2012-03-08 | 2014-12-04 | Empire Technology Development, Llc | Image retrieval and authentication using enhanced expectation maximization (eem) |
US9158791B2 (en) * | 2012-03-08 | 2015-10-13 | New Jersey Institute Of Technology | Image retrieval and authentication using enhanced expectation maximization (EEM) |
US9870634B2 (en) * | 2013-09-25 | 2018-01-16 | Heartflow, Inc. | Systems and methods for controlling user repeatability and reproducibility of automated image annotation correction |
US20170132826A1 (en) * | 2013-09-25 | 2017-05-11 | Heartflow, Inc. | Systems and methods for controlling user repeatability and reproducibility of automated image annotation correction |
US10546403B2 (en) | 2013-09-25 | 2020-01-28 | Heartflow, Inc. | System and method for controlling user repeatability and reproducibility of automated image annotation correction |
US11742070B2 (en) | 2013-09-25 | 2023-08-29 | Heartflow, Inc. | System and method for controlling user repeatability and reproducibility of automated image annotation correction |
US10319035B2 (en) | 2013-10-11 | 2019-06-11 | Ccc Information Services | Image capturing and automatic labeling system |
US20170364740A1 (en) * | 2016-06-17 | 2017-12-21 | International Business Machines Corporation | Signal processing |
US9928408B2 (en) * | 2016-06-17 | 2018-03-27 | International Business Machines Corporation | Signal processing |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
Also Published As
Publication number | Publication date |
---|---|
US20120114248A1 (en) | 2012-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8463045B2 (en) | Hierarchical sparse representation for image retrieval | |
US9201903B2 (en) | Query by image | |
JP5749279B2 (en) | Join embedding for item association | |
US7693865B2 (en) | Techniques for navigational query identification | |
USRE47340E1 (en) | Image retrieval apparatus | |
Zhang et al. | Query specific rank fusion for image retrieval | |
US20180276250A1 (en) | Distributed Image Search | |
US9460122B2 (en) | Long-query retrieval | |
US20130110829A1 (en) | Method and Apparatus of Ranking Search Results, and Search Method and Apparatus | |
US9317533B2 (en) | Adaptive image retrieval database | |
US11138479B2 (en) | Method for valuation of image dark data based on similarity hashing | |
US20120310864A1 (en) | Adaptive Batch Mode Active Learning for Evolving a Classifier | |
WO2013129580A1 (en) | Approximate nearest neighbor search device, approximate nearest neighbor search method, and program | |
Lin et al. | Association rule mining with a correlation-based interestingness measure for video semantic concept detection | |
US11755671B2 (en) | Projecting queries into a content item embedding space | |
Chang et al. | Semantic clusters based manifold ranking for image retrieval | |
Tang et al. | Remote sensing image retrieval based on semi-supervised deep hashing learning | |
CN109902129A (en) | Insurance agent's classifying method and relevant device based on big data analysis | |
CN117349512B (en) | User tag classification method and system based on big data | |
Kozhushko et al. | Using hierarchical temporal memory for document ranking system identification | |
CN115344734A (en) | Image retrieval method, image retrieval device, electronic equipment and computer-readable storage medium | |
Mali et al. | Image Retrieval using Hash Code and Relevance Feedback Technique | |
Saad et al. | Mining visual web knowledge utilizing multiple classifier architecture | |
Mirza-Mohammadi et al. | Ranking error-correcting output codes for class retrieval | |
Hosseini | Optimizing the Construction of Information Retrieval Test Collections |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICROSOFT CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YANG, LINJUN;TIAN, QI;NI, BINGBING;SIGNING DATES FROM 20101009 TO 20101010;REEL/FRAME:025352/0530 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001 Effective date: 20141014 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |