WO2014160433A2 - Systems and methods for classifying objects in digital images captured using mobile devices - Google Patents

Systems and methods for classifying objects in digital images captured using mobile devices Download PDF

Info

Publication number
WO2014160433A2
WO2014160433A2 PCT/US2014/026597 US2014026597W WO2014160433A2 WO 2014160433 A2 WO2014160433 A2 WO 2014160433A2 US 2014026597 W US2014026597 W US 2014026597W WO 2014160433 A2 WO2014160433 A2 WO 2014160433A2
Authority
WO
WIPO (PCT)
Prior art keywords
digital image
recited
representation
image
determining
Prior art date
Application number
PCT/US2014/026597
Other languages
French (fr)
Other versions
WO2014160433A3 (en
Inventor
Jan Willers Amtrup
Anthony Macciola
Stephen Michael Thompson
Jiyong Ma
Alexander Shustorovich
Christopher W. Thrasher
Original Assignee
Kofax, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kofax, Inc. filed Critical Kofax, Inc.
Priority to EP14773721.7A priority Critical patent/EP2974261A4/en
Priority to CN201480014229.9A priority patent/CN105308944A/en
Priority to JP2016502192A priority patent/JP2016516245A/en
Publication of WO2014160433A2 publication Critical patent/WO2014160433A2/en
Publication of WO2014160433A3 publication Critical patent/WO2014160433A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/416Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/224Character recognition characterised by the type of writing of printed characters having additional code marks or containing code marks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/247Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids

Definitions

  • the present invention relates to mobile image capture and image processing, and more particularly to capturing and processing digital images using a mobile device, and classifying objects detected in such digital images.
  • a still farther challenge is presented by the nature of mobile capture components (e.g. cameras on mobile phones, tablets, etc.). Where conventional scanners are capable of faithfully representing the physical document in a digital image, critically maintaining aspect ratio, dimensions, and shape of the physical document in the digital image, mobile capture components are frequently incapable of producing such results,
  • images of documents captured by a camera present a new line of processing issues not encountered when dealing with images captured by a scanner. This is in part due to the inherent differences in the way the document image is acquired, as well as the way the devices are constructed.
  • the way that some scanners work is to use a transport mechanism that creates a relative movement between paper and a linear array of sensors. These sensors create pixel values of the document as it moves by, and the sequence of these captured pixel values forms an image. Accordingly, there is generally a horizontal or vertical consistency up to the noise in the sensor itself, and it is the same sensor that provides ail the pixels in the line.
  • cameras have many more sensors in a nonlinear array, e.g., typically arranged in a rectangle. Thus, all of these individual sensors are independent, and render image data that is not typically of horizontal or vertical consistency.
  • cameras introduce a projective effect that is a function of the angle at which the picture is taken. For example, with a linear array like in a scanner, even if the transport of the paper is not perfectly orthogonal to the alignment of sensors and some skew is introduced, there is no projective effect like in a camera. Additionally, with camera capture, nonlinear distortions may be introduced because of the camera optics.
  • a method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representa tion of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
  • a method includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class; and performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted, in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features.
  • a system includes a processor; and logic in and/or executable by the processor to cause the processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first
  • a computer program product includes a computer readable storage medium having program code embodied therewith, the program code readable/executable by a processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the
  • FIG. 1 illustrates a network architecture, in accordance with, one embodiment.
  • FIG. 2 shows a representative hardware environment that may be associated, with the servers and/or clients of FIG. 1, in accordance with one embodiment.
  • FIG. 3A depicts a digital image of an object, according to one embodiment.
  • FIG, 3B depicts a schematic representation of the digital image shown in FIG. 3A divided into a plurality of sections for generating a first representation of the digital image, according to one embodiment.
  • FIG. 3C is depicts a first representat on of the digital image shown in FIG. 3A, the first representation being characterized by a reduced resolution relati e to the resolution of the digital image.
  • FIG. 4A is a schematic representation of a plurality of subregions depicted in a digital image of a document, according to one embodiment.
  • FIG. 4B is a masked representation of the digital image shown in FIG. 4A, according to one embodiment.
  • FIG. 4C is a masked representation of the digital image shown in FIG. 4.4, according to one embodiment.
  • FIG. 4D is a masked representation of the digital image shown in FIG. 4A, according to one embodiment.
  • FIG. 5 is a flowchart of a method, according to one embodiment.
  • FIG, 6 is a flowchart of a method, according to one embodiment.
  • the present application refers to image processing of images (e.g. pictures, figures, graphical schematics, single frames of movies, videos, films, clips, etc.) captured by cameras, especially cameras of mobile devices.
  • a mobile device is any device capable of receiving data without having power supplied via a physical connection (e.g. wire, cord, cable, etc.) and capable of receiving data without a physical data connection (e.g. wire, cord, cable, etc.).
  • Mobile devices within the scope of the present disclosures include exemplar ⁇ ' devices such as a mobile telephone, smartphone, tablet, personal digital assistant, iPod ®, iPad ®, BLACKBERRY ⁇ device, etc,
  • the presently disclosed mobile image processing algorithms can be applied, sometimes with certain modifications, to images coming from scanners and multifunction peripherals (MFPs).
  • images processed using the presently disclosed processing algorithms may be further processed using conventional scanner processing algorithms, in some approaches.
  • One benefit of using a mobile device is that with a data plan, image processing and information processing based on captured images can be done in a much more convenient, streamlined and integrated way than previous methods that relied on presence of a scanner.
  • the use of mobile devices as document(s) capture and/or processing devices has heretofore been considered unfeasible for a variety of reasons.
  • an image may be captured by a camera of a mobile device.
  • the term "camera” should be broadly interpreted to include any type of device capable of capturing an image of a physical object external to the device, such as a piece of paper.
  • the term “camera” does not encompass a peripheral scanner or multifunction device. Any type of camera may be used. Preferred embodiments may use cameras having a higher resolution, e.g. 8 MP or more, ideally 12 MP or more.
  • the image may be captured in color, grayscale, black and white, or with any other known optical effect.
  • image as referred to herein is meant to encompass any type of data corresponding to the output of the camera, including raw data, processed data, etc.
  • a method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representation of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
  • a method includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class; and performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted, in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features.
  • a system includes a processor; and logic in and/or executable by the processor to cause the processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the comparison.
  • a computer program product includes a computer readable storage medium having program code embodied therewith, the program code readable/executable by a processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the
  • aspects of the present mvention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as "logic,” “circuit,” “module” or “system.” Furthermore, aspects of the present mvention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, processor, or device.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband, as part of a carrier wave, an electrical connection having one or more wires, an optical fiber, etc. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device,
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Sendee Provider).
  • LAN local area network
  • WAN wide area network
  • These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other d evices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed, in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration can be implemented by special purpose hardware -based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
  • FIG. 1 illustrates an architecture 100, in accordance with one embodiment.
  • a plurality of remote networks 102 are provided including a first remote network 104 and a second remote network 106.
  • a gateway 101 may be coupled between the remote networks 102 and a proximate network 108.
  • the networks 104, 106 may each take any form including, but not limited to a LAN, a WAN such as the Internet, public switched telephone network (PSTN), internal telephone network, etc,
  • the gateway 101 serves as an entrance point from the remote networks 102 to the proximate network 108.
  • the gateway 101 may function as a router, which is capable of directing a given packet of data that aixives at the gateway 101, and a switch, which furnishes the actual path in and out of the gateway 101 for a given packet.
  • At least one data server 114 coupled to the proximate network 108, and which is accessible from the remote networks 102 via the gateway 101.
  • the data server(s) 114 may include any type of computing device/groupware. Coupled, to each data server 114 is a plurality of user devices 116. Such user devices 116 may include a desktop computer, lap-top computer, hand-held computer, printer or any other type of logic. It should be noted that a user device 1 may also be directly coupled to any of the networks, in one embodiment.
  • a peripheral 120 or series of peripherals 120 may be coupled to one or more of the networks 104, 106, 108. It should be noted that databases and/or additional components may be utilized with, or integrated into, any type of network element coupled to the networks 104, 1 ⁇ 6, 108. In the context of the present description, a network element may refer to any component of a network.
  • methods and systems described herein may be implemented with and/or on virtual systems and/or systems which emulate one or more other systems, such as a UNIX system which emulates an IBM z/OS environment, a UNIX system which virtually hosts a MICROSOFT WINDOWS environment, a MICROSOFT WINDOWS system which emulates an IBM z/OS environment, etc.
  • This virtualization and/or emulation may be enhanced through the use ofVMWARE software, in some embodiments.
  • one or more networks 1 ⁇ 4, 106, 108 may represent a cluster of systems commonly referred to as a "cloud.”
  • cloud computing shared resources, such as processing power, peripherals, software, data, servers, etc, are provided to any system in the cloud in an on-demand relationship, thereby allowing access and. distribution of services across many computing systems.
  • Cloud computing typically involves an Internet connection between the systems operating in the cloud, but other techniques of connecting the systems may also be used,
  • FIG. 2 shows a representative hardware environment associated with a user device 116 and/or server 114 of FIG. 1, in accordance with one embodiment.
  • Such figure illustrates a typical hardware configuration of a workstation having a central processing unit 210, such as a microprocessor, and a number of other units interconnected via a system bus 212.
  • a central processing unit 210 such as a microprocessor
  • the workstation shown in FIG. 2 includes a Random Access Memory (RAM) 214, Read Only Memory (ROM) 216, an I/O adapter 218 for connecting peripheral devices such as disk storage units 220 to the bus 212, a user interface adapter 222 for connecting a keyboard 224, a mouse 226, a speaker 228, a microphone 232, and/or other user interface devices such as a touch screen and a digital camera (not shown) to the bus 212, communication adapter 234 for connecting the workstation to a communication network 235 (e.g., a data processing network) and. a display adapter 236 for connecting the bus 212 to a display device 238.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • I/O adapter 218 for connecting peripheral devices such as disk storage units 220 to the bus 212
  • a user interface adapter 222 for connecting a keyboard 224, a mouse 226, a speaker 228, a microphone 232, and/or other user interface devices such as a touch screen and a
  • the workstation may have resident thereon an operating system such as the Microsoft Windows® Operating System (OS), a MAC OS, a UNIX OS, etc. It will be appreciated that a preferred embodiment may also be implemented on platforms and operating systems other than those mentioned.
  • OS Microsoft Windows® Operating System
  • a preferred embodiment may be written using JAVA, XML,, C, and/or C ⁇ + language, or other programming languages, along with an object oriented programming methodology.
  • Object oriented programming (OOP) which has become increasingly used to develop complex applications, may be used,
  • An application may be installed on the mobile device, e.g., stored in a nonvolatile memory of the device.
  • the application includes instructions to perform processing of an image on the mobile device.
  • the application includes instructions to send the image to a remote server such as a network server.
  • the application may include instructions to decide whether to perform some or all processing on the mobile device and/or send the image to the remote site.
  • an edge detection algorithm proceeds from the boundaries of a digital image toward a central region of the image, looking for points that are sufficiently different from what is known about the properties of the background.
  • the background in the images captured by even the same mobile device may be different every time, so a new technique to identify the document(s) in the image is provided.
  • Finding page edges within a camera-captured image helps to accommodate important differences in the properties of images captured using mobile devices as opposed, e.g., to scanners. For example, due to projective effects the image of a rectangular document in a photograph may not appear truly rectangular, and opposite sides of the document in the image may not have the same length. Second, even the best lexises have some non-linearity resulting in straight lines within an object, e.g. straight sides of a substantially rectangular document, appearing slightly curved in the captured, image of that object. Third, images captured using cameras overwhelmingly tend to introduce uneven illumination effects in the captured image. This unevenness of illumination makes even a perfectly uniform background of the surface against which a document may be placed appear in the image with varied brightness, and often with shadows, especially around the page edges if the page is not perfectly flat.
  • the current algorithm utilizes one or more of the following functionalities.
  • the frame of the image contains the digital representation of the document with margins of the surrounding background.
  • the search for individual page edges may be performed on a step-over approach analyzing rows and columns of the image from outside in.
  • the step-over approach may define a plurality of analysis windows within the digital image, such as understood herein, analysis windows may include one or more "background windows,” i.e. windows encompassing only pixels depicting the background of the digital image, as well as one or more "test windows” i.e. windows encompassing pixels depicting the background of the digital image, the digital representation of the document, or both.
  • the digital representation of the document may be detected in the digital image by defining a first analysis window, i.e. a background analysis window, in a margin of the image corresponding to the background of the surface upon w Inch the document is placed.
  • a first analysis window i.e. a background analysis window
  • a plurality of small analysis windows e.g. test windows
  • one or more distributions of one or more statistical properties descriptive of the background may be estimated.
  • a next step in detecting boundaries of the digital representation of the document may include defining a plurality of test windows within the digital image, and analyzing the corresponding regions of the digital image. For each test window one or more statistical values descriptive of the corresponding region of the image may be calculated. Further, these statistical values may be compared to a correspondmg distribution of statistics descriptive of the background.
  • the plurality of test windows may be defined along a path, particularly a linear path.
  • the plurality of test windows may be defined in a horizontal direction and/or a vertical direction, e.g. along rows and columns of the digital image.
  • a stepwise progression may be employed to define the test windows along the path and/or between the rows and/or columns. In some embodiments, as will be appreciated by one having ordinary skill in the art upon reading the present descriptions, utilizing a stepwise progression may advantageously increase the computational efficiency of document detection processes.
  • the magnitude of the starting step may be estimated based on the resolution or pixel size of the image, in some embodiments, but this step may be reduced if advantageous for reliable detection of document sides, as discussed further below.
  • the algorithm estimates the distribution of several statistics descriptive of the image properties found in a large analysis window placed within the background surrounding the document.
  • a plurality of small windows may be defined within the large analysis window, and distributions of statistics descriptive of the small test windows may be estimated.
  • large analy sis window is defined in a background region of the digital image, such as a top-left corner of the image.
  • Statistics descriptive of the background pixels may include any statistical value that may be generated, from digital image data, such as a minimum value, a maximum value, a median value, a mean value, a spread or range of values, a variance, a standard dev iation, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Values may be sampled from any data descripti ve of the digital image, such as brightness values in one or more color channels, e.g. red-green-blue or RGB, cyan-magenta, yellow, black or CMYK, hue saturation value or HSV, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
  • each of the small analysis windows may comprise a subset of the plurality of pixels within the large analysis window.
  • small analysis windows may be of any size and/or shape capable of fitting within the boundaries of large analysis window.
  • small analysis windows may be characterized by a rectangular shape, and even more preferably a rectangle characterized by being three pixels long in a first direction (e.g. height) and seven pixels long in a second direction (e.g. width).
  • first direction e.g. height
  • second direction e.g. width
  • other small analysis window sizes, shapes, and dimensions are also suitable for implementation in the presently disclosed processing algorithms.
  • test windows may be employed to analyze an image and detect the boundary of a digital representation of a document depicted in the image.
  • Background windows are used for estimation of original statistical properties of the background and/or reestimation of local statistical properties of the background. Reestimation may be necessary and/or advantageous in order to address artifacts such as uneven illumination and/or background texture variations.
  • statistical estimation may be performed over some or all of a plurality of small analysis window(s) in a large analysis window within the margin outside of the document page in some approaches.
  • Such estimation may be performed using a stepwise movement of a small analysis window within the large analysis window, and the stepwise movement may be made in any suitable increment so as to vary the number of samples taken for a given pixel.
  • an analysis process may define a number of small analysis windows within large analysis window sufficient to ensure each pixel is sampled once.
  • the plurality of small analysis windows defined in this computationally efficient approach would share common borders but not overlap.
  • the analysis process may define a number of small analysis windows within large analy sis window sufficient to ensure each pixel is sampled a maximum number of times, e.g. by reducing the step to produce only a single pixefshift in a given direction between sequentially defined small analysis windows.
  • any step increment may be employed in various embodiments of the presently disclosed processing algorithms, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • large analysis windows utilized to reestimate statistics of local background in the digital image as well as test windows can be placed in the digital image in any which way desirable.
  • the search for the left side edge in a gi ven row i begins from the calculation of the above mentioned statistics in a large analy sis window adjacent to the frame boundary on the left side of the image centered around, a given row .
  • the algorithm when encountering a possible non-background test window (e.g. a test window for which the estimated statistics are dissimilar from the distribution of statistics characteristic of the last known local background) as the algorithm progresses from the outer region(s) of the image towards the interior regions thereof, the algorithm may backtrack into a previously determined background region, form a new large analysis window and re- estimate the distribution of background statistics in order to reevaluate the validity of the differences between the chosen statistics within the small analysis window and the local distribution of corresponding statistics within the large analysis window, in some embodiments.
  • a possible non-background test window e.g. a test window for which the estimated statistics are dissimilar from the distribution of statistics characteristic of the last known local background
  • the algorithm may proceed from an outer region of the image to an inner region of the image in a variety of manners. For example, in one approach the algorithm proceeds defining test windows in a substantially spiral pattern. In other approaches the pattern may be
  • the pattern may be a substantially shingled pattern.
  • the pattern may also be defined by a "sequence mask" laid, over part or all of the digital image, such as a checkerboard pattern, a vertically, horizontally, or diagonally striped pattern, concentric shapes, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • analysis windows such as large analysis windows and/or small analysis windows may be defined throughout the digital image in a random manner, a pseudo-random manner, stochastically, etc. according to some defined procedure, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the algorithm can proceed with a sequence of test windows in any desirable fashion as long as the path allows to backtrack into known background, and the path covers the whole image with desirable granularity.
  • recalculating statistics in this manner helps to accommodate for any illumination drift inherent to the digital image and/or background, which may otherwise result in false identification of non-background points in the image (e.g. outlier candidate edge points)
  • the algorithm may jump a certain distance further along its path in order to check again and thus bypass small variations in the texture of the background, such as wood grain, scratches on a surface, patterns of a surface, small shadows, etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the algorithm determines whether the point lies on the edge of the shadow (a possibility especially if the edge of the page is raised abo ve the background surface) and tries to get to the actual page edge. This process relies on the observation that shadows usually darken towards the real edge followed, by an abrupt brightening of the image.
  • page edge detection does not necessarily involve edge detection per se, i.e. page edge detection according to the present disclosures may be performed in a manner that does not search for a document boundary (e.g. page edge), but rather searches for image characteristics associated with a transition from background to the document.
  • the transition may be characterized by flattening of the off-white brightness levels within a glossy paper, i.e. by changes in texture rather than in average gray or color levels.
  • candidate edge points e.g. candidate edge points
  • candidate edge points that are essentially the first and the last non-background pixels in each ro and column on a grid.
  • random outliers e.g. outlier candidate edge points and to determine which candidate edge points correspond, to each side of the page, it is useful in one approach to analyze neighboring candidate edge points.
  • a "point" may be considered any region within the digital image, such as a pixel, a position between pixels (e.g. a point with fractional coordinates such as the center of a 2-pixel by 2-pixel square) a small window of pixels, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • a candidate edge point is associated with the center of a test window (e.g. a 3-pixel x 7 -pixel window) that has been found to be characterized by statistics that are determined to be different from the distribution of statistics descriptive of the local background.
  • a "neighboring" candidate edge point, or a “neighboring” pixel is considered to be a point or pixel, respectively, which is near or adjacent a point or pixel of interest (e.g. pixel), e.g. a point or pixel positioned at feast in part along a boundary of the point or pixel of interest, a point or pixel positioned within a threshold distance of the point or pixel of interest (such as within 2, 10, 64 pixels, etc. in a given direction, within one row of the point or pixel of interest, within one column of the point or pixel of interest), etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the "neighboring" point or pixel may be the closest candidate edge point to the point of interest along a particular direction, e.g. a horizontal direction and/or a vertical direction.
  • Each "good" edge point ideally has at least two immediate neighbors (one on each side) and. does not deviate far from a straight line segment connecting these neighbors and the "good" edge point, e.g. the candidate edge point and the at least two immediately neighboring points may be fit to a linear regression, and the result may be characterized by a coefficient of determination (R 2 ) not less than 0.95.
  • R 2 coefficient of determination
  • the angle of this segment with respect to one or more borders of the digital image, together with its relative location determines whether the edge point is assigned to top, left, right, or bottom side of the page.
  • a candidate edge point and the two neighboring edge points may be assigned to respective comers of a triangle.
  • the candidate edge point may be considered a "good” candidate edge point. If the angle of the triangle at the candidate edge point deviates far from 180 degrees by more than a threshold value (such as by 20 degrees or more), then the candidate edge point may be excluded from the set of "good” candidate edge points.
  • a threshold value such as by 20 degrees or more
  • the step of this grid may start from a large number such as 32, but it may ⁇ be reduced by a factor of two and. the search for edge points repeated until there are enough of them to determine the Least Mean Squares (IMS) based equations of page sides (see below). If this process cannot determine the sides reliably even after using all rows and columns in the image, it gives up and the whole image is treated as the page.
  • IMS Least Mean Squares
  • the equations of page sides are determined as follows, in one embodiment.
  • the algorithm fits the best LMS straight line to each of the sides using the strategy of throwing out worst outliers until all the remaining supporting edges lie within a small distance from the LMS line. For example, a point with the largest distance from a substantially straight line connecting a plurality of candidate edge points along a particular boundar of the document may be designated the "worst" outlier. This procedure may be repeated iteratrvely to designate and/or remove one or more "worst" outliers from the plurality of candidate edge point.
  • the distance with which a candidate edge point may deviate from the line connecting the pluralit of candidate edge points is based at least in part on the size and/or resolution of the digital image.
  • the algorithm may attempt to fit the best second-degree polynomial (parabola) to the same original candidate points.
  • the algorithmic difference between finding the best parabola vs. the best straight line is minor:
  • Intersections of the four found sides of the document may be calculated in order to find, the corners of (possibly slightly curved) page tetragon, (e.g. tetragon and discussed in further detail below), in the preferred implementation in order to do this it is necessary to consider three cases: calculating intersections of two straight lines, calculating intersections of a straight line and. a parabola, and calculating intersections of two parabolas.
  • page tetragon e.g. tetragon and discussed in further detail below
  • the tetragon is preferably not too small (e.g., below a predefined threshold of any desired value, such as 25% of the total area of the image), the corners of the tetragon preferably do not lie too far outside of the frame of the image (e.g. not more than 100 pixels away), and the corners themselves should preferably be interpretable as top-left, top-right, bottom-left and bottom-right with diagonals intersecting inside of the tetragon, etc. If these constraints are not met, a given page detection result may be rejected, in some embodiments.
  • the algorithm may determine a target rectangle.
  • Target rectangle width and height may be set to the average of top and bottom sides of the tetragon and the average of left and right sides respectively.
  • the angle of skew of the target rectangle may be set to zero so that the page sides will become horizontal and vertical.
  • the skew angle may be set to the average of the angles of top and bottom sides to the horizontal axis and those of the left and right sides to the vertical axis.
  • the center of the target rectangle may be designated so as to match the average of the coordinates of the four comers of the tetragon; otherwise the center may be calculated so that the target rectangle ends up in the top left of the image frame, in additional embodiments.
  • page detection includes performing a method such .
  • the method may be performed in any environment, including those described herein and represented in any of the Figures provided with the present disclosures.
  • a plurality of candidate edge points corresponding to a transition from a digital image background to the digital representation of the document are defined
  • defining the plurality of candidate edge points in operation may include one or more additional operations such as operations -, described below.
  • a large analysis window (e.g. a large analysis window) and. is defined within the digital image.
  • a first large analysis window is defined in a region depicting a pluralit of pixels of the digital image background, but not depicting the non-background (e.g. the digital representation of the document) in order to obtain information characteristic of the digital image background for comparison and contrast to information characteristic of the non-background, (e.g. the digital representation of the document, such as background statistics discussed in further detail below).
  • the first large analysis window may be defined in a corner (such as a top-left corner) of the digital image.
  • the first large analysis window may be defined in any part of the digital image without departing from the scope of the present disclosures.
  • the large analysis window may be any size and/or characterized by any suitable dimensions, but in preferred embodiments the large analysis window is approximately forty pixels high and approximately forty pixels wide.
  • the large analysis window may be defined in a corner region of the digital image.
  • a digital image comprises a digital
  • the large analysis window may be defined in a region comprising a plurality of background pixels and not including pixels corresponding to the digital representation of the document. Moreover, the large analysis window may be defined in the corner of the digital image, in some approaches.
  • a plurality of small analysis windows may be defined within the digital image, such as within the large analysis window.
  • the small analysis windows may overlap at least in part with one or more other small analysis windows such as to be characterized, by comprising one or more overlap regions .
  • all possible small analysis windows are defined within the large analysis window.
  • small analysis windows may be defined within any portion of the digital image, such , and preferably small analysis windows may be defined such that each small analysis window is characterized by a single center pixel.
  • one or more statistics are calculated for one or more small analysis windows (e.g. one or more small analysis windows within a large analysis window) and one or more distributions of corresponding statistics are estimated (e.g. a distribution of statistics estimated across a plurality of small analysis windows).
  • distributions of statistics may be estimated across one or more large analysis window(s) and optionally merged.
  • values may be descriptive of any feature associated with the background of the digital image, such as background brightness values, background color channel values, background texture values, background tint values, background contrast values, background sharpness values, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • statistics may include a minimum, a maximum and/or a range of brightness values in one or more co lor channels of the plurality of pixels depicting the digital image background over the plurality of small windows within the large analysis window'.
  • one or more distributions of background statistics are estimated.
  • estimating the distribution(s) of statistics one may obtain descriptive distribution(s) that characterize the properties of the background, of the digital image within, for example, a large analysis window.
  • the distribution(s) preferably correspond to the background statistics calculated for each small analysis window, and may include, for example, a distribution of brightness minima, a distribution of brighiness maxima, etc., from which one may obtain distribution statistical descriptors such as the minimum and/or maximum of minimum brightness values, the minimum and/or maximum of minimum brightness values, minimum and/or maximum spread of brightness values, minimum and/or maximum of minimum color channel values, minimum and/or maximum of maximum color channel values, minimum and/or maximum spread of color channel values etc. as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
  • any of the calculated background statistics e.g. for brightness values, color channel values, contrast values, texture values, tint values, sharpness values, etc.
  • any value descriptive of the distribution may be employed without departing from the scope of the present disclosures.
  • a large analysis window such as analysis window r is defined, within the digital image.
  • windo shapes may be defined by positively setting the boundaries of the window as a portion of the digital image, may be defined by negatively, e.g. by applying a mask to the digital image and defining the regions of the digital image not masked as the analysis window.
  • windows may be defined according to a pattern, especially in embodiments where windows are negatively defined by applying a mask to the digital image. Of course, other manners for defining the windows may be employed without departing from the scope of the present disclosures.
  • each analysis window statistic corresponds to a distribution of background statistics estimated for the large analysis window.
  • maximum brightness corresponds to distribution of background brightness maxima
  • minimum brightness corresponds to distribution of background brightness minima
  • brightness spread corresponds to distribution of background brightness spreads, etc. as would be understood by one having ordinar skill in the art upon reading the present
  • determining whether a statistically significant difference exists may be performed using any known statistical significance evaluation method or metric, such as a p-value, a z-test, a chi-squared correlation, etc. as would be appreciated by a skilled artisan reading the present descriptions.
  • one or more points e.g. the centermost pixel or point in the analysis window for which a statistically significant difference exists between a value describing the pixel and the corresponding distribution of background statistics is designated as a candidate edge point.
  • the designating may be accomplished by any suitable method known in the art, such as setting a flag corresponding to the pixel, storing coordinates of the pixel, making an array of pixel coordinates, altering one or more values describing the pixel (such as brightness, hue, contrast, etc.), or any other suitable means.
  • one or more operations may be repeated one or more times.
  • a plurality of such repetitions may be performed, wherein each repetition is performed on a different portion of the digital image.
  • the repetitions may be performed until each side of the digital representation of the document has been evaluated.
  • defining the analysis windows may result in a plurality of analysis windows, which share one or more borders, which overlap in whole or in part, and/or which do not share any common border and do not overlap, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the plurality of repetitions may be performed in a manner directed to reestimate local background statistics upon detecting a potentially non- background window (e.g. a window containing a candidate edge point or a window containing an artifact such as uneven illumination, background texture variation, etc. ⁇ .
  • a potentially non- background window e.g. a window containing a candidate edge point or a window containing an artifact such as uneven illumination, background texture variation, etc. ⁇ .
  • four sides of a tetragon are defined based on the plurality of candidate edge points.
  • the sides of the tetragon encompass the edges of a digital representation of a document in a digital image.
  • Defining the sides of the tetragon may include, in some approaches, performing one or more least-mean-squares (LMS) approximations.
  • LMS least-mean-squares
  • defining the sides of the tetragon may include identifying one or more outlier candidate edge points, and removing one or more outlier candidate edge points from the pluralit of candidate edge points. Further, defining the sides of the tetragon may include performing at least one additional LMS approximation excluding the one or more outlier candidate edge points.
  • each side of the tetragon is characterized by an equation chosen from a class of functions, and performing the at least one LMS approximation comprises determining one or more coefficients for each equation, such as best coefficients of second degree polynomials in a preferred implementation.
  • defining the sides of the tetragon may include determining whether each side of the digital represe tation of the document falls within a given class of functions, such as second degree polynomials or simpler functions such as linear functions instead of second, degree polynomials.
  • performing method may accurately define a tetragon around the four dominant sides of a document while ignoring one or more deviations from the dominant sides of the document, such as a rip and/or a tab - and.
  • Additional and/or alternative embodiments of the presently disclosed tetragon may be characterized by having four sides, and each side being characterized by one or more equations such as the polynomial functions discussed above.
  • embodiments where the sides of tetragon are characterized by more than one equation may involve dividing one or more sides into a plurality of segments, each segment being characterized by an equation such as the polynomial functions discussed above.
  • Defining the tetragon may, in various embodiments, alternatively and/or additionally include defining one or more corners of the tetragon.
  • tetragon comers may be defined by calculating one or more intersections between adjacent sides of the tetragon, and designating an appropriate intersection from the one or more calculated intersections in cases where multiple intersections are calculated.
  • defining the comers may include solving one or more equations, wherein each equation is characterized by belonging to a chosen class of functions such as th degree polynomials, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
  • a corner of the tetragon may be defined by one or more of: an intersection of two curved adjacent sides of the tetragon; an intersection of two substantially straight lines; and an intersection of one substantially straight line and one substantially curved line.
  • the digital representation of the document and. the tetragon are output to a display of a mobile device. Outputting may be performed in any manner, and may depend upon the configuration of the mobile device hardware and/or software.
  • outputting may be performed in various approaches so as to facilitate further processing and/or user interaction with the output.
  • the tetragon may be displayed in a manner designed to distinguish the tetragon from other features of the digital image, for example by displaying the tetragon sides in a particular color, pattern, illumination motif, as an animation, etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • representation of the document may facilitate a user manually adjusting and/or defining the tetragon in any suitable manner.
  • a user may interact with, the display of the mobile device to translate the tetragon, i.e. to move the location of the tetragon in one or more directions while maintaining the aspect ratio, shape, edge lengths, area, etc. of the tetragon.
  • a user may interact with the display of the mobile device to manually define or adjust locations of tetragon comers, e.g. tapping on a tetragon corner and dragging the corner to a desired location within the digital image, such as a corner of the digital representation of the document.
  • page detection such as described above may include one or more additional and/or alternative operations, such as will be described below.
  • method may further include capturing one or more of the image data containing the digital representation of the document and audio data relating to the digital representation of the document. Capturing may he performed using one or more capture components coupled to the mobile device, such as a microphone, a camera, an accelerometer, a sensor, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
  • method may include defining a new large analysis window and reestimating the distribution of background, statistics for the new large analysis window upon determining that the statistically significant difference exists, i.e. essentially repeating operation and/or in a different region of the digital image near a point where a potentially non-background point has been identified, such as near one of the edges of the document.
  • a large analysis window may be positioned near or at the leftmost non- background pixel in a row or positioned near or at the rightmost non- background pixel in a row, positioned near or at the topmost non-background pixel in a column, positioned near or at bottommost non-background pixel in a column.
  • Approaches involving such reestimation may farther include determining whether the statistically significant difference exists between at least one small analysis window (e.g. a test window) statistic and. the corresponding reestimated distribution of large analysis window statistics, in this manner, it is possible to obtain a higher-confidence determination of whether the statistically significant difference exists, and therefore better distinguish true transitions from the digital image background to the digital representation of the document as opposed to, for example, variations in texture, illumination anomalies, and/or other artifacts within the digital image.
  • at least one small analysis window e.g. a test window
  • the corresponding reestimated distribution of large analysis window statistics in this manner, it is possible to obtain a higher-confidence determination of whether the statistically significant difference exists, and therefore better distinguish true transitions from the digital image background to the digital representation of the document as opposed to, for example, variations in texture, illumination anomalies, and/or other artifacts within the digital image.
  • avoiding artifacts such as variations in illumination and/or background texture, etc. in the digital image, the artifacts not corresponding to a true transition from the digital image background to the digital representation of the document.
  • avoiding artifacts ma take the form of bypassing one or more regions (e.g. regions characterized by textures, variations, etc. that distinguish the region from the true background) of the digital image.
  • one or more regions may be bypassed upon determining a statistically significant difference exists between a statistical distribution estimated for the large analysis window and a corresponding statistic calculated for the small analysis window, defining a new large analysis window near the small analysis window, reestimating the distribution of statistics for the new large analysis window, and determining that the statistically significant difference does not exist between the reestimated statistical distribution and the corresponding statistic calculated for the small analysis window.
  • bypassing may be accomplished by checking another analysis window further along the path and resuming the search for a transition to non-background upon determining that the statistics of this checked window do not differ significantly from the known statistical properties of the background, e.g. as indicated by a test of statistical significance.
  • bypassing may be accomplished, by checking another analysis window farther along the path.
  • page detection may additionally and/or alternatively include determining whether the tetragon satisfies one or more quality control metrics; and rejecting the tetragon upon determining the tetragon does not satisfy one or more of the quality control metrics.
  • quality control metrics may include measures such as a LMS support metric, a minimum tetragon area metric, a tetragon corner location metric, and a tetragon diagonal intersection location metric.
  • determining whether the tetragon satisfies one or more of these metrics acts as a check on the performance of method.
  • checks may include determining whether the tetragon covers at least a threshold of the overall digital image area, e.g. whether the tetragon comprises at least 25% of the total image area.
  • checks may include determining whether tetragon diagonals intersect inside the boundaries of the tetragon, determining whether one or more of the LM S approximations were calculated from sufficient data to have robust confidence in the statistics derived therefrom, i.e.
  • quality metrics and/or checks may facilitate rejecting siiboptimal tetragon definitions, and further facilitate improving the definition of the tetragon sides.
  • one approach involves receiving an indication that the defining the four sides of the tetragon based on the plurality of candidate edge points failed to define a valid tetragon, i.e. failed to satisfy one or more of the quality control metrics; and redefining the plurality of candidate edge points.
  • redefining the plurality of candidate edge points includes sampling a greater number of points within the digital image than a number of points sampled in the prior, failed attempt.
  • This may be accomplished, in one approach, by reducing the step over one or more of rows or columns of the digital image and repeating all the steps of the algorithm in order to analyze a larger number of candidate edge points.
  • the step may be decreased in a vertical direction, a horizontal direction, or both.
  • other methods of redefining the candidate edge points and/or resampling points within the digital image may be utilized without departing from the scope of the present disclosures.
  • page detection may include designating the entire digital image as the digital representation of the document, particularly where multiple repetitions of method failed to define a valid tetragon, even with significantly reduced step in progression through the digital image analysis.
  • designating the entire digital image as the digital representation of the document may include defining image comers as document corners, defining image sides as document sides, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the diagonals of the tetragon may be characterized by a first line connecting a calculated top left corner of the tetragon to a calculated bottom right corner of the tetragon, and second, line connecting a calculated, top right comer of the tetragon and a calculated, bottom left corner of the tetragon.
  • the first line and the second line preferably intersect inside the tetragon.
  • one or more of the foregoing operations may be performed using a processor, and the processor may be part of a mobile device, particularly a mobile device having an integrated camera.
  • the goal of a rectangularization algorithm is to smoothly transform a tetragon (such as defined above in page detection method) into a rectangle (such).
  • the tetragon is characterized, by a plurality of equations, each equation corresponding to a side of the tetragon and being selected from a chosen class of functions.
  • each side of the tetragon may be characterized by a first degree polynomial, second degree polynomial, third degree polynomial, etc. as would be appreciated by the skilled artisan upon reading the present descriptions.
  • each intrinsic coordinate pair (p, q) corresponds to an intersection of a line parallel to each of a left side of the rectangle and a right side of the rectangle, e.g. a line splitting both top and bottom sides in the proportion ofp to 1 -p: and a line parallel to each of a top side of the rectangle and a bottom side of the rectangle, e.g. a line splitting both top and bottom sides in the proportion of q to 1 - q, wherein 0 ⁇ p ⁇ 1 , and wherein 0 ⁇ q ⁇ 1 -
  • the goal of the rectangularization algorithm described below r is to match each point in the rectangularized image to a corresponding point in the original image, and do it in such a way as to transform each of the four sides of the tetragon into a substantially straight line, while opposite sides of the tetragon should become parallel to each other and orthogonal to the other pair of sides; i.e. top and bottom sides of the tetragon become parallel to each other; and left and right sides of the tetragon become parallel to each other and orthogonal to the new top and bottom.
  • the tetragon is transformed into a true rectangle characterized by four corners, each corner comprising two straight lines intersecting to form a ninety-degree angle.
  • the main idea of the rectangularization algorithm described below is to achieve this goal by, first, calculating rectangle-based intrinsic coordmates (p, q) for each point (not shown) in the rectangularized destination image, second, matching these to the same pair (p, q) of tetragon-based intrinsic coordinates in the original image, third, calculating the coordinates of the intersection of the left-to-right and top-to-bottom curves corresponding to these intrinsic coordinates respectively, and finally, assigning the color or gray value at the found point in the original image to the point.
  • each point in a digital image may correspond to an intersection of a top-to-bottom curve and a left-to-right curve (a curve may include a straight line, a curved line, e.g. a parabola, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions) corresponding to intrinsic coordinates (such as described above) associated with a point.
  • a curve may include a straight line, a curved line, e.g. a parabola, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions
  • intrinsic coordinates such as described above
  • rectangularization may involve defining a plurality of such left-to-right fines and top-to-bottom lines.
  • rectangularization may include matching target rectangle-based coordinates to intrinsic tetragon-based coordinates of the digital representation of the document. This matching may include iterative! ⁇ ' searching for an intersection of a given left-to-right curve and a given top-to-bottom curve shows the first iteration of an exemplary iterative search within the scope of the present disclosures.
  • the iterative search includes designating a starting point having coordinates (xo, yo),
  • the starting point maybe located anywhere within the digital representation of the document, but preferably is located at or near the center of the target rectangle.
  • the iterative search may include projecting the starting point onto one of the two intersecting curves, While the starting point may be projected onto either of the curves,
  • the first half of a first iteration in the iterative search includes projecting the starting- point onto the top-to-bottom curve to obtain x-coordinate (xi) of the next point, the projection result represented in by point, which has coordinates (xi, yo).
  • the second half of a first iteration in the iterative search includes projecting the point onto the left-to-right curve to obtain y-coordinate (yi) of the next point, the projection result represented in by point, which has coordmates (x yi).
  • Rectangularization involves transforming the tetragon defined in page detection into a true rectangle. The result of this process is a graphical representation of an output after performing a page rectangularization algorithm, according to one embodiment.
  • a method for modifying one or more spatial characteristics of a digital representation of a document in a digital image may include any of the techniques described herein. As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, method may be performed in any suitable environment, including those shown and/or described in the figures and corresponding descriptions of the present disclosures.
  • a tetragon (such as defined above in page detection method.) is transformed into a rectangle.
  • the tetragon is characterized by a plurality of equations, each equation corresponding to a side of the tetragon and being selected from a chosen class of functions.
  • each side of the tetragon may be characterized by a first degree polynomial, second degree polynomial, third degree polynomial, etc. as would be appreciated by the skilled artisan upon reading the present descriptions.
  • curves may be described by exemplary' ' polynomial functions fitting one or more of the following general forms.
  • are the coefficients in the equation of the left side of the tetragon
  • bi are the coefficients in the equation of the right side of the tetragon
  • ci are the coefficients in the equation of the top side of the tetragon
  • di are the coefficients in the equation of the bottom side of the tetragon
  • p and q are the tetragon-based intrinsic coordinates corresponding to curves,.
  • the coefficients such as at In, a, di, etc, may be derived from calculations, estimations, and/or determinations achieved in the course of performing page detection, such as a page detection method as discussed, above.
  • transforming the tetragon into a rectangle may include one or more additional operations, such as will be described in greater detail below.
  • method additionally and/or alternatively includes stretching one or more regions of the tetragon to achieve a more rectangular or truly rectangular shape.
  • such stretching is performed in a manner sufficiently smooth to avoid introducing artifacts into the rectangle.
  • transforming the tetragon into a rectangle may include determining a height of the rectangle, a width of the rectangle, a skew angle of the rectangle, and/or a center position of the rectangle.
  • such transforming may include defining a width of the target rectangle as the average of the width of the top side and the width of the bottom side of the tetragon ; defining a height of the target rectangle as the average of the height of the left side and the height of the right side of the tetragon ; defining a center of the target rectangle depending on the desired placement of the rectangle in the image; and defining an angle of skew of the target rectangle, e.g. in response to a user request to deskew the digital representation of the document,
  • the transforming may additionally and/or alternatively include generating a rectangularized digital image from the original digital image; determining a p- coordinate and a ⁇ -coordinate for a plurality of points within the rectangularized digital image (e.g. points both inside and outside of the target rectangle) wherein each point located to the left of the rectangle has a /'-coordinate value p ⁇ 0, wherein each point located to the right of the rectangle has a .'-coordinate v alue p > 1, wherein each point located above the rectangle has a incoordinate value q ⁇ 0, and wherein each point located below the rectangle has a ⁇ -coordinate value q > .
  • the transforming may additionally and/or alternatively include generating a rectangularized digital image from the original digital image; determining a pair of rectangle-based intrinsic coordinates for each point within the rectangularized digital image; and matching each pair of rectangle-based, intrinsic coordinates to an equivalent pair of tetragon- based intrinsic coordinates within the original digital image.
  • matching the rectangle-based intrinsic coordinates to the tetragon-based intrinsic coordinates may include: performing an iterative search for an intersection of the top-to-bottom curve and the left-to-right curve.
  • matching the rectangle-based intrinsic coordinates to the tetragon-based intrinsic coordinates may include determining a distance between (xk, yk) and (xk+i, yk+i); determining whether the distance is less than a predetermined threshold: and terminating the iterative search upon determining that the distance is less than the predetermined threshold.
  • the image processing algorithm disclosed herein may additionally and/or alternatively include functionality designed to detect and'or correct a skew angle of a digital representation of a document in a digital image.
  • One preferred approach to correcting skew are described below. Of course, other methods of correcting skew within a digital image are within the scope of the these disclosures, as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
  • a digital representation of a document in a digital image may be characterized, by one or more skew angles a
  • horizontal skew angle a represents an angle between a horizontal line and an edge, of the digital representation of the document, the edge, having its longitudinal axis in a substantially horizontal direction (i.e. either the top or bottom edge of the digital representation of the document).
  • a may represent an angle between a vertical line and an edge, of the digital representation of the document, the edge, having its longitudinal axis in a substantially vertical direction (i.e. either the left edge or right edge of the digital representation of the document).
  • the digital representation of the document may be defined by a top edge, a bottom edge, a right edge and a left edge.
  • the presently described image processing algorithm may include features directed to detecting whether a digital representation of a document comprises one or more illumination problems.
  • illumination problems may include locally under- saturated regions of a digital image, when brightness values vary greatly from pixel-to-pixel within image
  • the processes include (preferably using a mobile device processor) dividing a tetragon including a digital representation of a document into a plurality of sections, each section comprising a plurality of pixels.
  • a distribution of brightness values of each section is determined.
  • the distribution of brightness values may be compiled and/or assembled in any laiown manner, and may be fit to any known standard distribution model, such as a Gaussian distribution, a bimodal distribution, a skewed distribution, etc.
  • a brightness value range of each section is determined.
  • a range is defined as a difference between a maximum value and a minimum value in a given distribution.
  • the brightness value range would be defined as the difference between the characteristic maximum brightness value in a given section and the characteristic minimum brightness value in the same section.
  • these characteristic values may correspond to the 2 na and 98 th percentiles of the whole distribution respectively.
  • operation may include determining that a region of a digital image depicting a digital representation of a document is oversaturated, according to one embodiment.
  • Determining whether each section is oversaturated may include determining a section oversaturation ratio for each section.
  • each section oversaturation ratio is defined as a number of pixels exhibiting a maximum brightness value in the section di vided by a total number of pixels in the section.
  • An unevenly illuminated image may depict or be characterized, by a plurality of dark spots that may be more dense in areas where the brightness level of a corresponding pixel, point or region of the digital image is lower than that of other regions of the image or document, and/or lower than an average brightness level of the image or document.
  • uneven illumination may be characterized by a brightness gradient, such with a gradient proceeding from a top right corner of the image (near region) to a lower left corner of the image (near region) such that brightness decreases along the gradient with a relatively bright area in the top right comer of the image (near region) and a relatively dark area in the lower left corner of the image (near region).
  • determining whether each section is oversaturated may further include determining, for each section, whether the oversaturation level of the section is greater than a predetermined threshold., such as 10%; and. characterizing the section as oversaturated. upon determining that the saturation level of the section is greater than the predetermined threshold. While the presently described embodiment, employs a threshold value of 1 0%, other predetermined threshold oversaturation levels may be employed without departing from the scope of the present descriptions. Notably, the exact value is a matter of visual perception and expert judgment, and may be adjusted and/or set by a user in various approaches.
  • a predetermined threshold such as 10%
  • a predetermined variability threshold such as a median brightness variability of 18 out of a 0-255 integer value range
  • the exact value is a matter of visual perception and expert judgment, and may be adjusted and/or set by a user in various approaches.
  • determining the variability of the section may include determining a brightness value of a target pixel in the plurality of pixels; calculating a difference between the brightness value of the target pixel and a brightness value for one or more neighboring pixels, each neighboring pixel being one or more (for example, 2) pixels away from the target pixel; repeating the determining and the calculating for each pixel in the plurality of pixels to obtain each target pixel variability; and generating a distribution of target pixel variability values, wherein each target pixel brightness value and target pixel variability value is an integer in a range from 0 to 255.
  • This approach may be implemented, for example, by incrementing a corresponding counter in an array of all possible variability values in a range trom 0 to 255, e.g. to generate a histogram of variability values.
  • the neighboring pixels when utilizing neighboring pixels in determining the variability of a particular section, may be within about two pixels of the target pixel along either a vertical direction, a horizontal direction, or both (e.g. a diagonal direction).
  • a vertical direction e.g. a vertical direction
  • a horizontal direction e.g. a horizontal direction
  • other pixel proximity limits may be employed without departing from the scope of the present invention.
  • method may further include removing one or more target pixel variability values from the distribution of target pixel variability values to generate a corrected distribution; and defining a characteristic background variability based on the corrected distribution.
  • generating a corrected distribution and defining the characteristic background variability may include removing the top 35% of total counted values (or any other value sufficient to cover significant brightness changes associated with transitions from the background to the foreground) and define the characteristic background variability based on the remaining values of the distribution, i.e. values taken from a relatively flat background region of the digital representation of the document.
  • a number of oversaturated sections is determined. This operation may include any manner of determining a total number of oversaturated sections, e.g. by incrementing a counter during processing of the image, by setting a flag for each oversaturated section and counting flags at some point during processing, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • a number of undersaturated sections is determined. This operation may include any manner of determining a total number of undersaturated sections, e.g. by incrementing a counter during processing of the image, by setting a flag for each
  • an oversaturation threshold which may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the digital image is undersaturated upon determining that a ratio of the number of undersaturated sections to the total number of sections exceeds an undersaturation threshold, which may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions..
  • an undersaturation threshold which may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • method may include one or more additional and/or alternative operations, such as will be described in detail below.
  • method may include performing the following operations, for each section. Defining a section height by dividing the height of the document into a predefined number of horizontal sections; and defining a section width by dividing the width of the document into a predetermined number of vertical sections.
  • the section height and width are determined based, on the goal of creating a certain number of sections and. making these sections approximately square by dividing the height of the document into a certain number of horizontal parts and by dividing the width of the document into a certain (possibly different) number of vertical parts.
  • a method for determining whether il umination problems exist in a digital representation of a document includes the following operations, some or all of which may be performed in any environment described herein and/or represented in the presently- disclosed figures.
  • correcting unevenness of illumination in a digital image includes normalizing an overall brightness level of the digital image. Normalizing overall brightness may transform a digital image characterized by a brightness gradient such as discussed above into a digital image characterized by a relatively flat, even distribution of brightness across the digital image, such . Note that in region is characterized by a significantly more dense distribution of dark spots than region, but in regions, are characterized by substantially similar dark spot density- profiles.
  • unevenness of illumination may be corrected.
  • a method for correcting uneven illumination in one or more regions of the digital image is provided herein for use in any suitable environment, including those described herein and represented in the various figures, among other suitable environments as would be known by one having ordinary skill in the art upon reading the present descriptions.
  • method includes operation where, using a processor, a two- dimensional illumination model is derived from the digital image.
  • the two-dimensional illumination model is applied to each pixel in the digital image.
  • the digital image may be divided into a plurality of sections, and some or all of the pixels within a section may be clustered based on color, e.g. brightness values in one or more color channels, median hue values, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Moreover, several most numerous clusters may be analyzed, to determine characteristics of one or more possible local backgrounds. In order to designate a c luster as a local background of the section, the number of pixels belonging to this cluster has to exceed a certain predefined threshold, such as a threshold percentage of the total section area.
  • a certain predefined threshold such as a threshold percentage of the total section area.
  • clustering may be performed using any known method, including Markov-chain Monte Carlo methods, nearest neighbor joining, distribution-based clustering such as expectation-maximization, density-based clustering such as density-based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS), etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions,
  • method may include determining, for each distribution of color channel values within background clusters, one or more of an average color of the primary background of the corresponding section and an average color of the secondary background of the corresponding section, if one or both exist in the section.
  • method includes designating, for each section, either the primary background color or the secondary background color as a local representation of a main background of the digital representation of the document, each local representation being characterized by either the average color of the primary background of the corresponding section or the average color of the secondary background of the corresponding section;
  • method includes fitting a plurality of average color channel values of chosen local representations of the image background to a two-dimensional illumination model.
  • derivation of the two-dimensional illumination model may include, for a plurality of background, clusters: calculating an average color channel value of each background cluster, calculating a hue ratio of each background cluster, and calculating a median hue ratio for the plurality of background clusters. Moreover, the derivation may also include comparing the hue ratio of each background cluster to the median hue ratio of the plurality of clusters; selecting the more likely of the possible two backgrounds as the local representation of the document background based on the comparison; fitting at least one two-dimensional illumination model to the average channel values of the local representation; and. calculating a plurality of average main background color channel values over a plurality of local
  • the applying of the model may include the calculating of a difference between one or more predicted, background channel values and the average main background color channel values; and adding a fraction of the difference to one or more color channel values for each pixel in the digital image.
  • adding the fraction may involve adding a value in a range from 0 to 1 of the difference, for example, 3 ⁇ 4 of the difference, in a preferred embodiment, to the actual pixel value.
  • method may include additional and/or alternative operations, such as those discussed immediately below.
  • method further includes one or more of: determining, for each section, a plurality of color clusters; determining a plurality of numerous color clusters, each numerous color cluster corresponding to high frequency of representation in the section (e.g. the color cluster is one of the clusters with the highest number of pixels in the section belonging to that color cluster) determining a total area of the section; determining a plurality of partial section areas, each partial section area corresponding to an area represented by one the plurality of numerous color clusters; dividing each partial section area by the total area to obtain a cluster percentage area for each numerous color cluster; (e.g.
  • the classifying operation identifies either: no background in the section, a single most numerous background, in the section, or two most numerous backgrounds in the section.
  • the classifying includes classifying each belonging to a cluster containing a number of pixels greater than a background threshold as a background pixel.
  • the background, threshold is in a range from 0 to 100% (for example, 15% in a preferred, approach).
  • the background threshold may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions..
  • mobile image processing may include a method for estimating resolution of a digital representation of a document.
  • these methods may be performed in any suitable environment, including those described herein and represented in the various figures presented herewith.
  • method may be used in conjunction with any other method described herein, and may include additional and/or alternative operations to those described below, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the digital image may be characterized as a bi tonal image, i.e. an image containing only two tones, and preferably a black and white image.
  • a plurality of likely characters is determined based on the plurality of connected components.
  • Likely characters may be regions of a digital image characterized by a predetermined number of light-to-dark transitions in a given direction, such as three light-to-dark transitions in a vertical direction as would be encountered for a small region of the digital image depicting a capital letter "E," each light-to-dark transition corresponding to a transition from a background of a document (light) to one of the horizontal strokes of the letter "E.”
  • other numbers of light- to-dark transitions may be employed, such as two vertical and/or horizontal light-to-dark transitions for a letter "o," one vertical light to dark transition for a letter "1,” etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • one or more average character dimensions are determined based on the plurality of likely text characters.
  • the average character dimensions may include one or more of an average character width and an average character height, but of course other suitable character dimensions may be utilized, as would be recognized by a skilled artisan reading the present descriptions.
  • the resolution of the digital image is estimated based on the one or more average character dimensions.
  • method may optionally and/or alternatively include one or more additional operations, such as described below.
  • method may further include one or more of:
  • the estimated resolution will only be adjusted if a good, match between the digital representation of the document and one of the known document types has been found.
  • the one or more known document types include: a Letter size document (8.5 x 1 1 inch); a Legal size document (8.5 x 14 inch); an A3 document (1 1.69 x 16.54 inch); an A4 (European Letter size) document (8.27 x 1 1.69 inch); an A5 document (5.83 x 8.27 inch); a ledger/tabloid document (1 1 x 17 inch); a driver license (2.125 x 3.375 inch); a business card (2 x 3.5 inch); a personal check (2.75 x 6 inch); a business check (3 x 7.25 inch); a business check (3 x 8.25 inch); a business check (2.75 x 8.5 inch); a business check (3.5 x 8.5 inch); a business check (3.66 x 8.5 inch); a business check (4 x 8.5 inch); a 2.25-inch wide receipt; and a 3.125 -inch wide receipt.
  • method may further and/or optionally include computing, for one or more connected components, one or more of: a number of on-off transitions within the connected component; (for example transitions from a character to a document background, e.g. transitions from black-to-white, white-to-black, etc. as would be understood by the skilled artisan reading the present descriptions); a black pixel density within the connected component; an aspect ratio of the connected component; and a likelihood that one or more of the connected component
  • components represents a text character based, on one or more of the black pixel density, the number of on-off transitions, and the aspect ratio.
  • method may further and/or optionally include determining a character height of at least two of the plurality of text characters; calculating an average character height based on each character height of the at least two text characters; determining a character width of at least two of the plurality of text characters; calculating an average character width based on each character width of the at least two text characters; performing at least one comparison.
  • the comparison may be selected from: comparing the average character height to a reference average character height; and comparing the average character width to a reference average character width.
  • method may further include estimating the resolution of the digital image based on the at least one comparison, where each of the reference average character height and the reference average character width correspond to one or more reference characters, each reference character being characterized by a known average character width and a known average character height.
  • each reference character corresponds to a digital representation of a character obtained from scanning a representative sample of one or more business documents) at some selected resolution, such as 300 DPI, and each reference character further corresponds to one or more common fonts, such as Arial, Times New Roman, Helvetica, Courier, Courier New, Tahoma, etc. as would be understood by the skilled artisan reading the present descriptions.
  • representative samples of business documents may be scanned at other resolutions, so long as the resulting image resolution is suitable for recognizing characters on the document.
  • the resolution must be sufficient to provide a minimum character size, such as a smallest character being no less than 12 pixels in height in one embodiment.
  • the minimum character height may vary according to the nature of the image. For example different character heights may be required when processing a grayscale image than when processing a binary (e.g. bitonal) image. In more approaches, characters must be sufficiently large to be recognized, by optical character recognition (OCR).
  • OCR optical character recognition
  • method may include one or more of: estimating one or more dimensions of the digital representation of the document based on the estimated resolution of the digital representation of the document: computing an average character width trom the average character dimensions; computing an average character height from the average character dimensions; comparing the average character width to the average character height; estimating an orientation of the digital representation of the document based on the comparison; and matching the digital representation of the document to a known document type based on the estimated dimensions and the estimated, orientation.
  • estimating resolution may be performed in an inverse manner, namely by processing a digital representation of a document to determine a content of the document, such as a payment amount for a digital representation of a check, an addressee for a letter, a pattern of a form, a barcode, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the digital representation of the document may be determined to correspond to one or more known document types, and utilizing information about the known document type(s), the resolution of the digital representation of the document may be determined and/or estimated.
  • a method for detecting one or more blurred regions in a digital image will be described, according to various embodiments. As will be understood and appreciated by the skilled artisan upon reading the present descriptions, method may be performed in any suitable environment, such as those discussed herein and represented in the multitude of figures submitted herewith. Further, method may be performed in isolation and/or in conjunction with any other operation of any other method described herein, including but not limited to image.
  • method includes operation, where, using a processor, a tetragon comprising a digital representation of a document in a digital image is divided into a plurality of sections, each section comprising a plurality of pixels. [00236] In one embodiment, method includes operation, where, for each section it is determined whether the section contains one or more sharp pixel-to-pixel transitions in a first direction
  • method includes operation, where, for each section a total number of first direction sharp pixel-to-pixei transitions (Ssi) are counted.
  • method includes operation, where, for each section a total number of first-direction blurred pixel-to-pixel transitions (Sin) are counted.
  • method includes operation, where, for each section it is determined whether the section contains one or more sharp pixel-to-pixei transitions in a second direction.
  • method includes operation, where, for each section a total number of second direction sharp pixel-to-pixel transitions (Ss?.) are counted.
  • method includes operation, where, for each section, it is determined, whether the section contains one or more blurred pixel-to-pixel transitions in the second direction
  • a total number of second-direction blurred pixel-to-pixe3 transitions (Sm) are counted.
  • each section it is determined that the section is blank upon determining: Ssi is less than a predetermined sharp transition threshold, Ssi is less than a predetermined blurred transition thresho3d, Ssi is less than a predetermined, sharp transition threshold, and SJ?2 is less than a predetermined blurred transition threshold.
  • n Ssi / Ssi is determined.
  • n Ss2 / SB2 is determined.
  • a "first direction” and "second direction” may be characterized as perpendicuiar, e.g. a vertical direction and a horizontal direction, or perpendicular diagonals of a square.
  • the "first direction” and “second direction” may correspond to any path traversing the digital image, but preferably each corresponds to a linear path traversing the digital image.
  • the non-blank section is blurred upon determining one or more of: the section is blurred in the first direction, and. the section is blurred in the section direction.
  • a total number of blurred sections is determined.
  • an image blur ratio R defined as: the total number blurred sections divided by a total number of sections; is calculated.
  • method includes operation, where, it is determined that the digital image is blurred upon determining the image blur ratio is greater than a predetermined image blur threshold.
  • method may include one or more additional and/or alternative operations, such as described below.
  • method may also inc lude determining, for each section a distribution of brightness values of the plurality of pixels; determining a characteristic variability v of the distribution of brightness values;
  • method may also include defining a plurality of center pixels; sequentially analyzing each of the plurality of center pixels within one or more small windows of pixels surrounding the center pixel; such as two pixels before and after; identifying the sharp pixel-to-pixel transitio upon determining: the large brightness transition exists within an immediate vicinity of the center pixel, (for example, from the immediately preceding pixel to the one following), a first small (e.g.
  • method may also include, for each section: counting a total number of sharp transitions in each of one or more chosen directions; counting a total number of blurred transitions in each chosen direction; determining that a section is blank upon determining: the total number of sharp transitions is less than a predefined sharp transition threshold (for example, 50); and the total number of blurred transitions is less than a predefined blurred transition threshold; determining the non- blank section is blurred upon determining a section blurriness ratio comprising the total number of sharp transitions to the total number of blurred transitions is less than a section blur ratio threshold (for example, 24%) in at least one of the chosen directions; and determining that the section is sharp upon determining the section is neither blank nor blurred.
  • a predefined sharp transition threshold for example, 50
  • FIG. 5 a method 500 is shown.
  • the method 500 may be carried out in any desired environment, and may include embodiments and/or approaches described in relation to FIGS. 1-4D, among others.
  • FIG. 5 more or less operations than those shown in FIG. 5 may be performed in accordance method 500 as would be appreciated by one of ordinary skill in the art upon reading the present descriptions.
  • the digital image may be characterized by a native resolution.
  • a “native resolution” may be an original, nati ve resolution of the image as originally captured, but also may be a resolution of the digital image after performing some pre- classification processing such as any of the image processing operations described herein, in one embodiment, the native resolution is approximately 500 pixels by 600 pixels (i.e. a 500x600 digital image) for a digital image of a driver license subjected to processing by virtual rescan (VRS) before performing classification.
  • the digital image may be characterized as a color image in some approaches, and in still more approaches may be a cropped-coior image, i.e. a color image depicting substantially only the object to be classified, and not depicting image background.
  • a first representation of the digital image is generated using a processor of the mobile device.
  • the first represe tation may be characterized by a reduced resolution, in one approach.
  • a "reduced resolution” may be any resolution less than the native resolution of the digital image, and more particularly any resolution suitable for subsequent analysis of the first representation according to the principles set forth herein.
  • the reduced resolution is sufficiently low to minimize processing overhead and maximize computational efficiency and robustness of performing the algorithm on the respective mobile device, host device and/or server platform.
  • the first represe tation is characterized by a resolution of about 25 pixels by 25 pixels, which has been experimentally determined to be a particularly efficient and robust reduced resolution for processing of relatively small documents, such as business cards, driver licenses, receipts, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • classification of larger documents or objects may benefit from utilizing a higher resolution such as 50 pixels by 50 pixels, 100 pixels by 100 pixels, etc. to better represent the larger document or object for robust classification and maximum computational efficiency.
  • the resolution utilized may or may not have the same number of pixels in each dimension.
  • the most desirable resolution for classifying various objects within a broad range of object classes may be determined
  • any resolution may be employed, and preferably the resolution is characterized by comprising between 1 pixel and about 1000 pixels in a first dimension, and between 1 and about 1000 pixels in a second dimension,
  • FIGS. 3A- 3C respectively depict: a digital image before being divided into sections (e.g. digital image 300 FIG. 3 A); a digital image divided into sections (e.g. sections 304 FIG. 3B); and a first representation of the digital image (e.g. representation 310 FIG. 3C) characterized by a reduced resolution.
  • FIGS. 3A-3B a digital image 300 captured, by a mobile device may be divided into a plurality of sections 304.
  • a first representation may be generated by dividing a digital image R (having a resolution ofxs. pixels by yn pixels) into Sx horizontal sections and S y vertical sections and thus may be characterized by a reduced resolution r of & pixels by S y pixels.
  • generating the first representation essentially includes generating a less-granular represe tation of the digital image.
  • the digital image 300 is divided into S sections, each section 304 corresponding to one portion of an s-by-s grid 302. Generating the first
  • each pixel 312 in the first representation 310 corresponds to one of the S sections 304 of the digital image, and. wherein each pixel 312 is located in a position of the first represe tation 310 corresponding to the location of the corresponding section 304 in the digital image, i.e. the upper-leftmost pixel 312 in the first representation corresponds to the upper-leftmost section 304 in the digital image, etc.
  • generating the first represe tation may include one or more alternative and/or additional suboperations, such as dividing the digital image into a plurality of sections.
  • the digital image may be divided into a plurality of sections in any suitable manner, and. in one embodiment the digital image is divided into a plurality of rectangular sections.
  • sections may be characterized by any shape, and in alternative approaches the plurality of sections may or may not represent the entire digital image, may represent an oversampling of some regions of the image, or may represent a single sampling of each pixel depicted in the digital image, in a preferred embodiment, as discussed above regarding FIGS. 3A-3C, the digital image is divided into S substantially square sections 304 to form an s x s grid 302.
  • generating the first represe tation may also include determining, for each section of the digital image, at least one characteristic value, where each characteristic value corresponds to one or more features descriptive of the section.
  • any feature that may be expressed as a numerical value is suitable for use in generating the first representation, e.g. an average brightness or intensity (0- 255) across each pixel in the section, an average value (0-255) of each color channel of each pixel in the section, such as an average red-channel value, and average green-channel value, and an average blue-channel value for a red-green-blue (RGB) image, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • each pixel 312 of the first representation 310 corresponds to one of the S sections 304 not only with respect to positional correspondence, but also with respect to feature correspondence.
  • generating the first representation 310 may additionally include determining a characteristic section intensity value is by calculating the average of the individual intensity values ip of each pixel 306 in the section 304. Then, each pixel 312 in the first representation 310 is assigned an intensity value equal to the average intensity value is calculated for the corresponding section 304 of the digital image 300. In this manner, the first representation 310 reflects a less granular, normalized representation of the features depicted in digital image 300.
  • the pixels 312 comprising the first representation 310 may be represented using any characteristic value or combination of characteristic values without departing from the scope of the presently disclosed classification methods.
  • characteristic values may be computed and/or determined using any suitable means, such as by random selection of a characteristic value from a distribution of values, by a statistical means or measure, such as an average value, a spread of values, a minimum value, a maximum value, a standard deviation of values, a variance of values, or by any other means that would be known to a skilled artisan upon reading the instant descriptions.
  • a first feature vector is generated based on the first representation.
  • the first feature vector and/ or reference feature matrices may include a plurality of feature vectors, where each feature vector corresponds to a characteristic of a corresponding object class, e.g. a characteristic minimum, maximum, average, etc. brightness in one or more color channels at a particular location (pixel or section), presence of a particular symbol or other reference object at a particular location, dimensions, aspect ratio, pixel density (especially black pixel density, but also pixel density of any other color channel), etc.
  • a characteristic of a corresponding object class e.g. a characteristic minimum, maximum, average, etc. brightness in one or more color channels at a particular location (pixel or section), presence of a particular symbol or other reference object at a particular location, dimensions, aspect ratio, pixel density (especially black pixel density, but also pixel density of any other color channel), etc.
  • feature vectors suitable for inclusion in first feature vector and/or reference feature matrices comprise any type, number and/or length of feature vectors, descriptive of one or more features of the image, e.g. distribution of color data, .
  • the first feature vector is compared to a plurality of reference feature matrices, each reference feature matrix comprising a plurality of vectors.
  • the comparing operation 508 may be performed according to any suitable matrix comparison, vector comparison, or a combination of the two.
  • the comparing may include an N-dimensional feature space comparison.
  • N is greater than 50, but of course, N may be any value sufficiently large to ensure robust classification of objects into a single, correct object class, which those having ordinary skill in the art reading the present descriptions will appreciate to vary according to many factors, such as the complexity of the object, the similarity or distinctness between object classes, the number of object classes, etc.
  • ''objects' ' include any tangible thing represented in an image and. which may be described according to at least one unique characteristic such as color, size, dimensions, shape, textare, or representative feature(s) as would be understood by one having ordinary skill in the art upon reading the present descriptions. Additionally, objects include or classified according to at least one unique combination of such characteristics. For example, in various embodiments objects may include but are in no way limited to persons, animals, vehicles, buildings, landmarks, documents, furniture, plants, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • each object class being characterized by a significant number of starkly distinguishing features or feature vectors (e.g. each object class corresponding to an object or object(s) characterized by very different size, shape, color profile and/or color scheme and easily distinguishable reference symbols positioned in unique locations on each object class, etc.), a relatively lo value of N may be sufficiently large to ensure robust classification.
  • the value of N is preferably chosen or determined such that the classification is not only robust, but also comp tationally efficient; i.e. the classification process(es) introduce only minimal processing overhead to the device(s) or system(s) utilized to perform the classification algorithm.
  • N The value of N that achieves the desired balance between classification robustness and processing overhead will depend on many factors such as described above and others that would be appreciated by one having ordinary skill in the art upon reading the present descriptions. Moreover, determining the appropriate value of N to achieve the desired balance may be accomplished using any known method or equivalent thereof as understood by a skilled artisan upon reading the instant disclosures.
  • an object depicted in the digital image is classified as a member of a particular object class based at least in part on the comparing operation 508. More specifically, the comparing operation 508 may involve evaluating each feature vector of each reference feature matrix, or alternatively evaluating a plurality of feature matrices for objects belonging to a particular object class, and identifying a hyper-plane in the N-dimensional feature space that separates the fea ture vectors of one reference feature matrix from the feature vectors of other reference feature matrices. In this manner, the classification algorithm defines concrete hyper- plane boundaries between object classes, and may assign an unknown object to a particular object class based on similarity of feature vectors to the particular object class and/or dissimilarity to other reference feature matrix profiles.
  • objects belonging to one particular class may be characterized by feature vectors having a distribution of values clustered in the lower-right portion of the feature space, while another class of objects may be characterized by feature vectors exhibiting a distribution of values clustered in the upper-left portion of the feature space, and the
  • classification algorithm may distinguish between the two by identifying a line between each cluster separating the feature space into two classes - "upper left” and “lower-right.”
  • the complexity of the classification grows rapidly, but also provides significant improvements to classification robustness, as will be appreciated by one having ordinary skill in the art upon reading the present descriptions.
  • classification according to embodiments of the presently disclosed methods may include one or more additional and/or alternative features and/or operations, such as described below.
  • classification such as described above may additionally and/or alternatively include assigning a confidence value to a plurality of putative object classes based on the comparing operation (e.g. as performed in operation 508 of method 500) the presently disclosed classification methods, systems and/or computer program products may additionally and/or alternatively determine a location of the mobile device, receive location information indicating the location of the mobile device, etc. and based on the determined location, a confidence value of a classification result corresponding to a particular location may be adjusted. For example, if a mobile device is determined to be located in a particular state (e.g. Maryland) based on a GPS signal, then during classification, a confidence value may be adjusted for any object class corresponding to the particular state (e.g. Maryland Driver License, Maryland Department of Motor Vehicle Title/Registration Form, Maryland Traffic Violation Ticket, etc. as would be understood by one having ordinary skill in the art upon reading the present
  • a confidence value may be adjusted for any object class corresponding to the particular state (e.g. Maryland Driver License, Maryland Department
  • Confidence values may be adjusted in any suitable manner, such as increasing a confidence value for any object class corresponding to a particular location, decreasing a confidence value for any object class not corresponding to a particular location, normalizing confidence value(s) based on correspondence/non-correspondence to a particular location, etc. as would be understood by the skilled, artisan reading the present disclosures.
  • the mobile device location may be determined using any known method, and employing hardware components of the mobile device or any other number of devices in communication with the mobile device, such as one or more satellites, wireless communication networks, servers, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • the mobile device location may be determined based in whole or in part on one or more of a global-positioning system (GPS) signal, a connection to a wireless communication network, a database of known locations (e.g. a contact database, a database associated with a navigational tool such as Google Maps, etc.), a social media tool (e.g. a "check- in” feature such as provided via Facebook, Google Plus, Yelp, etc.), an IP address, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • GPS global-positioning system
  • classification additionally and/or alternatively includes outputting an indication of the particular object class to a display of the mobile device; and receiving user input via the display of the mobile device in response to outputting the indication.
  • user input may be of any known type and relate to any of the herein described features and/or operations, preferably user input relates to confirming, negating or modifying the particular object class to which the object was assigned by the classification algorithm.
  • the indication may be output to the display in any suitable manner, such as via a push notification, text message, display window on the display of the mobile device, email, etc, as would be understood by one having ordinary skill in the art.
  • the user input may take any form and be received in any known manner, such as detecting a user tapping or pressing on a portion of the mobile device display (e.g. by detecting changes in resistance, capacitance on a touch- screen device, by detecting user interaction with one or more buttons or switches of the mobile device, etc.)
  • classification further includes determining one or more object features of a classified object based at least in part on the particular object class.
  • classification may include determining such object features using any suitable mechanism or approach, such as receiving an object class identification code and using the object class identification code as a query and/or to perform a lookup in a database of object features organized according to object class and keyed, hashed, indexed, etc. to the object class identification code.
  • Object features within the scope of the present disclosures may include any feature capable of being recognized in a digital image, and preferably any feature capable of being expressed in a numerical format (whether scalar, vector, or otherwise), e.g. location of subregion containing reference object(s) (especially in one or more object orientation states, such as landscape, portrait, etc.) object color profile, or color scheme, object subregion color profile or color scheme, location of text, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • FIG. 6 a method 600 is shown.
  • the method 600 may be carried out in any desired environment, and may include embodiments and/or approaches described in relation to FIGS. 1-4D, among others.
  • FIG. 6 more or less operations than those shown in FIG. 6 may be performed in accordance method 600 as would be appreciated by one of ordinary skill in the art upon reading the present descriptions,
  • a first feature vector is generated, based on a digital image captured by a mobile device
  • the first feature vector is compared to a plurality of reference feature matrices.
  • an object depicted in the digital image is classified as a member of a particular object class based at least in part on the comparing (e.g. the comparing performed in operation 604),
  • one or more object features of the object are determined based at least in part on the particular object class.
  • a processing operation is performed.
  • the processing operation includes performmg one or more of the following subprocesses: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and bmarizing the digital image based at least in part on the one or more object features.
  • Exemplary characteristics that may be utilized to improve object detection may include characteristics such as object dimensions, object shape, object color, one or more reference features of the object class (such as reference symbols positioned in a known location of a document).
  • object detection may be improved based on the one or more known characteristics by facilitating an object detection algorithm distinguishing regions of the digital image depicting the object from regions of the digital image depicting other objects, image background, artifacts, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • object detection algorithm distinguishing regions of the digital image depicting the object from regions of the digital image depicting other objects, image background, artifacts, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
  • objects belonging to a particular object class are known to exhibit a particular color profile or scheme, it may be simpler and/or more reliable to attempt detecting the particular color profile or scheme within the digital image rather than detecting a transition from one color profile or scheme (e.g. a background color profile or scheme) to another color profile or scheme (e.g. the object color profile or scheme), especially if the two colors profiles or schemes are not characterized by sharply contrasting features.
  • a d/or may result in a higher-confidence or higher-quality result to transform a digital representation of an object from a native appearance to a true configuration based on a set of known object characteristics that definitively represent the true object configuration, rather than attempting to estimate the true object configuration from the native appearance and. project the native appearance onto an estimated object configuration,
  • the classification may identify known dimensions of the object, and based on these known dimensions the digital image may be rectangularized to transform a distorted represe tation of the object in the digital image into an undistorted representation (e.g. by removing projective effects introduced in the process of capturing the image using a camera of a mobile device rather than a traditional fiat-bed scanner, paper- feed scan er or other similar multifunction peripheral (MFP)),
  • MFP multifunction peripheral
  • binarization algorithms generally transform a multi-tonal digital image (e.g. grayscale, color, or any other image such as image 400 exhibiting more than two tones) into a bitonal image, i.e. an image exhibiting only two tones (typically white and black).
  • a digital image depicting an object with regions exhibiting two or more distinct color profiles and/or color schemes e.g. a region depicting a color photograph 402 as compared to a region depicting a black/white text region 404, a color-text region 406, a symbol 408 such as a reference object, watermark, etc. object background region 410, etc.
  • these difficulties may be at least partially due to the differences between the color profiles, schemes, etc., which counter- influence a single binarization transform.
  • providing an ability to distinguish each of these regions having disparate color schemes or profiles and define separate binarization parameters for each may greatly improve the quality of the resulting bitonal image as a whole and with particular respect to the quality of the transformation in each respective region.
  • improved binarization may include determining an object class color profile and/or scheme (e.g.
  • determining a color profile and/or color scheme for object background region 410 determining a color profile and/or color scheme for object background region 410); adjusting one or more binarization parameters based on the object class color profile and/or color scheme; and thresholding the digital image using the one or more adjusted binarization parameters.
  • Binarization parameters may include any parameter of any suitable binarization process as would be appreciated by those having ordinary skill in the art reading the present descriptions, and binarization parameters may be adjusted according to any suitable methodology. For example, with respect to adjusting binarization parameters based on an object class color profile a d/or color scheme, binarization parameters may be adjusted to over- and'or under-emphasize a contribution of one or more color channels, intensities, etc. in accordance with the object class color profile/scheme (such as under-emphasizing the red channel for an object class color profile/scheme relatively saturated by red hue(s), etc.).
  • improved binarization may include determining an object class mask, applying the object class mask to the digital image and thresholding a subregion of the digital image based on the object class mask.
  • the object class mask may be any type of mask, with the condition that the object class mask provides information regarding the location of particular regions of interest characteristic to objects belonging to the class (such as a region depicting a color photograph 402, a region depicting a black/white text region 404, a color-text region 406, a symbol region depicting a symbol 408 such as a reference object, watermark, etc., an object background region 410, etc.) and. enabling the selective inclusion and/or exclusion of such regions from the binarization operation(s).
  • improved binarization includes determining an object class mask 420 identifying regions such as discussed immediately above and applying the object class mask 420 to exclude from binarization all of the digital image 4 ⁇ 0 except a single region of interest, such as object background region 410.
  • the entire digital image may be masked-out and a region of interest such as object background region 410 subsequently rnasked- in to the binarization process.
  • the masking functionality now described with reference to FIG.
  • 4B may be combined with the exemplary color profile and'or color scheme information functionality described above, for example by obtaining both the object class mask and the object color profile and'or color scheme, applying the object class mask to exclude all of the digital image from binarization except object background region 410, adjusting one or more binarization parameters based on the object background region color profile and/or color scheme, and thresholding the object background region 410 using the adjusted binarization parameters.
  • multiple regions of interest may be masked-in and/or masked-out using object class mask 420 to selectively designate regions and/or parameters for binarization in a layered approach designed to produce high-quality bitonal images.
  • object class mask 420 multiple text regions 404, 406 may be retained for binarization (potentially using adjusted parameters) after applying object class mask 420, for example to exclude all non-text regions from binarization, in some approaches.
  • a digital image 400 such as a region depicting a color photograph 402, using an object class mask 420. Then, particularly in approaches where the remaining portion of the digital image 400 is characterized by a single color profile and/or color scheme, or a small number (i.e. no more than 3) substantially similar color profile and/or color schemes, binarization may be performed, to clarify the remaining portions of the digital image 400.
  • the masked-out unique region may optionally be restored to the digital image 400, with the result being an improved bitonal image quality in all regions of the digital image 400 that were subjected, to binarization coupled with an undisturbed, color photograph 402 in the region of the image not subjected to binarization.
  • OCR optical character recognition
  • the presently disclosed algorithms may determine the expected format for this text follows a format such as
  • the algorithm may correct the erroneous OCR predictions, e.g. converting the comma after "Jan” into a period and/or converting the letter "1" at the end of 2011" into a numerical one character.
  • the presently disclosed algorithms may determine the expected format for the same text is instead “[##]/[##]/[####]” and convert “Jan” to "01 " and convert each set of comma-space characters ", " into a slash "/” to correct the erroneous OCR predictions.
  • a method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representation of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
  • Generating the first representation involves dividing the digital image into a plurality of sections; and determining, for each section, at least one characteristic value, each characteristic value corresponding to one or more features descriptive of the section.
  • the first representation comprises a plurality of pixels, each of the plurality of pixels corresponds to one section of the plurality of sections, and each of the pluralit of pixels is characterized by the at least one characteristic value determined for the corresponding section.
  • the digital image comprises a cropped, color image.
  • One or more of the reference feature matrices comprises a plurality of feature vectors, and each feature vector corresponds to at least one characteristic of an object.
  • the comparing comprises an N-dimensional comparison, and N is greater than 50.
  • the first feature vector is characterized by a feature vector length greater than 500.
  • the method also includes determining one or more object features of the object based at least in part on the particular object class; detecting the object depicted in the digital image based at least in part on the classifying and/or result thereof; rectangularizing the object depicted in the digital image based at least in part on the classifying and/or result thereof;
  • the binarizing additionally and/or alternatively includes one or more of: determining an object class mask; applying the object class mask to the digital image; and. thresholding a subregion of the digital image based on the object class mask.
  • the method may include adjusting one or more binarization parameters based on the object class mask; and thresholding the digital image using the one or more adjusted binarization parameters, determining an object class color scheme.
  • binarizing may include adjusting one or more binarization parameters based on the object class color scheme; and thresholding the digital image using the one or more adjusted binarization parameters.
  • the method additionally and/or alternatively includes: determining a geographical location associated with the mobile device, wherein the classifying is further based at least in part on the geographical location.
  • the method additionally and/or alternatively includes: outputting an indication of the particular object class to a display of the mobile device; and. receiving user input via the display of the mobile device in response to outputting the indication.
  • the method additionally and/or alternatively includes: determining one or more object features of the object based at least in part on the particular object class.
  • a method includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class.
  • the method also includes performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features.
  • the one or more object features comprise an object color scheme, and the binarizing comprises: determining the object color scheme; adjusting one or more binarization parameters based on the processing; and thresholding the digital image using the one or more adjusted binarizat on parameters.
  • the one or more object features may additionally and/or alternatively comprise an object class mask, and the binarizing comprises; determining the object class mask; applying the object class mask to the digital image; and thresholding a subregion of the digital image based on the object class mask.
  • inventive concepts disclosed herein have been presented by way of example to illustrate the myriad, features thereof in a plurality of illustrative scenarios, embodiments, and/or implementations. It should be appreciated that the concepts generally disclosed are to be considered as modular, and may be implemented in any combination, permutation, or synthesis thereof. In addition, any modification, alteration, or equivalent of the presently disclosed features, functions, and concepts that would be appreciated by a person having ordinary skill in the art upon reading the instant descriptions should also be considered within the scope of this disclosure.
  • one embodiment of the present invention includes all of the features disclosed herein, including those shown and described in conjunction with any of the FIGS.
  • Other embodiments include subsets of the features disclosed herein a d/or shown and. described in conjunction with any of the FIGS.
  • Such features, or subsets thereof, may be combined in any ⁇ way using known techniques that would become apparent to one skilled in the art after reading the present description.

Abstract

In one embodiment, a method includes receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representation of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.

Description

FIELD OF INVENTION
[0001] The present invention relates to mobile image capture and image processing, and more particularly to capturing and processing digital images using a mobile device, and classifying objects detected in such digital images.
BACKGROUND OF THE INVENTION
[0002] Digital images having depicted therein an object inclusive of documents such as a letter, a check, a bill, an invoice, etc. have conventionally been captured and processed using a scanner or multifunction peripheral coupled to a computer workstation such as a laptop or desktop computer. Methods and. systems capable of performing such capture and processing are well known in the art and well adapted to the tasks for which they are employed.
[0003] However, in an era where day-to-day activities, computing, and business are increasingly performed using mobile devices, it would be greatly beneficial to provide analogous document capture and processing systems and methods for deployment and use on mobile platforms, such as smart phones, digital cameras, tablet computers, etc.
[0004] A major challenge in transitioning conventional document capture and. processing techniques is the limited processing power and image resolution achievable using hardware currently available in mobile devices. These limitations present a significant challenge because it is impossible or impractical to process images captured at resolutions typically much lower than achievable by a conventional scanner. As a result, conventional scanner-based, processing algorithms typically perform poorly on digital images captured using a mobile device.
[0005] In addition, the limited, processing and memory available on mobile devices makes conventional image processing algorithms employed for scanners prohibitively expensive in terms of computational cost. Attempting to process a conventional scanner-based image processing algorithm takes far too much time to be a practical application on modern mobile platforms.
[0006] A still farther challenge is presented by the nature of mobile capture components (e.g. cameras on mobile phones, tablets, etc.). Where conventional scanners are capable of faithfully representing the physical document in a digital image, critically maintaining aspect ratio, dimensions, and shape of the physical document in the digital image, mobile capture components are frequently incapable of producing such results,
[0007] Specifically, images of documents captured by a camera present a new line of processing issues not encountered when dealing with images captured by a scanner. This is in part due to the inherent differences in the way the document image is acquired, as well as the way the devices are constructed. The way that some scanners work is to use a transport mechanism that creates a relative movement between paper and a linear array of sensors. These sensors create pixel values of the document as it moves by, and the sequence of these captured pixel values forms an image. Accordingly, there is generally a horizontal or vertical consistency up to the noise in the sensor itself, and it is the same sensor that provides ail the pixels in the line.
[0008] In contrast, cameras have many more sensors in a nonlinear array, e.g., typically arranged in a rectangle. Thus, all of these individual sensors are independent, and render image data that is not typically of horizontal or vertical consistency. In addition, cameras introduce a projective effect that is a function of the angle at which the picture is taken. For example, with a linear array like in a scanner, even if the transport of the paper is not perfectly orthogonal to the alignment of sensors and some skew is introduced, there is no projective effect like in a camera. Additionally, with camera capture, nonlinear distortions may be introduced because of the camera optics.
[0009] Conventional image processing algorithms designed to detect documents in images captured using traditional flat-bed and/or paper feed scanners may also utilize information derived from page detection to attempt to classify detected documents as members of a particular document class. However, due to the unique challenges introduced by virtue of capturing digital images using cameras of mobile devices, these conventional classification algorithms perform inadequately and are incapable of robustly classifying documents in such digital images.
[0010] Moreover, even when documents can be properly classified, the hardware limitations of current mobile devices make performing classification using the mobile device prohibitively expensive from a computational efficiency standpoint.
[0011 ] In view of the challenges presented above, it would be beneficial to provide an image capture and processing algorithm and applications thereof that compensate for and/or correct problems associated, with image capture, processing and classification using a mobile device, while maintaining a low computational cost via efficient processing methods.
[0012] Moreover, it would be a further improvement in the field to provide object classification systems, methods and computer program products capable of robustly assigning objects to a particular class of objects and utilize information known about members of the class to further address and overcome unique challenges inherent to processing images captured using a camera of a mobile device.
SUMMARY OF THE INVENTION
[0013] Irs one embodiment a method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representa tion of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
[0014] In another embodiment, a method includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class; and performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted, in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features.
[0015] In still another embodiment, a system includes a processor; and logic in and/or executable by the processor to cause the processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first
representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the comparison.
[0016] In still yet another embodiment, a computer program product includes a computer readable storage medium having program code embodied therewith, the program code readable/executable by a processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the
comparison. BRIEF DESCRIPTION OF THE DRAWINGS
[0017] FIG. 1 illustrates a network architecture, in accordance with, one embodiment.
[0018] FIG. 2 shows a representative hardware environment that may be associated, with the servers and/or clients of FIG. 1, in accordance with one embodiment.
[0019] FIG. 3A depicts a digital image of an object, according to one embodiment.
[0020] FIG, 3B depicts a schematic representation of the digital image shown in FIG. 3A divided into a plurality of sections for generating a first representation of the digital image, according to one embodiment.
[0021 ] FIG. 3C is depicts a first representat on of the digital image shown in FIG. 3A, the first representation being characterized by a reduced resolution relati e to the resolution of the digital image.
[0022] FIG. 4A is a schematic representation of a plurality of subregions depicted in a digital image of a document, according to one embodiment.
[0023] FIG. 4B is a masked representation of the digital image shown in FIG. 4A, according to one embodiment.
[0024] FIG. 4C is a masked representation of the digital image shown in FIG. 4.4, according to one embodiment.
[0025] FIG. 4D is a masked representation of the digital image shown in FIG. 4A, according to one embodiment.
[0026] FIG. 5 is a flowchart of a method, according to one embodiment.
[0027] FIG, 6 is a flowchart of a method, according to one embodiment.
DETAILED DESCRIPTION
[0028] The following description is made for the purpose of illustrating the general principles of the present invention and is not meant to limit the inventive concepts claimed herein. Further, particular features described herein can be used in combination with other described features in each of the various possible combinations and permutations.
[0029] Unless otherwise specifically defined herein, all terms are to be given their broadest possible interpretation including meanings implied from the specification as well as meanings understood by those skilled in the art and/or as defined in dictionaries, treatises, etc.
[0030] It must also be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless otherwise specified.
[0031] The present application refers to image processing of images (e.g. pictures, figures, graphical schematics, single frames of movies, videos, films, clips, etc.) captured by cameras, especially cameras of mobile devices. As understood herein, a mobile device is any device capable of receiving data without having power supplied via a physical connection (e.g. wire, cord, cable, etc.) and capable of receiving data without a physical data connection (e.g. wire, cord, cable, etc.). Mobile devices within the scope of the present disclosures include exemplar}' devices such as a mobile telephone, smartphone, tablet, personal digital assistant, iPod ®, iPad ®, BLACKBERRY © device, etc,
[0032] However, as it will become apparent from the descriptions of various functionalities, the presently disclosed mobile image processing algorithms can be applied, sometimes with certain modifications, to images coming from scanners and multifunction peripherals (MFPs). Similarly, images processed using the presently disclosed processing algorithms may be further processed using conventional scanner processing algorithms, in some approaches.
[0033] Of course, the various embodiments set forth herein may be implemented, utilizing hardware, software, or any desired combination thereof. For that matter, any type of logic may be utilized which is capable of implementing the various functionality set forth herein.
[0034] One benefit of using a mobile device is that with a data plan, image processing and information processing based on captured images can be done in a much more convenient, streamlined and integrated way than previous methods that relied on presence of a scanner. However, the use of mobile devices as document(s) capture and/or processing devices has heretofore been considered unfeasible for a variety of reasons.
[0035] In one approach, an image may be captured by a camera of a mobile device. The term "camera" should be broadly interpreted to include any type of device capable of capturing an image of a physical object external to the device, such as a piece of paper. The term "camera" does not encompass a peripheral scanner or multifunction device. Any type of camera may be used. Preferred embodiments may use cameras having a higher resolution, e.g. 8 MP or more, ideally 12 MP or more. The image may be captured in color, grayscale, black and white, or with any other known optical effect. The term "image" as referred to herein is meant to encompass any type of data corresponding to the output of the camera, including raw data, processed data, etc.
[0036] General Embodiments
[0037] In one general embodiment a method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representation of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
[0038] In another general embodiment, a method includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class; and performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted, in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features.
[0039] In still another general embodiment, a system includes a processor; and logic in and/or executable by the processor to cause the processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the comparison.
[0040] In still yet another general embodiment, a computer program product includes a computer readable storage medium having program code embodied therewith, the program code readable/executable by a processor to: generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation; compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the
comparison.
[0041] As will be appreciated by one skilled in the art, aspects of the present mvention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as "logic," "circuit," "module" or "system." Furthermore, aspects of the present mvention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
[0042] Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non- exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard, disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any- suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, processor, or device.
[0043] A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband, as part of a carrier wave, an electrical connection having one or more wires, an optical fiber, etc. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device,
[0044] Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. [0045] Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Sendee Provider).
[0046] Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the inv ention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flo wchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions acts specified in the flowchart and/or block diagram block or blocks.
[0047] These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
[0048] The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other d evices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
[0049] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should, also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed, in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware -based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
[0050] FIG. 1 illustrates an architecture 100, in accordance with one embodiment. FIG. 1, a plurality of remote networks 102 are provided including a first remote network 104 and a second remote network 106. A gateway 101 may be coupled between the remote networks 102 and a proximate network 108. in the context of the present architecture 100, the networks 104, 106 may each take any form including, but not limited to a LAN, a WAN such as the Internet, public switched telephone network (PSTN), internal telephone network, etc,
[0051] In use, the gateway 101 serves as an entrance point from the remote networks 102 to the proximate network 108. As such, the gateway 101 may function as a router, which is capable of directing a given packet of data that aixives at the gateway 101, and a switch, which furnishes the actual path in and out of the gateway 101 for a given packet.
[0052] Further included is at least one data server 114 coupled to the proximate network 108, and which is accessible from the remote networks 102 via the gateway 101. It should be noted that the data server(s) 114 may include any type of computing device/groupware. Coupled, to each data server 114 is a plurality of user devices 116. Such user devices 116 may include a desktop computer, lap-top computer, hand-held computer, printer or any other type of logic. It should be noted that a user device 1 may also be directly coupled to any of the networks, in one embodiment.
[0053] A peripheral 120 or series of peripherals 120, e.g., facsimile machines, printers, networked and/or local storage units or systems, etc., may be coupled to one or more of the networks 104, 106, 108. It should be noted that databases and/or additional components may be utilized with, or integrated into, any type of network element coupled to the networks 104, 1Θ6, 108. In the context of the present description, a network element may refer to any component of a network. [0054] According to some approaches, methods and systems described herein may be implemented with and/or on virtual systems and/or systems which emulate one or more other systems, such as a UNIX system which emulates an IBM z/OS environment, a UNIX system which virtually hosts a MICROSOFT WINDOWS environment, a MICROSOFT WINDOWS system which emulates an IBM z/OS environment, etc. This virtualization and/or emulation may be enhanced through the use ofVMWARE software, in some embodiments.
[0055] In more approaches, one or more networks 1Θ4, 106, 108, may represent a cluster of systems commonly referred to as a "cloud." In cloud computing, shared resources, such as processing power, peripherals, software, data, servers, etc, are provided to any system in the cloud in an on-demand relationship, thereby allowing access and. distribution of services across many computing systems. Cloud computing typically involves an Internet connection between the systems operating in the cloud, but other techniques of connecting the systems may also be used,
[0056] FIG. 2 shows a representative hardware environment associated with a user device 116 and/or server 114 of FIG. 1, in accordance with one embodiment. Such figure illustrates a typical hardware configuration of a workstation having a central processing unit 210, such as a microprocessor, and a number of other units interconnected via a system bus 212.
[0057] The workstation shown in FIG. 2 includes a Random Access Memory (RAM) 214, Read Only Memory (ROM) 216, an I/O adapter 218 for connecting peripheral devices such as disk storage units 220 to the bus 212, a user interface adapter 222 for connecting a keyboard 224, a mouse 226, a speaker 228, a microphone 232, and/or other user interface devices such as a touch screen and a digital camera (not shown) to the bus 212, communication adapter 234 for connecting the workstation to a communication network 235 (e.g., a data processing network) and. a display adapter 236 for connecting the bus 212 to a display device 238.
[0058] The workstation may have resident thereon an operating system such as the Microsoft Windows® Operating System (OS), a MAC OS, a UNIX OS, etc. It will be appreciated that a preferred embodiment may also be implemented on platforms and operating systems other than those mentioned. A preferred embodiment may be written using JAVA, XML,, C, and/or C÷+ language, or other programming languages, along with an object oriented programming methodology. Object oriented programming (OOP), which has become increasingly used to develop complex applications, may be used,
[0059] An application may be installed on the mobile device, e.g., stored in a nonvolatile memory of the device. In one approach, the application includes instructions to perform processing of an image on the mobile device. In another approach, the application includes instructions to send the image to a remote server such as a network server. In yet another approach, the application may include instructions to decide whether to perform some or all processing on the mobile device and/or send the image to the remote site.
[0060] Various Embodiments of Page Detection
[0061] One exemplary embodiment illustrating an exemplary methodology for performing page detection will now be described.
[0062] In one approach, an edge detection algorithm proceeds from the boundaries of a digital image toward a central region of the image, looking for points that are sufficiently different from what is known about the properties of the background.
[0063] Notably, the background in the images captured by even the same mobile device may be different every time, so a new technique to identify the document(s) in the image is provided.
[0064] Finding page edges within a camera-captured image according to the present disclosures helps to accommodate important differences in the properties of images captured using mobile devices as opposed, e.g., to scanners. For example, due to projective effects the image of a rectangular document in a photograph may not appear truly rectangular, and opposite sides of the document in the image may not have the same length. Second, even the best lexises have some non-linearity resulting in straight lines within an object, e.g. straight sides of a substantially rectangular document, appearing slightly curved in the captured, image of that object. Third, images captured using cameras overwhelmingly tend to introduce uneven illumination effects in the captured image. This unevenness of illumination makes even a perfectly uniform background of the surface against which a document may be placed appear in the image with varied brightness, and often with shadows, especially around the page edges if the page is not perfectly flat.
[0065] In an exemplary approach, to avoid mistaking the variability within the background for page edges, the current algorithm utilizes one or more of the following functionalities.
[0066] In various embodiments, the frame of the image contains the digital representation of the document with margins of the surrounding background. In the preferred implementation the search for individual page edges may be performed on a step-over approach analyzing rows and columns of the image from outside in. In one embodiment, the step-over approach may define a plurality of analysis windows within the digital image, such as understood herein, analysis windows may include one or more "background windows," i.e. windows encompassing only pixels depicting the background of the digital image, as well as one or more "test windows" i.e. windows encompassing pixels depicting the background of the digital image, the digital representation of the document, or both. [0067] In a preferred embodiment, the digital representation of the document may be detected in the digital image by defining a first analysis window, i.e. a background analysis window, in a margin of the image corresponding to the background of the surface upon w Inch the document is placed. Within the first analysis window, a plurality of small analysis windows (e.g. test windows) may be defined within the first analysis window. Utilizing the plurality of test windows, one or more distributions of one or more statistical properties descriptive of the background may be estimated.
[0068] With continuing reference to the preferred embodiment discussed immediately above, a next step in detecting boundaries of the digital representation of the document may include defining a plurality of test windows within the digital image, and analyzing the corresponding regions of the digital image. For each test window one or more statistical values descriptive of the corresponding region of the image may be calculated. Further, these statistical values ma be compared to a correspondmg distribution of statistics descriptive of the background.
[0069] In a preferred approach, the plurality of test windows may be defined along a path, particularly a linear path. In a particularly preferred approach, the plurality of test windows may be defined in a horizontal direction and/or a vertical direction, e.g. along rows and columns of the digital image. Moreover, a stepwise progression may be employed to define the test windows along the path and/or between the rows and/or columns. In some embodiments, as will be appreciated by one having ordinary skill in the art upon reading the present descriptions, utilizing a stepwise progression may advantageously increase the computational efficiency of document detection processes.
[0070] Moreover, the magnitude of the starting step may be estimated based on the resolution or pixel size of the image, in some embodiments, but this step may be reduced if advantageous for reliable detection of document sides, as discussed further below.
[0071] In more embodiments, the algorithm estimates the distribution of several statistics descriptive of the image properties found in a large analysis window placed within the background surrounding the document. In one approach a plurality of small windows may be defined within the large analysis window, and distributions of statistics descriptive of the small test windows may be estimated. In one embodiment, large analy sis window is defined in a background region of the digital image, such as a top-left corner of the image.
[0072] Statistics descriptive of the background pixels may include any statistical value that may be generated, from digital image data, such as a minimum value, a maximum value, a median value, a mean value, a spread or range of values, a variance, a standard dev iation, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Values may be sampled from any data descripti ve of the digital image, such as brightness values in one or more color channels, e.g. red-green-blue or RGB, cyan-magenta, yellow, black or CMYK, hue saturation value or HSV, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
[0073] In one approach, each of the small analysis windows may comprise a subset of the plurality of pixels within the large analysis window. Moreover, small analysis windows may be of any size and/or shape capable of fitting within the boundaries of large analysis window. In a preferred embodiment, small analysis windows may be characterized by a rectangular shape, and even more preferably a rectangle characterized by being three pixels long in a first direction (e.g. height) and seven pixels long in a second direction (e.g. width). Of course, other small analysis window sizes, shapes, and dimensions are also suitable for implementation in the presently disclosed processing algorithms.
[0074] In one embodiment, test windows may be employed to analyze an image and detect the boundary of a digital representation of a document depicted in the image. Background windows are used for estimation of original statistical properties of the background and/or reestimation of local statistical properties of the background. Reestimation may be necessary and/or advantageous in order to address artifacts such as uneven illumination and/or background texture variations.
[0075] Preferably, statistical estimation may be performed over some or all of a plurality of small analysis window(s) in a large analysis window within the margin outside of the document page in some approaches. Such estimation may be performed using a stepwise movement of a small analysis window within the large analysis window, and the stepwise movement may be made in any suitable increment so as to vary the number of samples taken for a given pixel. For example, to promote computational efficiency, an analysis process may define a number of small analysis windows within large analysis window sufficient to ensure each pixel is sampled once. Thus the plurality of small analysis windows defined in this computationally efficient approach would share common borders but not overlap.
[0076] In another approach designed to promote robustness of statistical estimations, the analysis process may define a number of small analysis windows within large analy sis window sufficient to ensure each pixel is sampled a maximum number of times, e.g. by reducing the step to produce only a single pixefshift in a given direction between sequentially defined small analysis windows. Of course, any step increment may be employed in various embodiments of the presently disclosed processing algorithms, as would be understood by one having ordinary skill in the art upon reading the present descriptions. [0077] The skilled artisan will appreciate that large analysis windows utilized to reestimate statistics of local background in the digital image as well as test windows can be placed in the digital image in any which way desirable.
[0078] For example, according to one embodiment, the search for the left side edge in a gi ven row i begins from the calculation of the above mentioned statistics in a large analy sis window adjacent to the frame boundary on the left side of the image centered around, a given row .
[0079] In still more embodiments, when encountering a possible non-background test window (e.g. a test window for which the estimated statistics are dissimilar from the distribution of statistics characteristic of the last known local background) as the algorithm progresses from the outer region(s) of the image towards the interior regions thereof, the algorithm may backtrack into a previously determined background region, form a new large analysis window and re- estimate the distribution of background statistics in order to reevaluate the validity of the differences between the chosen statistics within the small analysis window and the local distribution of corresponding statistics within the large analysis window, in some embodiments.
[0080] As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, the algorithm may proceed from an outer region of the image to an inner region of the image in a variety of manners. For example, in one approach the algorithm proceeds defining test windows in a substantially spiral pattern. In other approaches the pattern may be
substantially serpentine along either a vertical or a horizontal direction. In still more approaches the pattern may be a substantially shingled pattern. The pattern may also be defined by a "sequence mask" laid, over part or all of the digital image, such as a checkerboard pattern, a vertically, horizontally, or diagonally striped pattern, concentric shapes, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. In other embodiments, analysis windows such as large analysis windows and/or small analysis windows may be defined throughout the digital image in a random manner, a pseudo-random manner, stochastically, etc. according to some defined procedure, as would be understood by one having ordinary skill in the art upon reading the present descriptions. The algorithm can proceed with a sequence of test windows in any desirable fashion as long as the path allows to backtrack into known background, and the path covers the whole image with desirable granularity.
[0081] Advantageously, recalculating statistics in this manner helps to accommodate for any illumination drift inherent to the digital image and/or background, which may otherwise result in false identification of non-background points in the image (e.g. outlier candidate edge points) [0082] In still yet more embodiments, when the difference is statistically valid, the algorithm may jump a certain distance further along its path in order to check again and thus bypass small variations in the texture of the background, such as wood grain, scratches on a surface, patterns of a surface, small shadows, etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[0083] In additional and/or alternative embodiments, after a potential non-background point has been found, the algorithm determines whether the point lies on the edge of the shadow (a possibility especially if the edge of the page is raised abo ve the background surface) and tries to get to the actual page edge. This process relies on the observation that shadows usually darken towards the real edge followed, by an abrupt brightening of the image.
[0084] The above described approach to page edge detection was utilized because the use of standard edge detectors may be unnecessary and even undesirable, for several reasons. First, most standard edge detectors involve operations that are time consuming, and second, the instant algorithm is not concerned with additional requirements like monitoring how thin the edges are, which directions they follow, etc. Even more importantly, looking for page edges does not necessarily involve edge detection per se, i.e. page edge detection according to the present disclosures may be performed in a manner that does not search for a document boundary (e.g. page edge), but rather searches for image characteristics associated with a transition from background to the document. For example, the transition may be characterized by flattening of the off-white brightness levels within a glossy paper, i.e. by changes in texture rather than in average gray or color levels.
[0085] As a result, it is possible to obtain candidate edge points (e.g. candidate edge points) that are essentially the first and the last non-background pixels in each ro and column on a grid. In order to eliminate random outliers (e.g. outlier candidate edge points and to determine which candidate edge points correspond, to each side of the page, it is useful in one approach to analyze neighboring candidate edge points.
[0086] In one embodiment, a "point" may be considered any region within the digital image, such as a pixel, a position between pixels (e.g. a point with fractional coordinates such as the center of a 2-pixel by 2-pixel square) a small window of pixels, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. In a preferred embodiment, a candidate edge point is associated with the center of a test window (e.g. a 3-pixel x 7 -pixel window) that has been found to be characterized by statistics that are determined to be different from the distribution of statistics descriptive of the local background. [0087] As understood herein, a "neighboring" candidate edge point, or a "neighboring" pixel is considered to be a point or pixel, respectively, which is near or adjacent a point or pixel of interest (e.g. pixel), e.g. a point or pixel positioned at feast in part along a boundary of the point or pixel of interest, a point or pixel positioned within a threshold distance of the point or pixel of interest (such as within 2, 10, 64 pixels, etc. in a given direction, within one row of the point or pixel of interest, within one column of the point or pixel of interest), etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions. In preferred approaches, the "neighboring" point or pixel may be the closest candidate edge point to the point of interest along a particular direction, e.g. a horizontal direction and/or a vertical direction.
[0088] Each "good" edge point ideally has at least two immediate neighbors (one on each side) and. does not deviate far from a straight line segment connecting these neighbors and the "good" edge point, e.g. the candidate edge point and the at least two immediately neighboring points may be fit to a linear regression, and the result may be characterized by a coefficient of determination (R2) not less than 0.95. The angle of this segment with respect to one or more borders of the digital image, together with its relative location determines whether the edge point is assigned to top, left, right, or bottom side of the page. In a preferred embodiment, a candidate edge point and the two neighboring edge points may be assigned to respective comers of a triangle. If the angle of the triangle at the candidate edge point is close to 180 degrees, then the candidate edge point may be considered a "good" candidate edge point. If the angle of the triangle at the candidate edge point deviates far from 180 degrees by more than a threshold value (such as by 20 degrees or more), then the candidate edge point may be excluded from the set of "good" candidate edge points. The rationale behind, this heuristic is based, on the desire to throw out random errors in the determination of the first and last non-background pixels within rows and. columns. 'These pixels are unlikely to exist in consistent lines, so checking the neighbors in terms of distance and direction is particularly advantageous in some approaches.
[0089] For speed, the step of this grid may start from a large number such as 32, but it may¬ be reduced by a factor of two and. the search for edge points repeated until there are enough of them to determine the Least Mean Squares (IMS) based equations of page sides (see below). If this process cannot determine the sides reliably even after using all rows and columns in the image, it gives up and the whole image is treated as the page.
[0090] The equations of page sides are determined as follows, in one embodiment. First, the algorithm fits the best LMS straight line to each of the sides using the strategy of throwing out worst outliers until all the remaining supporting edges lie within a small distance from the LMS line. For example, a point with the largest distance from a substantially straight line connecting a plurality of candidate edge points along a particular boundar of the document may be designated the "worst" outlier. This procedure may be repeated iteratrvely to designate and/or remove one or more "worst" outliers from the plurality of candidate edge point. In some approaches, the distance with which a candidate edge point may deviate from the line connecting the pluralit of candidate edge points is based at least in part on the size and/or resolution of the digital image.
[0091] If this line is not well supported all along its stretch, the algorithm may attempt to fit the best second-degree polynomial (parabola) to the same original candidate points. The algorithmic difference between finding the best parabola vs. the best straight line is minor:
instead of two unknown coefficients determining the direction and. offset of the line there are three coefficients determining the curvature, direction, and offset of the parabola; however, in other respects the process is essentially the same, in one embodiment.
[0092] If the support of the parabola is stronger than that of the straight line, especially closer to the ends of the candidate edge span, the conclusion is that the algorithm should prefer the parabola as a better model of the page side in the image. Otherwise, the linear model is employed, in various approaches.
[0093] Intersections of the four found sides of the document may be calculated in order to find, the corners of (possibly slightly curved) page tetragon, (e.g. tetragon and discussed in further detail below), in the preferred implementation in order to do this it is necessary to consider three cases: calculating intersections of two straight lines, calculating intersections of a straight line and. a parabola, and calculating intersections of two parabolas.
[0094] In the first case there is a single solution (since top and bottom page edges stretch mostly horizontally, while left and right page edges stretch mostly vertically, the corresponding LMS lines cannot be parallel) and this solution determines the coordinates of the corresponding page corner.
[0095] The second case, calculating intersections of a straight line and a parabola, is slightly more complicated: there can be zero, one, or two solutions of the resulting quadratic equation. If there is no intersection, it may indicate a fatal problem with page detection, and its result may be rejected. A single solution is somewhat unlikely, but presents no further problems. Two intersections present a choice, in which case the intersection closer to the corresponding corner of the frame is a better candidate - in practice, the other solution of the equation may be very far away from the coordinate range of the image frame.
[0096] The third case, calculating intersections of two parabolas, results in a fourth degree polynomial equation that (in principle) may be solved analytically. However, in practice the number of calculations necessary to achie ve a solution may be greater than in an approximate iterative algorithm that also guarantees the desired sub-pixel precision.
[0097] One exemplary procedure used for this purpose is described, in detail below with reference to rectangularization of the digital representation of the document, according to one approach.
[0098] There are several constraints on the validity of the resulting target tetragon (e.g. tetragon as discussed in further detail below). Namely, the tetragon is preferably not too small (e.g., below a predefined threshold of any desired value, such as 25% of the total area of the image), the corners of the tetragon preferably do not lie too far outside of the frame of the image (e.g. not more than 100 pixels away), and the corners themselves should preferably be interpretable as top-left, top-right, bottom-left and bottom-right with diagonals intersecting inside of the tetragon, etc. If these constraints are not met, a given page detection result may be rejected, in some embodiments.
[0099] In one illustrative embodiment where the detected tetragon of the digital
representation of the document is valid, the algorithm may determine a target rectangle. Target rectangle width and height may be set to the average of top and bottom sides of the tetragon and the average of left and right sides respectively.
[00100] In one embodiment, if skew correction is performed, the angle of skew of the target rectangle may be set to zero so that the page sides will become horizontal and vertical.
Otherwise, the skew angle may be set to the average of the angles of top and bottom sides to the horizontal axis and those of the left and right sides to the vertical axis.
[00101] In a similar fashion, if crop correction is not performed, the center of the target rectangle may be designated so as to match the average of the coordinates of the four comers of the tetragon; otherwise the center may be calculated so that the target rectangle ends up in the top left of the image frame, in additional embodiments.
[00102] In some approaches, if page detection result is rejected for any reason, some or all steps of the process described herein may be repeated with a smaller step increment, in order to obtain more candidate edge points and, advantageously, achieve more plausible results. In a worst-case scenario where problems persist even with the minimum allowed step, the detected page may be set to the whole image frame and the original image may be left untouched.
[00103] Now with particular reference to an exemplary implementation of the inventive page detection embodiment described, herein, in one approach page detection includes performing a method such . As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, the method may be performed in any environment, including those described herein and represented in any of the Figures provided with the present disclosures.
[00104] In one embodiment, a plurality of candidate edge points corresponding to a transition from a digital image background to the digital representation of the document are defined,
[00105] In various embodiments, defining the plurality of candidate edge points in operation may include one or more additional operations such as operations -, described below.
[00106] According to one embodiment, a large analysis window (e.g. a large analysis window) and. is defined within the digital image. Preferably, a first large analysis window is defined in a region depicting a pluralit of pixels of the digital image background, but not depicting the non-background (e.g. the digital representation of the document) in order to obtain information characteristic of the digital image background for comparison and contrast to information characteristic of the non-background, (e.g. the digital representation of the document, such as background statistics discussed in further detail below). For example, the first large analysis window may be defined in a corner (such as a top-left corner) of the digital image. Of course, the first large analysis window may be defined in any part of the digital image without departing from the scope of the present disclosures.
[001 7] Moreover, as will be understood by one having ordinary skill in the art upon reading the present descriptions, the large analysis window may be any size and/or characterized by any suitable dimensions, but in preferred embodiments the large analysis window is approximately forty pixels high and approximately forty pixels wide.
[00108] In particularly preferred approaches, the large analysis window may be defined in a corner region of the digital image. For example, a digital image comprises a digital
representation of a document having a plurality of sides and a background. As described above, the large analysis window may be defined in a region comprising a plurality of background pixels and not including pixels corresponding to the digital representation of the document. Moreover, the large analysis window may be defined in the corner of the digital image, in some approaches.
[00109] According to one embodiment, a plurality of small analysis windows may be defined within the digital image, such as within the large analysis window. The small analysis windows may overlap at least in part with one or more other small analysis windows such as to be characterized, by comprising one or more overlap regions . In a preferred approach all possible small analysis windows are defined within the large analysis window. Of course, small analysis windows may be defined within any portion of the digital image, such , and preferably small analysis windows may be defined such that each small analysis window is characterized by a single center pixel. [00110] In operation, according to one embodiment, one or more statistics are calculated for one or more small analysis windows (e.g. one or more small analysis windows within a large analysis window) and one or more distributions of corresponding statistics are estimated (e.g. a distribution of statistics estimated across a plurality of small analysis windows). In another embodiment, distributions of statistics may be estimated across one or more large analysis window(s) and optionally merged.
[00111] Moreover, values may be descriptive of any feature associated with the background of the digital image, such as background brightness values, background color channel values, background texture values, background tint values, background contrast values, background sharpness values, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Moreover still, statistics may include a minimum, a maximum and/or a range of brightness values in one or more co lor channels of the plurality of pixels depicting the digital image background over the plurality of small windows within the large analysis window'.
[00112] In operation, according to one embodiment, one or more distributions of background statistics are estimated. By estimating the distribution(s) of statistics, one may obtain descriptive distribution(s) that characterize the properties of the background, of the digital image within, for example, a large analysis window.
00113] The distribution(s) preferably correspond to the background statistics calculated for each small analysis window, and may include, for example, a distribution of brightness minima, a distribution of brighiness maxima, etc., from which one may obtain distribution statistical descriptors such as the minimum and/or maximum of minimum brightness values, the minimum and/or maximum of minimum brightness values, minimum and/or maximum spread of brightness values, minimum and/or maximum of minimum color channel values, minimum and/or maximum of maximum color channel values, minimum and/or maximum spread of color channel values etc. as would be appreciated by one having ordinary skill in the art upon reading the present descriptions. Of course, any of the calculated background statistics (e.g. for brightness values, color channel values, contrast values, texture values, tint values, sharpness values, etc.) may be assembled into a distribution and any value descriptive of the distribution may be employed without departing from the scope of the present disclosures.
[00114] In operation, according to one embodiment, a large analysis window, such as analysis windowr is defined, within the digital image.
00115] Moreover, windo shapes may be defined by positively setting the boundaries of the window as a portion of the digital image, may be defined by negatively, e.g. by applying a mask to the digital image and defining the regions of the digital image not masked as the analysis window. Moreover still, windows may be defined according to a pattern, especially in embodiments where windows are negatively defined by applying a mask to the digital image. Of course, other manners for defining the windows may be employed without departing from the scope of the present disclosures.
[00116] In operation, according to one embodiment, one or more statistics are calculated for the analysis window. Moreover, in preferred embodiments each analysis window statistic corresponds to a distribution of background statistics estimated for the large analysis window. For example, in one embodiment maximum brightness corresponds to distribution of background brightness maxima, minimum brightness corresponds to distribution of background brightness minima, brightness spread corresponds to distribution of background brightness spreads, etc. as would be understood by one having ordinar skill in the art upon reading the present
descriptions.
[00117] In operation, according to one embodiment, it is determined whether a statistically significant difference exists between at least one analy sis window statistic and the corresponding distribution of background statistics. As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, determining whether a statistically significant difference exists may be performed using any known statistical significance evaluation method or metric, such as a p-value, a z-test, a chi-squared correlation, etc. as would be appreciated by a skilled artisan reading the present descriptions.
[00118] In operation, according to one embodiment, one or more points (e.g. the centermost pixel or point in the analysis window for which a statistically significant difference exists between a value describing the pixel and the corresponding distribution of background statistics is designated as a candidate edge point. The designating may be accomplished by any suitable method known in the art, such as setting a flag corresponding to the pixel, storing coordinates of the pixel, making an array of pixel coordinates, altering one or more values describing the pixel (such as brightness, hue, contrast, etc.), or any other suitable means.
[00119] According to one embodiment, one or more operations may be repeated one or more times. In a preferred embodiment, a plurality of such repetitions may be performed, wherein each repetition is performed on a different portion of the digital image. Preferably, the repetitions may be performed until each side of the digital representation of the document has been evaluated. In various approaches, defining the analysis windows, may result in a plurality of analysis windows, which share one or more borders, which overlap in whole or in part, and/or which do not share any common border and do not overlap, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00120] In a particularly preferred embodiment, the plurality of repetitions may be performed in a manner directed to reestimate local background statistics upon detecting a potentially non- background window (e.g. a window containing a candidate edge point or a window containing an artifact such as uneven illumination, background texture variation, etc.}.
[00121] In operation, according to one embodiment, four sides of a tetragon are defined based on the plurality of candidate edge points. Preferably, the sides of the tetragon encompass the edges of a digital representation of a document in a digital image. Defining the sides of the tetragon may include, in some approaches, performing one or more least-mean-squares (LMS) approximations.
[00122] In more approaches, defining the sides of the tetragon may include identifying one or more outlier candidate edge points, and removing one or more outlier candidate edge points from the pluralit of candidate edge points. Further, defining the sides of the tetragon may include performing at least one additional LMS approximation excluding the one or more outlier candidate edge points.
[00123] Further still, in one embodiment each side of the tetragon is characterized by an equation chosen from a class of functions, and performing the at least one LMS approximation comprises determining one or more coefficients for each equation, such as best coefficients of second degree polynomials in a preferred implementation. According to these approaches, defining the sides of the tetragon may include determining whether each side of the digital represe tation of the document falls within a given class of functions, such as second degree polynomials or simpler functions such as linear functions instead of second, degree polynomials.
[00124] In preferred approaches, performing method may accurately define a tetragon around the four dominant sides of a document while ignoring one or more deviations from the dominant sides of the document, such as a rip and/or a tab - and.
[00125] Additional and/or alternative embodiments of the presently disclosed tetragon may be characterized by having four sides, and each side being characterized by one or more equations such as the polynomial functions discussed above. For example, embodiments where the sides of tetragon are characterized by more than one equation may involve dividing one or more sides into a plurality of segments, each segment being characterized by an equation such as the polynomial functions discussed above.
[00126] Defining the tetragon may, in various embodiments, alternatively and/or additionally include defining one or more corners of the tetragon. For example, tetragon comers may be defined by calculating one or more intersections between adjacent sides of the tetragon, and designating an appropriate intersection from the one or more calculated intersections in cases where multiple intersections are calculated. In still more embodiments, defining the comers may include solving one or more equations, wherein each equation is characterized by belonging to a chosen class of functions such as th degree polynomials, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
[00127] In various embodiments, a corner of the tetragon may be defined by one or more of: an intersection of two curved adjacent sides of the tetragon; an intersection of two substantially straight lines; and an intersection of one substantially straight line and one substantially curved line.
[00128] In operation, according to one embodiment, the digital representation of the document and. the tetragon are output to a display of a mobile device. Outputting may be performed in any manner, and may depend upon the configuration of the mobile device hardware and/or software.
[00129] Moreover, outputting may be performed in various approaches so as to facilitate further processing and/or user interaction with the output. For example, in one embodiment the tetragon may be displayed in a manner designed to distinguish the tetragon from other features of the digital image, for example by displaying the tetragon sides in a particular color, pattern, illumination motif, as an animation, etc, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00130] Further still, in some embodiments outputting the tetragon and the digital
representation of the document may facilitate a user manually adjusting and/or defining the tetragon in any suitable manner. For example, a user may interact with, the display of the mobile device to translate the tetragon, i.e. to move the location of the tetragon in one or more directions while maintaining the aspect ratio, shape, edge lengths, area, etc. of the tetragon. Additionally and/or alternatively, a user may interact with the display of the mobile device to manually define or adjust locations of tetragon comers, e.g. tapping on a tetragon corner and dragging the corner to a desired location within the digital image, such as a corner of the digital representation of the document.
[00131] Referring again to, one particular example of an ideal result of page detection is depicted, showing the digital representation of the document within the digital image, and having a tetragon that encompasses the edges of the digital representation of the document.
[00132] In some approaches page detection such as described above may include one or more additional and/or alternative operations, such as will be described below.
[00133] In one approach, method may further include capturing one or more of the image data containing the digital representation of the document and audio data relating to the digital representation of the document. Capturing may he performed using one or more capture components coupled to the mobile device, such as a microphone, a camera, an accelerometer, a sensor, etc. as would be understood, by one having ordinary skill in the art upon reading the present descriptions.
00134] In another approach, method may include defining a new large analysis window and reestimating the distribution of background, statistics for the new large analysis window upon determining that the statistically significant difference exists, i.e. essentially repeating operation and/or in a different region of the digital image near a point where a potentially non-background point has been identified, such as near one of the edges of the document.
[00135] In several exemplary embodiments, a large analysis window may be positioned near or at the leftmost non- background pixel in a row or positioned near or at the rightmost non- background pixel in a row, positioned near or at the topmost non-background pixel in a column, positioned near or at bottommost non-background pixel in a column.
[00136] Approaches involving such reestimation may farther include determining whether the statistically significant difference exists between at least one small analysis window (e.g. a test window) statistic and. the corresponding reestimated distribution of large analysis window statistics, in this manner, it is possible to obtain a higher-confidence determination of whether the statistically significant difference exists, and therefore better distinguish true transitions from the digital image background to the digital representation of the document as opposed to, for example, variations in texture, illumination anomalies, and/or other artifacts within the digital image.
[00137] Moreover, with or without performing reestimation as described above may facilitate the method, avoiding one or more artifacts such as variations in illumination and/or background texture, etc. in the digital image, the artifacts not corresponding to a true transition from the digital image background to the digital representation of the document. In some approaches, avoiding artifacts ma take the form of bypassing one or more regions (e.g. regions characterized by textures, variations, etc. that distinguish the region from the true background) of the digital image.
[00138] In some approaches, one or more regions may be bypassed upon determining a statistically significant difference exists between a statistical distribution estimated for the large analysis window and a corresponding statistic calculated for the small analysis window, defining a new large analysis window near the small analysis window, reestimating the distribution of statistics for the new large analysis window, and determining that the statistically significant difference does not exist between the reestimated statistical distribution and the corresponding statistic calculated for the small analysis window.
[00139] In other approaches, bypassing may be accomplished by checking another analysis window further along the path and resuming the search for a transition to non-background upon determining that the statistics of this checked window do not differ significantly from the known statistical properties of the background, e.g. as indicated by a test of statistical significance.
[00140] As will be appreciated by the skilled artisan upon reading the present disclosures, bypassing may be accomplished, by checking another analysis window farther along the path.
[00141 ] In still further approaches, page detection may additionally and/or alternatively include determining whether the tetragon satisfies one or more quality control metrics; and rejecting the tetragon upon determining the tetragon does not satisfy one or more of the quality control metrics. Moreover, quality control metrics may include measures such as a LMS support metric, a minimum tetragon area metric, a tetragon corner location metric, and a tetragon diagonal intersection location metric.
[00142] In practice, determining whether the tetragon satisfies one or more of these metrics acts as a check on the performance of method. For example, checks may include determining whether the tetragon covers at least a threshold of the overall digital image area, e.g. whether the tetragon comprises at least 25% of the total image area. Furthermore, checks may include determining whether tetragon diagonals intersect inside the boundaries of the tetragon, determining whether one or more of the LM S approximations were calculated from sufficient data to have robust confidence in the statistics derived therefrom, i.e. whether the LMS approximation has sufficient "support," (such as an approximation calculated from at least five data points, or at least a quarter of the total number of data points, in various approaches), and/or determining whether tetragon corner locations (as defined by equations characterizing each respective side of the tetragon) exist within a threshold distance of the edge of the digital image, e.g. whether tetragon corners are located more than 100 pixels away from an edge of the digital image in a gi ven direction. Of course, other quality metrics and/or checks may be employ ed without departing from the scope of these disclosures, as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
[00143] In one approach, quality metrics and/or checks may facilitate rejecting siiboptimal tetragon definitions, and further facilitate improving the definition of the tetragon sides. For example, one approach involves receiving an indication that the defining the four sides of the tetragon based on the plurality of candidate edge points failed to define a valid tetragon, i.e. failed to satisfy one or more of the quality control metrics; and redefining the plurality of candidate edge points. Notably, in this embodiment redefining the plurality of candidate edge points includes sampling a greater number of points within the digital image than a number of points sampled in the prior, failed attempt. This may be accomplished, in one approach, by reducing the step over one or more of rows or columns of the digital image and repeating all the steps of the algorithm in order to analyze a larger number of candidate edge points. The step may be decreased in a vertical direction, a horizontal direction, or both. Of course, other methods of redefining the candidate edge points and/or resampling points within the digital image may be utilized without departing from the scope of the present disclosures.
[00144] Further still, page detection may include designating the entire digital image as the digital representation of the document, particularly where multiple repetitions of method failed to define a valid tetragon, even with significantly reduced step in progression through the digital image analysis. In one approach, designating the entire digital image as the digital representation of the document may include defining image comers as document corners, defining image sides as document sides, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00145] As described herein, the diagonals of the tetragon may be characterized by a first line connecting a calculated top left corner of the tetragon to a calculated bottom right corner of the tetragon, and second, line connecting a calculated, top right comer of the tetragon and a calculated, bottom left corner of the tetragon. Moreover, the first line and the second line preferably intersect inside the tetragon.
[00146] In various approaches, one or more of the foregoing operations may be performed using a processor, and the processor may be part of a mobile device, particularly a mobile device having an integrated camera.
[00147] Rectangul ari zation
[00148] The present descriptions relate to rectanguf arizing a digital representation of a document in a digital image, various approaches to which will be described in detail below.
[00149] In one embodiment, the goal of a rectangularization algorithm is to smoothly transform a tetragon (such as defined above in page detection method) into a rectangle (such). Notably, the tetragon is characterized, by a plurality of equations, each equation corresponding to a side of the tetragon and being selected from a chosen class of functions. For example, each side of the tetragon may be characterized by a first degree polynomial, second degree polynomial, third degree polynomial, etc. as would be appreciated by the skilled artisan upon reading the present descriptions.
[00150] In one approach, sides of the tetragon may be described by equations, and in a preferred embodiment a left side of the tetragon is characterized by a second degree polynomial equation: x = a?. * y2 +ai * y + ao; a right side of the tetragon is characterized by a second degree polynomial equation: x = 1)2 * y2 + bi * y + bo; a top side of the tetragon is characterized by a second degree polynomial equation: y = C2 * x2 +ci * x + co and a bottom side of the tetragon is characterized by a second degree polynomial equation: y = ώ * x2 +di * x + <afc.
[00151] The description of page rectangularization algorithm presented below utilizes the definition of a plurality of tetragon-based intrinsic coordinate pairs (p, q) within the tetragon, each intrinsic coordinate pair (p, q) corresponding to an intersection of a top -to-bottom curve characterized by an equation obtained from the equations of its left and right sides by combining all corresponding coefficients in a top-to-bottom curve coefficient ratio of p to 1 - p, and a left- to-right curve characterized by an equation obtained from the equations of its top and bottom sides by combining all corresponding coefficients in a left- to-right curve coefficient ratio of q to 1 - q, wherein 0 <p < 1 , and wherein 0 < q < 1.
[00152] In a preferred embodiment where the sides of the tetragon are characterized by second degree polynomial equations, the top-to-bottom curve corresponding to the intrinsic coordinate p will be characterized by the equation: x = ((1 /.> · * a?. ÷ p * bi) * y2 + (( 1 - p) * ai + p * bi) * y + ((1 -p) * ao +p * bo), and the left-to-right curve corresponding to the intrinsic coordinate q will be characterized by the equation: y = ((1 - q) * c2 + q * ώ) * y2 + ((1 - q) * a + q * di) * y + ((l-#)* co÷ q * do). Of course, other equations may characterize any of the sides and/or curves described above, as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
[00153] For a rectangle, which is a particular case of a tetragon, the intrinsic coordinates become especially simple: within the rectangle, each intrinsic coordinate pair (p, q) corresponds to an intersection of a line parallel to each of a left side of the rectangle and a right side of the rectangle, e.g. a line splitting both top and bottom sides in the proportion ofp to 1 -p: and a line parallel to each of a top side of the rectangle and a bottom side of the rectangle, e.g. a line splitting both top and bottom sides in the proportion of q to 1 - q, wherein 0 < p < 1 , and wherein 0 < q < 1 -
[00154] The goal of the rectangularization algorithm described belowr is to match each point in the rectangularized image to a corresponding point in the original image, and do it in such a way as to transform each of the four sides of the tetragon into a substantially straight line, while opposite sides of the tetragon should become parallel to each other and orthogonal to the other pair of sides; i.e. top and bottom sides of the tetragon become parallel to each other; and left and right sides of the tetragon become parallel to each other and orthogonal to the new top and bottom. Thus, the tetragon is transformed into a true rectangle characterized by four corners, each corner comprising two straight lines intersecting to form a ninety-degree angle.
[00155] The main idea of the rectangularization algorithm described below is to achieve this goal by, first, calculating rectangle-based intrinsic coordmates (p, q) for each point (not shown) in the rectangularized destination image, second, matching these to the same pair (p, q) of tetragon-based intrinsic coordinates in the original image, third, calculating the coordinates of the intersection of the left-to-right and top-to-bottom curves corresponding to these intrinsic coordinates respectively, and finally, assigning the color or gray value at the found point in the original image to the point.
[00156] Referring now to, which depicts a graphical representation of a first iteration of a page rectangularization algorithm, according to one embodiment. , each point in a digital image may correspond to an intersection of a top-to-bottom curve and a left-to-right curve (a curve may include a straight line, a curved line, e.g. a parabola, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions) corresponding to intrinsic coordinates (such as described above) associated with a point.
[00157] As will become apparent from the present descriptions, rectangularization may involve defining a plurality of such left-to-right fines and top-to-bottom lines.
[00158] Moreover, rectangularization may include matching target rectangle-based coordinates to intrinsic tetragon-based coordinates of the digital representation of the document. This matching may include iterative!}' searching for an intersection of a given left-to-right curve and a given top-to-bottom curve shows the first iteration of an exemplary iterative search within the scope of the present disclosures.
[00159] The iterative search, according to one approach discussed in further detail below with regard to, includes designating a starting point having coordinates (xo, yo), The starting point maybe located anywhere within the digital representation of the document, but preferably is located at or near the center of the target rectangle.
[00160] The iterative search may include projecting the starting point onto one of the two intersecting curves, While the starting point may be projected onto either of the curves,, in one approach the first half of a first iteration in the iterative search includes projecting the starting- point onto the top-to-bottom curve to obtain x-coordinate (xi) of the next point, the projection result represented in by point, which has coordinates (xi, yo). Similarly, in some embodiments the second half of a first iteration in the iterative search includes projecting the point onto the left-to-right curve to obtain y-coordinate (yi) of the next point, the projection result represented in by point, which has coordmates (x yi). [00161 ] Rectangularization involves transforming the tetragon defined in page detection into a true rectangle. The result of this process is a graphical representation of an output after performing a page rectangularization algorithm, according to one embodiment.
[00162] Further iterations may utilize a similar approach such as described, in further detail belo with respect to and method, in some embodiments.
[00163] A method for modifying one or more spatial characteristics of a digital representation of a document in a digital image may include any of the techniques described herein. As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, method may be performed in any suitable environment, including those shown and/or described in the figures and corresponding descriptions of the present disclosures.
[00164] In one embodiment, a tetragon (such as defined above in page detection method.) is transformed into a rectangle. Notably, the tetragon is characterized by a plurality of equations, each equation corresponding to a side of the tetragon and being selected from a chosen class of functions. For example, each side of the tetragon may be characterized by a first degree polynomial, second degree polynomial, third degree polynomial, etc. as would be appreciated by the skilled artisan upon reading the present descriptions.
[00165] In one embodiment, sides of the tetragon may be described by equations, and in a preferred embodiment a left side of the tetragon is characterized by a second degree polynomial equation: x = a?. * y2 + ai * y + ao; a right side of the tetragon is characterized by a second degree polynomial equation: x = &? * y2 + bi * y + bo; a top side of the tetragon is characterized, by a second degree polynomial equation: y = C2 * x2 + ci * x + co; and a bottom side of the tetragon is characterized by a second degree polynomial equation: y = c * x2 + di * x + do. Moreover, the top-to-bottom curve equation is: x = ((i -p) * a? + p * bi) * y2 + ((1 -p) * i + p * hi) * y + ((i -p)
* ao + p * bo), and the left-to-right curve equation is: y = ((1 - q) * C2 + q * di) * y2 + ((1 - q) * ci + q
* di) * y + ((1 - q) * co + q * do). Of course, other equations may characterize any of the sides and/or curves described above, as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
[00166] In one embodiment, curves, may be described by exemplary'' polynomial functions fitting one or more of the following general forms.
xi = ii2 * yo2 + u i * yo + w,
Figure imgf000031_0001
where = ( 1 - p) * w + p * bi, and. v.. = (1 - q) * a + q * di, and. where, ω are the coefficients in the equation of the left side of the tetragon, bi are the coefficients in the equation of the right side of the tetragon, ci are the coefficients in the equation of the top side of the tetragon, di are the coefficients in the equation of the bottom side of the tetragon, and p and q are the tetragon-based intrinsic coordinates corresponding to curves,. In some approaches, the coefficients such as at In, a, di, etc, may be derived from calculations, estimations, and/or determinations achieved in the course of performing page detection, such as a page detection method as discussed, above.
00167] Of course, as would be understood by one having ordinary skill in the art, transforming the tetragon into a rectangle may include one or more additional operations, such as will be described in greater detail below.
[00168] In one embodiment, method additionally and/or alternatively includes stretching one or more regions of the tetragon to achieve a more rectangular or truly rectangular shape.
Preferably, such stretching is performed in a manner sufficiently smooth to avoid introducing artifacts into the rectangle.
00169] In some approaches, transforming the tetragon into a rectangle may include determining a height of the rectangle, a width of the rectangle, a skew angle of the rectangle, and/or a center position of the rectangle. For example, such transforming may include defining a width of the target rectangle as the average of the width of the top side and the width of the bottom side of the tetragon ; defining a height of the target rectangle as the average of the height of the left side and the height of the right side of the tetragon ; defining a center of the target rectangle depending on the desired placement of the rectangle in the image; and defining an angle of skew of the target rectangle, e.g. in response to a user request to deskew the digital representation of the document,
[00170] In some approaches, the transforming may additionally and/or alternatively include generating a rectangularized digital image from the original digital image; determining a p- coordinate and a ^-coordinate for a plurality of points within the rectangularized digital image (e.g. points both inside and outside of the target rectangle) wherein each point located to the left of the rectangle has a /'-coordinate value p < 0, wherein each point located to the right of the rectangle has a .'-coordinate v alue p > 1, wherein each point located above the rectangle has a incoordinate value q < 0, and wherein each point located below the rectangle has a ^-coordinate value q > .
[00171 ] In some approaches, the transforming may additionally and/or alternatively include generating a rectangularized digital image from the original digital image; determining a pair of rectangle-based intrinsic coordinates for each point within the rectangularized digital image; and matching each pair of rectangle-based, intrinsic coordinates to an equivalent pair of tetragon- based intrinsic coordinates within the original digital image.
[00172] In preferred approaches, matching the rectangle-based intrinsic coordinates to the tetragon-based intrinsic coordinates may include: performing an iterative search for an intersection of the top-to-bottom curve and the left-to-right curve. Moreover, the iterative search may itself include designating a starting point (xo, yo), for example, the center of the target rectangle; projecting the starting point (xo, yo) onto the left-to-right curve: xi = 112 * yo2 + ui * yo + uo; and projecting a next point (xi , yo) onto the top-to-bottom curve: yi = v.? * χ·2 + vi * xi + vo, where m = (1 -p) * at+p * In, and where vi = (1 - q) * ct+ q * dt. Thereafter, the iterative search may include iterative!}' projecting (xk, yk) onto the left-to-right curve: xk+j = m * yk2 + ui * yk + uo: and projecting (xk+i, yk) onto the top-to-bottom curve: \¾-Η = V2 * xk+i2 + v; * Xk+i + vo.
[00173] In still more embodiments, matching the rectangle-based intrinsic coordinates to the tetragon-based intrinsic coordinates may include determining a distance between (xk, yk) and (xk+i, yk+i); determining whether the distance is less than a predetermined threshold: and terminating the iterative search upon determining that the distance is less than the predetermined threshold.
[00174] Various Embodiments of Skew Angle Detection and Correction
[00175] In some embodiments, the image processing algorithm disclosed herein may additionally and/or alternatively include functionality designed to detect and'or correct a skew angle of a digital representation of a document in a digital image. One preferred approach to correcting skew are described below. Of course, other methods of correcting skew within a digital image are within the scope of the these disclosures, as would be appreciated by one having ordinary skill in the art upon reading the present descriptions.
[00176] A digital representation of a document in a digital image may be characterized, by one or more skew angles a As will be appreciated by the skilled artisan reading these descriptions and viewing, horizontal skew angle a represents an angle between a horizontal line and an edge, of the digital representation of the document, the edge, having its longitudinal axis in a substantially horizontal direction (i.e. either the top or bottom edge of the digital representation of the document). Similarly, a may represent an angle between a vertical line and an edge, of the digital representation of the document, the edge, having its longitudinal axis in a substantially vertical direction (i.e. either the left edge or right edge of the digital representation of the document).
[00177] Moreover, the digital representation of the document may be defined by a top edge, a bottom edge, a right edge and a left edge. Each of these edges may be characterized by a substantially linear equation, such that for top edge : y = -tan(a)x + dt; for bottom edge : y = - tan(a)x + db; for right edge : x = tan(a)>' + dr; and for left edge : x = tan(a)y + dl, where dt and db are the y-intercept of the linear equation describing the top and bottom edges of the digital representation of the document, respectively, and where dr and dl are the x-intercept of the linear equation describing the right and left edges of the digital representation of the document, respectively,
00178] In one approach, having defined the linear equations describing each side of the digital representation of the document, for example a rectangular document, a skew angle thereof may be corrected by setting a = 0, such that for top edge : y = dt; for bottom edge : y = db; for right edge : x = dr; and for left edge : x = dl.
[00179] Various Embodiments of Detecting Ijlumination Problems
[00180] In still more embodiments, the presently described image processing algorithm may include features directed to detecting whether a digital representation of a document comprises one or more illumination problems.
[00181] For example, illumination problems may include locally under- saturated regions of a digital image, when brightness values vary greatly from pixel-to-pixel within image
backgrounds, such as is characteristic of images captured in settings with insufficient ambient and/or provided illumination, and locally over-saturated regions of a digital image, when some areas within the image are washed out, such as within the reflection of the flash.
[00182] One exemplary approach to detecting illumination problems in a digital image including a digital representation of a document are described below, according to one embodiment: and, which depicts a method for determining whether illumination problems exist in a digital representation of a document. As will be appreciated by one having ordinary skill in the art upon reading the present descriptions, method may be performed in any suitable environment, such as those described herein and represented in the various Figures submitted herewith. Of course, other environments may also be suitable for operating method within the scope of the present disclosures, as would be appreciated by the skilled artisan reading the instant specification.
[00183] In one embodiment, the processes include (preferably using a mobile device processor) dividing a tetragon including a digital representation of a document into a plurality of sections, each section comprising a plurality of pixels.
[00184] In more approaches, a distribution of brightness values of each section is determined. As will be understood by one having ordinary skill in the art, the distribution of brightness values may be compiled and/or assembled in any laiown manner, and may be fit to any known standard distribution model, such as a Gaussian distribution, a bimodal distribution, a skewed distribution, etc.
[00185] Irs still more approaches, a brightness value range of each section is determined. As will be appreciated by one having ordinary skill in the art, a range is defined as a difference between a maximum value and a minimum value in a given distribution. Here the brightness value range would be defined as the difference between the characteristic maximum brightness value in a given section and the characteristic minimum brightness value in the same section.. For example, these characteristic values may correspond to the 2na and 98th percentiles of the whole distribution respectively.
[00186] In many approaches, a variability of brightness values of each section is determined.
[00187] In various approaches, it is determined whether each section is oversaturated. For example, operation may include determining that a region of a digital image depicting a digital representation of a document is oversaturated, according to one embodiment. Determining whether each section is oversaturated may include determining a section oversaturation ratio for each section. Notably, in preferred embodiments each section oversaturation ratio is defined as a number of pixels exhibiting a maximum brightness value in the section di vided by a total number of pixels in the section.
[00188] An unevenly illuminated image may depict or be characterized, by a plurality of dark spots that may be more dense in areas where the brightness level of a corresponding pixel, point or region of the digital image is lower than that of other regions of the image or document, and/or lower than an average brightness level of the image or document. In some embodiments, uneven illumination may be characterized by a brightness gradient, such with a gradient proceeding from a top right corner of the image (near region) to a lower left corner of the image (near region) such that brightness decreases along the gradient with a relatively bright area in the top right comer of the image (near region) and a relatively dark area in the lower left corner of the image (near region).
[00189] In some approaches, determining whether each section is oversaturated may further include determining, for each section, whether the oversaturation level of the section is greater than a predetermined threshold., such as 10%; and. characterizing the section as oversaturated. upon determining that the saturation level of the section is greater than the predetermined threshold. While the presently described embodiment, employs a threshold value of 1 0%, other predetermined threshold oversaturation levels may be employed without departing from the scope of the present descriptions. Notably, the exact value is a matter of visual perception and expert judgment, and may be adjusted and/or set by a user in various approaches.
[00190] In more approaches, it is determined whether each section is undersaturated. For example, operation may include determining that a region of a digital image depicting a digital representation of a document is undersatarated, according to one embodiment. Determining whether each section is under-saturated may include additional operations such as determining a median variability of the distribution of brightness values of each section; determining whether each median variability is greater than a predetermined variability threshold, such as a median brightness variability of 18 out of a 0-255 integer value range; and determining, for each section, that the section is undersatarated upon determining that the median variability of the section is greater than the predetermined variability threshold. Notably, the exact value is a matter of visual perception and expert judgment, and may be adjusted and/or set by a user in various approaches.
[00191] In one particular approach, determining the variability of the section may include determining a brightness value of a target pixel in the plurality of pixels; calculating a difference between the brightness value of the target pixel and a brightness value for one or more neighboring pixels, each neighboring pixel being one or more (for example, 2) pixels away from the target pixel; repeating the determining and the calculating for each pixel in the plurality of pixels to obtain each target pixel variability; and generating a distribution of target pixel variability values, wherein each target pixel brightness value and target pixel variability value is an integer in a range from 0 to 255. This approach may be implemented, for example, by incrementing a corresponding counter in an array of all possible variability values in a range trom 0 to 255, e.g. to generate a histogram of variability values.
[00192] Notably, when utilizing neighboring pixels in determining the variability of a particular section, the neighboring pixels may be within about two pixels of the target pixel along either a vertical direction, a horizontal direction, or both (e.g. a diagonal direction). Of course, other pixel proximity limits may be employed without departing from the scope of the present invention.
[00193] In some approaches, method may further include removing one or more target pixel variability values from the distribution of target pixel variability values to generate a corrected distribution; and defining a characteristic background variability based on the corrected distribution. For example, in one embodiment generating a corrected distribution and defining the characteristic background variability may include removing the top 35% of total counted values (or any other value sufficient to cover significant brightness changes associated with transitions from the background to the foreground) and define the characteristic background variability based on the remaining values of the distribution, i.e. values taken from a relatively flat background region of the digital representation of the document.
[00194] In more approaches, a number of oversaturated sections is determined. This operation may include any manner of determining a total number of oversaturated sections, e.g. by incrementing a counter during processing of the image, by setting a flag for each oversaturated section and counting flags at some point during processing, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
00195] In more approaches, a number of undersaturated sections is determined. This operation may include any manner of determining a total number of undersaturated sections, e.g. by incrementing a counter during processing of the image, by setting a flag for each
undersaturated section and counting flags at some point during processing, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00196] In more approaches, it is determined that the digital image is oversaturated upon determining that a ratio of the number of oversaturated sections to the total number of sections exceeds an oversaturation threshold, which may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00197] In more approaches, it is determined that the digital image is undersaturated upon determining that a ratio of the number of undersaturated sections to the total number of sections exceeds an undersaturation threshold, which may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions..
[00198] In more approaches, it is determined that the illumination problem exists in the digital image upon determining that the digital image is either undersaturated or oversaturated.
[00199] In still more approaches, method may include one or more additional and/or alternative operations, such as will be described in detail below.
[00200] In one embodiment, method may include performing the following operations, for each section. Defining a section height by dividing the height of the document into a predefined number of horizontal sections; and defining a section width by dividing the width of the document into a predetermined number of vertical sections. In a preferred approach, the section height and width are determined based, on the goal of creating a certain number of sections and. making these sections approximately square by dividing the height of the document into a certain number of horizontal parts and by dividing the width of the document into a certain (possibly different) number of vertical parts.
[00201] Thus, in some embodiments each section is characterized by a section height and width, where the digital image is characterized by an image width w and an image height h, where h > = w, where the section size is characterized by a section width >¾ and a section height hs where >½· = w/m, where hs - h/n, where m and n are defined so that ws is approximately equal to hs. For example, in a preferred embodiment, m >= 3, n >= 4.
[00202] In another approach, a method for determining whether il umination problems exist in a digital representation of a document, includes the following operations, some or all of which may be performed in any environment described herein and/or represented in the presently- disclosed figures.
[00203] Various Embodiments of Correcting Uneven Illumination
[00204] In some approaches, correcting unevenness of illumination in a digital image includes normalizing an overall brightness level of the digital image. Normalizing overall brightness may transform a digital image characterized by a brightness gradient such as discussed above into a digital image characterized by a relatively flat, even distribution of brightness across the digital image, such . Note that in region is characterized by a significantly more dense distribution of dark spots than region, but in regions, are characterized by substantially similar dark spot density- profiles.
[00205] In accordance with the present disclosures, unevenness of illumination may be corrected. In particular, a method for correcting uneven illumination in one or more regions of the digital image is provided herein for use in any suitable environment, including those described herein and represented in the various figures, among other suitable environments as would be known by one having ordinary skill in the art upon reading the present descriptions.
[00206] In one embodiment, method includes operation where, using a processor, a two- dimensional illumination model is derived from the digital image.
[00207] In one embodiment, the two-dimensional illumination model is applied to each pixel in the digital image.
[00208] In more approaches, the digital image may be divided into a plurality of sections, and some or all of the pixels within a section may be clustered based on color, e.g. brightness values in one or more color channels, median hue values, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Moreover, several most numerous clusters may be analyzed, to determine characteristics of one or more possible local backgrounds. In order to designate a c luster as a local background of the section, the number of pixels belonging to this cluster has to exceed a certain predefined threshold, such as a threshold percentage of the total section area.
[00209] In various approaches, clustering may be performed using any known method,, including Markov-chain Monte Carlo methods, nearest neighbor joining, distribution-based clustering such as expectation-maximization, density-based clustering such as density-based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS), etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions,
[00210] In one embodiment, method may include determining, for each distribution of color channel values within background clusters, one or more of an average color of the primary background of the corresponding section and an average color of the secondary background of the corresponding section, if one or both exist in the section.
[00211 ] In one embodiment, method includes designating, for each section, either the primary background color or the secondary background color as a local representation of a main background of the digital representation of the document, each local representation being characterized by either the average color of the primary background of the corresponding section or the average color of the secondary background of the corresponding section;
[00212] In one embodiment, method includes fitting a plurality of average color channel values of chosen local representations of the image background to a two-dimensional illumination model. In some approaches, the two-dimensional illumination model is a second- degree polynomial characterized by the equation: v = ox2 + bxy + cy2 + dx + ey + f; where v is an average color channel value for one of the plurality of color channels; a, b, c, d, e, and/ are each unknown parameters of the two- dimensional illumination model, each unknown parameter a, b, c, d, e, and /is approximated using a least-mean-squares approximation, x is x-coordinate of the mid-point pixel in the section, and v is a ^-coordinate of the mid-point pixel in the section.
[00213] In one approach, derivation of the two-dimensional illumination model may include, for a plurality of background, clusters: calculating an average color channel value of each background cluster, calculating a hue ratio of each background cluster, and calculating a median hue ratio for the plurality of background clusters. Moreover, the derivation may also include comparing the hue ratio of each background cluster to the median hue ratio of the plurality of clusters; selecting the more likely of the possible two backgrounds as the local representation of the document background based on the comparison; fitting at least one two-dimensional illumination model to the average channel values of the local representation; and. calculating a plurality of average main background color channel values over a plurality of local
representations.
[00214] The applying of the model may include the calculating of a difference between one or more predicted, background channel values and the average main background color channel values; and adding a fraction of the difference to one or more color channel values for each pixel in the digital image. For example, adding the fraction may involve adding a value in a range from 0 to 1 of the difference, for example, ¾ of the difference, in a preferred embodiment, to the actual pixel value.
[00215] In still more approaches, method may include additional and/or alternative operations, such as those discussed immediately below.
[00216] For example, in one approach method further includes one or more of: determining, for each section, a plurality of color clusters; determining a plurality of numerous color clusters, each numerous color cluster corresponding to high frequency of representation in the section (e.g. the color cluster is one of the clusters with the highest number of pixels in the section belonging to that color cluster) determining a total area of the section; determining a plurality of partial section areas, each partial section area corresponding to an area represented by one the plurality of numerous color clusters; dividing each partial section area by the total area to obtain a cluster percentage area for each numerous color cluster; (e.g. by dividing the number of pixels in the section belonging to numerous color clusters by the total number of pixels in the section to obtain a percentage of a total area of the section occupied by the corresponding most numerous color clusters) andclassirying each numerous color cluster as either a background cluster or a non-background cluster based on the cluster percentage area.
[00217] Notably, in preferred approaches the classifying operation identifies either: no background in the section, a single most numerous background, in the section, or two most numerous backgrounds in the section. Moreover, the classifying includes classifying each belonging to a cluster containing a number of pixels greater than a background threshold as a background pixel. In some approaches, the background, threshold is in a range from 0 to 100% (for example, 15% in a preferred, approach). The background threshold may be defined by a user, may be a predetermined value, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions..
[00218] Various Embodiments of Resolution Estimation
[00219] As a further object of the presently disclosed inventive embodiments, mobile image processing may include a method for estimating resolution of a digital representation of a document. Of course, these methods may be performed in any suitable environment, including those described herein and represented in the various figures presented herewith. Moreover, method may be used in conjunction with any other method described herein, and may include additional and/or alternative operations to those described below, as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00220] In one embodiment, a plurality of connected components of a plurality of non- background elements are detected in the digital image, in some approaches, the digital image may be characterized as a bi tonal image, i.e. an image containing only two tones, and preferably a black and white image.
[00221] In another embodiment, a plurality of likely characters is determined based on the plurality of connected components. Likely characters may be regions of a digital image characterized by a predetermined number of light-to-dark transitions in a given direction, such as three light-to-dark transitions in a vertical direction as would be encountered for a small region of the digital image depicting a capital letter "E," each light-to-dark transition corresponding to a transition from a background of a document (light) to one of the horizontal strokes of the letter "E." Of course, other numbers of light- to-dark transitions may be employed, such as two vertical and/or horizontal light-to-dark transitions for a letter "o," one vertical light to dark transition for a letter "1," etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00222] In still another embodiment, one or more average character dimensions are determined based on the plurality of likely text characters. As understood herein, the average character dimensions may include one or more of an average character width and an average character height, but of course other suitable character dimensions may be utilized, as would be recognized by a skilled artisan reading the present descriptions.
[00223] In still yet another embodiment, the resolution of the digital image is estimated based on the one or more average character dimensions.
[00224] In further embodiments, method may optionally and/or alternatively include one or more additional operations, such as described below.
[00225] For example, in one embodiment method may further include one or more of:
estimating one or more dimensions of the digital representation of the document based on the estimated, resolution of the digital image; comparing the one or more estimated dimensions of the digital representation of the document to one or more kno wn dimensions of a plurality of known document types; matching the digital representation of the document to one or more of the plurality of known document types based on the comparison; determining whether the match satisfies one or more quality control criteria; and adjusting the estimated resolution of the digital representation of the document based on the known dimensions of the known document type upon determining the match satisfies the one or more quality control criteria. In some approaches, the estimated resolution will only be adjusted if a good, match between the digital representation of the document and one of the known document types has been found.
[00226] In some approaches, the one or more known document types include: a Letter size document (8.5 x 1 1 inch); a Legal size document (8.5 x 14 inch); an A3 document (1 1.69 x 16.54 inch); an A4 (European Letter size) document (8.27 x 1 1.69 inch); an A5 document (5.83 x 8.27 inch); a ledger/tabloid document (1 1 x 17 inch); a driver license (2.125 x 3.375 inch); a business card (2 x 3.5 inch); a personal check (2.75 x 6 inch); a business check (3 x 7.25 inch); a business check (3 x 8.25 inch); a business check (2.75 x 8.5 inch); a business check (3.5 x 8.5 inch); a business check (3.66 x 8.5 inch); a business check (4 x 8.5 inch); a 2.25-inch wide receipt; and a 3.125 -inch wide receipt.
[00227] In still more approaches, method may further and/or optionally include computing, for one or more connected components, one or more of: a number of on-off transitions within the connected component; (for example transitions from a character to a document background, e.g. transitions from black-to-white, white-to-black, etc. as would be understood by the skilled artisan reading the present descriptions);a black pixel density within the connected component; an aspect ratio of the connected component; and a likelihood that one or more of the connected
components represents a text character based, on one or more of the black pixel density, the number of on-off transitions, and the aspect ratio.
[00228] In still more approaches, method may further and/or optionally include determining a character height of at least two of the plurality of text characters; calculating an average character height based on each character height of the at least two text characters; determining a character width of at least two of the plurality of text characters; calculating an average character width based on each character width of the at least two text characters; performing at least one comparison. Notably, the comparison may be selected from: comparing the average character height to a reference average character height; and comparing the average character width to a reference average character width.
[00229] In such approaches, method may further include estimating the resolution of the digital image based on the at least one comparison, where each of the reference average character height and the reference average character width correspond to one or more reference characters, each reference character being characterized by a known average character width and a known average character height.
[00230] In various embodiments, each reference character corresponds to a digital representation of a character obtained from scanning a representative sample of one or more business documents) at some selected resolution, such as 300 DPI, and each reference character further corresponds to one or more common fonts, such as Arial, Times New Roman, Helvetica, Courier, Courier New, Tahoma, etc. as would be understood by the skilled artisan reading the present descriptions. Of course, representative samples of business documents may be scanned at other resolutions, so long as the resulting image resolution is suitable for recognizing characters on the document. In some approaches, the resolution must be sufficient to provide a minimum character size, such as a smallest character being no less than 12 pixels in height in one embodiment. Of course, those having ordinary skill in the art will understand, that the minimum character height may vary according to the nature of the image. For example different character heights may be required when processing a grayscale image than when processing a binary (e.g. bitonal) image. In more approaches, characters must be sufficiently large to be recognized, by optical character recognition (OCR).
[00231] In even still more embodiments, method may include one or more of: estimating one or more dimensions of the digital representation of the document based on the estimated resolution of the digital representation of the document: computing an average character width trom the average character dimensions; computing an average character height from the average character dimensions; comparing the average character width to the average character height; estimating an orientation of the digital representation of the document based on the comparison; and matching the digital representation of the document to a known document type based on the estimated dimensions and the estimated, orientation.
[00232] In an alternative embodiment, estimating resolution may be performed in an inverse manner, namely by processing a digital representation of a document to determine a content of the document, such as a payment amount for a digital representation of a check, an addressee for a letter, a pattern of a form, a barcode, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. Based on the determined content, the digital representation of the document may be determined to correspond to one or more known document types, and utilizing information about the known document type(s), the resolution of the digital representation of the document may be determined and/or estimated.
[00233] Various Embodiments of Blur Detection
[00234] A method for detecting one or more blurred regions in a digital image will be described, according to various embodiments. As will be understood and appreciated by the skilled artisan upon reading the present descriptions, method may be performed in any suitable environment, such as those discussed herein and represented in the multitude of figures submitted herewith. Further, method may be performed in isolation and/or in conjunction with any other operation of any other method described herein, including but not limited to image.
[00235] In one embodiment, method includes operation, where, using a processor, a tetragon comprising a digital representation of a document in a digital image is divided into a plurality of sections, each section comprising a plurality of pixels. [00236] In one embodiment, method includes operation, where, for each section it is determined whether the section contains one or more sharp pixel-to-pixel transitions in a first direction
[00237] In one embodiment, method includes operation, where, for each section a total number of first direction sharp pixel-to-pixei transitions (Ssi) are counted.
[00238] In one embodiment, method inc3ud.es operation, where, for each section it is determined whether the section contains one or more blurred pixel-to-pixel transitions in the first direction.
[00239] In one embodiment, method includes operation, where, for each section a total number of first-direction blurred pixel-to-pixel transitions (Sin) are counted.
[00240] In one embodiment, method includes operation, where, for each section it is determined whether the section contains one or more sharp pixel-to-pixei transitions in a second direction.
[00241] In one embodiment, method includes operation, where, for each section a total number of second direction sharp pixel-to-pixel transitions (Ss?.) are counted.
[00242] In one embodiment, method includes operation, where, for each section, it is determined, whether the section contains one or more blurred pixel-to-pixel transitions in the second direction
[00243] In one embodiment, for each section, a total number of second-direction blurred pixel-to-pixe3 transitions (Sm) are counted.
[00244] In one embodiment, for each section, it is determined that the section is blank upon determining: Ssi is less than a predetermined sharp transition threshold, Ssi is less than a predetermined blurred transition thresho3d, Ssi is less than a predetermined, sharp transition threshold, and SJ?2 is less than a predetermined blurred transition threshold.
[00245] In one embodiment, for each non- blank section, a first direction blur ratio n = Ssi / Ssi is determined.
[00246] In one embodiment, for each non-b3ank section, a second direction blur ratio n = Ss2 / SB2 is determined..
[00247] In one embodiment, for each non-blank section, it is determined that the non-blank section is blurred in the first direction upon detennming that n is less than a predefined section blur ratio threshold.
[00248] In one embodiment, for each no -blank section, it is determined that the non- blank section is blurred in the second direction upon determining that ri is less than the predefined section blur ratio threshold. [00249] In some approaches a "first direction" and "second direction" may be characterized as perpendicuiar, e.g. a vertical direction and a horizontal direction, or perpendicular diagonals of a square. In other approaches, the "first direction" and "second direction" may correspond to any path traversing the digital image, but preferably each corresponds to a linear path traversing the digital image. A person having ordinary skill in the art reading the present descriptions will appreciate that the scope of the inventive embodiments disclosed herein should not be limited to only these examples, but rather inclusive of any equivalents thereof known in the art.
[00250] In one embodiment, for each non-blank section, it is determined that the non-blank section is blurred upon determining one or more of: the section is blurred in the first direction, and. the section is blurred in the section direction.
[00251] In one embodiment, a total number of blurred sections is determined.
[00252] In one embodiment, an image blur ratio R defined as: the total number blurred sections divided by a total number of sections; is calculated.
[00253] In one embodiment, method includes operation, where, it is determined that the digital image is blurred upon determining the image blur ratio is greater than a predetermined image blur threshold.
[00254] In various embodiments, method may include one or more additional and/or alternative operations, such as described below. For example, in one embodiment, method may also inc lude determining, for each section a distribution of brightness values of the plurality of pixels; determining a characteristic variability v of the distribution of brightness values;
calculating a noticeable brightness transition threshold η based on v(for example, η = 3 * v, but not more than a certain value, such as 16): calculating a large brightness transition threshold μ based on //(for example μ = 2 * .</, but not more than a certain value, such as half of the brightness range); analyzing, for each pixel within the plurality of pixels, a directional pattern of brightness change in a window surrounding the pixel; (for example, horizontally, vertically, diagonally, etc.) and identifying one or more of: the sharp pixel-to-pixel transition and the blurred pixel-to-pixel transitions based on the analysis.
[00255] In another embodiment, method, may also include defining a plurality of center pixels; sequentially analyzing each of the plurality of center pixels within one or more small windows of pixels surrounding the center pixel; such as two pixels before and after; identifying the sharp pixel-to-pixel transitio upon determining: the large brightness transition exists within an immediate vicinity of the center pixel, (for example, from the immediately preceding pixel to the one following), a first small (e.g. smaller than noticeable) brightness variation exists before the large brightness transition; and a second small brightness variation exists after the large brightness transition; detecting the sharp pixel-to-pixel transition upon determining: the large transition exists within one or more of the small windows, a monotonic change in brightness exists in the large transition; and detecting the blurred pixel-to-pixel transition upon determining: the noticeable transition occurs within a small window; and the monotonic change in brightness exists in the noticeable transition.
[00256] In still another embodiment, method may also include, for each section: counting a total number of sharp transitions in each of one or more chosen directions; counting a total number of blurred transitions in each chosen direction; determining that a section is blank upon determining: the total number of sharp transitions is less than a predefined sharp transition threshold (for example, 50); and the total number of blurred transitions is less than a predefined blurred transition threshold; determining the non- blank section is blurred upon determining a section blurriness ratio comprising the total number of sharp transitions to the total number of blurred transitions is less than a section blur ratio threshold (for example, 24%) in at least one of the chosen directions; and determining that the section is sharp upon determining the section is neither blank nor blurred.
[00257] In yet another embodiment, method may also include determining a total number of blank sections within the plurality of sections (Nbiank); determining a total number of blurred sections within the plurality of sections (Nbtur); determining a total number of sharp sections within the plurality of sections (Nsh∞p); determining a blurriness ratio ( g) = Ni (ΝΜ∞·+ NsiwP}; and determining that the digital image is sharp if the RB is less than a blurriness threshold (preferably expressed as a percentage, for example 30%).
[00258] It will further be appreciated that embodiments presented herein may be provided in the form of a service deployed on behalf of a customer to offer service on demand.
[00259] Documen t Classifi cation
[00260] In accordance with one inventive embodiment commensurate in scope with the present disclosures, FIG. 5, a method 500 is shown. The method 500 may be carried out in any desired environment, and may include embodiments and/or approaches described in relation to FIGS. 1-4D, among others. Of course, more or less operations than those shown in FIG. 5 may be performed in accordance method 500 as would be appreciated by one of ordinary skill in the art upon reading the present descriptions.
[00261] In operation 502, a digital image captured by a mobile device is received.
[00262] In one embodiment the digital image may be characterized by a native resolution. As understood herein, a "native resolution" may be an original, nati ve resolution of the image as originally captured, but also may be a resolution of the digital image after performing some pre- classification processing such as any of the image processing operations described herein, in one embodiment, the native resolution is approximately 500 pixels by 600 pixels (i.e. a 500x600 digital image) for a digital image of a driver license subjected to processing by virtual rescan (VRS) before performing classification. Moreover, the digital image may be characterized as a color image in some approaches, and in still more approaches may be a cropped-coior image, i.e. a color image depicting substantially only the object to be classified, and not depicting image background.
[00263] In operation 504, a first representation of the digital image is generated using a processor of the mobile device. The first represe tation may be characterized by a reduced resolution, in one approach. As understood herein, a "reduced resolution" may be any resolution less than the native resolution of the digital image, and more particularly any resolution suitable for subsequent analysis of the first representation according to the principles set forth herein.
[00264] In preferred embodiments, the reduced resolution is sufficiently low to minimize processing overhead and maximize computational efficiency and robustness of performing the algorithm on the respective mobile device, host device and/or server platform. For example, in one approach the first represe tation is characterized by a resolution of about 25 pixels by 25 pixels, which has been experimentally determined to be a particularly efficient and robust reduced resolution for processing of relatively small documents, such as business cards, driver licenses, receipts, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00265] Of course, in other embodiments, different resolutions may be employed without departing from the scope of the present disclosure. For example, classification of larger documents or objects may benefit from utilizing a higher resolution such as 50 pixels by 50 pixels, 100 pixels by 100 pixels, etc. to better represent the larger document or object for robust classification and maximum computational efficiency. The resolution utilized may or may not have the same number of pixels in each dimension. Moreover, the most desirable resolution for classifying various objects within a broad range of object classes may be determined
experimentally according to a user's preferred balance between computational efficiency and classification robustness. In still more embodiments, any resolution may be employed, and preferably the resolution is characterized by comprising between 1 pixel and about 1000 pixels in a first dimension, and between 1 and about 1000 pixels in a second dimension,
[00266] One exemplar}' embodiment of inputs, outputs and/or results of a process flow for generating the first representation will now be presented with particular reference to FIGS. 3A- 3C, which respectively depict: a digital image before being divided into sections (e.g. digital image 300 FIG. 3 A); a digital image divided into sections (e.g. sections 304 FIG. 3B); and a first representation of the digital image (e.g. representation 310 FIG. 3C) characterized by a reduced resolution.
[00267] FIGS. 3A-3B, a digital image 300 captured, by a mobile device may be divided into a plurality of sections 304. Each section may comprise a plurality of pixels 306, which may comprise a substantially rectangular grid of pixels such that the section has dimensions of ps(x) horizontal pixels (ps(x) = 4 FIG. 3B) by ps(y) vertical pixels (ps(yj = 4 FIG. 3B),
[00268] In one general embodiment, a first representation may be generated by dividing a digital image R (having a resolution ofxs. pixels by yn pixels) into Sx horizontal sections and Sy vertical sections and thus may be characterized by a reduced resolution r of & pixels by Sy pixels. Thus, generating the first representation essentially includes generating a less-granular represe tation of the digital image.
[00269] For example, in one approach the digital image 300 is divided into S sections, each section 304 corresponding to one portion of an s-by-s grid 302. Generating the first
representation involves generating a s-pixel-by-s-pixel first representation 310, where each pixel 312 in the first representation 310 corresponds to one of the S sections 304 of the digital image, and. wherein each pixel 312 is located in a position of the first represe tation 310 corresponding to the location of the corresponding section 304 in the digital image, i.e. the upper-leftmost pixel 312 in the first representation corresponds to the upper-leftmost section 304 in the digital image, etc.
[00270] Of course, other reduced resolutions may be employed for the first representation, ideally but not necessarily according to limitations and/or features of a mobile device, host device, and or server platform being utilized, to cany out the processing, the characteristics of the digital image (resolution, illumination, presence of blur, etc.) and/or characteristics of the object which is to be detected and/or classified (contrast with background, presence of text or other symbols, closeness of fit to a general template, etc.) as would be understood by those having ordinary skill in the art upon reading the present descriptions.
[00271] In some approaches, generating the first represe tation may include one or more alternative and/or additional suboperations, such as dividing the digital image into a plurality of sections. The digital image may be divided into a plurality of sections in any suitable manner, and. in one embodiment the digital image is divided into a plurality of rectangular sections. Of course, sections may be characterized by any shape, and in alternative approaches the plurality of sections may or may not represent the entire digital image, may represent an oversampling of some regions of the image, or may represent a single sampling of each pixel depicted in the digital image, in a preferred embodiment, as discussed above regarding FIGS. 3A-3C, the digital image is divided into S substantially square sections 304 to form an s x s grid 302.
[00272] In further approaches, generating the first represe tation may also include determining, for each section of the digital image, at least one characteristic value, where each characteristic value corresponds to one or more features descriptive of the section. Within the scope of the present disclosures, any feature that may be expressed as a numerical value is suitable for use in generating the first representation, e.g. an average brightness or intensity (0- 255) across each pixel in the section, an average value (0-255) of each color channel of each pixel in the section, such as an average red-channel value, and average green-channel value, and an average blue-channel value for a red-green-blue (RGB) image, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00273] With continuing reference to FIGS. 3A-3C, in some embodiments each pixel 312 of the first representation 310 corresponds to one of the S sections 304 not only with respect to positional correspondence, but also with respect to feature correspondence. For example, in one approach generating the first representation 310 may additionally include determining a characteristic section intensity value is by calculating the average of the individual intensity values ip of each pixel 306 in the section 304. Then, each pixel 312 in the first representation 310 is assigned an intensity value equal to the average intensity value is calculated for the corresponding section 304 of the digital image 300. In this manner, the first representation 310 reflects a less granular, normalized representation of the features depicted in digital image 300.
[00274] Of course, the pixels 312 comprising the first representation 310 may be represented using any characteristic value or combination of characteristic values without departing from the scope of the presently disclosed classification methods. Further, characteristic values may be computed and/or determined using any suitable means, such as by random selection of a characteristic value from a distribution of values, by a statistical means or measure, such as an average value, a spread of values, a minimum value, a maximum value, a standard deviation of values, a variance of values, or by any other means that would be known to a skilled artisan upon reading the instant descriptions.
[00275] In operation 506, a first feature vector is generated based on the first representation.
[00276] The first feature vector and/ or reference feature matrices may include a plurality of feature vectors, where each feature vector corresponds to a characteristic of a corresponding object class, e.g. a characteristic minimum, maximum, average, etc. brightness in one or more color channels at a particular location (pixel or section), presence of a particular symbol or other reference object at a particular location, dimensions, aspect ratio, pixel density (especially black pixel density, but also pixel density of any other color channel), etc.
[00277] As would be understood by one having ordinary skill in the art upon reading the present descriptions, feature vectors suitable for inclusion in first feature vector and/or reference feature matrices comprise any type, number and/or length of feature vectors, descriptive of one or more features of the image, e.g. distribution of color data, .
[00278] In operation 508, the first feature vector is compared to a plurality of reference feature matrices, each reference feature matrix comprising a plurality of vectors.
[00279] The comparing operation 508 may be performed according to any suitable matrix comparison, vector comparison, or a combination of the two.
[00280] Thus, in such approaches the comparing may include an N-dimensional feature space comparison. In at least one approach, N is greater than 50, but of course, N may be any value sufficiently large to ensure robust classification of objects into a single, correct object class, which those having ordinary skill in the art reading the present descriptions will appreciate to vary according to many factors, such as the complexity of the object, the similarity or distinctness between object classes, the number of object classes, etc.
[00281] As understood herein, ''objects'' include any tangible thing represented in an image and. which may be described according to at least one unique characteristic such as color, size, dimensions, shape, textare, or representative feature(s) as would be understood by one having ordinary skill in the art upon reading the present descriptions. Additionally, objects include or classified according to at least one unique combination of such characteristics. For example, in various embodiments objects may include but are in no way limited to persons, animals, vehicles, buildings, landmarks, documents, furniture, plants, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00282] For example, in one embodiment where attempting to classify an object depicted in a digital image as one of only a small number of object classes (e.g. 3-5 object classes), each object class being characterized by a significant number of starkly distinguishing features or feature vectors (e.g. each object class corresponding to an object or object(s) characterized by very different size, shape, color profile and/or color scheme and easily distinguishable reference symbols positioned in unique locations on each object class, etc.), a relatively lo value of N may be sufficiently large to ensure robust classification.
[00283] On the other hand, where attempting to classify an object depicted in a digital image as one of a large number of object classes (e.g. 30 or more object classes), and each object class is characterized by a significant number of similar features or feature vectors, and only a few distinguishing features or feature vectors, a relatively high value of may be preferable to ensure robust classification. Similarly, the value of N is preferably chosen or determined such that the classification is not only robust, but also comp tationally efficient; i.e. the classification process(es) introduce only minimal processing overhead to the device(s) or system(s) utilized to perform the classification algorithm.
00284] The value of N that achieves the desired balance between classification robustness and processing overhead will depend on many factors such as described above and others that would be appreciated by one having ordinary skill in the art upon reading the present descriptions. Moreover, determining the appropriate value of N to achieve the desired balance may be accomplished using any known method or equivalent thereof as understood by a skilled artisan upon reading the instant disclosures.
[00285] In a concrete implementation, directed to classifying driver licenses according to state and distinguishing driver licenses from myriad other document types, it was determined that a 625-dimensional comparison (N = 625) provided a preferably robust classification without introducing unsatisfactorily high overhead to processing performed using a variety of current- generation mobile devices.
[00286] In operation 510, an object depicted in the digital image is classified as a member of a particular object class based at least in part on the comparing operation 508. More specifically, the comparing operation 508 may involve evaluating each feature vector of each reference feature matrix, or alternatively evaluating a plurality of feature matrices for objects belonging to a particular object class, and identifying a hyper-plane in the N-dimensional feature space that separates the fea ture vectors of one reference feature matrix from the feature vectors of other reference feature matrices. In this manner, the classification algorithm defines concrete hyper- plane boundaries between object classes, and may assign an unknown object to a particular object class based on similarity of feature vectors to the particular object class and/or dissimilarity to other reference feature matrix profiles.
[00287] In the simplest example of such feature-space discrimination, imagining a two- dimensional feature space with one feature plotted along the ordinate axis and another feature plotted along the abscissa, objects belonging to one particular class may be characterized by feature vectors having a distribution of values clustered in the lower-right portion of the feature space, while another class of objects may be characterized by feature vectors exhibiting a distribution of values clustered in the upper-left portion of the feature space, and the
classification algorithm may distinguish between the two by identifying a line between each cluster separating the feature space into two classes - "upper left" and "lower-right." Of course, as the number of dimensions considered in the feature space increases, the complexity of the classification grows rapidly, but also provides significant improvements to classification robustness, as will be appreciated by one having ordinary skill in the art upon reading the present descriptions.
100288] Additional Processing
00289] In some approaches, classification according to embodiments of the presently disclosed methods may include one or more additional and/or alternative features and/or operations, such as described below.
[00290] In one embodiment, classification such as described above may additionally and/or alternatively include assigning a confidence value to a plurality of putative object classes based on the comparing operation (e.g. as performed in operation 508 of method 500) the presently disclosed classification methods, systems and/or computer program products may additionally and/or alternatively determine a location of the mobile device, receive location information indicating the location of the mobile device, etc. and based on the determined location, a confidence value of a classification result corresponding to a particular location may be adjusted. For example, if a mobile device is determined to be located in a particular state (e.g. Maryland) based on a GPS signal, then during classification, a confidence value may be adjusted for any object class corresponding to the particular state (e.g. Maryland Driver License, Maryland Department of Motor Vehicle Title/Registration Form, Maryland Traffic Violation Ticket, etc. as would be understood by one having ordinary skill in the art upon reading the present
descriptions).
[00291] Confidence values may be adjusted in any suitable manner, such as increasing a confidence value for any object class corresponding to a particular location, decreasing a confidence value for any object class not corresponding to a particular location, normalizing confidence value(s) based on correspondence/non-correspondence to a particular location, etc. as would be understood by the skilled, artisan reading the present disclosures.
00292] The mobile device location may be determined using any known method, and employing hardware components of the mobile device or any other number of devices in communication with the mobile device, such as one or more satellites, wireless communication networks, servers, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00293] For example, the mobile device location may be determined based in whole or in part on one or more of a global-positioning system (GPS) signal, a connection to a wireless communication network, a database of known locations (e.g. a contact database, a database associated with a navigational tool such as Google Maps, etc.), a social media tool (e.g. a "check- in" feature such as provided via Facebook, Google Plus, Yelp, etc.), an IP address, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00294] In more embodiments, classification additionally and/or alternatively includes outputting an indication of the particular object class to a display of the mobile device; and receiving user input via the display of the mobile device in response to outputting the indication. While the user input may be of any known type and relate to any of the herein described features and/or operations, preferably user input relates to confirming, negating or modifying the particular object class to which the object was assigned by the classification algorithm.
[00295] The indication may be output to the display in any suitable manner, such as via a push notification, text message, display window on the display of the mobile device, email, etc, as would be understood by one having ordinary skill in the art. Moreover, the user input may take any form and be received in any known manner, such as detecting a user tapping or pressing on a portion of the mobile device display (e.g. by detecting changes in resistance, capacitance on a touch- screen device, by detecting user interaction with one or more buttons or switches of the mobile device, etc.)
[00296] In one embodiment, classification further includes determining one or more object features of a classified object based at least in part on the particular object class. Thus, classification may include determining such object features using any suitable mechanism or approach, such as receiving an object class identification code and using the object class identification code as a query and/or to perform a lookup in a database of object features organized according to object class and keyed, hashed, indexed, etc. to the object class identification code.
[00297] Object features within the scope of the present disclosures may include any feature capable of being recognized in a digital image, and preferably any feature capable of being expressed in a numerical format (whether scalar, vector, or otherwise), e.g. location of subregion containing reference object(s) (especially in one or more object orientation states, such as landscape, portrait, etc.) object color profile, or color scheme, object subregion color profile or color scheme, location of text, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00298] In accordance with another inventive embodiment commensurate in scope with the present disclosures, FIG. 6, a method 600 is shown. The method 600 may be carried out in any desired environment, and may include embodiments and/or approaches described in relation to FIGS. 1-4D, among others. Of course, more or less operations than those shown in FIG. 6 may be performed in accordance method 600 as would be appreciated by one of ordinary skill in the art upon reading the present descriptions,
[00299] In operation 602, a first feature vector is generated, based on a digital image captured by a mobile device,
[00300] In operation 604, the first feature vector is compared to a plurality of reference feature matrices.
[00301] In operation 606, an object depicted in the digital image is classified as a member of a particular object class based at least in part on the comparing (e.g. the comparing performed in operation 604),
[00302] In operation 608, one or more object features of the object are determined based at least in part on the particular object class.
[00303] In operation 610, a processing operation is performed. The processing operation includes performmg one or more of the following subprocesses: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and bmarizing the digital image based at least in part on the one or more object features.
[00304] As will be further appreciated by one having ordinary skill in the art upon reading the above descriptions of document classification, in various embodiments it may be advantageous to perform one or more additional processing operations, such as the subprocesses described above with reference to operation 610, on a digital image based at least in part on object features determined via document classification.
[00305] For example, after classifying an object depicted in a digital image, such as a document, it may be possible to refine other processing parameters, functions, etc. a d/or utilize information known to be true for the class of objects to which the classified object belongs, such as object shape, size, dimensions, location of regions of interest on and/or in the object, such as regions depicting one or more symbols, patterns, text, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions.
[00306] Regarding performing page detection based on classification, it may be advantageous in some approaches to utilize information known about an object belonging to a particular object class in order to improve object detection capabilities. For example, and as would be appreciated by one having ordinary skill in the art. it may be less computationally expensive, and/or may result in a higher-confidence or higher-quality result to narrow a set of characteristics that may potentially identify an object in a digital image to one or a few discrete, known characteristics, and simply searcli for those characteristic(s).
[00307] Exemplary characteristics that may be utilized to improve object detection may include characteristics such as object dimensions, object shape, object color, one or more reference features of the object class (such as reference symbols positioned in a known location of a document).
[00308] In another approach, object detection may be improved based on the one or more known characteristics by facilitating an object detection algorithm distinguishing regions of the digital image depicting the object from regions of the digital image depicting other objects, image background, artifacts, etc. as would be understood by one having ordinary skill in the art upon reading the present descriptions. For example, if objects belonging to a particular object class are known to exhibit a particular color profile or scheme, it may be simpler and/or more reliable to attempt detecting the particular color profile or scheme within the digital image rather than detecting a transition from one color profile or scheme (e.g. a background color profile or scheme) to another color profile or scheme (e.g. the object color profile or scheme), especially if the two colors profiles or schemes are not characterized by sharply contrasting features.
[00309] Regarding performing rectangularization based on classification, it may be advantageous in some approaches to utilize information known about an object belonging to a particular object class in order to improve object rectangularization capabilities. For example, and as would be appreciated by one having ordinary skill in the art, it may be less
computationally expensive, a d/or may result in a higher-confidence or higher-quality result to transform a digital representation of an object from a native appearance to a true configuration based on a set of known object characteristics that definitively represent the true object configuration, rather than attempting to estimate the true object configuration from the native appearance and. project the native appearance onto an estimated object configuration,
[00310] In one approach, the classification may identify known dimensions of the object, and based on these known dimensions the digital image may be rectangularized to transform a distorted represe tation of the object in the digital image into an undistorted representation (e.g. by removing projective effects introduced in the process of capturing the image using a camera of a mobile device rather than a traditional fiat-bed scanner, paper- feed scan er or other similar multifunction peripheral (MFP)),
[00311] Regarding performing cropping based on classification, and. similar to the principles discussed, above regarding rectangularization, it may be advantageous in some approaches to utilize information known about an object belonging to a particular object class to improve cropping of digital images depicting the object such that all or significantly all of the cropped image depicts the object and not image background (or other objects, artifacts, etc. depicted in the image).
[00312] As a simple example, it may be advantageous to determine an object's known size, dimensions, configuration, etc. according to the object classification and utilize this information to identify a region of the image depicting the object from regions of the image not depicting the object, and define crop lines surrounding the object to remove the regions of the image not depicting the object.
[00313] Regarding performing binarization based on classification, the presently disclosed classification algorithms provide several useful improvements to mobile image processing. Several exemplar}' embodiments of such improvements will now be described with reference to FIGS. 4A-4D.
[00314] For example, binarization algorithms generally transform a multi-tonal digital image (e.g. grayscale, color, or any other image such as image 400 exhibiting more than two tones) into a bitonal image, i.e. an image exhibiting only two tones (typically white and black). Those having ordinary skill in the art will appreciate that attempting to binarize a digital image depicting an object with regions exhibiting two or more distinct color profiles and/or color schemes (e.g. a region depicting a color photograph 402 as compared to a region depicting a black/white text region 404, a color-text region 406, a symbol 408 such as a reference object, watermark, etc. object background region 410, etc.) may produce an unsuccessful or
unsatisfactory result.
[00315] As one explanation, these difficulties may be at feast partially due to the differences between the color profiles, schemes, etc., which counter-influence a single binarization transform. Thus, providing an ability to distinguish each of these regions having disparate color schemes or profiles and define separate binarization parameters for each may greatly improve the quality of the resulting bitonal image as a whole and with particular respect to the quality of the transformation in each respective region.
[00316] According to one exemplar}' embodiment shown in FIGS. 4A-4B, improved binarization may include determining an object class color profile and/or scheme (e.g.
determining a color profile and/or color scheme for object background region 410); adjusting one or more binarization parameters based on the object class color profile and/or color scheme; and thresholding the digital image using the one or more adjusted binarization parameters.
[00317] Binarization parameters may include any parameter of any suitable binarization process as would be appreciated by those having ordinary skill in the art reading the present descriptions, and binarization parameters may be adjusted according to any suitable methodology. For example, with respect to adjusting binarization parameters based on an object class color profile a d/or color scheme, binarization parameters may be adjusted to over- and'or under-emphasize a contribution of one or more color channels, intensities, etc. in accordance with the object class color profile/scheme (such as under-emphasizing the red channel for an object class color profile/scheme relatively saturated by red hue(s), etc.).
[00318] Similarly, in other embodiments such as particularly shown in FIGS. 4B-4D, improved binarization may include determining an object class mask, applying the object class mask to the digital image and thresholding a subregion of the digital image based on the object class mask. The object class mask may be any type of mask, with the condition that the object class mask provides information regarding the location of particular regions of interest characteristic to objects belonging to the class (such as a region depicting a color photograph 402, a region depicting a black/white text region 404, a color-text region 406, a symbol region depicting a symbol 408 such as a reference object, watermark, etc., an object background region 410, etc.) and. enabling the selective inclusion and/or exclusion of such regions from the binarization operation(s).
[0031 ] For example, FIG. 4B, improved binarization includes determining an object class mask 420 identifying regions such as discussed immediately above and applying the object class mask 420 to exclude from binarization all of the digital image 4Θ0 except a single region of interest, such as object background region 410. Alternatively the entire digital image may be masked-out and a region of interest such as object background region 410 subsequently rnasked- in to the binarization process. Moreover, in either event the masking functionality now described with reference to FIG. 4B may be combined with the exemplary color profile and'or color scheme information functionality described above, for example by obtaining both the object class mask and the object color profile and'or color scheme, applying the object class mask to exclude all of the digital image from binarization except object background region 410, adjusting one or more binarization parameters based on the object background region color profile and/or color scheme, and thresholding the object background region 410 using the adjusted binarization parameters.
[00320] Extending the principle shown in FIG. 4B, multiple regions of interest may be masked-in and/or masked-out using object class mask 420 to selectively designate regions and/or parameters for binarization in a layered approach designed to produce high-quality bitonal images. For example, FIG. 4C multiple text regions 404, 406 may be retained for binarization (potentially using adjusted parameters) after applying object class mask 420, for example to exclude all non-text regions from binarization, in some approaches. [00321 ] Similarly, it may be advantageous to simply exclude only a portion of an image from binarization, whether or not adjusting any parameters. For example, with reference to FIG. 4D, it may be desirable to mask-out a unique region of a digital image 400, such as a region depicting a color photograph 402, using an object class mask 420. Then, particularly in approaches where the remaining portion of the digital image 400 is characterized by a single color profile and/or color scheme, or a small number (i.e. no more than 3) substantially similar color profile and/or color schemes, binarization may be performed, to clarify the remaining portions of the digital image 400. Subsequently, the masked-out unique region may optionally be restored to the digital image 400, with the result being an improved bitonal image quality in all regions of the digital image 400 that were subjected, to binarization coupled with an undisturbed, color photograph 402 in the region of the image not subjected to binarization.
00322] In still more embodiments, it may be advantageous to perform optical character recognition (OCR) based at least in part on the classification and/or result of classification. Specifically, it may be advantageous to determine information about the location, format, and'or content of text depicted in objects belonging to a particular class, and modify predictions estimated by traditional OCR methods based, on an expected text location, format and/or content. For example, in one embodiment where an OCR prediction estimates text in a region
corresponding to a "date" field, of a document reads "Jan, 14, 2013" the presently disclosed algorithms may determine the expected format for this text follows a format such as
"[Abbreviated Month][.] [##][,][####]" the algorithm may correct the erroneous OCR predictions, e.g. converting the comma after "Jan" into a period and/or converting the letter "1" at the end of 2011" into a numerical one character. Similarly, the presently disclosed algorithms may determine the expected format for the same text is instead "[##]/[##]/[####]" and convert "Jan" to "01 " and convert each set of comma-space characters ", " into a slash "/" to correct the erroneous OCR predictions.
00323] A method includes: receiving a digital image captured by a mobile device; and using a processor of the mobile device: generating a first representation of the digital image, the first representation being characterized by a reduced resolution; generating a first feature vector based on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing. Generating the first representation involves dividing the digital image into a plurality of sections; and determining, for each section, at least one characteristic value, each characteristic value corresponding to one or more features descriptive of the section. The first representation comprises a plurality of pixels, each of the plurality of pixels corresponds to one section of the plurality of sections, and each of the pluralit of pixels is characterized by the at least one characteristic value determined for the corresponding section. The digital image comprises a cropped, color image. One or more of the reference feature matrices comprises a plurality of feature vectors, and each feature vector corresponds to at least one characteristic of an object. The comparing comprises an N-dimensional comparison, and N is greater than 50. The first feature vector is characterized by a feature vector length greater than 500. The method also includes determining one or more object features of the object based at least in part on the particular object class; detecting the object depicted in the digital image based at least in part on the classifying and/or result thereof; rectangularizing the object depicted in the digital image based at least in part on the classifying and/or result thereof;
cropping the digital image based at least in part on the classifying and/or result thereof; and/or binarizing the digital image is based at least in part on the classifying and/or result thereof. The binarizing additionally and/or alternatively includes one or more of: determining an object class mask; applying the object class mask to the digital image; and. thresholding a subregion of the digital image based on the object class mask. The method may include adjusting one or more binarization parameters based on the object class mask; and thresholding the digital image using the one or more adjusted binarization parameters, determining an object class color scheme. Similarly, binarizing may include adjusting one or more binarization parameters based on the object class color scheme; and thresholding the digital image using the one or more adjusted binarization parameters. The method additionally and/or alternatively includes: determining a geographical location associated with the mobile device, wherein the classifying is further based at least in part on the geographical location. The method additionally and/or alternatively includes: outputting an indication of the particular object class to a display of the mobile device; and. receiving user input via the display of the mobile device in response to outputting the indication. The method additionally and/or alternatively includes: determining one or more object features of the object based at least in part on the particular object class.
[00324] A method, includes: generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices; classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and determining one or more object features of the object based at least in part on the particular object class. The method also includes performing at feast one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of: detecting the object depicted in the digital image based at least in part on the one or more object features; rectangularizing the object depicted in the digital image based at least in part on the one or more object features; cropping the digital image based at least in part on the one or more object features; and binarizing the digital image based at least in part on the one or more object features. The one or more object features comprise an object color scheme, and the binarizing comprises: determining the object color scheme; adjusting one or more binarization parameters based on the processing; and thresholding the digital image using the one or more adjusted binarizat on parameters. The one or more object features may additionally and/or alternatively comprise an object class mask, and the binarizing comprises; determining the object class mask; applying the object class mask to the digital image; and thresholding a subregion of the digital image based on the object class mask.
[00325] Of course, other methods of improving upon a d/or correcting OCR predictions that would be appreciated by the skilled, artisan upon reading these descriptions are also fully within the scope of the present disclosure.
[00326] The inventive concepts disclosed herein have been presented by way of example to illustrate the myriad, features thereof in a plurality of illustrative scenarios, embodiments, and/or implementations. It should be appreciated that the concepts generally disclosed are to be considered as modular, and may be implemented in any combination, permutation, or synthesis thereof. In addition, any modification, alteration, or equivalent of the presently disclosed features, functions, and concepts that would be appreciated by a person having ordinary skill in the art upon reading the instant descriptions should also be considered within the scope of this disclosure.
[00327] Accordingly, one embodiment of the present invention includes all of the features disclosed herein, including those shown and described in conjunction with any of the FIGS. Other embodiments include subsets of the features disclosed herein a d/or shown and. described in conjunction with any of the FIGS. Such features, or subsets thereof, may be combined in any¬ way using known techniques that would become apparent to one skilled in the art after reading the present description.
[00328] While various embodiments have been described above, it should, be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of an embodiment of the present in vention should not be limited by any of the above- described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

CLAIMS What is claimed is:
1. A method, comprising:
receiving a digital image captured by a mobile device; and
using a processor of the mobile device:
generating a first representation of the digital image, the first representation being characterized by a reduced resolution;
generating a first fea ture vector based, on the first representation; comparing the first feature vector to a plurality of reference feature matrices; and. classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing.
2. The method as recited in claim 1, wherein generating the first representation comprises: dividing the digital image into a plurality of sections; and
determining, for each section, at least one characteristic value, each characteri tic value corresponding to one or more features descriptive of the section.
3. The method as recited in claim 2, wherein the first representation comprises a plurality of pixels,
wherein each of the plurality of pixels corresponds to one section of the plurality of sections, and
wherem each of the plurality of pixels is characterized by the at least one characteristic value determined, for the corresponding section.
4. The method as recited in claim 1, wherein the digital image comprises a cropped, color image.
5. The meihod as recited in claim 1 , wherein one or more of the reference feature matrices comprises a plurality of feature vectors, and
wherem each feature vector corresponds to at least one characteristic of an object.
6. The method as recited in claini 1, wherein the comparing comprises an N-dimensional comparison, and
wherein N is greater than 50,
7. The method as recited in claim 1. wherein the first feature vector is characterized by a feature vector length greater than 500,
8. The method as recited in claim 1, further comprising: determining one or more object features of the object based at least in part on the particular object class.
9. The method as recited in claim 1. further comprising: detecting the object depicted in the digital image based at least in part on the classifying and/or result thereof.
10. The method, as recited in claim 1, further comprising: rectangularizing the object depicted in the digital image based at least in part on the classifying and/or result thereof.
11. The method as recited in claim 1, further comprising: cropping the digital image based at least in part on the classifying and/or result thereof.
12. The method as recited in claim 1, further comprising binarizing the digital image based at feast in part on the classifying and/or result thereof.
13. The method as recited in claim 12, wherein the binarizing comprises:
determining an object class color scheme;
adjusting one or more binarization parameters based on the object class color scheme; and
thresholding the digital image using the one or more adjusted binarization parameters.
14. The method as recited in claim 12, wherein the binarizing comprises:
determining an object class mask;
applying the object class mask to the digital image; and
thresholding a subregion of the digital image based on the object class mask,
15. The method as recited in claim 14, wherein the binarizing further comprises: adjusting one or more binarization parameters based on the object class mask; and thresholding the digital image using the one or more adjusted binarization parameters.
16. The method as recited in claim 1. further comprising: determining a geographical location associated with the mobile device,
wherein the classifying is further based, at least in part on the geographical location.
17. The method as recited in claim 1 , further comprising:
outputting an indication of the particular object class to a display of the mobile device; and
receiving user input via the display of the mobile device in response to outputting the indication.
18. The method, as recited in claim 1, further comprising: determining one or more object features of the object based at least in part on the particular object class.
19. A method, comprising:
generating a first feature vector based on a digital image captured by a mobile device; comparing the first feature vector to a plurality of reference feature matrices;
classifying an object depicted in the digital image as a member of a particular object class based at least in part on the comparing; and
determining one or more object features of the object based at least in part on the
particular object class; and
performing at least one processing operation using a processor of a mobile device, the at least one processing operation selected from a group consisting of:
detecting the object depicted in the digital image based at least in part on the one or more object features;
rectangul arizing the object depicted in the digital image based at least in part on the one or more object features;
cropping the digital image based at least in pari on the one or more object features; and
binarizing the digital image based at least in part on the one or more object
features.
20. The method as recited in claim 19, wherein the one or more object features comprise an object color scheme, and
wherein the binarizing comprises:
determining the object color scheme;
adjusting one or more binarization parameters based on the processing; and thresholding the digital image using the one or more adjusted binarization
parameters.
21. The method as recited in claim 19, wherein the one or more object features comprise an object class mask, and
wherein the binarizing comprises:
determining the object class mask:
applying the object class mask to the digital image; and
thresholding a subregion of the digital image based on the object class mask.
22. The method as recited in claim 21 , wherein the one or more object features further
comprise an object color scheme, and
wherein the binarizing comprises:
determining the object color scheme;
adjusting one or more binarization parameters based on the processing; and thresholding the digital image using the one or more adjusted binarization
parameters.
23. A system, comprising:
a processor; and
logic in and/or executable by the processor to cause the processor to:
generate a first representation of a digital image captured by a mobile device; generate a first feature vector based on the first representation;
compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the comparison.
24. A computer program product comprising: a computer readable storage medium having program code embodied therewith, the program code readable/executable by a processor to:
generate a first representation of a digital image captured, by a mobile device; generate a first feature vector based on the first representation;
compare the first feature vector to a plurality of reference feature matrices; and classify an object depicted in the digital image as a member of a particular object class based at least in part on the comparison.
PCT/US2014/026597 2013-03-13 2014-03-13 Systems and methods for classifying objects in digital images captured using mobile devices WO2014160433A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP14773721.7A EP2974261A4 (en) 2013-03-13 2014-03-13 Systems and methods for classifying objects in digital images captured using mobile devices
CN201480014229.9A CN105308944A (en) 2013-03-13 2014-03-13 Classifying objects in images using mobile devices
JP2016502192A JP2016516245A (en) 2013-03-13 2014-03-13 Classification of objects in images using mobile devices

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/802,226 US9355312B2 (en) 2013-03-13 2013-03-13 Systems and methods for classifying objects in digital images captured using mobile devices
US13/802,226 2013-03-13

Publications (2)

Publication Number Publication Date
WO2014160433A2 true WO2014160433A2 (en) 2014-10-02
WO2014160433A3 WO2014160433A3 (en) 2014-11-27

Family

ID=51527209

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/026597 WO2014160433A2 (en) 2013-03-13 2014-03-13 Systems and methods for classifying objects in digital images captured using mobile devices

Country Status (5)

Country Link
US (3) US9355312B2 (en)
EP (1) EP2974261A4 (en)
JP (1) JP2016516245A (en)
CN (1) CN105308944A (en)
WO (1) WO2014160433A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2647670C1 (en) * 2016-09-27 2018-03-16 Общество с ограниченной ответственностью "Аби Девелопмент" Automated methods and systems of identifying image fragments in document-containing images to facilitate extraction of information from identificated document-containing image fragments

Families Citing this family (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9769354B2 (en) 2005-03-24 2017-09-19 Kofax, Inc. Systems and methods of processing scanned data
US9137417B2 (en) 2005-03-24 2015-09-15 Kofax, Inc. Systems and methods for processing video data
US9576272B2 (en) 2009-02-10 2017-02-21 Kofax, Inc. Systems, methods and computer program products for determining document validity
US9767354B2 (en) 2009-02-10 2017-09-19 Kofax, Inc. Global geographic information retrieval, validation, and normalization
US8774516B2 (en) 2009-02-10 2014-07-08 Kofax, Inc. Systems, methods and computer program products for determining document validity
US8879846B2 (en) 2009-02-10 2014-11-04 Kofax, Inc. Systems, methods and computer program products for processing financial documents
US8958605B2 (en) 2009-02-10 2015-02-17 Kofax, Inc. Systems, methods and computer program products for determining document validity
US9349046B2 (en) 2009-02-10 2016-05-24 Kofax, Inc. Smart optical input/output (I/O) extension for context-dependent workflows
US9634855B2 (en) 2010-05-13 2017-04-25 Alexander Poltorak Electronic personal interactive device that determines topics of interest using a conversational agent
US9058580B1 (en) 2012-01-12 2015-06-16 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US9058515B1 (en) 2012-01-12 2015-06-16 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US10146795B2 (en) 2012-01-12 2018-12-04 Kofax, Inc. Systems and methods for mobile image capture and processing
US9165187B2 (en) 2012-01-12 2015-10-20 Kofax, Inc. Systems and methods for mobile image capture and processing
US9483794B2 (en) 2012-01-12 2016-11-01 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
JP2016517587A (en) 2013-03-13 2016-06-16 コファックス, インコーポレイテッド Classification of objects in digital images captured using mobile devices
US10783615B2 (en) * 2013-03-13 2020-09-22 Kofax, Inc. Content-based object detection, 3D reconstruction, and data extraction from digital images
US9355312B2 (en) 2013-03-13 2016-05-31 Kofax, Inc. Systems and methods for classifying objects in digital images captured using mobile devices
US10127636B2 (en) 2013-09-27 2018-11-13 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data
US9208536B2 (en) 2013-09-27 2015-12-08 Kofax, Inc. Systems and methods for three dimensional geometric reconstruction of captured image data
US20140316841A1 (en) 2013-04-23 2014-10-23 Kofax, Inc. Location-based workflows and services
CN105518704A (en) 2013-05-03 2016-04-20 柯法克斯公司 Systems and methods for detecting and classifying objects in video captured using mobile devices
US9386235B2 (en) 2013-11-15 2016-07-05 Kofax, Inc. Systems and methods for generating composite images of long documents using mobile video data
US9940511B2 (en) * 2014-05-30 2018-04-10 Kofax, Inc. Machine print, hand print, and signature discrimination
CN104023249B (en) 2014-06-12 2015-10-21 腾讯科技(深圳)有限公司 Television channel recognition methods and device
US9760788B2 (en) * 2014-10-30 2017-09-12 Kofax, Inc. Mobile document detection and orientation based on reference object characteristics
IL235565B (en) * 2014-11-06 2019-06-30 Kolton Achiav Location based optical character recognition (ocr)
US10380486B2 (en) * 2015-01-20 2019-08-13 International Business Machines Corporation Classifying entities by behavior
US9858408B2 (en) 2015-02-13 2018-01-02 Yoti Holding Limited Digital identity system
US10853592B2 (en) 2015-02-13 2020-12-01 Yoti Holding Limited Digital identity system
US20160241531A1 (en) * 2015-02-13 2016-08-18 Yoti Ltd Confidence values
US10692085B2 (en) 2015-02-13 2020-06-23 Yoti Holding Limited Secure electronic payment
US10594484B2 (en) 2015-02-13 2020-03-17 Yoti Holding Limited Digital identity system
US9648496B2 (en) 2015-02-13 2017-05-09 Yoti Ltd Authentication of web content
US9785764B2 (en) * 2015-02-13 2017-10-10 Yoti Ltd Digital identity
US9852285B2 (en) 2015-02-13 2017-12-26 Yoti Holding Limited Digital identity
US10242285B2 (en) 2015-07-20 2019-03-26 Kofax, Inc. Iterative recognition-guided thresholding and data extraction
DE102016201389A1 (en) * 2016-01-29 2017-08-03 Robert Bosch Gmbh Method for recognizing objects, in particular of three-dimensional objects
US9779296B1 (en) 2016-04-01 2017-10-03 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data
WO2017208368A1 (en) * 2016-05-31 2017-12-07 株式会社Pfu Image processing device, image processing method, and program
CN106407997A (en) * 2016-07-14 2017-02-15 昆山饰爱阿智能科技有限公司 System for identifying object through mobile device and identification method thereof
US10657364B2 (en) * 2016-09-23 2020-05-19 Samsung Electronics Co., Ltd System and method for deep network fusion for fast and robust object detection
CN106678065B (en) * 2016-12-09 2018-12-14 西华大学 A kind of blower fan control system based on the two blade impeller remotely controlled
JP6401806B2 (en) * 2017-02-14 2018-10-10 株式会社Pfu Date identification device, date identification method, and date identification program
US10275687B2 (en) * 2017-02-16 2019-04-30 International Business Machines Corporation Image recognition with filtering of image classification output distribution
US10810773B2 (en) * 2017-06-14 2020-10-20 Dell Products, L.P. Headset display control based upon a user's pupil state
WO2019052997A1 (en) * 2017-09-13 2019-03-21 Koninklijke Philips N.V. Camera and image calibration for subject identification
US11062176B2 (en) 2017-11-30 2021-07-13 Kofax, Inc. Object detection and image cropping using a multi-detector approach
CN109190594A (en) * 2018-09-21 2019-01-11 广东蔚海数问大数据科技有限公司 Optical Character Recognition system and information extracting method
US10999640B2 (en) 2018-11-29 2021-05-04 International Business Machines Corporation Automatic embedding of information associated with video content
US20200250766A1 (en) * 2019-02-06 2020-08-06 Teachers Insurance And Annuity Association Of America Automated customer enrollment using mobile communication devices
US11170271B2 (en) * 2019-06-26 2021-11-09 Dallas Limetree, LLC Method and system for classifying content using scoring for identifying psychological factors employed by consumers to take action
US11636117B2 (en) 2019-06-26 2023-04-25 Dallas Limetree, LLC Content selection using psychological factor vectors
US11341605B1 (en) * 2019-09-30 2022-05-24 Amazon Technologies, Inc. Document rectification via homography recovery using machine learning
CN112749715B (en) * 2019-10-29 2023-10-13 腾讯科技(深圳)有限公司 Picture classification and picture display method, device, equipment and medium
CN115035407A (en) * 2019-11-06 2022-09-09 支付宝(杭州)信息技术有限公司 Method, device and equipment for identifying object in image
JP2021124843A (en) * 2020-02-03 2021-08-30 富士フイルムビジネスイノベーション株式会社 Document processing device and program
US11513669B2 (en) 2020-02-28 2022-11-29 Micron Technology, Inc. User interface for modifying pictures
US11144752B1 (en) * 2020-05-12 2021-10-12 Cyxtera Cybersecurity, Inc. Physical document verification in uncontrolled environments
US11513848B2 (en) * 2020-10-05 2022-11-29 Apple Inc. Critical agent identification to modify bandwidth allocation in a virtual channel
US11763613B2 (en) * 2021-03-08 2023-09-19 Johnson Controls Tyco IP Holdings LLP Automatic creation and management of digital identity profiles for access control
JP2023007599A (en) * 2021-07-02 2023-01-19 株式会社日立ハイテク Image processing apparatus, method, and image processing system
US11748973B2 (en) * 2021-12-17 2023-09-05 Microsoft Technology Licensing, Llc Systems and methods for generating object state distributions usable for image comparison tasks
US11829701B1 (en) * 2022-06-30 2023-11-28 Accenture Global Solutions Limited Heuristics-based processing of electronic document contents

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6512848B2 (en) * 1996-11-18 2003-01-28 Canon Kabushiki Kaisha Page analysis system
US20080212115A1 (en) * 2007-02-13 2008-09-04 Yohsuke Konishi Image processing method, image processing apparatus, image reading apparatus, and image forming apparatus
US20100060915A1 (en) * 2008-09-08 2010-03-11 Masaru Suzuki Apparatus and method for image processing, and program

Family Cites Families (740)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1660102A (en) 1923-06-04 1928-02-21 William H Smyth High-speed tracklaying tractor
US3069654A (en) 1960-03-25 1962-12-18 Paul V C Hough Method and means for recognizing complex patterns
US3696599A (en) 1971-07-16 1972-10-10 Us Navy Cable fairing system
US4558461A (en) 1983-06-17 1985-12-10 Litton Systems, Inc. Text line bounding system
US4836026A (en) 1984-06-01 1989-06-06 Science Applications International Corporation Ultrasonic imaging system
US4651287A (en) 1984-06-14 1987-03-17 Tsao Sherman H Digital image processing algorithm for output devices with discrete halftone gray scale capability
US4656665A (en) 1985-01-15 1987-04-07 International Business Machines Corporation Thresholding technique for graphics images using histogram analysis
DE3716787A1 (en) 1986-05-19 1987-11-26 Ricoh Kk CHARACTER RECOGNITION METHOD
US4992863A (en) 1987-12-22 1991-02-12 Minolta Camera Kabushiki Kaisha Colored image reading apparatus
US5101448A (en) 1988-08-24 1992-03-31 Hitachi, Ltd. Method and apparatus for processing a document by utilizing an image
JPH02311083A (en) 1989-05-26 1990-12-26 Ricoh Co Ltd Original reader
US5159667A (en) 1989-05-31 1992-10-27 Borrey Roland G Document identification by characteristics matching
JP2940960B2 (en) 1989-10-31 1999-08-25 株式会社日立製作所 Image tilt detection method and correction method, and image information processing apparatus
US5020112A (en) 1989-10-31 1991-05-28 At&T Bell Laboratories Image recognition method using two-dimensional stochastic grammars
US5063604A (en) 1989-11-08 1991-11-05 Transitions Research Corporation Method and means for recognizing patterns represented in logarithmic polar coordinates
IT1237803B (en) 1989-12-21 1993-06-17 Temav Spa PROCESS FOR THE PREPARATION OF FINE NITRIDE ALUMINUM POWDERS
US5344132A (en) 1990-01-16 1994-09-06 Digital Image Systems Image based document processing and information management system and apparatus
JP2997508B2 (en) * 1990-05-31 2000-01-11 株式会社東芝 Pattern recognition device
JP2708263B2 (en) 1990-06-22 1998-02-04 富士写真フイルム株式会社 Image reading device
JPH0488489A (en) 1990-08-01 1992-03-23 Internatl Business Mach Corp <Ibm> Character recognizing device and method using generalized half conversion
JPH04287290A (en) 1990-11-20 1992-10-12 Imra America Inc Hough transformation picture processor
KR930010845B1 (en) 1990-12-31 1993-11-12 주식회사 금성사 Graphic and character auto-separating method of video signal
JPH04270565A (en) 1991-02-20 1992-09-25 Fuji Xerox Co Ltd Picture compression system
US5313527A (en) 1991-06-07 1994-05-17 Paragraph International Method and apparatus for recognizing cursive writing from sequential input information
US5293429A (en) 1991-08-06 1994-03-08 Ricoh Company, Ltd. System and method for automatically classifying heterogeneous business forms
US5680525A (en) 1991-08-08 1997-10-21 Hitachi, Ltd. Three-dimensional graphic system with an editor for generating a textrue mapping image
JPH0560616A (en) * 1991-09-05 1993-03-12 Matsushita Electric Ind Co Ltd Method and apparatus for discriminating color
WO1993007580A1 (en) 1991-10-02 1993-04-15 Fujitsu Limited Method of determining direction in local region of profile segment and method of determining lines and angles
US5321770A (en) 1991-11-19 1994-06-14 Xerox Corporation Method for determining boundaries of words in text
JP3191057B2 (en) 1991-11-22 2001-07-23 株式会社日立製作所 Method and apparatus for processing encoded image data
US5359673A (en) 1991-12-27 1994-10-25 Xerox Corporation Method and apparatus for converting bitmap image documents to editable coded data using a standard notation to record document recognition ambiguities
DE9202508U1 (en) 1992-02-27 1992-04-09 Georg Karl Geka-Brush Gmbh, 8809 Bechhofen, De
US5317646A (en) 1992-03-24 1994-05-31 Xerox Corporation Automated method for creating templates in a forms recognition and processing system
DE4310727C2 (en) 1992-04-06 1996-07-11 Hell Ag Linotype Method and device for analyzing image templates
US5268967A (en) 1992-06-29 1993-12-07 Eastman Kodak Company Method for automatic foreground and background detection in digital radiographic images
US5596655A (en) 1992-08-18 1997-01-21 Hewlett-Packard Company Method for finding and classifying scanned information
US5594814A (en) 1992-10-19 1997-01-14 Fast; Bruce B. OCR image preprocessing method for image enhancement of scanned documents
US5848184A (en) 1993-03-15 1998-12-08 Unisys Corporation Document page analyzer and method
JPH06274680A (en) 1993-03-17 1994-09-30 Hitachi Ltd Method and system recognizing document
US6002489A (en) 1993-04-02 1999-12-14 Fujitsu Limited Product catalog having image evaluation chart
JPH06314339A (en) 1993-04-27 1994-11-08 Honda Motor Co Ltd Image rectilinear component extracting device
US5602964A (en) 1993-05-21 1997-02-11 Autometric, Incorporated Automata networks and methods for obtaining optimized dynamically reconfigurable computational architectures and controls
US7082426B2 (en) 1993-06-18 2006-07-25 Cnet Networks, Inc. Content aggregation method and apparatus for an on-line product catalog
US5353673A (en) 1993-09-07 1994-10-11 Lynch John H Brass-wind musical instrument mouthpiece with radially asymmetric lip restrictor
JP2720924B2 (en) 1993-09-21 1998-03-04 富士ゼロックス株式会社 Image signal encoding device
US6219773B1 (en) 1993-10-18 2001-04-17 Via-Cyrix, Inc. System and method of retiring misaligned write operands from a write buffer
DE69432114T2 (en) 1993-11-24 2003-10-30 Canon Kk System for identifying and processing forms
US5546474A (en) 1993-12-21 1996-08-13 Hewlett-Packard Company Detection of photo regions in digital images
US5671463A (en) 1993-12-28 1997-09-23 Minolta Co., Ltd. Image forming apparatus capable of forming a plurality of images from different originals on a single copy sheet
US5598515A (en) 1994-01-10 1997-01-28 Gen Tech Corp. System and method for reconstructing surface elements of solid objects in a three-dimensional scene from a plurality of two dimensional images of the scene
US5473742A (en) 1994-02-22 1995-12-05 Paragraph International Method and apparatus for representing image data using polynomial approximation method and iterative transformation-reparametrization technique
JP3163215B2 (en) 1994-03-07 2001-05-08 日本電信電話株式会社 Line extraction Hough transform image processing device
US5699244A (en) 1994-03-07 1997-12-16 Monsanto Company Hand-held GUI PDA with GPS/DGPS receiver for collecting agronomic and GPS position data
JP3311135B2 (en) 1994-03-23 2002-08-05 積水化学工業株式会社 Inspection range recognition method
DE69516751T2 (en) 1994-04-15 2000-10-05 Canon Kk Image preprocessing for character recognition system
US5652663A (en) 1994-07-29 1997-07-29 Polaroid Corporation Preview buffer for electronic scanner
US5563723A (en) 1994-08-31 1996-10-08 Eastman Kodak Company Method of calibration of image scanner signal processing circuits
US5757963A (en) 1994-09-30 1998-05-26 Xerox Corporation Method and apparatus for complex column segmentation by major white region pattern matching
JP3494326B2 (en) 1994-10-19 2004-02-09 ミノルタ株式会社 Image forming device
US5696611A (en) 1994-11-08 1997-12-09 Matsushita Graphic Communication Systems, Inc. Color picture processing apparatus for reproducing a color picture having a smoothly changed gradation
EP0723247B1 (en) 1995-01-17 1998-07-29 Eastman Kodak Company Document image assessment system and method
US5822454A (en) 1995-04-10 1998-10-13 Rebus Technology, Inc. System and method for automatic page registration and automatic zone detection during forms processing
US5857029A (en) 1995-06-05 1999-01-05 United Parcel Service Of America, Inc. Method and apparatus for non-contact signature imaging
DK71495A (en) 1995-06-22 1996-12-23 Purup Prepress As Digital image correction method and apparatus
CN1137455C (en) 1995-08-09 2004-02-04 丰田自动车株式会社 Travel plan preparing device
JPH0962826A (en) 1995-08-22 1997-03-07 Fuji Photo Film Co Ltd Picture reader
US5781665A (en) 1995-08-28 1998-07-14 Pitney Bowes Inc. Apparatus and method for cropping an image
CA2184561C (en) 1995-09-12 2001-05-29 Yasuyuki Michimoto Object detecting apparatus in which the position of a planar object is estimated by using hough transform
JPH0991341A (en) 1995-09-21 1997-04-04 Hitachi Ltd Conference holding and schedule management support device
CA2233023A1 (en) 1995-09-25 1997-04-03 Edward A. Taft Optimum access to electronic documents
US6532077B1 (en) 1995-10-04 2003-03-11 Canon Kabushiki Kaisha Image processing system
JPH09116720A (en) 1995-10-20 1997-05-02 Matsushita Graphic Commun Syst Inc Ocr facsimile equipment and communication system therefor
US6009196A (en) 1995-11-28 1999-12-28 Xerox Corporation Method for classifying non-running text in an image
US5987172A (en) 1995-12-06 1999-11-16 Cognex Corp. Edge peak contour tracker
US6009191A (en) 1996-02-15 1999-12-28 Intel Corporation Computer implemented method for compressing 48-bit pixels to 16-bit pixels
US5923763A (en) 1996-03-21 1999-07-13 Walker Asset Management Limited Partnership Method and apparatus for secure document timestamping
US5937084A (en) 1996-05-22 1999-08-10 Ncr Corporation Knowledge-based document analysis system
US8204293B2 (en) 2007-03-09 2012-06-19 Cummins-Allison Corp. Document imaging and processing system
US5956468A (en) 1996-07-12 1999-09-21 Seiko Epson Corporation Document segmentation system
SE510310C2 (en) 1996-07-19 1999-05-10 Ericsson Telefon Ab L M Method and apparatus for motion estimation and segmentation
US6038348A (en) 1996-07-24 2000-03-14 Oak Technology, Inc. Pixel image enhancement system and method
US5696805A (en) 1996-09-17 1997-12-09 Eastman Kodak Company Apparatus and method for identifying specific bone regions in digital X-ray images
JP3685421B2 (en) 1996-09-18 2005-08-17 富士写真フイルム株式会社 Image processing device
US5899978A (en) 1996-10-07 1999-05-04 Title America Titling system and method therefor
JPH10117262A (en) 1996-10-09 1998-05-06 Fuji Photo Film Co Ltd Image processor
JP2940496B2 (en) 1996-11-05 1999-08-25 日本電気株式会社 Pattern matching encoding apparatus and method
US6104840A (en) 1996-11-08 2000-08-15 Ricoh Company, Ltd. Method and system for generating a composite image from partially overlapping adjacent images taken along a plurality of axes
JP3748141B2 (en) 1996-12-26 2006-02-22 株式会社東芝 Image forming apparatus
US6052124A (en) 1997-02-03 2000-04-18 Yissum Research Development Company System and method for directly estimating three-dimensional structure of objects in a scene and camera motion from three two-dimensional views of the scene
US6098065A (en) 1997-02-13 2000-08-01 Nortel Networks Corporation Associative search engine
US6233059B1 (en) 1997-02-19 2001-05-15 Canon Kabushiki Kaisha Scanner device and control method thereof, and image input system
JP2927350B2 (en) 1997-03-27 1999-07-28 株式会社モノリス Multi-resolution filter processing method and image matching method using the method
SE511242C2 (en) 1997-04-01 1999-08-30 Readsoft Ab Method and apparatus for automatic data capture of forms
US6154217A (en) 1997-04-15 2000-11-28 Software Architects, Inc. Gamut restriction of color image
US6005958A (en) 1997-04-23 1999-12-21 Automotive Systems Laboratory, Inc. Occupant type and position detection system
US6067385A (en) 1997-05-07 2000-05-23 Ricoh Company Limited System for aligning document images when scanned in duplex mode
US6433896B1 (en) 1997-06-10 2002-08-13 Minolta Co., Ltd. Image processing apparatus
KR100420819B1 (en) 1997-06-25 2004-04-17 마쯔시다덴기산교 가부시키가이샤 Method for displaying luminous gradation
JP3877385B2 (en) 1997-07-04 2007-02-07 大日本スクリーン製造株式会社 Image processing parameter determination apparatus and method
JP3061019B2 (en) 1997-08-04 2000-07-10 トヨタ自動車株式会社 Internal combustion engine
US5953388A (en) 1997-08-18 1999-09-14 George Mason University Method and apparatus for processing data from a tomographic imaging system
JP3891654B2 (en) 1997-08-20 2007-03-14 株式会社東芝 Image forming apparatus
US6005968A (en) 1997-08-29 1999-12-21 X-Rite, Incorporated Scanner calibration and correction techniques using scaled lightness values
JPH1186021A (en) 1997-09-09 1999-03-30 Fuji Photo Film Co Ltd Image processor
JPH1178112A (en) 1997-09-09 1999-03-23 Konica Corp Image forming system and image forming method
JPH1191169A (en) 1997-09-19 1999-04-06 Fuji Photo Film Co Ltd Image processing apparatus
US6011595A (en) 1997-09-19 2000-01-04 Eastman Kodak Company Method for segmenting a digital image into a foreground region and a key color region
US6480624B1 (en) 1997-09-30 2002-11-12 Minolta Co., Ltd. Color discrimination apparatus and method
US6434620B1 (en) 1998-08-27 2002-08-13 Alacritech, Inc. TCP/IP offload network interface device
JP3608920B2 (en) 1997-10-14 2005-01-12 株式会社ミツトヨ Non-contact image measurement system
US5867264A (en) 1997-10-15 1999-02-02 Pacific Advanced Technology Apparatus for image multispectral sensing employing addressable spatial mask
US6243722B1 (en) 1997-11-24 2001-06-05 International Business Machines Corporation Method and system for a network-based document review tool utilizing comment classification
US6222613B1 (en) 1998-02-10 2001-04-24 Konica Corporation Image processing method and apparatus
DE19809790B4 (en) 1998-03-09 2005-12-22 Daimlerchrysler Ag Method for determining a twist structure in the surface of a precision-machined cylindrical workpiece
JPH11261821A (en) 1998-03-12 1999-09-24 Fuji Photo Film Co Ltd Image processing method
US6426806B2 (en) 1998-03-31 2002-07-30 Canon Kabushiki Kaisha Routing scanned documents with scanned control sheets
JP3457562B2 (en) 1998-04-06 2003-10-20 富士写真フイルム株式会社 Image processing apparatus and method
US6327581B1 (en) 1998-04-06 2001-12-04 Microsoft Corporation Methods and apparatus for building a support vector machine classifier
US7194471B1 (en) 1998-04-10 2007-03-20 Ricoh Company, Ltd. Document classification system and method for classifying a document according to contents of the document
US6393147B2 (en) 1998-04-13 2002-05-21 Intel Corporation Color region based recognition of unidentified objects
US8955743B1 (en) 1998-04-17 2015-02-17 Diebold Self-Service Systems Division Of Diebold, Incorporated Automated banking machine with remote user assistance
US6789069B1 (en) 1998-05-01 2004-09-07 Biowulf Technologies Llc Method for enhancing knowledge discovered from biological data using a learning machine
US7617163B2 (en) 1998-05-01 2009-11-10 Health Discovery Corporation Kernels and kernel methods for spectral data
US7318051B2 (en) 2001-05-18 2008-01-08 Health Discovery Corporation Methods for feature selection in a learning machine
JPH11328408A (en) 1998-05-12 1999-11-30 Advantest Corp Device for processing data and information storage medium
US6748109B1 (en) 1998-06-16 2004-06-08 Fuji Photo Film Co., Ltd Digital laboratory system for processing photographic images
US6161130A (en) 1998-06-23 2000-12-12 Microsoft Corporation Technique which utilizes a probabilistic classifier to detect "junk" e-mail by automatically updating a training and re-training the classifier based on the updated training set
US6192360B1 (en) 1998-06-23 2001-02-20 Microsoft Corporation Methods and apparatus for classifying text and for building a text classifier
US6831755B1 (en) 1998-06-26 2004-12-14 Sony Corporation Printer having image correcting capability
US7253836B1 (en) 1998-06-30 2007-08-07 Nikon Corporation Digital camera, storage medium for image signal processing, carrier wave and electronic camera
US6456738B1 (en) 1998-07-16 2002-09-24 Ricoh Company, Ltd. Method of and system for extracting predetermined elements from input document based upon model which is adaptively modified according to variable amount in the input document
FR2781475B1 (en) 1998-07-23 2000-09-08 Alsthom Cge Alcatel USE OF A POROUS GRAPHITE CRUCIBLE TO PROCESS SILICA PELLETS
US6219158B1 (en) 1998-07-31 2001-04-17 Hewlett-Packard Company Method and apparatus for a dynamically variable scanner, copier or facsimile secondary reflective surface
US6385346B1 (en) 1998-08-04 2002-05-07 Sharp Laboratories Of America, Inc. Method of display and control of adjustable parameters for a digital scanner device
US6571008B1 (en) 1998-08-07 2003-05-27 Washington State University Research Foundation Reverse engineering of polymeric solid models by refractive index matching
US6292168B1 (en) 1998-08-13 2001-09-18 Xerox Corporation Period-based bit conversion method and apparatus for digital image processing
JP2000067065A (en) 1998-08-20 2000-03-03 Ricoh Co Ltd Method for identifying document image and record medium
US6373507B1 (en) 1998-09-14 2002-04-16 Microsoft Corporation Computer-implemented image acquistion system
US7017108B1 (en) 1998-09-15 2006-03-21 Canon Kabushiki Kaisha Method and apparatus for reproducing a linear document having non-linear referential links
US6263122B1 (en) 1998-09-23 2001-07-17 Hewlett Packard Company System and method for manipulating regions in a scanned image
US6223223B1 (en) 1998-09-30 2001-04-24 Hewlett-Packard Company Network scanner contention handling method
US6575367B1 (en) 1998-11-05 2003-06-10 Welch Allyn Data Collection, Inc. Image data binarization methods enabling optical reader to read fine print indicia
US6370277B1 (en) 1998-12-07 2002-04-09 Kofax Image Products, Inc. Virtual rescanning: a method for interactive document image quality enhancement
US6480304B1 (en) 1998-12-09 2002-11-12 Scansoft, Inc. Scanning system and method
US6396599B1 (en) 1998-12-21 2002-05-28 Eastman Kodak Company Method and apparatus for modifying a portion of an image in accordance with colorimetric parameters
US6765685B1 (en) 1999-01-22 2004-07-20 Ricoh Company, Ltd. Printing electronic documents with automatically interleaved separation sheets
US7003719B1 (en) 1999-01-25 2006-02-21 West Publishing Company, Dba West Group System, method, and software for inserting hyperlinks into documents
US6614930B1 (en) 1999-01-28 2003-09-02 Koninklijke Philips Electronics N.V. Video stream classifiable symbol isolation method and system
JP2000227316A (en) 1999-02-04 2000-08-15 Keyence Corp Inspection device
US6646765B1 (en) 1999-02-19 2003-11-11 Hewlett-Packard Development Company, L.P. Selective document scanning method and apparatus
JP2000251012A (en) 1999-03-01 2000-09-14 Hitachi Ltd Method and system for document processing
JP2000298702A (en) 1999-04-15 2000-10-24 Canon Inc Image processing device and method therefor, and computer-readable memory
EP1049030A1 (en) 1999-04-28 2000-11-02 SER Systeme AG Produkte und Anwendungen der Datenverarbeitung Classification method and apparatus
US6590676B1 (en) 1999-05-18 2003-07-08 Electronics For Imaging, Inc. Image reconstruction architecture
EP1054331A3 (en) 1999-05-21 2003-11-12 Hewlett-Packard Company, A Delaware Corporation System and method for storing and retrieving document data
JP4453119B2 (en) 1999-06-08 2010-04-21 ソニー株式会社 Camera calibration apparatus and method, image processing apparatus and method, program providing medium, and camera
JP2000354144A (en) 1999-06-11 2000-12-19 Ricoh Co Ltd Document reader
JP4626007B2 (en) 1999-06-14 2011-02-02 株式会社ニコン Image processing method, machine-readable recording medium storing image processing program, and image processing apparatus
US7051274B1 (en) 1999-06-24 2006-05-23 Microsoft Corporation Scalable computing system for managing annotations
JP4114279B2 (en) 1999-06-25 2008-07-09 コニカミノルタビジネステクノロジーズ株式会社 Image processing device
US6501855B1 (en) 1999-07-20 2002-12-31 Parascript, Llc Manual-search restriction on documents not having an ASCII index
IL131092A (en) 1999-07-25 2006-08-01 Orbotech Ltd Optical inspection system
US6628808B1 (en) 1999-07-28 2003-09-30 Datacard Corporation Apparatus and method for verifying a scanned image
US6628416B1 (en) 1999-10-13 2003-09-30 Umax Data Systems, Inc. Method and user interface for performing a scan operation for a scanner coupled to a computer system
JP3501031B2 (en) 1999-08-24 2004-02-23 日本電気株式会社 Image region determination device, image region determination method, and storage medium storing program thereof
JP3587506B2 (en) 1999-08-30 2004-11-10 富士重工業株式会社 Stereo camera adjustment device
US6633857B1 (en) 1999-09-04 2003-10-14 Microsoft Corporation Relevance vector machine
US6601026B2 (en) 1999-09-17 2003-07-29 Discern Communications, Inc. Information retrieval by natural language querying
US7123292B1 (en) 1999-09-29 2006-10-17 Xerox Corporation Mosaicing images with an offset lens
JP2001103255A (en) 1999-09-30 2001-04-13 Minolta Co Ltd Image processing system
US6839466B2 (en) 1999-10-04 2005-01-04 Xerox Corporation Detecting overlapping images in an automatic image segmentation device with the presence of severe bleeding
US7430066B2 (en) 1999-10-13 2008-09-30 Transpacific Ip, Ltd. Method and user interface for performing an automatic scan operation for a scanner coupled to a computer system
JP4377494B2 (en) 1999-10-22 2009-12-02 東芝テック株式会社 Information input device
JP4094789B2 (en) 1999-11-26 2008-06-04 富士通株式会社 Image processing apparatus and image processing method
US7735721B1 (en) 1999-11-30 2010-06-15 Diebold Self-Service Systems Division Of Diebold, Incorporated Method of evaluating checks deposited into a cash dispensing automated banking machine
US6751349B2 (en) 1999-11-30 2004-06-15 Fuji Photo Film Co., Ltd. Image processing system
US7337389B1 (en) 1999-12-07 2008-02-26 Microsoft Corporation System and method for annotating an electronic document independently of its content
US6665425B1 (en) 1999-12-16 2003-12-16 Xerox Corporation Systems and methods for automated image quality based diagnostics and remediation of document processing systems
US20010027420A1 (en) 1999-12-21 2001-10-04 Miroslav Boublik Method and apparatus for capturing transaction data
US6724916B1 (en) 2000-01-05 2004-04-20 The United States Of America As Represented By The Secretary Of The Navy Composite hough transform for multitarget multisensor tracking
US6778684B1 (en) 2000-01-20 2004-08-17 Xerox Corporation Systems and methods for checking image/document quality
JP2001218047A (en) 2000-02-04 2001-08-10 Fuji Photo Film Co Ltd Picture processor
JP2001297303A (en) * 2000-02-09 2001-10-26 Ricoh Co Ltd Method and device for recognizing document image and computer readable recording medium
JP2001309128A (en) 2000-02-24 2001-11-02 Xerox Corp Image capture control system
US7149347B1 (en) 2000-03-02 2006-12-12 Science Applications International Corporation Machine learning of document templates for data extraction
US6859909B1 (en) 2000-03-07 2005-02-22 Microsoft Corporation System and method for annotating web-based documents
US6643413B1 (en) 2000-03-27 2003-11-04 Microsoft Corporation Manifold mosaic hopping for image-based rendering
US6757081B1 (en) 2000-04-07 2004-06-29 Hewlett-Packard Development Company, L.P. Methods and apparatus for analyzing and image and for controlling a scanner
SE0001312D0 (en) 2000-04-10 2000-04-10 Abb Ab Industrial robot
JP4369678B2 (en) 2000-04-27 2009-11-25 株式会社ブロードリーフ Service provision system for vehicles
US6337925B1 (en) 2000-05-08 2002-01-08 Adobe Systems Incorporated Method for determining a border in a complex scene with applications to image masking
US20020030831A1 (en) 2000-05-10 2002-03-14 Fuji Photo Film Co., Ltd. Image correction method
US6469801B1 (en) 2000-05-17 2002-10-22 Heidelberger Druckmaschinen Ag Scanner with prepress scaling mode
US6763515B1 (en) 2000-06-05 2004-07-13 National Instruments Corporation System and method for automatically generating a graphical program to perform an image processing algorithm
US6701009B1 (en) 2000-06-06 2004-03-02 Sharp Laboratories Of America, Inc. Method of separated color foreground and background pixel improvement
US20030120653A1 (en) 2000-07-05 2003-06-26 Sean Brady Trainable internet search engine and methods of using
JP4023075B2 (en) 2000-07-10 2007-12-19 富士ゼロックス株式会社 Image acquisition device
US6463430B1 (en) 2000-07-10 2002-10-08 Mohomine, Inc. Devices and methods for generating and managing a database
JP4171574B2 (en) 2000-07-21 2008-10-22 富士フイルム株式会社 Image processing condition determining apparatus and image processing condition determining program storage medium
KR20040041082A (en) 2000-07-24 2004-05-13 비브콤 인코포레이티드 System and method for indexing, searching, identifying, and editing portions of electronic multimedia files
US6675159B1 (en) 2000-07-27 2004-01-06 Science Applic Int Corp Concept-based search and retrieval system
WO2002010884A2 (en) 2000-07-28 2002-02-07 Raf Technology, Inc. Orthogonal technology for multi-line character recognition
US6850653B2 (en) 2000-08-08 2005-02-01 Canon Kabushiki Kaisha Image reading system, image reading setting determination apparatus, reading setting determination method, recording medium, and program
US6901170B1 (en) 2000-09-05 2005-05-31 Fuji Xerox Co., Ltd. Image processing device and recording medium
EP1325421A4 (en) 2000-09-07 2007-07-04 Us Postal Service Mailing online operation flow
JP3720740B2 (en) 2000-09-12 2005-11-30 キヤノン株式会社 Distributed printing system, distributed printing control method, storage medium, and program
US7002700B1 (en) 2000-09-14 2006-02-21 Electronics For Imaging, Inc. Method and system for merging scan files into a color workflow
US7738706B2 (en) 2000-09-22 2010-06-15 Sri International Method and apparatus for recognition of symbols in images of three-dimensional scenes
DE10047219A1 (en) 2000-09-23 2002-06-06 Wuerth Adolf Gmbh & Co Kg cleat
JP4472847B2 (en) 2000-09-28 2010-06-02 キヤノン電子株式会社 Image processing apparatus and control method thereof, image input apparatus and control method thereof, and storage medium
JP2002109242A (en) 2000-09-29 2002-04-12 Glory Ltd Method and device for document processing and storage medium stored with document processing program
US7428495B2 (en) 2000-10-02 2008-09-23 International Projects Consultancy Services, Inc. Object based workflow system and method
US6621595B1 (en) 2000-11-01 2003-09-16 Hewlett-Packard Development Company, L.P. System and method for enhancing scanned document images for color printing
US20050060162A1 (en) 2000-11-10 2005-03-17 Farhad Mohit Systems and methods for automatic identification and hyperlinking of words or other data items and for information retrieval using hyperlinked words or data items
US7043080B1 (en) 2000-11-21 2006-05-09 Sharp Laboratories Of America, Inc. Methods and systems for text detection in mixed-context documents using local geometric signatures
US6788308B2 (en) 2000-11-29 2004-09-07 Tvgateway,Llc System and method for improving the readability of text
EP1211594A3 (en) 2000-11-30 2006-05-24 Canon Kabushiki Kaisha Apparatus and method for controlling user interface
US6921220B2 (en) 2000-12-19 2005-07-26 Canon Kabushiki Kaisha Image processing system, data processing apparatus, data processing method, computer program and storage medium
US6826311B2 (en) 2001-01-04 2004-11-30 Microsoft Corporation Hough transform supporting methods and arrangements
US7266768B2 (en) 2001-01-09 2007-09-04 Sharp Laboratories Of America, Inc. Systems and methods for manipulating electronic information using a three-dimensional iconic representation
US6522791B2 (en) 2001-01-23 2003-02-18 Xerox Corporation Dynamic user interface with scanned image improvement assist
US6909805B2 (en) 2001-01-31 2005-06-21 Matsushita Electric Industrial Co., Ltd. Detecting and utilizing add-on information from a scanned document image
US6882983B2 (en) 2001-02-05 2005-04-19 Notiva Corporation Method and system for processing transactions
US6950555B2 (en) 2001-02-16 2005-09-27 Parascript Llc Holistic-analytical recognition of handwritten text
JP2002247371A (en) 2001-02-21 2002-08-30 Ricoh Co Ltd Image processor and recording medium having recorded image processing program
EP1384155A4 (en) 2001-03-01 2007-02-28 Health Discovery Corp Spectral kernels for learning machines
US7864369B2 (en) 2001-03-19 2011-01-04 Dmetrix, Inc. Large-area imaging by concatenation with array microscope
US7145699B2 (en) 2001-03-30 2006-12-05 Sharp Laboratories Of America, Inc. System and method for digital document alignment
JP2002300386A (en) 2001-03-30 2002-10-11 Fuji Photo Film Co Ltd Image processing method
US20020165717A1 (en) 2001-04-06 2002-11-07 Solmer Robert P. Efficient method for information extraction
US6658147B2 (en) 2001-04-16 2003-12-02 Parascript Llc Reshaping freehand drawn lines and shapes in an electronic document
JP3824209B2 (en) 2001-04-18 2006-09-20 三菱電機株式会社 Automatic document divider
US20040049401A1 (en) 2002-02-19 2004-03-11 Carr J. Scott Security methods employing drivers licenses and other documents
US7023447B2 (en) 2001-05-02 2006-04-04 Eastman Kodak Company Block sampling based method and apparatus for texture synthesis
US7006707B2 (en) 2001-05-03 2006-02-28 Adobe Systems Incorporated Projecting images onto a surface
US6944357B2 (en) 2001-05-24 2005-09-13 Microsoft Corporation System and process for automatically determining optimal image compression methods for reducing file size
WO2002099720A1 (en) 2001-06-01 2002-12-12 American Express Travel Related Services Company, Inc. System and method for global automated address verification
US20030030638A1 (en) 2001-06-07 2003-02-13 Karl Astrom Method and apparatus for extracting information from a target area within a two-dimensional graphical object in an image
FR2825817B1 (en) 2001-06-07 2003-09-19 Commissariat Energie Atomique IMAGE PROCESSING METHOD FOR THE AUTOMATIC EXTRACTION OF SEMANTIC ELEMENTS
US7403313B2 (en) 2001-09-27 2008-07-22 Transpacific Ip, Ltd. Automatic scanning parameter setting device and method
US6584339B2 (en) 2001-06-27 2003-06-24 Vanderbilt University Method and apparatus for collecting and processing physical space data for use while performing image-guided surgery
US7154622B2 (en) 2001-06-27 2006-12-26 Sharp Laboratories Of America, Inc. Method of routing and processing document images sent using a digital scanner and transceiver
US7298903B2 (en) 2001-06-28 2007-11-20 Microsoft Corporation Method and system for separating text and drawings in digital ink
US7013047B2 (en) 2001-06-28 2006-03-14 National Instruments Corporation System and method for performing edge detection in an image
CA2457639C (en) 2001-08-13 2014-07-22 Accenture Global Services Gmbh A computer system for managing accounting data
US7506062B2 (en) 2001-08-30 2009-03-17 Xerox Corporation Scanner-initiated network-based image input scanning
US20030044012A1 (en) 2001-08-31 2003-03-06 Sharp Laboratories Of America, Inc. System and method for using a profile to encrypt documents in a digital scanner
JP5002099B2 (en) 2001-08-31 2012-08-15 株式会社東芝 Magnetic resonance imaging system
JP4564693B2 (en) 2001-09-14 2010-10-20 キヤノン株式会社 Document processing apparatus and method
US7515313B2 (en) 2001-09-20 2009-04-07 Stone Cheng Method and system for scanning with one-scan-and-done feature
US6732046B1 (en) 2001-10-03 2004-05-04 Navigation Technologies Corp. Application of the hough transform to modeling the horizontal component of road geometry and computing heading and curvature
US7430002B2 (en) 2001-10-03 2008-09-30 Micron Technology, Inc. Digital imaging system and method for adjusting image-capturing parameters using image comparisons
US6667774B2 (en) 2001-11-02 2003-12-23 Imatte, Inc. Method and apparatus for the automatic generation of subject to background transition area boundary lines and subject shadow retention
US6922487B2 (en) 2001-11-02 2005-07-26 Xerox Corporation Method and apparatus for capturing text images
US6898316B2 (en) 2001-11-09 2005-05-24 Arcsoft, Inc. Multiple image area detection in a digital image
US6944616B2 (en) 2001-11-28 2005-09-13 Pavilion Technologies, Inc. System and method for historical database training of support vector machines
EP1317133A1 (en) 2001-12-03 2003-06-04 Kofax Image Products, Inc. Virtual rescanning a method for interactive document image quality enhancement
US7937281B2 (en) 2001-12-07 2011-05-03 Accenture Global Services Limited Accelerated process improvement framework
US7286177B2 (en) 2001-12-19 2007-10-23 Nokia Corporation Digital camera
US7053953B2 (en) 2001-12-21 2006-05-30 Eastman Kodak Company Method and camera system for blurring portions of a verification image to show out of focus areas in a captured archival image
JP2003196357A (en) 2001-12-27 2003-07-11 Hitachi Software Eng Co Ltd Method and system of document filing
US7346215B2 (en) 2001-12-31 2008-03-18 Transpacific Ip, Ltd. Apparatus and method for capturing a document
US7054036B2 (en) 2002-01-25 2006-05-30 Kabushiki Kaisha Toshiba Image processing method and image forming apparatus
US20030142328A1 (en) 2002-01-31 2003-07-31 Mcdaniel Stanley Eugene Evaluation of image processing operations
JP3891408B2 (en) 2002-02-08 2007-03-14 株式会社リコー Image correction apparatus, program, storage medium, and image correction method
US7362354B2 (en) 2002-02-12 2008-04-22 Hewlett-Packard Development Company, L.P. Method and system for assessing the photo quality of a captured image in a digital still camera
US6985631B2 (en) 2002-02-20 2006-01-10 Hewlett-Packard Development Company, L.P. Systems and methods for automatically detecting a corner in a digitally captured image
US7020320B2 (en) 2002-03-06 2006-03-28 Parascript, Llc Extracting text written on a check
US7107285B2 (en) 2002-03-16 2006-09-12 Questerra Corporation Method, system, and program for an improved enterprise spatial system
WO2003085624A1 (en) 2002-04-05 2003-10-16 Unbounded Access Ltd. Networked accessibility enhancer system
JP4185699B2 (en) 2002-04-12 2008-11-26 日立オムロンターミナルソリューションズ株式会社 Form reading system, form reading method and program therefor
US20030210428A1 (en) 2002-05-07 2003-11-13 Alex Bevlin Non-OCR method for capture of computer filled-in forms
AU2003238886A1 (en) 2002-05-23 2003-12-12 Phochron, Inc. System and method for digital content processing and distribution
US7636455B2 (en) 2002-06-04 2009-12-22 Raytheon Company Digital image edge detection and road network tracking method and system
US7409092B2 (en) 2002-06-20 2008-08-05 Hrl Laboratories, Llc Method and apparatus for the surveillance of objects in images
US7197158B2 (en) 2002-06-28 2007-03-27 Microsoft Corporation Generation of metadata for acquired images
US20040143547A1 (en) 2002-07-02 2004-07-22 Dean Mersky Automated accounts payable using image typing and type specific processing
US6999625B1 (en) * 2002-07-12 2006-02-14 The United States Of America As Represented By The Secretary Of The Navy Feature-based detection and context discriminate classification for digital images
US7209599B2 (en) 2002-07-12 2007-04-24 Hewlett-Packard Development Company, L.P. System and method for scanned image bleedthrough processing
JP2004054640A (en) 2002-07-19 2004-02-19 Sharp Corp Method for distributing image information, image information distribution system, center device, terminal device, scanner device, computer program, and recording medium
US7031525B2 (en) 2002-07-30 2006-04-18 Mitsubishi Electric Research Laboratories, Inc. Edge detection based on background change
US7043084B2 (en) 2002-07-30 2006-05-09 Mitsubishi Electric Research Laboratories, Inc. Wheelchair detection using stereo vision
US7365881B2 (en) 2002-08-19 2008-04-29 Eastman Kodak Company Halftone dot-growth technique based on morphological filtering
US7123387B2 (en) 2002-08-23 2006-10-17 Chung-Wei Cheng Image scanning method
US20040083119A1 (en) 2002-09-04 2004-04-29 Schunder Lawrence V. System and method for implementing a vendor contract management system
JP3741090B2 (en) 2002-09-09 2006-02-01 コニカミノルタビジネステクノロジーズ株式会社 Image processing device
US7260561B1 (en) 2003-11-10 2007-08-21 Zxibix, Inc. System and method to facilitate user thinking about an arbitrary problem with output and interface to external components and resources
US20040090458A1 (en) 2002-11-12 2004-05-13 Yu John Chung Wah Method and apparatus for previewing GUI design and providing screen-to-source association
US20050057780A1 (en) 2002-11-19 2005-03-17 Canon Denshi Kabushiki Kaisha Network scanning system
DE10253903A1 (en) 2002-11-19 2004-06-17 OCé PRINTING SYSTEMS GMBH Method, arrangement and computer software for printing a release sheet using an electrophotographic printer or copier
FR2847344B1 (en) 2002-11-20 2005-02-25 Framatome Anp PROBE FOR CONTROLLING AN INTERNAL WALL OF A CONDUIT
KR100446538B1 (en) 2002-11-21 2004-09-01 삼성전자주식회사 On-line digital picture processing system for digital camera rental system
US7386527B2 (en) 2002-12-06 2008-06-10 Kofax, Inc. Effective multi-class support vector machine classification
BR0317326A (en) 2002-12-16 2005-11-16 King Pharmaceuticals Inc Method of reducing cardiovascular disease in an at-risk individual
JP2004198211A (en) 2002-12-18 2004-07-15 Aisin Seiki Co Ltd Apparatus for monitoring vicinity of mobile object
US7181082B2 (en) 2002-12-18 2007-02-20 Sharp Laboratories Of America, Inc. Blur detection system
WO2004061702A1 (en) 2002-12-26 2004-07-22 The Trustees Of Columbia University In The City Of New York Ordered data compression system and methods
US20070128899A1 (en) 2003-01-12 2007-06-07 Yaron Mayer System and method for improving the efficiency, comfort, and/or reliability in Operating Systems, such as for example Windows
US7174043B2 (en) 2003-02-25 2007-02-06 Evernote Corp. On-line handwriting recognizer
US20040169889A1 (en) 2003-02-27 2004-09-02 Toshiba Tec Kabushiki Kaisha Image processing apparatus and controller apparatus using thereof
US20040169873A1 (en) 2003-02-28 2004-09-02 Xerox Corporation Automatic determination of custom parameters based on scanned image data
US7765155B2 (en) 2003-03-13 2010-07-27 International Business Machines Corporation Invoice processing approval and storage system method and apparatus
US6729733B1 (en) 2003-03-21 2004-05-04 Mitsubishi Electric Research Laboratories, Inc. Method for determining a largest inscribed rectangular image within a union of projected quadrilateral images
US7639392B2 (en) 2003-03-28 2009-12-29 Infoprint Solutions Company, Llc Methods, systems, and media to enhance image processing in a color reprographic system
US7665061B2 (en) 2003-04-08 2010-02-16 Microsoft Corporation Code builders
GB0308509D0 (en) 2003-04-12 2003-05-21 Antonis Jan Inspection apparatus and method
US7251777B1 (en) 2003-04-16 2007-07-31 Hypervision, Ltd. Method and system for automated structuring of textual documents
US7406183B2 (en) 2003-04-28 2008-07-29 International Business Machines Corporation System and method of sorting document images based on image quality
US7327374B2 (en) 2003-04-30 2008-02-05 Byong Mok Oh Structure-preserving clone brush
US20040223640A1 (en) 2003-05-09 2004-11-11 Bovyrin Alexander V. Stereo matching using segmentation of image columns
JP4864295B2 (en) 2003-06-02 2012-02-01 富士フイルム株式会社 Image display system, image display apparatus, and program
JP4261988B2 (en) 2003-06-03 2009-05-13 キヤノン株式会社 Image processing apparatus and method
US20040245334A1 (en) * 2003-06-06 2004-12-09 Sikorski Steven Maurice Inverted terminal presentation scanner and holder
US8484066B2 (en) 2003-06-09 2013-07-09 Greenline Systems, Inc. System and method for risk detection reporting and infrastructure
US7389516B2 (en) 2003-06-19 2008-06-17 Microsoft Corporation System and method for facilitating interaction between a computer and a network scanner
US7616233B2 (en) 2003-06-26 2009-11-10 Fotonation Vision Limited Perfecting of digital image capture parameters within acquisition devices using face detection
US20040263639A1 (en) 2003-06-26 2004-12-30 Vladimir Sadovsky System and method for intelligent image acquisition
JP4289040B2 (en) 2003-06-26 2009-07-01 富士ゼロックス株式会社 Image processing apparatus and method
JP2005018678A (en) 2003-06-30 2005-01-20 Casio Comput Co Ltd Form data input processing device, form data input processing method, and program
US7362892B2 (en) 2003-07-02 2008-04-22 Lockheed Martin Corporation Self-optimizing classifier
US20060242180A1 (en) 2003-07-23 2006-10-26 Graf James A Extracting data from semi-structured text documents
US20050030602A1 (en) 2003-08-06 2005-02-10 Gregson Daniel P. Scan templates
US20050050060A1 (en) 2003-08-27 2005-03-03 Gerard Damm Data structure for range-specified algorithms
JP2005071262A (en) 2003-08-27 2005-03-17 Casio Comput Co Ltd Slip processing system
US8937731B2 (en) 2003-09-01 2015-01-20 Konica Minolta Business Technologies, Inc. Image processing apparatus for receiving a request relating to image processing from an external source and executing the received request
JP3951990B2 (en) 2003-09-05 2007-08-01 ブラザー工業株式会社 Wireless station, program, and operation control method
JP4725057B2 (en) 2003-09-09 2011-07-13 セイコーエプソン株式会社 Generation of image quality adjustment information and image quality adjustment using image quality adjustment information
JP2005085173A (en) 2003-09-10 2005-03-31 Toshiba Corp Data management system
US7797381B2 (en) 2003-09-19 2010-09-14 International Business Machines Corporation Methods and apparatus for information hyperchain management for on-demand business collaboration
US7844109B2 (en) 2003-09-24 2010-11-30 Canon Kabushiki Kaisha Image processing method and apparatus
US20050080844A1 (en) 2003-10-10 2005-04-14 Sridhar Dathathraya System and method for managing scan destination profiles
JP4139760B2 (en) 2003-10-10 2008-08-27 富士フイルム株式会社 Image processing method and apparatus, and image processing program
US20070011334A1 (en) 2003-11-03 2007-01-11 Steven Higgins Methods and apparatuses to provide composite applications
EP1530357A1 (en) 2003-11-06 2005-05-11 Ricoh Company, Ltd. Method, computer program, and apparatus for detecting specific information included in image data of original image with accuracy, and computer readable storing medium storing the program
US20050193325A1 (en) 2003-11-12 2005-09-01 Epstein David L. Mobile content engine with enhanced features
GB0326374D0 (en) 2003-11-12 2003-12-17 British Telecomm Object detection in images
US7553095B2 (en) 2003-11-27 2009-06-30 Konica Minolta Business Technologies, Inc. Print data transmitting apparatus, image forming system, printing condition setting method and printer driver program
JP4347677B2 (en) 2003-12-08 2009-10-21 富士フイルム株式会社 Form OCR program, method and apparatus
US8693043B2 (en) 2003-12-19 2014-04-08 Kofax, Inc. Automatic document separation
JP2005208861A (en) 2004-01-21 2005-08-04 Oki Electric Ind Co Ltd Store visiting reception system and store visiting reception method therefor
US7184929B2 (en) 2004-01-28 2007-02-27 Microsoft Corporation Exponential priors for maximum entropy models
US9229540B2 (en) 2004-01-30 2016-01-05 Electronic Scripting Products, Inc. Deriving input from six degrees of freedom interfaces
US7298897B1 (en) 2004-02-11 2007-11-20 United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Optimal binarization of gray-scaled digital images via fuzzy reasoning
US7379587B2 (en) 2004-02-12 2008-05-27 Xerox Corporation Systems and methods for identifying regions within an image having similar continuity values
US7812860B2 (en) 2004-04-01 2010-10-12 Exbiblio B.V. Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device
US7636479B2 (en) 2004-02-24 2009-12-22 Trw Automotive U.S. Llc Method and apparatus for controlling classification and classification switching in a vision system
US20050216564A1 (en) 2004-03-11 2005-09-29 Myers Gregory K Method and apparatus for analysis of electronic communications containing imagery
JP2005267457A (en) 2004-03-19 2005-09-29 Casio Comput Co Ltd Image processing device, imaging apparatus, image processing method and program
FR2868185B1 (en) 2004-03-23 2006-06-30 Realeyes3D Sa METHOD FOR EXTRACTING RAW DATA FROM IMAGE RESULTING FROM SHOOTING
US7379562B2 (en) 2004-03-31 2008-05-27 Microsoft Corporation Determining connectedness and offset of 3D objects relative to an interactive surface
US7990556B2 (en) 2004-12-03 2011-08-02 Google Inc. Association of a portable scanner with input/output and storage devices
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
JP5238249B2 (en) 2004-04-01 2013-07-17 グーグル インコーポレイテッド Acquiring data from rendered documents using handheld devices
US7505056B2 (en) 2004-04-02 2009-03-17 K-Nfb Reading Technology, Inc. Mode processing in portable reading machine
TWI240067B (en) 2004-04-06 2005-09-21 Sunplus Technology Co Ltd Rapid color recognition method
US7366705B2 (en) 2004-04-15 2008-04-29 Microsoft Corporation Clustering based text classification
CN101493830A (en) 2004-04-29 2009-07-29 Nec软件有限公司 Structured natural language query and knowledge system
US20050246262A1 (en) 2004-04-29 2005-11-03 Aggarwal Charu C Enabling interoperability between participants in a network
JP3800227B2 (en) 2004-05-17 2006-07-26 コニカミノルタビジネステクノロジーズ株式会社 Image forming apparatus, information processing method and information processing program used therefor
US7430059B2 (en) 2004-05-24 2008-09-30 Xerox Corporation Systems, methods and graphical user interfaces for interactively previewing a scanned document
US7496218B2 (en) 2004-05-26 2009-02-24 Ramsay Thomas E System and method for identifying objects of interest in image data
WO2005116866A1 (en) 2004-05-28 2005-12-08 Agency For Science, Technology And Research Method and system for word sequence processing
US7272261B2 (en) 2004-06-04 2007-09-18 Xerox Corporation Method and system for classifying scanned-media
US20050273453A1 (en) 2004-06-05 2005-12-08 National Background Data, Llc Systems, apparatus and methods for performing criminal background investigations
US7392426B2 (en) 2004-06-15 2008-06-24 Honeywell International Inc. Redundant processing architecture for single fault tolerance
EP1607716A3 (en) 2004-06-18 2012-06-20 Topcon Corporation Model forming apparatus and method, and photographing apparatus and method
US20060219773A1 (en) 2004-06-18 2006-10-05 Richardson Joseph L System and method for correcting data in financial documents
JP2006031379A (en) 2004-07-15 2006-02-02 Sony Corp Information presentation apparatus and information presentation method
US7339585B2 (en) 2004-07-19 2008-03-04 Pie Medical Imaging B.V. Method and apparatus for visualization of biological structures with use of 3D position information from segmentation results
US20060023271A1 (en) 2004-07-30 2006-02-02 Boay Yoke P Scanner with color profile matching mechanism
US7403008B2 (en) 2004-08-02 2008-07-22 Cornell Research Foundation, Inc. Electron spin resonance microscope for imaging with micron resolution
JP2006054519A (en) 2004-08-09 2006-02-23 Ricoh Co Ltd Imaging apparatus
KR20060014765A (en) 2004-08-12 2006-02-16 주식회사 현대오토넷 Emergency safety service system and method using telematics system
US7515772B2 (en) 2004-08-21 2009-04-07 Xerox Corp Document registration and skew detection system
US7299407B2 (en) 2004-08-24 2007-11-20 International Business Machines Corporation Marking and annotating electronic documents
EP1810182A4 (en) 2004-08-31 2010-07-07 Kumar Gopalakrishnan Method and system for providing information services relevant to visual imagery
US7643665B2 (en) 2004-08-31 2010-01-05 Semiconductor Insights Inc. Method of design analysis of existing integrated circuits
EP1789920A1 (en) 2004-09-02 2007-05-30 Koninklijke Philips Electronics N.V. Feature weighted medical object contouring using distance coordinates
US20070118794A1 (en) 2004-09-08 2007-05-24 Josef Hollander Shared annotation system and method
US7739127B1 (en) 2004-09-23 2010-06-15 Stephen Don Hall Automated system for filing prescription drug claims
US8332401B2 (en) 2004-10-01 2012-12-11 Ricoh Co., Ltd Method and system for position-based image matching in a mixed media environment
US8005831B2 (en) 2005-08-23 2011-08-23 Ricoh Co., Ltd. System and methods for creation and use of a mixed media environment with geographic location information
US7639387B2 (en) 2005-08-23 2009-12-29 Ricoh Co., Ltd. Authoring tools using a mixed media environment
US7991778B2 (en) 2005-08-23 2011-08-02 Ricoh Co., Ltd. Triggering actions with captured input in a mixed media environment
US9530050B1 (en) 2007-07-11 2016-12-27 Ricoh Co., Ltd. Document annotation sharing
JP4477468B2 (en) 2004-10-15 2010-06-09 富士通株式会社 Device part image retrieval device for assembly drawings
US20060089907A1 (en) 2004-10-22 2006-04-27 Klaus Kohlmaier Invoice verification process
US7464066B2 (en) 2004-10-26 2008-12-09 Applied Intelligence Solutions, Llc Multi-dimensional, expert behavior-emulation system
JP2006126941A (en) 2004-10-26 2006-05-18 Canon Inc Image processor, image processing method, image processing control program, and storage medium
US7492943B2 (en) 2004-10-29 2009-02-17 George Mason Intellectual Properties, Inc. Open set recognition using transduction
US20060095372A1 (en) 2004-11-01 2006-05-04 Sap Aktiengesellschaft System and method for management and verification of invoices
US20060095374A1 (en) 2004-11-01 2006-05-04 Jp Morgan Chase System and method for supply chain financing
US7475335B2 (en) 2004-11-03 2009-01-06 International Business Machines Corporation Method for automatically and dynamically composing document management applications
US7782384B2 (en) 2004-11-05 2010-08-24 Kelly Douglas J Digital camera having system for digital image composition and related method
KR100653886B1 (en) 2004-11-05 2006-12-05 주식회사 칼라짚미디어 Mixed-code and mixed-code encondig method and apparatus
US20060112340A1 (en) 2004-11-22 2006-05-25 Julia Mohr Portal page conversion and annotation
JP4345651B2 (en) 2004-11-29 2009-10-14 セイコーエプソン株式会社 Image information evaluation method, image information evaluation program, and image information evaluation apparatus
US7428331B2 (en) 2004-11-30 2008-09-23 Seiko Epson Corporation Page background estimation using color, texture and edge features
GB0426523D0 (en) 2004-12-02 2005-01-05 British Telecomm Video processing
US7742641B2 (en) 2004-12-06 2010-06-22 Honda Motor Co., Ltd. Confidence weighted classifier combination for multi-modal identification
JP2006190259A (en) 2004-12-06 2006-07-20 Canon Inc Shake determining device, image processor, control method and program of the same
US7168614B2 (en) 2004-12-10 2007-01-30 Mitek Systems, Inc. System and method for check fraud detection using signature validation
US7201323B2 (en) 2004-12-10 2007-04-10 Mitek Systems, Inc. System and method for check fraud detection using signature validation
US7249717B2 (en) 2004-12-10 2007-07-31 Mitek Systems, Inc. System and method for check fraud detection using signature validation
JP4460528B2 (en) 2004-12-14 2010-05-12 本田技研工業株式会社 IDENTIFICATION OBJECT IDENTIFICATION DEVICE AND ROBOT HAVING THE SAME
KR100670003B1 (en) 2004-12-28 2007-01-19 삼성전자주식회사 The apparatus for detecting the homogeneous region in the image using the adaptive threshold value
JP4602074B2 (en) 2004-12-28 2010-12-22 シャープ株式会社 Photovoltaic generator installation support system and program
EP1834455B1 (en) 2004-12-28 2013-06-26 ST-Ericsson SA Method and apparatus for peer-to-peer instant messaging
KR100729280B1 (en) 2005-01-08 2007-06-15 아이리텍 잉크 Iris Identification System and Method using Mobile Device with Stereo Camera
EP1842140A4 (en) 2005-01-19 2012-01-04 Truecontext Corp Policy-driven mobile forms applications
WO2006136958A2 (en) 2005-01-25 2006-12-28 Dspv, Ltd. System and method of improving the legibility and applicability of document pictures using form based image enhancement
JP2006209588A (en) 2005-01-31 2006-08-10 Casio Electronics Co Ltd Evidence document issue device and database creation device for evidence document information
US20060195491A1 (en) 2005-02-11 2006-08-31 Lexmark International, Inc. System and method of importing documents into a document management system
GB0503970D0 (en) 2005-02-25 2005-04-06 Firstondemand Ltd Method and apparatus for authentication of invoices
US7487438B1 (en) 2005-03-08 2009-02-03 Pegasus Imaging Corporation Method and apparatus for recognizing a digitized form, extracting information from a filled-in form, and generating a corrected filled-in form
US7822880B2 (en) 2005-03-10 2010-10-26 Konica Minolta Systems Laboratory, Inc. User interfaces for peripheral configuration
US20070002348A1 (en) 2005-03-15 2007-01-04 Kabushiki Kaisha Toshiba Method and apparatus for producing images by using finely optimized image processing parameters
US7545529B2 (en) 2005-03-24 2009-06-09 Kofax, Inc. Systems and methods of accessing random access cache for rescanning
US9769354B2 (en) 2005-03-24 2017-09-19 Kofax, Inc. Systems and methods of processing scanned data
US9137417B2 (en) 2005-03-24 2015-09-15 Kofax, Inc. Systems and methods for processing video data
US8749839B2 (en) 2005-03-24 2014-06-10 Kofax, Inc. Systems and methods of processing scanned data
US7570816B2 (en) 2005-03-31 2009-08-04 Microsoft Corporation Systems and methods for detecting text
US7412425B2 (en) 2005-04-14 2008-08-12 Honda Motor Co., Ltd. Partially supervised machine learning of data classification based on local-neighborhood Laplacian Eigenmaps
US7747958B2 (en) 2005-04-18 2010-06-29 Research In Motion Limited System and method for enabling assisted visual development of workflow for application tasks
JP2006301835A (en) 2005-04-19 2006-11-02 Fuji Xerox Co Ltd Transaction document management method and system
US7941744B2 (en) 2005-04-25 2011-05-10 Adp, Inc. System and method for electronic document generation and delivery
AU2005201758B2 (en) 2005-04-27 2008-12-18 Canon Kabushiki Kaisha Method of learning associations between documents and data sets
US7760956B2 (en) 2005-05-12 2010-07-20 Hewlett-Packard Development Company, L.P. System and method for producing a page using frames of a video stream
US20060256392A1 (en) 2005-05-13 2006-11-16 Microsoft Corporation Scanning systems and methods
US7636883B2 (en) 2005-05-18 2009-12-22 International Business Machines Corporation User form based automated and guided data collection
JP4561474B2 (en) 2005-05-24 2010-10-13 株式会社日立製作所 Electronic document storage system
US20100049035A1 (en) 2005-05-27 2010-02-25 Qingmao Hu Brain image segmentation from ct data
WO2006131967A1 (en) 2005-06-08 2006-12-14 Fujitsu Limited Image processor
US20060282762A1 (en) 2005-06-10 2006-12-14 Oracle International Corporation Collaborative document review system
US7957018B2 (en) 2005-06-10 2011-06-07 Lexmark International, Inc. Coversheet manager application
US20060282463A1 (en) 2005-06-10 2006-12-14 Lexmark International, Inc. Virtual coversheet association application
US20060288015A1 (en) 2005-06-15 2006-12-21 Schirripa Steven R Electronic content classification
EP1736928A1 (en) 2005-06-20 2006-12-27 Mitsubishi Electric Information Technology Centre Europe B.V. Robust image registration
US7756325B2 (en) 2005-06-20 2010-07-13 University Of Basel Estimating 3D shape and texture of a 3D object based on a 2D image of the 3D object
JP4756930B2 (en) 2005-06-23 2011-08-24 キヤノン株式会社 Document management system, document management method, image forming apparatus, and information processing apparatus
US7937264B2 (en) 2005-06-30 2011-05-03 Microsoft Corporation Leveraging unlabeled data with a probabilistic graphical model
US20070002375A1 (en) 2005-06-30 2007-01-04 Lexmark International, Inc. Segmenting and aligning a plurality of cards in a multi-card image
US7515767B2 (en) 2005-07-01 2009-04-07 Flir Systems, Inc. Image correction across multiple spectral regimes
US20070035780A1 (en) 2005-08-02 2007-02-15 Kabushiki Kaisha Toshiba System and method for defining characteristic data of a scanned document
JP4525519B2 (en) 2005-08-18 2010-08-18 日本電信電話株式会社 Quadrilateral evaluation method, apparatus and program
EP1917639A4 (en) 2005-08-25 2017-11-08 Ricoh Company, Ltd. Image processing method and apparatus, digital camera, and recording medium recording image processing program
US8643892B2 (en) 2005-08-29 2014-02-04 Xerox Corporation User configured page chromaticity determination and splitting method
US7801382B2 (en) 2005-09-22 2010-09-21 Compressus, Inc. Method and apparatus for adjustable image compression
US7450740B2 (en) * 2005-09-28 2008-11-11 Facedouble, Inc. Image classification and information retrieval over wireless digital networks and the internet
US7831107B2 (en) 2005-10-17 2010-11-09 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and program
US8176004B2 (en) 2005-10-24 2012-05-08 Capsilon Corporation Systems and methods for intelligent paperless document management
US7495784B2 (en) 2005-11-14 2009-02-24 Kabushiki Kiasha Toshiba Printer with print order calculation based on print creation time and process ratio
US8229166B2 (en) 2009-07-07 2012-07-24 Trimble Navigation, Ltd Image-based tracking
KR100664421B1 (en) 2006-01-10 2007-01-03 주식회사 인지소프트 Portable terminal and method for recognizing name card using having camera
WO2007082534A1 (en) 2006-01-17 2007-07-26 Flemming Ast Mobile unit with camera and optical character recognition, optionally for conversion of imaged text into comprehensible speech
US7720206B2 (en) 2006-01-18 2010-05-18 Teoco Corporation System and method for intelligent data extraction for telecommunications invoices
US7639897B2 (en) 2006-01-24 2009-12-29 Hewlett-Packard Development Company, L.P. Method and apparatus for composing a panoramic photograph
US7738730B2 (en) 2006-01-25 2010-06-15 Atalasoft, Inc. Method of image analysis using sparse hough transform
US8385647B2 (en) 2006-01-25 2013-02-26 Kofax, Inc. Method of image analysis using sparse Hough transform
JP4341629B2 (en) 2006-01-27 2009-10-07 カシオ計算機株式会社 Imaging apparatus, image processing method, and program
CN101390383B (en) 2006-02-23 2011-09-21 松下电器产业株式会社 Image correction device, method, program, integrated circuit, and system
US20070204162A1 (en) 2006-02-24 2007-08-30 Rodriguez Tony F Safeguarding private information through digital watermarking
US7330604B2 (en) 2006-03-02 2008-02-12 Compulink Management Center, Inc. Model-based dewarping method and apparatus
US7657091B2 (en) 2006-03-06 2010-02-02 Mitek Systems, Inc. Method for automatic removal of text from a signature area
JP4615462B2 (en) * 2006-03-15 2011-01-19 株式会社リコー Image processing apparatus, image forming apparatus, program, and image processing method
WO2008027620A1 (en) 2006-03-30 2008-03-06 Obopay Inc. Mobile person-to-person payment system
US7562060B2 (en) 2006-03-31 2009-07-14 Yahoo! Inc. Large scale semi-supervised linear support vector machines
US8775277B2 (en) 2006-04-21 2014-07-08 International Business Machines Corporation Method, system, and program product for electronically validating invoices
US8136114B1 (en) 2006-04-21 2012-03-13 Sprint Communications Company L.P. Business process management system having dynamic task assignment
US8213687B2 (en) 2006-04-28 2012-07-03 Hewlett-Packard Development Company, L.P. Image processing methods, image processing systems, and articles of manufacture
TWI311679B (en) 2006-04-28 2009-07-01 Primax Electronics Ltd A method of evaluating minimum sampling steps of auto focus
US20070260588A1 (en) 2006-05-08 2007-11-08 International Business Machines Corporation Selective, contextual review for documents
JP2007306259A (en) 2006-05-10 2007-11-22 Sony Corp Setting screen display controller, server device, image processing system, printer, imaging apparatus, display device, setting screen display control method, program, and data structure
TWI386817B (en) 2006-05-24 2013-02-21 Kofax Inc System for and method of providing a user interface for a computer-based software application
US7787695B2 (en) 2006-06-06 2010-08-31 Mitek Systems, Inc. Method for applying a signature simplicity analysis for improving the accuracy of signature validation
US7860320B2 (en) * 2006-06-26 2010-12-28 Eastman Kodak Company Classifying image regions based on picture location
US20080005081A1 (en) 2006-06-28 2008-01-03 Sun Microsystems, Inc. Method and apparatus for searching and resource discovery in a distributed enterprise system
US7626612B2 (en) 2006-06-30 2009-12-01 Motorola, Inc. Methods and devices for video correction of still camera motion
US7937345B2 (en) 2006-07-12 2011-05-03 Kofax, Inc. Data classification methods using machine learning techniques
US20080086432A1 (en) 2006-07-12 2008-04-10 Schmidtler Mauritius A R Data classification methods using machine learning techniques
US7958067B2 (en) 2006-07-12 2011-06-07 Kofax, Inc. Data classification methods using machine learning techniques
US7761391B2 (en) 2006-07-12 2010-07-20 Kofax, Inc. Methods and systems for improved transductive maximum entropy discrimination classification
WO2008008142A2 (en) 2006-07-12 2008-01-17 Kofax Image Products, Inc. Machine learning techniques and transductive data classification
US8073263B2 (en) 2006-07-31 2011-12-06 Ricoh Co., Ltd. Multi-classifier selection and monitoring for MMR-based image recognition
JP4172512B2 (en) 2006-08-30 2008-10-29 船井電機株式会社 Panorama imaging device
US20080235766A1 (en) 2006-09-01 2008-09-25 Wallos Robert Apparatus and method for document certification
JP2008134683A (en) 2006-11-27 2008-06-12 Fuji Xerox Co Ltd Image processor and image processing program
US8081227B1 (en) 2006-11-30 2011-12-20 Adobe Systems Incorporated Image quality visual indicator
US20080133388A1 (en) 2006-12-01 2008-06-05 Sergey Alekseev Invoice exception management
US7416131B2 (en) 2006-12-13 2008-08-26 Bottom Line Technologies (De), Inc. Electronic transaction processing server with automated transaction evaluation
US9282446B2 (en) 2009-08-06 2016-03-08 Golba Llc Location-aware content and location-based advertising with a mobile device
US20080147561A1 (en) 2006-12-18 2008-06-19 Pitney Bowes Incorporated Image based invoice payment with digital signature verification
US20100062491A1 (en) 2007-01-05 2010-03-11 Novozymes A/S Overexpression of the Chaperone BIP in a Heterokaryon
CA2578466A1 (en) 2007-01-12 2008-07-12 Truecontext Corporation Method and system for customizing a mobile application using a web-based interface
US20080177643A1 (en) 2007-01-22 2008-07-24 Matthews Clifton W System and method for invoice management
US7899247B2 (en) 2007-01-24 2011-03-01 Samsung Electronics Co., Ltd. Apparatus and method of segmenting an image according to a cost function and/or feature vector and/or receiving a signal representing the segmented image in an image coding and/or decoding system
WO2008094470A1 (en) 2007-01-26 2008-08-07 Magtek, Inc. Card reader for use with web based transactions
US20080183576A1 (en) 2007-01-30 2008-07-31 Sang Hun Kim Mobile service system and method using two-dimensional coupon code
EP1956517A1 (en) 2007-02-07 2008-08-13 WinBooks s.a. Computer assisted method for processing accounting operations and software product for implementing such method
JP4324628B2 (en) * 2007-02-13 2009-09-02 シャープ株式会社 Image processing method, image processing apparatus, image reading apparatus, image forming apparatus, computer program, and recording medium
US20080201617A1 (en) 2007-02-16 2008-08-21 Brother Kogyo Kabushiki Kaisha Network device and network system
KR101288971B1 (en) 2007-02-16 2013-07-24 삼성전자주식회사 Method and apparatus for 3 dimensional modeling using 2 dimensional images
JP4123299B1 (en) 2007-02-21 2008-07-23 富士ゼロックス株式会社 Image processing apparatus and image processing program
KR100866963B1 (en) 2007-03-12 2008-11-05 삼성전자주식회사 Method for stabilizing digital image which can correct the horizontal shear distortion and vertical scale distortion
JP4877013B2 (en) 2007-03-30 2012-02-15 ブラザー工業株式会社 Scanner
US8244031B2 (en) 2007-04-13 2012-08-14 Kofax, Inc. System and method for identifying and classifying color regions from a digital image
US20080270166A1 (en) 2007-04-16 2008-10-30 Duane Morin Transcript, course catalog and financial aid apparatus, systems, and methods
CN101295305B (en) 2007-04-25 2012-10-31 富士通株式会社 Image retrieval device
US8265393B2 (en) 2007-05-01 2012-09-11 Compulink Management Center, Inc. Photo-document segmentation method and system
US8279465B2 (en) 2007-05-01 2012-10-02 Kofax, Inc. Systems and methods for routing facsimiles based on content
KR101157654B1 (en) 2007-05-21 2012-06-18 삼성전자주식회사 Method for transmitting email in image forming apparatus and image forming apparatus capable of transmitting email
US7894689B2 (en) 2007-05-31 2011-02-22 Seiko Epson Corporation Image stitching
JP2009015396A (en) 2007-06-29 2009-01-22 Ricoh Co Ltd Workflow system, workflow management device, and workflow management method
JP2009014836A (en) 2007-07-02 2009-01-22 Canon Inc Active matrix type display and driving method therefor
JP4363468B2 (en) 2007-07-12 2009-11-11 ソニー株式会社 Imaging apparatus, imaging method, and video signal processing program
US8126924B1 (en) 2007-07-20 2012-02-28 Countermind Method of representing and processing complex branching logic for mobile applications
WO2009015501A1 (en) 2007-07-27 2009-02-05 ETH Zürich Computer system and method for generating a 3d geometric model
EP2183703A1 (en) 2007-08-01 2010-05-12 Yeda Research And Development Company Limited Multiscale edge detection and fiber enhancement using differences of oriented means
US8503797B2 (en) 2007-09-05 2013-08-06 The Neat Company, Inc. Automatic document classification using lexical and physical features
US20110035662A1 (en) 2009-02-18 2011-02-10 King Martin T Interacting with rendered documents using a multi-function mobile device, such as a mobile phone
US7825963B2 (en) 2007-09-19 2010-11-02 Nokia Corporation Method and system for capturing an image from video
US20090110267A1 (en) 2007-09-21 2009-04-30 The Regents Of The University Of California Automated texture mapping system for 3D models
US9811849B2 (en) 2007-09-28 2017-11-07 Great-Circle Technologies, Inc. Contextual execution of automated workflows
US8218887B2 (en) 2007-09-28 2012-07-10 Abbyy Software, Ltd. Enhanced method of multilayer compression of PDF (image) files using OCR systems
US8094976B2 (en) 2007-10-03 2012-01-10 Esker, Inc. One-screen reconciliation of business document image data, optical character recognition extracted data, and enterprise resource planning data
US8244062B2 (en) 2007-10-22 2012-08-14 Hewlett-Packard Development Company, L.P. Correction of distortion in captured images
US8059888B2 (en) 2007-10-30 2011-11-15 Microsoft Corporation Semi-automatic plane extrusion for 3D modeling
US7655685B2 (en) 2007-11-02 2010-02-02 Jenrin Discovery, Inc. Cannabinoid receptor antagonists/inverse agonists useful for treating metabolic disorders, including obesity and diabetes
US7809721B2 (en) 2007-11-16 2010-10-05 Iac Search & Media, Inc. Ranking of objects using semantic and nonsemantic features in a system and method for conducting a search
US8732155B2 (en) 2007-11-16 2014-05-20 Iac Search & Media, Inc. Categorization in a system and method for conducting a search
US8194965B2 (en) 2007-11-19 2012-06-05 Parascript, Llc Method and system of providing a probability distribution to aid the detection of tumors in mammogram images
US8311296B2 (en) 2007-11-21 2012-11-13 Parascript, Llc Voting in mammography processing
US8035641B1 (en) 2007-11-28 2011-10-11 Adobe Systems Incorporated Fast depth of field simulation
US8249985B2 (en) 2007-11-29 2012-08-21 Bank Of America Corporation Sub-account mechanism
US8103048B2 (en) 2007-12-04 2012-01-24 Mcafee, Inc. Detection of spam images
US8532374B2 (en) 2007-12-05 2013-09-10 Canon Kabushiki Kaisha Colour document layout analysis with multi-level decomposition
US8194933B2 (en) 2007-12-12 2012-06-05 3M Innovative Properties Company Identification and verification of an unknown document according to an eigen image process
US8150547B2 (en) 2007-12-21 2012-04-03 Bell and Howell, LLC. Method and system to provide address services with a document processing system
US8566752B2 (en) 2007-12-21 2013-10-22 Ricoh Co., Ltd. Persistent selection marks
US9672510B2 (en) 2008-01-18 2017-06-06 Mitek Systems, Inc. Systems and methods for automatic image capture and processing of documents on a mobile device
US8577118B2 (en) 2008-01-18 2013-11-05 Mitek Systems Systems for mobile image capture and remittance processing
US9292737B2 (en) 2008-01-18 2016-03-22 Mitek Systems, Inc. Systems and methods for classifying payment documents during mobile image processing
US8379914B2 (en) 2008-01-18 2013-02-19 Mitek Systems, Inc. Systems and methods for mobile image capture and remittance processing
US8582862B2 (en) 2010-05-12 2013-11-12 Mitek Systems Mobile image quality assurance in mobile document image processing applications
US8483473B2 (en) 2008-01-18 2013-07-09 Mitek Systems, Inc. Systems and methods for obtaining financial offers using mobile image capture
US7949176B2 (en) 2008-01-18 2011-05-24 Mitek Systems, Inc. Systems for mobile image capture and processing of documents
US10102583B2 (en) 2008-01-18 2018-10-16 Mitek Systems, Inc. System and methods for obtaining insurance offers using mobile image capture
US10528925B2 (en) 2008-01-18 2020-01-07 Mitek Systems, Inc. Systems and methods for mobile automated clearing house enrollment
US9298979B2 (en) 2008-01-18 2016-03-29 Mitek Systems, Inc. Systems and methods for mobile image capture and content processing of driver's licenses
US20130297353A1 (en) 2008-01-18 2013-11-07 Mitek Systems Systems and methods for filing insurance claims using mobile imaging
US20090204530A1 (en) 2008-01-31 2009-08-13 Payscan America, Inc. Bar coded monetary transaction system and method
RU2460187C2 (en) 2008-02-01 2012-08-27 Рокстек Аб Transition frame with inbuilt pressing device
US7992087B1 (en) 2008-02-27 2011-08-02 Adobe Systems Incorporated Document mapped-object placement upon background change
JP5009196B2 (en) 2008-03-04 2012-08-22 ソニーフィナンシャルホールディングス株式会社 Information processing apparatus, program, and information processing method
US9082080B2 (en) 2008-03-05 2015-07-14 Kofax, Inc. Systems and methods for organizing data sets
US20090324025A1 (en) 2008-04-15 2009-12-31 Sony Ericsson Mobile Communicatoins AB Physical Access Control Using Dynamic Inputs from a Portable Communications Device
US8135656B2 (en) 2008-04-22 2012-03-13 Xerox Corporation Online management service for identification documents which prompts a user for a category of an official document
US20090285445A1 (en) 2008-05-15 2009-11-19 Sony Ericsson Mobile Communications Ab System and Method of Translating Road Signs
US8553984B2 (en) 2008-06-02 2013-10-08 Massachusetts Institute Of Technology Fast pattern classification based on a sparse transform
CN101329731A (en) 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
US7949167B2 (en) 2008-06-12 2011-05-24 Siemens Medical Solutions Usa, Inc. Automatic learning of image features to predict disease
KR20100000671A (en) 2008-06-25 2010-01-06 삼성전자주식회사 Method for image processing
US8154611B2 (en) 2008-07-17 2012-04-10 The Boeing Company Methods and systems for improving resolution of a digitally stabilized image
US8520979B2 (en) 2008-08-19 2013-08-27 Digimarc Corporation Methods and systems for content processing
US20100045701A1 (en) 2008-08-22 2010-02-25 Cybernet Systems Corporation Automatic mapping of augmented reality fiducials
JP4715888B2 (en) 2008-09-02 2011-07-06 カシオ計算機株式会社 Image processing apparatus and computer program
US9177218B2 (en) 2008-09-08 2015-11-03 Kofax, Inc. System and method, and computer program product for detecting an edge in scan data
WO2010030056A1 (en) 2008-09-10 2010-03-18 Bionet Co., Ltd Automatic contour detection method for ultrasonic diagnosis appartus
JP2010098728A (en) 2008-09-19 2010-04-30 Sanyo Electric Co Ltd Projection type video display, and display system
US9037513B2 (en) 2008-09-30 2015-05-19 Apple Inc. System and method for providing electronic event tickets
EP2352321B1 (en) 2008-10-31 2019-09-11 ZTE Corporation Method and apparatus for authentication processing of mobile terminal
US8384947B2 (en) 2008-11-17 2013-02-26 Image Trends, Inc. Handheld scanner and system comprising same
US8180153B2 (en) 2008-12-05 2012-05-15 Xerox Corporation 3+1 layer mixed raster content (MRC) images having a black text layer
GB0822953D0 (en) 2008-12-16 2009-01-21 Stafforshire University Image processing
US8306327B2 (en) 2008-12-30 2012-11-06 International Business Machines Corporation Adaptive partial character recognition
US8774516B2 (en) 2009-02-10 2014-07-08 Kofax, Inc. Systems, methods and computer program products for determining document validity
US8958605B2 (en) 2009-02-10 2015-02-17 Kofax, Inc. Systems, methods and computer program products for determining document validity
US9767354B2 (en) 2009-02-10 2017-09-19 Kofax, Inc. Global geographic information retrieval, validation, and normalization
US8345981B2 (en) 2009-02-10 2013-01-01 Kofax, Inc. Systems, methods, and computer program products for determining document validity
US9576272B2 (en) 2009-02-10 2017-02-21 Kofax, Inc. Systems, methods and computer program products for determining document validity
US8879846B2 (en) 2009-02-10 2014-11-04 Kofax, Inc. Systems, methods and computer program products for processing financial documents
US8406480B2 (en) 2009-02-17 2013-03-26 International Business Machines Corporation Visual credential verification
US8265422B1 (en) 2009-02-20 2012-09-11 Adobe Systems Incorporated Method and apparatus for removing general lens distortion from images
JP4725657B2 (en) 2009-02-26 2011-07-13 ブラザー工業株式会社 Image composition output program, image composition output device, and image composition output system
US8498486B2 (en) 2009-03-12 2013-07-30 Qualcomm Incorporated Response to detection of blur in an image
US20100280859A1 (en) 2009-04-30 2010-11-04 Bank Of America Corporation Future checks integration
CN101894262B (en) * 2009-05-20 2014-07-09 索尼株式会社 Method and apparatus for classifying image
RS51531B (en) 2009-05-29 2011-06-30 Vlatacom D.O.O. Handheld portable device for travel an id document verification, biometric data reading and identification of persons using those documents
US20100331043A1 (en) 2009-06-23 2010-12-30 K-Nfb Reading Technology, Inc. Document and image processing
US8620078B1 (en) 2009-07-14 2013-12-31 Matrox Electronic Systems, Ltd. Determining a class associated with an image
JP5397059B2 (en) 2009-07-17 2014-01-22 ソニー株式会社 Image processing apparatus and method, program, and recording medium
US8478052B1 (en) * 2009-07-17 2013-07-02 Google Inc. Image classification
US8508580B2 (en) 2009-07-31 2013-08-13 3Dmedia Corporation Methods, systems, and computer-readable storage media for creating three-dimensional (3D) images of a scene
JP4772894B2 (en) 2009-08-03 2011-09-14 シャープ株式会社 Image output device, portable terminal device, captured image processing system, image output method, program, and recording medium
US9135277B2 (en) 2009-08-07 2015-09-15 Google Inc. Architecture for responding to a visual query
JP4856263B2 (en) 2009-08-07 2012-01-18 シャープ株式会社 Captured image processing system, image output method, program, and recording medium
CN101639760A (en) 2009-08-27 2010-02-03 上海合合信息科技发展有限公司 Input method and input system of contact information
US8655733B2 (en) 2009-08-27 2014-02-18 Microsoft Corporation Payment workflow extensibility for point-of-sale applications
US9779386B2 (en) 2009-08-31 2017-10-03 Thomson Reuters Global Resources Method and system for implementing workflows and managing staff and engagements
US8819172B2 (en) 2010-11-04 2014-08-26 Digimarc Corporation Smartphone-based methods and systems
KR101611440B1 (en) 2009-11-16 2016-04-11 삼성전자주식회사 Method and apparatus for processing image
CN102301353A (en) 2009-11-30 2011-12-28 松下电器产业株式会社 Portable communication apparatus, communication method, integrated circuit, and program
JP2011118513A (en) 2009-12-01 2011-06-16 Toshiba Corp Character recognition device and form identification method
JP4979757B2 (en) 2009-12-02 2012-07-18 日立オムロンターミナルソリューションズ株式会社 Paper sheet identification apparatus, automatic transaction apparatus, and paper sheet identification method
US9183224B2 (en) 2009-12-02 2015-11-10 Google Inc. Identifying matching canonical documents in response to a visual query
US9405772B2 (en) 2009-12-02 2016-08-02 Google Inc. Actionable search results for street view visual queries
US8406554B1 (en) 2009-12-02 2013-03-26 Jadavpur University Image binarization based on grey membership parameters of pixels
US20110137898A1 (en) 2009-12-07 2011-06-09 Xerox Corporation Unstructured document classification
US20120019614A1 (en) 2009-12-11 2012-01-26 Tessera Technologies Ireland Limited Variable Stereo Base for (3D) Panorama Creation on Handheld Device
US8532419B2 (en) 2010-01-13 2013-09-10 iParse, LLC Automatic image capture
US20110249905A1 (en) 2010-01-15 2011-10-13 Copanion, Inc. Systems and methods for automatically extracting data from electronic documents including tables
US8600173B2 (en) 2010-01-27 2013-12-03 Dst Technologies, Inc. Contextualization of machine indeterminable information based on machine determinable information
US9129432B2 (en) 2010-01-28 2015-09-08 The Hong Kong University Of Science And Technology Image-based procedural remodeling of buildings
JP5426422B2 (en) 2010-02-10 2014-02-26 株式会社Pfu Image processing apparatus, image processing method, and image processing program
KR101630688B1 (en) 2010-02-17 2016-06-16 삼성전자주식회사 Apparatus for motion estimation and method thereof and image processing apparatus
US8433775B2 (en) 2010-03-31 2013-04-30 Bank Of America Corporation Integration of different mobile device types with a business infrastructure
US8515208B2 (en) 2010-04-05 2013-08-20 Kofax, Inc. Method for document to template alignment
US8595234B2 (en) 2010-05-17 2013-11-26 Wal-Mart Stores, Inc. Processing data feeds
US9047531B2 (en) 2010-05-21 2015-06-02 Hand Held Products, Inc. Interactive user interface for capturing a document in an image signal
US8600167B2 (en) 2010-05-21 2013-12-03 Hand Held Products, Inc. System for capturing a document in an image signal
WO2011149558A2 (en) 2010-05-28 2011-12-01 Abelow Daniel H Reality alternate
EP3324350A1 (en) 2010-06-08 2018-05-23 Deutsche Post AG Navigation system for optimising delivery or collection journeys
US8352411B2 (en) 2010-06-17 2013-01-08 Sap Ag Activity schemes for support of knowledge-intensive tasks
JP5500480B2 (en) 2010-06-24 2014-05-21 株式会社日立情報通信エンジニアリング Form recognition device and form recognition method
US8745488B1 (en) 2010-06-30 2014-06-03 Patrick Wong System and a method for web-based editing of documents online with an editing interface and concurrent display to webpages and print documents
US20120008856A1 (en) 2010-07-08 2012-01-12 Gregory Robert Hewes Automatic Convergence Based on Face Detection for Stereoscopic Imaging
US8548201B2 (en) 2010-09-02 2013-10-01 Electronics And Telecommunications Research Institute Apparatus and method for recognizing identifier of vehicle
JP5738559B2 (en) 2010-09-07 2015-06-24 株式会社プリマジェスト Insurance business processing system and insurance business processing method
US20120077476A1 (en) 2010-09-23 2012-03-29 Theodore G. Paraskevakos System and method for utilizing mobile telephones to combat crime
US20120092329A1 (en) 2010-10-13 2012-04-19 Qualcomm Incorporated Text-based 3d augmented reality
US9282238B2 (en) 2010-10-29 2016-03-08 Hewlett-Packard Development Company, L.P. Camera system for determining pose quality and providing feedback to a user
US20120116957A1 (en) 2010-11-04 2012-05-10 Bank Of America Corporation System and method for populating a list of transaction participants
US8995012B2 (en) 2010-11-05 2015-03-31 Rdm Corporation System for mobile image capture and processing of financial documents
US8744196B2 (en) * 2010-11-26 2014-06-03 Hewlett-Packard Development Company, L.P. Automatic recognition of images
US8754988B2 (en) 2010-12-22 2014-06-17 Tektronix, Inc. Blur detection with local sharpness map
US8503769B2 (en) 2010-12-28 2013-08-06 Microsoft Corporation Matching text to images
JP5736796B2 (en) 2011-01-24 2015-06-17 株式会社ニコン Electronic camera, program and recording medium
US20120194692A1 (en) 2011-01-31 2012-08-02 Hand Held Products, Inc. Terminal operative for display of electronic record
US8675953B1 (en) 2011-02-02 2014-03-18 Intuit Inc. Calculating an object size using images
US8811711B2 (en) 2011-03-08 2014-08-19 Bank Of America Corporation Recognizing financial document images
JP2012191486A (en) 2011-03-11 2012-10-04 Sony Corp Image composing apparatus, image composing method, and program
JP2012194736A (en) 2011-03-16 2012-10-11 Ms&Ad Research Institute Co Ltd Accident report preparation support system
JP5231667B2 (en) 2011-04-01 2013-07-10 シャープ株式会社 Imaging apparatus, display method in imaging apparatus, image processing method in imaging apparatus, program, and recording medium
US8533595B2 (en) 2011-04-19 2013-09-10 Autodesk, Inc Hierarchical display and navigation of document revision histories
US8792682B2 (en) 2011-04-21 2014-07-29 Xerox Corporation Method and system for identifying a license plate
US9342886B2 (en) 2011-04-29 2016-05-17 Qualcomm Incorporated Devices, methods, and apparatuses for homography evaluation involving a mobile device
US10402898B2 (en) 2011-05-04 2019-09-03 Paypal, Inc. Image-based financial processing
US8751317B2 (en) 2011-05-12 2014-06-10 Koin, Inc. Enabling a merchant's storefront POS (point of sale) system to accept a payment transaction verified by SMS messaging with buyer's mobile phone
US20120293607A1 (en) 2011-05-17 2012-11-22 Apple Inc. Panorama Processing
US8571271B2 (en) 2011-05-26 2013-10-29 Microsoft Corporation Dual-phase red eye correction
US20120300020A1 (en) 2011-05-27 2012-11-29 Qualcomm Incorporated Real-time self-localization from panoramic images
US20120308139A1 (en) 2011-05-31 2012-12-06 Verizon Patent And Licensing Inc. Method and system for facilitating subscriber services using mobile imaging
US9400806B2 (en) 2011-06-08 2016-07-26 Hewlett-Packard Development Company, L.P. Image triggered transactions
US9418304B2 (en) * 2011-06-29 2016-08-16 Qualcomm Incorporated System and method for recognizing text information in object
US20130027757A1 (en) 2011-07-29 2013-01-31 Qualcomm Incorporated Mobile fax machine with image stitching and degradation removal processing
US8559766B2 (en) 2011-08-16 2013-10-15 iParse, LLC Automatic image capture
US8813111B2 (en) * 2011-08-22 2014-08-19 Xerox Corporation Photograph-based game
US8660943B1 (en) 2011-08-31 2014-02-25 Btpatent Llc Methods and systems for financial transactions
US8525883B2 (en) 2011-09-02 2013-09-03 Sharp Laboratories Of America, Inc. Methods, systems and apparatus for automatic video quality assessment
CN102982396B (en) 2011-09-06 2017-12-26 Sap欧洲公司 Universal process modeling framework
US9710821B2 (en) 2011-09-15 2017-07-18 Stephan HEATH Systems and methods for mobile and online payment systems for purchases related to mobile and online promotions or offers provided using impressions tracking and analysis, location information, 2D and 3D mapping, mobile mapping, social media, and user behavior and
US8768834B2 (en) 2011-09-20 2014-07-01 E2Interactive, Inc. Digital exchange and mobile wallet for digital currency
US8737980B2 (en) 2011-09-27 2014-05-27 W2Bi, Inc. End to end application automatic testing
US9123005B2 (en) 2011-10-11 2015-09-01 Mobiwork, Llc Method and system to define implement and enforce workflow of a mobile workforce
US10810218B2 (en) 2011-10-14 2020-10-20 Transunion, Llc System and method for matching of database records based on similarities to search queries
WO2013059599A1 (en) 2011-10-19 2013-04-25 The Regents Of The University Of California Image-based measurement tools
EP2587745A1 (en) 2011-10-26 2013-05-01 Swisscom AG A method and system of obtaining contact information for a person or an entity
US9087262B2 (en) 2011-11-10 2015-07-21 Fuji Xerox Co., Ltd. Sharpness estimation in document and scene images
US8701166B2 (en) 2011-12-09 2014-04-15 Blackberry Limited Secure authentication
US20170111532A1 (en) 2012-01-12 2017-04-20 Kofax, Inc. Real-time processing of video streams captured using mobile devices
US9058515B1 (en) 2012-01-12 2015-06-16 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US9483794B2 (en) 2012-01-12 2016-11-01 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US11321772B2 (en) 2012-01-12 2022-05-03 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US9275281B2 (en) 2012-01-12 2016-03-01 Kofax, Inc. Mobile image capture, processing, and electronic form generation
US9058580B1 (en) 2012-01-12 2015-06-16 Kofax, Inc. Systems and methods for identification document processing and business workflow integration
US9165187B2 (en) 2012-01-12 2015-10-20 Kofax, Inc. Systems and methods for mobile image capture and processing
TWI588778B (en) 2012-01-17 2017-06-21 國立臺灣科技大學 Activity recognition method
US9058327B1 (en) 2012-01-25 2015-06-16 Symantec Corporation Enhancing training of predictive coding systems through user selected text
US9305083B2 (en) 2012-01-26 2016-04-05 Microsoft Technology Licensing, Llc Author disambiguation
US20130198358A1 (en) 2012-01-30 2013-08-01 DoDat Process Technology, LLC Distributive on-demand administrative tasking apparatuses, methods and systems
JP5914045B2 (en) 2012-02-28 2016-05-11 キヤノン株式会社 Image processing apparatus, image processing method, and program
US8990112B2 (en) 2012-03-01 2015-03-24 Ricoh Company, Ltd. Expense report system with receipt image processing
JP5734902B2 (en) 2012-03-19 2015-06-17 株式会社東芝 Construction process management system and management method thereof
US8724907B1 (en) 2012-03-28 2014-05-13 Emc Corporation Method and system for using OCR data for grouping and classifying documents
US20130268430A1 (en) 2012-04-05 2013-10-10 Ziftit, Inc. Method and apparatus for dynamic gift card processing
US20130268378A1 (en) 2012-04-06 2013-10-10 Microsoft Corporation Transaction validation between a mobile communication device and a terminal using location data
US20130271579A1 (en) 2012-04-14 2013-10-17 Younian Wang Mobile Stereo Device: Stereo Imaging, Measurement and 3D Scene Reconstruction with Mobile Devices such as Tablet Computers and Smart Phones
US8639621B1 (en) 2012-04-25 2014-01-28 Wells Fargo Bank, N.A. System and method for a mobile wallet
US9916514B2 (en) 2012-06-11 2018-03-13 Amazon Technologies, Inc. Text recognition driven functionality
US8441548B1 (en) * 2012-06-15 2013-05-14 Google Inc. Facial image quality assessment
US9064316B2 (en) 2012-06-28 2015-06-23 Lexmark International, Inc. Methods of content-based image identification
US8781229B2 (en) 2012-06-29 2014-07-15 Palo Alto Research Center Incorporated System and method for localizing data fields on structured and semi-structured forms
US9092773B2 (en) 2012-06-30 2015-07-28 At&T Intellectual Property I, L.P. Generating and categorizing transaction records
US20140012754A1 (en) 2012-07-06 2014-01-09 Bank Of America Corporation Financial document processing system
US8705836B2 (en) 2012-08-06 2014-04-22 A2iA S.A. Systems and methods for recognizing information in objects using a mobile device
JP2014035656A (en) 2012-08-09 2014-02-24 Sony Corp Image processing apparatus, image processing method, and program
US8842319B2 (en) 2012-08-21 2014-09-23 Xerox Corporation Context aware document services for mobile device users
US8817339B2 (en) 2012-08-22 2014-08-26 Top Image Systems Ltd. Handheld device document imaging
US9928406B2 (en) 2012-10-01 2018-03-27 The Regents Of The University Of California Unified face representation for individual recognition in surveillance videos and vehicle logo super-resolution system
US20140149308A1 (en) 2012-11-27 2014-05-29 Ebay Inc. Automated package tracking
US9256791B2 (en) 2012-12-04 2016-02-09 Mobileye Vision Technologies Ltd. Road vertical contour detection
US20140181691A1 (en) 2012-12-20 2014-06-26 Rajesh Poornachandran Sharing of selected content for data collection
US9648297B1 (en) 2012-12-28 2017-05-09 Google Inc. Systems and methods for assisting a user in capturing images for three-dimensional reconstruction
US9092665B2 (en) 2013-01-30 2015-07-28 Aquifi, Inc Systems and methods for initializing motion tracking of human hands
US9239713B1 (en) 2013-03-06 2016-01-19 MobileForce Software, Inc. Platform independent rendering for native mobile applications
US10127636B2 (en) 2013-09-27 2018-11-13 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data
US10140511B2 (en) 2013-03-13 2018-11-27 Kofax, Inc. Building classification and extraction models based on electronic forms
JP2016517587A (en) 2013-03-13 2016-06-16 コファックス, インコーポレイテッド Classification of objects in digital images captured using mobile devices
US9355312B2 (en) 2013-03-13 2016-05-31 Kofax, Inc. Systems and methods for classifying objects in digital images captured using mobile devices
US9208536B2 (en) 2013-09-27 2015-12-08 Kofax, Inc. Systems and methods for three dimensional geometric reconstruction of captured image data
US9384566B2 (en) 2013-03-14 2016-07-05 Wisconsin Alumni Research Foundation System and method for simulataneous image artifact reduction and tomographic reconstruction
GB2500823B (en) 2013-03-28 2014-02-26 Paycasso Verify Ltd Method, system and computer program for comparing images
US20140316841A1 (en) 2013-04-23 2014-10-23 Kofax, Inc. Location-based workflows and services
CN105518704A (en) 2013-05-03 2016-04-20 柯法克斯公司 Systems and methods for detecting and classifying objects in video captured using mobile devices
RU2541353C2 (en) 2013-06-19 2015-02-10 Общество с ограниченной ответственностью "Аби Девелопмент" Automatic capture of document with given proportions
US20150006362A1 (en) 2013-06-28 2015-01-01 Google Inc. Extracting card data using card art
US8805125B1 (en) 2013-06-28 2014-08-12 Google Inc. Comparing extracted card data using continuous scanning
US10140257B2 (en) 2013-08-02 2018-11-27 Symbol Technologies, Llc Method and apparatus for capturing and processing content from context sensitive documents on a mobile device
US10769362B2 (en) 2013-08-02 2020-09-08 Symbol Technologies, Llc Method and apparatus for capturing and extracting content from documents on a mobile device
KR102082301B1 (en) 2013-09-30 2020-02-27 삼성전자주식회사 Method, apparatus and computer-readable recording medium for converting document image captured by camera to the scanned document image
US20150120564A1 (en) 2013-10-29 2015-04-30 Bank Of America Corporation Check memo line data lift
US9373057B1 (en) 2013-11-01 2016-06-21 Google Inc. Training a neural network to detect objects in images
US9386235B2 (en) 2013-11-15 2016-07-05 Kofax, Inc. Systems and methods for generating composite images of long documents using mobile video data
US20150161765A1 (en) 2013-12-06 2015-06-11 Emc Corporation Scaling mobile check photos to physical dimensions
US10386439B2 (en) 2013-12-20 2019-08-20 Koninklike Philips N.V. Density guided attenuation map generation in PET/MR systems
US8811751B1 (en) 2013-12-20 2014-08-19 I.R.I.S. Method and system for correcting projective distortions with elimination steps on multiple levels
US20150248391A1 (en) 2014-02-28 2015-09-03 Ricoh Company, Ltd. Form auto-filling using a mobile device
US9626528B2 (en) 2014-03-07 2017-04-18 International Business Machines Corporation Data leak prevention enforcement based on learned document classification
CN105095900B (en) 2014-05-04 2020-12-08 斑马智行网络(香港)有限公司 Method and device for extracting specific information in standard card
US9251431B2 (en) 2014-05-30 2016-02-02 Apple Inc. Object-of-interest detection and recognition with split, full-resolution image processing pipeline
US9342830B2 (en) 2014-07-15 2016-05-17 Google Inc. Classifying open-loop and closed-loop payment cards based on optical character recognition
US20160034775A1 (en) 2014-08-02 2016-02-04 General Vault, LLC Methods and apparatus for bounded image data analysis and notification mechanism
US9251614B1 (en) 2014-08-29 2016-02-02 Konica Minolta Laboratory U.S.A., Inc. Background removal for document images
AU2014218444B2 (en) 2014-08-29 2017-06-15 Canon Kabushiki Kaisha Dynamic feature selection for joint probabilistic recognition
US9760788B2 (en) 2014-10-30 2017-09-12 Kofax, Inc. Mobile document detection and orientation based on reference object characteristics
US9852132B2 (en) 2014-11-25 2017-12-26 Chegg, Inc. Building a topical learning model in a content management system
US9367899B1 (en) 2015-05-29 2016-06-14 Konica Minolta Laboratory U.S.A., Inc. Document image binarization method
US10242285B2 (en) 2015-07-20 2019-03-26 Kofax, Inc. Iterative recognition-guided thresholding and data extraction
US10467465B2 (en) 2015-07-20 2019-11-05 Kofax, Inc. Range and/or polarity-based thresholding for improved data extraction
US9779296B1 (en) 2016-04-01 2017-10-03 Kofax, Inc. Content-based detection and three dimensional geometric reconstruction of objects in image and video data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6512848B2 (en) * 1996-11-18 2003-01-28 Canon Kabushiki Kaisha Page analysis system
US20080212115A1 (en) * 2007-02-13 2008-09-04 Yohsuke Konishi Image processing method, image processing apparatus, image reading apparatus, and image forming apparatus
US20100060915A1 (en) * 2008-09-08 2010-03-11 Masaru Suzuki Apparatus and method for image processing, and program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2647670C1 (en) * 2016-09-27 2018-03-16 Общество с ограниченной ответственностью "Аби Девелопмент" Automated methods and systems of identifying image fragments in document-containing images to facilitate extraction of information from identificated document-containing image fragments

Also Published As

Publication number Publication date
EP2974261A4 (en) 2016-06-15
US9996741B2 (en) 2018-06-12
US20160259973A1 (en) 2016-09-08
US20170103281A1 (en) 2017-04-13
JP2016516245A (en) 2016-06-02
US20140270349A1 (en) 2014-09-18
US9355312B2 (en) 2016-05-31
CN105308944A (en) 2016-02-03
US10127441B2 (en) 2018-11-13
EP2974261A2 (en) 2016-01-20
WO2014160433A3 (en) 2014-11-27

Similar Documents

Publication Publication Date Title
WO2014160433A2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
US11087407B2 (en) Systems and methods for mobile image capture and processing
US11062176B2 (en) Object detection and image cropping using a multi-detector approach
US9754164B2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
US9275281B2 (en) Mobile image capture, processing, and electronic form generation
WO2017015401A1 (en) Mobile image capture, processing, and electronic form generation

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201480014229.9

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14773721

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2014773721

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016502192

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14773721

Country of ref document: EP

Kind code of ref document: A2