WO2006006190A1

WO2006006190A1 - Data entry process and system

Info

Publication number: WO2006006190A1
Application number: PCT/IT2005/000380
Authority: WO
Inventors: Armando Salle
Original assignee: Bankersoft S.R.L.
Priority date: 2004-07-08
Filing date: 2005-07-05
Publication date: 2006-01-19
Also published as: ITTO20040467A1

Abstract

A process is disclosed for entering data, particularly accounting data, in a system for storing and/or processing such data; a system is further disclosed for entering data, particularly accounting data, in a system for storing and/or processing such data.

Description

DR-CR. ENTRY BROCESS AND SYSTEM

The present invention refers to a data entry process and system, particularly for accounting data in a system for storing and/or processing such data.

As known, all activities consisting in handling and/or analysing documents, above all paper documents, but also digital documents, particularly accounting documents, require long data entry operations, typically performed by an operator, of data contained in such documents into storage data bases for their following processing.

In particular, in an accountant office, a very relevant part of the work involved is entering into the information system all pieces of information being present in accounting documents delivered by customers. These documents are of various types, such as purchase invoices, sales invoices, bank handling forms, etc.

This work can be performed by public accountants, or more often by accountants. In any case, the person entering data knows fiscal and accounting laws, and, in addition to be able to identify which are the document data that must be entered in the system, is able to choose the accounting accounts and the correct VAT (as known, acronym for "Value Added Tax") item for every document amount and every customer. The term ^λVVAT item" means an identification code that represents a value for every one of the characteristics related to VAT (percentage, deducted amount, etc.)- Moreover, these codes allow identifying the amounts to be summed on every cell of VAT Declarations.

The prior art discloses, in IT-A-TO2003A000900, a process and a system implementing it, through which it is possible to make a public accountant or an accounting and fiscal export enter accounting and fiscal rules in the system, in such a way that a user without any accounting and/or fiscal knowledge manages, guided by the system, to enter the document data.

Both when data keying is performed traditionally, namely by choosing accounting accounts and VAT items by the user, and when it is performed through the above known process, a user must read on the accounting document all useful data and key them in by using an information system program, with consequent very high waste of resources and time, and with the unavoidable chance of incurring in an unwanted entry of wrong accounting data, due to the necessary manual keying for such data.

Object of the present invention is solving the above prior-art problems by providing a data entry process and system, particularly for accounting data, in a system for storing and/or managing them, that allows making it easier to perform the task of keying-in such data.

Moreover, an object of the present invention is providing a data entry process and system, particularly of accounting data, in a system for storing and/or processing them, that allows entering such data in a quicker and more reliable way, consequently minimising the waste of economy and time resources and minimising, if not removing, possible manual keying-in and entry errors by an operator.

The above and other objects and advantages of the invention, as will appear from the following description, are obtained by a data entry process, particularly for accounting data, as claimed in claim 1. Preferred embodiments and non-trivial variations of the present invention are the subject matter of the dependent claims.

The present invention will be better described by some preferred embodiments thereof, provided as a non-limiting example, with reference to the enclosed drawing, in which FIG. 1 shows a block diagram representing the steps of the accounting data entry process and system according to the present invention.

In particular, the process according to the present invention is based on identifying the type of document in which data to be entered in the storing and/or processing system are contained, on determining the position of every data on every type of document, and on automatically reading and automatically compiling data when the document is of an already known type.

In general, the document that can be used with the procedure according to the present invention can physically be on media of a different nature, such as: paper; digital like image; or digital like text.

The process according to the present invention makes it easier to enter data, particularly accounting data, by digitally acquiring, in fact, the image of documents in which such data are contained, recognising the document texts, identifying the type of every document, storing the areas in which data to enter can be found on every type of document, and automatically loading the data entry program fields depending on found data in corresponding areas.

In the following description, for easiness and for clarity reasons, it is supposed that documents are on a paper medium. This description can however be easily modified in order to suit it to other document-supporting media.

With reference to FIG. 1, it is possible to note that the accounting data entry process and system according to the present invention comprises the steps of: obtaining (FlOl) a digital image of the document; it is obvious that, if the document is not in on paper but already a digital document, such step is not necessary, while if the document is on paper, the digital image must be obtained, for example, by acquisition through a scanner. from the image obtained in the previous step, recognising (F102) the characters contained in the document through a recognition program (for example any OCR - Optical Character Recognition program) and storing the obtained texts and their position in the image; displaying (F103) to a user a digital document image and requesting (F104) the user to enter document data in their related fields, for example through a data entry program (manual entry) or by selecting the field to be entered and, on the digital document image, the area from which data must be taken (automatic selection) : in this way, the system accurately knows the area from which the data have been taken; locating (F105) and storing (F106) positions of digital image area related to fields in which data are present that are equivalent to data that had been manually or automatically entered by a user. Such step can be performed differently, for example by comparing data entered by the user with texts recognised on the document; selecting an area identifying the document image (F107) and storing the identifying image contained in such area (F107) that is used as document type identification. Documents of the same type are, for example, invoices of the same supplier. Generally, they always have the same lay-out, and only data change. It is possible to use different criteria for selecting the identifying area, such as for example: a. top left area (area often used for company logo) ; or

b. area in which pictures that are not characters can be found; storing (F108) the position of such identifying area and such

identifying image contained in such area; when entering subsequent documents, before requesting the user, in step F104, to enter document data, identifying (F109) the type of document by comparing the identifying area and the identifying image with the identifying areas and the identifying images of already-stored document types; such comparison of identifying images can occur, for example, by using an algorithm that points out the similarity of compared images, such as the "cross-correlation". Alternatively, it is possible to provide that the user manually identifies the document type, for example from a list of available documents. Alternatively, it can be provided to identify the document type by correlating the whole document image, therefore without the need of having to select one identifying area on the digital image. If the document type is not identified, one proceeds as described in the previous steps. If instead the document type is identified, one proceeds as described in the following steps: searching (FlIO) which image areas have been stored on previous documents of the same type associated with that field. Consequently, searching (Fill) which text is present in the current document in the related area and automatically loading (F112) this text in the field to be keyed-in: such areas can be chosen by using different logics, for example: a) the area more frequently associated with documents of the same type; b) the area associated in the latest document of the same type; c) the area more frequently previously associated with the latest documents of the same type; d) the area more frequently previously associated with documents of the same type but weighing the most recent documents more; or e) the area previously associated with a document of the same type chosen by the user as single model. when the user selects a field in the data entry program, if the procedure automatically loaded the field, highlighting (F113) , on the document image, the area from which the loaded data have been taken. If for such field also other areas had been stored, also the other areas are highlighted differently. The user can change (F114) the chosen area among the highlighted ones by pressing a key or by using the mouse or other pointing devices, such as for example a touch screen, on the image. When the user selects another area: replacing (F115) the value loaded in the field with the text found in the selected area; or the user can also manually modify or write (F116) the field value; upon user confirmation, for every field that is automatically filled-in with the text of an image area, storing (F117) the selected area; for all fields that have been manually keyed-in by the user, comparing (F105) data entered by the user with texts recognised on the document. For every user-entered data also detected on the recognised text, storing (F106) the area positions on the image where the equivalent text to data entered by the user has been found.

Once the user, through the procedure according to the present invention, has stored different documents of the same type in a storing and/or processing system, the procedure will have enough information available for loading the program fields with data detected on the document image.

It is obvious that this process has many variations. For example, when the user selects the field to be keyed-in, instead of writing the value (or choosing among the proposed areas) it can point out any image area where these data can be found, even if this one has not been previously located.

Selecting the identifying area that is useful for locating the document type can be automatically performed by the procedure or can be manually performed by the user.

As an alternative to associating data to a fixed image position, they can be associated with a nearby image or text, such as for example the same data description. The system can store, for example, that the invoice date of a given document type can be exactly found under the text 'INVOICE DATE' .

A public accountant customer has several types of accounting documents, but normally a high number of these documents are of the same type of the previously loaded documents. For example, when a customer periodically purchases goods from one of his suppliers, he will periodically have invoices of this supplier. These invoices usually have some picture or logo that allows identifying them, and generally always have the same lay-out, thereby allowing to always find the same data types (such as for example the 'invoice date¹) in the same position.

Quite a number of documents come from few suppliers at municipality, province, regional or national level. This is the case for telephone services, electric energy, gas, etc. invoices. The identification of these documents can be used for many customers of the same public accountant that use the same suppliers.

If the process according to the present invention is not used as associated with other existing procedures for assisting the user in choosing VAT items and accounting accounts, the user himself, in addition to data found on the document, will have to choose VAT items and accounting accounts for every amount that has to be loaded. These data cannot be found on the document, and depend on customer activities, on document type and on associated amount type. Therefore, for these fields, the system will not try and associate document areas, but will store the values chosen by the user for such customer/document type/amount type.

For example, if, for a given customer, the user keys-in a telephone invoice, and the amount 'Telephone Subscription' is associated with account 105 and VAT item 118, the system will remember these values, and when for such customer one has to key-in a new telephone invoice, the system will propose, for the 'Telephone Subscription' amount account 105 and VAT item 118.

Even if the telephone invoice can be identical as type for different customers, the choice of accounting accounts and VAT item depends on the type of customer activity, and therefore the choices made for a customer must not be used for another customer.

When a comparison is made between data entered by the user and data present in the document, it must be taken into account that data entered by the user can comply with writing standards that maybe do not coincide with those used in the document. For example, if the entered field is a date, maybe in the data entry program the user writes '25/2/2004', while in the document the same date can be found in different formats, such as for example '2502 2004', o '25-2-2004', or '25 February 2004', etc. The comparison function must be flexible and accept many possible formats in such a way as to find the corresponding data on the document. Other examples are the use or not of dots or commas for thousands and for decimals, the separators for credit card numbers, etc.

In the same way, the step that automatically loads data in fields of the data entry program depending on texts found in the document, will have to take care of changing the found text format so that it is compatible with the format required by the data entry program. As an alternative, a data entry program can be realised that accepts different formats for keying-in the values.

Moreover, an advantageous aspect of the present invention is that, depending on individual needs, some process steps can be executed in a different order from the above-described one. For example, an image area can be selected that will be used to identify this type of documents even before requesting the user to enter data in fields. A variation of the present process consists in adding a functionality for an automatic completion of the field value, that can be implemented as an alternative or in addition to the above- described process.

Many word-processing, data sheet programs and others have a functionality for automatically completing the text that a user is writing. When a user starts writing a word on some word processors, the program tries to deduct what the user is writing and proposes the complete word. The user can confirm the text proposed by the program by pressing a key (usually ^ΛEnter' ) or he can go on writing what he wishes. These word processors contain a list of terms, and when a user starts writing a word, they compare the letters written by the user with the first letters on every term which can be found in the list. When the words in the list that start with the letters

written by the user are few or only one, the program proposes its choice to the user. This process inside a word processing program has a limited usefulness. Normally the list of terms is made of the vocabulary of the used language, or through all terms used by the user in other documents. In any case, the list is very long and consequently the number of words starting with the same letters is very high. The program can determine with a high level of certainty

the word that the user is writing when keying-in of the word is almost already finished, consequently not generating a big work saving for a user.

In a data entry job, a similar mechanism instead can generate big benefits if implemented in a particular way. It is necessary to realise the list with the texts found through the OCR process on the document that is being entered at that time. By building the list in this way, the program will have an extremely short list available. Consequently, the chance of early finding what a user is writing without ambiguities is very high. For example, suppose that the user must write a VAT number, which can be found on an invoice he is entering. The VAT number in our example is 02553570165. These data will be present in the list of texts found by the OCR. Once the user has written 025, it is very likely that in the present invoice there will be no other text starting with 025, and therefore the program can propose the completion of the whole field with 02553570165. As a result, the user has had to key-in only 3 characters and the program has proposed the 9 remaining characters, saving him 75% of its keying-in.

This method for automatically completing the fields depending on data found in the document can be useful alone, particularly for entering data from documents whose type is never or almost never repeated. It can also be used together with the previously described process. In this latter case, the program can limit the list of texts to those that can be found on areas found during step FlIO for filling-in the field upon entry (F104) , limiting very much the number of texts in the list and allowing to anticipate and complete the field values almost immediately.

This process obviously has many variations. For example, to the list of texts found on the document by the OCR process, some very frequently used data can be added, such as for example the current date.

Another variation to the described processes consists in selecting available data for loading or completing a certain field depending on the type of data expected for such field.

Alternatively or in addition to the described processes, available texts for loading or completing a field can be limited to the only texts that have the same type of field to be keyed-in.

If for example the field to be keyed-in is "invoice date", it is expected that the text to be found on the original document is of the "date" type. In this case the system could limit the choice only to those data that satisfy this type of data.

A first way of using this mechanism is proposing to the user,

for every field, the texts found on the document that satisfy this format. If, as pointed out in the example, the user must fill-in the "invoice date", the system could propose a list with dates that had been found in the document. They could be only one or two and therefore the selection would be easy. In order to make it still easier, the program can highlight the area in the original document where each one of the dates shown for the choice can be found.

A second way of using this mechanism is limiting the areas used by the step Fill of automatically searching the text and/or the step F113 of highlighting the alternative areas from which the text can be taken only to those whose text has data of the desired type. If in this example of search for the "invoice date", the program found 3 areas used in the previous documents of the same type in order to take these data, step Fill would verify on which of the 3 areas the found text corresponds to a date removing choices that do not correspond. Step F113 would not highlight the areas whose text does not correspond to a date.

A third way of using this mechanism is during the automatic completion. When the user starts manually compiling a field and the program searches which texts on the original document start with characters keyed-in by the user, it will limit this choice to the only texts that correspond to the searched data type. In the example with "invoice date", if the user keys-in "12" and the program finds on the list of texts "12.000", "125" and "12/12/2003" the program will discard texts "12.000" and "125" because they do not correspond to a date and will automatically complete the field with a single text that starts with "12" and is a date: "12/12/2003".

The advantages of this process are clear. Once the user has entered many documents for a given customer, when he will have to enter a new one, it will be very likely that he will find a document of the same type already entered. In this case, the system will automaticlly perform the compilation of data entry program fields. The user work decreases. Instead of having to search for data on the document and to manually key them in, the user can only check that data entered by the procedure are correct. Since the procedure, in addition to loading the value in the field, highlights on the document image the area from which it took the data, and moreover, areas from which it could take it alternatively, user checks are further simplified.

In this way, the user can proceed in a quicker way, removing a lot of the fatigue to search for data on the document and to manually enter them. .Moreover, the chance of performing keying-in errors also decreases.

The choices that are automatically made by the procedure progressively improve. In fact, the more the loaded documents are, the more information the procedure has for evaluating where every data must be taken, and therefore the more reliable the choices that are proposed to the user become.

The invention also refers to a computer program product comprising computer program code means adapted to run all the steps of the above-described process when such program is run on a computer.

The invention also refers to a computer program as defined and contained on a computer-readable medium. Moreover, the present invention also refers to a data entry system for electronically practising the process according to the present invention, comprising: means for acquiring documents in a digital form; means for manually entering data; means for displaying digital data, documents and images; means for storing documents and/or data contained in such documents; and means for processing documents and/or data contained in such documents.

Claims

CIAIMS

1. Process for entering data in a system for storing and/or processing such data, characterised in that it comprises the steps of: obtaining (FlOl) a digital image of a document; recognising (F102) , from said digital image, characters and texts and storing said characters and/or said texts and a position thereof in said digital image; displaying (F103) to a user a digital image of said document and entering (F104) said data in related fields;

locating (F105) and storing (F106) positions of areas of said image related to said fields in which data are present that are equivalent to said data entered by a user; identifying (F109) a type of said document; searching (FlIO) for which of said areas of said digital image have been stored in said already-loaded documents; searching (Fill) for which text is present in said document

in a related one of said areas and automatically loading (F112) said text in said field; and for every data entered by said user or detected in said text, storing (F106) positions of areas on said digital image in which a text has been found which is equivalent to said data entered by said user.

2. Process according to claim 1, characterised in that said data are accounting data.

3. Process according to claim 1, characterised in that said digital image is obtained through scannering.

4. Process according to claim 1, characterised in that, in order to enter (F104) said data in related fields, a data entry program is used for manual entry and/or the field to be entered is selected and, on the digital document image, the area from which data must be extracted.

5. Process according to claim 1, characterised in that, in order to locate (F105) said area positions, it is necessary to compare said data entered in related fields with said recognised texts on said document.

6. Process according to claim 1, characterised in that, in order to identify (F105) said area positions, said user manually points out said positions on said digital image.

7. Process according to claim 1, characterised in that it further comprises the steps of selecting (F107) an identifying area of said image of said document as identification of a type of said document, and storing (F108) a position of said identifying area and an identifying image contained in said area.

8. Process according to claim 1, characterised in that, in order to identify (F109) said document type, it is necessary to compare said identifying area and said identifying image with identifying areas and identifying images of already-stored types of documents.

9. Process according to claim 1, characterised in that, in order to

automatically load (F112) said text in said field, said field is associated with at least one area that had been previously- associated with said field by documents of the same type with a higher frequency.

10. Process according to claim 1, characterised in that, in order to automatically load (F112) said text in said field, said field is associated with at least one area that had been previously associated with said field by a document of the same type chosen by said user as model.

11. Process according to claim 1, characterised in that it further comprises the steps of: highlighting (F113) , on said digital image of said document, areas from which said data had been stored; if necessary, changing (F114) a choice of said highlighted area with a choice of another highlighted area and replacing (F115) a value loaded in said field for said text that is found in said selected area; and storing (F117) said highlighted area.

12. Process according to claim 11, characterised in that it further comprises the step of limiting the areas used by the step (F113) of highlighting the alternative areas where the text can be taken only to those in which the text have data of the desired type.

13. Process according to claim 1, characterised in that it

further comprises the steps of manually modifying or writing (FlI6)

said value in said field; and, for all said manually keyed-in fields, comparing (F105) said manually entered data with said recognised texts on said document.

14. Process according to any one of the previous claims, characterised in that said step of recognising (F102) said texts and characters from said digital image occurs through a OCR - Optical Character Recognition program.

15. Process according to claim 1, characterised in that it

further comprises the step of automatically completing the field value.

16. Process according to claim 15, characterised in that said step of automatically completing the field value comprises creating a list of predefined words through the texts found by means of an OCR process on the document that is currently being entered.

17. Process according to claim 16, characterised in that said step of automatically completing the field value comprises limiting the list of texts to those being present on areas found during step

(FIlO) for compiling the field when entering it (F104) .

18. Process according to claim 16, characterised in that said step of automatically completing the field value comprises adding some data used much more freguently, for example the current date, to the list of texts found on the document by the OCR process.

19. Process according to claim 1, characterised in that it further comprises the step of limiting the areas used by the step (Fill) of automatically searching the text where the text can be taken only to those in which the text have data of a desired type.

20. Process according to claim 15, characterised in that said step of automatically completing the field value comprises limiting available texts for loading or completing a field only to those texts that have the same type as of the field to be keyed-in.

21. Process according to claim 20, characterised in that it comprises the further step of highlighting the area of the original document where each one of the solutions shown for the choice can be found.

22. Computer program product comprising computer program code means adapted to perform all steps of the process according to any one of the previous claims when such program is run on a computer.

23. Computer program product according to claim 22 and contained on a computer-readable medium.

24. Data entry system adapted to perform said process according to any one of claims 1 to 21, characterised in that it comprises means for acquiring documents in a digital form, means for manually entering data, means for displaying digital data, documents and images, means for storing documents and/or data contained in such documents/ and means for processing documents and/or data contained in such documents.