CN100386761C - Data file merging method - Google Patents

Data file merging method Download PDF

Info

Publication number
CN100386761C
CN100386761C CNB2005101145884A CN200510114588A CN100386761C CN 100386761 C CN100386761 C CN 100386761C CN B2005101145884 A CNB2005101145884 A CN B2005101145884A CN 200510114588 A CN200510114588 A CN 200510114588A CN 100386761 C CN100386761 C CN 100386761C
Authority
CN
China
Prior art keywords
record
data file
current
file
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2005101145884A
Other languages
Chinese (zh)
Other versions
CN1746894A (en
Inventor
张亚栋
赵云飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Helishi System Integration Co Ltd
Original Assignee
Beijing Hollysys Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hollysys Co Ltd filed Critical Beijing Hollysys Co Ltd
Priority to CNB2005101145884A priority Critical patent/CN100386761C/en
Publication of CN1746894A publication Critical patent/CN1746894A/en
Application granted granted Critical
Publication of CN100386761C publication Critical patent/CN100386761C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention discloses data file merging method. First same records of two data files from current records are searched first, and if the current records of one file are not the same record, the current records in the relevant file and all records between current records and the first same record are copied in an object file, and then the same records are arranged as the new current records of the two data files. Large same records in rest data are merged by using an internal memory comparison mode from the current records, and then the files are merged by using the method circularly. The method of the present invention can quickly, efficiently and reliably merge data files, reduce the amount of calculation of the data file merge, and enhance the stability of system running. The present invention can also be used for merging redundancy double-engine data files.

Description

A kind of data file merging method
Technical field
The present invention relates to the data file treatment technology, specifically, relate to a kind of merging method of data file.
Background technology
At present, a lot of supervisory systems as track traffic synthetic monitoring system, generally all require to realize the dual-computer redundancy configuration, and data such as incident, warning, Operation Log are needed long preservation.In the server of two redundancies, just exist two parts of log files like this.Because some unusual factors such as server maintenance or network failure may cause the record in the two-server inconsistent.And when the operator inquires about, must provide a complete, unique data, therefore, these two parts of files need be merged.
According to actual conditions, redundant as can be seen two-shipper data file has following characteristics: 1, exist big section identical recordings.Under most time, the recorded content of two files is on all four.2, certain file may exist small number of records to omit phenomenon.In time, because system or network, certain file may be omitted one or two record in minority.3, under rare occasion, certain file may lack big section record.Owing to hardware fault or other artificial reasons cause certain station server break-off, thereby cause the record of the big section of this server disappearance.4, the record strip number in it is a lot, may be ten hundreds of.
Now, adopt one by one each bar in the comparison document to write down to realize for the merging of such two one data file more.The calculated amount of this merging method is will be very big and operation time is long, causes that system loading increases in the merging process, influences the stability that total system is moved.Therefore,, great majority big for this data volume write down identical, occur the merging of the data file of the inconsistent or big segment data disappearance of minority data sometimes, how a kind of quick, reliable data file merging method can be provided, become problem demanding prompt solution.
Summary of the invention
Technical matters to be solved by this invention provides a kind of data file merging method, realizes quick, efficient, data file merging reliably, reduces the calculated amount of data file merging.
For solving the problems of the technologies described above, the invention provides a kind of data file merging method, be used to merge first data file and second data file that have a large amount of identical datas, it is characterized in that comprising the steps:
(a) article one that first data file and second data file be set is recorded as current record;
(b) from current record, find in two files from the identical record of this current record article one, current record if any file is not this identical recordings, then with all record copies between current record in the corresponding document and this current record and article one identical recordings in file destination;
(c) current record with two files is updated to the identical record of this article one, calculates the residue record number of the data file that the residue record is few in first and second data files, sets the current relatively quantity smaller or equal to this value;
(d) from current record, take out current relatively a plurality of records of quantity from first and second data files and carry out integral body relatively, if it is identical, execution in step (e), otherwise, a current part that compares quantity as new current relatively quantity, is continued comparison by identical mode, till the comparative result that takes out record from two data files is identical, carry out next step then;
(e) data of being taken out in one of them file are all copied in the file destination, whether the record of judging two data files then copies is finished, and finishes if all copied, then finishes; If all copy is not finished execution in step (f); If have only one of them data file copies to finish, execution in step (g) then;
(f) current record of upgrading in these two data files is taken out next bar record that the last item writes down in the record separately for it, returns step (b);
(g) the residue record copies with another data file arrives file destination, finishes.
Further, said method also can have following characteristics: further may further comprise the steps in the described step (b):
(b1) search and the identical record of the second data file current record since the current record of first data file, if find, execution in step (b2), otherwise execution in step (b3);
(b2) current record as first data file is not this identical recordings, with the whole record copies between the current record in first data file and this current record and the record identical with the second data file current record in file destination, execution in step (c) then;
(b3) search and the identical record of the first data file current record since the current record of second data file, if find, execution in step (b4) then, otherwise, execution in step (b5);
(b4) current record as second data file is not this identical recordings, with the whole record copies between the current record in second data file and this current record and the record identical with the first data file current record in file destination, execution in step (c) then;
(b5) current record with first data file and second data file copies in the file destination, and next bar with this two data files current record writes down as new current record then, returns step (b1).
Further, said method also can have following characteristics: in the described step (c), be that the smaller value in the residue record number of first and second data files is set at current relatively quantity.
Further, said method also can have following characteristics: the integral body in the described step (d) relatively is that internal memory called in many records of first data file and second data file, and internal memory relatively finishes by carrying out.
Further, said method also can have following characteristics: in the described step (d), when integral body result relatively is incomplete same, be that 1/3~2/3 of current relatively quantity is continued relatively as new current relatively quantity.
Further, said method also can have following characteristics: in the described step (d), when integral body result relatively is incomplete same, be that 1/2 of current relatively quantity is continued relatively as new current relatively quantity.
Further, said method also can have following characteristics: in the described step (e), when all copies when not finishing of first data file and second data file are learnt in judgement, carry out following steps earlier: whether the record number of judging file destination counts sum less than first and second data file, if, execution in step (f) again, otherwise make mistakes, finish.
As from the foregoing, the present invention is by searching out initial, the end position of big section identical recordings, disposable merging identical data, the efficient that improved greatly relatively, merges.Further, can also search identical recordings and merge the data that lack, adopt internal memory relatively to wait means, further reduce data and merge the calculated amount of bringing by staggered.
Description of drawings
Figure 1A and Figure 1B are the process flow diagram of embodiment of the invention method, have respectively expressed part steps.
Embodiment
Merging method with the data file of redundant two-shipper in the track traffic synthetic monitoring system is an example below, and the present invention is described in detail, and as shown in Figure 1, the present embodiment method may further comprise the steps:
Step 101: article one that first data file and second data file are set is recorded as current record;
Step 102: calculate the residue record number of first data file and second data file, whether judge wherein less residue record number greater than zero, if, carry out next step, otherwise carry out abnormality processing (not redirect from here under the normal condition), finish;
Step 103: in first data file, search the record identical with the current record of second data file since the current record of first data file, if find, execution in step 104, otherwise, execution in step 105;
Step 104: the current record as first data file is not this identical recordings, with the whole record copies between the current record in first data file and this current record and the record identical with the second data file current record in file destination, execution in step 108;
Step 105: in second data file, search the record identical with the current record of first data file since the current record of second data file, if find, then execution in step 106, otherwise, execution in step 107;
Step 106: the current record as second data file is not this identical recordings, with the whole record copies between the current record in second data file and this current record and the record identical with the first data file current record in file destination, execution in step 108;
Step 107: the current record of first and second data files is copied in the file destination, and next bar with this two data files current record writes down as new current record then, returns step 103;
Step 108: the current record of two data files is updated to identical recordings described in the file separately, recomputates first data file and second data file residue record number, wherein less residue record number is set to current relatively quantity;
Step 109: the current record since two data files, take out current relatively many records of quantity appointment, put into internal memory and carry out integral body relatively, if content is identical, then execution in step 110, otherwise, execution in step 111;
Step 110: many record copies that first data file or second data file are extracted are in file destination, the current record of two data files is updated to next bar record of the last item record that takes out separately respectively, the sequence number that is about to current record adds current relatively quantity, execution in step 112;
Step 111: half of current relatively quantity as new current relatively quantity, returned step 109 and continued relatively;
Step 112: judge whether first data file and second data file have all copied and finish, if, then finish, otherwise, next step carried out;
Step 113, whether judge has a data document copying to finish in first data file and second data file, if, execution in step 115, otherwise, next step carried out;
Step 114, two data files all do not copy to be finished, and judges whether the record number of file destination counts sum less than first and second data file, if then execution in step 102, if not, then makes mistakes, and finishes;
Step 115 all copies another remaining data record that does not have to copy the data file of finishing in the file destination, finishes.
According to the method described above an example is tested, 50000 records are wherein arranged in the data file of A machine, the B machine is 47800 records (during have twice interruption to stop the B machine receive data), adopt the mode that compares one by one, approximately nearly 10 minutes consuming time, and adopt this algorithm, and then less than 10 seconds, the effect highly significant.And, find no loss of data through repeatedly measuring and calculating.
Should be noted that, the merging of the data file in the track traffic synthetic monitoring system that data file merging method of the present invention is not restricted to point out in the embodiment in the redundant two-shipper, in fact, the present invention is applicable to that various data volumes are big, the merging of the data file that most of data are identical, and be specially adapted to wherein to occur sometimes the merging of the data file of the inconsistent or big segment data disappearance of minority data.
The inventive method can be done various possible conversion on the basis of the foregoing description method.
For example, in another embodiment, when current record begins to search article one identical recordings of two data files, might not be confined to staggered method of searching among the embodiment, also can finish by the following method:
Steps A is searched since the current record of first data file, judges whether and the identical record of the second data file current record, if having, has just found the identical record of article one, otherwise, execution in step B;
Step B is updated to new current record with next bar record of the second data file current record, returns steps A and continues to search.
This mode also can realize basic function of the present invention, but, the staggered method of searching that when looking for article one identical recordings, adopts among the embodiment, when in a file, having lacked a big segment record, by searching the identical record of article one that just can find two file current records to begin once or twice.For example, suppose records such as first data file has 6,7......, lacked 1~5 record, its current record is 6.Records such as second data file has 1,2,3,4,5,6,7......, current record is 1.By staggered method of searching, can not find out the current record of second data file in first data file, be that sequence number was 1 when record, meeting goes to search the current record 6 of first data file to second data file, just can find the identical record of both article one by 2 times like this, if and adopt the method in a data file, search, just need in first data file, search 6 times, the identical record of the article one that can find two data files to begin just from current record, thereby staggered efficient of searching mode is higher.
And for example, in another embodiment, when finding the identical record of two data file article one, also this identical recordings can be copied in the file destination, and its next bar record is just passable as the new current record of two data files.The method of this and embodiment is equal to.
And for example, in another embodiment, when setting current relatively quantity, also can not get the residue record number of the file that the residue record is few in two data files, also can get certain numerical value, as 1000~5000 etc. less than this record number.

Claims (7)

1. a data file merging method is used to merge first data file and second data file that have a large amount of identical datas, it is characterized in that comprising the steps:
(a) article one that first data file and second data file be set is recorded as current record;
(b) from current record, find in two files from the identical record of this current record article one, current record if any file is not this identical recordings, then with all record copies between current record in the corresponding document and this current record and article one identical recordings in file destination;
(c) current record with two files is updated to the identical record of this article one, calculates the residue record number of the data file that the residue record is few in first and second data files, sets the current relatively quantity smaller or equal to this value;
(d) from current record, take out current relatively a plurality of records of quantity from first and second data files and carry out integral body relatively, if it is identical, execution in step (e), otherwise, a current part that compares quantity as new current relatively quantity, is continued comparison by identical mode, till the comparative result that takes out record from two data files is identical, carry out next step then;
(e) data of being taken out in one of them file are all copied in the file destination, whether the record of judging two data files then copies is finished, and finishes if all copied, then finishes; If all copy is not finished execution in step (f); If have only one of them data file copies to finish, execution in step (g) then;
(f) current record of upgrading in these two data files is taken out next bar record that the last item writes down in the record separately for it, returns step (b);
(g) the residue record copies with another data file arrives file destination, finishes.
2. the method for claim 1 is characterized in that, further may further comprise the steps in the described step (b):
(b1) search and the identical record of the second data file current record since the current record of first data file, if find, execution in step (b2), otherwise execution in step (b3);
(b2) current record as first data file is not this identical recordings, with the whole record copies between the current record in first data file and this current record and the record identical with the second data file current record in file destination, execution in step (c) then;
(b3) search and the identical record of the first data file current record since the current record of second data file, if find, execution in step (b4) then, otherwise, execution in step (b5);
(b4) current record as second data file is not this identical recordings, with the whole record copies between the current record in second data file and this current record and the record identical with the first data file current record in file destination, execution in step (c) then;
(b5) current record with first data file and second data file copies in the file destination, and next bar with this two data files current record writes down as new current record then, returns step (b1).
3. the method for claim 1 is characterized in that, in the described step (c), is that the smaller value in the residue record number of first and second data files is set at current relatively quantity.
4. the method for claim 1 is characterized in that, the integral body in the described step (d) relatively is that internal memory called in many records of first data file and second data file, and internal memory relatively finishes by carrying out.
5. the method for claim 1 is characterized in that, in the described step (d), when integral body result relatively is incomplete same, is that 1/3~2/3 of current relatively quantity is continued relatively as new current relatively quantity.
6. method as claimed in claim 5 is characterized in that, in the described step (d), when integral body result relatively is incomplete same, is that 1/2 of current relatively quantity is continued relatively as new current relatively quantity.
7. the method for claim 1, it is characterized in that, in the described step (e), when all copies when not finishing of first data file and second data file are learnt in judgement, carry out earlier following steps: whether the record number of judging file destination counts sum less than first and second data file, if, execution in step (f) again, otherwise make mistakes, finish.
CNB2005101145884A 2005-10-26 2005-10-26 Data file merging method Active CN100386761C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005101145884A CN100386761C (en) 2005-10-26 2005-10-26 Data file merging method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005101145884A CN100386761C (en) 2005-10-26 2005-10-26 Data file merging method

Publications (2)

Publication Number Publication Date
CN1746894A CN1746894A (en) 2006-03-15
CN100386761C true CN100386761C (en) 2008-05-07

Family

ID=36166431

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005101145884A Active CN100386761C (en) 2005-10-26 2005-10-26 Data file merging method

Country Status (1)

Country Link
CN (1) CN100386761C (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389433B1 (en) * 1999-07-16 2002-05-14 Microsoft Corporation Method and system for automatically merging files into a single instance store
CN1622094A (en) * 2004-12-24 2005-06-01 北京中星微电子有限公司 Method for file merge
US20050154615A1 (en) * 2001-09-05 2005-07-14 Rotter Joann M. System for processing and consolidating records

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389433B1 (en) * 1999-07-16 2002-05-14 Microsoft Corporation Method and system for automatically merging files into a single instance store
US20050154615A1 (en) * 2001-09-05 2005-07-14 Rotter Joann M. System for processing and consolidating records
CN1622094A (en) * 2004-12-24 2005-06-01 北京中星微电子有限公司 Method for file merge

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
数据集成中的一种数据合并技术. 董树明,徐文胜,董逸生.现代计算机,第175期. 2003
数据集成中的一种数据合并技术. 董树明,徐文胜,董逸生.现代计算机,第175期. 2003 *

Also Published As

Publication number Publication date
CN1746894A (en) 2006-03-15

Similar Documents

Publication Publication Date Title
US7107486B2 (en) Restore method for backup
US20080086515A1 (en) Method and System for a Soft Error Collection of Trace Files
CN109358987B (en) A kind of backup cluster based on two-stage data deduplication
US7809892B1 (en) Asynchronous data replication
US20100023805A1 (en) Method and system for disaster recovery based on journaling events pruning in a computing environment
CN105608086A (en) Transaction processing method and device of distributed database system
CN101593136A (en) Make computing machine have the method and the computer system of high availability
US20060173930A1 (en) Apparatus, system and method for persistently storing data in a data synchronization process
US8655847B2 (en) Mirroring data changes in a database system
CN103186624A (en) Data synchronization method and data synchronization device
CN110515927B (en) Data processing method and system, electronic device and medium
US20060004839A1 (en) Method and system for data processing with data replication for the same
CN110543446A (en) block chain direct filing method based on snapshot
CN115145697B (en) Database transaction processing method and device and electronic equipment
CN109063005B (en) Data migration method and system, storage medium and electronic device
US8745038B2 (en) Optimization of integration flow plans
WO2015183316A1 (en) Partially sorted log archive
US20090217108A1 (en) Method, system and computer program product for processing error information in a system
CN103678051A (en) On-line fault tolerance method in cluster data processing system
CN114281508A (en) Data batch-flow fusion offline calculation method
CN100386761C (en) Data file merging method
CN110543485B (en) Block chain reservation filing method based on snapshot
CN101840363B (en) A kind of file block comparative approach and device
CN108804239B (en) Platform integration method and device, computer equipment and storage medium
US20160085782A1 (en) Update method and updating device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee

Owner name: BEIJING HELISHI SYSTEM ENGINEERING CORPORATION

Free format text: FORMER NAME OR ADDRESS: BEIJING HELISHI SYSTEM ENGINEERING CO., LTD.

CP01 Change in the name or title of a patent holder

Address after: No. 10, building materials Road, Xisanqi, Beijing, Haidian District

Patentee after: Beijing HollySys System Engineering Co., Ltd.

Address before: No. 10, building materials Road, Xisanqi, Beijing, Haidian District

Patentee before: Beijing HollySys System Engineering Co., Ltd.

TR01 Transfer of patent right

Effective date of registration: 20211116

Address after: 100176 room 3412, floor 4, building 3, yard 2, Desheng Middle Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Patentee after: Beijing Helishi system integration Co., Ltd

Address before: 10 Jiancai Chengzhong Road, Xisanqi, Haidian District, Beijing 100096

Patentee before: Beijing Helishi System Engineering Co., Ltd

TR01 Transfer of patent right