Illllllllllllllllllllllllllllllllllllllllllllllll
US006968335B2
(12) United States Patent (io) Patent No.: US 6,968,335 B2
Bayliss et al. (45) Date of Patent: Nov. 22,2005
(54) METHOD AND SYSTEM FOR PARALLEL PROCESSING OF DATABASE QUERIES
(75) Inventors: David Bayliss, Delray Beach, FL (US);
Richard Chapman, Boca Raton, FL (US); Jake Smith, London (GB); Ole Poulsen, Bend, OR (US); Gavin Halliday, Royston (GB); Nigel Hicks, London (GB)
(73) Assignee: Sesint, Inc., Baco Raton, FL (US)
( * ) Notice: Subject to any disclaimer, the term ol this patent is extended or adjusted under 35 U.S.C. 154(b) by 419 days.
(21) Appl. No.: 10/293,490
(22) Filed: Nov. 14, 2002
(65) Prior Publication Data
US 2004/0098359 Al May 20, 2004
(51) Int. CI.7 G06F 17/30
(52) U.S. CI 707/10; 707/3; 707/101;
707/201
(58) Field of Search 707/1, 3, 10, 100-102,
707/201-203; 709/201
(56) References Cited
U.S. PATENT DOCUMENTS
![[table]](http://www.google.ca/patents?id=Mi8WAAAAEBAJ&ie=ISO-8859-1&output=text&pg=PA1&img=1&zoom=3&hl=en&q=&cds=1&sig=ACfU3U2h5gVXUDI7RScFPt1URyIxZJbKhQ&edge=0&edge=stretch&ci=134,691,300,223)
![[blocks in formation]](http://www.google.ca/patents?id=Mi8WAAAAEBAJ&ie=ISO-8859-1&output=text&pg=PA1&img=1&zoom=3&hl=en&q=&cds=1&sig=ACfU3U2h5gVXUDI7RScFPt1URyIxZJbKhQ&edge=0&edge=stretch&ci=478,158,382,429)
A system and methods for parallel processing of queries to one or more databases are described herein. One or more databases may be distributed among a subset of slave nodes of a global-results processing matrix. A query to the database may be generated using a query-based high-level programming language. The query-based source code then may be converted to intermediary source code in a common programming language and then compiled into a dynamic link library (DLL) or other type of executable. The DLL is then distributed among the slave nodes of the processing matrix, whereupon the slave nodes execute related portions of the DLL substantially in parallel to generate initial query results. The initial query results may then be provided to master node of the global-results processing matrix for additional processing, whereby the master node is adapted to execute one or more associated portions of the DLL on the initial query results.
104 Claims, 21 Drawing Sheets
6,108,763 A * 8/2000 Grondalski 712/10
6,192,391 Bl * 2/2001 Ohtani 709/201
6.266,804 Bl 7/2001 Isman
6,311,169 B2 10/2001 Duhon
6,427,148 Bl 7/2002 Cossock
2004/0098359 Al * 5/2004 Bayliss et al 707/1
2004/0098372 Al * 5/2004 Bayliss et al 707/3
2004/0098373 Al * 5/2004 Bayliss et al 707/3
2004/0098374 Al * 5/2004 Bayliss et al 707/3
2004/0098390 Al * 5/2004 Bayliss et al 707/7
OTHER PUBLICATIONS
Vincent Coppola, "Killer APP," Men's Journal, vol. 12, No. 3, Apr. 2003, pp. 86-90.
Eike Schallehn et al., "Extensible and Similarity-based Grouping for Data Integration," Department of Computer Science, pp. 1-17.
Rohit Schallehn et al., "Eliminating Fuzzy Duplicates in Data Warehouses," 12 pages.
Peter Christen et al., "Parallel Computing Techniques for High-Performance Probablistic Record Linkage," Data Mining Group, Australian National University, Epidemiology and Surveillance Branch, Project web page: http:// datamining.anu.edu.au/linkage.html, 2002, pp. 1-11. Peter Christen et al., "Parallel Techniques for High-Performance Record Linkage (Data Matching)," Data Mining Group, Australian National University, Epidemiology and Surveillance Branch, Project web page: http://datamining.anu.edu.au/linkage.html, 2002, pp. 1-27. Peter Christen et al., "High-Performance Computing Techniques for Record Linkage," Data Matching), Data Mining Group, Australian National University, Epidemiology and Surveillance Branch, Project web page: http://datamining. anu.edu.au/linkage.html, 2002, pp. 1-14. William E. Winkler, "Matching And Record Linkage," U.S. Bureau of the Census, pp. 1-38.
Peter Christen et al., "High-Performance Computing Techniques for Record Linkage," ANU Data Mining Group, Australian National University, Epidemiology and Surveillance Branch, Project web page: http://datamining. anu.edu.au/linkage.html, pp. 1-11.
William E. Winkler, "The State of Record Linkage and Current Research Problems," U.S. Bureau of the Census, 15 pages.
William E. Winkler, "Advanced Methods For Record Linkage," Bureau of the Census, pp. 1-21. William E Winkler, Frequency-Based Matching in Fellegi-Sunter Model of Record Linkage, Burean Of The Census Statistical Research Division, Oct. 4, 2000, 14 pages.
William E. Winkler, "State of Statistical Data Editing And Current Research Problems," Bureau Of The Census Statistical Research Division, 10 pages.
The First Open ETL/EAI Software For The Real-Time Enterprise, Sunopsis, A New Generation ETLTool, "SunopsisTM v3 expedites integration between heterogeneous systems for Data Warehouse, Data Mining, Business Intelligence and OLAP projects," <www.suopsis.com>, 6 pages.
Alan Dumas, "The ETL Market and SunopsisTM v3 Business Intelligence, Data Warehouse & Datamart Projects," 2002, Sunopsis, pp. 1-7.
Teradata Warehouse Solutions, "Teradata Database Technical Overview," 2002, pp. 1-7.
WhiteCross White Paper, May 25, 2000, "wx/des-Technical Information," pp. 1-36.
Teradata Alliance Solutions, "Teradata and Ab Initio," pp. 1-2.
Peter Christen et al., The Australian National University, "Febri—Freely extensible biomedical record linkage," Oct. 2002, pp. 1-67.
William E. Winkler, "Using the EM Algorithm for Weight Computation in the Fellegi-Sunter Model of Record Linkage," Bureau Of The Census Statistical Research Division, Oct. 4, 2000, 12 pages.
William E. Winkler et al., "An Application Of The Fellegi-Sunter Model Of Record Linkage To The 1990 U.S. Decennial Census," U.S. Burean of the Census, pp. 1-22.
William E. Winkler, "Improved Decision Rules in The Fellegi-Sunter Model Of Record Linkage," Burean of the Census, pp. 1-13.
Fritz Scheuren et al., "Recursive Merging and Analysis of Administrative Lists and Data," U.S. Bureau of the Census, 9 pages.
William E. Winkler, "Record Linkage Software and Methods for Merging Administrative Lists," U.S. Bureau of the Census, Jul. 7, 2001, 11 pages.
Enterprises, Publishing and Broadcasting Limited, AcxiomAbilitec, pp. 44-45.
TransUnion, Credit Reporting System, Oct. 9,2002,4 pages, <http://www.tr ansunion.com/content/page.jsp?id=/tr ansunion/general/data/business/BusCre...>.
TransUnion, ID Verification & Fraud Detection, Account Acquisition, Account Management, Collection & Location Services, Employment Screening, Risk Management, Automotive, Banking-Savings & Loan, Credit Card Providers, Credit Unions, Energy & Utilities, Healthcare, Insurance, Investment, Real Estate, Telecommunications, Oct. 9, 2002, 46 pages, <http://www.transunion.com>.
White Paper An Introduction to OLAP Multidimensional Terminology and Technology, 20 pages.
* cited by examiner
« PreviousContinue » |