WO1994000473A2 - Pentavalent synthesis of oligonucleotides containing stereospecific alkylphosphonates and arylphosphonates - Google Patents
Pentavalent synthesis of oligonucleotides containing stereospecific alkylphosphonates and arylphosphonates Download PDFInfo
- Publication number
- WO1994000473A2 WO1994000473A2 PCT/US1993/006277 US9306277W WO9400473A2 WO 1994000473 A2 WO1994000473 A2 WO 1994000473A2 US 9306277 W US9306277 W US 9306277W WO 9400473 A2 WO9400473 A2 WO 9400473A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- aryl
- lower alkyl
- oligonucleotide
- group
- alkyl
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07H—SUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
- C07H21/00—Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/30—Chemical structure
- C12N2310/32—Chemical structure of the sugar
- C12N2310/321—2'-O-R Modification
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Biotechnology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Biomedical Technology (AREA)
- Pathology (AREA)
- Oncology (AREA)
- Hospice & Palliative Care (AREA)
- Plant Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Saccharide Compounds (AREA)
Abstract
The present invention provides a method for making R stereospecific alkyl- and aryl-phosphonate linkages between nucleotides. These methods can be used for automated synthesis of oligonucleotides having sequential R stereospecific alkyl- and aryl-phosphonate linkages. The present invention is also directed to the oligonucleotides having several sequential R phosphonate linkages which were produced by the subject methods. Moreover, the present invention provides methods for using the subject oligonucleotides, including methods for regulating the biosynthesis of a DNA, an RNA or a protein and methods for detecting and isolating complementary nucleic acid targets.
Description
PENTAVALENT SYNTHESIS OF OLIGONUCLEOTIDES CONTAINING STEREOSPECIFIC ALRYlPHOSPHONATES AND ARYLPHOSPRONATES The present invention provides methods of making R-stereospecific alkyl- or aryl-phosphonate linkages between nucleotides. Moreover, these methods are amenable to automation. The present invention is also directed to the R-stereospecific alkyl- and arylphosphonate oligonucleotides formed by such methods. Moreover, in another embodiment, the present invention is directed to methods of using the R-stereospecific oligonucleotides, for example, as diagnostic probes and as therapeutic agents having the capability of regulating cellular and viral DNA replication, RNA transcription, protein translation, and other processes involving nucleic acid templates. Furthermore, the present R-stereospecific oligonucleotides can be used as probes for detection or isolation of a target nucleic acid. Oligonucleotides have been employed diversely in utilities ranging from diagnosis and therapy of disease to discovery, cloning and synthesis of nucleic acids. For example, oligonucleotides can be used as probes to identify target nucleic acids that are present in vivo, in tissue samples or that are immobilized onto a filter or membrane. After identification by the oligonucleotide, a target nucleic acid can be cloned and an oligonucleotide can be used to prime the synthesis of that nucleic acid. Moreover, hybridization patterns of an oligonucleotide to a nucleic acid that differ from normal hybridization patterns are frequently useful in diagnosis of disease. Furthermore, there has been great interest recently in developing oligonucleotides as therapeutic agents which can regulate the biological function of cellular or viral nucleic acids. Interest in oligonucleotides as therapeutic agents has arisen from observations of naturally occurring complementary, or antisense, RNA used by some cells to control protein expression. More recently, synthetic oligonucleotides have been used with success to inhibit gene expression. For example, oligonucleotides were initially utilized to inhibit growth of Rous sarcoma virus (Zamecnik et al. 1978 Proc. Natl. Acad. Sci. USA 75: 280-284). Since such initial studies, oligonucleotides have been used to inhibit the expression of a wide variety of target nucleic acids in both cell-free extracts and in whole cells derived from diverse organisms, including viruses, bacteria, plants and animals. For example, expression of vesicular stomatitis virus matrix protein, human c-myc protooncogene, and c-Ha-ras protooncogene has been inhibited by oligonucleotides (Wickstrom et al. 1986 Biophys. J. 49: 15-19; Heikkila et al. 1987 Nature 328: 445-449; Wickstrom et al. 1988 Proc. Natl. Acad. Sci. USA 85: 1028-1032; and Daaka et al. 1990 Oncogene Res. 5: 267-275). A review of such therapeutic applications for oligonucleotides is provided by Uhlmann et al. 1990, Chemical Reviews 90: 543-584. However, the development of oligonucleotides for in vivo regulation of biological processes has been hampered by several long-standing problems, including the nuclease sensitivity and poor cell penetrability of oligonucleotides. In contrast to normal phosphodiester (O-PO2-O) linkages present in common, naturally occuring nucleic acids, both R and S stereoisomeric aryl- or alkylsubstituted phosphonate linkages confer several desirable properties upon an oligonucleotide, including increased nuclease resistance and increased cell penetration. Moreover, oligonucleotides having racemic alkylphosphonate linkages have been shown to specifically inhibit growth of simian virus 40, vesicular stomatitis virus, herpes simplex virus type 1 and human immunodeficiency virus (Miller et al. 1985 Biochimie 67: 769-776; Agris et al. 1986 Biochemistry 25: 6268-6275; Smith et al. 1986 Proc. Natl. Acad. Sci. USA 83: 2787-2791; and Sarin et al. 1988 Proc. Natl. Acad. Sci. USA 85: 7448-7451). However, relatively high concentrations of alkyl- or aryl-phosphonate oligonucleotides have been required to achieve a significant therapeutic effect. This requirement for high oligonucleotide concentrations is apparently due to inefficient binding by oligonucleotides which have some phosphonate linkages in the S-stereospecific configuration (Miller 1991 Biotechnology 9: 358-362). S-stereospecific linkages are generated in addition to R-stereospecific linkages using presently available non-stereospecific synthetic procedures. In particular, replacement of a phosphate oxygen with another group, generates a chiral phosphate which can exist in two stereo- configurations, R and S (Rp and Sp, respectively). Current synthetic procedures are non-stereospecific and typically generate a linkage having either a Rp or Sp configuration, as each nucleotide is added, to thereby generate an oligonucleotide having a mixture of Rp and Sp linkages. However, the melting temperatures of pure Rp and Sp isomers differ significantly, with the Rp isomer binding much more strongly than the Sp isomer (Miller et al. 1980 J. Biol. Chem. 235: 9659-9665; and Lesnikowski et al. 1990 Nucleic Acids Res. 18: 2109-2115). Hence, oligonucleotides with Rp phosphonate linkages have highly desirable binding properties and consequently greater utility than oligonucleotides with Sp or racemic phosphonate linkages. Moreover, a procedure which efficiently produces such highly desirable Rp isomer linkages on alkyl- or aryl-phosphonate oligonucleotides presents a large improvement over available prior art procedures. Present methods for obtaining oligonucleotides with only Rp alkyl- or aryl-phosphonate linkages require steps that are not readily adapted to automation, are inefficient or can be used for obtaining very short oligonucleotides, i.e. oligonucleotides having only up to about 8 oligonucleotides. For example, Lesnikowski et al. (1988 Nucleic Acids Res. 16: 11675-11689) have reported stereospecific dimer, trimer and tetramer synthesis of oligonucleotides using Grignard reagent activation of the 5'-OH group nucleotide and purification of Rp and Sp isomers after addition of each nucleotide. However, these methods present formidable difficulties for automation. More recently, Lesnikowski et al. (1990 Nucleic Acids Res. 18: 2109-2115) have reported synthesis of an octamer (dT)s with a central racemic methylphos-phonate linkage and with other linkages as either all Rp or all Sp. Lebedev et al. (1990b Tetrahedron Letters 31: 855-858) provide a method for making single stereospecific phosphonothioate (i.e. P-S-C-5') linkages between two nucleotides. However, to date there is no disclosure of a method which permits efficient automated synthesis of Rp-stereospecific alkyl- or aryl-phosphonate (i.e. P-O- C-5') linkages. The present invention provides efficient methods for synthesis of Rp stereospecifc alkyl- and aryl-phosphonate linkages between nucleotides of an oligonucleotide. Moreover, the present methods can readily be adapted for automated oligonucleotide synthesis. The present invention is also directed to Rp isomeric oligonucleotides produced by these methods, and to methods of using the present Rp alkyl- or arylphosphonate oligonucleotides as diagnostic probes and as therapeutic agents. The present invention relates to a method for producing an oligonucleotide having an Rp stereoisomeric alkyl- or aryl-phosphonate linkage between a first nucleotide and a second nucleotide in the oligoncleotide, wherein the oligonucleotide is of the formula: EMI6.1 which comprises: (a) reacting a first nucleotide of the formula: EMI6.2 with an alkyl- or aryl-phosphonothioate intermediate of the formula: EMI6.3 under conditions sufficient to produce the Rp stereoisomeric alkyl- or aryl-phosphonate linkage, wherein: Yl is a hydrogen, phosphate, phosphate present in the oligonucleotide or V1; Y2 is a hydrogen, phosphate, phosphate present in the oligonucleotide or V2, X is hydroxy or V3, M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group; each B group is independently a purine or pyrimidine base and each B group can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl; V1 is a protecting group, solid support or phosphate attached to the penultimate nucleotide of the oligonucleotide; V2 is a protecting group; and V, is hydrogen or OY3, wherein V3 is lower alkyl or protecting group; A is an activating group; and the intermediate has an Sp stereoisomeric configuration at the phosphate; and (b) when V1, V2 or V, is a protecting group optionally removing said Vl, V2 or V, protecting group. The present invention also relates to a method of producing a polynucleotide chain of an oligonucleotide having at least one Rp alkyl-phosphonate or one Rp aryl-phosphonate linkage. The present invention further relates to a method of producing an alkyl- or aryl-phosphonothioate nucleotide intermediate having an Sp stereoisomeric phosphorus configuration. The present invention still further relates to such an alkyl- or aryl-phosphonothioate nucleotide intermediate, wherein the intermediate has an Sp stereoisomeric phosphorus configuration. Such an intermediate can be used to generate the present Rp stereoisomeric linkages. The present invention additionally relates to a compartmentalized kit for producing a polynucleotide chain of an oligonucleotide having at least five Rp alkyl-phosphonate or Rp aryl-phosphonate linkages. The present invention also relates to an oligonucleotide having at least five Rp alkylphosphonate or Rp aryl-phosphonate linkages produced by the subject methods. The present invention further relates to the present oligonucleotides which have an attached agent to facilitate cell delivery, a drug or a reporter molecule. The present invention still further relates to a compartmentalized kit for detection or diagnosis of a target nucleic acid. The present invention additionally relates to a compartmentalized kit for isolation of a template nucleic acid. The present invention also relates to a method of regulating biosynthesis of a DNA, an RNA or a protein using the subject Rp alkyl- or aryl-phosphonate oligonucleotides. The present invention further relates to a pharmaceutical composition for regulating biosynthesis of a nucleic acid or protein comprising a pharmaceutically effective amount of one of the present oligonucleotides and a pharmaceutically acceptable carrier. The present invention still further relates to a method of detecting a target nucleic acid which includes contacting one of the present oligonucleotides with a sample to be tested for containing such a nucleic acid for a time and under conditions sufficient to form an oligonucleotide-target complex; and detecting such a complex. Fig. 1 depicts a chromatograph of Rp and Sp stereoisomers of dithymidine methylphosphonate separated by liquid chromatography on a 4.6 x250 mm Cia silica column with gradient elution using 10% to 15% acetonitrile in water (0.25%/min) at a flow rate of 1.0 ml/min. Fig. 2 depicts superimposed circular dichroism spectra of Rp and Sp dithymidine methylphosphonate stereoisomers separated as illustrated in Fig. 1. Each stereoisomer has a characteristic spectrum which can be used to identify that stereoisomer. Fig. 3 depicts 1H NMR spectra of Rp (top) and Sp (bottom) stereoisomers of dithymidine methylphosphonate, illustrating several distinct peaks characteristic of a given stereoisomer which can be used for stereoisomeric identification, e.g. the H2 and H6 peaks. Fig. 4 depicts 31P NMR spectra of Rp (top) and Sp (bottom) stereoisomers of dithymidine methylphosphonate. The Rp stereoisomer has a characteristic additional peak at 7.984 ppm which can be used to identify this stereoisomer. Fig. 5 depicts a spectrograph of 5'dimethoxytrityl-tetrathymidine methylphosphonate-3'acetate (DMT-TpTpTpT-OAc) produced by fast atom bombardment mass spectroscopy (FABMS). Specific peaks corresponding to distinct molecular fragments of DMT-TpTpTpT-OAc are identified (e.g. 5'-dimethoxytrityldithymidine, DMT-TpT, at 850 m/e). The present invention provides a method for producing an oligonucleotide having an Rp stereoisomeric alkyl- or aryl-phosphonate linkage between a first nucleotide and a second nucleotide in oligoncleotide, wherein the oligonucleotide is of the formula: EMI10.1 According to the present invention, Rp stereoisomeric alkyl- or aryl-phosphonate linkages between two nucleotides are formed by: (a) reacting a first nucleotide of the formula: EMI11.1 with an alkyl- or aryl-phosphonothioate intermediate of the formula: EMI11.2 under conditions sufficient to produce the Rp stereoisomeric alkyl- or aryl-phosphonate linkage, wherein: Y1 is a hydrogen, phosphate, phosphate present in the oligonucleotide or V.; Y2 is a hydrogen, phosphate, phosphate present in the oligonucleotide or V2; X is hydroxy or V3; M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group; each B group is independently a purine or pyrimidine base and each B group can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl; Vl is a protecting group, solid support or phosphate attached to the penultimate nucleotide of the oligonucleotide; V2 is a protecting group; and V3 is hydrogen or OY3, wherein V3 is lower alkyl or protecting group; A is an activating group; and the intermediate has an Sp stereoisomeric configuration at the phosphate; and (b) when Vl, V2 or V3 is a protecting group, optionally removing the V1, V2 or V3 protecting group. The present invention further relates to a method of producing a polynucleotide chain of an oligonucleotide having at least one Rp-alkyl-phosphonate or Rp-aryl-phosphonate linkage, wherein the oligonucleotide has the formula: EMI12.1 which method includes the following steps: (a) reacting a 5'- terminal nucleotide of the formula: EMI13.1 with an alkyl- or aryl-phosphonothioate nucleotide intermediate of the formula: EMI13.2 under conditions sufficient to produce the Rp stereoisomeric alkyl- or aryl-phosphonate linkage and so generate a new 5'-terminal nucleotide, wherein: Y1, V2, X, M, B, V., V2, V3 and V3 are as defined hereinabove; n is an integer from 0 to 200; the intermediate has an Sp stereoisomeric phosphorus configuration; and (b) removing the V2 protecting group from the new 5'-terminal nucleotide; (c) reacting the product of (b) with another alkyl- or aryl-phosphonothioate nucleotide intermediate under conditions sufficient to produce the Rp stereoisomeric linkage and so generate a new 5'-terminal nucleotide; (d) repeating steps b and c to extend the polynucleotide chain n-l times; and (e) when V1, V2 or V3 is a protecting group, optionally removing the V1, V2 or V3 protecting group. If the desired product is a compound of Formula I or II wherein X is OH and Yl or Y2 are hydrogen or phosphate, such groups are generated upon removal of the protecting groups by standard techniques known to one skilled in the art. The Rp stereoisomeric alkyl- or arylphosphonate linkages produced by the methods of the present invention have M substituents on the phosphate atom. Such an M substituent is present instead of an oxygen atom commonly found in conventional nucleic acids which have -O-PO,-O- linkages. According to the present invention, M is a lower alkyl, a cycloalkyl, a thioxo, a thio-lower alkyl, an aryl or an aryl lower alkyl group wherein such lower alkyl and aryl groups can be substituted with at least one hydroxy, halogen or cyano group. As used herein the term lower alkyl refers to alkyl groups containing one to six carbon atoms. Lower alkyl groups can be straight-chained or branched, and include such moieties as methyl, ethyl, propyl, isopropyl, n-butyl, sec-butyl, iso-butyl, t-butyl, pentyl, amyl, hexyl and the like. Preferred M alkyl groups of the present invention have from one to four carbon atoms. The most preferred M alkyl group is methyl. Similarly, a lower alkenyl is a lower alkyl with 1-3 carbon-carbon double bonds. Moreover, an alkoxy group is a lower alkyl attached via an oxygen atom; a lower acyl is a lower alkyl attached via a carbonyl (C=O); and a lower cyanoalkyl is a lower alkyl with a CN substituent. The term cycloalkyl refers to saturated cyclic structure, i.e. a ring, having 3-7 ring carbon atoms and includes such groups as cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl and cycloheptyl rings. A thioxo group is a =S group and a thio-lower alkyl is a lower alkyl attached to the phosphate via a sulfur atom. The term aryl refers to an aromatic moiety containing 6-10 ring carbon atoms and includes phenyl, a-naphthyl, B-naphthyl, and the like. A preferred aryl group is phenyl. An aryloxy group is an aryl attached via an oxygen atom and an aroyl is an aryl attached via a carboxyl (CO). Similarly, an aryloxy acyl is an aryl linked to an acyl via an oxygen atom. According to the present invention a halo group is a halogen. Halo groups include fluorine, chlorine, bromine and iodine. A preferred halo group for substitution on M lower alkyl, aryl, and aryl lower alkyl groups is fluorine. Preferred M groups are lower alkyl or phenyl groups which can be substituted with a halo group, preferably a fluorine. More preferred M groups are unsubstituted lower alkyl groups. An especially preferred M group is an unsubstituted methyl group. Therefore, the preferred Rp-stereoisomeric linkages of the present invention are alkylphosphonate linkages and more preferably are methylphosphonate linkages. According to the present invention, the nucleotides joined by the present alkyl- or aryl phosphonate linkages can have deoxyribose or ribose sugar moieties. Therefore, as defined herein X is either hydroxy or V3, wherein V3 is hydrogen or OY3 and V3 is lower alkyl or a protecting group. Accordingly, when X is hydrogen a deoxyribose sugar is present but when X is hydroxy or -O-Y3 a ribose sugar, an O-alkyl ribose sugar or a protected ribose sugar, is present in the associated nucleotide. Preferred oligonucleotides of the present invention have X as hydrogen or OH. However, during synthesis of the present oligonucleotides such an OH is protected with a protecting group which can be removed at conclusion of synthesis by the present methods. The nucleotides linked according to the present invention each have a B group which represents the base moiety present on the nucleotide. Each B group is independently a purine or pyrimidine base which can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl. Preferred purine and pyrimidine B groups have 1-2 lower alkyl, amino, oxo, hydroxy or lower alkoxy substituents. Preferred B groups of the present invention are purines such as guanine (G) and adenine (A), pyrimidines such as thymine (T), cytosine (C) or uracil (U), and any related base analog that is capable of base pairing with a guanine, adenine, thymine, cytosine or uracil. For example, such base analogs include pseudocytosine, isopseudocytosine, 3-aminophenylimidazole, 2'-O-methyl-adenine, 7-deazadenine, 7deazaguanine, 4-acetylcytosine, 5-(carboxy hydroxylmethyl)-uracil, 2'-O-methylcytosine, 5carboxymethylaminomethyl-2-thiouracil, 5carboxymethylamino-methyluracil, dihydrouracil, 2'-0methyluracil, 2'-O-methyl-pseudouracil, -D- galactosylqueonine, 2'-O-methylguanine, xanthine, hypoxanthine, N6-isopentenyladenine, l-methyladenine, 1methyl-pseudouracil, 1-methylguanine, l-methylxanthine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-methyluracil, N6methyl-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 8-D- mannosylqueonine, 5-methoxycarbonylmethyluracil, 5methoxyuracil, 2-methyl-thio-N6-isopentenyladenine, N (9-beta-D-ribofuranosyl-2-methylthiopurine-6-yl) - carbamoyl)threonine, N-(9-beta-D-ribofuranosylpurine-6yl)-N-methylcarbamoyl)threonine. B groups in an aanomeric configuration can also be present in the nucleotides linked by the present methods. Preferred B groups are unmodified G, A, T, C and U bases. In addition, preferred B groups include pyrimidines and purines with lower alkyl, lower alkoxy, lower alkylamine, phenyl or lower alkyl substituted phenyl groups. It is more preferred that these groups are present on the 5 position of the pyrimidine and on the 7 or 8 position of the purine. Especially preferred base analogs are 5-methylcytosine, 5-methyluracil and diaminopurine. The most preferred B groups are unmodified G, A, T, C and U. Moreover, the selection of a B group for each nucleotide added to the growing polynucleotide chain determines the nucleotide sequence of an oligonucleotide produced by the present methods. Accordingly, the present methods can be used to generate oligonucleotides having any desired nucleotide sequence by varying which nucleotide base B is placed at each position. The selection of a nucleotide sequence is generally determined by the intended purpose of the oligonucleotide and is described in more detail hereinbelow. According to the present invention n is an integer used to describe the number of Rp alkyl- or Rp aryl-phosphonate linkages sequentially synthesized by the present methods. As used herein, n is 0 to 200. Moreover, up to 201 Rp alkyl- or aryl-phosphonate linkages can be formed when n ranges from 0 to 200. However, when n is 0 a single Rp alkyl- or arylphosphonate linkage is formed. Therefore, the present invention is directed towards application of the subject methods to form isolated Rp phosphonate linkages as well as sequential chains of Rp stereoisomeric alkyl- or aryl-phosphonate linkages. Preferably, n is at least 5. However, a value of at least 8 for n is more preferred. Even more preferred is a value of at least 10 for n. Especially preferred values for n are at least 12 or 14. According to the present invention, Y1 is present on a 3'-oxygen of a nucleotide and can be a hydrogen, phosphate, phosphate present in the oligonucleotide or Vl. V1 is related to Y1 in that Vl and Yl are at the same position and Yl can have the same meaning as Vl. As used herein, V1 is a protecting group, a solid support or a phosphate attached to a penultimate nucleotide. Such a penultimate nucleotide is the nucleotide next to the 5'-terminal nucleotide. Moreover, as used herein, Y2 is present on a 5'-oxygen of a nucleotide or an oligonucleotide and can be a hydrogen, a phosphate, a phosphate present in the oligonucleotide or V2, wherein V2 is a protecting group. Since Y2 and V2 are at the same position, removal of a V2 protecting group can generate a Y2 hydrogen or phosphate. Similarly, X and V3 are related not only by virtue of placement at the same position (2') but also because X can have the same meaning as V3, i.e. X is hydroxy or V3. When X is V3, V3 can be hydrogen or 0-Y3 wherein Y is a lower alkyl or a protecting group. According to the present invention, removal of a Y3 protecting group can produce a hydroxy group, i.e. X as OH. As used herein, formulas I and II represent a portion of an oligonucleotide when Y1 or y2 is defined as a phosphate present in the oligonucleotide. Hence additional nucleotides can flank the Rp phosphonate linkage being formed when Y1 or Y2 is a phosphate present in the oligonucleotide. In particular, usage of Yl or Y2 as a phosphate present in the oligonucleotide is intended to indicate that the oligonucleotide can be longer than the n Rp linkages formed by the present methods. More particularly, the present invention contemplates conventional phosphodiester linkages, or an interspersing conventional phosphodiester and Rp phosphonate linkages, in the parts of the oligonucleotide attached to a Y, and Y2 phosphate. As used herein a conventional phosphodiester linkage is a O-PO2-O- linkage between 3'- and 5'-positions of two nucleoside sugars. Preferably about 1 to about 50 -0 PO2-O-linkages can be added to or interspersed between Rp phosphonate linkages of the present oligonucleotides. Conventional oligonucleotides are added by known procedures (e.g. Uhlmann et al. 1990 Chemical Reviews 90: 544-584). Accordingly, the present methods can be adapted to incorporate at least one additional step directed to adding 1-50 nucleotides wherein such nucleotides are joined by O-PO2-O linkages. As provided by the present methods, an internal or non-terminal Rp linkage is produced when both Yl and Y2 are phosphates present in the oligonucleotide. However, when Y1 or Y2 is other than a phosphate present in the oligonucleotide, a 3'-terminal or a 5'-terminal linkage, respectively, can be made. Accordingly, the present methods can be used to generate both internal and terminal Rp stereoisomeric alkyl- or aryl-phosphonate linkages. Moreover, sequential Rp linkages can also be formed by the present methods since V1 can be defined as the phosphate present on the penultimate nucleotide of the oligonucleotide at each round of synthesis. Such a penultimate nucleotide is the nucleotide next to the 5'terminal nucleotide. As defined, V1 can be a solid support. Preferably V1 is a solid support when the present methods are performed by automation since V1 can thereby serve as an anchor for the growing polynucleotide chain. Such a solid support can be any known support used during synthesis of DNA or RNA. Common types of solid supports include controlled pore glass (CPG), polystyrene silica, cellulose, nylon, and the like. Preferred solid supports are CPG and polystyrene. An especially preferred solid support is CPG. The Vl solid support is covalently linked to the 3'-OH of a nucleoside by known procedures (e.g., Matteucci et al. 1980 Tetrahedron Lett. 21: 719-722). Alternatively, nucleosides linked to solid supports can be purchased commercially, e.g. from Sigma Chemical Company. Moreover, a solid support can also be removed from an oligonucleotide of the present invention by known procedures, e.g. by alkaline hydrolysis. The Vl, V2 and Y3 protecting groups can be used when the present synthetic methods are employed to form the subject Rp sterospecific phosphonate linkages. In particular, the present invention provides such protecting groups for covalent binding to a reactive group on a nucleotide. Such binding of a protecting group renders that reactive group unreactive while the present synthetic methods are performed. Reactive groups of the present invention include 5'-OH, 3'-OH, 2'-OH and related groups, e.g. reactive groups present on the B bases. Ideally, a protecting group is easily removed to regenerate the correct structure of the reactive group without chemically altering the remainder of the molecule. Examples of protecting groups contemplated by the present invention include any known blocking or protecting agent used during synthesis of deoxyribooligonucleotides or ribooligo-nucleotides to protect a a hydroxy group on a nucleotide, e.g. a 5'-OH, 3'-OH or 2'-OH group. The V1, V2 and Y3 protecting groups are preferably attached via an oxygen atom. Such O-linked protecting groups are useful for protecting the OH groups on nucleotides. In this regard, Greene (1981 Protecting Groups in Organic Synthesis, John Wiley & Sons, Inc.) provides a comprehensive review of protecting groups which can be used for different reactive groups, including OH reactive groups. Preferred protecting groups of the present invention are lower alkyl, lower cyanoalkyl, lower alkanoyl, aroyl, aryloxy, aryloxy-lower akanoyl, haloaryl, fluorenylmethoxy carbonyl (FMOC), trityl, monomethoxytrityl (MMT), dimethoxytrityl (DMT), and related groups. More preferred protecting groups include isopropyl, isobutyl, 2-cyano-ethyl, acetyl, benzoyl, phenoxy-acetyl, halophenyl, MMT, DMT and the like. As used herein, an activating group A is a heteroaromatic ring with at least one lower alkyl, cycloalkyl, cycloalkyl-lower alkyl, aryl or arylalkyl substituent. It is preferred that the heteroaromatic ring is substituted with a lower alkyl or cycloalkyl. Moreover, the A group is attached to the sulfur atom of the alkyl- or aryl-phosphonothioate nucleotide. However, prior to attachment of the A group, a leaving group L is present on the activating group A at the position to which the A group will be attached to the phosphonothioate nucleotide. As used herein, an activating group A attached to a leaving group L is referred to as an activator or A-L. According to the present invention the activating group A includes a heteroaromatic ring containing from 1 to 4 nitrogen ring atoms, which A group is of the formula: EMI23.1 or a salt thereof, wherein: 9 is C-R1 or N; D is C-R2 or N; E is C-R3 or N; G is C-R4 or N; J is C-R, or N; Y is -S-, -NR6-, or -O-; R is a substituent attached to one nitrogen atom, wherein such a substituent is lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl; RI, R2, R3, R4 and Rs are independently hydrogen, lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl, arylalkyl, or Rl and R2 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R, and R4 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R4 and R5 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring; and R6 is lower alkyl or hydrogen. In a preferred embodiment the activating group A is a salt of a positively charged heteroaromatic ring. In one embodiment, the activating group A is of the formula: EMI24.1 or salts thereof, wherein R, Rl and R2 are as defined hereinabove. More preferably, the positively charged A group is of the formula: EMI24.2 or salts thereof, wherein R is as defined hereinabove. In an especially preferred embodiment the A group has the formula: EMI24.3 or salts thereof, wherein R is as defined hereinabove. Accordingly, as used herein, heteroaromatic groups of the present invention contain from 5-14 ring atoms and 1, 2 or 3 nitrogen, oxygen or sulfur heteroatoms. Preferred heteroaromatic rings are either monocyclic or bicyclic with 1 or 2 ring nitrogen heteroatoms and 5 to 10 ring atoms. Preferred heteroaromatic rings can also have 1 other nitrogen, sulfur or oxygen ring atom. Especially preferred heteroaromatic rings are monocyclic with 5 or 6 ring atoms and one nitrogen heteroatom. Heteroaromatic rings contemplated by the present invention include pyrrole, isopyrrole, pyrazole, triazole, oxazole, isoxazole, thiazole, isothiazole, oxodiazole, tetrazole, pyrazine, pyrimidine, pyridine, oxazine, isoxazine, oxadiazine, imidazole, indole, pyrindine, quinoline, isoquinoline, pyridopyridine and the like. Preferred heteroaromatic rings include pyridine, imidazole, triazole, tetrazole, indole and pyridopyridine rings. Moreover, preferred heteroaromatic rings are positively charged nitrogen heteroaromatic rings, e.g. nitrogen heteroaromatic rings with an N-linked lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl group. An especially preferred heteroaromatic ring is pyridine with an N-linked lower alkyl. In particular, a preferred A group is an Nalkyl pyridinium. The most preferred A group is 2-Nmethylpyridinium. When attached to the alkyl- or arylphosphonothioate nucleotide, such a pyridinium is preferably attached via the 2 or 4 position, and most preferably via the 2 position. Prior to attachment of A to the phosphonothioate, L is present on A at the position to which the phosphonothioate will be attached. As used herein, L is a leaving group. As is generally known in the art, and for the purposes of the present invention, "a leaving group" is defined as a group which is readily broken away from a covalent linkage with a carbon atom by nucleophilic attack on that atom. Leaving groups are generally electron withdrawing groups either because of their electronegativity or because they have an inductive effect. The L groups contemplated by the present invention include halo, nitro, diazo, azido, lower trialkylamino, lower alkoxy, aryloxy, lower alkyl sulfonate, lower fluoroalkylsulfonate, aryl sulfonate, lower alkyl sulfinate, aryl sulfinate groups and the like. Sulfonates include a lower alkyl or an aryl with an attached SO, group. Similarly, sulfinates can be lower alkyl or aryl groups with an attached SO2 group. Preferred lower alkyl sulfonates include methyl sulfonate (i.e. mesylate), ethyl sulfonate, propyl sulfonate, isopropyl sulfonate, butyl sulfonate, isobutyl sulfonate, t-butyl sulfonate, pentyl sulfonates, hexyl sulfonates and the like. Moreover, aryl sulfonates include groups such as tolylsulfonates (i.e. tosylates), and ammonio-alkylsulfonate (i.e. betylates). bromophenylsulfonates (i.e. brosylates), nitrophenyl-sulfonates (i.e. nosylates) and the like. Preferred lower fluoroalkylsulfonates include trifluoromethyl-sulfonates (i.e. -OSO2CF3 or triflates) nonafluorobutyl-sulfonates (i.e. -0S02-C4F9) or nonaflates and 2,2,2-trifluoroethyl-sulfonates (i.e. OSO2-CH2-CH2CF3 or tresylates). Preferred L groups include diazo and halo, i.e. F, Cl, Br and I. More preferred L groups are Cl and Br. In the most preferred embodiment, A-L is positively charged. Since the subject activator A-L can be positively charged, the present invention contemplates providing A-L as a salt. Such A-L salts include the positively charged or cationic A-L moiety associated with a counteranion. The variable Z is used herein to represent the negatively charged counterion. Counterions which can be associated with A-L include halides, e.g. Cl, Br and I. Preferred counterions for forming salts of the present A-L activators are I- and Br. Preferably A-L is a salt of a positively charged nitrogen heteroaromatic group substituted with one leaving group and one N-linked lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl. The L leaving group is preferably at the 2 or 4 position, and more preferably at the 2 position, relative to the nitrogen heteroatom. Preferred A-L activators are 2halo-N-alkyl pyridinium bromide or 2-halo-N-alkyl pyridinium iodide, wherein the alkyl is lower alkyl, and most preferably methyl or ethyl, and the halo is preferably chloro, bromo or dodo. The most preferred A L activators are 2-chloro-N-alkyl pyridinium bromide or 2-chloro-N-alkyl pyridinium iodide. The A-L activators of the present invention are either commercially available or are synthesized by availible procedures. For example the appropriate heteroaromatic group can be purchased which has the desired R1, R2, R3, R4, R5 and R6 groups. Alternatively the Rl, R2, R,, R4 and R5 groups can be attached to the heteroaromatic group by standard alkylation or arylation procedures. An L group such as halogen is then attached, e.g. by adding the halogen at the desired position on the heteroaromatic ring by standard procedures. After addition of the L group, the R and R6 group can be added by standard alkylation or arylation procedures. A preferred A-L activator, 2-chloro-Nmethyl-pyridinium iodide is commercially availible. Moreover, conditions have been described for using this activator to form esters from acids and alcohols (Mukaiyama et al. 1975 Chem. Lett. 1045; and Fieser 1980 Reagents for Organic Synthesis Wiley-Interscience, New York, 8: 95-96). According to the present invention, Rp stereoisomeric alkyl- or aryl-phosphonate linkages between any two nucleotides are formed by reacting a first or 5'-terminal nucleotide of the formula: EMI28.1 with an alkyl- or aryl-phosphonothioate nucleotide intermediate of the formula: EMI28.2 wherein: the intermediate has an Sp stereoisomeric phosphorus configuration; Vl is a protecting group, solid support or phosphate attached to the penultimate nucleotide of the oligonucleotide; V2 is a protecting groups; V3 is hydrogen, or OY,, wherein Y3 is lower alkyl or protecting group; A is an activating group; and B is as defined herein. Moreover according to the present invention, the first (or 5') nucleotide is reacted with the depicted Sp stereoisomeric intermediate under conditions sufficient to covalently link the 5'-oxygen of the 5'- terminal nucleotide with the phosphorus of the intermediate, and thereby lose S-A and invert the Sp stereoisomeric configuration to produce the Rp stereoisomeric alkyl- or aryl-phosphonate linkage. As used herein, conditions sufficient to covalently link the 5'-oxygen with the phosphorus of the intermediate include those time, temperature, solvent and reactant concentration conditions permitting nucleophilic attack by the 5'-oxygen upon the phosphorus to displace the sulfur and invert the phosphorus configuration. Such reaction conditions further permit displacement of the sulfur linked to the activating group to produce an S-A product. A time sufficient to covalently link the 5'oxygen with the phosphorus is about 5 min to about 10 hr, and preferably is about 10 min to about 2 hr. A more preferred reaction time is about 30 min to about 90 min. An especially preferred reaction time is about 60 min. The reaction temperature preferably employed for linkage of a first or 5'-nucleotide to the intermediate methods is about 0 C to about 45"C. A more preferred reaction temperature is about 4"C to about 35"C. An especially preferred linkage temperature is about room temperature, i.e. about 20"C to 25"C. The reaction solvent conditions for linkage of the first or 5'-nucleotide with the intermediate are anhydrous conditions, wherein a nonpolar or nonpolar aprotic solvent is employed. Preferred solvents for use during linkage of a nucleotide with a nucleotide intermediate include acetonitrile, dimethylformamide, tetrahydrofuran, dimethylsulfoxide, pyridine and the like. According to the present invention, the molar ratio of first (or 5'-terminal) nucleotide relative to nucleotide intermediate can range from about 1:10 to about 5:1. Preferably the molar ratios are about 1:3 to about 1:1. An especially preferred molar ratio of first (or 5'-terminal) nucleotide to nucleotide intermediate is about 1:2. In one especially preferred embodiment, the present methods are performed automatically in a nucleic acid synthesizer. The present methods have been designed for adaptation to automation by selecting reactions which can be performed under conditions typically used in nucleic acid synthesizers. For example, room temperatures, solvents and reagents contemplated herein are compatible with procedures and common protecting groups employed during automated nucleic acid synthesis (see Uhlmann et al. for a review of such procedures and protecting groups). In another embodiment, the present invention is directed to an alkyl- or aryl-phosphonothioate nucleotide intermediate which has an Sp stereoisomeric configuration at the phosphorus, and an activating group A. This intermediate is of the formula: EMI31.1 wherein; B and M are as defined hereinabove; V2 is a protecting group; V, is a hydrogen, or OY, wherein Y, is lower alkyl or protecting group; and A is the activating group. The present invention is also directed to a method of producing such an intermediate by reacting an alkyl- or aryl-phosphonothioate nucleotide of the formula: EMI31.2 with an A-L activator under conditions sufficient to produce the intermediate without inversion of the Sp stereoisomeric phosphorus configuration; wherein: the alkyl- or aryl-phosphonothioate nucleotide has an Sp stereoisomeric phosphorus configuration; V2 is a protecting group; V3 is a hydrogen, lower alkoxy or an O-linked protecting group; and M and B are as defined hereinabove. According to the present invention, conditions sufficient to produce the intermediate without inversion of the Sp-stereoisomeric configuration include nuclear displacement conditions wherein the L group is displaced by the sulfur atom on the phosphonothioate nucleotide. Such conditions include a time, a solvent, a temperature and a reactant concentration sufficent for such nucleophilic displacement of L by the sulfur. A time sufficient to displace the L group and so produce the intermediate without inversion of the Spstereoisomeric configuration is about 1 sec to about 30 min and preferably about 10 sec to about 10 min. A more preferred time for displacing L with the sulfur of the phosphonothioate is about 30 sec to about 2 min. An especially preferred time for forming the intermediate is about 1 min. The solvent for generating the intermediate is an anhydrous solvent, and preferably is a nonpolar or nonpolar aprotic solvent. Preferred solvents for use during formation of the intermediate include acetonitrile, dimethylformamide, tetrahydrofuran, dimethylsulfoxide, pyridine and related solvents. The temperature preferably employed to displace the L leaving group with the phosphonothioate sulfur, and thereby form the intermediate, is about OOC to about 600C. A more preferred displacement temperature is about 4"C to about 45"C. An especially preferred temperature for forming the intermediate is about room temperature, i.e. about 20"C. According to the present invention, the concentrations of the alkyl- or aryl-phosphonothioate nucleotide to the A-L activator can range from a molar ratio of about 1:10 to about 10:1. Preferably the molar ratio of phosphonothioate nucleotide to activator is about 1:5 to about 3:1. An especially preferred molar ratio of phosphonothioate nucleotide to A-L activator is about 1:2. One embodiment of the present invention provides a compartmentalized kit for producing a polynucleotide chain of an oligonucleotide having at least five R-alkyl-phosphonate or R-aryl-phosphonate linkages, wherein the oligonucleotide has the formula: EMI33.1 wherein Y1, Y2, X, M and B are defined as hereinabove; and n is the number of Rp alkyl- or arylphosphonate linkages in the portion of the oligonucleotide and is an integer of from about 4 to about 200; which includes: (a) a first container adapted to contain A-L; and (b) a second container adapted to contain salts of a first alkyl- or aryl-phosphonothioate nucleotide precursor of the formula: EMI34.1 wherein the first precursor has an Sp stereoisomeric phosphorus configuration. The kit for producing such a polynucleotide chain of an oligonucleotide can further include at least one additional container adapted to contain a salt of a second Sp stereoisomeric alkyl- or aryl-phosphonothioate nucleotide precursor which has a different B group than the first precursor. In a preferred embodiment, the alkyl- or arylphosphinate nucleotide precursors provided in the kit have a B group selected from the group of guanine, adenine, thymine, cytosine or uracil. Moreover, when a precursor is provided in a kit the M group thereupon is preferably lower alkyl or aryl. A more preferred M group is methyl or ethyl. In addition, a preferred V2 group for a precursor provided in a kit of the present invention is a protecting group, preferably dimethoxytrityl or monomethoxytrityl. Furthermore, the present kits preferably have salts of the preferred A-L activators described hereinabove. In a more preferred embodiment the kit provides a first container containing A-L, a second container containing a salt of alkyl- or arylphosphonothioate guanine, a third container containing a salt of alkyl- or aryl-phosphonothioate adenine, a fourth container containing a salt of alkyl- or arylphosphonothioate cytosine, a fifth container containing a salt of alkyl- or aryl-phosphonothioate thymine and optionally a sixth container containing a salt of alkylor aryl-phosphonothioate uracil. As used herein, salts of the present alkyl- or aryl-phosphonothioate nucleotide precursor are alkali metal or alkaline earth metal salts, for example Li, Na, K, Mg, Ca, and the like. Preferred salts are alkali metal salts, e.g., Li, Na, and K. Especially preferred salts are Li salts. After synthesis by the present methods an oligonucleotide can be purified by polyacrylamide gel electrophoresis, or by any of a number of chromatographic methods, including gel chromatography and high pressure liquid chromatography. In a preferred embodiment the present invention is directed to an oligonucleotide having at least five sequential Rp stereospecific alkyl- or arylphosphonate linkages produced by the present methods. While the oligonucleotides prepared by the present methods can have as little as five Rp stereospeicific alkyl- or aryl-phosphonate linkages, preferred oligonucleotides have more than five Rp stereospecific linkage. For example, in another embodiment, oligonucleotides synthesized by the methods of the present invention generally have about 8 to about 200 alkyl- or aryl-phosphonate linkages. Preferred oligonucleotides of the present invention have about 10 to about 200 alkyl- or aryl-phosphonate linkages. More preferred oligonucleotides have about 12 to about 200 alkyl- or aryl-phosphonate linkages. Especially preferred oligonucleotides of the present invention have about 14 to about 200 alkyl- or aryl-phosphonate linkages. According to the present invention, the subject methods produce R-stereospecific linkages at a higher frequency than S-stereospecific linkages. However, not all of the alkyl- or aryl-phosphonate linkages produced by the present methods may be Rstereospecific. Therefore, S-stereospecific linkages can occasionally be produced, for example, if the preparation of alkyl- or aryl-phosphonothioate nucleotide precursors employed have a small percentage of R-stereoisomeric nucleotide contaminants. Accordingly, the present invention is directed to methods of producing a higher percentage of R stereospecific alkyl- and aryl-phosphonate linkages than S-stereospecific alkyl- and aryl-phosphonate linkages. In particular the present methods can produce at least about 75% R-stereospecific linkages in an oligonucleotide wherein the remaining linkages can be Sstereospecific. More particularly, the oligonucleotides generated by the present methods have about 85% to about 100% R-stereospecific linkages. However, the present methods have the capability for producing oligonucleotides having about 95% to 100% R-stereospecific alkylor aryl-phosphonate linkages. Moreover, the oligonucleotides of the present invention need not have only alkyl- or aryl-phosphonate linkages. In some instances oligonucleotides having a mixture of phosphonate EMI37.1 and conventional phosphodiester (-O-PO2-O) linkages are preferred. For example, conventional phosphodiester linkages may be incorporated into the present oligonucleotides to generate an endonuclease cleavage site or to render the oligonucleotide sensitive to normal cellular enzymes at a particular sequence within the oligonucleotide. If the subject oligonucleotides have conventional phosphodiester linkages these oligonucleotides can have about 1 to about 50 conventional phosphodiester linkages. Therefore, the present invention is directed to oligonucleotides which can have conventional phosphodiester linkages, as well as both Sp stereospecific and Rp stereospecific phosphonate linkages, so long as the oligonucleotide has at least five sequential Rp stereospecific alkyl- or arylphosphonate linkages generated by the present methods. According to the present invention, Rp stereospecific oligonucleotide products derived from the subject synthetic methods can have an attached agent to facilitate cellular delivery or uptake. Such an agent can, for example, be any known moiety which enhances cellular membrane penetration by the oligonucleotide, any known ligand for a cell-specific receptor or any availible antibody reactive with a cell-specific antigen. A moiety or ligand which enhances cellular membrane penetration by the oligonucleotide can include, for example, any non-polar group, steroid, hormone, polycation, protein carrier, or viral or bacterial protein capable of cell membrane penetration. Such a non-polar group can be a phenyl, naphthyl, quinoline, anthracene, phenanthracene, fatty acid, fatty alcohol, sesquiterpene, diterpene and related groups. Steroids which can enhance cell uptake include cholesterol, progesterone, estrogen, androgen and related steroids. For example, covalent linkage of a cholesterol moiety to an oligonucleotide can improve cellular uptake by 5- to 10- fold which in turn improves DNA binding by about 10fold (Boutorin et al., 1989, FEBS Letters 254: 129132). Hormones such as insulin can also bind to cell membranes and facilitate entry of an oligonucleotide thereto into the cell. Polycations, e.g. polyamino acid cations, including cations of basic amino acids, such as poly-L-lysine, can also facilitate uptake of oligonucleotides into cells (Schell, 1974, Biochem. Biophys. Acta 340: 323, and Lemaitre et al., 1987, Proc. Natl. Acad. Sci. USA 84: 648). Certain protein carriers can also facilitate cellular uptake of oligonucleotides, including, for example, serum albumin, transferrin, nuclear proteins possessing signals for transport to the nucleus, and viral or bacterial proteins capable of cell membrane penetration. Accordingly, the present invention contemplates derivatization of the subject oligonucleotides with the above-identified groups to increase oligonucleotide cellular uptake. Moreover, the present invention contemplates the preparation of Rp stereospecific linkages in oligonucleotides having any nucleotide sequence. In many instances the selection of a nucleotide sequence depends upon the intended purpose of the oligonucleotide, for example the nucleotide sequence can be selected for the purpose of binding to a nucleic acid target. Such a nucleic acid target can be present within a template nucleic acid which encodes a DNA, RNA or protein. Moreover, binding of the subject oligonucleotides can be used, for example, to detect or to regulate the biosynthesis of such a template nucleic acid. The present invention contemplates a variety of utilities for the subject Rp stereospecific oligonucleotides. Some utilities include, but are not limited to: use of oligonucleotides of defined sequence bound to a solid support for affinity isolation of complementary nucleic acids; covalent attachment of a drug, drug analog or other therapeutic agent to the oligonucleotide to allow cell-type specific drug delivery; labeling the subject oligonucleotides with a detectable reporter molecule for localizing, quantitating or identifying complementary target nucleic acids; and binding oligonucleotides to a cellular or viral nucleic acid template and regulating biosynthesis directed by that template. The subject oligonucleotides can be attached to a solid support such as silica, cellulose, nylon, polystyrene, polyethylene glycol, Sepharose 4BQ and other natural or synthetic materials that are used to make beads, filters, and column chromatography resins. Attachment procedures for nucleic acids to solid supports of these types are well known; any known attachment procedure is contemplated by the present invention. An oligonucleotide attached to a solid support can then be used to isolate a complementary nucleic acid. Isolation of the complementary nucleic acid can be done by incorporating the oligonucleotide:solid support into a column for chromatographic procedures. Other isolation methods can be done without incorporation of the oligonucleotide:solid support into a column, e.g. by utilization of filtration procedures. Oligonucleotide:solid supports can be used, for example, + mRNA from total cellular or viral to isolate poly(A) RNA by making an Rp alkyl- or aryl-phosphonate oligonucleotide with only poly(dT) or poly(U) B groups. The present Rp alkyl- and aryl-phosphonate oligonucleotides are ideally suited to applications of this type because they are nuclease resistant and bind strongly to target nucleic acids. The present invention also contemplates using the subject oligonucleotides for targeting drugs to specific cell types. Such targeting can allow selective destruction or enhancement of particular cell types, e.g. inhibition of tumor cell growth can be attained. Different cell types express different genes, so that the concentration of a particular mRNA can be greater in one cell type relative to another cell type, such an mRNA is a target mRNA for cell type specific drug delivery by oligonucleotides linked to drugs or drug analogs. Cells with high concentrations of target mRNA are targeted for drug delivery by administering to the cell an oligonucleotide with a covalently linked drug that is complementary to the target mRNA. The present invention also contemplates labeling the subject oligonucleotides for use as probes to detect a target nucleic acid. Labelled oligonucleotide probes have utility in diagnostic and analytical hybridization procedures for localizing, guantitating or detecting a target nucleic acid in tissues, chromosomes or in mixtures of nucleic acids. Oligonucleotide probes of this invention represent a substantial improvement over conventional nucleic acid probes for such procedures because the present Rp stereospecific linkages provide oligonucleotides with increased binding stability. Labeling an oligonucleotide can be done by incorporating nucleotides linked to a "reporter molecule" into the subject oligonucleotides. A "reporter molecule", as defined herein, is a molecule or atom which, by its chemical nature, provides an identifiable signal allowing detection of the oligonucleotide. Detection can be either qualitative or quantitative. The present invention contemplates using any commonly used reporter molecule including radionuclides, enzymes, biotins, psoralens, fluorophores, chelated heavy metals, and luciferin. The most commonly used reporter molecules are either enzymes, fluorophores or radionuclides which can be linked to nucleotides either before or after oligonucleotide synthesis. Preferably, the reporter molecule is added after oligonucleotide synthesis, for example, by forming a covalent linkage between a 3'- or 5'-terminal hydroxy or phosphate and a phosphate, nitrogen, sulfor or oxygen atom on the reporter molecule. Commonly used enzymes include horseradish peroxidase, alkaline phosphatase, glucose oxidase and - galactosidase, among others. The substrates to be used with the specific enzymes are generally chosen because a detectably colored product is formed by the enzyme acting upon the substrate. For example, p-nitrophenyl phosphate is suitable for use with alkaline phosphatase conjugates; for horseradish peroxidase, 1,2phenylenediamine, 5-aminosalicyclic acid or toluidine are commonly used. The probes so generated have utility in the detection of a specific DNA or RNA target in, for example, Southern analysis, Northern analysis, in situ hybridization to tissue sections or chromosomal squashes and other analytical and diagnostic procedures. Methods of using such hybridization probes are well known and examples of such methodology are provided by Sambrook et al. (1989, Molecular Cloning: A Laboratory Manual, Vols. 1-3, Cold Spring Harbor Press, NY). The present oligonucleotides can be used in conjunction with any known detection or diagnostic procedure which is based upon hybridization of a probe to a target nucleic acid. Moreover, the present oligonucleotides can be used in any hybridization procedure which quantitates a target nucleic acid, e.g., by competitive hybridization between a target nucleic acid present in a sample and a labeled tracer target for one of the present oligonucleotides. Furthermore, the reagents needed for making a oligonucleotide probe and for utilizing such a probe in a hybridization procedure can be marketed in a kit. The kit for detection of a hybridized oligonucleotide probe of the present invention can be compartmentalized for ease of utility and can contain at least one first container providing an oligonucleotide of the present invention. The kit can also be adapted to contain at least one other container providing reagents for labeling the oligonucleotide with a reporter molecule. Moreover, the kit can be further adapted to contain at least one other container providing reagents for detecting the reporter molecule linked to the oligonucleotide. Moreover the present invention provides a kit for isolation of a template nucleic acid. Such a kit has at least one first container providing one of the present oligonucleotides which is complementary to a target contained within the template. For example, the template nucleic acid can be cellular and/or viralail. + mRNA and the target can be the poly(A) poly(A) Hence oligonucleotides of the present invention which have utility for isolation of poly(A) nucleotide sequence of poly(dT) or poly(U). Furthermore, the present invention provides kits useful when diagnosis of a disease depends upon detection of a specific, known target nucleic acid. Such nucleic acid targets can be, for example, a viral nucleic acid, an extra or missing chromosome or gene, a mutant cellular gene or chromosome, an aberrantly expressed RNA and others. Examples of such target nucleic acids contemplated by the present invention are provided hereinbelow. These diagnostic kits can be compartmentalized to contain at least one first container providing a oligonucleotide linked to a reporter molecule and can contain at least one second container providing reagents for detection of the reporter molecule. One aspect of the present invention provides a method of regulating biosynthesis of a DNA, an RNA or a protein by contacting at least one of the subject oligonucleotides with a nucleic acid template for that DNA, that RNA or that protein in an amount and under conditions sufficient to permit the binding of the oligonucleotide(s) to a target sequence contained in the template. The binding between the oligonucleotide(s) and the target can regulate biosynthesis of the nucleic acid or the protein, e.g. by blocking access to the template. When access to the template is blocked proteins and nucleic acids involved in the biosynthetic process are prevented from binding to the template, from moving along the template, or from recognizing signals encoded within the template. As used herein, biosynthesis of a nucleic acid or a protein includes cellular and viral processes such as DNA replication, DNA reverse transcription, RNA transcription, RNA splicing, RNA polyadenylation, RNA translocation and protein translation, and related processes which can lead to production of DNA, RNA or protein, and involve a nucleic acid template at some stage of the biosynthetic process. As used herein, a nucleic acid template can be an RNA or a DNA template. As contemplated by the present invention, regulating biosynthesis includes inhibiting, stopping, increasing, accelerating or delaying biosynthesis. Regulation may be direct or indirect, i.e. biosynthesis of a DNA, RNA or protein may be regulated directly by binding a oligonucleotide to the template for that DNA, RNA or protein; alternatively, biosynthesis may be regulated indirectly by oligonucleotide binding to a second template encoding a protein that plays a role in regulating the biosynthesis of the first DNA, RNA or protein. DNA replication from a DNA template is mediated by proteins which bind to an origin of replication where they open the DNA and initiate DNA synthesis along the DNA template. To inhibit DNA replication in accordance with the present invention, oligonucleotides are selected which bind to one or more targets in an origin of replication. Such binding blocks template access to proteins involved in DNA replication. Therefore initiation and procession of DNA replication is inhibited. As an alternative method of inhibiting DNA replication, expression of the proteins which mediate DNA replication can be inhibited at, for example, the transcriptional or translational level. DNA replication from an RNA template is mediated by reverse transcriptase binding to a region of RNA also bound by a nucleic acid primer. To inhibit DNA replication from an RNA template, reverse transcriptase or primer binding can be blocked by binding a oligonucleotide to the primer binding site, and thereby blocking access to that site. Moreover, inhibition of DNA replication can occur by binding a oligonucleotide to a site residing in the RNA template since such binding can block access to that site and to downstream sites, i.e. sites on the 3' side of the target or binding site. To initiate RNA transcription, RNA polymerase recognizes and binds to specific start sequences, or promoters, on a DNA template. Binding of RNA polymerase opens the DNA template. There are also additional transcriptional regulatory elements that play a role in transcription and are located on the DNA template. These transcriptional regulatory elements include enhancer sequences, upstream activating sequences, repressor binding sites and others. All such promoter and transcriptional regulatory elements, singly or in combination, are targets for the subject oligonucleotides. Oligonucleotide binding to these sites can block RNA polymerase and transcription factors from gaining access to the template and thereby regulating, e.g., increasing or decreasing, the production of RNA, especially mRNA and tRNA. Additionally, the subject oligonucleotides can be targeted to the coding region or 3'-untranslated region of the DNA template to cause premature termination of transcription. One skilled in the art can readily design oligonucleotides for the above target sequences from the known sequence of these regulatory elements, from coding region sequences, and from consensus sequences. RNA transcription can be increased by, for example, binding a oligonucleotide to a negative transcriptional regulatory element or by inhibiting biosynthesis of a protein that can repress transcription. Negative transcriptional regulatory elements include repressor sites or operator sites, wherein a repressor protein binds and blocks transcription. Oligonucleotide binding to repressor or operator sites can block access of repressor proteins to their binding sites and thereby increase transcription. The primary RNA transcript made in eukaryotic cells, or pre-mRNA, is subject to a number of maturation processes before being trans located into the cytoplasm for protein translation. In the nucleus, introns are removed from the pre-mRNA in splicing reactions. The 5' end of the mRNA is modified to form the 5' cap structure, thereby stabilizing the mRNA. Various bases are also altered. The polyadenylation of the mRNA at the 3' end is thought to be linked with export from the nucleus. The subject oligonucleotides can be used to block any of these processes. A pre-mRNA template is spliced in the nucleus by ribonucleoproteins which bind to splice junctions and intron branch point sequences in the pre-mRNA. Consensus sequences for 5' and 3' splice junctions and for the intron branch point are known. For example, inhibition of ribonucleoprotein binding to the splice junctions or inhibition of covalent linkage of the 5' end of the intron to the intron branch point can block splicing. Maturation of a pre-mRNA template can, therefore, be blocked by preventing access to these sites, i.e. by binding oligonucleotides of this invention to a 5' splice junction, an intron branch point or a 3' splice junction. Splicing of a specific pre-mRNA template can be inhibited by using oligonucleotides with sequences that are complementary to the specific pre-mRNA splice junction(s) or intron branch point. In a further embodiment, a collection of related splicing of pre-mRNA templates can be inhibited by using a mixture of oligonucleotides having a variety of sequences that, taken together, are complementary to the desired group of splice junction and intron branch point sequences. Polyadenylation involves recognition and cleavage of a pre-mRNA by a specific RNA endonuclease at specific polyadenylation sites, followed by addition of a poly(A) tail onto the 3' end of the pre-mRNA. Hence, any of these steps can be inhibited by binding the subject oligonucleotides to the appropriate site. RNA trans location from the nucleus to the cytoplasm of eukaryotic cells appears to require a poly(A) tail. Thus, a oligonucleotide is designed in accordance with this invention to bind to the poly(A) tail and thereby inhibit RNA translocation. The sequence of such an oligonucleotide can consist of about 10 to about 50 thymine residues, and preferably about 20 thymine residues. Protein biosynthesis begins with the binding of ribosomes to an mRNA template, followed by initiation and elongation of the amino acid chain via translational "reading" of the mRNA. Protein biosynthesis, or translation, can thus be blocked or inhibited by blocking access to the template using the subject oligonucleotides to bind to targets in the template mRNA. Such targets contemplated by this invention include the ribosome binding site the 5' mRNA cap site, an initiation codon, a site between a 5' mRNA cap site and the initiation codon and sites in the protein coding sequence. There are also classes of protein which share domains of nucleotide sequence homology. Thus, inhibition of protein biosynthesis for such a class can be accomplished by targeting the homologous protein domains (via the coding sequence) with the subject oligonucleotides. Regulation of biosynthesis by any of the aforementioned procedures has utility for many applications. For example, genetic disorders can be corrected by inhibiting the production of mutant or over-produced proteins, or by increasing production of under-expressed proteins; the expression of genes encoding factors that regulate cell proliferation can be inhibited to control the spread of cancer; and virally encoded functions can be inhibited to combat viral infection. Some types of genetic disorders that can be treated by the oligonucleotides of the present invention include Alzheimer's disease, some types of arthritis, sickle cell anemia, and types of cancer for which patients can be a genetically predisposed, as well as other genetic disorders. Many types of viral infections can be treated by utilizing the oligonucleotides of the present invention, including infections caused by influenza, rhinovirus, human immunovirus, herpes simplex, papilloma virus, cytomegalovirus, Epstein-Barr virus, adenovirus, vesticular stomatitus virus, rotavirus and respitory synctitial virus among others. According to the present invention, animal and plant viral infections may also be treated by administering the subject oligonucleotides. Accordingly, template nucleic acids contemplated by the present invention include cellular oncogenes, genes having a role in Alzheimer's disease, genetic functions encoded by viruses such as those described above, and others. Such template nucleic acids include but are not limited to SEQ ID NO:1 to SEQ ID NO:98 which encode the following genetic functions: SEQ ID NO:1 human c-abl; SEQ ID NO:2 human c-bcl-2a; SEQ ID NO:3 human c-bcl-2b; SEQ ID NO:4 human c-bcr-l; SEQ ID NO:5 human c-bcr-2; SEQ ID NO:6 human c-bcr-3; SEQ ID NO:7 human c-cbl; SEQ ID NO:8 human c-erbB-2; SEQ ID NO:9 human c-ets-l; SEQ ID NO:10 human c-dbl; SEQ ID NO: 11 human c-fgf; SEQ ID'NO:12 human c-fgr-l; SEQ ID NO:13 human c-fgr-2; SEQ ID NO:14 human c-fgr-3; SEQ ID NO:15 human c-fgr-4; SEQ ID NO:16 human c-fgr-5; SEQ ID NO:17 human c-fgr-6; SEQ ID NO:18 human c-fgr-7; SEQ ID NO:19 human c-fms; SEQ ID NO:20 human c-fos; SEQ ID NO:21 human c-has/bas; SEQ ID NO:22 human c-int-l; SEQ ID NO:23 human c-int-2; SEQ ID NO:24 human c-jun; SEQ ID NO:25 human c-kit; SEQ ID NO:26 human c-mas; SEQ ID NO:27 human c-met; SEQ ID NO:28 human c-myc; SEQ ID NO:29 human c-Ki-rasl; SEQ ID NO:30 human N-ras-l; SEQ ID NO:31 human N-ras-2; SEQ ID NO:32 human N-ras-3; SEQ ID NO:33 human N-ras-4; SEQ ID NO:34 human c-ret; SEQ ID NO:35 human c-ros-l; SEQ ID NO:36 human c-ros-2; SEQ ID NO:37 human c-ros-3; SEQ ID NO:38 human c-ros-4; SEQ ID NO:39 human c-rps-5; SEQ ID NO:40 human c-ros-6; SEQ ID NO:41 human c-ros-7; SEQ ID NO:42 human c-ros-8; SEQ ID NO:43 human c-ros-9; SEQ ID NO:44 human c-ros-l0; SEQ ID NO:45 human c-sec; SEQ ID NO:46 human c-sis-l; SEQ ID NO:47 human c-sis-2; SEQ ID NO:48 human c-sis-3; SEQ ID NO:49 human c-sis-4; SEQ ID NO:50 human c-sis-5; SEQ ID NO:51 human c-sis-al; SEQ ID NO:52 human c-sis-a2; SEQ ID NO:53 human c-sis-a3; SEQ ID NO:54 human c-sis-a4; SEQ ID NO:55 human c-sis-a5; SEQ ID NO: 56 human c-sis-a6; SEQ ID NO:57 human c-sis-a7; SEQ ID NO:58 human c-sis-bl; SEQ ID NO:59 human c-sis-b2; SEQ ID NO:60 human c-sis-b3; SEQ ID NO: 61 human c-sis-b4; SEQ ID NO:62 human c-sis-b5; SEQ ID NO:63 human c-snoA; SEQ ID NO:64 human c-snoN; SEQ ID NO:65 human c-spi-1; SEQ ID NO:66 human c-src-1; SEQ ID NO:67 human c-src-2; SEQ ID NO:68 human c-src-3; SEQ ID NO:69 human c-src-4; SEQ ID NO:70 human c-src-5; SEQ ID NO:71 human c-src-6: SEQ ID NO:72 human c-src-7; SEQ ID NO:73 human c-src-8; SEQ ID NO:74 human c-src-9; SEQ ID NO:75 human c-src-l0; SEQ ID NO:76 human c-src-ll; SEQ ID NO: 77 human c-syn; SEQ ID NO:78 human c-trk; SEQ ID NO:79 human c-vav; SEQ ID NO:80 human c-mos-OA; SEQ ID NO:81 human GP5-mos; SEQ ID NO:82 human c-yes-1; SEQ ID NO:83 human c-yes-2; SEQ ID NO:84 human c-ski-1; SEQ ID NO:85 human c-ski-2; SEQ ID NO:86 human c-ski-3; SEQ ID NO:87 human c-ski-4; SEQ ID NO:88 human c-ski-S; SEQ ID NO:89 human c-myb-l; SEQ ID NO:90 human c-myb-2; SEQ ID NO:91 human c-myb-3; SEQ ID NO:92 human c-myb-4; SEQ ID NO:93 human c-rel. Moreover, according to the present invention the subject oligonucleotides need have only sufficient complementarity to detectably bind to either strand of a target nucleic acid sequence, e.g. SEQ ID NO:1-98. Complementarity between nucleic acids is the degree to which the bases in one nucleic acid strand can hydrogen bond, or base pair, with the bases in a second nucleic acid strand. Hence, complementarity can sometimes be conveniently described by the percentage, i.e. proportion, of nucleotides which form base pairs between two strands or within a specific region or domain of two strands. For the present invention sufficient complementarity means that a sufficient number of base pairs exists between the subject oligonucleotides and a target nucleic acid to achieve detectable binding of the oligonucleotide. Therefore a sufficient number, but not necessarily all, nucleotides in the present oligonucleotides can hydrogen bond to a target. The number of positions which are necessary to provide sufficient complementarity for binding of the subject oligonucleotides, can be detected by standard procedures including a melting temperature determination, standard Southern and Northern hybridization, light absorption detection, gel shift, DNA footprinting, alkylation interference and related procedures (as provided for example in Sambrook et al.). Moreover, according to the present invention oligonucleotide binding can be detected functionally, e.g. by observing a decrease in cellular or viral proliferation or by observing a decrease or increase in the synthesis of the DNA, RNA or protein encoded within or by a template nucleic acid. Accordingly the degree of complementarity between an oligonucleotide of the present invention and a strand of a target nucleic acid need not be 100% so long as oligonucleotide binding can be detected. However, it is preferred that the present oligonucleotides have at least about 50% complementarity with their target nucleic acids. In an especially preferred embodiment sufficient complementarity is greater than 70% complementarity with the target. Moreover, the degree of complementarity that provides detectable binding between the subject oligonucleotides and the target is dependent upon the conditions under which that binding occurs. It is well known that binding between nucleic acid strands depends on factors besides the degree of mismatch between two sequences. Such factors include the GC content of the region, temperature, ionic strength, the presence of formamide and types of counter ions present. The effect that these conditions have upon binding is known to one skilled in the art. Furthermore, conditions are frequently determined by the circumstances of use. For example, when an oligonucleotide is made for use in vivo, no formamide will be present and the ionic strength, types of counter ions, and temperature correspond to physiological conditions. Binding conditions can be manipulated in vitro to optimize the utility of the present oligonucleotides. A thorough treatment of the qualitative and quantitative considerations involved in establishing binding conditions that allow one skilled in the art to design appropriate oligonucleotides for use under the desired conditions is provided by Beltz et al., 1983, Methods Enzymol. 100: 266-285 and by Sambrook et al. Thus for the present invention, one of ordinary skill in the art can readily design a nucleotide sequence for the subject oligonucleotides which exhibits sufficient complementarity to detectably bind to the target nucleic acid of interest, including nucleic acids having SEQ ID NO: 1-93. To confirm a nucleotide sequence, oligonucleotides may be subjected to DNA sequencing by any of the known procedures, including Maxam and Gilbert sequencing, Sanger sequencing, capillary electrophoresis sequencing, the wandering spot sequencing procedure or by using selective chemical degradation of oligonucleotides bound to Hybond paper. Sequences of oligonucleotides can also be analyzed by plasma desorption mass spectroscopy or by fast atom bombardment (McNeal, et al., 1982, J. Am. Chem. Soc. 104: 976; Viari, et al., 1987, Biomed. Environ. Mass Spectrom. 14: 83; Grotjahn et al., 1982, Nuc. Acid Res. 10: 4671). Sequencing methods are also available for RNA oligonucleotides. A further aspect of this invention provides pharmaceutical compositions containing the subject oligonucleotides with a pharmaceutically acceptable carrier. In particular, the present invention provides a pharmaceutical composition for regulating biosynthesis of a nucleic acid or protein comprising a pharmaceutically effective amount of the subject oligonucleotide with a pharmaceutically acceptable carrier. As used herein a pharmaceutically effective amount of the subject oligonucleotides is about 0.1 ug to about 100 mg per kg of body weight per day, and preferably of about 0.1 ug to about 10 mg per kg of body weight per day. Dosages can be readily determined by one of ordinary skill in the art and formulated into the subject pharmaceutical compositions. As used herein, "pharmaceutically acceptable carrier" includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like. The use of such media and agents for pharmaceutical active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions. The subject oligonucleotides can be provided to a mammalian cell by topical or parenteral administration, for example, by intraveneous, intramuscular, intraperitoneal subcutaneous or intradermal route, or when suitably protected, the subject oligonucleotides can be orally administered. The subject oligonucleotides may be incorporated into a cream, solution or suspension for topical administration. For oral administration, oligonucleotides may be protected by enclosure in a gelatin capsule. Oligonucleotides may be incorporated into liposomes or liposomes modified with polyethylene glycol for parenteral administration. Incorporation of additional substances into the liposome, for example, antibodies reactive against membrane proteins found on specific target cells, can help target the oligonucleotides to specific cell types. Topical administration and parenteral administration in a liposomal carrier is preferred. The following examples further illustrate the invention. EXAMPLE 1 PREPARATION OF AN STEREOISOMERIC ALKYL - OR ARYL-PHOSPHONOTHIOATE NUCLEOTIDES The reactions used to produce Sp- and Rpstereospecific nucleotides are as described in (Lebedev, et al. 1990 Tetrahedron Letters 31: 855-858) and are depicted below in Reaction Scheme I. DMT is used for dimethokytrityl in Reaction Scheme I. The phosphate present on 5' dimethoxytritylthymidyl-3' -methylphosphonoamidate (1) was protected by cyanoethylation in the presence of 4 (N,N-diethylamino)-pyridine (DMAP) and trifluoroacetic anhydride at room temperature to produce 5' dimethoxytritylthymidyl-3' -methylphosphono-ethylcyanate (2). The phosphite triester present in 2 was then oxidized with sulfur (sue) in the presence of CH3CN to generate racemic 5' -dimethoxytrityl-thymidyl-3' - cyanoethylphosphonothioate (3). The diastereomers of 3 were purified and separated by silica high pressure liquid chromatography (HPLC). Cyanoethyl groups were removed with concentrated ammonium hydroxide in ethanol (v/v 1:2). The deprotected diastereomers were then purified by silica HPLC and the ammonium cation was replaced with lithium (Li+) by using a Dowex 50W x 2 exchange column to yield the lithium salts of separate Sp- and Rp-stereoisomers of 5'-dimethoxytritylthymidyl3'-methylphosphonothioate (4). Each reaction was was monitored by observation of distinct 3 lip nuclear magnetic resonance (NMR) peaks which are characteristic of a given reactant or product. *The chemical shift frequencies (6) of the reactants, intermediates and products generated during the foregoing synthetic procedures are provided in Reaction Scheme I. EXAMPLE 2 METHODS FOR DETECTING AND MONITORING THE STEREOlSOMERIC CONFIGURATION OF A PHOPHONATE LINKAGE Separation of Stereoisomers: Rp and Sp stereoisomers of alkyl- or arylphosphonate nucleotides prepared as in Example 1, were stable and were separated by ion exchange chromatography or by high pressure liquid chromatography (HPLC) using anhydrous or aqueous solvents. Reversed phase or silica gel columns were employed when separation was by HPLC. For example, Sp- and Rp-stereoisomers of 5' dimethoxytritylthymidyl-3' -methylphosphonothioate were separated by Cia silica gel HPLC using either acetic acid/methanol or a gradient of 10-15% acetonitrile in water. Similarly, racemic 5',3'-protected dithymidine methylphosphonate was resolved into Rp and Sp stereoisomers by HPLC on a 4.6 x 250 mm column of silica gel a gradient of 10-15% acetonitrile in water for elution (Fig. 1). Accordingly, Rp and Sp stereoisomers of both nucleotides and short oligonucleotides can be chromatographically separated. Detection by Circular Dichroism: Circular dichroism (CD) has been used to detect stereoisomeric differences. For example, separate Rp and Sp stereoisomers of dithymidine methylphosphonate have different CD spectra, wherein the Rp isomer has a larger CD peak and the Sp isomer CD trough is blue-shifted (Fig. 2). Detection by Nuclear Magnetic Resonance: Separated Rp and Sp stereoisomers have distinctive 1H and 31P nuclear magnetic resonance (NMR) spectra. For example, Figs. 3 and 4 depict the respective XH and 31P NMR spectra of both Rp and Sp stereoisomers of dithymidine methylphosphonate. Detection by Mass Spectroscopy: Fast atom bombardment mass spectrometry (FABMS) has been used extensively to examine the structures of oligonucleotides having molecular weights up to 10,000 g/mole (Stec et al. 1985 J. Org. Chem. 50: 3908; Ulrich et al. 1984 Org. Mass Spectrom. 19: 585; Grotjahn et al. 1982 Nucleic Acids Res. 10: 4671; Grotjahn et al. 1983 Int. J. Mass Spectrom. Ion Phys. 46: 439; Sindona et al. 1982 J. Chem. Res. (S):184; Eagles et al. 1984 Biomed. Mass. Spectrom. 11: 41; Connolly et al. 1984 Biochemistry 23: 3443-3453; and Matsuo et al. 1986 34th Annual Conference on Mass Spectrometry and Allied Topics, 329). Therefore, FABMS has utility for structural analyses of R and S stereoisomers of alkyl- and aryl-phosphonates. For example, FABMS of tetrathymidine methylphosphonate (i.e. DMT-TpTpTpT-OAc) which was sputtered from thioglycerol yielded the spectrogram depicted in Fig. 5 wherein peaks corresponding to distinct molecular fragments are identified (e.g. DMT TpT is dimethoxytrityl-dithymidine methylphosphonate). EXAMPLE 3 A METHOD FOR MAKING AN R STERBOISOMERIC ALKYLPHOPHONATE LINKAGE Reactions for producing an Rp-stereospecific linkage are depicted below in Reaction Scheme II. DMT is used for dimethoxytrityl in Reaction Scheme II. A purified Sp stereoisomer of an 3'-Omethylphosphonate nucleotide (4) is made by the procedures provided in Example 1. In the first.step, 1 mmole of 4 is reacted with 2 mmole of an activator, 2chloro-N-methylpyridinium (5) in 10 ml of acetonitrile (CH,CN) under anhydrous conditions at room temperature for 1 min. This reaction yields a reactive intermediate (6) without altering the S configuration of the phosphate. To this reaction mixture is added 0.5 mmole of a 5'-OH unprotected nucleoside and the mixture is allowed to react at room temperature for 60 min. The 5'-oxygen atom of the nucleoside attacks the phosphorus and displaces the sulfur with the attached activating group by SN2 nucleophilic substitution. This SH2 displacement reaction inverts the S stereoisomeric configuation of the phosphorus to an R stereoisomeric configuration. The displaced 2-thio-N-methylpyridinium molecule is stabilized by resonance tautomerization and does not further react with the phosphorus to cause epimerization of the R configuration. This was confirmed in a separate study, wherein the only 31P NMR peaks characteristic of a phosphorus-sulfur linkage (at 86-88 ppm) were attributable to reactive intermediate 6. SEQUENCE LISTING (1) GENERAL INFORMATION: (j) APPLICANT: Wickstrom, Eric and Lebedev, Alexander V. (ii) TITLE OF INVENTION: Pentavalent Synthesis of Oligonucleotides containing Stereospecific Alkyiphosphonates and Arylphosphonates (iii) NUMBER OF SEQUENCES: 93 (iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: SCULLY, SCOTT, MURPHY & PRESSER (B) STREET: 400 Garden City Plaza (C) CITY: Garden City (D) STATE: NY (E) COUNTRY: USA (F) ZIP: 11530 (v) COMPUTER READABLE FORM: (A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) (D) TOPOLOGY: linear (li) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: GATCTTGCTG CCCGAAACTG CCTGGTAGGG GAGAACCACT TGGTGAAGGT AGCTGATTTT 60 GGCCTGAGCA GGTTGATGAC AGGGGACACC TACACAGCCC ATGCTGGAGC CAAGTTCCCC 120 ATCAAATGGA CTGCACCCGA GAGCCTGGCC TACAACAAGT TCTCCATCAA GTCCGACGTC 180 TGGGGTAAGG GC 192 (2) INFORMATION FOR SEO ID NO:2: (1) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5086 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (11) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:2: GCGCCCGCCC CTCCGCGCCG CCTGCCCGCC CGCCCGCCGC GCTCCCGCCC GCCGCTCTCC 60 GTGGCCCCGC CGCGCTGCCG CCGCCGCCGC TGCCAGCGAA GGTGCCGGGG CTCCGGGCCC 120 TCCCTGCCGG CGGCCGTCAG CGCTCGGAGC GAACTGCGCG ACGGGAGGTC CGGGAGGCGA 180 CCGTAGTCGC GCCGCCGCGC AGGACCAGGA GGAGGAGAAA GGGTGCGCAG CCCGGAGGCG 240 GGGTGCGCCG GTGGGGTGCA GCGGAAGAGG GGGTCCAGGG GGGAGAACTT CGTAGCAGTC 300 ATCCITTTTA GGAAAAGAGG GAAAAAATAA AACCCTCCCC CACCACCTCC TTCTCCCCAC 360 CCCTCGCCGC ACCACACACA GCGCGGGCTT CTAGCGCTCG GCACCGGCGG GCCAGGCGCG 420 TCCTGCCTTC ATTTATCCAG CAGCTTTTCG GAAAATGCAT TTGCTGTTCG GAGTTTAATC 480 AGAAGACGAT TCCTGCCTCC GTCCCCGGCT CCTTCATCGT CCCATCTCCC CTGTCTCTCT 540 CCTGGGGAGG CGTGAAGCGG TCCCGTGGAT AGAGATTCAT GCCTGTGTCC GCGCGTGTGT 600 GCGCGCGTAT AAATTGCCGA GAAGGGGAAA ACATCACAGG ACTTCTGCGA ATACCGGACT 660 GAAAATTGTA ATTCATCTGC CGCCGCCGCT GCCAAAAAAA AACTCGAGCT CTTGAGATCT 720 CCGGTTGGGA TTCCTGCGGA TTGACATTTC TGTGAAGCAG AAGTCTGGGA ATCGATCTGG 780 AAATCCTCCT AATTTTTATC CCCTCTCCCC CCGACTCCTG ATTCATTGGG AAGTTTCAAA 840 TCAGCTATAA CTGGAGAGTG CTGAAGATTG ATGGGATCGT TGCCTTATGC ATTTGTTTTG 900 GTTTTACAAA AAGGAAACTT GACAGAGGAT CATGCTGTAC TTAAAAAATA CAAGTAAGTC 960 TCGCACAGGA AATTGGTTTA ATGTAACTTT CAATGGAAAC CTTTGAGATT TTTTACTAA 1020 AGTGCATTCG AGTAAATTTA ATTTCCAGGC AGCTTAATAC ATTGTTTTTA GCCGTGTTAC 1080 TTGTAGTGTG TATGCCCTGC TTTCACTCAG TGTGTACAGG GAAACGCACC TGAT: TTTA 1140 CTTATTAGTT TGTTTTTCT TTAACCrrTC AGCATCACAG AGGAAGTAGA CTGATATTAA 1200 CAATACTTAC TAATAATAAC GTGCCTCATG AAATAAAGAT CCGAAAGGAA TTGGAATAAA 1260 AATTTCCTGC GTCTCATGCC AAGAGGGAAA CACCAGAATC AAGTGTTCCG CGTGATTGAA 1320 GACACCCCCT CGTCCAAGAA TGCAAAGCAC ATCCAATAAA ATAGCTGGAT TATAACTCCT 1380 CTTCTTTCTC TGGGGGCCGT GGGGTGGGAG CTGGGGCGAG AGGTGCCGTT GGCCCCCGTT 1440 GCTTTTCCTC TGGGAAGGAT GGCGCACGCT GGGAGAACGG GGTACGACAA CCGGGAGATA 1500 GTGATGAAGT ACATCCATTA TAAGCTGTCG CAGAGGGGCT ACGAGTGGGA TGCGGGAGAT 1560 GTGGGCGCCG CGCCCCCGGG GGCCGCCCCC GCACCGGGCA TCTTCTCCTC CCAGCCCGGC. 1620 CACACGCCCC ATCCAGCCGC ATCCCGCGAC CCGGTCGCCA GGACCTCGCC GCTGCAGACC 1680 CCGGCTGCCC CCGGCGCCGC CGCGGGGCCT GCGCTCAGCC CGGTGCCACC TGTGGTCCAC 1740 CTGGCCCTCC GCCAAGCCGG CGACGACTTC <RTI ID=65.10> TCCCGCCGCT ACCGCGGCGA CTTCGCCGA(; 1800 ATGTCCAGCC AGCTGCACCT GACGCCCTTC ACCGCGCGGG GACGCTTTGC CACGGTGGTG 1860 GAGGAGCTCT TCAGGGACGG GGTGAACTGG GGGAGGATTG TGGCCTTCTT TGAGTTCGGT 1920 GGGGTCATGT GTGTGGAGAG CGTCAACCGG GAGATGTCGC CCCTGGTGGA CAACATCGCC 1980 CTGTGGATGA CTGAGTACCT GAACCGGCAC CTGCACACCT GGATCCAGGA TAACGGAGGC 2040 TGGGATGCCT TTGTGGAACT GTACGGCCCC AGCATGCGGC CTCTGTTTGA TTTCTCCTGG 2100 CTGTCTCTGA AGACTCTGCT CAGTTTGGCC CTGGTGGGAG CTTGCATCAC CCTGGGTGCC 2160 TATCTGAGCC ACAAGTGAAG TCAACATGCC TGCCCCAAAC AAATATGCAA AAGGTTCACT 2220 AAAGCAGTAG AAATAATATG CATTGTCAGT GATGTACCAT GAAACAAAGC TGCAGGCTGT 2280 TTAAGAAAAA ATAACACACA TATAAACATC ACACACACAG ACAGACACAC ACACACACAA 2340 CAATTAACAG TCTTCAGGCA AAACGTCGAA TCAGCTATTT ACTGCCAAAG GGAAATATCA 2400 TTTATTTTTT ACATTATTAA GAAAAAAGAT TTATTTATIT AAGACAGTCC CATCAAAACT 2460 CCGTCTTTGG AAATCCGACC ACTAATTGCC AAACACCGCT TCGTGTGGCT CCACCTGGAT 2520 GTTCTGTGCC TGTAAACATA GATTCGCTTT CCATGTTGTT GGCCGGATCA CCATCTGAAG 2580 AGCAGACGGA TGGAAAAAGG ACCTGATCAT TGGGGAAGCT GGCTTTCTGG CTGCTGGAGG 2640 CTGGGGAGAA GGTGTTCATT CACTTGCATT TCTTTGCCCT GGGGGCGTGA TATTAACAGA 2700 GGGAGGGTTC CCGTGGGGGG AAGTCCATGC CTCCCTGGCC TGAAGAAGAG ACTCTTTGCA 2760 TATGACTCAC ATGATGCATA CCTGGTGGGA GGAAAAGAGT TGGGAACTTC AGhTGGACCT 2820 AGTACCCACT GAGATTTCCA CGCCGAAGGA CAGCGATGGG AAAAATGCCC TTAAATCATA 2880 GGAAAGTATT TTTTTAAGCT ACCAATTGTG CCGAGAAAAG CATTTTA(; CA ATTrATACAA 2940 TATCATCCAG TACCTTAAAC CCTGATTGTG TATATTCATA TATTTTGGAT ACGCACCCCC 3000 CAACTCCCAA TACTGGCTCT GTCTGAGTAA GAAACAGAAT CCTCTGGAAC TTGAGGAAGT 3060 GAACATTTCG GTGACTTCCG ATCAGGAAGG CTAGAGTTAC CCAGAGCATC AGGCCGCCAC 3120 AAGTGCCTGC TTTTAGGAGA CCGAAGTCCG CAGAACCTAC CTGTGTCCCA GCTTGGAGGC 3180 CTGGTCCTGG AACTGAGCCG GGCCCTCACT GGCCTCCTCC AGGGATGATC AACAGGGTAG 3240 TGTGGTCTCC GAATGTCTGG AAGCTGATGG ATGGAGCTCA CAATTCCACT GTCAAGAAAG 3300 AGCAGTAGAG GGGTGTGGCT GGGCCTGTCA CCCTGGGGCC CTCCAGGTAG GCCCGTTTTC 3360 ACGTGGAGCA TAGGAGCCAC GACCCTTCTT AAGACATGTA TCACTGTAGA GGGAAGGAAC 3420 AGAGGCCCTG GGCCTTCCTA TCAGAAGGAC ATGGTGAAGG CTGGGAACGT GAGGAGAGGC 3480 AATGGCCACG GCCCATTTTG GCTGTAGCAC ATGGCACGTT GGCTGTGTGG CCTTGGCCAC 3540 CTGTGAGTTT AAAGCAAGGC <RTI ID=67.2> TTTAAATGAC TTTGGAGAGG GTCACAAATC CTAAAAGAAG 3600 CATTGAAGTG AGGTGTCATG GATTAATTGA CCCCTGTCTA TGGAATTACA TGTAAAACAT 3660 TATCTTGTCA CTGTAGTTTG GTTTTATTTG AAAACCTGAC AAAAAAAAAG TTCCAGGTGT 3720 GGAATATGGG GGTTATCTGT ACATCCTGGG GCATTAAAAA AAAATCAATG GTGGGGAACT 3780 ATAAAGAAGT AACAAAAGAA GTGACATCTT CAGCAAATAA ACTAGGAAAT TTTTTTTTCT 3840 TCCAGTTTAG AATCAGCCTT GAAACATTGA TGGAATAACT CTGTGGCATT ATTGCATTAT 3900 ATACCATTTA TCTGTATTAA CTTTGGAATG TACTCTGTTC AATGTTTAAT GCTGTGGTTG 3960 ATATTTCGAA AGCTGCTTTA AAAAAATACA TGCATCTCAG CGTTTTTTTG TTTTTAATTG 4020 TATTTAGTTA TGGCCTATAC ACTATTTGTG AGCAAAGGTG ATCGTTTTCT GTTTGAGATT 4080 TTTATCTCTT GATTCTTCAA AAGCATTCTG AGAAGGTGAG ATAAGCCCTG AGTCTCAGCT 4140 ACCTAAGAAA AACCTGGATG TCACTGGCCA CTGAGGAGCT TTGTrTCAAC CAAGTCATGT 4200 GCATTTCCAC GTCAACAGAA TTGTTTATTG TGACAGTTAT ATCTGTTGTC CCTTTGACCT 4260 TGTTTCTTGA AGGTTTCCTC GTCCCTGGGC AATTCCGCAT TTAATTCATG GTATTCAGGA 4320 TTACATGCAT GTTTGGTTAA ACCCATGAGA TTCATTCAGT TAAAAATCCA GATGGCGAAT 4380 GACCAGCAGA TTCAAATCTA TGGTGGTTTG ACCTTTAGAG AGTTGCTTTA CGTGGCCTGT 4440 TTCAACACAG ACCCACCCAG AGCCCTCCTG CCCTCCTTCC GCGGGGGCTT TCTCATGGCT 4500 GTCCTTCAGG GTCTTCCTGA AATGCAGTGG TCGTTACGCT CCACCAAGAA AGCAGGAAAC 4560 CTGTGGTATG AAGCCAGACC TCCCCGGCGG GCCTCAGGGA ACAGAATGAT CAGACCTTTG 4620 AATGATTCTA ATTTTTAAGC AAAATATTAT TTTATGAAAG GTTTACATTG TCAAAGTGAT 4680 GAATATGGAA TATCCAATCC TGTGCTGCTA TCCTGCCAAA ATCATTTTAA TGGAGTCAGT 4740 TTGCAGTATG CTCCACGTGG TAAGATCCTC CAAGCTGCTT TAGAAGTAAC AATGAAGAAC 4800 GTGGACGTTT TTAATATAAA GCCTGTTTTG TCTTTTGTTG TTGTTCAAAC GGGATTCACA 4860 GAGTATTTGA AAAATGTATA TATATTAAGA GGTCACGGGG GCTAATTGCT AGCTGGCTGC 4920 CTTTTGCTGT GGGGTTTTGT TACCTGGTTT TAATAACAGT AAATGTGCCC AGCCTCTTGG 4980 CCCCAGAACT GTACAGTATT GTGGCTGCAC TTGCTCTAAG AGTAGTTGAT GTTGCATTTT 5040 CCTTATTGTT AAAAACATGT TAGAAGCAAT GAATGTATAT AAAAGC 5086 12) INFORMATION FOR SEQ ID NO:3: (i) SEQUENCE CilARACTERISTICS: (A) LENGTH: 911 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (il) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:3: TGATTGAAGA CACCCCCTCG TCCAAGAATG CAAAGCACAT CCAATAAAAT AGCTGGATTA 60 TAACTCCTCT TCTTTCTCTG GGGGCCGTGG GGTGGGAGCT GGGGCGAGAG GTGCCGTTGG 120 CCCCCGTTGC TTTTCCTCTG GGAAGGATGG CGCACGCTGG GAGAACGGGG TACGACAACC 180 GGGAGATAGT GATGAAGTAC ATCCATTATA AGCTGTCGCA GAGGGGCTAC GAGTGGGATG 240 CGGGAGATGT GGGCGCCGCG CCCCCGGGGG CCGCCCCCGC ACCGGGCATC TTCTCCTCCC 300 AGCCCGGGCA CACGCCCCAT CCAGCCGCAT CCCGCGACCC GGTCGCCAGG ACCTCGCCGC 360 TGCAGACCCC GGCTGCCCCC GGCGCCGCCG CGGGGCCTGC GCTCAGCCCG GTGCCACCTG 420 TGGTCCACCT GGCCCTCCGC CAAGCCGGCG ACGACTTCTC CCGCCGCTAC CGCGGCGACT 480 TCGCCGAGAT GTCCAGCCAG CTGCACCTGA CGCCCTTCAC CGCGCGGGGA CGCTTTGCCA 540 CGGTGGTGGA GGAGCTCTTC AGGGACGGGG TGAACTGGGG GAGGATTGTG GCCTTCTTTG 600 AGTTCGGTGG GGTCATGTGT GTGGAGAGCG TCAACCGGGA GATGTCGCCC CTGGTGGACA 660 ACATCGCCCT GTGGATGACT GAGTACCTGA ACCGGCACCT GCACACCTGG ATCCAGGATA 720 ACGGAGGCTG GGTAGGTGCA TCTGGTGATG TGAGTCTGGG CTGAGGCCAC AGGTCCGAGA 780 TCGGGGGTTG GAGTGCGGGT GGGCTCCTGG GCAATGGGAG GCTGTGGAGC CGGCGAAATA 840 AAATCAGAGT TGTTGCTTCC CGGCGTGTCC CTACCTCCTC CTCTGGACAA AGCGTTCACT 900 CCCAACCTGA C 911 (2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 90 base pairs {B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:4: ATGATGAGTC TCCGGGGCTC TATGGGTTTC TGAATGTCAT CGTCCACTCA GCCACTGGAT 60 TTAAGCAGAG TTCAAGTAAG TACTGGTTTG 90 (2) INFORMATION FOR SEQ ID NO:5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 204 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: CCCTTTCTCT TCCAGAAGCC CTTCAGCGGC CAGTAGCATC TGACTTTGAG CCTCAGGGTC 60 TGAGTGAAGC CGCTCGTTGG AACTCCAAGG AAAACCTTCT CGCTGGACCC AGTGAAAATG 120 ACCCCAACCT TTTCGTTGCA CTGTATGATT TTGTGC.CCAG TGGAGATAAC ACTCTAAGCA 180 TAACTAAAGG TAAAAGGGTT GTGG 204 (2) INFORMATION FOR SEO ID NO:6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 200 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: TTCCTTTCTT CTCAGGTGAA AAGCTCCGGG TCTTAGGCTA TAATCACAAT GGGGAATGGT 60 TTGAAGCCCA AACCAAAAAT GGCCAAGGCT GGGTCCCAAG CAACTACATC ACGCCAGTCA 120 ACAGTCTGGA GAAACACTCC TGGTACCATG GGCCTGTGTC CCGCAATGCC GCTGAGTATC 180 TGCTGAGCAG CGGGATCAAT 200 {2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3090 base pairs (fl) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: GAATTCCGGG CCCGGATAGC CGGCGGCGGC GGCGGCGGCG GCGGCGGCGG CGGCCGGGAG 60 AGGCCCCTCC TTCACGCCCT GCTTCTCTCC CTCGCTCGCA GTCGAGCCGA GCCGGCGGAC 120 CCGCCTGGGC TCCGACCCTG CCCAGGCCAT GGCCGGCAAC GTGAAGAAGA GCTCTGGGGC 180 CGGGGGCGGC ACGGGCTCCG GGGGCTCGGG TTCGGGTGGC CTGATTGGGC TCATGAAGGA 240 CGCCTTCCAG CCGCACCACC ACCACCACCA CCACCTCAGC CCCCACCCGC CGGGGACGGT 300 GGACAAGAAG ATGGTGGAGA AGTGCTGGAA GCTCATGGAC AAGGTGGTGC GGTTGTGTCA 360 GAACCCAAAG CTGGCGCTAA AGAATAGCCC ACCTTATATC TTAGACCTGC TACCAGATAC 420 CTACCAGCAT CTCCGTACTA TCTTGTCAAG ATATGAGGGG AAGATGGAGA CACTTGGAGA 480 AAATGAGTAT TTTAGGGTGT TTATGGAGAA TTTGATGAAG AAAACTAAGC AAACCATAAG 540 CCTCTTCAAG GAGGGAAAAG AAAGAATGTA TGAGGAGAAT TCTCAGCCTA GGCGAAACCT 600 AACCAAACTG TCCCTCATCT TCAGCCACAT GCTGGCAGAA CTAAAAGGAA TCTTTCCAAG 660 TSGACTCTTT CAGGGAGACA CATTTCGGAT TACTAAAGCA GATGCTGCGG AATTTTGGAG 720 AAAAGCTTTT GGGGAAAAGA CAATAGTCCC TTGGAAGAGC TTTCGACAGG CTCTACATGA 780 AGTGCATCCC ATCAGTTCTG GGCTGGAGGC CATGGCTCTG AAATCCACTA TTGATCTGAC 840 CTGCAATGAT TATATTTCGG TTTrrTGAATT TGACATCTTT ACCCGACTCT TTCAGCCCTG 900 GTCCTCTTTG CTCAGGAATT GGAACAGCCT TGCTGTAACT CATCCTGGCT ACATGGCTTT 960 TTTGACGTAT GACGAAGTGA AAGCTCGGCT CCAGAAATTC ATTCACAAAC CTGGCAGTTA 1020 TATCTTCCGG CTGAGCTGTA CTCGTCTGGG TCAGTGGGCT ATTGGGTATG TTACTGCTGA 1080 TGGGAACATT CTCCAGACAA TCCCTCACAA TAAACCTCTC TTCCAAGCAC TGATTGATGG 1140 CTTCAGGGAA GGCTTCTATT TGTTTCCTGA TGGACGAAAT CAGAATCCTG ATCTGACTGG 1200 CTTATGTGAA CCAACTCCCC AAGACCATAT CAAAGTGACC CAGGAACAAT ATGAATTATA 1260 CTGTGAGATG GGCTCCACAT TCCAACTATG TAAAATATGT GCTGAAAATG ATAAGGATGT 1320 AAAGATTGAG CCCTGTGGAC ACCTCATGTG CACATCCTGT CTTACATCCT GGCAGGAATC 1380 AGAAGGTCAG GGCTGTCCTT TCTGCCGATG TGAAATTAAA GGTACTGAAC CCATCGTGGT 1440 AGATCCGTTT GATCCTAGAG GGAGTGGCAG CCTGTTGAGG CAAGGAGCAG AGGGAGCTCC 1500 CTCCCCAAAT TATGATGATG ATGATGATGA ACGAGCTGAT GATACTCTCT TCATGATGAA 1560 GGAATTGGCT GGTGCCAAGG TGGAACGGCC GCCTTCTCCA TTCTCCATGG CCCCACAAGC 1620 TTCCCTTCCC CCGGTGCCAC CACGACTTGA CCTTCTGCCG CAGCGAGTAT GTGTTCCCTC 1680 AAGTGCTTCT GCTCTTGGAA CTGCTTCTAA GGCTGCTTCT GGCTCCCTTC ATAAAGACAA 1740 ACCATTGCCA GTACCTCCCA CACTTCGAGA TCTTCCACCA CCACCGCCTC CAGACCGGCC 1800 ATATTCTGTT GGAGCAGAAT CCCGACCTCA AAGACGCCCC TTGCCTTGTA CACCAGGCGA 1860 CTGTCCCTCC AGAGACAAAC TGCCCCCTGT CCCCTCTAGC CGCCTTGGAG ACTCATGGCT l920 GCCCCGGCCA ATCCCCAAAG TACCAGTATC TTGCCCCAAGT TCCAGTGATC CCTGGACAGG 1980 AAGAGAATTA ACCAACCGGC ACTCACTTCC ATTTTCATTG CCCTCACAAA TGGAGCCCAG 2040 ACCAGATGTG CCTAGGCTCG GAAGCACGTT CAGTCTGGAT ACCTCCATGA GTATGAATAG 2100 CAGCCCATTA GTAGGTCCAG AGTGTGACCA CCCCAAAATC AAACCTTCCT CATCTGCCAA 2160 TGCCATTTAT TCTCTGGCTG CCAGACCTCT TCCTGTGCCA AAACTGCCAC CTGGGGAGCA 2220 ATGTGAGGGT GAAGAGGACA CAGAGTACAT GACTCCCTCT TCCAGGCCTC TACGGCCTTT 2280 GGATACATCC CAGAGTTCAC GAGCATGTGA .rTGCGACCAG CAGATTGATA GCTGTACGTA 2340 TGAAGCAATG TATAATATTC AGTCCCAGGC GCCATCTATC ACCGAGAGCA GCACCTTTGG 2400 TGAAGGGAAT TTGGCCGCAG CCCATGCCAA CACTGGTCCC GAGGAGTCAG <RTI ID=72.9> AAAATGAGGA 2460 TGATGGGTAT GATGTCCCAA AGCCACCTGT GCCGGCCGTG CTGGCCCGCC GAACTCTCTC 2520 AGATATCTCT AATGCCAGCT CCTCCTTTGG CTGGTTGTCT CTGGATGGTG ATCCTACAAC 2580 AAATGTCACT GAAGGTTCCC AAGTTCCCGA GAGGCCTCCA AAACCATTCC CGCGGAGAAT 2640 CAACTCTGAA CGGAAAGCTG GCAGCTGTCA GCAAGGTAGT GGTCCTGCCG CCTCTGCTGC 2700 CACCGCCTCA CCTCAGCTCT CCAGTGAGAT CGAGAACCTC ATGAGTCAGG GGTACTCCTA 2760 CCAGGACATC CAGAAAGCTT TGGTCATTGC CCAGAACAAC ATCGAGATGG CCAAAAACAT 2820 CCTCCGGGAA TTTGTTTCCA TTTCTTCTCC TGCCCATGTA GCTACCTAGC ACACCATCTC 2880 CCTGCTGCAG GTTTAGAGGA CCAGTGAGTT GGGAGTTATT ACTCAAGTGG CACCTAGAAG 2940 GGCAGGAGTT CCTTTGGTGA CTTCACAGTG AAGTCTTGCC CTCTCTGTGG GATATCACAT 3000 CAGTGGTTCC AAGATTTCAA AGTG(;TGAAA TGAAAATGGA GCAGCTAGTA TGTTTTATTA 3060 (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: CCCGGGGGTC CTGGAAGCCA CAAGGTAAAC ACAACACATC CCCCTCCTTG ACTATCAATT 60 TTACTAGAGG ATGTGGTGGG AAAACCATTA TTTGATATTA AAACAAATAG GCTTGGGATG 120 GAGTAGGATG CAAGCTCCCA GGAAAGTTTA AGATAAAACC TGAGACTTAA AAGGGTGTTA 180 AGAGTGGCAG CCTAGGGAAT TTATCCCGGA CTCCGGGGGA GGGGGCAGAG TCACCAGCCT 240 CTGCATTTAG GGATTCTCCG AGGAAAAGTG TGAGAACGGC TGCAGGCAAC CCAGCTTCCC 300 GGCGCTAGGA GGGACGCACC CAGGCCTGCG CGAAGAGAGG GAGAAAGTGA AGCTGGGAGT 360 TGCCACTCCC AGACTTGTTG GAATGCAGTT GGAGGC,GGCG AGCTGGGAGC GCGCTTGCTC 420 CCAATCACAG GAGMGGAGG AGGTGGAGGA GGAGGGCTGC TTGAGGAAGT ATAAGAATGA 480 AGTTGTGAAG CTGAGATTCC CCTCCATTGG GACCGGAGAA ACCAGGGAGC CCCCCCGGG 39 (2) INFORMATION FOR SEQ ID NO:9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1604 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:9: GAATTCCGGG CGAGGGCCGG GCAGGAGGAG CGGGCGCGCG GCGGGCGAGG CTGGGACCCG 60 AGCGCGCTCA CTTCGCCGCA AAGTGCCAAC TTCCCCTGGA GTGCCGGGCG CGCACCGTCC 120 GGGCGCGGGG GAAAGAAAGG CAGCGGGAAT TTGAGATTTT TGGGAAGAAA GTCGGATTTC 180 CCCCGTCCCC TTCCCCCTGT TACTAATCCT CATTAAAAAG AAAAACAACA ATAACTGCAA 240 ACTTGCTACC ATCCCGTACG TCCCCCACTC CTGGCACCAT GAAGGCGGCC GTCGAT TCA 300 AGCCGACTCT CACCATCATC AAGACGGAAA AAGTCGATCT GGAGCTTTTC CCCTCCCCGG 360 ATATGGAATG TGCAGATGTC CCACTATTAA CTCCAAGCAG CAAAGAAATG ATGTCTCAAG 420 CATTAAAAGC TACTTTCAGT GGTTTCACTA AAGAACAGCA ACGACTGGGG ATCCCAAAAG 480 ACCCCCGGCA GTGGACAGAA ACCCATGTTC GGGACTGGGT GATGTGGGCT GTGAATGAAT 540 TCAGCCTGAA AGGTGTAGAC TTCCAGAAGT TCTGTATGAA TGGAGCAGCC CTCTGCGCCC 600 TGGGTAAAGA CTGCTTTCTC GAGCTGGCCC CAGACTITGT TGGGGACATC TTATGGGAAC 660 ATCTAGAGAT CCTGCAGAAA GAGGATGTGA AACCATATCA AGTTAATGGA GTCAACCCAG 720 CCTATCCAGA ATCCCGCTAT ACCTCGGATT ACTTCATTAG CTATGGTATT GAGCATGCCC 780 AGTGTGTTCC ACCATCGGAG TTCTCAGAGC CCAGCTTCAT CACAGAGTCC TATCAGACGC 840 TCCATCCCAT CAGCTCGGAA GAGCTCCTCT CCCTCAAGTA TGAGAATGAC TACCCCTCGG 900 TCATTCTCCG AGACCCTCTC CAGACAGACA CCTTGCAGAA TGACTACTTT GCTATCAAAC 960 AAGAAGTCGT CACCCCAGAC AACATGTGCA TGGGGAGGAC CAGTCGTGGT AAACTCGGGG 1020 GCCAGGACTC TTTTGAAAGC ATAGAGAGCT ACGATAGTTG TGATCGCCTC ACCCAGTCCT 1080 GGAGCAGCCA GTCATCTTTC AACAGCCTGC AGCGTGTTCC CTCCTATGAC AGCTTCGACT 1140 CAGAGGACTA TCCGGCTGCC CTGCCCAACC ACAAGCCCAA GGGCACCTTC AAGGACTATG 1200 TGCGGGACCG TGCTGACCTC AATAAGGACA AGCCTGTCAT TCCTGCTGCT GCCCTAGCTG 1260 GCTACACAGG CAGTGGACCA ATCCAGCTAT GGCAGTTTCT TCTGGAATTA CTCACTGATA 1320 AATCCTGTCA GTCTTTTATC AGCTGGACAG GAGATGGCTG GGAATTCAAA CTTTCTGACC 1380 CAGATGAGGT GGCCAGGAGA TGGGGAAAGA GGAAAAACAA ACCTAAGATG AATTATGAGA 1440 AACTGAGCCG TGGCCTACGC TACTATTACG ACAAAAACAT CATCCACAAG AC GCGGGGA 1500 AACGCTACGT GTACCGCTTT GTGTGTGACC TGCAGAGCCT GCTGGGGTAC ACCCCTGAGG 1560 AGCTGCACGC CATGCTGGAC GTCAAGCCAG ATGCCGACGA GTGA 1604 12) INFORMATION FOR SEQ ID $NO:10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2S34 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: GAACTTGTTT GAGCCGTAAG CCCGAGCCTA GCGTCGCACG CTGGGCGACT CCCCTCAGGC 60 TCTCAGGCCG GCGCCTTCGG GGGACCACGT AGCGCCCCAG CGGTGGCGGC TGCGCCCGGC 120 TCAGGGCGGT GCCGCGGCCA CTGGCTCCTC CTCGGGGTGG GGCGGGCCGC GCCGGGCGAG 180 GGCGGGAGGA GGAGCAGCGC AGCGGCGTGA GGAGCTGCCG CGCGAGGAGC GCGTCGCGTC 240 CGCACTTCTC CTGCCCGAGA GACTGAGCCG CGCTGGCAGC TCGCGTCGAG TCGGACTGCC 300 CTAGCCGCAT CCCGCGGCGC CCGGTCGGGT CCCGGGCACC AGGCAACACC TAGGCCGTTC 360 CCTTCAGACA GCCCCGGGCC AGCGGCCCCC TCGGGAAATG TCCAGCGGCC GCAGAAGGGG 420 CAGCGCCCCC TGGCACAGCT TCTCCCGGTT CTTCGCTCCC CGAAGTCCTT CCCGGGACAA 480 GGAAGAGGAA GAGGAGGAGA GGCCGGGGAC GAGCCCGCCT CCAGCTCCAG GCCGGTCCGC 540 TGCCAGCCAC GTACTAAATG AACTGATACA GACTGAGAGA GTTTATGTTC GAGAACTGTA 600 TACTGTTTTG TTGGGTTATA GAGCGGAGAT GGATAATCCA <RTI ID=75.7> GAGATGTITG ATCTTATGCC 660 ACCTCTCCTG AGAAATAAAA AGGACATTCT CTTTGGAAAC ATGGCAGAAA TATATGAATT 720 CCATAACGAC ATTTTCTTGA GCAGCCTGGA AAATTGTGCT CATGCTCCAG AAAGAGTGGG 780 ACCTTGTTTC CTGGAAAGGA AGGATGATTT TCAGATGTAT GCAAAATATT GTCAGAATAA 840 GCCCAGATCA GAAACAATTT GGAGGAAGTA TTCAGAATGC GCATTTTTCC AGGAATGTCA 900 AAGAAAGTTA AAACACAGAC TTAGACTGGA TTCCTATTTA CTCAAACCAG TGCAACGAAT 960 CACTAAATAT CAGTTATTGT TGAAGGAGCT ATTAAAATAT AGCAAAGACT GTGAAGGATC 1020 TGCTCTGTTG AAGAAGGCAC TCGATGCAAT GCTGGATTTA CTGAAGTCAG TTAATGATTC 1080 TATGCATCAG ATTGCAATAA ATGGCTATAT TGGAAACTTA AATGMCTGG GCAAGATGAT 1140 AATGCAAGGT GGATTCAGCG TTTGGATAGG GCACAAGAAA GGTGCTACAA AAATGAAGGA 1200 TTTGGCTAGA TTCAAACCAA TGCAGCGACA CCTTTTCTTG TATGAAAAAG CCATTGTTTT 1260 TTGCAAAAGG CGTGTTGAAA GTGGAGAAGG CTCTGACAGA TACCCGTCAT ACAGTTTTAA 1320 ACACTGTTGG AAAATGGATG AAGTTGGAAT CACTGAATAT GTAAAAGGTG ATAACCGCAA 1380 GTTTGAAATC TGGTATGGTG AAAAGGAAGA AGTTTATATT GTCCAGGCTT CTAATGTAGA 1440 TGTGAAGATG ACGTGGCTAA AAGAAATAAG AAATATTTTG TTGAAGCAGC AGGAACTTTT 1500 GACAGTTAAA AAAAGAAAGC AACAGGATCA ATTAACAGAA CGGGATAAGT TTCAGATTTC 1560 TCTTCAGCAG AATGATGAAA AGCAACAGGG AGCTTTTATA AGTACTGAGG AAACTGAATT 1620 GGAACACACC AGCACTGTGG TGGAGGTCTG TGAGGCAATT GCGTCAGTTC AGGCAGAAGC 1680 AAATACAGTT TGGACTGAGG CATCACAATC TGTAGAAATC TCTGAAGAAC CTGCGGAATG 1740 GTCAAGCAAC TATTTCTACC CCACTTATGA TGAAAATGAA GAAGAAAATA GGCCCCTCAT 1800 GAGACCTGTG TCGGAGATGG CTCTCCTATA TTGATGAAGC TACTATGTCA AATGGCAAGT 1860 AGCTCTTTCC TGCCTGCTTC TCAGCTCATT TGGAAAAATA CTGCGCAAAA GACATTGAGC 1920 TCAAATGATG CAGATGTTGT TTTCAGGTTA ATGGACACGC AAAGAAACCA CAGCACATAC 1980 TTCTTTTCTT TCATTTAATA AAGCTTTTAA TTATGGTACG CTGTCTTTTT AAAATCATGT 2040 ATTTAATGTG TCAGATATTG TGCTTGAAAG ATTCTCATCT CAGAATACTT TTGGACTTGA 2100 AAATTATTTC TTCTCTACTT TGTAACCAAA TGCAATCGGT GTGCCTTGGA TTATTTAGTT 2160 TATTAATGAA TTAAGTCAAA ATTACGGCTG CAAAATGGCT AAGGTCAAGT AAAGCACAAC 2220 ATTATGATTT AATATGCTTT TGTTGAAACC ACAGCTTTTG TGCCCATTGT TTTAACTTGT 2280 GTGAAACAAT ACAAAGCCCA GAAATTCTTT TCGGGGCATG AGTAAATTTT GTTCAGGGCT 2340 ACTGTCTGTA TGTGCCCAGA TAAAATTTTC ATGAGAGTAG TTTACAAAAG CCGTATTTAA 2400 AAGTTAATAT TTTCACACTT TTTTTCTGGA TTTCTGCTTA TAATTAATGT AACTTAAATT 2460 AGTTGTGCTC TGCTATTTTC TGTATATTTC ATGTTGTAAT TCTTTTTTTC AAATAAAAAT 2520 TAATTCTTCA GGTT <RTI ID=77.2> 2534 (2) INFORMATION FOR SEQ ID NO:11: (i) SEQUENCE CflARACTERISTICS: (A) LENGTH: 1219 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xl) SEQUENCE DESCRIPTION: SEQ ID NO:ll: GGGAGCGGGC GAGTAGGAGG GGGCGCCGGG CTATATATAT AGCGGCCTCG GCCTCGGGCG 60 GGCCTGGCGC TCAGGGAGGC GCGCACTGCT CCTCAGAGTC CCAGCTCCAG CCGCGCGCTT 120 TCCGCCCGGC TCGCCGCTCC ATGCAGCCGG GGTAGAGCCC GGCGCCCGGG GGCCCCGTCG 180 CTTGCCTCCC GCACCTCCTC GGTTGCGCAC TCCCGCCCGA GGTCGGCCGT GCGCTCCCGC 240 GGGACGCCAC AGGCGCAGCT CTGCCCCCCA GCTTCCCGGG CGCACTGACC GCCTGACCGA 300 CGCACGCCCT CGGGCCGGGA TGTCGGGGCC CGGGACGGCC GCGGTAGCGC TGCTCCCGGC 360 GGTCCTGCTG GCCTTGCTGG CGCCCTGGGC GGGCCGAGGG GGCGCCGCCG CACCCACTGC 420 ACCCAACGGC ACGCTGGAGG CCGAGCTGGA GCGCCGCTGG GAGAGCCTGG TGGCGCTCTC 480 GTTGGCGCGC CTGCCGGTGG CAGCGCAGCC CAAGGAGGCG GCCGTCCAGA GCGGCGCCGG 540 CGACTACCTG CTGGGCATCA AGCGGCTGCG GCGGCTCTAC TGCAACGTGG GCATCGGCTT 600 CCACCTCCAG GCGCTCCCCG ACGGCCGCAT CGGCGGCGCG CACGCGGACA CCCGCGACAG 660 CCTGCTGGAG CTCTCGCCCG TGGAGCGGGG CGTGGTGAGC ATCTTCGGCG TGGCCAGCCG 720 GTTCTTCGTG GCCATGAGCA GCAAGGGCAA GCTCTATGGC TCGCCCTTCT TCACCGATGA 780 GTGCACGTTC AAGGAGATTC TCCTTCCCAA CAACTACAAC GCCTACGAGT CCTACAAGTA 840 CCCCGGCATG TTCATCGCCC TGAGCAAGAA TGGGAAGACC AAGAAGGGGA ACCGAGTGTC 900 GCCCACCATG AAGGTCACCC ACTTCCTCCC CAGGCTGTGA CCCTCCAGAG GACCCTTGCC 960 TCAGCCTCGG GAAGCCCCTG GGAGGGCAGT GCGAGGGTCA CCTTGGTGCA CTTTCTTCGG 1020 ATGAAGAGTT TAATGCAAGA GTAGGTGTAA GATATTTAAA TTAATTATTT AAATGTGTAT 1080 ATATTGCCAC CAAATTATTT ATAGTTCTGC GGGTGTGTTT TTTAATTTTC TGGGGGGAAA 1140 AAAAGACAAA ACAAAAAACC AACTCTGACT TTTCTGGTGC AACAGTGGAG AATCTTACCA 1200 TTGGATTTCT TTAACTTGT 1219 (2) INFORMATION FOR SEQ ID NO:12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 139 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:12: GTTCTGTTCT GTGCCTACAG TGAAGGTGAC TGGTGGGAGG CTCGGTCTCT CAGCTCCGGA 60 AAAACTGGCT GCATTCCCAG CAACTACGTG GCCCCTGTTG ACTCAATCCA AGCTGAAGAG 120 TAAGTAGGGA TTGGGGCAA 139 (2) INFORMATION FOR SE0 ID NO:13: (i) SEQUENCE CHARACTERISTICS: (A) LENGHT: 144 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) Ixi) SEQUENCE DESCRIPTION: SEQ ID NO:13: TTGCCTGCCT TTCCCAACAG GTGGTACTTT GGAAAGATTG GGAGAAAGGA TGCAGAGAGG 60 CAGCTGCTTT CACCAGGCM CCCCCAGGGG GCCTTTCTCA TTCGGGAAAG CGAGACCACC 120 AAAGGTAGGG GTGGTGCCAC CCCC 144 (2) INFORMATION FOR SEQ ID NO:14: (i) SEQUENCE CllARACTERISTICS: (A) LENGTH: 190 base pairs: (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: AAAAGTGATC CTCTCCACAG GTGCCTACTC CCTGTCCATC CGGGACTGGG ATCAGACCAG 60 AGGCGATCAT GTGAAGCATT ACAAGATCCG CAAACTGGAC ATGGGCGGTT ACTACATCAC 120 CACACGGGTT CAGTTCAACT CGGTGCAGGA GCTGGTGCAG CACTACATGG GTGAGGGCAG 180 GGGCCTCAGA 190 (2) INFORMATION FOR SEO ID NO:15: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 196 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (il) MOLECULE TYPE: DNA (genomic) Ixi) SEQUENCE DESCRIPTION: SEO ID NO:15: CTTCATGACC CCTCCCCTAG AGGTGAATGA CGGGCTGTGC AACCTGCTCA TCGCGCCCTG 60 CACCATCATG AAGCCGCAGA CGCTGGGCCT GGCCAAGGAC GCCTGGGAGA TCAGCCGCAG 120 CTCCATCACG CTGGAGCGCC GGCTGGGCAC CGGCTGCTTC GGGGATGTGT GGCTGGGTAC 180 GGAGCTCCCG GGGGCC 196 (2) INFORMATION FOR SEQ ID NO:16: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 212 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: ACAAGACAGC CTCCGAGCAG GCACGTGGAA CGGCAGCACT AAGGTGGCGG TGAAGACGCT 60 GAAGCCGGGC ACCATGTCCC CGAAGGCCTT CCTGGAGGAG GCGCAGGTCA TGAAGCTGCT 120 GCGGCACGAC AAGCTGGTGC AGCTGTACGC CGTGGTGTCG GAGGAGCCCA TCTACATCGT 180 GACCGAGTTC ATGTGTCACG GTCAGGAGGC GG 212 (2) INFORMATION FOR SEQ ID NO:17: (i) SEQUENCE CRARACTERISTICS: (A) LENGTH: 117 base pairs tB) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: ACTTTCTGGC TTCTTCCCAG GCAGCTTGCT GGATITTCTC AAGAACCCAG AGGGCCAGGA 60 TTTGAGGCTG CCCCAATTGG TGGACATGGC AGCCCAGGTA ACTGGGCCAG CAGCCTT 117 (2) INFORMATION FOR SEQ ID NO:18: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 219 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (11) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: GAGCTCCCAT CTCTCCACAC TATGGTCCCC CAGGTAGCTG AGGGCATGGC CTACATGGAA 60 CGCATGAACT ACATTCACCG CGACCTGAGG GCAGCCAACA TCCTGGTTGG GGAGCGGCTG 120 GCGTGCAAGA TCGCAGACTT TGGCTTGGCG CGTCTCATCA AGGACGATGA GTACAACCCC 180 TGCCAAGGTG CCCTGCTTCA CCCCACCTTC CAAGAGCTC 219 (2) INFORMATION FOR SEQ ID NO:19: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35100 base pairs (B) TYPE: nucleic Acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: AA TATATATTTT AAGAATTATG CAGAACAATT AGATTCTTTG TAAAAATAAA TAAACAATGT 780 AAGTAACGCA CAAAAGATAG TTTTATAACC AGACTGCTGG GATCCAAATC CTATCTCCAC 840 CATTTGGTAG CTGTTTGACT ATGGACAAGC TTAAGGCACT TGATCTCTCT GAGCTITAGT 900 TTTCCCATCT GTGAAATGAG AATGACAATA GTACTTACCT ACATAAAGTT TTGCAGTACT 960 AAAGGAGACA GTGAATGTAA AAGGTTTGGC TAGTAAATGT CCTGTAAAAG GAAGCTTATT 1020 GCCAATATTA TCAGGCTCTC CCAGACCAAC CTGTATACAG GAAGAAAACA AACTCCGTTT 1080 CTCCTATAGT CTCACAACAC AAAATACTTC TGACCCCAGA TGTAGAGGAT GGGGCATATT 1140 TCCCCATACA CCAAGCAATC AACCAATTGT TCAGATTCTG CAGCAGACAC GAATCTGGTG 1200 CCCTCCGATT CAATTTGAAC ACTATATTTA CCTAGAGATA ACGTCAGATC TCACAGCTTG 1260 AAGGCTTGAG CCAGGAGTTT GAGGCTGCAG TGAGCTATGA TCGAGCCACA GAGCTCCAGC 1320 CTGGGCAACA GAGTGAAACT GCGTCTCTAA AATAATAATA ATAAATTTTT AAAAGATATG 1380 CATTACTTTG GAGATTCCAA GGATTTTAGG AGTTGTAAGC CAGGACATCA GGGTAAAGAA 1440 AAAATATATA TGTCACAATA TCATGCAACC TAACTTCTCT TTGGGATCTG CCAGAGCCAC 1500 CTGATCACTC TGAAGACCCT CATTTGTGCT ACTGACTAAC GGTCTGGCTG CTCTTGGACA 1560 TGTCTCTTCT CCCAAGACCC CTTGAAGATG GCTTTAGAAG GGCCCCAAAC TTAGCTAGCT 1620 CCCCCCAAGC TCAGGCTGGC CCTGCCCCAG ACTGCGACCC CTCCCTCTTG GGITCAAGGC 1680 TTTGTTTTCT TCTTAAAGAC CCAAGATTTC CAAACTCTGT GGTTGCCTTG CCTAGCTAAA 1740 AGGGGAAGAA GAGGATCAGC CCAAGGAGGA GGAAGAGGAA AACAAGACAA ACAGCCAGTG 1800 CAGAGGAGAG GAACGTGTGT CCAGTGTCCC GATCCCTGCG GAGCTAGTAG CTGAGAGCTC 1860 TGTGCCCTGG GCACCTTGCA GCCCTGCACC TGCCTGCCAC TTCCCCACCG AGGCCATGGG 1920 CCCAGGAGTT CTGCTGCTCC TGCTGGTGGC CACAGCTTGG CATGGTAAGA GCAGAACGGG 1980 GGGTGGGGGA CTTTGTTGGG GTGTGATGGA GAAGACCCCT GTGAAAGGAT TCAGTCCTTG 2040 CCCCTCACTG GGTGTCCTCA GGCTGTTTTA GTCTCCCCAA CACTGGACTG CAGGCTTGTG 2100 GGTATCTGCT TTGGAGAGGT AGTGGGGTGA AAAGAGATGG GTGTGGTGGA ACTGGTCCAC 2160 CTGGTGCTGT GGATCTGTCC CAGCTCTGCC AGCGACTCAC TGTGTGTCCT GAGCAAGCCT 2220 CTGATACTCT TGAGGCTTCA GTGTCCACTT CTATTCAATT GCAGGTGTTG GGGGCAGGGG 2280 GACAGTGATA GACTAGACCA GAGCAGTGCT TTTCATACTT TCCTGTGCAT ACAAGTTACC 2340 TGAGGATTTT GTTACAATGC AGATTCAGAC TCAGTCC.CTC TCAGGTGCGA CCTGAGATTC 2400 TGTATATCCA ACACACTCCT GGGAGATGTG AGATGCCGGC ACTGCTGGTC CAGACCTACA 2460 CTGAGTTGGG AGGACCTGGA GAGCTCCTGA TGGCTCTGGC AGCTCTGCCA GCCTGTGATT 2520 CGATGATTCT ATGCAAGATC TGATTTGGAA GGGCCTGATA GGGGTGGTGG TTCTTCCTTG 2580 GGTGGCTTGT GTAAGGGGTC AGAGGGGAGA GACAAGAGGT TGGCCTCTCT GGCCCAGGGC 2640 TCAGGAGAGG GGAATTCGGG GTGAAATAGG TATAGGGCTA GAGGAGGGAT TGGGAAGAGG 2700 CCAGTGAGGG TCTCCTGGAC CAGAGCCCTC CCAGACACAG GCTGCCAAGT CTCAGGAGGT 2760 CCCCAGGCTG TAGCAGTTCT GCAGAATTTC CATCTGGGAG GGAACATGAC TAGAGGTGAG 2820 GGGCTGCTGT GCTTGGCTTG TTGGCCCAAC AAACACATTT CTATTGCCTG CTTATTCAAA 2880 GGGACCTTGG GGGAGGATGG GGATTGAAGG GGAGAAAGGA CAGCCTCATA CTGGCCTCTT 2940 CACAGAAGGA CCCTAAGGCC GTGGCGCTTC TGGTCCCTGA TGAGGAGGAG ATGGCCCACT 3000 GACCATCCTT CTCTGGCCCA GGCAATCACA CTGAGCTTGA GTATTTGGGT TTITIITaTT 3060 TTTTTCCTGA GACAGAGTCT CTCTCTGTCA CCAGGCTGGA GTACAGTGGC ACAATCTCGG 3120 CTCACTGCAA CCTCCACCTC CCGGGTTCAA GTGTTTCTCC TGTCTCAGCC TCCCAAGCTG 3180 GGATTACAGG CATACACCAT CATGACTGGC TAATTTTTGT ATTTTAGTA GAGATGGGAT 3240 TTCACCATGT TGGCCAAGCT GGTCTCGAAC TCCTGACCTC AGGTGATCCA CCTGCCTTGG 3300 CCTCCCAAAG TGTTGGGATT ACTGGTGTGA GTCACGGCGC CCGGCCTGGA CTTCTTATTT 3360 TGCAATGTAA CTTACATGCA GTAGAAAGCA CAGGTTCTTA AGTTCAATGA GGTCTGACAA 3420 ATGCACACAC AGTGTACCCG CCACCCCCTT CATCTCAGAG AGTCCCACAG GTTTGATTTC 3480 ACTGCCTTGT CCTATCCTTA CACCCACAAC CTGCCTGTGG GGCAAAAACG GAAAAGTATC 3540 TGAGCCAGGT CTCAATTTAA TTTTATTTTT TTTATTGAGA TGGAGTCTTG TGGCCAGGCA 3600 TGGTGGCTCA CACCTGTAAT CCCAGCACTC TGGGAGGCCG AGGCGGGTGG ATCACAAGAT 3660 CAGGAGTTTC AGACCAGCCT CGCCAATATG GTGAAGCCCC CTCTCTACTA AAAAATACAA 3720 AAATTAGCCG GGTGTGGTGG TGGGTTCCTG TAGTTCCAGC TACTCAGGAG GCTGAGGTGG 3780 GAGAATCACT TGAACCCGGG AGGCAGAGGT TGCAGTGAGC TGAGATCATG CCACTGCACT 3840 CCAGCCTAGG CGACAGAGCA AGACTCCATC TCCTTCCTTT CTTTCTTCCT TCCTTCCTTC 3900 CTTCCTTCCT TCCTTCCTTC CTTCCTTCCT TCCTTCCTTC CTTCCTTCTT TCTTTCTTTC 3960 TTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTTTTCT ATCTTTTTGA GACCGAGTCT 4020 TGCTCTGTTG CCCAGGCTGG AGTGCAATGG CATGGATCTC GGCTCACTGC AACCTCCGCC 4080 TCCGGGGTTC AAGCAATTCT GCCACTCCTG AGTAGCTGTG ATTACTGGTG CCTGCCACCA 4140 CACCCAGCTA ATTTTTTTAT TTTTGGTAGA GACAGGGTIT TATCATGCTG GCCATGCTGG 4200 TCTCGAACTC CTGAACTCAA GCGATCCCCC TGCCTTGGCG TCCCAAAGTG CTGGGATTAC 4260 AGGCATGAGC CACTGTGCCT GGCTTCAATC AATTTAGAAG TTTATTTTGC CAAGGTTAAG 4320 GACATGCTGG CGAGAAAAAA ACATGGAGTC ACAAAAACAT TCTGTGGTCT GTGCCATTCT 4380 GGATGAATTC GAGGGCTTTA ATATTTAAAG GGGAAAGTGG GCTGGAGGGG AAAAGGGGAG 4440 GTTGTGGTAA TCCACATGTT GCAAAAGAAA AGCAGCAGGT AGGGGAACAG TCAATTATCT 4500 <RTI ID=84.8> CGGTTCAGTA AATTGGCTCT TTACATAGGG AAAGTGAACA TAGAGGAGCT GCCTGTGGGA 4560 TATTTTACCT TTTATCTGTC GCTATCTGCT TAGGAATAAA AGGCAAGGCA GCTTCTTGCA 4620 TGACTCAGTT TCCAGCTTGA TTTTTCCTTT TGGCAGAGTG AATTAGGGTC CCAAGTTTTT 4680 ATTTTCCCTT CACAGGGGCA TGGTGTGTGG GAGGGGGGCC AGATGGTTTT CCAGGGTCCA 4740 GTCCCAAGAG AAAGAAGAGA TGGGGAGGCT GGAAACCTAA GTTTTCAGCC CAACAGACCA 4800 ATGATGAGTG GATGAGGGGC CACTGTGAGG AGACTGGGGA TGGTATTGGA GGACCCTAGA 4860 GAGAGAGGGG GGCTCTCTCT TCATTACTGC GATGAGATCC TGGGCTGAAG AGGGGCTGTG 4920 TCCAGCCTTA GTGTGCAGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 4980 TGTGTGTGTT GGGAGAGAAG AGTAGAGATT GGGGCACATT CTGGAAGTGA TGAGGGAGGG 5040 GCTTCCAGGC AAGTGGGAGC TAGTGGAGAG GTGTGC, GGCA TGGGGAGAAT TGGGGAGTGG 5100 AGATGAGAGG GGGGAAGAAT GGACAGGCAC AGAAGGGGAC CTCAGTTAAT GTTCATAAGC 5160 CCATGCCCCC ACCCCGAGGA GGATGGGGGC CAAGCCGGCT TCCTTCCCTG CTAGCCAAGC 5220 CAGCAGGGGA AGTTGGCTGC GGAAGTTGCG GGTATCAGCC TTATCCTGCG TGAATACCTG 5280 GGACAATAGG ATAGGACAAA ATAGGGCAGA CACCGCTCCC TGACCACATT TCCTGGAGGC 5340 CAAGGCAGGG TCTAGAGAGA CAGGCTGGGG GAAGGGATGG GAGAAGCCCA CTGTAAGGTG 5400 TGAGGCAGGT GTAAAAAAGG AACAAATGGA ATCACAGAAT CCAAGGTTAA AATCTTGAGC 5460 GATCAGAGTT GGCCCAGAAG GGACATTAGA AATGTAGCAA TTAAAGCAGG TGCCCAGGGC 5520 AGGAGTAGTT CTATACATCA TCTCACTCAA CCTTCAGCTG AAGTTTTTGG GGTGGGAGCT 5580 GGGATTATTC CCATCAGACA GAAC.AATAGC CTGAGGCTCA AAGAGGTTAA GAAACTAACC 5640 CAGCTGGTAA GGGAAGAACC AAGATCCAAA CCCAAGTTGG TGTGAGCCCA CACTCCAAGC 5700 TGTTTCCTGC TATAAAACCC CGGCCTGGGG GCCCTAGATT GCTGCAGCAG TGATAGGGCA 5760 GCCCCAGCTC TGTTGAGATT TGCTAAAAAG GCTGCTAGAA ATGACACTTG TCCCTCTTCC 5820 TGGCAGTTGC ACTGCATAGG AGGGCTACAA CCCCAGGTGG CAGGCTTGGC AGTATTCACA 5880 ATTCACTCAA TCCCGTTTGC TGATCAGAGT CTTGGGGAGA AGGGATGCAC ATTCTGATGA 5940 ATACAAAATC AAAACGTGAA TTAAGCCATC CTGAAAGGAC TTGAGAGAGG AAACCTTTCC 6000 AATTCTGGGC TCTATGGTGG GGCAGGGGGA ATTTCCATTT CAAGGGGGTT TGCAGAGAAC 6060 AATGGAGATA CCCTGAATTC ACCGAAAGCC CTCGGGGTGG CCTGTCATTG TGCCCCCATC 6120 ACTGGGAAGA GGAAGGGCCA GAGCTGAGGA GTTGGATGGC CAGGGTCAGC CAGTGGGTCA 6180 GCGTCAGAGC CCAGCCTCAC AGCTGCCCCG CAAGTGGCAC TCCCTCTCCC TGCCTGGAGA 6240 GAGGAGAGTG GCTAGGAGGC TGGGGAAGCA GAAGTGAGAA CATCCCTGTA GAAGGGCCAC 6300 AGGCTGAGCG GAAACCGGGG GCTGAGCCTG ACGCCAACAA TGTGTTTCCG CCCACACAGG 6360 CTGGGGGGCG CCTGGCAGCC CCTCGGAGGC TTGAATCAGC TCTCACTTCC CTCcTTrGCC 6420 CCTATTTTAG GCCCTGGAAA AATGCTGACG CTGCAGAGGC AACGGGCCTT CTTCCCGGAC 6480 AGCCTGATAG GGGTTTCAAG TTCTCTTTTC TCCTTCAAC.A AAATTTTCCT TAAAAGAGAT 6540 TGGCTTCCCA GTAAACACAG ATGTGTGGGG GTGCCGGGGT GAGCTGCTGG GTGTAGACTA 6600 GGTAATAAAC ATAGTGACTA ACTCTTACTG AGCCATTTAT TTGGTGACAG GTGATGTTCT 6660 AAAGTCTTCC CATGCATTTA AAATGCCTAA CATACCAATG AGTGGGTACG ATGATTGTCC 6720 CTGTTTTATA GGTGGGGAAA CTGAGGCATG GCACCTCCCC ATCCCACTGT GCTGCAGACC 6780 AGATGTCCAT TGGTGGGAGC GGGCACACCA GGAGATTCTT GGGACCTCTC TAACTCTGCT 6840 GGGCTAAGAT CCTACATCTC TTTTTTTTC TTTCCCATAT GAATTAAGCT GAGGACTTGG 6900 CCGTGAACAT TCCATTCATT TGTTTCTTCA TTCGGTGGTA GAAATACATC CACTTTGTAC 6960 ACAGGGTTAA AAGAGTCCAT TCCTGGGGAG TAGAAAGATG GCATCACAGC AGGGAAGACT 7020 GAGGCAGGAG GCTGAGGACC CCAGGGGGAC AGAGGCCTGG GTGAGAGGCT GAGCAAGCTG 7080 CAAGCCCCCT TTCTCAGAGG AGGGACCTCC TGGACATCAG AGACATCAGT CTGTCCCTGA 7140 GCAGGTTGAG GGTTAGGAGC TGAGCAAATG ACCAGGGGGC AGGGGCTCGT TCAAGGTGGT 7200 CCCTTGATGG CACAGCACCA TCCCTGCCAA GCTACCACCC ATCTCAGAGT CAGGACGGCC 7260 CAAGGGGCGC ATCCTAGACC TCACTTCTGT CTGCTGTCCC TCTCTCCCAC CAGGTCAGGG 7320 AATCCCAGTG ATAGAGCCCA GTGTCCCCGA GCTGGTCGTG AAGCCAGGAG CAACGGTGAC 7380 CTTGCGATGT GTGGGCAATG GCAGCGTGGA ATGGGATGGC CCCCCATCAC CTCACTGGAC 7440 CCTGTACTCT GATGGCTCCA GCAGCATCCT CAGCACCAAC AACGCTACCT TCCAAAACAC 7500 GGGGACCTAT CGCTGCACTG AGCCTGGAGA CCCCCTGGGA GGCAGCGCCG CCATCCACCT 7560 CTATGTCAAA GGTGAGGAGT CTGAGCCTCC TCCCAAGAGG CCTGACCCGG CAGGCCCCAC 7620 TACAATGGGC CCTAAAATTA ACAATCGTAA CAATTCAGCT CTGCATTTAC TGAGTGCTGG 7680 CTATGAGCAA GGACCTGGAA GAGCTGCTAA TGTAATGCAG TCCTCACAAC AACCCTGCAA 7740 GTCGGGTCTA TGATGATGCA TTTTCTAGAA GTGCAGGGAG GTTATCCAAG GTCACACAGC 7800 CTCACATAGT GGGACTAGAC TGGAGCCCAG GTGCGCCTGA CTCTGGAGCC ACCACGCTGA 7860 AGCATCCGCT GAACTGTCCT GGCGTGGTGT GACCTCAGAT GAATGATCAG CCTCTCTGAG 7920 CTTCCTTGTC ACCTATGTCC AGGTACTCCT TGGCCCAGTG GAGGGAGGGC AGTTGTAACC 7980 CTGTGCCCTC CTCTACTCTA GACCCTGCCC GGCCCTGGAA CGTGCTAGCA CAGGAGGTGG 8040 TCGTGTTCGA GGACCAGGAC GCACTACTGC CCTGTCTGCT CACAGACCCG GTGCTGGAAG 8100 CAGGCGTCTC GCTGGTGCGT GTGCGTGGCC GGCCCCTCAT GCGCCACACC AACTACTCCT 8160 TCTCGCCCTG GCATGGCTTC ACCATCCACA GGGCCAAGTT CATTCAGAGC CAGGACTATC 8220 AATGCAGTGC CCTGATGGGT GGCAGGAAGG TGATGTCCAT CAGCATCCGG CTGAAAGTGC 8280 AGAAAGGTGC GTGGGGCATG GGGACCGGCA GCCAGGCCTG AAGAGTGGGG ACAGAGAGCC 8340 GGCGGCCACA TGGGTGGTGA CTGGGGACTG GGTGTGATGG GGGGCAGTGG GATGTCCTCT 8400 TTCTTTCACT TCTTCCCCTC AATGGTTCCA CGATCATCTA TGGGGCAGGA CTGACAAGGT 8460 GTCGGGGCAG GGAGACAAAC CACATGTGAG CAAATAACTC AGTGGGCAAG GTCATCTCAG 8520 GTCATTGGAC ATGCTACAAA AATAAACATT CACATG(; TA GCTGAATAAG GAGTGTGTAG 8580 GGCGGGGAGC CTCACTGAGA AGGAAACACT TTATTAGAGC GGAAATCTGA ATGACAT20 GAGGCAGGAG GCTGAGGACC CCAGGGGGAC AGAGGCCTGG GTGAGAGGCT GAGCAAGCTG 7080 CAAGCCCCCT TTCTCAGAGG AGGGACCTCC TGGACATCAG AGACATCAGT CTGTCCCTGA 7140 GCAGGTTGAG GGTTAGGAGC TGAGCAAATG ACCAGGGGGC AGGGGCTCGT TCAAGGTGGT 7200 CCCTTGATGG CACAGCACCA TCCCTGCCAA GCTACCACCC ATCTCAGAGT CAGGACGGCC 7260 CAAGGGGCGC ATCCTAGACC TCACTTCTGT CTGCTGTCCC TCTCTCCCAC CAGGTCAGGG 7320 AATCCCAGTG ATAGAGCCCA GTGTCCCCGA GCTGGTCGTG AAGCCAGGAG CAACGGTGAC 7380 CTTGCGATGT GTGGGCAATG GCAGCGTGGA ATGGGATGGC CCCCCATCAC CTCACTGGAC 7440 CCTGTACTCT GATGGCTCCA GCAGCATCCT CAGCACCAAC AACGCTACCT TCCAAAACAC 7500 GGGGACCTAT CGCTGCACTG AGCCTGGAGA CCCCCTGGGA GGCAGCGCCG CCATCCACCT 7560 CTATGTCAAA GGTGAGGAGT CTGAGCCTCC TCCCAAGAGG CCTC;ACCCGG CAGGCCCCAC 7620 TACAATGGGC CCTAAAATTA ACAATCGTAA CAATTCAGCT CTGCATTTAC TGAGTGCTGG 7680 CTATGAGCAA GGACCTGGAA GAGCTGCTAA TGTAtTGCAG TCCTCACAAC AACCCTGCAA 7740 GTCGGGTCTA TGATGATGCA TTTTCTAGAA GTGCAGGGAG GTTATCCAAG GTCACACAGC 7800 CTCACATAGT GGGACTAGAC TGGAGCCCAG GTGCGCCTGA CTCTGGAGCC ACCACGCTGA 7860 AGCATCCGCT GAACTGTCCT GGCGTGGTGT GACCTCAGAT GAATGATCAG CCTCTCTGAG 7920 CTTCCTTGTC ACCTATGTCC AGGTACTCCT TGGCCCAGTG GAGGGAGGGC AGTTGTAACC 7980 CTGTGCCCTC CTCTACTCTA GACCCTGCCC GGCCCTGGAA CGTGCTAGCA CAGGAGGTGG 8040 TCGTGTTCGA GGACCAGGAC GCACTACTGC CCTGTCTGCT CACAGACCCG GTGCTGGAAG 8100 CAGGCGTCTC GCTGGTGCGT GTGCGTGGCC GGCCCCTCAT GCGCCACACC AACTACTCCT 8160 TCTCGCCCTG GCATGGCTTC ACCATCCACA GGGCCAAGTT CATTCAGAGC CAGGACTATC 8220 AATGCAGTGC CCTGATGGGT GGCAGGAAGG TGATGTCCAT CAGCATCCGG CTGAAAGTGC 8280 AGAAAGGTGC GTGGGGCATG GGGACCGGCA GCCAGGCCTG AAGAGTGGGG ACAGAGAGCC 8340 GGCGGCCACA TGGGTGGTGA CTGGGGACTG GGTGTGATGG GGGGCAGTGG GATGTCCTCT 8400 TTCTITCACT TCTTCCCCTC AATGGTTCCA CGATCATCTA TGGGGCAGGA CTGACAAGGT 8460 GTCGGGGCAG GGAGACAAAC CACATGTGAG CAAATAACTC AGTGGGCAAG GTCATCTCAG 8520 GTCATTGGAC ATGCTACAAA AATAAACATT CAACATGGTA GCTGAATAAG GAGTGTGTAG 8580 GGCGGGGAGC CTCACTGAGA AGGAAACACT TTATTAGAGC GGAAATCTGA ATGACATGAA 8640 GAAGGTGGCT GTGCAAAGAT CTGCTTCAGC AGGGGGACAG TGAGTACCAA GTGGTGAGGT 8700 GGGGACAGGC TCTGAATGTT CTAGGTATGG AAAGAGGACG GAAGCTCAGC CTCAGACATG 8760 GATTTCCCAC TGGGGGCCTG CCTAAGGCCA AGTGCTGG(; C ATGTGTAGGA GGGATGCTGA 8820 GCCAAGAGGC AGGGAGGAGA TGGTGGGTGC GTGTGATGGC TCTCGCGGTG GCCAGGTAAC 8880 AGTGGAGGTG GAGTCTCACC CTGCTGGGAT GGCAGGCAGG ATTCTGGTTT CTGGGAGGAC 8940 TGGTGAGAGC AAGCAGGACC CCAGCCTGAG GACCTGGGCT TGAGACAGCA ATCAGTCCCT 9000 GTAACAAGGG CCAGGGTCAG AGTGAAGCAG CTAGCCCAAT GCCACTGGGA TCTGAAGCCA 9060 CTAAACCTGC CTAGGGGGTC AAAGGACCCC AGCTGTGTGG GCAGAGGAGG CCATTAGGGC 912.0 TCTTTCCTGG CATTTCATCC TGCAGAGCCC TGGGCTGGCC AAGAGCCAAA GGTCCTGGGC 9180 CCTAGTTCTG CCTTGACCCC CCCTCAGGGA CCTTGGGTGA GTCCTTTCAT GTCCCTGGGC 9240 CTTAGGAATC TGGATTAGAT TATC m CAA CAGCAGCAAT GGGCATAAAT ATGAATTCAA 9300 GGCCTACTGT GCATCAGGCA TCTTGCTGGC TGCTGGAATA TTCCTGTCAC GGATITGACA 9360 TTCGACTAGA GTCTAACTAT TAAATAGAAA GTAAATACAA ATGTGATGAG CAAGAAACCA 9420 AGCTGGGGAG TGGCGGGCAT GGAGGTGCTG GGGAGGCTAA TTCATATCAG CTGGTCACAG 9480 AAGCCTTGCT GAGGAATTTT TGAGCTAAAG ATCTGAAGGA TGAGAACAGC CTCCCATITG 9540 AAGTGTGGGA GGAAGGCAT TCCAGGAGGG AAAGGTGGGT GCAAAGGCCC TGTGGTAGGA 9600 AAGAGGTCCA GCGGGCTGCA GTGCAGTGAA CAAGGGGTGG GGTTATCAGG GCGGTCAGAA 9660 ACAGGTTGGG CTGTGGAAGG ACTTTGACTT CTTTTCTGAG AGTAATGGGA AGCCCCAAAT 9720 GTTTACAGAG GAGAGAGGCA TGGTCCCATT TATATTTGTA AGAGGTCACT TTGGTGAAC.A 9780 ATCTAGGTGT GGGGGGCTTG GAGGGAGGCA GGGAGGTCTC TGAGGAGGCT GGTGCAGAAG 9840 TCCAGAGTGG AGAATGGTGA CGGGACTGGG GAGGGGTAGA GGTGATGGAG AAAGTAGACT 9900 TTCCAAGGTC TCTTTAGGAC AGGCCTTGCA GTGGGGGGAC TGGGAGCATC AAGGCTGCCT 9960 CCCAGGATTT GGGATGGGGC AGTGATGGGG ACCCTGGCCT GTGTGTCCTG GCCCATGGCA 10020 GGGAGGAGAG CAATATCTCT ATCATATTCA GGGAGCCTGG GTGTTCAGGG GTCTCTCCCC 10080 CGGTCTCAGT CATCCCAGGG CCCCCAGCCT TGACACTGGT GCCTGCAGAG CTGGTGCGGA 10140 TTCGAGGGGA GGCTGCCCAG ATCGTGTGCT CAGCCAGCAG CGTTGATGTT AACTTTGATG 10200 TCTTCCTCCA ACACAACAAC ACCAAGGTCA GTCCCTGCAG ATCACAAGGT GAAGTCTGGC 10260 CATCCTCCCA GCACACCAGG TTTCCCATGG TGGAGTCCTG GGCCCCCAAC TCCAAACTGG 10320 CTGTCTTAGC TGAAGGCACA GCTCAGACTC CAGAGAGGGG TGCAGACTCA CCCGAGATCT 10380 CACTCCCAGT CAGTAGCTGA CACAGAATCA GGACTCATGC TTGTGCCGCT GAACTTTGTG 10440 GGGGTGGGTG GGGGGAGGTG GTTCTCTGTC ACCTTGACAC ATGGCCTTTG CCCCAGCCTT 10500 TAGACAAAAG CCAGAGGTGA GCTCACTTCT GATTTAGCAA GGGTTTCCTA GGCCACCATT 10560 GAAGCCCAGG AATATAACAG CTATTTCAGA AAGACATTGG GAGAGAGGGA GGAGGAGGGA 10620 GGATTCCAGG AGGGACTCAC GTTGGGCTGC CTCTAAGAGC CCCCTCCCTT CCCACTGCAC 10680 CTGCCGTGTT CCAGACACAG CCCTAAGCCA CTTGCATGCA TATCTCATTT ACTCCTCACT 10740 ACAGTCTTGG GGCAGGGAGC CAGTATTAGC CCCATTTTAC AAGTGAAGCA ACAGGCTCAG 10800 AGGAAAGGCA GATAGTAATC CTTAAAGGCT GAGGATTGGA ACCCAGATCT TTCTAATCCC 10860 TAAACTACCT TGGTATAACA TCTCCATTCC TTCTGGCTGC AGCTCGCAAT CCCTCAACAA 10920 TCTGACTTTC ATAATAACCG TTACCAAAAA GTCCTGACCC TCAACCTCGA TCAAGTAGAT 10980 TTCCAACATG CCGGCAACTA CTCCTGCGTG GCCAGCAACG TGCAGGGCAA GCACTCCACC 11040 TCCATGTTCT TCCGGGTGGT AGGTAAGCAT CAGGGTGGTG GTGGACAGTC GGTAGGGATC 11100 CTGCAGGAGT GTGAGCAGAA GGGTTTTGTT GAGGAAGCTG ATGTCAGGGA AGGAGACCTG 11160 CTGAGGATAT CTCTGCTGGA GTTTGTTTAT CCAAGGCCTG GCTAAGGAGC CACTCTCCAG 11220 GAGCTTTCCC TTACCCTCTC CTGGGATCTC TCTCCCATCT TGGAGCTCTT ACAGTGCATG 11280 GCTGCATTGG GTGCACCTTA GTGCCATTTT TTGTTTATTT GGGGATTGGG GTCCAGTAGC 11340 TCCCTACTGG ACTTCATTTG TTCATTCTTT CATGCATTCC TTTATGGAAA CATGAAAAGA 11400 CAATGATCAC CCAGTGATTA TGGGGGAAGC ACAAGGTGTC CTGGGAACAC TGAAGAGTCC 11460 CCCCAACCCA GGCTTCGAGA AGGTGGCCTC TAAACTGGGA TGGGAAGAAT GAAGGTGAGT 11520 TGGCCGGGCA GAAGGGTGGG AAAGGAAGGG GAACAGCGCT TCTGGCAGAG GGAGGAACAT 11580 ATGCAAGGCT CAAAGGCAAA GAGAACATAG ATCATTTGGA ACACTGAAAG AACTTGACAA 11640 CAGCTGGGAT GTGGAGTGGT GTGAGGAGTG GCCACAGGGG AGCAGAGGAG GTGGCAGAAG 11700 CCGGAGGTAA AGGTGTCTTA AAGTGAGAAA GAATAACTGC ATCTTMCCT ATTGGGAGGT 11760 CATTGTAAAG AGGAGAGTGA TGGGGTCAGA TTGTACAGAG GAGGCACTTC GTGGTGGTCA 11820 GGAGCACACA CTCCAGGGCA GTGTTCCAAC CTGAGTCTGC CAAGGACTAG CAGGTTGCTA 11880 ACCACCCTGT GTCTCAGTTT TCCTACCTGT <RTI ID=90.11> AAAATGAAC;A TATTAACAGT AACTGCCTTC 11940 ATAGATAGAA GATAGATAGA TTAGATAGAT AGATAGATAG ATAGATAGAT AGATAGATAG 12000 ATAGATAGAT AGGAAGTACT TAGAACAGGG TCTGACACAG GAAATGCTC;T CCAAGTGTGC 12060 ACCAGGAGAT AGTATCTGAG AAGGCTCAGT CTGGCACCAT GTGGGTTGGG TGGGAACCTG 12120 GAGGCTGGAG AATGGGCTGA AGATGGCCAG TGGTGTGTGG AAGAGTCTGA GATGCAGGGA 12180 TGAGGAAGAG AAAGGAGATA AGGATGACCT CCAGTCTCT GGCTATGGTG ATTGGGTGCA 12240 GGCAGTGGCA GTCACTGGAC TCAGACCCTG AAGCAAGGCA GCAGCTCATC GGAGTGTGAG 12300 CAGGCTCTGA GACATTTAGG TCTGGCCGTG CCTCATGTGT TGAATGTTAT GGGAGATGGA 12360 GGTGGCGAGG AGCATGAGAA TCATGAGCAT CACTGCCCCT AGAGTATGTG CAAGGCACTG 12420 GACTTGCAGC AGATTGTGAG CTCTGCTGTG GACCCCAATC TGCACTGGGA GCTTTGGCAG 12480 GGTAAAGGGG AAGAAGAGCA AAAGCACAAG AATTCAGTTA CGGCTTCTAA TCCTGTCTGC 12540 TTTCTAGTAC AGGCATACAG TCATCACTCA AGAAATGTTT ATGTTCATTC ACAC m GGG 12600 CCAGACACTG TTCTAGACAT CGAGGATACA GCTGCAAGTG AAACAGATAC AACAACCCCC 12660 GACTCATGAA GTGTGTGCTC TAGCTGGGAG TGGGCAAGCA ATGAGCCAAG TAAATTATTA 12720 AAAAAACAAA TTATATAGCA TTTGCAGCTT CAGATAGGGT GTTCACCAAG GAAGATCTCA 12780 CTAGAAAGCT GATATTTGAG CAAAGGCTTA AATTGCTGAA GGAGCAAGCC ATGCGGCCAT 12840 TTTGGAGAAG GGAGCTCCAT CCTGCAGCGG GACTGTGCTT GCCATGTTCA GGGGACAAGT 12900 GGGCCAGTGT GGCTGCGGGG AGAGAGTGAG AAAAAAAGTG GTCTCAGATG AGGTCAGAGA 12960 GCTAAAGTGG GAAGGTGAGA TGAAAGGAGG CTACCGCAGT GGTCCAGGCT GGAGCTGATG 13020 GTGGGTGGAC TAGAGTGGTA ATGGTGAAGG CAGCAGGAAG TTGTTGGTGT TTGGATGGAT 13080 GAATGGACTA ATGGATGGAT GAATAATAGA TAGATGGATT GTTGAGAGAG ACAGAGAAGA 13140 GAAAAGCCTT GCCCCCAAAA GCTCACAGAC TACTTGGAGA GAGAAGAAAG CTACCTGGAG 13200 GGAGAACCAG ATGCATGAAG CAGTGCAGAT GTGGTGCCTA ATGAGTGTGT AGTCTGGAAG 13260 GGCAGCAAAA GTCGAGTGGA GTGAGAGGTT CCTGTGTCCT GGAGCACTGA GTAGAGACTC 13320 CCTCATGGGG GTGAATCTTA AAGGATAAAG GGGCCTCTAT AATGAAAAGG AGGAGGATGG 13380 GATTTCTGGT AGAGGAAATT GCTTGAGCAA AACCTCCAAG GTTGGAATGA CTATGGTGTG 13440 TTCAGGGATG TTAGCAGACC CAGATGGGTG GAGCGTTGAG TGTGTGTGTG TAGGMGGAA 13500 GAGGGGAGGT GGCTGGATGA GCACAGTGAG ACCTGATTTG ATTGAGAGCC TTGAACGCCA 13560 CGCTGAATAA TGGAGGCAAT GGGACGCCAT AGAGGGCTTT TGAGTAGACA TATATCAGTG 13620 TAGAAGGGTG AATTTCAGAT TTTTAGACAG AATAGAGTAA GGAGAGGAGC TCTTAGAAAT 13680 CATCTAGTCC AGGGCTTGTG GCAGAGCCCT GAGGTTTTAA GAAGGCATGT CAGGGGCTAC 13740 CATGACAGGC ACGGAGAGGC TGAGTGAATT GGGGTTCTTG CCACAATTCC CTTGCCTGAG 13800 ATTCAACAAG AGCAGCTGTA TTACAATCTG TGCAAAATGT CATTAGGAGA AACTAGTTAG 13860 TAGCTGGGCG TGGTGGCATG CAACTGTTGT CCCAGCTACT CGGGAGGCTG AGGCCGGAGA 13920 ATCGCTTGAA GCTGGGAGGC GGAGGTTGCA GTGAGCAGAG ACTGTGCCAC TGCACTCCAG 13980 CCTGGATGAC AGAGCAAGAC TCTGTTTCAA AAAAAAAAAA AAAAAMACT AGTCAGGACT 14040 CTTTCAGATA CAAGTAATAG AAACCAACTC AAACTGGCCT AATTMAAGG ATTTTTTTCC 14100 TTATAGCTAA AAAGCTCATG GATATCAGCT TCAGGAACAC TTGGATCCAG GTGTTCAGCT 14160 GATGCTGGAA AGAATCTATG ACTCCCCAAC TCTCAGCCCT GCCAGGAAGG CTTTCCCCTT 14220 GTAGGACTCC GACTATCCGC CTTGTAGTAT CTGATCCAGC AACACCAGTA AAATGAGGGC 14280 TTCTCTTTTC CCAGAGTCTT AACAAAAATC ATGGAATTGA GTGTTATGGA CTCATGGATT 14340 CATGGTAACC CAAACCAATC ACCGGGCCAG <RTI ID=92.10> AGGGGACAGA GTACCCTCAC TGGTTGGCCT 14400 GGGTTACACA CCTACTCCAG AGCTATATTT GGAAGCCGCA TTGACTGATT TATGACCAGA 14460 AGAAAGGGAA ATGGATGAGG ACACGTGAAA TTGTGTGTGT ATGTGTGTGT GTGTTTTCTT 14520 GCTGCCAAAA ATTTTTCAAA AACTTGGAAA ATCACAGATA TATTCAATCT CTTCATTACA 14580 CAAATAAGGA GATGGAGGCA CAAATGGGGA TAGAGGGATT TGCCCAGGTT CTCCTAGGGC 14640 TTCAGTGAGA AAAGTTTTGA TCCAGGGATT CTGAAGGGGG TGGTGAGAAG AGGGGTGTCA 14700 GAGGACCTGT CTTGGGTGGT GGGGACTATG TACCTGTGAC ATAGCTGCTC AGGGACTGGA 14760 TCAATGGGTG GATGACAAAA TGGACAAATA AACAAGGACA TCTTCCCACT AATGCCAGAT 14820 GCTTGTGTGT TCTGCTTTCC AGAGAGTGCC TACTTGAACT TGAGCTCTGA GCAGAACCTC 14880 ATCCAGGAGG TGACCGTGGG GGAGGGGCTC AACCTCAAAG TCATGGTGC; A GGCCTACCCA 14940 GGCCTGCAAG GTTTTAACTG GACCTACCTG GGACCCTTTT CTGACCACCA GCCTGAGCCC 15000 AAGCTTGCTA ATGCTACCAC CAAGGACACA TACAGGTACC ACTTATCAGC TCCCGTCTAC 15060 ACAGCCCGAC AACCAGATGG GGTATGCTTC AGCAAGCATC AGGACGCTTG GCTCATGTCC 15120 CAACCTTGGT GTATGACCTT GAGCAAGTCC CTGCCCCTTT CTGGGCTTCG CTTTCCCTGA 15180 CTTCATGGAA TCCCAATATT GGTCATCTGT GTTTGAGATC TAGATGAAAT TGACCTACCT 15240 CTCCATCCCA CATCCTTGGG ATAGTCAATG CCCCACCCAA GGATTCTACC ATTTCTTGGG 15300 AGTGTGCATT CTCATTGGTC CCTCAAGAAC CCTCAGCCTC ATTCATTTTC CTCTcTTGGG 15360 GCCAATCCAA ATGCAGAAAA CAGCCCCACT CATAGACACA CTCCTGATAA TGACTGCACA 15420 AGTTATCTGC TACATACAAA AGCTTGGAGG GAGGGGAAGA GGGAATTAAG ATCACACAAT 15480 CACAGATACA TGAAATGTTC TTTAAAGGAT TGTGATCACC CAGCCCCAAG AATTTCTCAC 15540 TGGCTGCTCT TCTCTGTAAG CTCAAAACTC TTCCCATGAA GTGCAATCTA TAATAACTCC 15600 ACACCCCTCT TCTTCCGTCT CTCCACTCCC ACAATCCTGT GTATTCCACA CACATTTTAG 15660 AAATCTTTTT CCTGTCTGCT TGTGAACTGT GTTCTTGGGG TCTTGCTTTC TCATCCAAAG 15720 TGGCTTAAGC AGGTAGGTTC TAAATAAGAA AGCTTTGTGC CTAAGAGGAA CACTCATACC 15780 AGGTATATCA GGTATTAACT CAGGTATTAA AATAGTTCCT TCTTTTCTTT CTTTTTATTA 15840 TTTTTTTTAG ATGGAGTTTT GCTCTTGTTG CTGGAGTGCA ATGGCACAAT CTCGGCTCAC 15900 TGCAAACTCG GCCTCCCGGG TTCAAGTGAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG 15960 ATTACAGATG CCCACCACCA CACCCAGCTA ATTTTTGTAT TTTTAGTAGA GACAGAGTTT 16020 CACCATGTTG GCCAGGCTGG TCTCGAACTC CTGACCTCAG GTGATCTGCC TGCCTCGGCC 16080 TCCCAAGGTG CTAGGATTAC AGGTGTGAGC CATCGTGCCT GGCCTGAAAT AATCATTCAT 16140 ACCCTGCCCT TTCAGAGGGA GACAGTACAG CTTAAGGGCA GCGAATACGT GGTGTGCATG 16200 CCACACTCAC TCTCATTCTT GTTTCTGCAA CTCTGTTCTG CAGAGTGTAG ATGCGGCCTC 16260 AGAGTCCTCC TCAACACAGG TCCCAGGCAG TATTTCCAGC ATAGTTGGCT CATGAGAGAT 16320 CTGTTTGTCA TCCCTGTGTG GATCCCTTAG ACAACTTCAA AACTCTTTGG GATTCTCGTT 1 380 CTAGCTCTGG AAGCCCAAAC CTCATTGATT CCCACAATCT TGCTTGTCAA TTGTCAGAAG 16440 CAACAAGGAT GTTITCTTGT CCTCATCTTC CTCCTCTCAG TTCCCTTCTG GTCCTTTCTG 16500 GCCAGGTCTC TGTCTTCCTC TCATTTAAAG CAGAAGTTCT GAATCTGGAA TGTGTAGGCC 16560 CTTTGGAGGG GGCTGGTCCA TGGATCGGTT TAATGGGTCC ATAAGCCACA GAGACATTGA 16620 GGAAAGGAAC ACGAGATCCC CTAAAACACA GTAGTCTGGG CCCATTCAGC ACAAGGCAGA 16680 CAAGCCTGGA CACCAAACAG CCACAGAATT TTAGTTCATG TCATGGGTTG TTCATAATGG 16740 TGACTTTCAA TTATCCAAAA AAGTCAAATT ATTTTTAGTT AAAGGGGTTA GTTATCTCAA 16800 GAAGTGACCT GGGCAGAGGC CTTGTATATG CCCAGGGTCT GGCTGGATGA GACTGCTCTC 16860 TGAATACCAT AGATTTTAGT CTAGTAGTAG CTGCAGACAT TTCCCAAGCA AGAACTGGCC 16920 ATTTGCTATA ATTTTTAAAA TTTTATTTAT TTTGACAGTG AACTGGGGGA CTTTTTAAAA 16980 AATGTATTTA TTACCTAAAA CAACACATGT TCATTATGGA CAAATTGTAA AATAGAGATT 17040 AAAGAAAGAA TAAAACAAAA AATTTCCCAG AATCAGCCAA AGATGATTTT TATTGTTAGT 17100 TTTTGCTCCA GGGCCTTTTC TGTAATAAAG GGTACCATTG AATTGAGTGC CCACAAAGAT 17160 TCAACTTCTG TGTCAAGCAC CCTAAAAAGG TCCTTTAATC CTCAAGCCAA GCCTGTGAAT 17220 TAATAACCAT CGATATCACT CTCACAGCAA AGGAAGTGAG GGATCAGAGA GGTTMGTAC 17280 TTGTCTAAGA TCACACAGCC AAGAAACAGC AGCACCAGGA CTTGAACCCC AGTCTCTGCA 17340 GCAACATGGC TCAGAACCCA GGGCCCTACA TCCTGCCTCT <RTI ID=94.11> TGTCTCTTTC TCAGTCCCTC 17400 TTGGCAAGGT TGGCACTTCA GGGATTTGTA GCAGGGATTG CAGCTTTCAT GAAAGCTTAG 17460 TCCAGTGACA GTGGTCAACG TAGGCGACCT GTGATAGGCC TCCCAGCACC TTGAAGACAT 17520 CACCTCTATT AAACCTCGGG AAAAAAACAC TTTCAGATAA GAAAACCAAC TAAGGAAATG 17580 GGATTGGTGG TTTTTGCATG TCTCAATGGC ACCCTGTCTG AGTATCTGGC TTACCCAAGG 17640 CCGTTGGGCC CTGAATATTT TACCAAAAAT AAAATAAACC CCTTTAAGGC TGTTATCTGA 17700 CTGCAATCCT GGCAGGGGCC ATACTAGGCT GGGGCTCACC AACACCACCT GATTCTCTCC 17760 TGCAGGCACA CCTTCACCCT CTCTCTGCCC CGCCTGAAGC CCTCTGAGGC TGGCCGCTAC 17820 TCCTTCCTGG CCAGAAACCC AGGAGGCTGG AGAGCTCTGA CGTTTGAGCT CACCCTTCGA 17880 TGTGAGTGCT GGGGCCGAGC GCCACCTGGG GCGGAGGCCC TGGGACTGCC TGGAGGGATG 17940 GGGTTGACTG GGGCAGGGCA CAGGGAAGTA GGTACTGG(;A GATTGGGAGG TGGCGGGGAA 18000 AGTGTGACTT GGGGCCTCCT CCTTTCTTCC TCAGACCCCC CAGAGGTAAG CGTCATATGG 18060 ACATTCATCA ACGGCTCTGG CACCCTITTG TGTGCTGCCT CTGGGTACCC CCAGCCCAAC 18120 GTGACATGGC TGCAGTGCAG TGGCCACACT GATAGGTAAG TGGGCTCCAC TCACCTCCCT 18180 CACCTGGGCT CAGGGGCTGG GCACCCTGTG AGTGGGAGGG ACATGCTGGC GCTGGGAACC 18240 CTGAAGCTCT GAGCCACATT CTGCTTTTGC CAGGTGTGAT GAGGCCCAAG TGCTGCAGGT 18300 CTGGGATGAC CCATACCCTG AGGTCCTGAG CCAGGAGCCC TTCCACAAGG TGACGGTGCA 18360 GAGCCTGCTG ACTGTTGAGA CCTTAGAGCA CAACCAAACC TACGAGTGCA GGGCCCACAA 18420 CAGCGTGGGG AGTGGCTCCT GGGCCTTCAT ACCCATCTCT GCAGGTGAGA GGGAGCCTTC 18480 GCACCCGCAC CGCCCCCCCG CCCGCCCCCC GCCCCTGCTC CTTTAGGCGG CTCCTCCCCC 18540 ACCCCCCACC GAGGGAGCTG GGGTTGGCTC CACCTTTGGA GCAGATCCTA GCAGTACCAA 18600 GGTCCACCTC TCTGGGCCAG TCCAAGCCCC TCCTGCCTGG CAGGTCCCCC GAAGCAGTAC; 18660 GACGGGGTAG TCTCTGAGAA AGCAGAGAGA AAGCAGCCTG AAGAAACTGG CCCCCACTCT 18720 TGTCCCTGCA CTCTAACTCA TGCATCTATT CACAAGTATG TGCAGGCATT ATGCACCGTC; 18780 TGCCAGGGAC GTGCCCTATG CAGGGAAGCA GTGCCTCCCC AGAGCTCAGA GGCTGATGAG 18840 GGAGGCAGGC AATGAGCAAG GAAACAGTCC ATCTCCAGCT CGGGGCCAGC TAAGGACGGC 1890n CTTCTCCAAC TCTCCCCTCT TGCTCCAGAC ACAGTCTATC CATTTGAGGT TGCTGTGCAA 18960 GAGGCTGCCC CGGGGGATGA TGCCCGGCCC TGTGCACAAC ACAGGCTGCC TCTCTGCTTT 19020 ACACAAAGGC TCCTTACCAG CTAGTTCTGT GATTCTCAGA GGCCCACAGC ATCCTCAGGC 19080 TTTTGACAAC CAGGCTCTGG CACCCACTGT GTGCCAGACC CTGGCATCTG CCTGGCTCAG 19140 GGGTGGTCAC TCACGTCCCC AGCTGCTGGC CTTGGAGCAA CTGCTACCAG GGTCCAGCTG 19200 CAAGCAGGAG CCTGCGGCCG CGCTGGGCCT CACTGCTGGA GGTTGTATAT TATAATAAAG 19260 CCAACATTTT GTTGAAGGCT TCTGCTGCGC CAGGCACTGT GTTAAGCTCT TTGTGC; GGAT 19320 TATCTCGATT AACTCCTACA AACCTAGAGAA ATAAATAGAA TTTTCCCTAG GCTCAATCTC 19380 ACACAGCTCC CAAGTGGCAC AGGTGAAACT TGACTGCAGA TCTAAGTTAC TGATCTGAGC 19440 AAGGAAGTGG AAATTATGTT CTCCAAAACA TCGCTAGAAC TAGTAGTATA GATTCTGGGA 19500 AGAGGAGACT CAGGGGCCAC AAGCCTGGCT TGCTAGACCC TCAGAAGGGC TGTATGATTC 19560 CAAAGGCATG TGGAGAAGCT GCAGGGGAAA TGCAGGAGAG GAAGGTTGCA <RT CATCCTCTTC ATTATTTCAT GTTACTGTCC TATTAGCCAA AGCCACAATT TAGTGCATGT 20760 TGCGTATAGT GTGCTTCCTG TGTCTGCTCA GTATATCACA GTGATTTGAG GGGCATTTTT 20820 CTATAGCATG TTACCTACAT CATCTCATTT AATGCCCTCA GCAACCACTG TATGCAGCTA 20880 GCATTAGTCT ATTTTACAGA GTTGTAAACT GAGGTTCTGA GAGGTTGGGA CAGTTGCCCT 20940 TGTCTACAGC TGGTCAAAGG CAGAGTCTGG TTTTTAACCC TGAAGGAGGA CTCACTCCAA 21000 AGCATGTCCC AATCATTATG TGAAACATTG ACTCATCTTA TTTTACCCTC ACAAGAAGCT 21060 GGAGGCAGGA AGTATACTAG TCAGTATCTT ACCCATCAGG AAGCTGAGGC TCAGCAAGGT 21120 TAAAAAAAAA ACCCCAAGGG GCTGAGGC;AT AGGGTTGGCA CTGGGCCCCA GGGGCTTCTG 21180 TCCCTAGAGC CCATGGCCTC CACTGCCTGC CTGCCCACAC AAAGACCATG TGCAATGTGA 21240 TCAGAAGCTG AGAC.GACCAG GCCAGAGGGC TGTGGGAGTT CAGAGGTGGA CGGACTTTTC 21300 AGGCTGGTGG GTAAGGGAGA CTGCCTGGAG GAGGTGGCTT GGCATTGGTG GGACGGGCTT 21360 TGGAGGATGA GGATGCAGCA GGGGAGATGA CACTAAGGGA AAGGGTATCT CTGGGGGAGA 21420 GGGCAGAGTG TGCAGAGGTG CAGGTGAGGG AAGGACCAGG GTGGGGCTGG GGGTCTGAAG 21480 GGTTGGACCC CACCCTGTCG GTCCAAGGCC ATCAGTGGGT TTGAACAAGG GAGTGGTGTG 21540 ATCAAGGACT GAATGACCCA TCTTGTGTCC CCTTGGCTAC CTTTTCTTCC CCACACCCCT 21600 TGGGGCTTTT GTGAGAAGAG <RTI ID=97.12> GGCTTGAAGT GGGCAGGGTG GGAAGGATGT TGGGGGAGCC 21660 CCAGGGGCAC ATGGATCGGG ATCTCTACTC CTGCCAGCAC TCAGCATGAG AAGGCTGCTC 21720 TGAGGGCAGC CCCGGTCAAT ACCTCCGGAT CTAGGTCCAG CTCTGACACT GTTITGCCAT 21780 GTAACCTCAG CTGACTCGCT GTCCTCTCTG GGCCTTAGTT TCCCCTCTTA TACCATGGC, T 21840 CTGGGTGTTC TCTAACAGCC CCTCCTCCTC TGACATGCCA AGAGCCCACT GGTGGTCTAG 21900 TTTAAGCACC AGAAACTTGG ACTTCAGTGA ATCTGGGTCC AAATCCTGCC TCTGCCAAGC 21960 TCTGGCTATG GGGTGATGAG AAAGTTGGTG TGTCTGAGTC TCTTCTCCAT TTGTAAAATG 22020 GGATCATTAA CAGCCTGTTG TGAGGGATTC CGTACCACAA CGCACATAGA GGACTGAGCG 22080 GGGTGCTGGA CGAGACAGTC TCTGTGATGG GAGCTGCACA CTCTTGTCCC AGGAGGAAGT 22140 TCGTTGGGGA ACCAGAGTTA GCTCATGCCT CTTGGGATGG TGGAAGGAGG GGGACGTCTG 22200 AGGTCGGGCA TCATCTCCTT GACTACACAC CCAAAGCC.GT TGTTTGGCCC AGCCCACCCA 22260 CCTCCAGGGA CAGGACCTTA CTCACTCTCG GGGCCACCCG TTCCTTCTCT GAGCAGCTCC 22320 AATGTTTGCA AAGTTCTTCC TTACATGGAA CTGAAAACTG CCTCGCAGTG CCCACAGAGC 22380 TGCCAGGACA GTCATGCAGA GATTCCAGAG AAGGGCCTAG GGCCCCCTGC GGCCCTTTCT 22440 GCCTTGGGCT GGCCAGCCCC CTTGGCTGTG GTTTAGGAAC TCTGTATCCC CTCTCCACGG 22500 GACCATTTTT GGAACATGTC ACCTCCACAC TTCCTGTCCA GGAAATTCAG CTGCCCCTGG 22560 AGCCCATGCA AGGCTGCGAG AAGACTTGCA GCTACCCTCC TCCCCTACAC CCATTCACAG 22620 ACCCTTTAGC TCCAGGCCGA GGTGTCCACC CATGGGAGCG GAGGGGGCAG GATGGTCATG 22680 CCCGTGCTAA GTGCCTGCCC TCCCATCCTC CTCTGCCTTG CCCCATGAGG TTCGGAGCCT 22740 TGCCCCTTCA CTGGGGACTC AGCCCAGCCT CTCCTCATTG CCCAGGCCTG GGGAAAGAAG 22800 TGGCCTGTCT GTGGGGAGTG TTTGTTCTGC CTCAGGGCTG AATCATCACC TTTCTGTCCC 22860 CCAGAGTGAC CACAAGGGGG GCCGTGGGGG AAGAGAAAAG GGCAGGAGTC AGCAGGCTCC 22920 CCTGGAGGAG GAGGCGCACA GGGAAATGGC TGAGGCAGCA GGGAAGGGAG GGTCCAGGGA 22980 GGCTGCTGGA AAGACTACGA TTCTGGGGGC TGGAACTGAG CTCTGAGGAG CAACAGGAGG 23040 GTCCCCAAAG ATTCCACTGG GAATTGTTCA GATCTCCACC TTCCTGTGAG AACATCCACT 23100 CACCCAGAAC CAGCAGGCCT AGATGGGGAG GGGACCGGGA CTTTGTCTCC ATGCCCCCTT 23160 TGGTGGGGAG GATGGGAGGA AGGGAAGAAG TCAGGGGGTG GGCCTGGGGC TTAGGCCCAT 23220 TGCAAGGAAT GAATGGGGTG ATGTGCTTCA AGCATCTAGC CCAGCGCCCC ACTCCCAGGA 23280 AGAGCTCAGG AAGAACCCGC TGCCATCATG ACMTTACGT CCACCCTTCT CAGGGAGCCT 23340 CGCCCATCCC CACCTCTTGA TCTCTCACTC ATAGTTCTTT GGAAGAGAGG CTGCCTCTGG 23400 GTAGACGCCC ATGAGCCCTT TCCAGGGATG GCACAGGTGC CCTGGGAGGT TTACATGCCC 23460 AGCAGGGGCA GGGGAGGGTT CCTGAGGCAG GCAGAAGGCA GCTTGGTCCG CTTCCAGAAA 23520 TTAGGAGCCT AGGATTCAGA AATCTGAGAA TCCAGCCAAA CCTCCATCCT CCTTGATCCC 23580 CTCCCTTTCA ACAGTGCCCC CTGCCCAGCT GGGGGCAGGG AGGGGCTGAC TCAGCCCAGC 23640 TGCAGAGGGA CAGAGGAACA AGAAGTGGTA AGAAAAAACA GTCTTAGCCA CAGAGGCTCC 23700 TAGAGATGGA AGTGGCCAGG AGAGGCTGAA GAATCCCCTC CTCGCCTTGT TGCTGTCTTT 23760 TGGGCTGGGA AGGCACCCAC GGGCAGGATT TGGATCCTCA GAGGCTTGGG AAGCTCTTCT 23820 CCCTGGGTCC CGTTTCAGAC TCTCTCCCAA GCTATAACGC AGAGGCTCTG AAGTTCACCT 23880 GCAGTCCGCC CTTCCAAATC AGAGCCTGGA AGTTAGTTCC TTCTCATITC TAATTGCAGT 23940 CTTTTCTCTC TAACTACCAG CTAGAAGTTC TTCCTGATGG TTAGCTGGAA GCTTTCTCCC 24000 TGTCTCTCTC TTTAAAAATG TCCACATTTT ATTTTTGATT CAGGGGATAG ACGTACAGGT 24060 TTGTTGCATG CGTATGTTTC GTGATGCTGA GCTTTGGAAT ATGGATCCCA TCACCTGCTA 24120 CTGAGCATAG CTCCCATAGT TTTTCAACCC TCGCCCGCTT CCACCCTCCC TGCTCTAGTA 24180 GCCCCCAGTG TCTGTTGGTG CCATCTTTAT GCCCATGCAC ACTCAATATT TAGCTCCCAC 24240 TTATAAGTGA GAACATGCGG TATGTAGGTT TTCTGTTTCG GTGTTAATTT GCTTAGGATA 24300 ATGGCCTTCA GCTGCACCAC GTTGCTGCAA AGGACATGAC TGGAATCTTC TCTCTCAACC 24360 AGGACTTGCA GCTAAAGGCC AGCCTCCTCC CTAGCACCGG TCCACACTTC CTTTAAGTTT 24420 CTAGCTCGGG TGCCCAGGGA AGGAGCCCAG CTGCAGGCAC AGCCAAGCTT GTCCCATCCC 24480 CAAGGCCTGG CCGGAAAGAG TTGCTCTGCT GACCCAGGGC CTCAGTGTCC TCCACCGCCC 24540 CAGCCCAGCT TCCACTTTCC CCCTCAACTT GGTCTTCCAT CAGCATTTCT TATGGGCAAC 24600 CCTTAGCATG GTACTCCCCC TCAGCAGCTG ACCCCTGGGC AAGAAACAGG GGCAGCCATT 24660 CCTCCTCCCC ACATCCCAGG GCTTGCCTCC CCTGGCTGGG TGGTAACAGC ATGGAGAGCC 24720 TAAGGAAGGA AATCAGGTCT TTCCAAAGGT GCTGGTCCTC CAGAATCTAT CTAGTGGGCA <RTI ID=99.16> 24780 GCGTCTCTCT TTCTCTCTCA AAAAGGTAAA GTCAAGGCTG GGTGCGATGG CTCACGCCTA 24840 TAATCCCAGC ACTTTGAGAG GCCAAGGCAG AAGGATTGCT TGAGCCCAGG AGTTTGAGCC 24900 TAGTGAGCTA TGATCGTGCC ACTGCACTCC GGCATGAGTG AAGGAGCAAG ACTCTGTCTC 24960 AAAAAAAAAA AGTCAGATGG CGACTCACCT GTGTCAAACT CTCAGGGTCT CTCACTGCCC 25020 GGCCAGGCAT GGTAGCTCAT GCCTGTAATC CCAGCAv r GAGAGACCGA GGCAGGCAAA 25080 CTGCTTGAGC TCACGAGTTC AAGACCAGCC TAGGCTGCGA CAAAGCCCCG TCTCTACAAA 25140 AATTAGCCAG GTGTGGTGCC ACATGCTTGT AGTCCCGGCT GCTTGGGAGA CTGAGGTGGG 25200 AGGATTGCTT GAACCTCGGG GGTCGAGGCT GTAGTGAGCC AAGACTGCCC CCACTGCATG 25260 CCAGTCTGGG GGACAGAGAT CCTGTCTTGG AAAAAAAAAA ATCCCAAAAG GGAACCCACT 25320 CACCTTATCA TAGCCCTCAA GGCCTTCCTG TTTCTGGAAT CTGCCCCCCA CTTCCCTCAA 25380 GCCATGATGG CTGCCTTCCT ATAGCTCAAA CTTGCCAGGA TCATTCCCAT GTCAACCATA 25440 CAGCATTTCC ATGCACTGTT CCTGGAAAAT TCTTCCTCTG ATGGTCACAT GGTGGGCTCT 25500 TTAGGGGCCT TCCCTGACTT ATCTTACTTT ATTTTCTTCA TAGCACCACT TGAGAATCTC 25560 CTAGATACAT GTTTATTTGC GTTTAATGCC TCTCTCAGCC ACTAC.AATGC AAACTCCATG 25620 GAGGGGCAGG GACTTTGTCC TGTTCAACTC TGAATCAGCG GTGCCTGACA CAAATAGATC; 25680 TTCAAGAAAG TATGTGGATG GGCTACTATT ATTCAGCCTT AAAAAGGAAG GGAATTCTGA 25740 CCTGTGCTGC AGCATGAATG AACCTTGAAG ACATTATGCT GGGTGAAATA AGGCAATCTC 25800 AATAGACACA TGCTGTGTGA GTCCACTGAG GTGCAGTGCC TAGAGCAGTG CAATTCACAG 25860 AGACAGCAGA ATCATGGTTG CCAGGGGCTG GAGGAGGGAA AGGGGAGTTG CTTTTTAACA 25920 GGAACAGAAT TTCAGTTTTG CAAGATGAAA AGAGCTCTGG AAACTGGTTG CACAAGGTAG 25980 AATGTAATTT ACTTAATACT ACTGAACCAT ACACTTAAAA ATGGTTGAAA TGGTAAATTT 26040 CATGTATGTT TTATCACAAT TAAAATATAT ATATATATIT GGATGGGAGG TTGGGTGGGT 26100 GGATGGATGG GTAGATGGAT GGACAGATGA ACGGATGGAT AAGATCTCAA GTTCCACCCT 26160 CCCTCCTGGC TCAGGAATTA CCAGATTATC AGAGATATCA GGGCCCTCAG AGGTTGTCTT 26220 GTCCAAGGTC TTCAATACAC AAATAGTGAA ACAGGCTTGG AGAAGGGAAG GTCACACAAC 26280 AAGGCAGAGT CAAGCAGGAA CATGCTCTCA GTGCTATGTT CATGAGACGA CCTCTCTCAG 26340 CCCAGAGCAG GCCTTGCCCT GCCTTCTCCC ACTGGGCGCC TTGGGACTGC CCACACCCCT 26400 GCTCTTGGGG GTCAGAAACA AGGTCCAGGA ACTGCCTGCC AGCCCCGACT GCCACGTGCT 26460 CCCTTCCTCT TCTGCAGAAG CCCAAGTACC AGGTCCGCTG GAAGATCATC GAGAGCTATG 26520 AGGGCAACAG TTATACTTTC ATCGACCCCA CGCAGCTGCC TTACAACGAG AAGTGGGAGT 26580 TCCCCCGGAA CAACCTGCAG TTTGGTGAGA TGGCAGCTCA TCACTCCACA GCTTCCTATC 26640 ACAGGGCCTG TGGGGGTTGC AGGGAGCCCA TGGGCCCTTG GACAGAGGCC CTTTGGTGCC 26700 CAGGGACTTA AGGGACCTGT GTGCGTGGCA GGTAAGACCC TCGGAGCTGG AGCCTTTGGG 26760 AAGGTGGTGG AGGCCACGGC CTTTGGTCTG GGCAAGGAGG ATGCTGTCCT GAAGGTGGCT 26820 GTGAAGATGC TGAAGTGTGA GTGAGGGGAG GGGATGAGGG AAGGGATGGG GGTGGTAGAT 26880 GCTGGGGGTG GGCTGGCCCT GGTGTCACAA GAGGCATCAC ACACA m CA ACCTGTTGAA 26940 GCCTGGGGGA CAGAGCTCAG GGGTGAGGAC TTGGGTTTC TTGTGAGCTC CAGGCACCCT 27000 CTGACTCCCG GCTCCAAGAA GGTCTAGGTC ACCCTTTAGT TGTGAAGGGG CTCCTGACTG 27060 AGCTCCAAAA AGTCTGGGGG TGCAGAAAGG CCACCTATGG CCATGGCCTG GCCACAGTTT 27120 GGCTTCCTGT CACCTGAAGA CCAGCTCAGT GACAGGCTCA TCCCTTCTCT CTCTCTCTCT 27180 GCCATCTGTG TGTCTGCATT TTTCCTTCTC CTTCTTTTGG CTTCTGGTCA CTCCGGGTCT 27240 TGGGATATGC CCTGCTTTCT CCCCTGGGTC TCTGCATrTG GTCCCCATGT ATCTGTGTGG 27300 TGCTCTCTGT CCTGCCCTCT CCCTGTCTTT GGGACTGTGG TTCTTCCTCC CAGCCACGGC 27360 CCATGCTGAT GAGAAGGAGG CCCTCATGTC CGAGCTGAAG ATCATGAGCC ACCTGGGCCA 27420 GCACGAGAAC ATCGTCAACC TTCTGGGAGC CTGTACCCAT GGAGGTAAGG GCCTTGGGGT 27480 TCCTGGGGCC AAGGTCTTGG GGCCTCTGGG GAATCTCAGG GCCCCAGGGC TACCTTGTTC 27540 CGTCTTCTCC TTCTCAGGAT CCTACTGCTC CAAGTGTCAG GGGGATCCCG GTCACAGCAT 27600 CCCTTAAACT CCTGGGCCCA TCTCCTGGAA TAGTCAGGAG CTGCACGGGC AGCTTGAGGT 27660 ATAAAGAGAG ACTGATAGGG AGCATCGGAG CCCTTGGAGG AGGAGATGAA TGTGCAAGCT 27720 CCTAGGCCCT GCTTCCAGGG AGCCGGATCC TCTGGGTCTG GAGTGAAGCC CCCCGCCTAC 27780 CTCTTATGAA GCTTCCATTC AAGGATGCTr GGACACTCTC CCCAGGGCCC CCAAAGGTGC 27840 CCCGGGCTTT GCTGGGACTC CAAGTGCCCC ACATCCTCTT CACTGATAGC AGCTCTGACC 27900 TACAGTGAGC CGCCATAGCT TTCCTTTGAA GAAATAATTC TTGGGCTACA TTTTTTTTAA 27960 GGTTGTCTTT TTTTTTTCAT TTTTTGTTTT TTTTTTCTTG AGACGGAGCC TCACTCTGTC 28020 ACCCAGGCTG GAGTGCAGTG GTGCGATCTC GGCTCACTGC AACCTCTGCC TCCCAGGTTC 28080 AAGCAATTCT CCTGCCTCAA CCTCCTGAGT AGTTGGAACT ACAGGCACAT GCCACCATGC 28140 CCGGCTGATT TTTTTGTATT TTTGTAGAGA TGGGGTTTCA CCATGTTAGC CAGGATGGTC 28200 TCGATCTCCT GACCTCGTGA TCCACCCACC TTGGCCTCCC AAAGTGCTGA GATTACAGGC 28260 ATGAGCCACC GTGCCCCGCC AAAGCCATCT GTTTTAAACA AATGGAACTA CTGAGGCACA 28320 AGGAAACTTG CTCACAGAGC CGAGGTTAGA ACTCAGCTAT GCTGAGTCCA AGTCCAGTGG 28380 CCTCACTGCC CCCAGTCTCA TGCTCCTGTT CATGGAGGGG AGCACTCAGC ACCTCCCTCA 28440 CCCCACACCC TTGGCTGCTC TAGGCCCTGT ACTGGTCATC ACGGAGTACT GTTGCTATGG 28500 CGACCTGCTC AACTTTCTGC GAAGGAAGGC TGAGGCCATG CTGGGACCCA GCCTGAGCCC 28560 CGGCCAGGAC CCCGAGGGAG GCGTCGACTA TAAGAACATC CACCTCGAGA AGAAATATGT 28620 CCGCAGGTAG CCCCTGGCAA AGGACAAGAA AAAGGCCAGG TCTGGGAGGC AGGATCCGAG 28680 TCTGTCTTCA AAGCCAGCTC AGGGTTGGAT GGCTCATC,AA TG(; GTGGCTA TGCAGCCCTC 28740 ACCTGCCACC TGTGTCATGG GAAGTAGCCA CCACAGGTTT TATGGCCATC TCTTGTTTCT 28800 CTACTCCTTT TCCCCTTCAT TCAACAAATA TrTGAACACC TACCGTGTTC TGGGAGTGTG 28860 GAGGGCAAAG ATGGGCAGCT CATAATCTGG TGGAGATATG CATCAATGAA ATCACCACCC 28920 AGTGTGTGTA AAAGATCAAC CAAGATCTGT GCCTGGAGCC CTAGTAAGAG ATGGGCAGAT 28980 GTGGCCGGGT GCAGTGGCTC ATGCCTGTAA TCCCAGCACT TTGGGAGGCT GAGGCGGGCA 29040 GATCACCTGA GGATGGGAGT TCGAGACCAG CCTTACCAAC AAGGTGAAAC CCCGTCTCTA 29100 TTAAATATAC AAAATTAGCC GGGCGTGGTG GCGCATGCCT ATAATCCCAG CT CTCGGGA 29160 GGCTGAGGCG GGAGAATTGC TTGAACCCAG GAGGCAGAGG TTGCTGTGAG CTGAGATCAC 29220 ACCATTGCAC TCCAGCCTGG GCAACAA(; AA TGAAACTCCG TCTCAAAAAA AAAGAGAGAT 29280 GGCTCTGTTG TCCTGTTGCT GTGATTCCTG GAAGCCATCC AGAACAGAGC CATCCAACAG 29340 ACAGAGCCAC ATGGGGAACC AAAGAGAGGA AGTGGGGAGA TTCATGTCAC ACATGAGTCA 29400 GGGTTAGAGG TGGAGCCTGG ACTAGAATCC TGCTCTCTTG ACTTCCAGTC CAGGAGTCAC 29460 CCAAGCCACA CTGCTGTCCT GGAGGTCTCT GTCTCAGGGG CTTGTGGGGT CAGGACAGGA 29520 TCAGAACAAG AAGGGTGTAC ACTGCGCCCT CATCCTAGAT ACTGTCAGCT GCCACGCCTG 29580 GGGAGGCAAA AGAGAAGGAG GCCATCTCTT CACCCAGGGC CTTAAAAATG GGGGCCTGGC 29640 AGCATCACTI CCTCTTCTGA TTCCCTGACA CTTCTATGAG GGTGGCACAC ACTAGGCCTC 29700 TGAAGATCAG ATCAAAATGA GCACCAAAGG AAAGTATTAG CTTCCATCTT CAAATACGCA 29760 GATGGGGAAA GTATTCCCAG AGTGGGTAAT TTCGAGGGCA <RTI ID=103.6> AATGGCCTGT AAACCAACTC 29820 TGTCAAAGGA TTCCAGGCTG TTAACGGAAG CATAGTTTCT AChAGGGAGC GGAAGGTTTT 29880 TTCGGTTTCT CCTTCTGGGA ACACTAGAAT ATGGACATTG TCAAGGTACA CATCTCTAGC 29940 GCAGAGGGGA CAGGAGGGAG AGAGAAATCC TATCTGGCTG GAACGTTAGG AGCAGTAGTG 30000 CTTCAGTCTA CAGTAGTGCT TCTCAAATTC TCTACCCCAA GTGTGCTCTC ATAGGCATCT 30060 CTTGAGGACT GTTGGAAGTG CACCACCTCA GGCCCATCCA CCCAGGCCTG CTGATTCAAA 30120 ATCTGCATTG CAGAGATTCC CGGGGTGATT TATCTGCACA TGAGTTGCAG CGTAAGCAGC 30180 ACTGCTCTAG ACCAGTGGGC CTCAGCTTAG GCTGTACTTT GTGATCACCT GGGGAGATIT 30240 AAATCTGTGA ATGACTGTTT TGTCCCTAGA GTTTCTGAAG TATTAGTAAT TAGCCTGATC 30300 CTAAAAGCTC CCGAAGTGAT TTTMTGTGA AGCCAGGGGT GTGAGGCACT GTCCAGAGAA 30360 GAGAGGGCAC AAGGGGCCCT AGAATATGCC CCAATTCTAG TAGGGCTGTT ATGGGGAAGA 30420 GGACTCCAAC TTCTCTGTGG CCCTTGAGGG TAGAGCAGGG GCTAGGAGGA AAATCTCAGG 30480 GGTAGATTGG CATTAGGAAC AGTGAAGAAC TTTCTCACAG GCAGAGCTGC CCAAAACCAG 30540 AATGGGTTGT AAGCTCCCTC ACCGGGGACA GCCGAGCAGA GACCAATGCT CACTCAGATG 30600 GAGTGTGGCA GGAGGGTTTC TTATCAGAAA GGGAGGTTCC AGTTGACCAT GGGGTGGTGG 30660 GTGGTCAAGG CCTGAGCTGA GCAGTGCAGT GATGATGACT GACCTCTGCC CCCCAACCCT 30720 CTCTCCTATG TAGGGACAGT GGCTTCTCCA GCCAGGGTGT GGACACCTAT GTGGAGATGA 30780 GGCCTGTCTC CACTTCTTCA AATGACTCCT TCTCTGAGCA AGGTGAGGAG GTCCCAGGGC 30840 CAGGCCCCAT TTGCTTGATA ACAAGGGAAA AGGAGAAGGG GCTGCTGGGG TGAGGGGTGG 30900 GGAGTGTGGC AGGGCTGCCC TGACGCCTCT TCCCACCCTA GACCTGGACA AGGAGGATGG 30960 ACGGCCCCTG GAGCTCCGGG ACCTGCTTCA CTTCTCCAGC CAAGTAGCCC AGGGCATGGC 31020 <RTI ID=104.2> CTTCCTCGCT TCCAAGAATG TGAGTAGGAA CCTGGCCCTG GCTCATAGCC ACCCAGGTCT 31080 GTGCTCCGGG GAGGCTGGAT GAGTGACGAT GGGGAGGAGG AAACGGGAGC CTGTGAGGGG 31140 GTAGGGGAGG AGACAGAGTA TGAGAGAGTC ATTTGGGCAG CAGCTGCAAG GATGAGTGGG 31200 AGAAAGCTGT GCCCAGGGCT GGAGCTCTGG GGCTGGGCAC CTGTGTCCCC AGCGTGAAGA 31260 TGAGGAAGGG TACCAGGCTT TCTTCATTCG TTTTACTAA ATAGTGTATG AGAGACAACA 31320 GTTGTCTCTG CTCATAAAGC ACGTGGTCTG GTGGGGATGA TAACGGAAGC TTCCTCAGAA 31380 TTTTGGGGAT ATTAGATAAC GTATAAAGTG CGCTCGGCCT AGGAAGAAGT GCCAGGGAAT 31440 GGGAGCTCTT GCCATCTTCC TTAGAACAGA TTCGGGAGTC AGTGGTTTGA TTGTTGGCTC 31500 TGCCACCTGC TCCGTGACTT TAAGCAACTA TTTAAATTCT GTGCCTCAGT TTCTACACCT 31560 ATAAAAATGG GCATAACGAT TGTTGAAAAG AAAAAGGGTT CAATGTGTGC AGAGTTTAGG 31620 <RTI ID=104.8> GAAGGGCCTG GCAGATAGCA GCTGCTATGA TCAGAAGTAA CGGTAGGGTT TGGAGACTGC 3t680 TCTCTGCACG GAAGCCCTTC GCTTCTGGGC; CCTGAGCAGA CCAGTCAGAG GACAAAGGGT 31740 GAGAAGGGCC ATGGCTGCTC AGGGTAATGG GGGTTTCTAA GCATTAAATG ATCAGATCAC 31800 GATACACATT CTCAGATCCT GGGCCCTGGT AGAAGGTATA GACAAGGGTT TGTGGTAAAG 31860 GACCAAAACT GTTGTTCACT CCAGCAGGGA CTCCAAAGCC ATGTGGGGCC CTCCCTGCCA 31920 TCCTCCTCAC CTCAGGCTCA GGTAGGAGAA GGCCCAAGAC TAACCCTGCA GTGCTITCCC 31980 TCAGTGCATC CACCGGGACG TGGCAGCGCG TAACGTGCTG TTGACCAATG GTCATC; TGGC 32040 CAAGATTGGG GACTTCGGGC TGGCTAGGGA CATCATGAAT GACTCCAACT ACATTGTCAA 32100 GGGCAATGTA AGTGCTGGGA GGGCTTGGGT CAGGCTGGGG AGGGGGTGAA GAGTCGGGGC 32160 CCAAAATAAC TGGGGACTGT CATCCCAGGC CCGCCTGCCT GTGAAGTGGA TGGCCCCAGA 32220 GAGCATCTTT GACTGTGTCT ACACGGTTCA GAGCGACGTC TGGTCCTATG GCATCCTCCT 32280 CTGGGAGATC TTCTCACTTG GTGAGCCACT GGGCCCACTC CAGGCAGAGC CTGGGGCTG 32340 CTCCTCTGGT TGCCCCACTG GTGGACAAAG CTGTTTGGTG CCCAGGACAC AGCGAGGGTT 32400 GGTGAGAGTG CAGGAATGGG CAAGGGCTCT CGAAACCCAG CATCGTGGCT CCTGCGGGAC 32460 TCGGCAGACC CTCTGCCCCT GACAGGCGCT CCTTTCTGGC TCTTCCCTCG TTTGTCTCTG 32520 CTCAGTTGCT GTTACCTGTT ACCCTCCTTT GTCACTGTIT CCCTCCTTTG TCTGAAATCT 32580 ACAGACCCTT GAAGATGCAG CTCTCTACTA CTAGGCTCTA GTAGAAAGAA CTGCTATTTC 32640 CCGAGGACTA GGCACAAGGA CTTGTACTCA GTTCTTAAAT ACGCTGCTCC TATACCCTCA 32700 TAACCACCTG ACTGTCCACA CTTTAACGAT ACACAGCTGA AGCTTTGGTC TGATTCCAAA 32760 GCCTGTGCAA GAATGTTTGG TGTGATAAGG CCTGGATAGA GGCTCACACC TTCCTAAAGC 32820 CTAAGCCTGC CACACACTGG CTGGCACACA GGAAGCACCG GGTAAGAGTA GCTGCTGTTG 32880 CAGATGTTGT CAAGTGGGAC CCTTTAAACC CAGTCTAAGA TGTGTGTGGG TGTGCGGGAA 32940 TGGGGAGAAG ACAATGGGCA TGGCCTCTTA CCTGATCTTG GCCTTTGCAG GGCTGAATCC 33000 CTACCCTGGC ATCCTGGTGA ACAGCAAGTT CTATAAACTG GTGAAGGATG GATACCAAAT 33060 GGCCCAGCCT GCATTTGCCC CAAAGAATAT GTAAGCGAAG GGATCCCAGG GAGGGAAAAC. 33120 GACACCCCAG GCTITCGCTG GAAAGGGATG GAAGGCCGTG TGGCCCTGAT CTTTCCCTGT 33180 CCAAAATGTT CCAGGGTCAG ACTTTATCTC TCCCATAGTG GACACAACAA GCCCCTTTTG 33240 AGTTCAAGCT ATGGGGGATG TTCTCAGAGA AGCAGCTGTT CACTAGGGCT GGTCCTAACC 33300 GACCACTTTT CCTTTTTTTT TTTTTTTTTT TTTGAGACAG CATCTTGCTC TGTAGCCCGG 33360 GCTGGAGCGC AGTGATGTGT GCAATCATAG CTCACTGCAG CCTCAATCTT CAGGGCTCAA 33420 GCAATCCTTT GGCCTCAGCC TCCCAAACAG CTGGGACTAC AGGTGTGCAC CACCAAGCCC 33480 AGCTATTTTT AAAAAATTTT TTAGTAGAGA TGGGATCTCA CTATGTTGTC CAGGCTGGTC 33540 TGGAACTCCT GGCCTTATGC AATCCTCCTG CATCAACCTC CCAAAGTGTT GGGATTACAG 33600 GAATGAGCCA CTGCACCTGT CCCTAAACAG ACTTTTAAGA GATCGTTATT ACAGTTACCC 33660 TGAGGATACC AAAATGGCCT CATCTGTCAG AATGAGGGTG ATGAGAGTAC CCTTCTGCAA 33720 GGGTTACTGT GAGGATTAAA TGGTAAAGCA TGCCAAGGAC TTGGCATAGG TTTTATACTA 33780 AACTTACTTT GACTGGGTTT GGGGACCTCT GCTGGGTAGG TCTCTCTAGG GGTGTGTGTT 33840 AATGGCCCCT GGACCCTAGG GAGCTGCCCA TGGGCATCCT CTGTCCTATC TCCCAGATAC 33900 AGCATCATGC AGGCCTGCTG GGCCTTGc,AG CCCACCCACA GACCCACCTT CCAGCAGATC 33960 TGCTCCTTCC TTCAGGAGCA GGCCCAAGAG GACAGGAGAG AGCGGGTGAG TGGGGTGAGG 34020 CTTGGGGTGG GTGGCCGGTA AAGCACGTTG GGCTGGGCCT GATGGATCTG GACTGACAGT 34080 TTCTGGTCCC TCCCACCCTC AGGACTATAC CAATCTGCCG AGCAGCAGCA GAAGCGGTGG 34140 CAGCGGCAGC AGCAGCAGTG AGCTGGAGGA GGAGAGCTCT AGTGAGCACC TGACCTGCTG 34200 CGAGCAAGGG GATATCGCCC AGCCCTTGCT GCAGCCCAAC AACTATCAGT TCTGCTGAGG 34260 AGTTGACGAC AGGGAGTACC ACTCTCCCCT CCTCCAAACT TCAACTCCTC CATGGATGG 34320 GCGACACGGG GAGAACATAC AAACTCTGCC TTCGGTCATT TCACTCAACA GCTCGGCCCA 34380 GCTCTGAAAC TTGGGAAGGT GAGGGATTCA GGGGAGGTCA GAGGATCCCA CTTCCTGAGC 34440 ATGGGCCATC ACTGCCAGTC AGGGGCTGGG GGCTGAGCCC TCACCCCCCG CCTCCCCTAC 34500 TGTTCTCATG GTGTTGGCCT CGTGTTTGCT ATGCCAACTA GTAGAACCTT CTITCCTAAT 34560 CCCCTTATCT TCATGGAAAT GGACTGACTT TATGCCTATG AAGTCCCCAG GAGCTACACT 34620 GATACTGAGA AAACCAGGCT CTTTGGGGCT AGACAGACTG GCAGAGAGTG AGATCTCCCT 34680 CTCTGAGAGG AGCAGCAGAT GCTCACAGAC CACACTCAGC TCAGGCCCCT TGGAGCAGGA 34740 TGGCTCCTCT AAGAATCTCA CAGGACCTCT TAGTCTCTGC CCTATACGCC GCCTTCACTC 34800 CACAGCCTCA CCCCTCCCAC CCCCATACTG GTACTGCTGT AATGAGCCAA GTGGCAGCTA 34860 AAAGTTGGGG GTGTTCTGCC CAGTCCCGTC ATTCTGGGCT AGAAGGCAGG GGACCTTGGC 34920 ATGTGGCTGG CCACACCAAG CAGGAAGCAC AAACTCCCCC AAGCTGACTC ATCCTAACTA 34980 ACAGTCACGC CGTGGGATGT CTCTGTCCAC ATTAAACTAA CAGCATTAAT GCAGTCAGCC 35040 TCTGGTTCTT TGTGCCACAT GAGTACCTGC AAATTCCCTG GAACGTCTTT CTTTCCTTCC 35100 (2) INFORMATION FOR SEQ ID NO:20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3565 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: GCAGCCGGGC GGCCGCAGAA GCGCCCAGGC CCGCGCGCCA CCCCTCTGGC GCCACCGTGG 60 TTGAGCCCGT GACGTTTACA CTCATTCATA AAACGCTTGT TATAAAAGCA GTGGCTGCGG 120 CGCCTCGTAC TCCAACCGCA TCTGCAGCGA GCAACTGAGA AGCCAAGACT GAGCCGGCGG 180 CCGCGGCGCA GCGAACGAGC AGTGACCGTG CTCCTACCCA GCTCTGCTTC ACAGCGCCCA 240 CCTGTCTCCG CCCCTCGGCC CCTCGCCCGG CTTTGCCTAA CCGCCACGAT GATGTTCTCG 300 GGCTTCAACG CAGACTACGA GGCGTCATCC TCCCGCTGCA GCAGCGCGTC CCCGGCCGGG 360 GATAGCCTCT CTTACTACCA CTCACCCGCA GACTCCTTCT CCAGCATGGG CTCGCCTGTC 420 AACGCGCAGG TAAGGCTGGC TTCCCGTCGC CGCGGGGCCG GGGGCTTGGG GTCGCGGAGG 480 AGGAGACACC GGGCGGGACG CTCCAGTAGA TGAGTAGGGG GCTCCCTTGT GCCTGGAGGG 540 AGGCTGCCGT GGCCGGAGCG GTGCCGGCTC GGGGGCTCGG GACTTGCTCT GAGCGCACGC 600 ACGCTTGCCA TAGTAAGAAT TGGTTCCCCC TTCGGGAGGC AGGTTCGTTC TGAGCAACCT 660 CTGGTCTGCA CTCCAGGACG GATCTCTGAC ATTAGCTGGA GCAGACGTGT CCCAAGCACA 720 AACTCGCTAA CTAGAGCCTG GCTTCTTCGG GGAGGTGGCA GAAAGCGGCA ATCCCCCCTC 780 CCCCGGCAGC CTGGAGCACG GAGGAGGGAT GAGGGAGGAG GGTGCAGCGG GCGGGTGTGT 840 AAGGCAGTTT CATTGATAAA AAGCGAGTTC ATTCTGGAGA CTCCGGAGCG GCGCCTGCGT 900 CAGCGCAGAC GTCAGGGATA TTTATAACAA ACCCCCTTTC AAGCAAGTGA TGCTGAAGGG 960 ATAACGGGAA CGCAGCGGCA GGATGGAAGA GACAGGCACT GCGCTGCGGA ATGCCTGGGA 1020 GGAAAAGGGG GAGACCTTTC ATCCAGGATG AGGGACATTT AAGATGAAAT GTCCGTGGCA 1080 GGATCGTTTC TCTTCACTGC TGCATGCGGC ACTGGGAACT CGCCCCACCT GTGTCCGGAA 1140 CCTGCTCGCT CACGTCGGCT TTCCCCTTCT GTTTTGTTCT AGGACTTCTG CACGGACCTG 1200 GCCGTCTCCA GTGCCAACTT CATTCCCACG GTCACTGCCA TCTCGACCAG TCCGGACCTG 1260 CAGTGGCTGG TGCAGCCCGC CCTCGTCTCC TCTGTGGCCC CATCGCAGAC CAGAGCCCCT 1320 CACCCTTTCG GAGTCCCCGC CCCCTCCGCT GGGGCTTACT CCAGGGCTGG CGTTGTGAAG 1380 ACCATGACAG GAGGCCGAGC GCAGAGCATT GGCAGGAGGG GCAAGGTGGA ACAGGTGAGG 1440 AACTCTAGCG TACTCTTCCT GGGAATGTGG GGGCTGGGTG GGAAGCAGCC CCGGAGATGC 1500 AGGAGCCCAG TACAGAGGAT GAAGCCACTG ATGGGGCTGG CTGCACATCC GTAACTGGGA 1560 GCCCTGGCTC CAAGCCC TT CCATCCCAAC TCAGACTCTG AGTCTCACCC TAAGAAGTAC 1620 TCTCATAGTT TCTTCCCTAA GTTTCTTACC GCATGCTTTC AGACTGGGCT CTTuTTTGTT 1680 CTCTrGCTGA GGATCTTATT TTAAATGCAA GTCACACCTA TTCTGCAACT GCAGGTCAGA 1740 AATGGTTTCA CAGTGGGGTG CCAGGAAGCA GGGAAGCTGC AGGAGCCAGT TCTACTGGGG 1800 TGGGTGAATG GAGGTGATGG CAGACACTTT TACTGAATGT CGGTCTTTTT TTGTGATTAT 1860 TCTAGTTATC TCCAGAAGAA GAAGAGAAAA GGAGAATCCG AAGGGAAAGG AATAAGATGG 1920 CTGCAGCCAA ATGCCGCAAC CGGAGGAGG AGCTGACTGA TACACTCCAA GCGGTAGGTA 1980 CTCTGTGGGT TGCTCCTTTT TAAAACTTAA GGGAAAGTTG GAGATTGAGC ATAAGGGCCC 2040 TTGAGTAAGA CTGTGTCTTA TGCTTTCCTT TATCCCTCTG TATACAGGAG ACAGACCAAC 2100 TAGAAGATGA GAAGTCTGCT TTGCAGACCG AGATTGCCAA CCTGCTGAAC, GAGAAGGAAA 2160 AACTAGAGTT CATCCTGGCA GCTCACCGAC CTGCCTGCAA GATCCCTGAT GACCTGGGCT 2220 TCCCAGAAGA GATGTCTGTG GCTTCCCTTG ATCTGACTGG GGGCCTGCCA GAGGTTGCCA 2280 CCCCGGAGTC TGAGGAGGCC TTCACCCTGC CTCTCCTCAA TGACCCTGAG CCCAAGCCCT 2340 CAGTGGAACC TGTCAAGAGC ATCAGCAGCA TGGAGCTGAA GACCGAGCCC TTTGATGACT 2400 TCCTGTTCCC AGCATCATCC AGGCCCAGTG GCTCTGAGAC AGCCCGCTCC GTGCCAGACA 2460 TGGACCTATC TGGGTCCTTC TATGCAGCAG ACTGGGAGCC TCTGCACAGT GGCTCCCTGG 2520 GGATGGGGCC CATGGCCACA GAGCTGGAGC CCCTGTGCAC TCCGGTGGTC ACCTGTACTC 2580 CCAGCTGCAC TGCTTACACG TCTTCCTTCG TCTTCACCTA CCCCGAGGCT GACTCCTTCC 2640 CCAGCTGTGC AGCTGCCCAC CGCAAGGGCA GCAGCAGCAA TGAGCCTTCC TCTGACTCGC 2700 TCAGCTCACC CACGCTGCTG GCCCTGTGAG GGGGCAGGGA AGGGGAGGCA GCCGGCACCC 2760 ACAAGTGCCA CTGCCCGAGC TGGTGCATTA CAGAGAGGAG AAACACATCT TCCCTAGAGG 2820 GTTCCTGTAG ACCTAGGGAG GACCTTATCT <RTI ID=109.2> GTGCGTGAAA CACACCAGGC TGTGGGCCTC 2880 AAGGACTTGA AAGCATCCAT GTGTGGACTC AAGTCCTTAC CTCTTCCGGA GATGTAGCAA 2940 AACGCATGGA GTGTGTATTG TTCCCAGTGA CACTTCAGAG AGCTGGTAGT TAGTAGCATG 3000 TTGAGCCAGG CCTGGGTCTG TGTCTCTTTT CTCTTTCTCC TTAGTCTTCT CATAGCATTA 3060 ACTAATCTAT TGGGTTCATT ATTGGAATTA ACCTGGTGCT GGATATTTTC AAATTGTATC 3120 TAGTGCAGCT GATTTTAACA ATAACTACTG TGTTCCTGGC AATAGTGTGT TCTGATTAGA 3180 AATGACCAAT ATTATACTAA GAAAAGATAC GACTTTATTT TCTGGTAGAT AGAAATAAAT 3240 AGCTATATCC ATGTACTGTA GTTTTTCTTC AACATCAATG TTCATTGTAA TGTTACTGAT 3300 CATGCATTGT TGAGGTGGTC TGAATGTTCT GACATTAACA GTTTTCCATG AAAACGTTTT 3360 ATTGTGTTTT TAATTTATTT ATTAAGATGG ATTCTCAGAT ATTTATATTT TTATTTTATT 3420 TTTTTCTACC TTGAGGTCTT <RTI ID=109.9> TTGACATGTG GAAAGTGAAT TTGAATGAAA AATTTAAGCA 3480 TTGTTTGCTT ATTGTTCCAA GACATTGTCA ATAAAAGCAT TTAAGTTGAA TGCGACCAAC 3540 CTTGTGCTCT TTTCATTCTG GAAGT 3565 (2) INFORMATION FOR SEQ ID NO:21: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 695 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: GGTAACCTCA GCCCTCGGGC GCCTCCCTTT AGCCTTTCTG CCGACCCAGC AGCTTCTAAT 60 TTGGGTGCGT GGTTGAGAGC GCTCAGCTGT CAGCCCTGCC TITGAGGGCT GGGTCCCTTT 120 TCCCATCACT GGGTCATTAA GAGCAAGTGG GGGCGAGGCG ACAGCCCTCC CGCACGCTGG 180 GTTGCAGCTG CACAGGTAGG CACGCTGCAG TCCTTGCTGC CTGGCGTTGG GGCCCAGGGA 240 CCGCTGTGGG TTTGCCCTTC AGATGGCCCT GCCAGCAGCT GCCCTGTGGG GCCTGGGGCT 300 GGGCCTGGGC CTGGCTGAGC AGGGCCCTCC TTGGCAGGTG GGGCAGGAGA CCCTGTAGGA 360 GGACCCCGGG CCGCAGGCCC TTGAGGAGCG ATGACGGAAT ATAAGCTGGT GGTGGTGGGC 420 GCCGACGGTG TGGGCAAGAG TGCGCTGACC ATCCAGCTGA TCCAGAACCA TTTTGTGGAC 480 GAATACGACC CCACTATAGA GGTGAGCCTG GCGCCGCCGT CCAGGTGCCA GCAGCTGCTG 540 CGGGCGAGCC CAGGACACAG CCAGGATAGG GCTGGCTGCA GCCCCTGGTC CCCTGCATGG 600 TGCTGTGGCC CTGTCTCCTG CTTCCTCTAG AGGAGGGGAG TCCCTCGTCT CAGCACCCCA 660 GGAGAGGAGG GGGCATGAGG GGCATGAGAG GTACC 695 12) INFORMATION FOR SEO ID NO:22: (i) SEQUENCE CllARACTERISTlCS: (A) LENGTH: 4522 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: CAGCTGAGTG AGGCGGGCGC GCGTGGGAGG GTGTCCCAAG GGGAGGGGTC CGCGGCCAGT 60 GCAGGCCCGG AGGCGGGGGC CACCGGGCAG GGGGCGGGGG TGAGCCCCGA CGGCCAACCC 120 GTCAGCTCTC GGCTCAGACG GGCGGGAACC ACAGCCCCGC TCGCTGCCCA TTGTCTGCGC CCCTAACCGG TGCGCCCTGG TGCCACAGTG CGGCCCGGAG GGGCAGCCTC CTCCCGTCAC 240 TTCAGCCAGC GCCGCAACTA TAAGAGGCGG TGCCC;CCCC; C CGTGGCCGCC TCAGCCCACC 300 AGCCGGGACC GCGAGCCATG CTGTCCGCCG CCCGCCCCCA GGGTTGTTAA AGCCAGACTG 360 CGAACTCTCG CCACTGCCGC CACCGCCGCG TCCCGTCCCA CCGTCGCGGG CAACAACCAA 420 AGTCGCCGCA ACTGCAGCAC AGAGCGGGCA AAGCCAGGCA GGCCATGGGG CTCTGGGCGC 480 TGTTGCCTGG CTGGGTTTCT GCTACGCTGC TGCTGGCGCT GGCCGCTCTG CCCGCAGCCC 540 TGGCTGCCAA CAGCAGTGGC CGATGGTGGT AAGTGAGCTG GTG AGCTCGGGGC CAGACTTCTA CCAGGCGTTT TCCAGCCGTG CACCCTGGAA ACGAAGCTTA 1740 ACTTTTCTGA GCTACTGCCC CAGATAAAGA AAGTTTCGGG TCGCGGACGC CGGCTGACCG 1800 CCGCTTTCCC CCAGCCTCTC TCAAAAGCGC CTGGGAAGCT GCTCTCTGCA GGCGTGTGTC 1860 TGGCCTCTCG CCCAGCAAGG CTTGCACCGC CAAAATGGGC CGAAAGTTTT GGGCTGCGAA 1920 GAAGTCTTGG GGATGTATGG TTCTTCCGCT CCCCTCTCTT CGGTTTGTCT CTCTGGGGCT 1980 GCTCCACTTC CGCTATCGAG CCAAAATGCG CCCTAGAATC TCCCAGTAAG GTGTGATTAC 2040 GCCCGTGGAC GTGGCTGCGT GCCCACGCAC CTGCTTTCTC TACTAGCCCT AGAC.ACCAGC 2100 TTTCCAGCAC TGCCGGCCCT GGTCCTCAGG ACTCAAAGTG CGGAGTCGGG GGTGGGATTC 2160 CGGTCCCAAG CCCTTCATGA GGGTGCTGGC CGCGCCCCGC GTACCCCCTC GCTGATCCCC 2220 GCTCCCTTCT CCCACAGGCT GTCGAGAAAC GGCGTTTATC TTCGCTATCA CCTCCGCCGG 2280 GGTCACCCAT TCGGTGGCGC GCTCCTGCTC AGAAGGTTCC ATCGMTCCT GCACGTGTGA 2340 CTACCGGCGG CGCGGCCCCG GGGGCCCCGA CTGGCACTGG GGGGGCTGCA GCGACAACAT 2400 TGACTTCGGC CGCCTCTTCG GCCGGGAGTT CGTGGACTCC GGGGAGAAGG GGCGGGACCT 2460 GCGCTTCCTC ATGAACCTTC ACAACAACGA GGCAGGCCGT ACGGTGAGCT TTGAGAGGCT 2520 CCGCACCCTA AGCGGAGCGG CAGGGGCCAA CCTCGGGCTG GGGAAGTGAC GGTCGGTGAG 2580 ATAAGGCAAG GGGCACCAGG AGAGGGCGTC CTGGGAGAC;C CGGAGGCTTG GAACGAAGAC 2640 GGAGAATAGA GGAGACAGTG GCTGAGGGCA AAGGTATGTC TGGCCCGCGG ACAGGTAGAA 2700 GAGGTTGCAA ATCAAGCACA GTCTCTTCGC TGTACAGATT CGAAAAATAA GCCTGAGAGG 2760 CCGAGACTGA CTCGCCGCGG CGGAGCAGGG TTGGGCAGGG TTTCCAAATC TCAGCGGAAC 2820 ATTTCGCGCC TCCCTTCCCC TGGGCTCAGC TAGGCCTGGG CCTTTGCTGA GGTCCGGCCC 2880 CCGTGGCGTC CGGGAGAGGG CAGTGTCTGG GAGGGTGACT CTGGCCCGGT GCCCTGGGAC 2940 ACTCTTTCTT CCCCTATCCC CGCAGACCGT ATTCTCCGAG ATGCGCCAGG AGTGCAAGTG 3000 CCACGGGATG TCCGGCTCAT GCACGGTGCG CACGTGCTGG ATGCGGCTGC CCACGCTGCG 3060 CGCCGTGGGC GATGTGCTGC GCGACCGCTT CGACGGCGCC TCGCGCGTCC TGTACGGCAA 3120 CCGCGGCAGC AACCGCGCTT CGCGAGCGGA GCTGCTGCGC CTGGAGCCGG AAGACCCGGC 3180 CCACAAACCG CCCTCCCCCC ACGACCTCGT CTACTTCGAG AAATCGCCCA ACTTCTGCAC 3240 GTACAGCGGA CGCCTGGGCA CAGCAGGCAC GGCAGGGCGC GCCTGTAACA GCTCGTCGCC 3300 CGCGCTGGAC GGCTGCGAGC TGCTCTGCTG CGGCAGGGGC CACCGCACGC GCACGCAGCG 3360 CGTCACCGAG CGCTGCAACT GCACCTTCCA CTGGTGCTGC CACGTCAGCT GCCGCAACTG 3420 CACGCACACG CGCGTACTGC ACGAGTGTCT GTGAGGCGCT GCGCGGACTC GCCCCCAGGA 3480 ACGCTCTCCT CGAGCCCTCC CCCAAACAGA CTCGCTAGCA CTCAAGACCC GGTTATTCGC 3540 CCACCCGAGT ACCTCCAGTC ACACTCCCCG CGGTTCATAC GCATCCCATC TCTCCCACTT 3600 CCTCCTACCT GGGGACTCCT CAAACCACTT GCCTGGGGCG GCATGAACCC TCTTGCCATC 3660 CTGATGGACC TGCCCCGGAC CTAACCTCCC TCCCTCTCCG CGGGAGACCC CTTGTTGCAC 3720 TGCCCCCTGC TTGGCCAGGA GGTGAGAGAA GGATGGGTCC CCTCCGCCAT GGGGTCGGCT 3780 CCTGATGGTG TCATTCTGCC TGCTCCATCG CGCCAGCGAC CTCTCTGCCT CTCTTCTTCC 3840 CCTTTGTCCT GCGTTTTCTC CGGGTCCTCC TAAGTCCCTT CCTATTCTCC TGCCATGGGT 3900 GCAGACCCTG AACCCACACC TGGGCATCAG GGCCTTTCTC CTCCCCACCT GTAGCTGAAG 3960 CAGGAGGTTA CAGGGCAAAA GGGCAGCTGT GATGATGTGG GAATGAGGTT GGGGGAACCA 4020 GCAGAAATGC CCCCATTCTC CCAGTCTCTG TCGTGGAGCC ATTGMCAGC TGTGAGCCAT 4080 GCCTCCCTGG GCCACCTCCT ACCCCTTCCT GTCCTGCCTC CTCATCAGTG TGTAAATAAT 4140 TTGCACTGAA ACGTGGATAC AGAGCCACGA GTTTGGATGT TGTAAATAAA ACTATTTATr 4200 GTGCTGGGTC CCAGCCTGGT TTGCAAAGAC CACCTCCAAC CCAACCCAAT CCCTCTCCAC 4260 TCTTCTCTCC TTTCTCCCTG CAGCCTTITC TGGTCCCTCT TCTCTCCTCA GTTTCTCAAA 4320 GATGCGTTTG CCTCCTGGAA TCAGTATTTC CTTCCACTGT AGCTATTAGC GGCTCCTCGC 4380 CCCCACCAGT GTAGCATCTT CCTCTGCAGA ATAAAATCTC TATTTTTATC GATGACTTGG 4440 TGGCTTTTCC TTGAATCCAG AACACAACCT TGTTTGTGGT GTCCCCTATC CTCCCCTTTT 4500 ACCACTCCCA GCTTGGAAGC TT 4522 (2) INFORMATION FOR SEQ ID NO:23: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11558 base pairs (R) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA {genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GGATCCTCCG AGGCCTGCCA GACGAGAGCT AACCCCACCA CTCCGGGTGC CCAGTTGGTG 60 TGCGCAAAAG GAGATAGCAT CTCCAGACCC GGCCTGCCCC GCGCCTTGCA AGGCTGAGGA 120 CAGCGCCACT GCTCCTGCAG GAAGCGCCGG GCGCAGACAC AAACCCGGAG CTCCCCACGC 180 GTGCCCGCGC CCCGGAGCCT CCCCGCCGCC GCCCTCCGCG GTCCCCTTCA TTATGCGCCG 240 CCTTTAATGG GCGTTTGTCA GCTCGACTTC CCCGCAAGTT GTTTTCACGG ACATCAGTCA 300 TCGGCGGCGG CCCCATTGTG CAGGGGGATT GATGGGGAGC GGGAGGGGGT GACAGGGCCT 360 GGGGCGGCCT CCTCAAGGCC TCGGTCTATA ATTTTCGGAG GCATAATTGG TCTGGGGGAG 420 GGGGCGGGGG AGGGGCGGGT AGGGGACCTT TCAGAGCCAG GAGGGCTTTC GGGGGCGTGG 480 GGCGCGCTGC GGAGCGGAGC CGCGGCTCGA CGGCGGTGCG CTGGCGGCGA GTGTATGCAG 540 ACGGCGCCCG GCCCGAACCC CGAGCCCCGC GGGGCTCCCC ACCCGCCGGC CTCCCGCCCC 600 TCCCGCGCCT CCGCCTGGGG ACCACGTCGG CCTTTTGTTG GCGAACCGTC CTTTCTTTCA 660 GCGCTTTGCG CAGCAACGGA AATTTCATTG CTCCTGGGTG GAAATTAAAG GGACTCGCGT 720 TCCCTCTCTC CCTCTCCCTC TCCCACTCTC CCTCTCTTTC TCTCTCTCGC CCACCCTTCC 780 CCCTTCTTCC CCCACCTTTC CCGCGAAGCC GGAGTCAGCA TCTCCAGGCG CGGGATCCCG 840 CTCCGAGCAC CTCGCAGCTG TCCGGCTGCC GCCCCTTCCh TGGGCGCCGC GCTCGCCTGC 900 AGCCGCCGCC GCCGCGGGGC GGGCGCGATG CCACGATGGG CCTAATCTr,G CTGCTACTGC 960 TCAGCCTGCT GGAGCCCGGC TGGCCCGCAG CGGGCCCTGG GGCGCGGTTG CGGCGCGATG 1020 CGGGCGGCCG TGGCGGCGTC TACGAGCACC TTGGCGGGGC GCCCCGGCGC CGCAAGCTCT 1080 ACTGCGCCAC GAAGTACCAC CTCCAGCTGC ACCCGAC; CCG CCGCGTCAAC GGCAGCCTGG 1140 AGAACAGCGC CTACAGTGAG TGCCGGACGC TGCGGGGCCC CGGGGGAAGC GGCGCCGGGA 1200 GGGGTCGGGC CCGGGAGAAG GCGCGCTGCG GGGCCCCGCG GGGGAGGCGG CGCCGGGGAG 1260 GGGTCTGGCC CGGGAGAAGG CATGGTGCGC CCGGGGTGTC CGGAAAAGA CCGTCTGCCT 1320 CCCGCTGCCA GAAGGGGAAT GCCAGGGTGC CCTCCTCAAC CTACACGTCC GGGAGAAGAG 1380 CGCGCGCTGG GGTTCGAGCA GAACTCGGGG GCATGGGCGT TCACGTCCGA AGAGGGCGCC 1440 GGCGGCTGTC AGAGTCCGTC CCCGGTCCGG CCCGCTGC; TC TGAGGGGACG CGAGCTAGGC 1500 GACCCCGGGG CTCCAGGCTC TGCTGCTTTG GGCAGCTCTG ACAAGTTTAA GCCCTTTGAA 1560 TGTGTCGGAG AAAAGGAGCC GGACAAGGCA TTTATCATCT TCTTTATTAT TTTGACGACT 1620 TCTCCTCCTC CTGCCGACCC CTGGACGCCC CGCCGTCCCC TCCGCCCTCC CCGCTCGCGG 1680 CTGCCCAGGT GGCCGCCCCC GCTGCTGCCT CTGCGCGGAG GACTCTTGCC CTGCGGAGCT 1740 CGGTTCCCTG GCCGCGGCCG CCACCGACAG TTTTCCCCGC GCTGGAATCT GCACCTCCCC 1800 CGCCTCCGCC CTGGCCGCAC AGCGACAGGA GGGTGGAGGC CCTCGCCGTC GGGACGTGCG 1860 GATCACGAGC GCGGAACGGT GTCCGCCCGG GCCTGGGGGT GCAGACACAC ACACTCCGGC 1920 CCCCGCACGC AGGACCCGCG GCCCGGGCTG CGCTCACCGC GGGAGTCTGC CGGACTACAC 1980 GGTTGGGCTC CTCTGGTCAC AGGGACCTGA GGCCCGCGCC GAGCCCTTTG AGGAGCAGGA 2040 TAGCGGAGCT CAGGGCCCGG GAGGACCGCG TGGGGAGCGT GGGAGGGCAG TCAAGACCAC 2100 AGCTGTCCTC GGGTGCTCCG CGCGGCGCTC GGCTCCGGCC CCGGGCGCAG GAAGCGGTTC 2160 CGCCGCTTGA AGGTGGCGGC GGCGGCCTCA GCAAACCGCG GCTTCCTCCA GGAAAGCCGC 2220 AGCCCTGAGA GGGCGTCCTG GGGACATGCG CCTCCGGAGC CGCACGGTGG GCACCAGCTG 2280 TCACCAGGGG GTCCGAGTGC GCGGAATTCG TCTCACTAAG ACACTCCGGT TCTCTCCAAA 2340 GCCAGGCTCC CCCTCGGAGT CTCACAGCAT CCAAACTTCT TGGTGTTGGC TGCTCACGGG 2400 GAGGGGAGGG CGCGCGCCCG CAGCCGCCCC TGTCCTGCGT CGAGACTCGT GCTTCGCTGG 2460 TCCCCGGTCA GGCACCGCCG ATGCCGCCCA GCCTGCGCAC TGGGAAGGCG GAGGCTCGC 2520 AGCCTGCACC ACAGCACCCC TGGGCTGGAG CAAAAGCCCG GTGGTGACCG CGTCTGTGCT 2580 CGGACCGCGT GCCAGGAGGG CGCTCCTGAG AGGCATGCCT TGGAGAGGGT GCAGGCAGTC 2640 GCCCCAATCC CCTCGCAGCC TCTATTGGGA GACAATGACC CCAACCCGTC CTTTGATGTA 2700 GTCCCCCGCC CCCAGCCCCC ACCACAGCAC TCGTGTATCC AGAAGGAAAG GCGGGAGGGG 2760 AGATTTAAAC TTTCTTATCC CTGGGGAGTG GGTGAGACCC GTCCAGCTTC CCTCTGCGGG 2820 CTCCAGGGTC TAAACTGGCT TCCTGCCCTT CCAGGTCCCC CGGCGAGTAC AATGATCTCC 2680 CGCCAGGTCT ATTGCCAGCA TGATTTGGGT GGCAGGCACC CTGGCTGTGT TATTGCCAGT 2940 GGTTTATATT ATCAGCCTGG CTCATCAGGG CCCAGCTTTC CCGCTGGCAG GCACAGGCAT 3000 TGGGGACCGT GCAGTCCCAG AGCTCAGACT CCCATGGCTG CCGCCAGGGC TTGATTTCCC 3060 GCTCAAGCAC ATGTGCAGGG GCCTGACAGT CACCCAGCCA TAGTGGTCCC CACTTGCTTG 3120 TCTGTCGGAT GATGATGGCA GCTGGCAAGG TGTTGTGAGG CTCGAATGAA AGCCAGGGTG 3180 CAGGCAGCGT CAGCAGGTGG GAATGTTAAT TTCATTGTTA CCCTCCCAAG TTTGAATGAA 3240 GAGAAGCCTT CCCTGATGTT CCCTTCTCTA CCTCCTGTCC GGTGACTTCT CCCTTCCTGA 3300 ACTTCCACTC ACTCCCCGAC CTGTGGGGTG TGTGCTACCC TCCCCAGCCC TGAGCCCGGG 3360 GTGTGGCTTT GGGCAGGGCC AGCTGGGCCT CAGCCGCCCC CTCCGTGGGG GCGGCGGGAG 3420 TGAGGCACCT CTCATTTCTT CTAGGTATTT TGGAGATAAC GGCAGTGGAG GTGGGCATTG 3480 TGGCCATCAG GGGTCTCTTC TCCGGGCGGT ACCTGGCCAT GAACAAGAGG GGACGACTCT 3540 ATGCTTCGGT GAGTCCAGGC TGTCACGTGG GTGGGCGCTG ACGGAGTAGC GGTCTGGCCT 3600 GCACATCAAG CCAGGGGACG GGGGATGTGG GCAGTAGAAT GCTTTGCCAA GGGTACATGG 3660 GATCCAAGCT AGGACCAGAC CCTGGCCCAG GGCTCCACGC CCAGTGATCT TTGGCTGTGT 3720 CTGTTAGAGG CTGCTCCGCG AGTGACCTGG AGCCTGGCCC TGGGGGACCT GGGATCTGGA 3780 GAATGCCAGG TTAGCAAAGG TTTCCCCCAT CCTTGTCATC ACTACCACCC TCCCTTGAGA 3840 GTAGCTGAGA GCAGGGAAGT GATTGTTCTC CAAATTTCCA CCAAAATAAA TGGACCCAAG 3900 ACAACCTGGG AGTGTTCCTT GGGGCAGCCA GAGCTCTCCT GGGAGCATGG AGAGGAGGGC 3960 GCTGGCCCAG CTCCACCGGC TCTCTCCAAG GGTGGAAAGG CGAGAGCTGA AGCCTAGGAG 4020 GGGAAGGGGG ATGGCGCCCA TGTCCCGGGC ACACAGGAGC CTTGGAGCCC TGGCTTGGGA 4080 CCCAGTCTAT CTACTGCCTG GGCCTCCGTG TAACCAAAAA GGGTCTGTGGG 4140 TATGGCAGCG CCCCTCTTCA TAAGGGGTAG GGGTGGGGAA TACACAGAGT AAAATACATG 4200 TCACAGGGAA ATTTGCTACC TCCAACTAGT CATTACATGC AATTTGGCTG ATACTTCCTT 4260 GGGCAATGAG AGGTITTCCA TCCATGAGTA GCCTGGCTGA CGCGGCCCAA GGACAATCTC 4320 CCTGCAGTGA GCTCTCTGCT CAGTCCTGCT CACAGAGGAC ACATCCCGCA GCTCCCTCTC 4380 GCAGAAGCTG ATGATTTCAT CACAGATTTT TAGCCGTTTT GCTAAAGGAA GGTCCAGAAA 4440 GCCGGGATGC GCCCCTTCAT TTTCTCTGGT CCAGAGGCTA CTCCCTCCTT CCTCCCATCC 4500 ACTCACCCAT CCACCCATCC ACCCATTCAC CCATCChCTC ATCCACCCAT CCACCCATCC 4560 ATCCATCCAC TCATCCACCC ATCCACCCAT CCATCCATCC ACCTATCCAT CCATCCATCC 4620 ACCCATCCAC CCATCCTTCC ATTTACCCAT CCACCATCCA CCTATCCATC CATCCATCCA 4680 CCCATCCACC CATCCACCCA TTTACCCATC CACCCATCCA CCCATCCACC TATCCATCCA 4740 TCCATCCATC CACCTATCCA CCCATCCACT CATCCACTCA CCCATCCACC TATCCACCCA 4900 CCCACTCACC CATCCATCCA CCCACCCACT CACCCATCCA TCCATACACC TATCCATCTG 4860 ACATTCAGTC CATTCATCAG TCAGTGTCCT GGAAGCTGTG TCTGGGAGCA GGGGCCCTCA 4920 TGGTGCCAGC TTCTGCCATA GGGGAGCCAG GATTTGGAGA GACAGAATAA AGCATGACAT 4980 GAGGGGGTCC CAGTGGTTCA GAGCCCACCT GGGGCTTCCG TGCCTGAGAA CTCACAGCCT 5040 GGCTTTGAGA AGGGTGGACA GAGGCTCCTA GCCCCACCAG GC.ATGCTTAG CCAAGCAGTC 5100 TGGCTGGAGG GTGTGGAACT TTCCAGAGTG CCCAGTG(; AG AAGTTCCCAT TGCCCTGCAG 5160 ACAACCTGGG AGGCATAGTA GCCTGGATGC CAACTCCGTC ACCTGCCACC TGCCACCTGA 5220 TACCACAGTC CTGCGTGTGG CCAGCTTGCC TAGGAGTGCT GTCCCACCAC GGGGCTCCCT 5280 GGACTTTGCC GGCAGCATGC TTGCCCTCAG GCCAGCTCAT CTCACTGTGG GCCAGTGAGC 5340 ACTGCACCCT GACACTCCTG GTCTGGCCCT GACCTCCCTT CTGAGGTATA GACCACCCTC 5400 CCCCAACATC CTGCTGACTC AGACACTCCA ACCGAACTCA GCAGCTCCCC GACTCCCCAT 5460 CTTAATCATT GCCCCACCAG CCACTCACTC TCACCCAGAG TGGACTGGGA GTCATGGTAG 5520 ACCCCTCCTG CTATTCAGTC CATCCCTAGT CCTCAGACCA CGCCCCTGCA TCTCTTCAGC 5580 CCACACACCT CACCTCCCTC TGCCTCATCC CCGAGAGCAC GCCTTAAAAG GCAGGCTCAT 560 CACTTCCCAC CTGGATCACT GCTCCAGCCT CTCCTACCTC CAGTCCATCT GTTTGTCAAG 5700 CCATCCATCC ATClTCCTTC CTTCTTTCCC <RTI ID=118.4> CTTCCZTTAT TCTCTTCACC CATCTGTCCA 5760 TACACCTCTC CTTCCTTCCT TCTITCCTTT CTCTCATCCT TCTTTCCATC CATCCTTCCA 5820 ACCGTTCATC TGTCCATCTG CCCATTCACA TGTCCATCCG TCCATGCTCC CTCCTTCCTT 5880 CCATCCTTTC ACCCATCCAC CCATCCACCC ATCCACCCAT CCACCAATCC ATCCAGCAAT 5940 CCACCCATCC ATCCAGCCAT CCACCCATCC ATCCGTCCAC CTATCCATCC ATCCACCCAT 6000 CCATCCAACC AGCCATCCAC CCATCCATCC ATCCATCCAT CCATCCATCC ATCCACCCAC 6060 CCACCCATCC ACCCATCCAT CCATCCAGCC AACCAGCCAT CCACCCATCC ACCCATCCAC 6120 CCATCCATCC ATCCATCCAT CCATCCATCC ATCCATCCAT CCACCCATCC ACCCATCCAT 6180 CCATCCAGCC ATCCACCCAT CCATCCATCC ACCTATCCAT CCATCCACCT ATCCATCCAT 6240 CCACCCATCC ATCCATCCAG CCATCCACCC ATCCATCCAT CCACCCATCC ACCCATCCAT 6300 CCATCCATCC ATCAATCCAT CCATCCATCC ACCCATCCAT CCATCCATCC ACCCACCCAT 6360 CTGACATTCA GTCCTTCGTT CACCAGTCAG TGTCCTGGGC ACTTACCATG <RTI ID=118.7> TGTTGCACTC 6420 TGCTAGTGCA TGTTAAAGGG GCTTCCTAAA AAAGCAAATC AGCCCAGTTC ATGGCTTCTG 6480 AGGACTCCCA TGATGCGCAG GCATCCCCCA GGCACCTTAG CTCCACCCAC AGAGCAGCAG 6540 CCCAGGTGCC ATCTGCTCCC AGGGGGGCCG TCATCCAGAG GTTGAGCCTC CTCCAGCTTC 6600 CGGAAGCCCC CTCAACTCTC CAAGCCATCT GTACTGCTCA GGCCACACCT TCACTGCCTG 6660 GAACGCCCTC CTCCAGAAAA TGCCCATGAG AAAGTGCCCG CCCCGCCCTT ACAGCTCCCT 6720 TTAGAAGTCA CCTCCTCAAC AAAGCAGCTT CGGAGTCGGC TTCCTCCCCA CCCTTCATGG 6780 GGTGGAAGCG GCCCTGGGGG AGGGGCCTGT GGAACCAGGT CTGGTGGGGC GGCAGTGTCA 6840 GGACACACAT TGGAAAGTGT TGTCACAGCG AGTCCCAACT GCAGTAGCTC TGGAGTCTAT 6900 GGCCCCGGCC CCTGAGCCAA CACCTTCTGC CCTGGTCATC TTGGCCACCC CAGGTGCCCT 6960 ACCACGTGTC AGCAATTGAT CTACACTGCC CCCAATCTCC CACCTCGGGA GAGCCTGAGC 7020 CCCCTGCCAC CTGAGGCTCA CAGGCACCTG TCCTTCAGTG ACCCACCCCT CAAGTGGCCC 7080 CCCAAAGAGA AAGCCTATTT CCTGGCCTCT CTGGCCCTGA GAGGACTCAC CTGCTGCAGG 71.40 ATCGAGGGAC CTGGGAGAAG CTTGGGTCCT GCCCTGAGCT TCTCACAGTC TGCCATGGGA 7200 GCCAGACACT AACATGTTGT ACCCAGTGTG TGGGGTGGGG AAGTCGGCCC TGAGCCCACC 7260 CGCTCATTCC CAGATAGTCG CTGAGCCCCA GGCTATGTGG GGGTCAGGCA CAGATCAGAC 7320 ACTGTTCTAT CCCCTGTAAT GTAGGGCCTG GCACCACCCC TGGGCCTCAG TTTCCCCATC 7380 TATAAACATG GAGACCGGTC TTGGATAGTA ACATCCAGAC TGTGGCCCAG GGGGCACTGG 7440 GCCTCCAAGA GGGCTGGAGA GCAGGGGGGC TTGGGGTTAT GGGGAGAGGC AGCCCCTACC 7500 ACCAGGTCAA GCCTTCCACC CTTCCCTGCT CTGGGCTTTC ATGTCGCTGA AAAACAGGAT 7560 GGGGTGCTTG AAGCATCTTG TTCCCTGAGG TCCTGCGAGG TCAGATGCTG CTCGGTCCAG 7620 TCTGGGAGCC TCTGGAGGCC ATTTCCACTC TCCCTCTCTC CCACGGGACA CCACACCCCA 7680 GATGGGGACA GACCATGAGG GCCTCCGACT CCTCCTGCGG CCAGTCCCCA GGAGGAAAGA 7740 GAGATCCTAC TGTCTGCCTT GCACCTGCTG CTTCCACCTC CCACCACCCT TTCTGGTTTG 7800 GAGGGAGCAG CTCCTGTCAG TCATCCCCTG AGGGAGGTGG CCCTAGGCAT TACCACTTGC 7860 CTGTAGCTGG GACTGAAGCC TAGTGGGTCT GCATAAGGCA TACCCACCTC TTCCCTACGC 7920 CACAGACTAA GAAGACCCCA TGAGGCAGCC CTTGAGGGAA ACTCCCACGG CCAGGCCTTG 7980 GGGATGTTGG TGTAAGCTCC TTCAGTTCAG AGCTTCACTG TGCCTTTGAG AGGGCAGGGT 8040 CCCTTCTGCC CTTCCCCCAC TGCACCCTGG GCCTGACAAG GGTCCCATTA ATCTCTGATG 8100 ACAAAGAGGT GTCCTCTCTG TCCTGCTTGT GGTGGGACAC CCCAGCTTCT GCTCTCATCC 8160 TAAGAACAGC AGTATCTGTG GCACTGATTG ATCACGTGCC TCATGCCAGG CCCCAGCCCC 8220 TGCTCTGTGT TGTAACCTCT TCAACCTGCA AGGCGAGATC CTCCAGCTTA AGAGAGTGTT 8280 GCTCAGGGGT CAATGACCCC AGCTGGGGCC CCTGAGCGCA GCGTTTAATG GAGAGTCTGG 8340 GCTCATCAAA CCTGGCTGTG TCCCCTTCCC TTCCTGCTTT TTGTGTTCTT TCTTCTTTTT 8400 ACTTTTCTGC AATTTCTTTT TTTTTCTTTT CTTTTTTTTT TTTTTTGAGT CAGAGTTTTG 8460 CTCTTGTCGC CCAGACTGGA GTGCAGTGGC ACAATCTCTG CTCACTGCAA CCTCTGCCTC 8520 CCGGGTTCAA GTGATTCTCC TGCCTAAGCC TCCCAAGTAG CTGGGATTGT GGGCGCACGC 8580 CACCACACCC GGCTAATTTT TTTGTATTTT TAGTAGAGAC GGGGTTTTGC CATGTTGCCC 8640 AGGCTGGTCT CGAACTCCTG ACCTCAGGTG ATTCACCTGC CTCGGCCTCC CAAACTGCTG 8700 GGATTACAGG CATGAGCCAC CGTGCCCAGC CTCTGCAATT TCTTTAAAAG AGATCTGGGT 8760 GTTGTCATCT TCCTTTGTCC AAATGCCCAG TCCTTGCTGA CCTCCACACT GCCAGCAGAC 8820 TGCCAGGGCA CCTGGGTTGG CCGGCCTGGT CTCTGCTCCA CAGACAACCC TACACATCCC 8880 TTTGCTGTGT CAGCGCCCTC AGATGGGGGA CAGAGGCTGG TGGAACCCAG AGAGTAGGAG 8940 CTGGGAGTTG TCTGGCACAG CTTTGAGTGA AGGTGACCTT TAGGCAGAAG GCCAAGTTCA 9000 CCAGGCACAC AGGGGAGGAA GGACATGCTG GGCAGAGGGC AGGGCCTAGG CAAAGGTGTG 9060 ACTGGCTGAG AGTGCGCTGG GGTTTGGCAC TGGACCGAAC AGCCTCACAG GAGGGGAGGC 9120 AGGCATCAGG CAACCCTGGG CCCTGACGCT GCCGCAGTCT CCCCGGGGCA CTGACCATGA 9180 TATCTCATCC CCGCAGGAGC ACTACAGCGC CGAGTGCGAG TTTGTGGAGC GGATCCACGA 9240 GCTGGGCTAT AATACGTATG CCTCCCGGCT GTACCGGACG GTGTCTAGTA CGCCTGGGGC 9300 CCGCCGGCAG CCCAGCGCCG AGAGACTGTG GTACGTGTCT GTGAACGGCA AGGGCCGGCC 9360 CCGCAGGGGC TTCAAGACCC GCCGCACACA GAAGTCCTCC CTGTTCCTGC CCCGCGTGCT 9420 GGACCACAGG GACCACGAGA TGGTGCGGCA GGCGGCGGCA GAAGCAGAGC CCGGATAACC 9480 TGGAGCCCTC TCACGTTCAG GCTTCGAGAC TGGGCTCCCA GCTGGAGGCC AGTGCGCACT 9540 AGCTGGGCCT GGTGGCCACC GCCAGAGCTC CTGGCGACAT CTTGGCGTGG CAGCCTCTTG 9600 ACTCTGACTC TCCTCCTTGA GCCCTTGCCC CTGCGTCCCG CGTCTGGGTT CTCAGCTATT 9660 TCCAGAGCCA GCTCAAATCA GGGTCCAGTG GGAACTGAAG AGGGCCCAAG TCGGAGCTCG 9720 GAGGGGGCTG CCTGCAATGC AGGGCATTTG TGGGTCTGTG TGGCAGGAAG CCGGCAGGGA 9780 AGGGCCTGAG TGCCAGCCCT GGCAGACTGA GGAGCCTCCC AGGAGCAGCG GGGCAGTGTG 9840 GGGCTTTGTG TCATCACAAC ATTAAAGTAT TTTATTCTAC TCTGTCGTTT GGTAGACCGT 9900 GATGCAGGCT GAGGAGCGCT TGCCGCCITT ACTSGhACGT GCTTGCTTCC AGCACAGCAG 9960 AATCCGCGCT GGCATCAGCC TGCGTCAGCT GCTGCTTTAA GAGGAAGACG GCATTCCCAG 10020 AAATCGGGCT AAAGGTGCAT TTCAGTTCCC TGGTTTTAGA AAGTTACGTT TTTTTGGATG 10080 GTTGGAAACA AGCAAAGGCA TGTTTGTGCA TGTGTGTGCA TCGTGTGTGT GTGCGTGGAG 10140 AGAAGGATCG TGTATTTCTG AAAGCGTGAG TGTGCATGTG GGTATGTGTG ATCITGTGTT 10200 AGTGGTACCT GTGTGAGGAC ATGTATGTGT GTGTGTGTTT CTGGGTGTGT CTGAATGTGT 10260 GATGTGTGTG TQTCTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGGAGA GAGAGAGAGA 10320 GAGTTTACTT TCTTTGAAAA CTCTAAAAAG CCTCTCCCTC TGGAAGCTGT GTGCTTCTCC 10380 AGGGACCCTT TAGAGCAACT GTGTCAGGTC AGGCAGCACA GAMCTTCCT TTATCCTTAC 10440 AACCTGCTCT TGGGGCCCGT GCACCCTGTC TTTACCTAGA AGGTGAGGCT CAGAGAAGCA 10500 ATTACTCAGT GGCCGGCCCC TGCCTTGGAC TAGGTGCCTC CTCACACCTG TTCCCCAACA 10560 ATGGCATGGG TGGAATCACC TGGGCCGGCC CAGGTGAGAG CCAGCATGGG CAGTGTACTA 10620 ACCTCTCCTG GCACTTGGCA GGATGGGCAG GGTCCAGGTG AGGAGGCTCT CCTGAGCCTG 10680 GGACTGTGAG GACCATCGCT CTCTGTTCCC ATGCCCTCCC AGGGGTCAGA GAGCCCAGAC 10740 TCAGAGAGCC CAGGGTCAGA GAACTTAGGG TCAGAGAGGC CAGAGTCAGA GAGCTGAGAC 10800 TCAGAGAGCC CAGGGTCAGA GTGTCCAGGG CCAGAGAGCT TGTGGTCACA GAGCCCAGAC 10860 TCAGAGAGCC CAGACTCAGA GCCACCTGAT TGGTTAGTGC AGACTCGCCA AACCCACAGG 10920 GAGGCTGGGC TCCTCCCTGG CACGTGTGCA ACACAAGTGA AAATCTCGGT GCCTCCTTCA 10980 GCCCCCAGCG CATGTCAGAT TTCCCGGAAT C.GCTCCCCTG CAGCTGCGAA CATTCCTGGC 11040 AGTCAACAGG AGCAGCACGC AGCTGAGCTC TGCTGTGGGT TTTGTTGTTT CTCTAGAGTG 11100 AGATGGGGCA GGGGCTGCCA TCACTCCCTC CTTGCAGATG ATGACCCTGA GTCCTGGCAA 11160 GGGGAACTTG CCCGGGGCTG TGTCAACACA GGGGAAGCAG CAGTACTCAG TGCTGCAGGA 11220 TCAACAGATG GTCCCTGATG AAGGCGTAGG AGACACTGGG GGCTCTTGTT TAACATGTAA 11280 AACAGCTTTG ACAAGAGAAT GTGGATTTTT CGCAGCTGAT GGCTGTGCCA TGGTCACCTT 11340 CTTCCCCACA CCAGAGTCCA AGGGACTTCA TTTTGTGTGT GTGTTTGGGG GGTCATGGGC 11400 TGAATTATGT cTCCTCCCCA GAGTTCATCT GTTGAAGTCC TAACCCCTAG TAACTCAGCA 11460 TGTGACCTTA TTTGGAATAG GGTCATTACA GATGCAACTG GTGAAGATGA GGTAACATAG 11520 GAGTAGAATG ACCCCTGAAT CCATTGTGAC CAGGATCC 11558 (2) INFORMATION FOR SEQ ID NO:24: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3622 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double ID) TOPOLOGY: linear (11) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:24: CCCGGGGAGG GGACCGGGGA ACAGAGGGCC GAGAGGCGTG CGGCAGGGGG GAGGGTAGGA 60 GAAAGAAGGG CCCGACTGTA GGAGGGCAGC GGAGCATTAC CTCATCCCGT GAGCCTCCGC 120 GGGCCCAGAG AAGAATCTTC TAGGGTGGAG TCTCCATGGT GACGGGCGGG CCCGCCCCCC 180 TGAGAGCGAC GCGAGCCAAT GGGAAGGCCT TGGGGTGACA TCATGGGCTA TTTTTAGGGG 240 TTGACTGGTA GCAGATAAGT GTTGAGCTCG GGCTGGATAA GGGCTCAGAG TTGCACTGAG 300 TGTGGCTGAA GCAGCGAGGC GGGAGTGGAG GTGCGCGGAG TCAGGCAGAC AGACAGACAC 360 AGCCAGCCAG CCAGGTCGGC AGTATAGTCC GAACTGCAAA TCTTATTTTC TTTTCACCTT 420 CTCTCTAACT GCCCAGAGCT AGCGCCTGTG GCTCCCGGGC TGGTGGTTCG GGAGTGTCCA 480 GAGAGCCTTG TCTCCAGCCG GCCCCGGGAG GAGAGCCCTG CTGCCCAGGC GCTGTTGACA 540 GCGGCGGAAA GCAGCGGTAC CCCACGCGCC CGCCGGGGGA CGTCGGCGAG CGGCTGCAGC 600 AGCAAAGAAC m CCCGGCG GGGAGGACCG GAGACAAGTG GCAGAGTCCC GGAGCGAACT 660 TTTGCAAGCC TTTCCTGCGT CTTAGGCTTC TCCACGGCGG TAAAGACCAG AAGGCGGCGG 720 AGAGCCACGC AAGAGAAGAA GGACGTGCGC TCAGCTTCGC TCGCACCGGT TGTTGAACTT 780 GGGCGAGCGC GAGCCGCGGC TGCCGGGCGC CCCCTCCCCC TAGCAGCGGA GGAGGGGACA 840 AGTCGTCGGA GTCCGGGCGG CCAAGACCCG CCGCCGGCCG GCCACTGCAG GGTCCGCACT 900 GATCCGCTCC GCGGGGAGAG CCGCTGCTCT GGGAAGTGAG TTCGCCTGCG GACTCCGAGG 960 AACCGCTGCG CCCGAAGAGC GCTCAGTGAG TGACCGCGAC TTTTCAAAGC CGGGTAGCGC 1020 GCGCGAGTCG ACAAGTAAGA GTGCGGGAGG CATCTTAATT AACCCTGCGC TCCCTGGAGC 1080 GAGCTGGTGA GGAGGGCGCA GCGGGGACGA CAGCCAGCGG GTGCGTGCGC TCTTAGAGAA 1140 ACTTTCCCTG TCAAAGGCTC CGGGGGGCGC GGGTGTCCCC CGCTTGCCAG AGCCCTGTTG 1200 CGGCCCCGAA ACTTGTGCGC GCACGCCAAA CTAACCTCAC GTGAAGTGAC GGACTGTTCT 1260 ATGACTGCAA AGATGGAAAC GACCTTCTAT GACGATGCCC TCAACGCCTC GTTCCTCCCG 1320 TCCGAGAGCG GACCTTATGG CTACAGTAAC CCCAAGATCC TGAAACAGAG CATGACCCTG L380 AACCTGGCCG ACCCAGTGGG GAGCCTGAAG CCGCACCTCC GCGCCAAGAA CTCGGACCTC 1440 CTCACCTCGC CCGACGTGGG GCTGCTCAAG CTGGCGTCGC CCGAGCTGGA GCGCCTGATA 1500 ATCCAGTCCA GCAACGGGCA CATCACCACC ACGCCGACCC CCACCCAGTT CCTGTGCCCC 1560 AAGAACGTGA CAGATGAGCA GGAGGGGTTC GCCGAGGGCT TCGTGCGCGC CCTGGCCGAA 1620 CTGCACAGCC AGAACACGCT GCCCAGCGTC ACGTCGGCGG CGCAGCCGGT CAACGGGGCA 1680 GGCATGGTGG CTCCCGCGGT AGCCTCGGTG GCAGGGGGCA GCGGCAGCGG CGGCTTCAGC 1740 GCCAGCCTGC ACAGCGAGCC GCCGGTCTAC GCAAACCTCA GCAACTTCAA CCCAGGCGCG 1800 CTGAGCAGCG GCGGCGGGGC GCCCTCCTAC GGCGCGGCCG GCCTGGCCTT TCCCGCGCAA 1860 CCCCAGCAGC AGCAGCAGCC GCCGCACCAC CTGCCCCAGC AGATGCCCGT GCAGCACCCG 1920 CGGCTGCAGG CCCTGAAGGA GGAGCCTCAG ACAGTGCCCG AGATGCCCGG CGAGACACCG 1980 CCCCTGTCCC CCATCGACAT GGAGTCCCAG GAGCGGATCA AGGCGGAGAG GAAGCGCATG 2040 AGGAACCGCA TCGCTGCCTC CAAGTGCCGA AAAAGGAAGC TGGAGAGAAT CGCCCGGCTG 2100 GAGGAAAAAG TGAAAACCTT GAAAGCTCAG AACTCGGAGC TGGCGTCCAC GGCCAACATG 2160 CTCAGGGAAC AGGTGGCACA GCTTAAACAG AAAGTCATGA ACCACGTTAA CAGTGGGTGC 2220 CAACTCATGC TAACGCAGCA GTTGCAAACA TTTTGAAGAG AGACCGTCGG GGGCTGAGGG 2280 GCAACGAAGA AAAAAAATAA CACAGAGAGA CAGACTTGAG AACTTGACM GTTGCGACGG 2340 AGAGAAAAAA GAAGTGTCCG AGAACTAAAG CCAAGGGTAT CCAAGTTGGA CTGGGTTCGG 2400 TCTGACGGCG CCCCCAGTGT GCACGAGTGG GAAGGACTTG GTCGCGCCCT CCCTTGGCGT 2460 GGAGCCAGGG AGCGGCCGCC TGCGGGCTGC CCCGCTTTGC GGACGGGCTG TCCCCGCGCG 2520 AACGGAACGT TGGACTTTCG TTAACATTGA CCAAGAACTG CATGGACCTA ACATTCGATC 2580 TCATTCAGTA TTAAAGGGGG GAGGGGGAGG GGGTTACAAA CTGCAATAGA GACTGTAGAT 2640 TGCTTCTGTA GTAcTCCTTA AGAACACAAA GCGGGGGGAG GGTTGGGGAG GGGCGGCAGG 2700 AGGGAGGTTT GTGAGAGCGA GGCTGAGCCT ACAGATGAAC TCTTTCTGGC CTGCTTTCGT 2760 TAACTGTGTA TGTACATATA TATATTTTTT AATTTGATTA AAGCTGATTA CTGTCAATAA 2820 ACAGCTTCAT GCCTTTGTAA GTTATTTCTT GTTTGTTTGT TTGGGTATCC TGCCCAGTGT 2880 TGTTTGTAAA TAAGAGATTT GGAGCACTCT GAGTTTACCA TTTGTAATAA AGTATATAAT 2940 TTTTTTATGT TTTGTTTCTG AAAATTCCAG AAAGGATATT TAAGAAAATA C AATAAACTA 3000 TTGGAAAGTA CTCCCCTAAC CT CTTTTCTG CATC ATCTGT AGATCCTAGT CTATCTAGGT 3060 GGAGTTGAAA GAGTTAAGAA TGCTCGATAA AATCACTCTC AGTGCTTCTT ACTATTAAGC 3120 AGTAAAAACT GTTCTCTATT AGACTTAGAA ATAAATGTAC CTGATGTACC TGATGCTATG 3180 TCAGGCTTCA TACTCCACGC TCCCCCAGCG TATCTATATG GAATTGCTTA CCAAAGGCTA 3240 GTGCGATGTT TCAGGAGGCT GGAGGAAGGG GGGTTGCAGT GGAGAGGGAC AGCCCACTGA 3300 GAAGTCAAAC ATTTCAAAGT TTGGATTGCA TCAAGTGGCA TGTGCTGTGA CCATTTATAA 3360 TGTTAGAAAT TTTACAATAG GTGCTTATTC TCAAAGCAGG AATTGGTGGC AGATTTTACA 3420 AAAGATGTAT CCTTCCAATT TGGAATCTTC TCTTTGACAA TTCCTAGATA AAAAGATGGC 3480 CTTTGTCTTA TGAATATTTA TAACAGCATT CTGTCACAAT AAATGTATTC AAATACCAAT 3540 AACAGATCTT GAATTGCTTC CCTTTACTAC TTTTTGTTC CCAAGTTATA TACTGAAGTT 3600 TITATTTTA GTTGCTGAGG TT 3622 (2) INFORMATION FOR SEQ ID NO:25: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5084 base pairs IB) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: GATCCCATCG CAGCTACCGC GATGAGAGGC GCTCGCGGCG CCTGGGATTT TCTCTGCGTT 60 CTGCTCCTAC TGCTTCGCGT CCAGACAGGC TCTTCTCAAC CATCTGTGAG TCCAGGGGAA 120 CCGTCTCCAC CATCCATCCA TCCAGGAAAA TCAGACTTAA TAGTCCGCGT GGGCGACGAG 180 ATTAGGCTGT TATGCACTGA TCCGGGCTTT GTCAAATGGA CTTTTGAGAT CCTGGATGAA 240 ACGAATGAGA ATAAGCAGAA TGAATGGATC ACGGAAAAGG CAGAAGCCAC CAACACCGGC 300 AAATACACGT GCACCAACAA ACACGGCTTA AGCAATTCCA TTTATGTGTT TGTTAGAGAT 360 CCTGCCAAGC TTTTCCTTGT TGACCGCTCC TTGTATGGGA AAGAAGACAA CGACACGCTG 420 GTCCGCTGTC CTCTCACAGA CCCAGAAGTG ACCAATTATT CCCTCAAGGG GTGCCAGGGG 480 AAGCCTCTTC CCAAGGACTT GAGGTITATT CCTGACCCCA AGGCGGGCAT CATGATCAAA 540 AGTGTGAAAC GCGCCTACCA TCGGCTCTGT CTGCATTGTT CTGTGGACCA GGAGGGCAAG 600 TCAGTGCTGT CGGAAAAATT CATCCTGAAA GTGAGGCC G CCTTCAAAGC TGTGCCTGTT 660 GTGTCTGTGT CCAAAGCAAG CTATCTTCTT AGGGAAGGGG AAGAATTCAC AGTGACGTGC 720 ACAATAAAAG ATGTGTCTAG TTCTGTGTAC TCAACGTGGA AAAGAGAAAA CAGTCAGACT 780 AAACTACAGG AGAAATATAA TAGCTGGCAT CACGGTGACT TCAATTATGA ACGTCAGGCA 840 ACGTTGACTA TCAGTTCAGC GAGAGTTAAT GATTCTGGAG TGTTCATGTG TTATGCCAAT 900 AATACTTTTG GATCAGCAAA TGTCACAACA ACCTTGGAAG TAGTAGATAA AGGATTCATT 960 AATATCTTCC CCATGATAAA CACTACAGTA TTTGTAAACG ATGGAGAAAA TGTAGATTTG 1020 ATTGTTGAAT ATGAAGCATT CCCCAAACCT GAACACCAGC AGTGGATCTA TATGAACAGA 1080 ACCTTCACTG ATAAATGGGA AGATTATCCC AAGTCTGAGA ATGAAAGTAA TATCAGATAC 1140 GTAAGTGAAC TTCATCTAAC GAGATTAAAA GGCACCGAAG GAGGCACTTA CACATTCCTA 1200 GTGTCCAATT CTGACGTCAA TGCTGCCATA GCATTTAATG TTTATGTGAA TACAAAACCA 1260 GAAATCCTGA CTTACGACAG GCTCGTGAAT GGCATGCTCC AATGTGTGGC AGCAGGATTC 1320 CCAGAGCCCA CAATAGATTG GTATTTTTGT CCAGGAACTG AGCAGAGATG CTCTGCITCT 1380 GTACTGCCAG TGGATGTGCA GACACTAAAC TCATCTGGGC CACCGTTTGG AAAGCTAGTG 1440 GTTCAGAGTT CTATAGATTC TAGTGCATTC AAGCACAATG GCACGGTTGA ATGTAAGGCT 1500 TACAACGATG TGGGCAAGAC TTCTGCCTAT TTTAACTTTG CATTTAAAGG TAACAACAAA 1560 GAGCAAATCC ATCCCCACAC CCTGTTCACT CCTTTGCTGA TTGGTTTCGT AATCGTAGCT 1620 GGCATGATGT GCATTATTGT GATGATTCTG ACCTACAAAT ATTTACAGAA ACCCATGTAT 1680 GAAGTACAGT GGAAGGTTGT TGAGGAGATA AATGGAAACA ATTATGTITA CATAGACCCA 1740 ACACAACTTC CTTATGATCA CAAATGGGAG <RTI ID=126.11> TTTCCCAGAA ACAGGCTGAG TTTTGGGAAA 1800 ACCCTGGGTG CTGGAGCTTT CGGGAAGGTT GTTGAGGCAA CTGCTTATGG CTTAATTAAG 1860 TCAGATGCGG CCATGACTGT CGCTGTAAAG ATGCTCAAGC CGAGTGCCCA TTTGACAGAA 1920 CGGGAAGCCC TCATGTCTGA ACTCAAAGTC CTGAGTTACC TTGGTAATCA CATGAATATT 1980 GTGAATCTAC TTGGAGCCTG CACCATTGGA GGGCCCACCC TGGTCATTAC AGAATATTGT 2 ACCAAGGCCG ACAAAAGGAG ATCTGTGAGA ATAGGCTCAT ACATAGAAAG AGATGTGACT 2280 CCCGCCATCA TGGAGGATGA CGAGTTGGCC CTAGACTTAG AAGACTTGCT GAGCTTTTCT 2340 TACCAGGTGG CAAAGGGCAT GGCTTTCCTC GCCTCCAAGA ATTGTATTCA CAGAGAOTTG 2400 GCAGCCAGAA ATATCCTCCT TACTCATGGT CGGATCACAA AGATTTGTGA TTTTGGTCTA 2460 GCCAGAGACA TCAAGAATGA TTCTAATTAT GTGGTTAAAG GAAACGCTCG ACTACCTGTG 2520 AAGTGGATGG CACCTGAAAG CATTTTCAAC TGTGTATACA CGTTTGAAAG TGACGTCTGG 2580 TCCTATGGGA TTTTTCTTTG GGAGCTGTTC TCTTTAGGAA GCAGCCCCTA TCCTGGAATG 2640 CCGGTCGATT CTAAGTTCTA CAAGATGATC AAGGAAGGCT TCCGGATGCT CAGCCCTGAA 2700 CACGCACCTG CTGAAATGTA TGACATAATG AAGACTTGCT GGGATGCAGA TCCCCTAAAA 2760 AGACCAACAT TCAAGCAAAT TGTTCAGCTA ATTGAGAAGC AGATTTCAGA GAGCACCAAT 2820 CATATTTACT CCAACTTAGC AAACTGCAGC CCCAACCGAC AGAAGCCCGT GGTAGACCAT 2880 TCTGTGCGGA TCAATTCTGT CGGCAGCACC GCTTCCTCCT CCCAGCCTCT GCTTGTGCAC 2940 GACGATGTCT GAGCAGAATC AGTGTTTGGG TCACCCCTCC AGGAATGATC TCTTCTTTTG 3000 GCTTCCATGA TGGTTATTTT CrTTcTTTC AACTTGCATC CAACTCCAGG ATAGTGGGCA 3060 CCCCACTGCA ATCCTGTCTT TCTGAGCACA CTTTAGTGGC CGATGATTTT TGTCATCAGC 3120 CACCATCCTA TTGCAAAGGT TCCAACTGTA TATATTCCCA ATAGCAACGT AGCTTCTACC 3180 ATGAACAGAA AACATTCTGA TTTGGAAAAA GAGAGGGAGG TATGGACTGG GGGCCAGAGT 3240 CCTTTCCAAG GCTTCTCCAA TTCTGCCCAA AAATATGGTT GATAGTTTAC CTGAATAAAT 3300 GGTAGTAATC ACAGTTGGCC TTCAGAACCA TCCATAGTAG TATGATGATA CAAGATTAGA 3360 AGCTGAAAAC CTAAGTCCTT TATGTGGAAA ACAGAACATC ATTAGAACAA AGGACAGAGT 3420 ATGAACACCT GGGCTTAAGA AATCTAGTAT TTCATGCTGG GAATGAGACA TAGGCCATGA 3480 AAAAAATGAT CCCCAAGTGT GAACAAAAGA TGCTCTTCTG TGGACCACTG CATGAGCTTT 3540 TATACTACCG ACCTGGTTTT TAAATAGAGT TTGCTATTAG AGCATTGAAT TGGAGAGAAG 3600 GCCTCCCTAG CCAGCACTTG TATATACGCA TCTATAAATT GTCCGTGTTC ATACATTTGA 3660 GGGGAAAACA CCATAAGGTT TCGTTTCTGT ATACAACCCT GGCATTATGT CCACTGTGTA 3720 TAGAAGTAGA TTAAGAGCCA TATAAGTTTG AAGGAAACAG TTAATACCAT TTTITAAGGA 3780 AACAATATAA CCACAAAGCA CAGTTTGAAC AAAATCTCCT CTTTTAGCTG ATGAACTTAT 3840 TCTGTAGATT CTGTGGAACA AGCCTATCAG CTTCAGAATG GCATTGTACT CAATGGATTT 3900 GATGCTGTTT GACAAAGTTA CTGATTCACT GCATGGCTCC CACAGGAGTG GGAAAACACT 3960 GCCATCTTAG TTTGGATTCT TATGTAGCAG GAAATAAAGT ATAGG m AG CCTCCTTCGC 4020 AGGCATGTCC TGGACACCGG GCCAGTATCT ATATATGTGT ATGTACGTTT GTATGTGTGT 4080 AGACAAATAT TTGGAGGGGT ATTTTTGCCC TGAGTCCAAG AGGGTCCTTT AGTACCTGAA 4140 AAGTAACTTG GCTTTCATTA TTAGTACTGC TCTTGTTTCT TTTCACATAG CTGTCCTAGAG 4200 TAGCTTACCA GAAGCTTCCA TAGTGGTGCA GAGGAAGTGG AAGGCATCAG TCCCTATGTA 4260 TTTGCAGTTC ACCTGCACTT AAGGCACTCT GTTATTTAGA CTCATCTTAC TGTACCTGTT 4320 CCTTAGACCT TCCATAATGC TACTGTCTCA CTGAAACATT TAAATTTTAC CCTTTAGACT 4380 GTAGCCTGGA TATTATTCTT GTAGTTTACC TCTITAAAAA CAAAACAAAA CAAAACAAAA 4440 AACTCCCCTT CCTCACTGCC CAATATAAAA GGCAAATGTG TACATGGCAG AGTTTGTGTG 4500 TTGTCTTGAA AGATTCAGGT ATGTTGCCTT TATGGTTTCC CCCTTCTACA TTTCTTAGAC 4560 TACA=AGA GAACTGTGC; C CGTTATCTGG AAGTAACCAT TTGCACTGGA GTTCTATGCT 4620 CTCGCACCTT TCCAAAGTTA ACAGATTTTG GGGTTGTGTT GTCACCCAAG AGATTGTTGT 4680 TTGCCATACT TTGTCTGAAA AATTCCTTTG TGTTTCTATT GACTTCAATG ATAGTAAGAA 4740 AAGTGGTTGT TAGTTATAGA TGTCTAGGTA CTTCAGGGGC ACTTCATTGA GAGTTTTGTC 4800 TIGCCATACT TTGTCTGAAA AATTCCTTTG TGTTTCTATT GACTTCAATG ATAGTAAGAA 4860 AAGTGGTTGT TAGTTATAGA TGTCTAGGTA CTTCAGGGGC ACTTCATTGA GAGTTTTGTC 4920 AATGCTTTT GAATATTCCC AAGCCCATGA GTCCTTGAAA ATATTTTTTA TATATACAGT 4980 AACTTTATGT GTAAATACAT AAGCGGCGTA AGTTTAAAGG ATGTTGGTGT TCCACGTGTT 5040 TTATTCCTGT ATGTTGTCCA ATTGTTGACA GTTCTGAAGA ATTC 5084 (2) INFORMATION FOR SEQ ID NO:26: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1388 btse pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: GGATCCAGAA GGGTCATTCA ATCAGTTCTC AGTCTTATCA GGTCTAAGTT CCTTTCTTAT 60 CAGGTCCTAA AGGCCTAATC TTATCATTGT GACAAAGATA ACTGTAGAGT CTGTTAAACT 120 TlTTTTTTAA TAACATGAAG ATTATGATIT ATAGCTGAAT TTCTCCCTTT TATTCCAATT 180 CAACAATTTT CATGGCTTTT TGTGTTTGTT TTGTTCTGGA CATATTTACA GAAAATTACC 240 TGAAGAGTTC CAACCTGAGG CCTCCTCATG GATGGGTCAA ACGTGACATC ATTTGTTGTT 300 GAGGAACCCA CGAACATCTC AACTGGCAGG AACGCCTCAG TCGGGAATGC ACATCGGCAA 360 ATCCCCATCG TGCACTGGGT CATTATGAGC ATCTCCCCAG TGGGGTTTGT TGAGAATGGG 420 ATTCTCCTCT GGTTCCTGTG CTTCCGGATG AGAAGAAATC CCTTCACTGT CTACATCACC 480 CACCTGTCTA TCGCAGACAT CTCACTGCTC TTCTGTATTT TCATCTTGTC TATCGACTAT 540 GCTTTAGATT <RTI ID=129.10> ATGAGCTTTC TTCTGGCCAT TACTACACAA TTGTCACATT ATCAGTGACT 600 TTTCTGTTTG GCTACAACAC GGGCCTCTAT GCTGACGG CCATTAGTGT GGAGAGGTGC 660 CTGTCAGTCC TTTACCCCAT CTGGTACCGA TGCCATCGCC CCAAGTACCA GTCGGCATTG 720 GTCTGTGCCC TTCTGTGGGC TCTTTCTTGC TTGGTGACCA CCATGGAGTA TGTCATGTGC 780 ATCGACAGAG AAGAAGAGAG TCACTCTCGG AATGACTGCC GAGCAGTCAT CATCTTTATA 840 GCCATCCTGA GCTTCCTGGT CTTCACGCCC CTCATGCTGG TGTCCAGCAC CATCTTGGTC 900 GTGAAGATCC GGAAGAACAC GTGGGCTTCC CATTCCTCCA AGCTTTACAT AGTCATCATG 960 GTCACCATCA TTATATTCCT CATCTTCGCT ATGCCCATGA GACTCCTTA CCTGCTGTAC 1020 TATGAGTATT GGTCGACCTT TGGGAACCTA CACCACATTT CCCTGCTCTT CTCCACAATC 1080 AACAGTAGCG CCAACCCTTT CATTTACTTC TTTGTGGGAA GCAGTAAGAA GAAGAGATTC 1140 AAGGAGTCCT TAAAAGTTGT <RTI ID=130.2> TCTGACCAGG GCTTTCAAAG ATGAAATGCA ACCTCGGCGC 1200 CAGAAAGACA ATTTTAATAC GGTCACAGTT GAGACTGTCG TCTAAGAACT GTGAGGGAAG 1260 TTGTGGATAA AAATGGTGGA ACACAGGTCA TTTTTAGTTT GTGCTTGGAA TATGACTTAA 1320 GTATCTCCTA AATGTGATAC AGAAGAACAT CTCATCCCAT ATGCATGAGA TACTAATTAA 1380 TGATGAAA 1388 (2) INFORMATION FOR SEQ ID NO:27: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4626 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: GAATTCCGCC CTCGCCGCCC GCGGCGCCCC GAGCGCTTTG TGAGCAGATG CGGAGCCGAG 60 TGGAGGGCGC GAGCCAGATG CGGGGCGACA GCTGACTTGC TGAGAGGAGG CGGGGAGGCG 120 CGGAGCGCGC GTGTGGTCCT TGCGCCGCTG ACTTCTCCAC TGGTTCCTGG GCACCGAAAG 180 ATAAACCTCT CATAATGAAG GCCCCCGCTG TGCTTGCACC TGGCATCCTC GTGCTCCTGT 240 TTACCTTGGT GCAGAGGAGC AATGGGGAGT GTAAAGAGGC ACTAGCAAAG TCCGAGATGA 300 ATGTGAATAT GAAGTATCAG CTTCCCAACT TCACCGCGGA AACACCCATC CAGAATGTCA 360 TTCTACATGA GCATCACATT TTCCTTGGTG CCACTAACTA CATTTATGTT TTAAATGAGG 420 AAGACCTTCA GAAGGTTGCT GAGTACAAGA CTGGGCCTGT GCTGGAACAC CCAGATTGTT 480 TCCCATGTCA GGACTGCAGC AGCAAAGCCA ATTTATCAGG AGGTGTTTGG AAAGATAACA 540 TCAACATGGC TCTAGTTGTC GACACCTACT ATGATGATCA ACTCATTAGC TGTGGCAGCG 600 TCAACAGAGG GACCTGCCAG CGACATGTCT TTCCCCACAA TCATACTGCT GACATACAGT 660 CGGAGGTTCA CTGCATATTC TCCCCACAGA TAGAAGAGCC CAGCCAGTGT CCTGACTGTG 720 TGGTGAGCGC CCTGGGAGCC AAAGTCCTIT CATCTGTAAA GGACCGGTTC ATCAACTTCT 780 TTGTAGGCAA TACCATAAAT TCTTCTTATT TCCCAGATCA TCCATTGCAT TCGATATCAG 840 TGAGAAGGCT AAAGGAAACG AAAGATGGTT TTATGTTTTT GACGGACCAG TCCTACATTG 900 ATGTTTTACC TGAGTTCAGA GATTCTTACC CCATTAAGTA TGTCCATGCC TTTGAAAGCA 960 ACAATTTTAT TTACTTCTTG ACGGTCCAAA GGGAAACTCT AGATGCTCAG ACTITTCACA 1020 CAAGAATAAT CAGGTTCTGT TCCATAAACT CTGGATTGCA TTCCTACATG GAAATGCCTC 1080 TGGAGTGTAT TCTCACAGAA AAGAGAAAAA AGAGATCCAC AAAGAAGGAA GTGTTTAATA 1140 TACTTCAGGC TGCGTATGTC AGCAAGCCTG GGGCCCAGCT TGCTAGACAA ATAGGAGCCA 1200 GCCTGAATGA TGACATTCTT TTCGGGGTGT TCC; CACAAAG CAAGCCAGAT TCTGCCGAAC 1260 CAATGGATCG ATCTGCCATG TGTGCATTCC CTATCAAATA TGTCAACGAC TTCTTCAACA 1320 AGATCGTCAA CAAAAACAAT GTGAGATGTC TCCAGCATTT TTACGGACCC AATCATGAGC 1380 ACTGCTTTAA TAGGACACTT CTGAGAAATT CATCAGGCTG TGAAGCGCGC CGTGATGAAT 1440 ATCGAACAGA GTTTACCACA GCTTTGCAGC GCGTTGACTT ATTCATGGGT CAATTCAGCG 1500 AAGTCCTCTT AACATCTATA TCCACCTTCA TTAAAGGAGA CCTCACCATA GCTAATCTTG 1560 GGACATCAGA GGGTCGCTTC ATGCAGGTTG TGGTTTCTCG ATCAGGACCA TCAACCCCTC 1620 ATGTGAATTT TCTCCTGGAC TCCCATCCAG TGTCTCCAGA AGTGATTGTG GAGCATACAT 1680 TAAACCAAAA TGGCTACACA CTGGTTATCA CTGGGAAGAA GATCACGAAG ATCCCATTGA 1740 ATGGCTTGGG CTGCAGACAT TTCCAGTCCT GCAGTCAATG CCTCTCTGCC CCACCCTTTG 1800 TTCAGTGTGG CTGGTGCCAC GACAAATGTG TGCGATCGGA GGAATGCCTG AGCGGGACAT 1860 GGACTCAACA GATCTGTCTG CCTGCAATCT ACAAGGTTTT CCCAAATAGT GCACCCCTTG 1920 AAGGAGGGAC AAGGCTGACC ATATGTGGCT GGGACTTTGG ATTTCGGAGG AATAATAAAT 1980 TTGATTTAAA GAAAACTAGA GTTCTCCTTG GAAATGAGAG CTGCACCTTG ACTTTAAGTG 2040 AGAGCACGAT GAATACATTG AAATGCACAG TTGGTCCTGC CATGAATAAG CATTTCAATA 2100 TGTCCATAAT TATTTCAAAT GGCCACGGGA CAACACAATA CAGTACATTC TCCTATGTGG 2160 ATCCTGTAAT AACAAGTATT TCGCCGAAAT ACGGTCCTAT GGCTGGTGGC ACTTTACTTA 2220 CITTAACTGG AAATTACCTA AACAGTGGGA ATTCTAGACA CATTTCAATT GGTGGAAAAA 2280 CATGTACTTT AAAAAGTGTG TCAAACAGTA TTCTTGAATG TTATACCCCA GCCCAAACCA 2340 TTTCAACTGA GTTTGCTGTT AAATTGAAAA TTGACTTAGC CAACCGAGAG ACAAGCATCT 2400 TCAGTTACCG TGAAGATCCC ATTGTCTATG AAATTCATCC AACCAAATCT TTTATTAGTA 2460 CTTGGTGGAA AGAACCTCTC AACATTGTCA GTTTTCTATT TTGCTTTGCC AGTGGTGGGA 2520 GCACAATAAC AGGTGTTGGG AAAAACCTGA ATTCAGTTAG TGTCCCGAGA ATGGTCATAA 2580 ATGTGCATGA AGCAGGAAGG AACTTTACAG TGGCATGTCA ACATCGCTCT AATTCAGAGA 2640 TAATCTGTTG TACCACTCCT TCCCTGCAAC AGCTGAATCT GCAACTCCCC CTGAAAACCA 2700 AAGCCT, fITT CATGTTAGAT GGGATCCTTT CCAAATACTT TGATCTCATT TATGTACATA 2760 ATCCTGTGTT TAAGCCTTT GAAAAGCCAG TGATGATCTC AATGGGCAAT GAAAATGTAC 2820 TGGAAATTAA GGGAAATGAT ATTGACCCTG AAGCAGTTAA AGGTGAAGTG TTAAAAGTTG 2880 GAAATAAGAG CTGTGAGAAT ATACACTTAC ATTCTGAAGC CGTTTTATGC ACGGTCCCCA 2940 ATGACCTGCT GAAATTGAAC AGCGAGCTAA ATATAGAGTG GAAGCAAGCA A m CTTCAA 3000 CCGTCCTTGG AAAAGTAATA GTTCAACCAG ATCAGAATTT CACAGGATTG ATTGCTGGTG 3060 TTGTCTCAAT ATCAACAGCA CTGTTATTAC TACTTGGGTT TTTCCTGTGG CTGAAAAAGA 3120 GAAAGCAAAT TAAAGATCTG GGCAGTGAAT TAGTTCGCTA CGATGCAAGA GTACACACTC 3180 CTCATTTGGA TAGGCTTGTA AGTGCCCGAA GTGTAAGCCC AACTACAGAA ATGGTTTCAA 3240 ATGAATCTGT AGACTACCGA GCTACTTTTC CAGAAGATCA GTTTCCTAAT TCATCTCAGA 3300 ACGGTTCATG CCGACAAGTG CAGTATCCTC TGACAGACAT GTCCCCCATC CTAACTAGTG 3360 GGGACTCTGA TATATCCAGT CCATTACTGC AAAATACTGT CCACATTGAC CTCAGTGCTC 3420 TAAATCCAGA GCTGGTCCAG GCAGTGCAGC ATGTAGTGAT TGGGCCCAGT AGCCTGATTG 3480 TGCATTTCAA TGAAGTCATA GGAAGAGGGC ATTTTGGTTG TGTATATCAT GGGACTTTGT 3540 TGGACAATGA TGGCAAGAAA ATTCACTGTG CTGTGAAATC CTTGAACAGA ATCACTGACA 3600 TAGGAGAAGT TTCCCAATTT CTGACCGAGG GAATCATCAT GAAAGATTTT AGTCATCCCA 3660 ATGTCCTCTC GCTCCTGGGA ATCTGCCTGC GAAGTGAAGG GTCTCCGCTG GTGGTCCTAC 3720 CATACATGAA ACATGGAGAT CTTCGAAATT TCATTCGAAA TGAGACTCAT AATCCAACTG 3780 TAAAAGATCT TATTGGCTTT GGTCTTCAAG TAGCCAAAGC GATGAAATAT CTTGCAAGCA 3840 AAAAG=GT CCACAGAGAC TTGGCTGCAA GAAACTGTAT GCTGGATGAA AAATTCACAG 3900 TCAAGGTTGC TGATTTTGGT CTTGCCAGAG <RTI ID=133.6> ACATGTATGA TAAAGAATAC TATAGTGTAC 3960 ACAACAAAAC AGGTGCAAAG CTGCCAGTGA AGTGGATGGC TTTGGAAAGT CTGCAAACTC 4020 AAAAGTTTAC CACCAAGTCA GATGTGTGGT CCTrTGGCGT CGTCCTCTGG GAGCTGATGA 4080 CAAGAGGAGC CCCACCTTAT CCTGACGTAA ACACCTTTGA TATAACTGTT TACTTGTTGC 4140 AAGGGAGAAG ACTCCTACAA CCCGAATACT GCCCAGACCC CTTATATGAA GTAATGCTAA 4200 AATGCTGGCA CCCTAAAGCC GAAATGCGCC CATCCTTTTC TGAACTGGTG TCCCGGATAT 4260 CAGCGbTCTT CTCTACTTTC ATTGGGGAGC ACTATGTCCA TGTGAACGCT ACTTATGTGA 4320 ACGTAAAATG TGTCGCTCCG TATCCTTCTC TGTTGTCATC AGAAGATAAC GCTGATGATG 4380 AGGTGGACAC ACGACCAGCC TCCTTCTGGG AGACATCATA GTGCTAGTAC TATGTCAAAG 4440 CAACAGTCCA CACTTTGTCC AATGGTTTTT TCACTGCCTG ACCTTTAAAA GGCCATCGAT 4500 ATTCTTTGCT CCTTGCCATA GGACTTGTAT TGTTATTTAA ATTACTGGAT TCTAAGGAAT 4560 TTCTTATCTG ACAGAGCATC AGAACCAGAG GCTTGGTCCC ACAGGCCAGG GACCAATGCG 4620 CTGCAG 4626 (2) INFORMATION FOR SEQ ID NO:28: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 8082 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear Iii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: AGCTTGTTTG GCCGTTTTAG GGTTTGTTGG AA=TTTT TCGTCTATGT ACTTGTGAAT 60 TATTTCACGT TTGCCATTAC CGGTTCTCCA TAGGGTGATG TTCATTAGCA GTGGTGATAG 120 GTTAATTTTC ACCATCTCTT ATGCGGTTGA ATAGTCACCT CTGAACCACT TTTTCCTCCA 180 GTAACTCCTC TTTCTTCGGA CCTTCTGCAG CCAACCTGAA AGAATAACAA GGAGGTGGCT 240 GGAAACTTGT TTTAAGGAAC CGCCTGTCCT TCCCCCGCTG GAAACCTTGC ACCTCGGACG 300 CTCCTGCTCC TGCCCCCACC TGACCCCCGC CCTCGTTGAC ATCCAGGCGC GATGATCTCT 360 GCTGCCAGTA GAGGGCACAC TTACTTTACT TTCGCAAACC TGAACGCGGG TGCTGCCCAG 420 AGAGGGGGCG GAGGGAAAGA CGCTTTGCAG CAAAATCCAG CATAGCGATT GGTTGCTCCC 480 CGCGTTTGCG GCAAAGGCCT GGAGGCAGGA GTAATTTGCA ATCCTTAAAG CTGAATTGTG 540 CAGTGCATCG GATTTGGAAG CTACTATATT CACTTAACAC TTGAACGCTG AGCTGCAAAC 600 TCAACGGGTA ATAACCCATC TTGAACAGCG TACATGCTAT ACACACACCC CTTCCCCCG 660 AATTGTTTTC TCTTTTGGAG GTGGTGGAGG GAGAGAAAAG TTTACTTAAA ATGCCTTTGG 720 GTGAGGGACC AAGGATGAGA AGAATGTTT TTGTTTTTCA TGCCGTGGAA TAACACAAAA 780 TAAAAAATCC CGAGGGAATA TACATTATAT ATTAAATATA GATCATTTCA GGGAGCAAAC 840 AAATCATGTG TGGGGCTGGG CAACTAGCTG AGTCGAAGCG TAAATAAAAT GTGAATACAC 900 GTTTGCGGGT TACATACAGT GCACTTTCAC TAGTATTCAG AAAAAATTGT GAGTCAGTGA 960 ACTAGGAAAT TAATGCCTGG AAGGCAGCCA AATTTTAATT AGCTCAAGAC TCCCCCCCCC 1020 CCCCAAAAAA AGGCACGGAA GTAATACTCC TCTCCTCTTC TTTGATCAGA ATCGATGCAT 1080 TTTTTGTGCA TGACCGCATT TCCAATAATA AAAGGGGAAA GAGGACCTGG AAAGGAATTA 1140 AACGTCCGGT TTGTCCGGGG AGGAAAGAGT TAACGGTTTT TTTCACAAGG GTCTCTGCTG 1200 ACTCCCCCGG CTCGGTCCAC AAGCTCTCCA CTTGCCCCTT TTAGGAAGTC CGGTCCCGCG 1260 GTTCGGGTAC CCCCTGCCCC TCCCATATTC TCCCGTCTAG CACCTTTGAT TTCTCCCAAA 1320 CCCGGCAGCC CGAGACTGTT GCAAACCGGC GCCACAGGGC GCAAAGGGGA TTTGTCTCTT 1380 CTGAAACCTG GCTGAGAAAT TGGGAACTCC GTGTGGGAGG CGTGGGGGTG GGACGGTGGG 1440 GTACAGACTG GCAGAGAGCA GGCAACCTCC CTCTCGCCCT AGCCCAGCTC TGGAACAGGC 1500 AGACACATCT CAGGGCTAAA CAGACGCCTC CCGCACGGGG CCCCACGGAA GCCTGAGCAG 1560 GCGGGGCAGG AGGGGCGGTA TCTGCTGCTT TGGCAGCAAA TTGGGGGACT CAGTCTGGGT 1620 GGAAGGTATC CAATCCAGAT AGCTGTGCAT ACATAATGCA TAATACATGA CTCCCCCCAA 1680 CAAATGCAAT GGGAGTTTAT TCATAACGCG CTCTCCAAGT ATACGTGGCA ATGCGTTGCT 1740 GGGTTATTTT AATCATTCTA GGCATCGTTT TCCTCCTTAT GCCTCTATCA TTCCTCCCTA 1800 TCTACACTAA CATCCCACGC TCTGAACGCG CGCCCATTAA TACCCTTCTT TCCTCCACTC 1860 TCCCTGGGAC TCTTGATCAA <RTI ID=135.2> AGCGCGGCCC TTTCCCCAGC CTTAGCGAGG CGCCCTGCAG 1920 CCTGGTACGC GCGTGGCGTG GCGGTGGGCG CGCAGTGCGT TCTCTGTGTG GAGGGCAGCT 1980 GTTCCGCCTG CGATGATTTA TACTCACAGG ACAAGGATGC GGTITGTCAA ACAGTACTGC 2040 TACGGAGGAG CAGCAGAGAA AGGGAGAGGG TTTGAGAGGG AGCAAAAGAA AATGGTAGGC 2100 GCGCGTAGTT AATTCATGCG GCTCTCTTAC TCTGTTTACA TCCTAGAGCT AGAGTGCTCG 2160 GCTGCCCGGC TGAGTCTCCT CCCCACCTTC CCCACCCTCC CCACCCTCCC CATAAGCGCC 2220 CCTCCCGGGT TCCCAAAGCA GAGGGCGTGG GGGAAAAGAA AAAAGATCCT CTCTCGCTAA 2280 TCTCCGCCCA CCGGCCCTTT ATAATGCGAG GGTCTGGACG GCTGAGGACC CCCGAGCTGT 2340 GCTGCTCGCG GCCGCCACCG CCGGGCCCCG GCCGTCCCTG GCTCCCCTCC TGCCTCGAGA 2400 AGGGCAGGGC TTCTCAGAGG CTTGGCGGGA AAAAGAACGG AGGGAGGGAT CGCGCTGAGT 2460 ATAAAAGCCG GTTTTCGGGG CTTTATCTAA CTCGCTGTAG TAATTCCAGC GAGAGGCAGA 2520 GGGAGCGAGC GGGCGGCCGG CTAGGGTGGA AGAGCCGGGC GAGCAGAGCT GCGCTGCGGG 2580 CGTCCTGGGA AGGGAGATCC GGAGCGAATA GGGGGCTTCG CCTCTGGCCC AGCCCTCCCG 2640 CTGATCCCCC AGCCAGCGGT CCGCAACCCT TGCCGCATCC ACGAAACTTT GCCCATAGCA 2700 GCGGGCGGGC Aulw GCACT GGAACTTACA ACACCCGAGC AAGGACGCGA CTCTCCCGAC 2760 GCGGGGAGGC TATTCTGCCC ATTTGGGGAC ACTTCCCCGC CGCTGCCAGG ACCCGCTTCT 2820 CTGAAAGGCT CTCCTTGCAG CTGCTTAGAC GCTGGATTTT TTTCGGGTAG TGGAAAACCA 2880 GGTAAGCACC GAAGTCCACT TGCCTTTTAA TTTATTTTTT TATCACTTTA ATGCTGAGAT 2940 GAGTCGAATG CCTAAATAGG GTGTCTTTTC TCCCATTCCT GCGCTATTGA CACTTTTCTC 3000 AGAGTAGTTA TGGTAACTGG GGCTGGGGTG GGGGGTAATC CAGAACTGGA TCGGGGTAAA 3060 GTGACTTGTC AAGATGGGAG AGGAGAAGGC AGAGGGAAAA CGGGAATGGT TTTTAAGACT 3120 ACCCTTTCGA GATTTCTGCC TTATGAATAT ATTCACGCTG ACTCCCGGCC GGTCGGACAT 3180 TCCTGCTTTA TTGTGTTAAT TGCTCTCTGG GTTTTGGGGG GCTGGGGGTT GCTTTGCGGT 3240 GGGCAGAAAG CCCCTTGCAT CCTGAGCTCC TTGGAGTAGG GACCGCATAT CGCCTGTGTG 3300 AGCCAGATCG CTCCGCAGCC GCTGACTTGT CCCCGTCTCC GGGAGGGCAT TTAAATTTCG 3360 GCTCACCGCA TTTCTGACAG CCGGAGACGG ACACTGCGGC GCGTCCCGCC CGCCTGTCCC 3420 CGCGGCGATT CCAACCCGCC CTGATCCTTT TAAGAAGTTG GCATTTGGCT TTTTAAAAAG 3480 CAATAATACA ATTTAAAACC TGGGTCTCTA GAGGTGTTAG GACGTGGTGT TGGGTAGGCG 3540 CAGGCAGGGG AAAAGGGAGG CGAGGATGTG TCCGATTCTC CTGGAATCGT TGACTTGGAA 3600 AAACCAGGGC GAATCTCCGC ACCCAGCCCT GACTCCCCTG CCGCGGCCGC CCTCGGGTGT 3660 CCTCGCGCCC GAGATGCGGA GGAACTGCGA GGAGCGGGGC TCTGGGCGGT TCCAGAACAG 3720 CTGCTACCCT TGGTGGGGTG GCTCCGGGGG AGGTATCGCA GCGGGGTCTC TGGCGCAGTT 3780 GCATCTCCGT ATTGAGTGCG AAGGGAGGTG CCCCTATTAT TATTTGACAC CCCCCTTGTA 3840 TTTATGGAGG GGTGGTAAAG CCCGCGGCTG AGCTCGCCAC TCCAGCCGGC GAGAGAAAGA 3900 AGAAAAGCTG GCAAAAGGAG TGTTGGACGG GGGCGGTACT GGGGGTGGGG ACGGGGGCGG 3960 TGGAGAGGGA AGGTTGGGAG GGGCTGCGGT GCCGGCGGGG GTAGGAGAGC GGCTAGGGCG 4020 CGAGTGGGAA CAGCCGCAGC GGAGGGGCCC CGGCGCGGAG CGGGGTTCAC GCAGCCGCTA 4080 GCGCCCAGGC GCCTCTCGCC TTCTCCTTCA GGTGGCGCAA AACTTTGTGC CTTGGATTTT 4140 GGCAAATTGT TTTCCTCACC GCCACCTCCC GCGGCTTCTT AAGGGCGCCA GGGCCGATTT 4200 CGATTCCTCT GCCGCTGCGG GGCCGACTCC CGGGCTTTGC GCTCCGGGCT CCCGGGGGAG 4260 CGGGGGCTCG GCGGGCACCA AGCCGCTGGT TCACTAAGTG CGTCTCCGAG ATAGCAGGGG 4320 ACTGTCCAAA GGGGGTGAAA GGGTGCTCCC TTTATTCCCC CACCAAGACC ACCCAGCCGC 4380 TTTAGGGGAT AGCTCTGCAA GGGGAGAGGT TCGGGACTGT GGCGCGCACT GCGCGCTGCG 4440 CCAGGTTTCC GCACCAAGAC CCCTTTAACT CAAGACTGCC TCCCGCTTTG TGTGCCCCGC 4500 TCCAGCAGCC TCCCGCGACG ATGCCCCTCA ACGTTAGCTT CACCAACAGG AACTATGACC 4560 TCGACTACGA CTCGGTGCAG CCGTATTTCT ACTGCGACGA GGAGGAGAAC TTCTACCAGC 4620 AGCAGCAGCA GAGCGAGCTG CAGCCCCCGG CGCCCAGCGA GGATATCTGG AAGAAATTCG 4680 AGCTGCTGCC CACCCCGCCC CTGTCCCCTA GCCGCCGCTC CGGGCTCTGC TCGCCCTCCT 4740 ACGTTGCGGT CACACCCTTC TCCCTTCGGG GAGACAACGA CGGCGGTGGC GGGAGCTTCT 4800 CCACGGCCGA CCAGCTGGAG ATGGTGACCG AGCTGCTGGG AGGAGACATG GTGAACCAGA 4860 GTTTCATCTG CGACCCGGAC GACGAGACCT TCATCAAAAA CATCATCATC CAGGACTGTA 4920 TGTGGAGCGG CTTCTCGGCC GCCGCCAAGC TCGTCTCAGA GAAGCTGGCC TCCTACCAGG 4980 CTGCGCGCAA AGACAGCGGC AGCCCGAACC CCGCCCGCGG CCACAGCGTC TGCTCCACCT 5040 CCAGCTTGTA CCTGCAGGAT CTGAGCGCCG CCGCCTCAGA GTGCATCGAC CCCTCGGTGG 5100 TCTTCCCCTA CCCTCTCAAC GACAGCAGCT CGCCCAAGTC CTGCGCCTCG CAAGACTCCA 5160 GCGCCTTCTC TCCGTCCTCG GATTCTCTGC TCTCCTCGAC GGAGTCCTCC CCGCAGGGCA 5220 GCCCCGAGCC CCTGGTGCTC CATGAGGAGA CACCGCCCAC CACCAGCAGC GACTCTGGTA 5280 AGCGAAGCCC GCCCAGGCCT GTCAAAAGTG GGCGGCTGGA TACCTTTCCC ATTTTCATTG 5340 GCAGCTTATT TAACGGGCCA CTCTTATTAG GAAGGAGAGA TAGCAGATCT GGAGAGATTT 5400 GGGAGCTCAT CACCTCTGAA ACCTTGGGCT TTAGCGIITC CTCCCATCCC TTCCCCTTAG 5460 ACTGCCCATG TTTGCAGCCC CCCTCCCCGT TTGTCTCCCA CCCCTCAGGA ATTTCATTTA 5520 GGTTTTTAAA CCTTCTGGCT TATCTTACAA CTCAATCCAC TTCTTCTTAC CTCCCGTTAA 5580 CATTTTAATT GCCCTGGGGC GGGGTGGCAG GGAGTGTATG AATGAGGATA AGAGAGGATT 5640 GATCTCTGAG AGTGAATGAA TTGCTTCCCT CTTAACTTCC GAGAAGTGGT GGGATTTAAT 5700 GAACTATCTA CAAAAATGAG GGGCTGTGTT TAGAGGCTAG GCAGGGCCTG CCTGAGTGCG 5760 GGAGCCAGTG AACTGCCTCA AGAGTGGGTG GGCTGAGGAG CTGGGATCTT CTCAGCCTAT 5820 TTTGAACACT GAAAAGCAAA TCCTTGCCAA AGTTGGACTT mTrTTTCT TTTATTCCTT 5880 CCCCCGCCCT CTTGGACTTT TGGCAAAACT GCAATTTTTT TTTTTTTATT TTTCATTTCC 5940 AGTAAAATAG GGAGTTGCTA AAGTCATACC AAGCAATTTG CAGCTATCAT TTGCAACACC 6000 TGAAGTGTTC TTGGTAAAGT CCCTCAAAAA TAGGAGGTGC TTGGGAATGT GCTTTGCTTT 6060 GGGTGTGTCC AAAGCCTCAT TAAGTCTTAG GTAAGAATTG GCATCAATGT CCTATCCTGG 6120 GAAGTTGCAC TTTTCTTGTC CATGCCATAA CCCAGCTGTC TTTCCCTTTA TGAGACTCTT 6180 ACCTTCATGG TGAGAGGAGT AAGGGTGGCT GGCTAGATTG GTTCTTTTTT TTTTTTTTTC 6240 CTTTTTTAAG ACGGAGTCTC ACTCTGTCAC TAGGCTGGAG TGCAGTGGCG CAATCAACCT 6300 CCAACCCCCT GGTTCAAGAG ATTCTCCTGC CTCAGCCTCC CAAGTAGCTG GGACTACAGG 6360 TGCACACCAC CATGCCAGGC TAATTTTTGT AATTTTAGTA GAGATGGGGT TTCATCGTGT 6420 TGGCCAGGAT GGTCTCTCCT GACCTCACGA TCCGCCCACC TCGGCCTCCC AAAGTGCTGG 6480 GATTACAGGT GTGAGCCAGG GCACCAGGCT TAGATGTGGC TCTTTGGGGA GATAATTTTG 6540 TCCAGAGACC TTTCTAACGT ATTCATGCCT TGTATTTGTA CAGCATTAAT CTGGTAATTG 6600 ATTATITTM TGTAACCTTG CTAAAGGAGT GATTTCTATT TCCTTTCTTA AAGAGGAGGA 6660 ACAAGAAGAT GAGGAAGAAA TCGATGTTGT TTCTGTGGAA AAGAGGCAGG CTCCTGGCAA 6720 AAGGTCAGAG TCTGGATCAC CTTCTGCTGG AGGCCACAGC AAACCTCCTC ACAGCCCACT 6780 GGTCCTCAAG AGGTGCCACG TCTCCACACA TCAGCACAAC TACGCAGCGC CTCCCTCCAC 6840 TCGGAAGGAC TATCCTGCTG CCAAGAGGGT CAAGTTGGAC AGTGTCAGAG TCCTGAGACA 6900 GATCAGCAAC AACCGAAAAT GCACCAGCCC CAGGTCCTCG GACACCGAGG AGAATGTCAA 6960 GAGGCGAACA CACAACGTCT TGGAGCGCCA GAGGAGGAAC GAGCTAAAAC GGAGCTTTTT 7020 TGCCCTGCGT GACCAGATCC CGGAGTTGGA AAACAATGAA AAGGCCCCCA AGGTAGTTAT 7080 CCTTAAAAAA GCCACAGCAT ACATCCTGTC CGTCCAAGCA GAGGAGCAAA AGCTCATTTC 7140 TGAAGAGGAC TTGTTGCGGA AACGACGAGA ACAGTTGAAA CACAAACTTG AACAGCTACG 7200 GAACTCTTGT GCGTAAGGAA AAGTAAGGAA AACGATTCCT TCTAACAGAA ATGTCCTGAG 7260 CAATCACCTA TGAACTTGTT TCAAATGCAT GATCAAATGC AACCTCACAA CCTTGGCTGA 7320 GTCTTGAGAC TGAAAGATTT AGCCATAATG TAAACTGCCT CAAATTGGAC <RTI ID=139.2> TTTGGGCATA 7380 AAAGAACTTT TTTATGCTTA CCATCTTTTT TTTTTCTTTA ACAGATTTGT ATTTAAGAAT 7440 TGTTTTTAAA AAATTTTAAG ATTTACACAA TGTTTCTCTG TAAATATTGC CATTAAATGT 7500 AAATAACTTT AATAAAACGT TTATAGCAGT TACACAGAAT TTCAATCCTA GTATATAGTA 7560 CCTAGTATTA TAGGTACTAT AAACCCTAAT TTTTTTTATT TAAGTACATT TTGCTTTTA 7620 AAGTTGATTT TTTTCTATTG TTTTTAGAAA AAATAAAATA ACTGGCAAAT ATATCATTGA 7680 GCCAAATCTT AAGTTGTGAA TSTTTTGTTT CGTTTCTTCC CCCTCCCAAC CACCACCATC 7740 CCTGTTTGTT TTCATCAATT GCCCCTTCAG AGGGCGGTCT TAAGAAAGGC AAGAGTTTTC 7800 CTCTGTTGAA ATGGGTCTGG GGGCCTTAAG GTCTITAAGT TCTTGGAGGT TCTAAGATGC 7860 TTCCTGGAGA CTATGATAAC AGCCAGAGTT GACAGTTAGA AGGAATGGCA GAAGGCAGGT 7920 GAGAAGGTGA GAGGTAGGCA AAGGAGATAC AAGAGGTCAA AGGTAGCAGT TAAGTACACA 7980 AAGAGGCATA AGGACTGGGG AGTTGGGAGG AAGGTGAGGA AGAAACTCCT GTTACnTAG 8040 TTAACCAGTG CCAGTCCCCT GCTCACTCCA AACCCAGGAA TT 8082 (2) INFORMATION FOR SEQ ID NO:29: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5775 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: TCCTAGGCGG CGGCCGCGGC GGCGGAGGCA GCAGCGGCGG CGGCAGTGGC GGCGGCGAAG 60 GTGGCGGCGG CTCGGCCAGT ACTCCCGGCC CCCGCCATTT CGGACTGGGA GCGAGCGCGG 120 CGCAGGCACT GAAGGCGGCG GCGGGGCCAG AGGCTCAGCG GCTCCCAGGT GCGGGAGAGA 180 GGCCTGCTGA AAATGACTGA ATATAAACTT GTGGTAGTTG GAGCTTGTGG CGTAGGCAAG 240 AGTGCCTTGA CGATACAGCT AATTCAGAAT CATTTGTGG ACGAATATGA TCCAACAATA 300 GAGGATTCCT ACAGGAAGCA AGTAGTAATT GATGGAGAAA CCTGTCTCTT GGATATTCTC 360 GACACAGCAG GTCAAGAGGA GTACAGTGCA ATGAGGGACC AGTACATGAG GACTGGGGAG 420 GGCTTTCTTT GTGTATTTGC CATAAATAAT ACTAAATCAT TTGAAGATAT TCACCATTAT 480 AGAGAACAAA TTAAAAGAGT TAAGGACTCT GAAGATGTAC CTATGGTCCT AGTAGGAAAT 540 AAATGTGATT TGCCTTCTAG AACAGTAGAC ACAAAACAGG CTCAGGACTT AGCAAGAAGT 600 TATGGAATTC CTTTTATTGA AACATCAGCA AAGACAAGAC AGGGTGTTGA TGATGCCTTC 660 TATACATTAG TTCGAGAAAT TCGAAAACAT AAAGAAAAGA TGAGCAAAGA TGGTAAAAAG 720 AAGAAAAAGA AGTCAAAGAC AAAGTGTGTA ATTATGTAAA TACAATTTGT ACTTTTTTCT 780 TAAGGCATAC TAGTACAAGT GGTAATTTTT GTACATTACA CTAAATTATT AGCATTTGrT 840 TTAGCATTAC CTAATTTTTT TCCTGCTCCA TGCAGACTGT TAGCT1=TAC CTTAAATGCT 900 TATTTTAAAA TGACAGTGGA AGrTTTrTT TCCTCGAAGT GCCAGTATTC CCAGAGTTTT 960 GGTTTTTGAA CTAGCAATGC CTGTGAAAAA GAAACTGAAT ACCTAAGATT TCTGTCTTGG 1020 GGTTTTTGGT GCATGCAGTT GATTACTTcT TATTTTTCTT ACCAAGTGTG AATGTTGGTG 1080 TGAAACAAAT TAATGAAGCT TTTGAATCAT CCCTATTCTG TGTTTTATCT AGTCACATAA 1140 ATGGATTAAT TACTAATTTC AGTTGAGACC TTCTAATTGG TITITACTGA AACATTGAGG 1200 GACACAAATT TATGGGCTTC CTGATGATGA TTCTTCTAGG CATCATGTCC TATAGTTTGT 1260 CATCCCTGAT GAATGTAAAG TTACACTGTT CACAAAGGTT TTGTCTCCTT TCCACTGCTA 1320 TTAGTCATGG TCACTCTCCC CAAAATATTA TATTTTTTCT ATAAAAAGAA AAAAATGGAA 1380 AAAAATTACA AGGCAATGGA AACTATTATA AGGCCATTTC CTTTTCACAT TAGATAAATT 1440 ACTATAAAGA CTCCTAATAG CTTTTCCTG TTAAGGCAGA CCCAGTATGA ATGGGATTAT 1500 TATAGCAACC ATTTTGGGGC TATATTTACA TGCTACTAAA TTTTTATAAT AATTGAAAAG 1560 ATTTTAACAA GTATAAAAAA ATTCTCATAG GAATTAAATG TAGTCTCCCT GTGTCAGACT 1620 GCTCT=CAT AGTATAACTT TAAATCTTTT CTTCAACTTG AGTCTTTGAA GATAGTTTTA 1680 ATTCTGCTTG TGACATTAAA AGATTATTTG GGCCAGTTAT AGCTTATTAG GTGTTGAAGA 1740 GACCAAGGTT GCAAGCCAGG CCCTGTGTGA ACCTTGAGCT TTCATAGAGA GTTTCACAGC 1800 ATGGACTGTG TGCCCCACGG TCATCCGAGT GGTTGTACGA TGCATTGGTT AGTCAAAAAT 1860 GGGGAGGGAC TAGGGCAGTT TGGATAGCTC AACAAGATAC AATCTCACTC TGTGGTGGTC 1920 CTGCTGACAA ATCAAGAGCA TTGCTTTTGT TTCTTAAGAA AACAAACTCT TTTTTAAAAA 1980 TTACTTTTAA ATATTAACTC AAAAGTTGAG ATTTTGGGGT GGTGGTGTGC CAAGACATTA 2040 ATTTTTTTTT TAAACAATGA AGTGAAAAAG TTTTACAATC TCTAGGTTTG GCTAGTTCTC 2100 TTAACACTGG TTAAATTAAC ATTGCATAAA CACTTTTCAA GTCTGATCCA TATTTAATAA 2160 TGCTTTAAAA TAAAAATAAA AACAATCCTT TTGATAAATT TAAAATGTTA CTTATTTTAA 2220 AATAAATGAA GTGAGATGGC ATGGTGAGGT GAAAGTATCA CTGGACTAGG TTGTTGGTGA 2280 CTTAGGTTCT AGATAGGTGT CTTTTAGGAC TCTGATTTTG AGGACATCAC TTACTATCCA 2340 TTTCITCATG TTAAAAGAAG TCATCTCAAA CTCTTAGTTT TTTTTTTTTA CACTATGTGA 2400 TTTATATTCC ATTTACATAA GGATACACTT ATTTGTCAAG CTCAGCACAA TCTGTAAATT 2460 TTTAACCTAT GTTACACCAT CTTCAGTGCC <RTI ID=141.9> AGTCTTGGGC AAAATTGTGC AAGAGGTGAA 2520 GTTTATATTT GAATATCCAT TCTCGTTTTA GGACTCTTCT TCCATATTAG TGTCATCTTG 258 CCAATCCATT AGCGACAGTA GGATTTTTCA ACCCTGGTAT GAATAGACAG AACCCTATCC 2880 AGTGGAAGGA GAATTTAATA AAGATAGTGC AGAAAGAATT CCTTAGGTAA TCTATAACTA 2940 GGACTACTCC TGGTAACAGT AATACATTCC ATTGTTTTAG TAACCAGAAA TCTTCATGCA 3000 ATGAAAAATA CTTTAATTCA TGAAGCTTAC TTTTTTTTTT TTGGTGTCAG AGTCTCGCTC 306 TTGTCACCCA GGCTGGAATG CAGTGGCGCC ATCTCAGCTC ACTGCAACCT TCCATCTTCC 3120 CAGGTTCAAG CGATTCTCGT GCCTCGGCCT CCTGAGTAGC TGGGATTACA GGCGTGTGCA 3180 CTACACTCAA CTAATTTTTG TATTTTTAGG AGAGACGGGG TTTCACCTGT TGGCCAGGCT 3240 GGTCTCGAAC TCCTGACCTC AAGTGATTCA CCCACCTTGG CCTCATAAAC CTGTTTTGCA 3300 GAACTCATTT ATTCAGCAAA TATTTATTGA GTGCCTACCA GATGCCAGTC ACCGCACAAG 3360 GCACTGGGTA TATGGTATCC CCAAACAAGA GACATAATCC CGGTCCTTAG GTACTGCTAG 3420 TGTGGTCTGT AATATCTTAC TAAGGCCTTT GGTATACGAC CCAGAGATAA CACGATGCGT 3480 ATTTTAGTTT TGCAAAGAAG GGGTTTGGTC TCTGTGCCAG CTCTATAATT GTTTTGCTAC 3540 GATTCCACTG AAACTCTTCG ATCAAGCTAC TTTATGTAAA TCACTTCATT GTTTTAAAGG 3600 AATAAACTTG ATTATATTGT TTrrTTATIT GGCATAACTG TGATTCTTTT AGGACAATTA 3660 CTGTACACAT TAAGGTGTAT GTCAGATATT CATATTGACC CAAATGTGTA ATATTCCAGT 3720 TTTCTCTGCA TAAGTAATTA AAATATACTT AAAAATTAAT AGTTTTATCT GGGTACAAAT 3780 AAACAGTGCC TGAACTAGTT CACAGACAAG GGAAACTTCT ATGTAAAAAT CACTATGATT 3840 TCTGAATTGC TATGTGAAAC TACAGATCTT TGGAACACTG TTTAGGTAGG GTGTTAAGAC 3900 TTGACACAGT ACCTCGTTTC TACACAGAGA AAGAAATGGC CATACTTCAG GAACTGCAGT 3960 GCTTATGAGG GGATATTTAG GCCTCTTGAA TTTTTGATGT AGATGGGCAT TTTTTTAAGG 4020 TAGTGGTTAA TTACCTTTAT GTGAACTTTG AATGGTTTAA CAAAAGATTT GTTTTTGTAG 4080 AGATTTTAAA GGGGGAGAAT TCTAGAAATA AATGTTACCT AATTATTACA GCCTTAAAGA 4140 CAAAAATCCT TGTTGAAGTT TTTTAAAAA AAGACTAAAT TACATAGACT TAGGCATTAA 4200 CATGTTTGTG GAAGAATATA GCAGACGTAT ATTGTATCAT TTGAGTGAAT GTTCCCAAGT 4260 AGGCATTCTA GGCTCTATTT AACTGAGTCA CACTGCATAG GAATTTAGAA CCTAACTTTT 4320 ATAGGTTATC AAAACTGTTG TCACCATTGC ACAATTTTGT CCTAATATAT ACATAGAAAC 4380 TTTGTGGGGC ATGTTAAGTT ACAGTTTGCA CAAGTTCATC TCATTTGTAT TCCATTGATT 4440 TTTTTTTTTC TTCTAAACAT TTTTTCTTCA AAACAGTATA TATAACTTT TrTAGGGGAT 4500 TTTTTTTAGA CAGCAAAAAA CTATCTGAAG ATTTCCATTT GTCAAAAAGT AATGATTTCT 4560 TGATAATTGT GTAGTGAATG TTTTTTAGAA CCCAGCAGTT ACCTTGAAAG CTGAATTTAT 4620 ATTTAGTAAC TTCTGTGTTA ATACTGGATA GCATGAATTC TGCATTGAGA AACTGAATAG 4680 CTGTCATAAA ATGCTTTCTT TCCTAAAGAA AGATACTCAC ATGAGTTCTT GAAGAATAGT 4740 CATAACTAGA TTAAGATCTG TGTTTTAGTT TAATAGTTTG AAGTGCCTGT TTGGGATAAT 4800 GATAGGTAAT TTAGATGAAT TTAGGGGAAA AAAAAGTTAT CTGCAGTTAT GTTGAGGGCC 4860 CATCTCTCCC CCCACACCCC CACAGAGCTA ACTGGGTTAC AGTGTTTTAT CCGAAAGTTT 4920 CCAATTCCAC TGTCTTGTGT TTTCATGTTG AAAATACTTT TGCATTTTTC C=GAGTGC 4980 CAATTTCTTA CTAGTACTAT TTCTTAATGT AACATGTTTA CCTGGCCTGT CTTTTAACTA 5040 TTTTTGTATA GTGTAAACTG AAACATGCAC ATTTTGTACA TTGTGCTTTC TTTTGTGGGT 5100 CATATGCAGT GTGATCCAGT TGTTTTCCAT CATTTGGTTG CGCTGACCTA GGAATGTTGG 5160 TCATATCAAA CATTAAAAAT GACCACTCTT TTAATGAAAT TAACTTTTAA ATGTTTATAG 5220 GAGTATGTGC TGTGAAGTGA TCTAAAATTT GTAATATTTT TGTCATGAAC TGTACTACTC 5280 CTAATTATTG TAATGTAATA AAAATAGTTA CAGTGACTAT GAGTGTGTAT TTATTCATGC 5340 AAATTTGAAC TGTTTGCCCC GAAATGGATA TGGATACTTT ATAAGCCATA GACACTATAG 5400 TATACCAGTG AATCTTTTAT GCAGCTTGTT AGAAGTATCC TTTTATTTTC TAAAAGGTGC 5460 TGTGGATATT ATGTAAAGGC GTGTTTGCTT AAACAATTTT CCATATTTAG AAGTAGATGC 5520 AAAACAAATC TGCCTTTATG ACAAAAAAAT AGGATAACAT TATTTATTTA TTTCCTTTTA 5580 TCAATAAGGT AATTGATACA CAACAGGTGA CTTGGTTTTA GGCCCAAAGG TAGCAGCAGC 5640 AACATTAATA ATGGAAATAA TTGAATAGTT AGTTATGTAT GTTAATGCCA GTCACCAGCA 5700 GGCTATTTCA AGGTCAGAAG TAATGACTCC ATACATATTA TTTATTTCTA TAACTACATT 5760 TAAATCATTA CCAGG 5775 (2) INFORMATION FOR SEQ ID NO:30: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 151 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (XI) SEQUENCE DESCRIPTION: SEQ ID NO:30: CAGGTTCTTG CTGGTGTGAA ATGACTGAGT ACAAACTGGT GGTGGTTGGA GCAGGTGGTG 60 TTGGGAAAAG CGCACTGACA ATCCAGCTAA TCCAGAACCA CTTTGTAGAT GAATATGATC 120 CCACCATAGA GGTGAGGCCC AGTGGTAGCC C 151 (2) INFORMATION FOR SEQ ID NO:31: li) SEQUENCE CHARACTERISTICS: IA) LENGTH: 199 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:31: GATTCTTACA GAAAACAAGT GGTTATAGAT GGTGAAACCT GTTTGTTGGA CATACTGGAT 60 ACAGCTGGAC AAGAAGAGTA CAGTGCCATG AGAGACCAAT ACATGAGGAC AGGCGAAGGC 120 TTCCTCTGTG TATTTGCCAT CAATAATAGC AAGTCATTTG CGGATATTAA CCTCTACAGG 180 TACTAGGAGT CATTATTTT 199 (2) INFORMATION FOR SEQ ID NO:32: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 160 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: GGAGCAGATT AAGCGAGTAA AAGACTCGGA TGATGTACCT ATGGTGCTAG TGGGAAACAA 60 GTGTGATTTG CCAACAAGGA CAGTTGATAC AAAACAAGCC CACGAACTGG CCAAGAGTTA 120 CGGGATTCCA TTCATTGAAA CCTCAGCCAA GACCAGACAG 160 (2) INFORMATION FOR SEQ ID NO:33: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 120 base pairs (B) TYPE: nucleic acid (C) SRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: GGTGTTGAAG ATGCTTTTTA CACACTGGTA AGAGAAATAC GCCAGTACCG AATGAAAAAA 60 CTCAACAGCA GTGATGATGG GACTCAGGGT TGTATGGGAT TGCCATGTGT GGTGATGTAA 120 12) INFORMATION FOR SEO ID NO:34: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4508 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear Iii) MOLECULE TYPE: DNA (gnomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: CAGATCCAGT TGTTCTCATG CAGCCGTGTG CGGTACGTGC CGTAGAGATG CTGGCCCAGG 60 CGGAAGCTGG GCACCTCCTT ACTGTACTTC CATACCCTAT GGAAGTACAG TAAGGAGGTG 120 CCCAGCTTCC GCCTGGGCCA GCATCTCTAC GGCACGTACC GCACACGGCT GCATGAGAAC 180 AACTGGATCT GCATCCAGGA GGACACCGGC CTCCTCTACC TTAACCGGAG CCTGGACCAT 240 AGCTCCTGGG AGAAGCTCAG TGTCCGCAAC CGCGGCTTTC CCCTGCTCAC CGTCTACCTC 300 AAGGTCTTCC TGTCACCCAC ATCCCTTCGT GAGGGCGAGT GCCAGTGGCC AGGCTGTGCC 360 CGCGTATACT TCTCCTTCTT CAACACCTCC TTTCCAGCCT GCAGCTCCCT CAAGCCCCGG 420 GAGCTCTGCT TCCCAGAGAC AAGGCCCTCC TTCCGCATTC GGGAGAACCG ACCCCCAGGC 480 ACCTTCCACC AGTTCCGCCT GCTGCCTGTG CAGTTCSTGT GCCCCAACAT CAGCGTGGCC 540 TACAGGCTCC TGGAGGGTGA GGGTCTGCCC TTCCGCTGCG CCCCGGACAG CCTGGAGGTG 600 AGCACGCGCT GGGCCCTGGA CCGCGAGCAG CGGGAGAAGT ACGAGCTGGT GGCCGTGTGC 660 ACCGTGCACG CCGGCGCGCG CGAGGAGGTG GTGATGGTGC CCTTCCCGGT GACCGTGTAC 720 GACGAGGACG ACTCGGCGCC CACCTTCCCC GCGGGCGTCG ACACCGCCAG CGCCGTGGTG 780 GAGTTCAAGC GGAAGGAGGA CACCGTGGTG GCCACGCTGC GTGTCTTCGA TGCAGACGTG 840 GTACCTGCAT CAGGGGAGCT GGTGAGGCGG TACACAAGCA CGCTGCTCCC CGGGGACACC 900 TGGGCCCAGC AGACCTTCCG GGTGGAACAC TGGCCCAACG AGACCTCGGT CCAGGCCAAC 960 GGCAGCTTCG TGCGGGCGAC CGTACATGAC TATAGGCTGG TTCTCAACCG GAACCTCTCC 1020 ATCTCGGAGA ACCGCACCAT GCAGCTGGCG GTGCTGGTCA ATGACTCAGA CTTCCAGGGC 1080 CCAGGAGCGG GCGTCCTCTT GCTCCACTTC AACGTGTCGG TGCTGCCGGT CAGCCTGCAC 1140 CTGCCCAGTA CCTACTCCCT CTCCGTGAGC AGGAGGGCTC GCCGATTTGC CCAGATCGGG 1200 AAAGTCTGTG TGGAAAACTG CCAGGCGTTC AGTGGCATCA ACGTCCAGTA CAAGCTGCAT 1260 TCCTCTGGTG CCAACTGCAG CACGCTAGGG GTGGTCACCT CAGCCGAGGA CACCTCGGGG 1320 ATCCTGTTTG TGAATGACAC CAAGGCCCTG CGGCGGCCCA AGTGTGCCGA ACTTCACTAC 1380 ATGGTGGTGG CCACCGACCA GCAGACCTCT AGGCAGGCCC AGGCCCAGCT GCTTGTAACA 1440 GTGGAGGGGT CATATGTGGC CGAGGAGGCG GGCTGCCCCC TGTCCTGTGC AGTCAGCAAG 1500 AGACGGCTGG AGTGTGAGGA GTGTGGCGGC CTGGGCTCCC CAACAGGCAG GTGTGAGTGG 1560 AGGCAAGGAG ATGGCAAAGG GATCACCAGG AACTTCTCCA CCTGCTCTCC CAGCACCAAG 1620 ACCTGCCCCG ACGGCCACTG CGATGTTGTG GAGACCCAAG ACATCAACAT TTGCCCTCAG 1680 GACTGCCTCC GGGGCAGCAT TGTTGGGGGA CACGAGCCTG GGGAGCCCCG GGGGATTAAA 1740 GCTGGCTATG GCACCTGCAA CTGCTTCCCT GAGGAGGAGA AGTGCTTCTG CGAGCCCGAA 1800 GACATCCAGG ATCCACTGTG CGACGAGCTG TGCCGCACGG TGATCGCAGC CGCTGTCCTC 1860 TTCTCCTTCA TCGTCTCGGT GCTGCTGTCT GCCTTCTGCA TCCACTGCTA CCACAAGTrr 1920 GCCCACAAGC CACCCATCTC CTCAGCTGAG ATGACCTTCC GGAGGCCCGC CCAGGCCTTC 1980 CCGGTCAGCT ACTCCTCTTC CGGTGCCCGC CGGCCCTCGC TGGACTCCAT GGAGAACCAG 2040 GTCTCCGTGG ATGCCTTCAA GATCCTGGAG GATCCAAAGT GGGAATTCCC TCGGAAGAAC 2100 TTGGTTCTTG GAAAAACTCT AGGAGAAGGC GAATTTGGAA AAGTGGTCAA GGCAACGGCC 2160 TTCCATCTGA AAGGCAGAGC AGGGTACACC ACGGTGGCCG TGAAGATGCT GAAAGAGAAC 2220 GCCTCCCCGA GTGAGCTTCG AGACCTGCTG TCAGAGTTCA ACGTCCTGAA GCAGGTCAAC 2280 CACCCACATG TCATCAAATT GTATGGGGCC TGCAGCCAGG ATGGCCCGCT CCTCCTCATC 2340 GTGGAGTACG CCAAATACGG CTCCCTGCGG GGCTTCCTCC GCGAGAGCCG CAAAGTGGGG 2400 CCTGGCTACC TGGGCAGTGG AGGCAGCCGC AACTCCAGCT CCCTGGACCA CCCGGATGAG 2460 CGGGCCCTCA CCATGGGCGA CCTCATCTCA TTTGCCTGGC AGATCTCACA GGGGATGCAG 2520 TATCTGGCCG AGATGAAGCT CGTTCATCGG GACTTGGCAG CCAGAAACAT CCTGGTAGCT 2580 GAGGGGCGGA AGATGAAGAT TTCGGATTTC GGCTTGTCCC GAGATGTTTA TGAAGAGGAT 2640 TCCTACGTGA AGAGGAGCCA GGGTCGGATT CCAGTTAAAT GGATGGCAAT TGAATCCCTT 2700 TTTGATCATA TCTACACCAC GCAAAGTGAT GTATGGTCTT TTGGTGTCCT GCTGTGGGAG 2760 ATCGTGACCC TAGGGGGAAA CCCCTATCCT GGGATTCCTC CTGAGCGGCT CTTCAACCTT 2820 CTGAAGACCG GCCACCGGAT GGAGAGGCCA GACAACTGCA GCGAGGAGAT GTACCGCCTG 2880 ATGCTGCAAT GCTGGAAGCA GGAGCCGGAC AAAAGGCCGG TGTITGCGGA CATCAGCAAA 2940 GACCTGGAGA AGATGATGGT TAAGAGGAGA GACTACTTGG ACCTTGCGGC GTCCACTCCA 3000 TCTGACTCCC TGATTTATGA CGACGGCCTC TCAGAGGAGG AGACACCGCT GGTGGACTGT 3060 AATAATGCCC CCCTCCCTCG AGCCCTCCCT TCCACATGGA TTGAAAACAA ACTCTATGGC 3120 ATGTCAGACC CGAACTGGCC TGGAGAGAGT CCTGTACCAC TCACGAGAGC TGATGGCACT 3180 AACACTGGGT TTCCAAGATA TCCAAATGAT AGTGTATATG CTAACTGGAT GCTTTCACCC 3240 TCAGCGGCAA AATTAATGGA CACGTTTGAT AGTTAACATT TCTTTGTGAA AGGTAATGGA 3300 CTCACAAGGG GAAGAAACAT GCTGAGAATG GAAAGTCTAC CGGCCCTTTC TTTGTGAACG 3360 TCACATTGGC CGAGCCGTGT TCAGTTCCCA GGTGGCAGAC TCGTTTTTGG TAGTTTGTTT 3420 TAACTTCCAA GGTGGTTTTA CTTCTGATAG CCGGTGATTT TCCCTCCTAG CAGACATGCC 3480 ACACCGGGTA AGAGCTCTGA GTCTTAGTGG TTAAGCATTC CTTTCTCTTC AGTGCCCAGC 3540 AGCACCCAGT GTTGGTCTGT GTCCATCAGT GACCACCAAC ATTCTGTGTT CACATGTGTG 3600 GGTCCAACAC TTACTACCTG GTGTATGAAA TTGGACCTGA ACTGTTGGAT TTrTCTAGTT 3660 GCCGCCAAAC AAGGCAAAAA AATTTAAACA TGAAGCACAC ACACAAAAAA GGCAGTAGGA 3720 AAAATGCTGG CCCTGATGAC cTGTCcTTAT TCAGAATGAG AGACTGCGGG GGGGGCCTGG 3780 GGGTAGTGTC AATGCCCCTC CAGGGCTGGA GGGGAAGAGG GGCCCCGAGG ATGGGCCTGG 3840 GCTCAGCATT CGAGATCTTG <RTI ID=148.12> AGAATGATTT TTTTTTAATC ATGCAACCTT TCCTTAGGAA 3900 GACATTTGGT TTTCATCATG ATTAAGATGA TTCCTAGATT TAGCACAATG GAGAGATTCC 3960 ATGCCATCIT TACTATGTGG ATGGTGGTAT CAGGGAAGAG GGCTCACAAG ACACATTTGT 4020 CCCCCGGGCC CACCACATCA TCCTCACGTG TTCGGTACTG AGCAGCCACT ACCCCTGATG 4080 AGAACAGTAT GAAGAAAGGG GGCTGTTGGA GTCCCAGAAT TGCTGACAGC AGAGGCTTTG 4140 CTGCTGTGAA TCCCACCTGC CACCAGCCTG CAGCACACCC CACAGCCAAG TAGAGGCGAA 4200 AGCAGTGGCT CATCCTACCT GTTAGGAGCA GGTAGGGCTT GTACTCACTT TAATTTGAAT 4260 CTTATCAACT TACTCATAAA GGGACAGGCT AGCTAGCTGT GTTAGAAGTA GCAATGACAA 4320 TGACCAAGGA CTGCTACACC TCTGATTACA ATTCTGATGT GAAAAAGATG GTGTTTGGCT 4380 CTTATAGAGC CTGTGTGAAA GGCCCATGGA TCAGCTCTTC <RTI ID=149.2> CTGTGITTGT AATTTAATGC 4440 TGCTACAAGG TGTTTCTGTT TCTTAGATTC TGACCATGAC TCATAAGCTT CTTGTCATTC 4500 TTCATTGC 4508 12) INFORMATION FOR SEQ ID NO:35: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 218 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double ID) TOPOLOGY: linear lii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: ATTTATTATT TTTTGCAGAA AGAGCACTTC AAATAATTTA CAGAACCAGA ATTTAAGGTG 60 GAAGATGACA TTTAATGGAT CCTGCAGTAG TGTTTGCACA TGGAAGTCCA AAAACCTGAA 120 AGGAATATTT CAGTTCAGAG TAGTAGCTGC AAATAATCTA GGGTITGGTG AATATAGTGG 180 AATCAGTGAG AATATTATAT TAGTTGGAGG TATGTTAC 218 (2) INFORMATION FOR SEQ ID NO:36: Ii) SEQUENCE CHARACTERISTICS: (A) LENGTH: 111 base pairs IB) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: TCTGATTTAT ATTATTAGAT GATTTTTGGA TACCAGAAAC AAGTTTCATA CTTACTATTA 60 TAGTTGGAAT ATTTCTGGTT GTTACAATCC CACTGACCTT TGGTAAGTAT A 111 12) INFORMATION FOR SEQ ID NO:37: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 163 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: TCATTTTTTC CTITATAGTC TGGCATAGAA GATTAAAGAA TCAAAAAAGT GCCAAGGAAG 60 GGGTGACAGT GCTTATAAAC GAAGACAAAG AGTTGGCTGA GCTGCGAGGT CTGGCAGCCG 120 GAGTAGGCCT GGCTAATGCC TGCTATGCAA TACAGTATGT AGC 163 (2) INFORMATION FOR SEQ ID NO:38: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 190 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double tD) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: AATGTTGCTA TTTTACAGTA CTCTTCCAAC CCAAGAGGAG ATTGAAAATC TTCCTGCCTT 60 CCCTCGGGAA AAACTGACTC TGCGTCTCTT GCTGGGAAGT GGAGCCTTTG GAGAAGTGTA 120 TGAAGGAACA GCAGTGGACA TCTTAGGAGT TGGAAGTGGA GAAATCAAAG TAGCAGTGAA 180 GGTAATGTGA 190 (2) INFORMATION FOR SEQ ID NO:39: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 92 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: TGTTTTTATG TTGTATAGAC TTTGAAGAAG GGTTCCACAG ACCAGGAGAA GATTGAATTC 60 CTGAAGGAGG CACATCTGAT GAGGTAGCTC TG 92 (2) INFORMATION FOR SEO ID NO:40: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 157 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: TGTCTTTCCA CCTTTCAGCA AATTTAATCA TCCCAACATT CTGAAGCAGC TTGGAGTITG 60 TCTGCTGAAT GAACCCCAAT ACATTATCCT GGAACTGATG GAGGGAGGAG ACCTTCTTAC 120 TTATTTGCGT AAAGCCCGGA TGGCAACGGT AGGCAGT 157 (2) INFORMATION FOR SEQ ID NO:41: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 125 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: TTCTTTCATT TCTTACAGTT TTATGGTCCT TTACTCACCT TGGTTGACCT TGTAGACCTG 60 TGTGTAGATA TTTCAAAAGG CTGTGTCTAC TTGGAACGGA TGCATTTCAT TCACAGGTAC 120 AATTC 125 (2) INFORMATION FOR SEQ ID NO:42: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 228 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: ATATAACTTC TGCCACAGGG ATCTGGCAGC TAGAAATTGC CTTGTTTCCG TGAAAGACTA 60 TACCAGTCCA CGGATAGTGA AGATYGGAGA CTTTGGACTC GCCAGAGACA TCTATAAAAA 120 TGATTACTAT AGAAAGAGAG GGGAAGGCCT GCTCCCAGTT CGGTGGATGG CTCCAGAAAG 180 TTTGATGGAT GGAATCTTCA CTACTCAATC TGATGTATGG TAAGTTTA 228 (2) INFORMATION FOR SEQ ID NO:43: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 162 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: ACCCTCTACT ATTCTTAGGT CTITTGGAAT TCTGATTTGG GAGATTTTAA CTCTTGGTCA 60 TCAGCCTTAT CCAGCTCATT CCAACCTTGA TGTGTTAAAC TATGTGCAAA CAGGAGGGAG 120 ACTGGAGCCA CCAAGAAATT GTCCTGATGA TCTGTAAGTT AA 162 (2) INFORMATION FOR SEQ ID NO:44: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 232 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: CTTTCCCCAT ATATTTAGGT GGAATTTAAT GACCCAGTGC TGGGCTCAAG AACCCGACCA 60 AAGACCTACT TTTCATAGAA TTCAGGACCA ACTTCAGTTA TTCAGAAATT T = CTTAAA 120 TAGCATTTAT AAGTCCAGAG ATGAAGCAAA CAACAGTGGA GTCATAAATG AAAGCTTTGA 180 AGGTAAGTTT GATTCTTCAG AATTTTCTAG TTTTCGCTGC ACTGTGAACT GA 232 (2) INFORMATION FOR SEQ ID NO:45: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 540 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear [ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: CCTTTTGGTC ACAGCAGGAG ATGCGGGCTC AGGACCGGCA GCTGGCAGGG CAGCCTGTCA 60 GGCTGCGGGC CCGGCTGCAC AGACTGAAAG TGGACCAAGT CTGTCACCTG CACCAGGAGC 120 TTCTGGATGA GGCTGAGCTG GAGATGGAGT TAGAGTCTGG GACTGGCTTG CCTCTGGCCC 180 CACCGCTGCG GCATCTGGGA CTCACGCGCA TGAACATCAG TGCCAGACGC TTCACCTCTG 240 CTGACAGCAG ACTTGGGTGT CTCTTGCAGT ATCTGGGAGA AAGAAGGAAG GAAGAGGTCC 300 CCGACCCCGG AGGCTCTGGC TACCTGCTGG GGAAGGTGGG CAACACTTAG GTTTCCAAAA 360 GCTGAATTTA GAGAGCACAG GATGGAGGGG AGGAGGAGAG GAAACTCGGT GGGCCCCAAA 420 TGTCTTAATA AAAAATGCAT TGAATCCCAT CAAGGTTTCT GTAGACTGTC ACAGAGCCTA 480 AATAAATGTT GTTGTATATT CATCCTGTGT CACTGGGACT TTAGGGATTC CACAACAGGA 540 (2) INFORMATION FOR SEQ ID NO:46: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 275 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: GGATCCTGCC TGTCCGTCTC CCTGTGACCT TGGAGCTTTC CACAGGAGAA AGCGAGAAAG 60 CGTGTGGTGG GGGAGACAGC CATGCTGGAA AGCCCCCACT CCCAGCTCAC TCAGCCTITT 120 GGTGTCTGCC CGGCAGGGGG ACCCCATTCC CGAGGAGCTT TATGAGATGC TGAGTGACCA 180 CTCGATCCGC TCCTTTGATG ATCTCCAACG CCTGCTGCAC GGAGACCCCG GAGGTAAATG 240 GAATCCCGCC CCGCGCTCCG GCCCTCCGAG GAGAC 275 2) INFORMATION FOR SEQ ID NO:47: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 180 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: CTGCAGAGGA AGATGGGGCC GAGTTGGACC TGAACATGAC CCGCTCCCAC TCTGGAGGCG 60 AGCTGGAGAG CTTGGCTCGT GGAAGAAGGA GCCTGGGTAA GACTGAGACA CCCAACAAGG 120 GTCCTTCAAA TTAGCATGGG GGCCAGGGAA AGAGAACGGG GGCGGGCAGC CAGTCGGAGG 180 (2) INFORMATION FOR SEQ ID NO:48: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 234 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: AGGTTTCGTC TCCTCCCAGG TTCCCTGACC ATTGCTGAGC CGGCCATGAT CGCCGAGTGC 60 AAGACGCGCA CCGAGGTGTT CGAGATCTCC CGGCGCCTCA TAGACCGCAC CAACGCCAAC 120 TTCCTAGTGT GGCCGCCCTG TGTGGAGGTG CAGCGCTGCT CCGGCTGCTG CAACAACCGC 180 AACGTGCAGT GCCGCCCCAC CCAAGTGCAG CTGCGACCTG TTCAAGTGCG TAGG 234 (2) INFORMATION FOR SEQ ID NO:49: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 203 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double lD) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: GTGGTCTCTA CAGAGGCATT TTGTGGCTCT GTCCTCCAGG TGAGAAAGAT CGAGATTGTG 60 CGGAAGAAGC CAATCTTTAA GAAGGCCACG GTGACGCTGG AAGACCACCT GGCATGCAAG 120 TGTGAGACAG TGGCAGCTGC ACGGCCTGTG ACCCGAAGCC CGGGGGGTTC CCAGGAGCAG 180 CGAGGTAACC ACCTTTCTAG GCT 203 (2) INFORMATION FOR SEQ ID NO:50: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 239 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: GGCAGGCCTT GGTCAGTGGG GAGAGACCTC CCCAATGGTC CACATGCTGA CGAGGTCTTT 60 CTTTTTCTTG TGCAGCCAAA ACGCCCCAAA CTCGGGTGAC CATTCGGACG GTGCGAGTCC 120 GCCGGCCCCC CAAGGGCAAG CACCGGAAAT TCAAGCACAC GCATGACAAG ACGGCACTGA 180 AGGAGACCCT TGGAGCCTAG GGGCATCGGC AGGAGAGTGT GTGGGCAGGT GAGGGTCAG 239 t2) INFORMATION FOR SEO ID NO:51: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1192 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double tD) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: GAATTCATGC CGGGCCCAGC CGAGCGCGCA GCGGGCACGC CGCGCGCGCG GAGCAGCCGT 60 GCCCGCCGCC CGGGCCCGCC GCCAGGGCGC ACACGCTCCC GCCCCCCTAC CCGGCCCGGG 120 CGGGAGTTTG CACCTCTCCC TGCCCGGGTG CTCGAGCTGC CGTTGCAAAG CCAACTTTGG 180 AAAAAGTTTT TTGGGGGAGA CTTGGGCCTT GAGGTGCCCA GCTCCGCGCT TTCCGATTTT 240 GGGGGCCTTT CCAGAAAATG TTGCAAAAAA GCTAAGCCGG CGGGCAGAGG AAAACGCCTG 300 TAGCCGGCGA GTGAAGACGA ACCATCGACT GCCGTGTTCC TTTTCCTCTT GGAGGTTGGA 360 GTCCCCTGGG CGCCCCCACA CGGCTAGACG CCTCGGCTGG TTCGCGACGC AGCCCCCCGG 420 CCGTGGATGC TGCACTCGGG CTCGGGATCC GCCCAGGTAG CCGGCCTCGG ACCCAGGTCC 480 TGCGCCCAGG TCCTCCCCTG CCCCCCAGCG ACGGAGCCGG GGCCGGGGGC GGCGGCGCCG 540 GGGGCATGCG GGTGAGCCGC GGCTGCAGAG GCCTGAGCGC CTGATCGCCG CGGACCCGAG 600 CCGAGCCCAC CCCCCTCCCC AGCCCCCCAC CCTGGCCGCG GGGGCGGCGC GCTCGATCTA 660 CGCGTTCGGG GCCCCGCGGG GCCGGGCCCG GAGTCGGCAT GAATCGCTGC TGGGCGCTCT 720 TCCTGTCTCT CTGCTGCTAC CTGCGTCTGG TCAGCGCCGA GGTGAGTGCC ACGGCGGCTG 780 GGGCTGGTTC TTCATTCATT ACCTTCGCCC CCCCCTTCTG ACCGCCCCCT CCTCTCCCTG 840 CAGTGAACTT TGGACCCTTG CACCCGCGAG CCTGACGCCG GGCGCTGGGT GACCTCTTCG 900 GGCTGGGAGC GAGGTCCGGG GGTGACAGGC TCTAAGGGAA GGCAACAGCG GTGGCTTTCT 960 TTCCAACCGG CGGGCGAATC TGGCTCCCTA AGCCGTTCCG TGTCGGGGGA GGGTGTGTGT 1020 GGCCCTGTCC CCCACCCTTT GGGAACCCGA GAACAAGCCC CTCCCGGCCG GGGGAGAGGG 1080 GGTGGGGTGG TGCCCAGGGT GCAGAAGGCA GCGCGTCCTC CCGAGCCCAC TTCGGCGCCA 1140 GCCTCGGCTT AGGCTCTGTC CTGCCATCGG CTTGCCCAGG AGGTGCAAGC TT 1192 12) INFORMATION FOR SEQ ID NO:52: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 596 base pairs IB) TYPE: nucleic acid {C) STRANDEDNESS: double (D) TOPOLOGY: linear Iii) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:52: TGGGGGAGAC AGACATAGAG ACAAGCAGGT CCAACTCAAA GCAAGCTGGG GTTCCTGTTG 60 GGGGTTGAGG GTACAGGGAC TGAGCTGGGC CTCAGAGGCT TCGGCAGGTC CAGACCCCGA 120 GGCCTTTGTG CTCCTGATCA TCAGGCCTGG ATCCTGTCTG TCCGTCTCCC TGTGACCTTG 180 GAGCTTTCCA CAGGAGAAAG CGAGAAAGCC CCCACTCCCA GCTCACTCAG CCTTTTGGTG 240 TCTGCCCGGC AGGGGGACCC CATTCCCGAG GAGCTTTATG AGATGCTGAG TGACCACTCG 300 ATCCGCTCCT TTGATGATCT CCAACGCCTG CTGCACGGAG ACCCCGGAGG TAAATGGAAT 360 CCCGCCCCGC GCTCCGGCCC TCCGAGGAGA CTTTAAGAGA TCTGGGAGGG GCAGGACAGG 420 AGGCATCCCT CCTTCTTGAC GTCTGGAGAA CTAGAGGCCC ATGGGCGCCC AGAGAGAGCG 480 TGGCCACACC CATCCAGGGC AGGGCCGAGT CAGCAGGCGG GTTGGTACTG GGACTTGGGG 540 TGTGGCAGGA GAAGCACCCA CGTGTGACTC CGGGTTGGTA CCGGGGTGGG GTACAA 596 (2) INFORMATION FOR SEQ ID NO:53: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 120 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: TGACTTCTCC TGCAGAGGAA GATGGGGCCG AGTTGGACCT GAACATGACC CGCTCCCACT 60 CTGGAGGCGA GCTGGAGAGC TTGGCTCGTG GAAGAAGGAG CCTGGGTAAG ACTGAGACAC 120 (2) INFORMATION FOR SEQ ID NO:54: Ii) SEQUENCE CHARACTERISTICS: (A) LENGTH: 236 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: TTCATCTCCT CCCAGGTTCC CTGACCATTG CTGAGCCGGC CATGATCGCC GAGTGCAAGA 60 CGCGCACCGA GGTGTTCGAG ATCTCCCGGC GCCTCATAGA CCGCACCAAC GCCAACTTCC 120 TGGTGTGGCC GCCCTGTGTG GAGGTGCAGC GCTGCTCCGG CTGCTGCAAC AACCGCAACG 180 TGCAGTGCCG CCCCACCCAG GTGCAGCTGC GACCTGTCCA GGTGCGTAGG CTCCGG 236 (2) INFORMATION FOR SEQ ID NO:55: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 175 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA tgenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: CCTCTCCAGC TCCAGGTGAG AAAGATCGAG ATTGTGCGGA AGAAGCCAAT CTITAAGAAG 60 GCCACGGTGA CGCTGGAAGA CCACCTGGCA TGCAAGTGTG AGACAGTGGC AGCTGCACGG 120 CCTGTGACCC GAAGCCCGGG GGGTTCCCAG GAGCAGCGAG GTAACCACCT TTCCA 175 12) INFORMATION FOR SEQ ID NO:56: 1i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 183 base pairs IB) TYPE: nucleic acid 1C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: CTTTTTCTTG TGCAGCCAAA ACGCCCCAAA CTCGGGTGAC CATTCGGACG GTGCGAGTCC 60 GCCGGCCCCC CAAGGGCAAG CACCGGAAAT TCAAGCACAC GCATGACAAG ACGGCACTGA 120 AGGAGACCCT TGGAGCCTAG GGGCATCGGC AGGAGAGTGT GTGGGCAGGT GAGGGCCAGG 180 CGG 183 (2) INFORMATION FOR SEQ ID NO:57: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 464 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double ID) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:57: GGTTATTTAA TATGGTATTT GCTGTATTGC CCCCATGGGG CCTTCGGAGC GATAATATTG 60 TTTCCCTCGT CCGTCTGTCT CGATGCCTGA TTCGGACGGC CAATGGTGCT TCCCCCACCC 120 CTCCACGTGT CCGTCCACCC TTCCATCAGC GGGTCTCCTC CCAGCGGCCT CCGGTCTTGC 180 CCAGCAGCTC AAGAAGAAAA AGAAGGACTG AACTCCATCG CCATCTTCTT CCCTTAACTC 240 CAAGAACTTG GGATAAGAGT GTGAGAGAGA CTGATGGGGT CGCTCTTTGG GGGAAACGGG 300 TTCCTTCCCC TGCACCTGGC CTGGGCCACA CCTGAGCGCT GTGGACTGTC CTGAGGAGCC 360 CTGAGGACCT CTCAGCATAG CCTGCCTGAT CCCTGAACCC CTAGCCAGCT CTGAGGGGAG 420 GCACCTCCAG GCAGGCCAGG CTGCCTCGGA CTCCATGGCT AGGA 464 (2) INFORMATION FOR SEQ ID NO:58: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 139 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:58: CCCAGCTCAC TCAGCCTTTT GGTGTCTGCC CGGCAGGGGG ACCCCATTCC CGAGGAGCTT 60 TATGAGATGC TGAGTGACCA CTCGATCCGC TCCTITGATG ATCTCCAACG CCTGCTGCAC 120 GGAGACCCCG GAGGTAAAT 139 (2) INFORMATION FOR SEQ ID NO:59: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 102 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear lii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: CTGCAGAGGA AGATGGGGCC GAGTTGGACC TGAACATGAC CCGCTCCCAC TCTGGAGGCG 60 AGCTGGAGAG CTTGGCTCGT GGAAGAAGGA GCCTGGGTAA GA 102 (2) INFORMATION FOR SEQ ID NO:60: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 218 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:60: TCCCAGGTTC CCTGACCATT GCTGAGCCGG CCATGATCGC CGAGTGCAAG ACGCGCACCG 60 AGGTGTTCGA GATCTCCCGG CGCCTCATAG ACCGCACCAA CGCCAACTTC CTGGTGTGGC 120 CGCCCTGTGT GGAGGTGCAG CGCTGCTCCG GCTGCTGCAA CAACCGCAAC GTGCAGTGCC 180 GCCCCACCCA GGTGCAGCTG CGACCTGTCC AGGTGCGT 218 (2) INFORMATION FOR SEQ ID NO:61: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 157 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: CTCCAGGTGA GAAAGATCGA GATTGTGCGG AAGAAGCCAA TCTITAAGAA GGCCACGGTG 60 ACGCTGGAAG ACCACCTGGC ATGCAAGTGT GAGACAGTGG CAGCTGCACG GCCTGTGACC 120 CGAAGCCCGG GGGGTTCCCA GGAGCAGCGA GGTAACC 157 (2) INFORMATION FOR SEQ ID NO:62: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 171 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: GTGCAGCCAA AACGCCCCAA ACTCGGGTGA CCATTCGGAC GGTGCGAGTC CGCCGGCCCC 60 CCAAGGGCAA GCACCGGAAA TTCAAGCACA CGCATGACAA GACGGCACTG AAGGAGACCC 120 TTGGAGCCTA GGGGCATCGG CAGGAGAGTG TGTGGGCAGG TGAGGGCCAG G 171 (2) INFORMATION FOR SEQ ID NO:63: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2875 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lli) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: GGTTTCAAAT TGGCCCTTTG GCCTCTGGAG CAAATTCAAA TGTAACTCTT CCCCAATCCC 60 CCTTCTCTTC TTCCAGATTA ATTAAAAGAA GAATGAACTA TAATCCTTGA AGATAACTGG 120 GCAATTTTTT AAGTCGGAGG CTGTTCTTAC TGGTGTGAGG ATTTACACAC GTCTTCAGTT 180 TTTCAGCACA GACCAGCAGA CCATCATTTT TAGAGGAAAT ACTCCCTCTG CCCTCCTTIT 240 TGGTTTCCTT GGTGGTAAAG ATTAAATTTG GTTGCATCAT TTTGACTTGT GTTTGAGTCT 300 AGATTTTATG GCACAAGGAA TGGCATAAAC TTTTCATGTG TTlTGGTTAA AACAAACCAG 360 ACCATTGCAT TGACCCTGGA CATCTTTAAT TGAGAAATTG GTAACTTTAT TTTAATATGT 420 ATATCTGAAG AATTCAAGAA AACAAAGGCA TCCTCAGAGG TGTGCCTCTT TTCTTTATTA 480 TTAGAGGCAA AACGAACAAT TTTATAGGAT TTGTAGTGAA ATTATACCAG ATTATAAGGA 540 GAACCAAAAC TAAGTCGCAA AATTTATTAA TTTAAGGGGC TCTCGCTTTG AAAGTTTGAG 600 AGTAAGTTAC GATAGGCATT TGTATCCATT CATTACTTTC CTCTTTTCAA ATAAGCAACT 660 AAATAGAAAT GCTAATCTCA GACTTAATTA TITAACAGAA GAGTGTACCA TGGAAAACCT 720 CCAGACAAAT TTCTCCTTGG TTCAGGGCTC AACTAAAAAA CTGAATGGGA TGGGAGATGA 780 TGGCAGCCCC CCAGCGAAAA AAATGATAAC GGACATTCAT GTAAATGGAA AAACGATAAA 840 CAAGGTGCCA ACAGTTAAGA AGGAACACTT GGATGACTAT GGAGAAGCAC CAGTGGAAAC 900 TGATGGAGAG CATGTTAAGC GAACCTGTAC TTCTGTTCCT GAAACTTTGC ATTTAAATCC 960 CAGTTTGAAA CACACATTGG CACAATTCCA TTTAAGTAGT CAGAGCTCGC TGGGTGGACC 1020 AGCAGCATTT TCTGCTCGGC ATTCCCAAGA AAGCATGTCG CCTACTGTAT TTCTGCCTCT 1080 TCCATCACCT CAGGTTCTTC CTGGCCCATT GCTCATCCCT TCAGATAGCT CCACAGAACT 1140 CACTCAGACT GTGTTGGAAG GGGAATCTAT TTCTTGTTTT CAAGTTGGAG GAGAAAAGAG 1200 ACTCTGTTTG CCCCAAGTCT TAAATTCTGT TCTCCGAGAA TTTACACTCC AGCAAATAAA 1260 TACAGTGTGT GATGAACTGT ACATATATTG TTCAAGGTGT ACTTCAGACC AGCTTCATAT 1320 CTTAAAGGTA CTGGGCATAC TTCCATTCAA TGCCCCATCC TGTGGGCTGA TTACATTAAC 1380 TGATGCACAA AGATTATGTA ATGCTTTATT GCGGCCACGA ACTTTTCCTC AAAATGGTAG 1440 CGTACTTCCT GCTAAAAGCT CATTGGCCCA GTTAAAGGAA ACTGGCAGTG CCTTTGAAGT 1500 GGAGCATGAA TGCCTAGGCA AATGTCAGGG TTTATTTGCA CCCCAGTTTT ATGTTCAGCC 1560 TGATGCTCCG TGTATTCAAT GTCTGGAGTG TTGTGGAATG TTTGCACCCC AGACGTITGT 1620 GATGCATTCT CACAGATCAC CTGACAAAAG AACTTGCCAC TGGGGCTTTG AATCAGCTAA 1680 ATGGCATTGC TATCTTCATG TGAACCAAAA ATACTTAGGA ACACCTGAAG AAAAGAAACT 1740 GAAGATAATT TTAGAAGAAA TGAAGGAGAA GTTTAGCATG AGAAGTGGAA AGAGAAATCA 1800 ATCCAAGGCA AGTTTTTAT ATCAATTTTT AATAATGGTA ATGGTTTACT TTGAAATGAA 1860 AATTCTATGT TTAGTGTGTA ACTTAACCTG TATGTTGAAC ATTGCTCATG CAACAACAAC 1920 AAAATACCGA TTGATATATT TGTATTGCAG TTTTAGGCC ATAAAGTGCT TTGCAGTATG 1980 TTTCCTCATT TGACTTTCCA AACATCCTGT GAGAGAAGTA AGACTATTAT TCCGTITTAC 2040 AGATAAAGTG AATGAAGCTC AGAGAGATAA AATGACTTTC CCAAAATTAT GTAGCCAGGG 2100 AGTGGAGGAG TTAGGGCTTC mTrTTT TTTTGTGCT TTTAGTAGAG GCCAGGTTTC 2160 AGCATGTTGG CCAGGCTGGT CTTGAACTCC TGACCGCGTG ATCCGCCCAC CTTGGCCTCC 2220 CAAAGGGCTG GGATTACATC CTTGAGCCCC TGTGTCCAGC CAGGGCTTCT TTTTCTTATC 2280 CTCTTTGGCA CACATCTTGC TTCTTGACCA CTACATCTGT TGTTTTTCTA GGACTCGATA 2340 ATTTGCGCTT TGGTGTTATC TCCATTTGCA AATGGTACAA TGGCCACAAT TCCCGTGGGC 2400 TCAAAACAGC ATTTTTCAGA GATACACCTA TGATTTCTGA TGTTTCTATG TTTGGATATT 2460 CAGGCTTGCT CAATATTTGA AACAAATGGA AAAGACATGT ATCTGAAGAA TTTGTGATTT 2520 GAAAGGAATA ACAAAAAAAA TGACAGCTAG AGTAAGGAAA AGTTATTTTA AACTAATAAA 2580 ATATTAATAT AAAAACCTGC CGGGCTCAGT GGCTCACACC TGTAATCCCA ACACTTTGGG 2640 GGGCTGAAGT AGGTGGATCA CCTGAGGTCA GGAGTTTGAG ACCAGCCTGG CCAACATGGT 2700 GAAATCCCAT CTCTGCTGAA AATACAAAAA TTAGACGGAT GTGGTGTCGC ACACTTGTAA 2760 TCCCAGCTAC TCAGGAGCTG AGGCAGGAGA ATCGCTTGAA CCCCGGAGGC GGAGGTTGTA 2820 GTGAGCCGAG ATTGTGCCAT TGCGCTCCAG CGTAGGCGTC GAGGGAAACT CCATC 2875 (2) INFORMATION FOR SEQ ID NO:64: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2840 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: GGTTTCAAAT TGGCCCTTTG GCCTCTGGAG CAAATTCAAA TGTAACTCTT CCCCAATCCC 60 CCTTCTCTTC TTCCAGATTA ATTMAAGAA GAATGAACTA TAATCCTTGA AGATAACTGG 120 GCAATTTTTT AAGTCGGAGG CTGTTCTTAC TGGTGTGAGG ATTTACACAC GTCTTCAGTT 180 TTTCAGCACA GACCAGCAGA CCATCATTTT TAGAGGAAAT ACTCCCTCTG CCCTCCTTTT 240 TGGTTTCCTT GGTGGTAAAG ATTAAATTTG GTTGCATCAT TTTGACTTGT GTTTGAGTCT 300 AGATTTTATG GCACAAGGAA TGGCATAAAC TITTCATGTG TTTTGGTTAA AACAAACCAG 360 ACCATTGCAT TGACCCTGGA CATCTTTAAT TGAGAAATTG GTAACTTTAT TTTAATATGT 420 ATATCTGAAG AATTCAAGAA AACAAAGGCA TCCTCAGAGG TGTGCCTCTT TTCTTTATTA 480 TTAGAGGCAA AACGAACAAT TTTATAGGAT TTGTAGTGAA ATTATACCAG ATTATAAGGA 540 GAACCAAAAC TAAGTCGCAA AATTTATTAA TTTAAGGGGC TCTCGCTTTG AAAGTTTGAG 600 AGTAAGTTAC GATAGGCATT TGTATCCATT CATTACTTTC CTCTTTTCAA ATAAGCAACT 660 AAATAGAAAT GCTAATCTCA GACTTAATTA TTTAACAGAA GAGTGTACCA TGGAAAACCT 720 CCAGACAAAT TTCTCCTTGG TTCAGGGCTC AACTAAAAAA CTGAATGGGA TGGGAGATGA 780 TGGCAGCCCC CCAGCGAAAA AAATGATAAC GGACATTCAT GTAAATGGAA AAACGATAAA 840 CAAGGTGCCA ACAGTTAAGA AGGAACACTT GGATGACTAT GGAGAAGCAC CAGTGGAAAC 900 TGATGGAGAG CATGTTAAGC GAACCTGTAC TTCTGTTCCT GAAACTTTGC ATTTAAATCC 960 CAGTTTGAAA CACACATTGG CACAATTCCA TTTAAGTAGT CAGAGCTCGC TGGGTGGACC 1020 AGCAGCATTT TCTGCTCGGC ATTCCCAAGA AAGCATGTCG CCTACTGTAT TTCTGCCTCT 1080 TCCATCACCT CAGGTTCTTC CTGGCCCATT GCTCATCCCT TCAGATAGCT CCACAGAACT 1140 CACTCAGACT GTGTTGGAAG GGGAATCTAT TTCTTGTTTT CAAGTTGGAG GAGAAAAGAG 1200 ACTCTGTTTG CCCCAAGTCT TAAATTCTGT TCTCCGAGAA TTTACACTCC <RTI ID=166.3> AGCAAATAAA 1260 TACAGTGTGT GATGAACTGT ACATATATTG TTCAAGGTGT ACTTCAGACC AGCTTCATAT 1320 CTTAAAGGTA CTGGGCATAC TTCCATTCAA TGCCCCATCC TGTGGGCTGA TTACATTAAC 1380 TGATGCACAA AGATTATGTA ATGerrrATT GCGGCCACGA ACTTTTCCTC AAAATGGTAG 1440 CGTACTTCCT GCTAAAAGCT CATTGGCCCA GTTAAAGGAA ACTGGCAGTG CCTTTGAAGT 1500 GGAGCATGAA TGCCTAGGCA AATGTCAGGG TTTATTTGCA CCCCAGTTT ATGTTCAGCC 1560 TGATGCTCCG TGTATTCAAT GTCTGGAGTG TTGTGGAATG TTTGCACCCC AGACGTTTGT 1620 GATGCATTCT CACAGATCAC CTGACAAAAG AACTTGCCAC TGGGGCTTTG AATCAGCTAA 1680 ATGGCATTGC TATCTTCATG TGAACCAAAA ATACTTAGGA ACACCTGAAG AAAAGAAACT 1740 GAAGATAATT TTAGAAGAAA TGAAGGAGAA GTTTAGCATG AGAAGTGGAA AGAGAAATCA 1800 ATCCAAGACA GATGCACCAT CAGGAATGGA ATTACAGTCA TGGTATCCTG TTATAAAGCA 1860 GGAAGGTGAC CATGTTTCTC AGACACATTC ATTITTACAC CCCAGCTACT ACTTATACAT 1920 GTGTGATAAA GTGGTTGCCC CAAATGTGTC ACTTACTTCT GCTGTATCCC AGTCTAAAGA 1980 GCTCACAAAG ACAGAGGCAA GTAAGTCCAT ATCAAGACAG TCAGAGAAGG CTCACAGTAG 2040 TGGTAAACTT CAAAAAACAG TGTCTTATCC AGATGTCTCA CTTGAGGAAC AGGAGAAAAT 2100 GGATTTAAAA ACAAGTAGAG AATTATGTAG CCGTTTAGAT GCATCAATCT CAAATAATTC 2160 TACAAGTAAA AGGAAATCTG AGTCTGCCAC TTGCAACTTA GTCAGAGACA TAAACAAAGT 2220 GGGAATTGGC CTTGTTGCTG CCGCTTCATC TCCGCTTCTT GTGAAAGATG TCATTTGTGA 2280 GGATGATAAG GGAAAAATCA TGGAAGAAGT AATGAGAACT TATTTAAAAC AACAGGAAAA 2340 ACTAAACTTG ATTTTGCAAA AGAAGCAACA ACTTCAGATG GAAGTAAAAA TGTTGAGTAG 2400 TTCAAAATCT ATGAAGGAAC TCACTGAAGA ACAGCAGAAT TTACAGAAAG AGCTTGAATC 2460 TTTGCAGAAT GAACATGCTC AAAGAATGGA AGAATTTTAT GTTGAACAGA AAGACTTAGA 2520 GAAAAAATTG GAGCAGATAA TGAAGCAAAA ATGTACCTGT GACTCAAATT TAGAAAAAGA 2580 CAAAGAGGCT GAATATGCAG GACAGTTGGC AGAACTGAGG CAGAGATTGG ACCATGCTGA 2640 GGCCGATAGG CAAGAACTCC AAGATGAACT CAGACAGGAA CGGGAAGCAA GACAGAAGTT 2700 .AGAGATGATG ATAAAAGAGC TAAAGCTGCA AATTCTGAAA TCATCAAAGA CTGCTAAAGA 2760 ATAGAAACTG TTAAAGAGAT TCATCTGTGT ATTACTGACA AGGTTTTTTT TGTTTGTTGT 2820 TTG CTTTGGT AATTGAATTC 2840 (2) INFORMATION FOR SEQ ID NO:65: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: * 1364 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: AAAATCAGGA ACTTGTGCTG GCCCTGCAAT GTCAAGGGAG GGGGCTCACC CAGGGCTCCT 60 GTAGCTCAGG GGGCAGGCCT GAGCCCTGCA CCCGCCCCAC GACCGTCCAG CCCCTGACGG 120 GCACCCCATC CTGAGGGGCT CTGCATTGGC CCCCACCGAG GCAGGGGATC TGACCGACTC 180 GGAGCCCGGC TGGATGTTAC AGGCGTGCAA AATGGAAGGG TTTCCCCTCG TCCCCCCTCC 240 ATCAGAAGAC CTGGTGCCCT ATGACACGGA TCTATACCAA CGCCAAACGC ACGAGTATTA 300 CCCCTATCTC AGCAGTGATG GGGAGAGCCA TAGCGACCAT TACTGGGACT TCCACCCCCA 360 CCACGTGCAC AGCGAGTTCG AGAGCTTCGC CGAGAACAAC TTCACGGAGC TCCAGAGCGT 420 GCAGCCCCCG CAGCTGCAGC AGCTCTACCG CCACATGGAG CTGGAGCAGA TGCACGTCCT 480 CGATACCCCC ATGGTGCCAC CCCATCCCAG TCTTGGCCAC CAGGTCTCCT ACCTGCCCCG 540 GATGTGCCTC CAGTACCCAT CCCTGTCCCC AGCCCAGCCC AGCTCAGATG AGGAGGAGGG 600 CGAGCGGCAG AGCCCCCCAC TGGAGGTGTC TGACGGCGAG GCGGATGGCC TGGAGCCCGG 660 GCCTGGGCTC CTGCCTGGGG AGACAGGCAG CAAGAAGAAG ATCCGCCTGT ACCAGTTCCT 720 GTTGGACCTG CTCCGCAGCG GCGACATGAA GGACAGCATC TGGTGGGTGG ACAAGGACAA 780 GGGCACCTTC CAGTTCTCGT CCAAGCACAA GGAGGCGCTG GCGCACCGCT GGGGCATCCA 840 GAAGGGCAAC CGCAAGAAGA TGACCTACCA GAAGATGGCG CGCGCGCTGC GCAACTACGG 900 CAAGACGGGC GAGGTCAAGA AGGTGAAGAA GAAGCTCACC TACCAGTTCA GCGGCGAAGT 960 GCTGGGCCGC GGGGGCCTGG CCGAGCGGCG CCACCCGCCC CACTGAGCCC GCAGCCCCCG 1020 CCGGCCCCGC CAGGCCTCCC CGCTGGCCAT AGCATTAAGC CCTCGCCCGG CCCGGACACA 1080 GGGAGGACGC TCCCGGGGCC CAGAGGCAGG ACTGTGGCGG GCCGGGCTCC GTCACCCGCC 1140 CCTCCCCCCA CTCCAGGCCC CCTCCACATC CCGCTTCGCC TCCCTCCAGG ACTCCACCCC 1200 GGCTCCCGAC GCCAGCTGGG CGTCAGACCC ACCGGCAACC TTGCAGAGGA CGACCCGGGG 1260 TACTGCCTTG GGAGTCTCAA GTCCGTATGT AAATCAGATC TCCCCTCTCA CCCCTCCCAC 1320 CCATTAACCT CCTCCCAAAA AACAAGTAAA GTTATTCTCA ATCC 1364 (2) INFORMATION FOR SEQ ID NO:66: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 271 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:66: CTGCCAGGAC CATGGGTAGC AACAAGAGCA AGCCCAAGGA TGCCAGCCAG CGGCGCCGCA 60 GCCTGGAGCC CGCCGAGAAC GTGCACGGCG CTGGCGGGGG CGCTTTCCCC GCCTCGCAGA 120 CCCCCAGCAA GCCAGCCTCG GCCGACGGCC ACCGCGGCCC CAGCGCGGCC TTCGCCCCCG 180 CGGCCGCCGA GCCCAAGCTG TTCGGAGGCT TCAACTCCTC GGACACCGTC ACCTCCCCGC 240 AGAGGGCGGG CCCGCTGGCC GGTCAGTGCG C 271 (2) INFORMATION FOR SEQ ID NO:67: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 118 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: CTCTCTGCAG GTGGAGTGAC CACCTTTGTG GCCCTCTATG ACTATGAGTC TAGGACGGAG 60 ACAGACCTGT CCTTCAAGAA AGGCGAGCGG CTCCAGATTG TCAACAACAC GTGAGTGC 118 (2) INFORMATION FOR SEQ ID NO:68: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 113 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear tii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:68: CCTGCTCAGA GAGGGAGACT GGTGGCTGGC CCACTCGCTC AGCACAGGAC AGACAGGCTA 60 CATCCCCAGC AACTACGTGG CGCCCTCCGA CTCCATCCAG GCTGAGGAGT TAG 113 (2) INFORMATION FOR SEQ ID NO:69: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 115 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:69: CCCCCAGGTG GTATTTTGGC AAGATCACCA GACGGGAGTC AGAGCGGTTA CTGCTCAATG 60 CAGAGAACCC GAGAGGGACC TTCCTCGTGC GAGAAAGTGA GACCACGAAA GGTAC 115 12) INFORMATION FOR SEQ ID NO:70: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 164 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID NO:70: GCCCCGCAGG TGCCTACTGC CTCTCAGTGT CTGACTTCGA CAACGCCAAG GGCCTCAACG 60 TGAAGCACTA CAAGATCCGC AAGCTGGACA GCGGCGGCTT CTACATCACC TCCCGCACCC 120 AGTTCAACAG CCTGCAGCAG CTGGTGGCCT ACTACTCCAG TGAG 164 (2) INFORMATION FOR SEQ ID NO:71: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 170 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: CCTCCTCAGA ACACGCCGAT GGCCTGTGCC ACCGCCTCAC CACCGTGTGC CCCACGTCCA 60 AGCCGCAGAC TCAGGGCCTG GCCAAGGATG CCTGGGAGAT CCCTCGGGAG TCGCTGCGGC 120 TGGAGGTCAA GCTGGGCCAG GGCTGCTTTG GCGAGGTGTG GATGGGTAAG 170 (2) INFORMATION FOR SEQ ID NO:72: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 194 base pairs (B) TYPE: nucleic acid C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: CCTCAACAGG GACCTGGAAC GGTACCACCA GGGTGGCCAT CAAAACCCTG AAGCCTGGCA 60 CGATGTCTCC AGAGGCCTTC CTGCAGGAGG CCCAGGTCAT GAAGAAGCTG AGGCATGAGA 120 AGCTGGTGCA GTTGTATGCT GTGGTTTCAG AGGAGCCCAT TTACATCGTC ACGGAGTACA 180 TGAGCAAGGG TGAG 194 (2) INFORMATION FOR SEQ ID NO:73: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 91 base pairs (8) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: TCTGCCCAGG GAGTTTGCTG GACTTTCTCA AGGGGGAGAC AGGCAAGTAC CTGCGGCTGC 60 CTCAGCTGGT GGACATGGCT GCTCAGGTGA G 91 (2) INFORMATION FOR SEO ID NO:74: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 165 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: CTGCAGATCG CCTCAGGCAT GGCGTACGTG GAGCGGATGA ACTACGTCCA CCGGGACCTT 60 CGTGCAGCCA ACATCCTGGT GGGAGAGAAC CTGGTGTGCA AAGTGGCCGA CTTTGGGCTG 120 GCTCGGCTCA TTGAAGACAA TGAGTACACG GCGCGGCAAG GTGGG 165 (2) INFORMATION FOR SEQ ID NO:75: Ii) SEQUENCE CHARACTERISTICS: (A) LENGTH: 146 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: TTCCTGCAGG TGCCAAATTC CCCATCAAGT GGACGGCTCC AGAAGCTGCC CTCTATGGCC 60 GCTTCACCAT CAAGTCGGAC GTGTGGTCCT TCGGGATCCT GCTGACTGAG CTCACCACAA 120 AGGGACGGGT GCCCTACCCT GGTAAG 146 (2) INFORMATION FOR SEQ ID NO:76: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 255 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: CTGCCACAGG GATGGTGAAC CGCGAGGTGC TGGACCAGGT GGAGCGGGGC TACCGGATGC 60 CCTGCCCGCC GGAGTGTCCC GAGTCCCTGC ACGACCTCAT GTGCCAGTGC TGGCGGAAGG 120 AGCCTGAGGA GCGGCCCACC TTCGAGTACC TGCAGGCCTT CCTGGAGGAC TACTTCACGT 180 CCACCGAGCC CCAGTACCAG CCCGGGGAGA ACCTCTAGGC ACAGGCGGGC CCAGACCGGC 240 TTCTCGGCTT GGATC 255 (2) INFORMATION FOR SEQ ID NO:77: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2647 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: GCCGCGCTGG TGGCGGCGGC GCGTCGTTGC AGTTGCGCCA TCTGTCAGGA GCGGAGCCGG 60 CGAGGAGGGG GCTGCCGCGG GCGAGGAGGA GGGGTCGCCG CGAGCCGAAG GCCTTCGAGA 120 CCCGCCCGCC GCCCGGCGGC GAGAGTAGAG GCGAGGTTGT TGTGCGAGCG GCGCGTCCTC 180 TCCCGCCCGG GCGCGCCGCG CTTCTCCCAG CGCACCGAGG ACCGCCCGGG CGCACACAAA 240 GCCGCCGCCC GCGCCGCACC GCCCGGCGGC CGCCGCCCGC GCCAGGGAGG GATTCGGCCG 300 CCGGGCCGGG GACACCCCGG CGCCGCCCCC TCGGTGCTCT CGGAAGGCCC ACCGGCTCCC 360 GGGCCCGCCG GGGACCCCCC GGAGCCGCCT CGGCCGCGCC GGAGGAGGGC GGGGAGAGGA 420 CCATGTGAGT GGGCTCCGGA GCCTCAGCGC CGCGCAGTTT TTTTGAAGAA GCAGGATGCT 480 GATCTAAACG TGGAAAAAGA CCAGTCCTGC CTCTGTTGTA GAAGACATGT GGTGTATATA 540 AAGTTTGTGA TCGTTGGCGG AAATTTTGGA ATTTAGATAA TGGGCTGTGT GCAATGTAAG 600 GATAAAGAAG CAACAAAACT GACGGAGGAG AGGGACGGCA GCCTGAACCA GAGCTCTGGG 660 TACCGCTATG GCACAGACCC CACCCCTCAG CACTACCCCA GCTTCGGTGT GACCTCCATC 720 CCCAACTACA ACAACTTCCA CGCAGCCGGG GGCCAAGGAC TCACCGTCTT TGGAGGTGTG 780 AACTCTTCGT CTCATACGGG GACCTTGCGT ACGAGAGGAG GAACAGGAGT GACACTCTTT 840 GTGGCCCTTT ATGACTATGA AGCACGGACA GAAGATGACC TGAGTTTTCA CAAAGGAGAA 900 AAATTTCAAA TATTGAACAG CTCGGAAGGA GATTGGTGGG AAGCCCGCTC CTTGACAACT 960 GGAGAGACAG GTTACATTCC CAGCAATTAT GTGGCTCCAG TTGACTCTAT CCAGGCAGAA 1020 GAGTGGTACT TTGGAAAACT TGGCCGAAAA GATGCTGAGC GACAGCTATT GTCCTTTGGA 1080 AACCCAAGAG GTACCTTTCT TATCCGCGAG AGTGAAACCA CCAAAGGTGC CTATTCACTT 1140 TCTATCCGTG ATTGGGATGA TATGAAAGGA GACCATGTCA AACATTATAA AATTCGCAAA 1200 CTTGACAATG GTGGATACTA CATTACCACC CGGGCCCAGT TTGAAACACT TCAGCAGCTT 1260 GTACAACATT ACTCAGAGAG AGCTGCAGGT CTCTGCTGCC GCCTAGTAGT TCCCTGTCAC 1320 AAAGGGATGC CAAGGCTTAC CGATCTGTCT GTCAAAACCA AAGATGTCTG GGAAATCCCT 1380 CGAGAATCCC TGCAGTTGAT CAAGAGACTG GGAAATGGGC AGTTTGGGGA AGTATGGATG 1440 GGTACCTGGA ATGGAAACAC AAAAGTAGCC ATAAAGACTC TTAAACCAGG CACAATGTCC 1500 CCCGAATCAT TCCTTGAGGA AGCGCAGATC ATGAAGAAGC TGAAGCACGA CAAGCTGGTC 1560 CAGCTCTATG CAGTGGTGTC TGAGGAGCCC ATCTACATCG TCACCGAGTA TATGAACAAA 1620 GGAAGTTTAC TGGATTTCTT AAAAGATGGA GAAGGAAGAG CTCTGAAATT ACCAAATCTT 1680 GTGGACATGG CAGCACAGGT GGCTGCAGGA ATGGCTTACA TCGAGCGCAT GAATTATATC 1740 CATAGAGATC TGCGATCAGC AAACATTCTA GTGGGGAATG GACTCATATG CAAGATTGCT 1800 GACTTCGGAT TGGCCCGATT GATAGAAGAC AATGAGTACA CAGCAAGACA AGGTGCAAAG 1860 TTCCCCATCA AGTGGACGGC CCCCGAGGCA GCCCTGTACG GGAGGTTCAC AATCAAGTCT 1920 GACGTGTGGT CTTTTGGAAT CTTACTCACA GAGCTGGTCA CCAAAGGAAG AGTGCCATAC 1980 CCAGGCATGA ACAACCGGGA GGTGCTGGAG CAGGTGGAGC GAGGCTACAG GATGCCCTGC 2040 CCGCAGGACT GCCCCATCTC TCTGCATGAG CTCATGATCC ACTGCTGGAA AAAGGACCCT 2100 GAAGAACGCC CCACTTTTGA GTACTTGCAG AGCTTCCTGG AAGACTACTT TACCGCGACA 2160 GAGCCCCAGT ACCAACCTGG TGAAAACCTG TAAGGCCCGG GTCTGCGGAG AGAGGCCTTG 2220 TCCCAGAGGC TGCCCCACCC CTCCCCATTA GCTTTCAATT CCGTAGCCAG CTGCTCCCCA 2280 GCAGCGGAAC CGCCCAGGAT CAGATTGCAT GTGACTCTGA AGCTGACGAA CTTCCATGGC 2340 CCTCATTAAT GACACTTGTC CCCAAATCCG AACCTCCTCT GTGAAGCATT CGAGACAGAA 2400 CCTTGTTATT TCTCAGACTT TGGAAAATGC ATTGTATCGA TGTTATGTAA AAGGCCAAAC 2460 CTCTGTTCAG TGTAAATAGT TACTCCAGTG CCAACAATCC TAGTGCTTTC CT1TRTTAAA 2520 AATGCAAATC CTATGTGATT TTAACTCTGT CTTCACCTGA TTCAACTAAA AAAAAAAAGT 2580 ATTATTTTCC AAAAGTGGCC TCTTTGTCTA AAACAATAAA ATTTTTTTTC ATGl"rrrAAC 2640 AAAAACC 2647 (2) INFORMATION FOR SEQ ID NO:78: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2301 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: GTCGACCGGA GGGCAGGAGG AGCAGGAGGA GCAGGAGCAG GAGGAGCAGG AGGAGCAGGA 60 GGAGCAGGAG GAGCAGGAGG AGCAGGAACA GGAGGAGGAG GAGGAGGAGA AGGAGGAGCA 120 GGAAGAGCAG GAGGAGGAGG AGCAGGAGCA GGAGGAGCAG GAGGGAGAGG AGGCTGCAAC 180 GCCGAGCGGA GGAGGCAGGA ACCGGAGCGC GAGCAGTAGC TGGGTGGGCA CCATGGCTGG 240 GATCACCACC ATCGAGGCGG TGAAGCGCAA GATCCAGGTT CTGCAGCAGC AGGCAGATGA 300 TGCAGAGGAG CGAGCTGAGC GCCTCCAGCG AGAAGTTGAG GGAGAAAGGC GGGCCCGGGA 360 ACAGGCTGAG GCTGAGGTGG CCTCCTTGAA CCGTAGGATC CAGCTGGTTG AAGAAGAGCT 420 GGACCGTGCT CAGGAGCGCC TGGCCACTGC CCTGCAAAAG CTGGAAGAAG CTGAAAAAGC 480 TGCTGATGAG AGTGAGAGAG GTATGAAGGT TATTGAAAAC CGGGCCTTAA AAGATGAAGA 540 AAAGATGGAA CTCCAGGAAA TCCAACTCGA AGAAGCTAAG CACATTGCAG AAGAGGCAGA 600 TAGGAAGTAT GAAGAGGTGG CTCGTAAGTT GGTGATCATT GAAGGAGACT TGGAACGCAC 660 AGAGGAACGA GCTGAGCTGG CAGAGTCGCG TTGCCGAGAG ATGGATGAGC AGATTAGACT 720 GATGGACCAG AACCTGAAGT GTCTGAGTGC TGCCGAAGAA AAGTACTCTC AAAAAGAAGA 780 TAAATATGAG GAAGAAATCA AGATTCTTAC TGATAAACTC AAGGAGGCAG AGACCCGTGC 840 TGAGTTTGCT GAGAGATCGG TAGCCAAGCT GGAAAAGACA ATTGATGACC TGGAAGACAC 900 TAACAGCACA TCTGGAGACC CGGTGGAGAA GAAGGACGAA ACACCTTTTG GGGTCTCGGT 960 GGCTGTGGGC CTGGCCGTCT TTGCCTGCCT CTTCCTTTCT ACGCTGCTCC TTGTGCTCAA 1020 CAAATGTGGA CGGAGAAACA AGTTTGGGAT CAACCGCCCG GCTGTGCTGG CTCCAGAGGA 1080 TGGGCTGGCC ATGTCCCTGC ATTTCATGAC ATTGGGTGGC AGCTCCCTGT CCCCCACCGA 1140 GGGCAAAGGC TCTGGGCTCC AAGGCCACAT CATCGAGAAC CCACAATACT TCAGTGATGC 1200 CTGTGTTCAC CACATCAAGC GCCGGGACAT CGTGCTCAAG TGGGAGCTGG GGGAGGGCGC 1260 CTTTGGGAAG GTCTTCCTTG CTGAGTGCCA CAACCTCCTG CCTGAGCAGG ACAAGATGCT 1320 GGTGGCTGTC AAGGCACTGA AGGAGGCGTC CGAGAGTGCT CGGCAGGACT TCCAACGTGA 1380 GGCTGAGCTG CTCACCATGC TGCAGCACCA GCACATCGTG CGCTTCTTCG GCGTCTGCAC 1440 CGAGGGCCGC CCCCTGCTCA TGGTCTTCGA GTATATGCGG CACGGGGACC TCAACCGCTT 1500 CCTCCGATCC CATGGACCCG ATGCCAAGCT GCTGGCTGGT GGGGAGGATG TGGCTCCAGG 1560 CCCCCTGGGT CTGGGGCAGC TGCTGGCCGT GGCTAGCCAG GTCGCTGCGG GGATGGTGTA 1620 CCTGGCGGGT CTGCATTTTG TGCACCGGGA CCTGGCCACA CGCAACTGTC TAGTGGGCCA 1680 GGGACTGGTG GTCAAGATTG GTGATTTTGG CATGAGCAGG GATATCTACA GCACCGACTA 1740 TTACCGTGTG GGAGGCCGCA CCATGCTGCC CATTCGCTGG ATGCCGCCCG AGAGCATCCT 1800 GTACCGTAAG TTCACCACCG AGAGCGACGT GTGGAGCTTC GGCGTGGTGC TCTGGGAGAT 1860 CTTCACCTAC GGCAAGCAGC CCTGGTACCA GCTCTCCAAC ACGGAGGCAA TCGACTGCAT 1920 CACGCAGGGA CGTGAGTTGG AGCGGCCACG TGCCTGCCCA CCAGAGGTCT ACGCCATCAT 1980 GCGGGGCTGC TGGCAGCGGG AGCCCAGCAA CGCCACAGCA TCAAGGATGT GCACGCCCGG 2040 CTGCAAGCCC TGGCCTAGGC ACCTCCTGTC TACCTGGATG TCCTGGGCTA GGGGGCCGGC 2100 CCAGGGGCTG GGAGTGGTTA GCCGGAATAC TGGGGCCTGC CCTCAGCATC CCCCATAGCT 2160 CCCAGCAGCC CCAGGGTGAT CTCGAAGTAT CTAATTCGCC CTCAGCATGT GGGAAGGGAC 2220 AGGTGGGGGC TGGGAGTAGA GGATGTTCCT GCTTCTCTAG GCAAGGTCCC GTCGTAGCAA 2280 TTATATTTAT TATGGGAATT C 2301 (2) INFORMATION FOR SEQ ID NO:79: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2757 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lil) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:79: CTAGGCTTTT GCAAAAAGCT TCACGCTGCC GCAAGCACTC AGGGCGCAAG GGCTGCTAAA 60 GGAAGCGGAA CACGTAGAAA GCCAGTCCGC AGAAACGGTG CTGACCCCGG ATGAATGTCA 120 GCTACTGGGC TATCTGGACA AGGGAAAACG CAAGCGCAAA GAGAAAGCAG TTCCTGTGCC 180 TTAAGAACAT TAGAACCTTC CTGTCCACCT GCTGTGAGAA GTTCGGCCTC AAGCGGAGCG 240 AGCTCTTCGA AGCCTTTGAC CTCTTCGATG TGCAGGATTT TGGCAAGGTC ATCTACACCC 300 TGTCTGCTCT GTCCTGGACC CCGATCGCCC AGAACAGGGG GATCATGCCC TTCCCCACCG 360 AGGAGGAGAG TGTAGGTGAT GAAGACATCT ACAGTGGCCT GTCCGACCAG ATCGACGACA 420 CGGTGGAGGA GGATGAGGAC CTGTATGACT GCGTGGAGAA TGAGGAGGCG GAAGGCGACG 480 AGATCTATGA GGACCTCATG CGCTCGGAGC CCGTGTCCAT GCCGCCCAAG ATGACAGAGT 540 ATGACAAGCG CTGCTGCTGC CTGCGGGAGA TCCAGCAGAC GGAGGAGAAG TACACTGACA 600 CGCTGGGCTC CATCCAGCAG CATTTCTTGA AGCCCCTGCA ACGGTTCCTG AAACCTCAAG 660 ACATTGAGAT CATCTTTATC AACATTGAGG ACCTGCTTCG TGTTCATACT CACTTCCTAA 720 AGGAGATGAA GGAAGCCCTG GGCACCCCTG GCGCACCGAA TCTCTACCAG GTCTTCATCA 780 AATACAAGGA GAGGTTCCTC GTCTATGGCC GCTACTGCAG CCAGGTGGAG TCAGCCAGCA 840 AACACCTGGA CCGTGTGGCC GCAGCCCGGG AGGACGTGCA GATGAAGCTG GAGGAATGTT 900 CTCAGAGAGC CAACAACGGG AGGTTCACTG CGCGACCTGC TGATGGTGCC TATGCAGCGA 960 GTTCTCAAAT ATCACCTCCT TCTCCAGGAG CTGGTGAAAC ACACGCAGGA GGCGATGGAG 1020 CAAGGAAACT GCGGCTGGCC CTGGATGCCA TGAGGGACCT GGCTCAGTGC GTGAACGAGG 1080 TCAAGCGAGA CAACGAGACA CTGCGACAGA TCACCAATTT CCAGCTGTCC ATTGAGAACC 1140 TGGACCAGTC TCTGGCTCAC TATGGCCGGC CCAAGATCGA CGGGGAACTC AAGATCACCT 1200 CGGTGGAACG GCGCTCCAAG ATGGACAGGT ATGCCTTCCT GCTCGACAAA GCTCTACTCA 1260 TCTGTAAGCG CAGGGGAGAC TCCTATGACC TCAAGGACTT TGTAAACCTG CACAGCTTCC 1320 AGGTTCGGGA TGACTCTTCA GGAGACCGAG ACAACAAGAA GTGGAGCCAC ATGTTCCTCC 1380 TGATCGAGGA CCAAGGTGCC CAGGGCTATG AGCTOTTCTT CAAGACAAGA GAATTGAAGA 1440 AGAAGTGGAT <RTI ID=178.3> GGAGCAGTTT GAGATGGCCA TCTCCAACAT CTATCCGGAG AATGCCACCG 1500 CCAACGGGCA TGACTTCCAG ATGTTCTCCT TTGAGGAGAC CACATCCTGC AAGGCCTGTC 1560 AGATGCTGCT TAGAGGTACC TTCTATCAGG GCTACCGCTG CCATCGGTGC CGGGCATCTG 1620 CACACAAGGA GTGTCTGGGG AGGGTCCCTC CATGTGGCCG ACATGGGCAA GATTTCCCAG 1680 GAACTATGAA GAAGGACAAA CTACATCGCA GGGCTCAGGA CAAAAAGAGG AATGAGCTGG 1740 GTCTGCCCAA GATGGAGGTG TTTCAGGAAT ACTACGGGCT TCCTCCACCC CCTGGAGCCA 1800 TTGGACCCTT TCTACGGCTC AACCCTGGAG ACATTGTGGA GCTCACGAAG GCTGAGGCTG 1860 AACAGAACTG GTGGGAGGGC AGAAATACAT CTACTAATGA AATTGGCTGG TITCCTTCTA 1920 ACAGGGTGAA GCCCTATGTC CATGGCCCTC CTCAGGACCT GTCTGTTCAT CTCTGGTACG 1980 CAGGCCCCAT GGAGCGGGCA GGGGCAGAGA GCATCCTGGC CAACCGCTCG GACGGGACTT 2040 TCTTGGTGCG GCAGAGGGTG AAGGATGCAG CAGAATTTGC CATCAGCATT AAATATAACG 2100 TCGAGGTCAA GCACACGGTT AAAATCATGA CAGCAGAAGG ACTGTACCGG ATCACAGAGA 2160 AAAAGGCTTT CCGGGGGCTT ACGGAGCTGG TGGAGTTTTA CCAGCAGAAC TCTCTAAAGG 2220 ATTGCTTCAA GTCTCTGGAC ACCACCTTGC AGTTCCCCTT CAAGGAGCCT GAAAAGAGAA 2280 CCATCAGCAG GCCAGCAGTG GGAAGCACAA AGTATfTTGG CACAGCCAAA GCCCGCTATG 2340 ACTTCTGCGC CCGTGACCGT TCAGAGCTGT CGCTCAAGGA GGGTGACATC ATCAAGATCC 2400 TTAACAAGAA GGGACAGCAA GGCTGGTGGC GAGGGGAGAT CTATGGCCGG GTTGGCTGGT 2460 TCCCTGCCAA CTACGTGGAG GAAGATTATT CTGAATACTG CTGAGCCCTG GTGCCTTGGC 2520 AGAGAGACGA GAAACTCCAG GCTCTGAGCC CGGCGTGGCG AGGCAGCGGA CCAGGGGCTG 2580 TGACAGCTCC GGCGGGTGGA GACTTTGGGA TGGACTGGAG GAGGCCAGCG TCCAGCTGGC 2640 GGTGCTCCCG GGATGTGCCC TGACATGGTT AATTTATAAC ACCCCGATTT TCCTCTTGGG 2700 TCCCCTCAAG CAGACGGGGG CTCAAGGGGG TTACATTTAA TAAAAGGATG AAGATGG 2757 (2) INFORMATION FOR SEQ ID NO:80: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 841 base pairs (B) TYPE: nucleic acid {C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: CCTGACCAAC ATGGTGAAAC CCCATCTCTA CTAAAAATAC AAAAATTATC CAGGCGTGGT 600 GGCGGGTGCC TGTAATCCCA GCTATTTGGG AGGCTGAGGC AGGAGAATCG TTTGAACCCA 660 GGAGACAGAG GTTGCAGTGA ACCAAGACCG TGTCACTGCA CTCCAGCCTG GACAACAGAG 720 CAAGACTCTG TCTCCAAAAA AAAAAAAAGA AAAAAAAAAA AGGAATGTGT CTGTCTGGAA 780 GGAAGAGCTG GAACACATAA CCCTTAGCTG TGTCTGCCTC CCGCACCAGT GTCCAGTTGA 840 C 841 (2) INFORMATION FOR SEQ ID NO:81: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 841 base pairs (B) TYPE: nucleic acid IC) STRANDEDNESS: double (D) TOPOLOGY: linear li1) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: AATGAGGAAA CTGAGTTCCA ATTTAGTTGA CCTGGTAAGA TCACAAAAAC AGCAGGCAGG 60 CCGGGTGCGG TGGCTCACAC CTGTAATCCC AACACTTTGG GAGGCCAAGG CGGGCAGATC 120 ACCTGAGGTC GGGAGTTCGA GACCAGCCTG ACCAACATGG TGAAACCCCA TCTTTACTAA 180 AAATACAAAA TTAGCCGGGT GTGGTGGCGC ATGCCTGTAA TCCCAGCTAC TCGGGAGGCT 240 GAGGCAGGAG AATCACTTGA ACTTGGGAGA GGGAGGTTGC AGTGAGCCGA GATTGCACCA 300 TTGCACTCCA GCCAAGAGGT CTCAAAAAAA AAAAAAAAAG CAGGCAGCAG AGTCAGATTC 360 CAAGTCTGGC AGCTCTCACC CCCCGAGGGG TGAAAAGTAA GCACTAACCA ACCACATGAG 420 CACTAGGGGA AGTAGATTTA AAAAGGAACA TGGGCCAGGT GTGGTGGCCT ACACATGTAA 480 TCCCAGCACT TTGGGAGGCC GAGGCAGATG GATCACCTGA GGTCAGGAGA TCGAGACCAG 540 CCTGACCAAC ATGGTGAAAC CCCATCTCTA CTAAAAATAC AAAAATTATC CAGGCGTGGT 600 GGCGGGTGCC TGTAATCCCA GCTATTTGGG AGGCTGAGGC AGGAGAATCG TTTGAACCCA 660 GGAGACAGAG GTTGCAGTGA ACCAAGACCG TGTCACTGCA CTCCAGCCTG GACAACAGAG 720 CAAGACTCTG TCTCCAAAAA AAAAAAAAGA AAAAAAAAAA AGGAATGTGT CTGTCTGGAA 780 GGAAGAGCTG GAACACATAA CCCTTAGCTG TGTCTGCCTC CCGCACCAGT GTCCAGTTGA 840 C 841 (2) INFORMATION FOR SEQ ID NO:82: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1804 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: GGATCCTCAG GGGTAACACC TTTTGGAGGT GGGCATCTTC CTCATTCTCA GTGGTGCCAA 60 GTTCATATCC TGCTGGCTTA ACACGTGGTG TTACTATATT TGTGGCCTTA TATGATTATG 120 AAGCTAGAAC TACAGAAGAC CTTTCATTTA AGAAGGGTGA AAAATTTCAA ATAATTAACA 180 ATACAGAAGG AGACTGGTGG GAAGCAAGAT CAATCACTAC AGGAAAGAAT GGTTATATCC 240 TGAGCAGTTA TGTAGCGCCT GCAGATTCCA TTCAGGCAGA AGAATGGTAT TTTGGCAAAA 300 TGGGGAGAAA AGATGCTGAA AGATTACTTC TGAATCCTGG AAATTAATGA GGTATTTTCT 360 TAGGAAGAGA GAGTGAAATG GCTGGGTGCA GTGGCTCATG CCTGTAATCC CAGCACTTTG 420 GGAGGCCGAG TTGGGCGGAT CACCTGAGGT CAGGAGTTCG AGACTAGCCT GGCCAACATG 480 GTGAAACCCC ATCTCTACTA AAAAAAAAAG TACAAAATTA GCTGGACGTG GTGGTGAGTG 540 CCTGTAATCC CAGCTACTCA GGAGGCTGAG GCAGCAGAAT CACTTGAACC TGGGAGGCGG 600 AGGTTGCAGT GAGCTGAGAT CGCGCCACTG CACTCCAGCC TCGGCGACAA GAGCAAAAAC 660 TCCGTCTAAA AAACAAATAA GCAAACAGAA CAAAACAAAA CAAAAACGAG AGAGCGAAAC 720 TACTAAAGGT GCTTATTCCC TCTCTATTCG TGATTGGGAT GAGGTAAGGG GTGACAATGT 780 GAAACACCAC AAAATTAGGA AACTTGACAA TGGTAGATAC TATATCACAA CCAGAGAACA 840 ACTTGATACT CTGCAGAAAT TGGCAAAACA CTACACAGAA CATGCTGATG GTITATGCCA 900 CAAGTTAACA ACTGTGTGTC CAACTGTGAA ACCTCAGATT CAAGGTCTAG CAAAAGATGC 960 TTGGGAAATC CCTTGATAAT CTITGCGACT AGAGGTTAAA CTAGGACAAG GATGTTTTGG 1020 CAAAGTGTGG ATGGGAATAT GGAATGGAAC CACAAAAGTA GCAATCAAAA CACTAAAACC 1080 AGGTACAATG ATGCCAGAAG CTTTTCTTCA AGAAGCTCAG GTAATGAAAA AAATAAGACA 1140 TGGTAAACTT GTTCCACTAT ATGCTGTTGT TTCTGAAGAG CCAATTTACA TTGTCACTGA 1200 ATTGATGTCA AAAGGAAGCT TATTCAATTT CCTTAAGGAA GGAGATGGAA AGTATTTGAA 1260 GCTTCCACAA ATGGTTGATA TGCCTGCTCA GATTGCTGAT GGTATGGCAT ATATTAAAAG 1320 AATGAACTAT ATTCACCGAG ATCTCTGGGC TGCTAATATT CTTGTAGGAG AAAATCTTCT 1380 GTGCAAAATA GCAGATTTTG GTTTAGCAAG GTTAATTGAA GACAATGAAT ACACATCAAG 1440 ACAAGGTGCA GAATTTCCAA TCAAATGGAC AGCTCCTGAA GTTGCACTGT ATGGTGGGTT 1500 TACAATAAAG TCTGGTGTCT GCTCATTTGG AATTCTACAG ACAGAACTGG TAACAAAGGG 1560 CAGAGTGCCA TATCCAGGTA TGGTGAACCA TGAAATACTG GAACAGGTGG AGCGAGGATA 1620 CAGGATGCCT TGCCCTCAGG GCTGTCCAGA ATCCCTCCAT GAATTGATGA ATCTGTGTTG 1680 GAAGAAGGAC CCTGATGAAA GACCAACATT TGAATATGTT CAGTCCTTCT TGGGAGACTA 1740 CTTCACTGCT ACAGAGCCAT AGTACCAGCC AGGAGAAAAC TTCTAATTCA AGTAGCCTAT 1800 TTTA 1804 (2) INFORMATION FOR SEQ ID NO:83: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 4517 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) lxi) SEQUENCE DESCRIPTION: SEQ ID NO:83: GCGGAGCCAA GGCACACGGG TCTGACCCTT GGGCCGGCCC GGAGCAAGTG ACACGGACCG 60 GTCGCCTATC CTGACCACAG CAAAGCGGCC CGGAGCCCGC GGAGGGGACC TGACGGGGGC 120 GTAGGCGCCG GAAGGCTGGG GGCCCCGGAG CCGGGCCGGC GTGGCCCGAG TTCCGGTGAG 180 CGGACGGCGG CGCGCGCAGA UGATAATG GGCTGCATTA AAAGTAAAGA AAACAAAAGT 240 CCAGCCATTA AATACAGACC TGAAAATACT CCAGAGCCTG TCAGTACAAG TGTGAGCCAT 300 TATGGAGCAG AACCCACTAC AGTGTCACCA TGTCCGTCAT CTTCAGCAAA GGGAACAGCA 360 GTTAATTTCA GCAGTCTTTC CATGACACCA TTTGGAGGAT CCTCAGGGGT AACGCCTTTT 420 GGAGGTGCAT CTTCCTCATT TTCAGTGGTG CCAAGTTCAT ATCCTGCTGG TITAACAGGT 480 GGTGTTACTA TATTTGTGGC CTTATATGAT TATGAAGCTA GAACTACAGA AGACCTTTCA 540 TTTAAGAAGG GTGAAAGATT TCAAATAATT AACAATACGG AAGGAGATTG GTGGGAAGCA 600 AGATCAATCG CTACAGGAAA GAATGGTTAT ATCCCGAGCA ATTATGTAGC GCCTGCAGAT 660 TCCATTCAGG CAGAAGAATG GTATTTTGGC AAAATGGGGA GAAAAGATGC TGAAAGATTA 720 CTTTTGAATC CTGGAAATCA ACGAGGTATT TTCTTAGTAA GAGAGAGTGA AACAACTAAA 780 GGTGCTTATT CCCTTTCTAT TCGTGATTGG GATGAGATAA GGGGTGACAA TGTGAAACAC 840 TACAAAATTA GGAAACTTGA CAATGGTGGA TACTATATCA CAACCAGAGC ACAATITGAT 900 ACTCTGCAGA AATTGGTGAA ACACTACACA GAACATGCTG ATGGTTTATG CCACAAGTTG 960 ACAACTGTGT GTCCAACTGT GAAACCTCAG ACTCAAGGTC TAGCAAAAGA TGCTTGGGAA 1020 ATCCCTCGAG AATCTTTGCG ACTAGAGGTT AAACTAGGAC AAGGATGTTT CGGCGAAGTG 1080 TGGATGGGAA CATGGAATGG AACCACGAAA GTAGCAATCA AAACACTAAA ACCAGGTACA 1140 ATGATGCCAG AAGCTTTCCT TCAAGAAGCT CAGATAATGA AAAAATTAAG ACATGATAAA 1200 CTTGTTCCAC TATATGCTGT TGTTTCTGAA GAACCAATTT ACATTGTCAC TGAATTTATG 1260 TCAAAAGGAA GCTTATTAGA <RTI ID=183.11> TTTCCTTAAG GAAGGAGATG GAAAGTATTT GAAGCTTCCA 1320 CAGCTGGTTG ATATGGCTGC TCAGATTGCT GATGGTATGG CATATATTGA AAGAATGAAC 1380 TATATTCACC GAGATCTTCG GGCTGCTAAT ATTCTTGTAG GAGAAAATCT TGTGTGCAAA 1440 ATAGCAGACT TTGGTTTAGC AAGGTTAATT GAAGACAATG AATACACAGC AAGACAAGGT 1500 GCAAAATTTC CAATCAAATG GACAGCTCCT GAAGCTGCAC TGTATGGTCG GTTTACAATA 1560 AAGTCTGATG TCTGGTCATT TGGAATTCTG CAAACAGAAC TAGTAACAAA GGGCCGAGTG 1620 CCATATCCAG GTATGGTGAA CCGTGAAGTA CTAGAACAAG TGGAGCGAGG ATACAGGATG 1680 CCGTGCCCTC AGGGCTGTCC AGAATCCCTC CATGAATTGA TGAATCTGTG TTGGAAGAAG 1740 GACCCTGATG AAAGACCAAC ATTTGAATAT ATTCAGTCCT TCTTGGAAGA CTACTTCACT 1800 GCTACAGAGC CACAGTACCA GCCAGGAGAA AATTTATAAT TCAAGTAGCC TATTTTATAT 1860 GCACAAATCT GCCAAAATAT AAAGAACTTG TGTAGATTTT CTACAGGAAT CAAAAGAAGA 1920 AAATCTTCTT TACTCTGCAT GTTTTTAATG GTAAACTGGA ATCCCAGATA TGGTTGCACA 1980 <RTI ID=184.2> AAACCACTTT TTTTTCCCCA AGTATTAAAC TCTAATGTAC CAATGATGAA TTTATCAGCG 2040 TATTTCAGGG TCCAAACAAA ATAGAGCTAA GATACTGATG ACAGTGTGGG TGACAGCATG 2100 GTAATGAAGG ACAGTGAGGC TCCTGCTTAT TTATAAATCA TTTCCTTTCT TTTTTTCCCC 2160 AAAGTCAGAA TTGCTCAAAG AAAATTATTT ATTGTTACAG ATAAAACTTG AGAGATAAAA 2220 AGCTATACCA TAATAAAATC TAAAATTAAG GAATATCATG GGACCAAATA ATTCCATTCC 2280 AGTTTTTAA AGTTTCTTGC ATTTATTATT CTCAAAAGTT TTTTCTAAGT TAAACAGTCA 2340 GTATGCAATC TTAATATATG CTTTCTTTTG CATGGACATG GGCCAGGTTT TTCAAAAGGA 2400 ATATAAACAG GATCTCAAAC TTGATTAAAT GTTAGACCAC AGAAGTGGAA TTTGAAAGTA 2460 TAATGCAGTA CATTAATATT CATGTTCATG GAACTGAAAG AATAAGAACT TTTTCACTTC 2520 AGTCCTTTTC <RTI ID=184.10> TGAAGAGTTT GACTTAGAAT AATGAAGGTA ACTAGAAAGT GAGTTAATCT 2580 TGTATGAGGT TGCATTGATT TTTTAAGGCA ATATATAATT GAAACTACTG TCCAATCAAA 2640 GGGGAAATGT mGATCTTT AGATAGCATG CAAAGTAAGA CCCAGCATTT TAAAAGCCCT 2700 TTTTTAAAAA CTAGACTTCG TACTGTGAGT ATTGCTTATA TGTCCTTATG GGGATGGGTG 2760 CCACAAATAG AAAATATGAC CAGATCAGGG ACTTGAATGC ACTTTTGCTC ATGGTGAATA 2820 TAGATGAACA GAGAGGAAAA TGTATTTAAA AGAAATACGA GAAAAGAAAA TGTGAAAGTT 2880 TTACAAGTTA GAGGGATGGA AGGTAATGTT TAATGTTGAT GTCATGGAGT GACAGAATGG 2940 CTTTGCTGGC ACTCAGAGCT CCTCACTTAG CTATATTCTG AGACTTTGAA GAGTTATAAA 3000 GTATAACTAT AAAACTAATT TTTCTTACAC ACTAAATGGG TATITGTTCA AAATAATGAA 3060 GTTATGGCTT CACATTCATT GCAGTGGGAT ATGGTTTTTA TGTAAAACAT TTTTAGAACT 3120 CCAGTTTTCA AATCATGTTT GAATCTACAT TCACTTTTTT TTGTTTTCTT TTTTGAGACG 3180 GAGTCTCGCT CTGCCGCCCA GGCTGGAGTG CAGTGGCGCG ATCTCGGCTC ACTGCAAGCT 3240 CTGCCTCCCA GGTTCACACC ATTCTCCTGC CTCAGCCTCC CGAGTAGCTG GGACTACAGG 3300 TGCCCACCAC CACGCCTGGC TAGTTTTTG TATTTTTAGT AGAGACGCAG TTTCACCGTG 3360 TTAGCCAGGA TGGTCTCGAT CTCCTGACCT TGTGATCTGC CCGCCTCGGC CTCCCAAAGT 3420 GCTGGGATTA CAGGTGTGAG CCACCGCGCC CAGCCTACAT TCACTTCTAA AGTCTATGTA 3480 ATGGTGGTCA TTTTTTCCCT TTTAGAATAC ATTAAATGGT TGATTTGGGG AGGAAAACTT 3540 ATTCTGAATA TTAACGGTGG TGAAAAGGGG ACAGTTTTTA CCCTAAAGTG CAAAAGTGAA 3600 ACATACAAAA TAAGACTAAT TTTTAAGAGT AACTCAGTAA TTTCAAAATA CAGATTTGAA 3660 TAGCAGCATT AGTGGTTTGA GTGTCTAGCA AAGGAAAAAT TGATGAATAA AATGAAGGTC 3720 TGGTGTATAT GTTTTAAAAT ACTCTCATAT AGTCACACTT TAAATTAAGC CTTATATTAG 3780 GCCCCTCTAT TTTCAGGATA TAATTCTTAA CTATCATTAT TTACCTGATT TTAATCATCA 3840 GATTCGAAAT TCTGTGCCAT GGCGTATATG TTCAAATTCA AACCATTTTT AAAATGTGAA 3900 GATGGACTTC ATGCAAGTTG GCAGTGGTTC TGGTACTAAA AATTGTGGTT GTTTTTTCTG 3960 TTTACGTAAC CTGCTTAGTA TTGACACTCT CTACCAAGAG GGTCTTCCTA AGAAGAGTGC 4020 TGTCATTATT TCCTCTTATC AACAACTTGT GACATGAGAT TTTTTAAGGG CTITATGTGA 4080 ACTATGATAT TGTAATTTTT CTAAGCATAT TCAAAAGGGT GACAAAATTA CGTTTATGTA 4140 CTAAATCTAA TCAGGAAAGT AAGGCAGGAA AAGTTGATGG TATTCATTAG GTTTTAACTG 4200 AATGGAGCAG TTCCTTATAT AATAACAATT GTATAGTAGG GATAAAACAC TAACAATGTG 4260 TATTCATTTT AAATTGTTCT GTATTTTTAA ATTGCCAAGA AAAACAACTT TGTAAATTTG 4320 GAGATATTTT CCAACAGCTT TTCGTCTTCA GTGTCTTAAT GTGGAAGTTA ACCCTTACCA 4380 AAAAAGGAAG TTGGCAAAAA CAGCCTTCTA GCACACTTTT TTAAATGAAT AATGGTAGCC 4440 TAAACTTAAT ATTTTTATAA AGTATTGTAA TATTGTTTTG TGG ATAATTG AAATAAAAAG 4500 TTCTCATTGA ATGCACC 4517 (2) INFORMATION FOR SEQ ID NO:84: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4175 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: TCCTCGTCGT CTGTGGATTG CTAAACCTGA GTGGGAAGGG GGGGGAAAAA AAAAAGGGTG 60 GGTTGTTGTT TTGTTTAAAA AAAGAAAAAA TCCCTTAAGT GGATTTGTAC CAGCGTGGAA 120 GATAACTGGG GATTTTTGTT GTTTGTTTTG GGAATAGAAA CTAAAAAATG GAGACTGTAA 180 GTAGAAGCAG CTTCCAGCCT CATCCAGGAC TGCAGAAGAC CTTGGAACAG TTTCATCTGA 240 GCTCTATGAG CTCCCTGGGT GGCCCTGCTG CITTCTCAGC GCGATGGGCA CAGGAGATGT 300 ACAAGAAAGA CAATGGCAAA GACCCAGCGG AACCTGTACT GCATCTGCCC CCTATCCAGC 360 CCCCCCCGGT GATGCCTGGT CCCTTCITCA TGCCCTCGGA CAGATCCACT GAGAGGTGCG 420 AGACCATCCT GGAAGGGGAA ACCATCTCCT GCTTCGTGGT GGGTGGGGAA AAGCGCCTTT 480 GCTTGCCCCA GATCCTGAAC TCGGTGCTCA GGGACTTCTC CCTGCAGCAG ATCAATTCGG 540 TGTGCGATGA GCTACACATT TACTGCTCCA GATGCACCGC TGACCAGCTG GAGATCCTCA 600 AAGTCATGGG CATCTTGCCC TTCTCTGCCC CCTCCTGCGG GCTGATCACT AAAACTGATG 660 CTGAGAGGCT TTGCAATGCC TTGCTTTATG GTGGCACCTA TCCTCCCCAC TGCAAGAAGG 720 AATTCTCTAG CACGATTGAG CTGGAGCTTA CAGAGAAGAG CTTCAAGGTG TACCACGAGT 780 GCTTTGGGAA GTGTAAGGGA CTCCTGGTAC CAGAGCTTTA CAGTAACCCC AGCGCAGCCT 840 GCATCCAGTG CTTGGACTGC AGGCTCATGT ACCCGCCTCA CAAATTTGTG GTCCACTCTC 900 ACAAATCCCT GGAAAACAGG ACTTGCCACT GGGGCTTTGA CTCTGCAAAC TGGAGGTCCT 960 ACATCCTCCT TAGCCAGGAT TACACTGGGA AAGAGGAGAA AGCTAGGCTG GGCCAGCTCT 1020 TAGATGAAAT GAAAGAAAAA TTTGACTATA ACAACAAATA CAAGAGGAAA GCCCCCAGGA 1080 ACCGTGAGTC TCCTAGAGTT CAGCTCCGCC GGACCAAAAT GTTCAAGACA ATGCTGTGGG 1140 ATCCAGCTGG AGGTTCAGCG GTACTGCAGC GTCAGCCAGA TGGAAATGAG GTCCCTTCAG 1200 ATCCTCCTGC TTCCAAGAAA ACCAAAATAG ACGACTCCGC TTCCCAATCT CCAGCTTCTA 1260 CTGAGAAGGA AAAGCAGTCC AGTTGGTTAC GGTCCTTATC CAGTTCATCT AATAAGAGCA 1320 TTGGCTGTGT CCATCCCCGT CAGCGTCTCT CAGCTTTCCG GCCCTGGTCC CCTGCTGTAT 1380 CAGCAAATGA GAAAGAGCTC TCAACCCATC TTCCTGCATT GATCCGAGAC AGCAGTTTTT 1440 ACTCCTACAA AAGCTTTGAG AATGCTGTGG CCCCCAACGT GGCACTCGCA CCTCCTGCCC 1500 AACAGAAAGT TGTGAGCAAC CCACCCTGTG CCACAGTGGT GTCCCGGAGC AGCGAACCGC 1560 CGAGCAGCGC TGCGCAGCCA CGGAAAAGAA AACATGCTGC AGAAACCCCG GCTGTCCCAG 1620 AGCCAGTGGC CACGGTTACT GCCCCTGAAG AGGATAAGGA ATCAGAAGCA GAAATTGAAG 1680 TAGAGACCAG GGAGGAATTC ACCTCCTCCT TATCCTCGCT CTCCTCCCCA TCCTTTACTT 1740 CATCCAGCTC TGCAAAGGAC ATGAGCTCAC CTGGGATGCA AGCCCCAGTC CCAGTCAACA 1800 GTTCATATGA GGTTGCAGCA CATTCTGACT CTCACAGCAG TGGGTTGGAA GCTGAGCTGG 1860 AGCACCTAAG GCAGGCCCTG GACAGTGGCC TAGATACAAA AGAAGCCAAA GAAAAATTCC 1920 TCCATGAAGT TGTTAAAATG AGAGTGAAGC AGGAAGAGAA GCTAAATGCT GCCTTGCAAG 1980 CCAAACGCAG CCTACATCAG GAGCTGGAGT TCCTCAGAGT GGCAAAGAAG GAGAAACTGA 2040 GAGAAGCAAC GGAGGCAAAA CGCAACTTAA GGAAAGAGAT TGAGCGTCTG AGAGCTGAGA 2100 ATGAGAAGAA AATGAAGGAA GCAAACGAGT CTCGGATACG GCTAAAGAGG GAACTGGAAC 2160 AAGCCAGGCA GATCCGGGTT TGCGACAAGG GTTGTGAAGC TGGCAGGCTT CGGGCCAAGT 2220 ACTCTGCCCA GATTGAGGAC CTACAGGTTA AGCTTCAGCA TGCAGAGGCT GACAGGGAGC 2280 AGCTCCGAGC TGACCTGATG CATGAGAGGG AGGCTCGAGA ACACTTGGAA AAAGTAGTCA 2340 AGGAACTTCA GGAACAGCTG TGGCCTAAAT CAAGCAGTCA ATCCAGCAGT GAAAACACAA 2400 CGAGCAACAT GGAGAATTAA ACCACGTCGT CTAATACAAC AGAATGACAT ATATGCACAG 2460 TAAGGGAGGA TGGGTGGGGT ACGTGTGTAA GTGCATGTGT GAGTAGTTGT GTCTTAACAC 2520 ACAGATCTAG GAATATGGAT TCTTATTAGT TGGAAGGCAA ATGTTACTCT TTATAACAGA 2580 AGCACTGAAT TACGCCTCTT TTTTTTTCCA ATCCATATAG CACAACATCT TACTGTGCCT 2640 ATAAAACACA AATGTGTTTA TAAACAAAAT ACTTTTAAGT CCACAGCAAA TTTTCTACTG 2700 GCAAACTCCA AGCAAGCAGC ATCCTCCAAC TAGAATCAGA GTAAAAGGCA AGCATGGCAG 2760 TGTTTTCATG TTGCCCTTCT GCCTGTCGGA ACATTTTGGA ATTTAAAAAC AAACTTTTCT 2820 TATAAGCTAT TTAAAGTAAT TCATTACACA GACTTGGTAT TAAAAAAAAT TAACAAGATT 2880 TTTTATAACG AACCTTTAAA AGCAAAACAA AAACCTTCGA TGCACAATTT TTACGACTTG 2940 TTAAAGGCTT TGGGATTCTT ACTGCAGAAG CCCTTTGGTG ATGATGCCAT TTCATTAGCA 3000 GTTTTTTTTA ATCCTGTCCT GTGGTTGTAT GAGAATTTCA GAGTGCTTTT CAAAGTTGAT 3060 TTTTTTCCTT AGAAACAATC ACCTTCATTT CCTGTCCTGA ACACAAGAAG AAAGGAAGAT 3120 GCAGGACTGT AAGGGCGTGG GGGAGGGCAG GAAGAGAAGA TGGACGCTTT GGAATTATAA 3180 ACCCAGCCTT ACAGACTTCA GTGTTTCAAA TCACGCCATG TTTTCTAAAG ACGTCTTCAT 3240 TAATCGATGT GTTCAAAAGA CTCACTTCAT CCAAGAGCAC TTCAGCTTTA GGAAAAGAAA 3300 GAAGGAAGTA AAGGAAGGAA ATGGATGACC TGTTAAGTTG GTTGAGAAAT AAAGCAGAAG 3360 ATGTGTTTTG AAGTCATTCT GAAATCTTCG CGTCAGCTTT CAGTTCTCTG GAAAACTCAT 3420 CTTTGTTGCA CCATCTTACC ATAGAATTCA GTATTTACCT ACTTCTATTC TGAACTGTTT 3480 GTCAGGATTT CTGTGCCCAA GGAGAGTGCA ACACCGCATT ATTGGATACT ACAGAAAAGA 3540 AAAACCACGT TTTTGCTGCT GTGAATAAGC CTACATCTTT TTTAAAAGAA AAACTTCTGT 3600 TTTTAAGAAT AGAAATTACT TTAATTTTGG GATCCGAGCC GCAGCCCTGG AATAGAAATG 3660 CAGCCTACCA TCACTCTGTC TTACTACCAT TGTTAGCGTC GTCGTTCATT TTTTTTTAAA 3720 CTGCACTTTG TCAGAACCTC ACTCTGCATT TTATTCCATA TTTTGGAAGT TTACAAGTTC 3780 AGCATTCTCG ATTCTGCTCT GCAGATGTTA AAATCATCAC CACCATTTTC CACCACGCGA 3840 CACCTCGGCC GTCATTTCCA TGTATGCAAA AGAAGAACTC AGTGGGTACA GAATGCTACC 3900 AAATACAAAG GCAGCAGAGC AGCGTGCTGC TGGTTGGGTT TCACAGCTGC GCTGCACGGC 3960 TGTGGCTGTC GAGGCTGGGA AGTGCTCAAA TACAGTTGGT GCTTTACTGA ATGAGAGAGG 4020 AGTTATTTTC ACCCACACAC ACTCACCTCT GATACACTCA AGCTCAGTGA AAAGTTGATC 4080 TGGGGCTGCA GTTGTGCCTT CCAGCTCATT TTTCCTCTCA GCATCTTCTA TAGGCAATGC 4140 TGACACTTTT TTTTTAAACC TTAAAGAATA AAAAG 4175 (2) INFORMATION FOR SEQ ID NO:85: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1838 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: GAGGAGTTAG GGAGGCTGAG CCAGCCCGCA TTAGCTCAGC TCTCAGGCAC TCTCAGTCTC 60 TGCTGGAAAA GGAGCTGGGA GGATGTGCGC GCTGTTACCG CGGCTCCCCA CGGCACGCTG 120 AGAGCGAAGG GAAGAAAGGA AGAAAGGGGA GCGCGGCGCC AGGGAGCCCG CCCGGATCGA 180 GCGAAGGGAC CGCTCGGGGA AGGCAGCCCG CCCGCAGCCC CGCACCGCCT TGGAAGCGCC 240 GGAGATTGGG GCCTTTTAAT TCTATTCCGC GTTGCTCCCT TCCCCGCACC GAGCCGTCTT 300 CCTCCTCCCG CTTTTTACTT TCCCTCCCCG CTCCTTCCTC CTGGTGGATA TGGACACAAG 360 CGGGATGCGT GCAGAGAAGC AGCAGCCGCC AGCGCGTATT TAACCCGCGG GTGCCCGGAG 420 CGTGGCTCGG ACCGGCGTTC CGGGGCTGGC AGTGCGCGCT GCCTGGCGCC GGGCTCGGAT 480 GAGGACCTCG CTTCGGCGAG ACTCTGCGCT CCGGCCGGGC GGCGTCGGGG CCGGGCACGC 540 GAGAGATTGT GACGGAGCGA GAGAACGGAA AGGAGAGAGA GCCGCGTACG AAAAGGATTT 600 ATTTCCTCGT CGTCTGTGGA TTGCTAAACC TGAGTGGGAA GGGGGGGGAA AAAAAAAAGG 660 GTGGGTTGTT GTTTTGTTTA AAAAAAGAAA AAATCCCTTA <RTI ID=190.3> AGTGGATTTG TACCAGCGTG 720 GAAGATAACT GGGGATTTTT GTTGTlTGTT TTGGGAATAG AAACTAAAAA ATGGAGACTG 780 TAAGTAGAAG CAGCTTCCAG CCTCATCCAG GACTGCAGAA GACCTTGGAA CAGTITCATC 840 TGAGCTCTAT GAGCTCCCTG GGTGGCCCTG CTGCTTTCTC AGCGCGATGG GCACAGGAGA 900 TGTACAAGAA AGACAATGGC AAAGACCCAG CGGAACCTGT ACTGCATCTG CCCCCTATCC 960 AGCCCCCCCC GGTGATGCCT GGTCCCTTCT TCATGCCCTC GGACAGATCC ACTGAGAGGT 1020 GCGAGACCAT CCTGGAAGGG GAAACCATCT CCTGCTTCGT GGTGGGTGGG GAAAAGCGCC 1080 TITGCTTGCC CCAGATCCTG AACTCGGTGC TCAGGGACTT CTCCCTGCAG CAGATCAATT 1140 CGGTGTGCGA TGAGCTACAC ATTTACTGCT CCAGATGCAC CGCTGACCAG CTGGAGATCC 1200 TCAAAGTCAT GGGCATCTTG CCCTTCTCTG CCCCCTCCTG CGGGCTGATC ACTAAAACTG 1260 ATGCTGAGAG GCTTTGCAAT GCCTTGCTTT ATGGTGGCAC CTATCCTCCC CACTGCAAGA 1320 AGGAATTCTC TAGCACGATT GAGCTGGAGC TTACAGAGAA GAGCTTCAAG GTGTACCACG 1380 AGTGCTTTGG GAAGTGTAAG GGACTCCTGG TACCAGAGCT TTACAGTAAC CCCAGCGCAG 1440 CCTGCATCCA GTGCTTGGAC TGCAGGCTCA TGTACCCGCC TCACAAATTT GTGGTCCACT 1500 CTCACAAATC CCTGGAAAAC AGGACTTGCC ACTGGGGCTT TGACTCTGCA AACTGGAGGT 1560 CCTACATTCT CCTTAGCCAG GATTACACTG GGAAAGAGGA GAAAGCTAGG CTGGGCCAGC 1620 TCTTAGATGA AATGAAAGAA AAATTTGACT ATAACAACAA ATACAAGAGG AAAGCCCCCA 1680 GGGTAAGTGA TGCCTTAAAT TCTTACTTTA AGAAGGAGTA GGGTGATAAG AAGGAGAAAA 1740 GATTTTCGAT GACGACCATC GATGTACTCA ATTTGTTACC AACTGTTTTC CTTTTAGAGT 1800 CGTTTAGCAA TCTAATCGCA TTCTTMTTC TCACCACG 1838 (2) INFORMATION FOR SEQ ID NO:86: (i) SEQUENCE CHARACTERISTICS: IA) LENGTH: 402 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear lii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: AAAAGCAGCA GTGTCCGAGA ACCAAAACAA AAAACCCCAC AGCACAACAA CTGGCCTCCT 60 TCCATGAGCC GACCCTATTT TCAGCCTGTC CGCCACGGTA CCAGGGGCAC GGGCTGTTAT 120 TGAGAGGCTG CTCTTGTACA TAAAGCCTTG TGATTTTTGT TTAGAACCGT GAGTCTCCTA 180 GAGTTCAGCT CCGCCGGAAC AAAATGTTCA AGACAATGCT GTGGGATCCA GCTGGAGGTT 240 CAGCGGTACT GCAGCGTCAG CCAGATGGAA ATGAGGTGCA GAGGGTTGCA GGGGTGGGTG 300 TGGGGAGTTC CTTCCCTTTA AACCAAAAGG AAAACAATAG TGATTCTGAA ATACATGCAT 360 TCAGTGTATC TATAGCATTA AATGCTAACT ATGGTAGGCA TG 402 {2) INFORMATION FOR SEQ ID NO:87: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 576 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: CGTGTGTAAG TGCATGTGTG AGTAGTTGTG TCTTAACACA CAGATCTAGG AATATGGATT 60 CTTATTAGTT GGAAGGCAAA TGTTACTCTT TATAACAGAA GCACTGAATT ACCCCTCTTT 120 TTTTTTCCAA TCCATATAGC ACAACATCTT ACTGTGCCTA TAAAACACAA AGTGTCTATA 180 AACAAAATAC TTTTAAGTCC ACAGCAAATT TTCTACTGGC AAACTCCAAG CAAGCAGCAT 240 CCTCCAACTA GAATCAGAGT AAAAGGCAAG CATGGCAGTG TTTTCATGTT GCCCTTCTGC 300 CTGTCGGAAC ATTTTGGAAT TTAAAAACAA ACTTTTCTTA TAAGCTATIT AAAGTAATTC 360 ATTACACAGA CTTGGTATTA AAAAAAATTA ACAAGATITT TTATAACGAA C CTrTAAAAG 420 CAAAACAAAA ACCTTCGATG CACAATTTIT ACGACTTGTT AAAGGCTTTG GGATTCTTAC 480 TGCAGAAGCC CTTTGGTGAT GATGCCATTT CATTAGCAGT TTTCCAGCCG TCTCGGAAGC 540 CATCACAGTT TTAATCCTGT CCTGTGGTTG TATGAG 576 (2) INFORMATION FOR SEQ ID NO:88: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1483 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: AATGGAACCC GCTAGCCAGT AGGCCTTAGC CTGTGGGGGA GCAGTTTTAG AGATAAGAGA 60 ACATGTACTT AACAGAATTA TTAGTGCTCT GTTGGGATTA ATGCTTTTTA AAGATTGTTT 120 GCACTTATTT TTCGTTCTCC TCCTTCCCAG AGCATTGGCT GTGTCCATCC CCGTCAGCGT 180 CTCTCAGCTT TCCGGCCCTG GTCCCCTGCT GTATCAGCAA ATGAGAAAGA GCTCTCAACC 240 CATCTTCCTG CATTGATCCG AGACAGGTAA GGCGGCCAGA GCTGAGGGAG GAATGGGGCT 300 GGCAACTGTG GAGAGTATCT GGAGAGCAAG TTGCTTTATG CTTGCCGTCT TGCATTGTTT 360 GTTAGTTGAT CAGTTCACTA GTAGTGGAAT AAAAAGTATT TGCTGGTGCA GTGTTCTAGT 420 TAGCAGAAAG GAAACAAAAA AGGTCTAAGA GTAGGCCTTT GGGCACCTAA TTCTGGGGAA 480 AAAATACCTA AAACTTCAGC AAGTTCTCAA GTTTTAATAG ACAACTTdAA GATACTTTGC 540 ATTATACATG CTGAGTTGAA GTTGCAGGCA GTTTTTGATA CCTTTTCTGC TAAATTCTTT 600 CTATGTGATG TTACAGTATT ACTGTTCTTT AACATAGTTA CTTTTAAGG CATTTTATGT 660 ACAAGTATTC AGATGTCCTG TGTCAAGAAA GTAAAGTTGC ACAAGGATAC ATTITGTACT 720 CCTGTGGCCT TTGTAGGGAA GCTGTTTCCA AGCTTACAAA ACATTCGTGG ATAGACAGAA 780 TTAAAGTCCA ATTAGGAAAA TCGATAATGC ACAAAGACTG TGGGGAAGCA GCATCATTTT 840 TAGACTGGGA GAGCTTGGTA CTCAGAATAT ATGGTCCATC ACATATATCC CTCCTGACCT 900 GACGGAGATT AGCAGTATAA ATTACCTCCA CATTTAAATG AGTACTTAAC AAATATATAG 960 CCAGATTCTT AGCTGGTGTG TATTGGTGGA CCTTCACCGT CACTCCAGTA AGAGCCTTGC 1020 AGTTTCAGTG TAACACACAG ATTGGGAAGC GTTTTAAAGT CAAGAATATG ACTGTTrTCT 1080 CCTTCCTTTT GTCTCCCCCT CTCCAGCAGT TTTTACTCCT ACAAAAGCTT TGAGAATGCT 1140 GTGGCCCCCA ACGTGGCACT CGCACCTCCT GCCCAACAGA AAGTTGTGAG CAACCCACCC 1200 TGTGCCACAG TGGTGTCCCG GAGCAGCGAA CCGCCGAGCA GCGCTGCGCA GCCACGGAAA 1260 AGAAAACATG CTGCAGAAAC CCCGGCTGTC CCAGAGCCAG TGGCCACGGT TACTGCCCCT 1320 GAAGAGGATA AGGAATCAGA AGCAGAAATT GAAGTAGAGA CCAGGGAGGA ATGTAAGTGT 1380 ATATCTGCCT TTACTTTGTT TTATTTGTGC TCTGTTTTCC TCTGGTTAAC CTCCAGCAGT 1440 TAGCTACTGA ACTCTGTTGC GTTCAAACAT AATTCTGGAG GCC 1483 (2) INFORMATION FOR SEQ ID NO:89: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3230 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: ACGTCTACCC ATTCTTATTT CTGCAGCATA TATAGCAGTG ACGAGGATGA TGAGGACTTT 60 GAGATGTGTG ACCATGACTA TGATGGGCTG CTTCCCAAGT CTGGAAAGCG TCACTTGGGG 120 AAAACAAGGT GGACCCGGGA AGAGGATGAA AAACTGAAGA AGCTGGTGGA ACAGAATGGA 180 ACAGATGACT GGAAAGTTAT TGCCAATTAT CTCCCGAATC GAACAGATGT GCAGTGCCAG 240 CACCGATGGC AGAAAGTACT AAACCCTGAG CTCATCAAGG GTCCTTGGAC CAAAGAAGAA 300 GATCAGAGAG TGATAGAGCT TGTACAGAAA TACGGTCCGA AACGTTGGTC TGTTATTGCC 360 AAGCACTTAA AGGGGAGAAT TGGAAAACAA TGTAGGGAGA GGTGGCATAA CCACTTGAAT 420 CCAGAAGTTA AGAAAACCTC CTGGACAGAA GAGGAAGACA GAATTATTTA CCAGGCACAC 480 AAGAGACTGG GGAACAGATG GGCAGAAATC GCAAAGCTAC TGCCTGGACG AACTGATAAT 540 GCTATCAAGA ACCACTGGAA TTCTACAATG CGTCGGAAGG TCGAACAGGA AGGTTATCTG 600 CAGGAGTCTT CAAAAGCCAG CCAGCCAGCA GTGGCCACAA GCTTCCAGAA GAACAGTCAT 660 TTGATGGGTT TTGCTCAGGC TCCGCCTACA GCTCAACTCC CTGCCACTGG CCAGCCCACT 720 GTTAACAACG ACTATTCCTA TTACCACATT TCTGAAGCAC AAAATGTCTC CAGTCATGTT 780 CCATACCCTG TAGCGTTACA TGTAAATATA GTCAATGTCC CTCAGCCAGC TGCCGCAGCC 840 ATTCAGAGAC ACTATAATGA TGAAGACCCT GAGAAGGAAA AGCGAATAAA GGAATTAGAA 900 TTGCTCCTAA TGTCAACTGA GAATGAGCTA AAAGGACAGC AGTTATGTGG GCCATTACTG 960 AATTCTGACA TCTTTAGCGA GGGCAGCC AACTGGGATG GCTCCTTGTG CTTTGCAACA 1020 TACATAGTTA ACCAACAAAG ACAATAAAAG AGAAGTGATG CCCAACACAG AACCACACAT 1080 GCAGCTACCC CGGGTGGCAC AGCACCACCA TTGCCGACCA CACCAGACCT CATGGAGACA 1140 GTGCACCTGT TTCCTGTTTG GGAGAACACC ACTCCACTCC ATCTCTGCCA GCGGATCCTG 1200 GCTCCCTACC TGAAGAAAGC GCCTCGCCAG CAAGGTGCAT GATCGTCCAC CAGGGCACCA 1260 TTCTGGATAA TGTTAAGAAC CTCTTAGAAT TTGCAGAAAC ACTCCAATTT ATAGATTCTT 1320 TCTTAAACAC TTCCAGTAAC CATGAAAACT CAGACTTGGA AATGCCTTCT TTAACTTCCA 1380 CCCCCCTCAT TGGTCACAAA TTGACTGTTA CAACACCATT TCATAGAGAC CAGACTGTGA 1440 AAACTCAAAA GGAAAATACT GTTTTAGAA CCCCAGCTAT CAAAAGGTCA ATCTTAGAAA 1500 GCTCTCCAAG AACTCCTACA CCATTCAAAC ATGCACTTGC AGCTCAAGAA ATTAAATACG 1560 GTCCCCTGAA GATGCTACCT CAGACACCCT CTCATCTAGT AGAAGATCTG CAGGATGTGA 1620 TCAAACAGGA ATCTGATGAA TCTGGAATTG TTGCTGAGTT TCAAGAAAAT GGACCACCCT 1680 TACTGAAGAA AATCAAACAA GAGGTGGAAT CTCCAACTGA TAAATCAGGA AACTTCTTCT 1740 GCTCACACCA CTGGGAAGGG GACAGTCTGA ATACCAACT GTTCACGCAG ACCTCGCCTG 1800 TGGCAGATGC ACCGAATATT CTTACAAGCT CCGTTTTAAT GGCACCAGCA TCAGAAGATG 1860 AAGACAATGT TCTCAAAGCA TTTACAGTAC CTAAAAACAG GTCCCTGGCG AGCCCCTTGC 1920 AGCCTTGTAG CAGTACCTGG GAACCTGCAT CCTGTGGAAA GATGGAGGAG CAGATGACAT 1980 CTTCCAGTCA AGCTCGTAAA TACGTGAATG CATTCTCAGC CCGGACGCTG GTCATGTGAG 2040 ACATTTCCAG AAAAGCATTA TGGTTTTCAG AACACTTCAA GTTGACTTGG GATATATCAT 2100 TCCTCAACAT GAAACTTTC ATGAATGGGA GAAGAACCTA TTTTGTTGT GGTACAACAG 2160 TTGAGAGCAG CACCAAGTGC ATTTAGTTGA ATGAAGTCTT CTTGGATTTC ACCCAACTAA 2220 AAGGATTTTT AAAAATAAAT AACAGTCTTA CCTAAATTAT TAGGTAATGA ATTGTAGCCA 2280 GTTGTTAATA TCTTAATGCA GATTTTTTTA AAAAAAACAT AAAATGATTT ATCTGTATTT 2340 TAAAGGATCC AACAGATCAG TATTTTTTCC TGTGATGGGT TTTTTGAAAT TTGACACATT 2400 AAAAGGTACT CCAGTATTTC ACTITCTCG ATCACTAAAC ATATGCATAT ATTTTTAAAA 2460 ATCAGTAAAA GCATTACTCT AAGTGTAGAC TTAATACCAT GTGACATTTA ATCCAGATTG 2520 TAAATGCTCA TTTATGGTTA ATGACATTGA AGGTACATTT ATTGTACCAA ACCATTTTAT 2580 GAGTTTTCTG <RTI ID=195.10> TTAGCTTGCT TTAAAAATTA TTACTGTAAG AAATAGTTTT ATAAAAAATT 2640 ATATTTTTAT TCAGTAATTT AATTTGTAA ATGCCAAATG AAAAACGTTT TTTGCTGCTA 2700 TGGTCTTAGC CTGTAGACAT GCTGCTAGTA TCAGAGGGGC AGTAGAGCTT GGACAGAAAG 2760 AAAAGAAACT TGGTGTTAGG TAATTGACTA TGCACTAGTA TITCAGACIT TTTAATTTTA 2820 TATATATATA CATTTTTTTT CCTTCTGCAA TACATTTGAA AACTTGTTTG GGAGACTCTG 2880 CATTTTTTAT TGTGGTTTTT TTGTTATTGT TGGTTTATAC AAGCATGCGT TGCACTTCTT 2940 TTTTGGGAGA TGTGTGTTGT TGATGTTCTA TGTTTTGTTT TGAGTGTAGC CTGACTGTTT 3000 TATAATTTGG GAGTTCTGCA TTTGATCCGC ATCCCCTGTG GTCTCTAAGT GTATGGTCTC 3060 AGAACTGTTG CATGGATCCT GTGTTTGCAA CTGGGGAGAC AGAAACTGTG GTTGATAGCC 3120 AGTCACTGCC TTAAGAACAT TTGATGCAAG ATGGCCAGCA CTGAACTTTT GAGATATGAC 3180 GGTGTACTTA CTGCCTTGTA GCAAAATAAA GATGTGCCCT TATTTTACCT 3230 12) INFORMATION FOR SEQ ID NO:90: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1035 base pairs IB) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear lii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:90: GAGGATGAAA AACTGAAGAA GCTGGTGGAA CAGAATGGAA CAGATGACTG GAAAGTTATT 60 GCCAATTATC TCCCGAATCG AACAGATGTG CAGTGCCAGC ACCGATGGCA GAAAGTACTA 120 AACCCTGAGC TCATCAAGGG TCCTTGGACC AAAGAAGAAG ATCAGAGAGT G AATGAGCTAA AAGGACAGCA GGTGCTACCA GTAAGACTGT CATCATGTGC TTGAATGAGG 840 GGATAGCAGC TTTGCCTCAG TTTACCTAAG CGCTCTTCTC TTCTAAATAT TACACTTAGC 900 AAGGCTCCAT ATATCCATTC AGAATGTCTC AACACAAGAA GTTGCTTGTA GTAAAATGTA 960 GTTGGTATCA GATTATATGC TGATTAAATT GGAAGCAGTC TTTTTGTAAT TGCAATAAAA 1020 ATGCAATATC CACTT 1035 (2) INFORMATION FOR SEQ ID NO:91: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3225 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA Igenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: GGCGGCAGCG CCCTGCCGAC GCCGGGGAGG GACGCAGGCA GGCGGCGGGC AGCGGGAGGC 60 GGCACCCCGG TGCTCCCCGC GGCTCTCGGC GGAGCCCCGC CGCCCGCCGC GCCATGGCCC 120 GAAGACCCCG GCACAGCATA TATAGCAGTG ACGAGGATGA TGAGGACTTT GAGATGTGTG 180 ACCATGACTA TGATGGGCTG CTTCCCAAGT CTGGAAAGCG TCACTTGGGG AAAACAAGGT 240 GGACCCGGGA AGAGGATGAA AAACTGAAGA AGCTGGTGGA ACAGAATGGA ACAGATGACT 300 GGAAAGTTAT TGCCAATTAT CTCCCGAATC GAACAGATGT GCAGTGCCAG CACCGATGGC 360 AGAAAGTACT AAACCCTGAG CTCATCAAGG GTCCTTGGAC CAAAGAAGAA GATCAGAGAG 420 TGATAGAGCT TGTACAGAAA TACGGTCCGA AACGTTGGTC TGTTATTGCC AAGCACTTAA 480 AGGGGAGAAT TGGAAAACAA TGTAGGGAGA GGTGGCATAA CCACTTGAAT CCAGAAGTTA 540 AGAAAACCTC CTGGACAGAA GAGGAAGACA GAATTATTTA CCAGGCACAC AAGAGACTGG 600 GGAACAGATG GGCAGAAATC GCAAAGCTAC TGCCTGGACG AACTGATAAT GCTATCAAGA 660 ACCACTGGAA TTCTACAATG CGTCGGAAGG TCGAACAGGA AGGTTATCTG CAGGAGTCTT 720 CAAAAGCCAG CCAGCCAGCA GTGGCCACAA GCTTCCAGAA GAACAGTCAT TTGATGGGTT 780 TTGCTCAGGC TCCGCCTACA GCTCAACTCC CTGCCACTGG CCAGCCCACT GTTAACAACG 840 ACTATTCCTA TTACCACATT TCTGAAGCAC AAAATGTCTC CAGTCATGTT CCATACCCTG 900 TAGCGTTACA TGTAAATATA GTCAATGTCC CTCAGCCAGC TGCCGCAGCC ATTCAGAGAC 960 ACTATAATGA TGAAGACCCT GAGAAGGAAA AGCGAATAAA GGAATTAGAA TTGCTCCTAA 1020 TGTCAACCGA GAATGAGCTA AAAGGACAGC AGGTGCTACC AACACAGAAC CACACATGCA 1080 GCTACCCCGG GTGGCACAGC ACCACCATTG CCGACCACAC CAGACCTCAT GGAGACAGTG 1140 CACCTGTTTC CTGTTTGGGA GAACACCACT CCACTCCATC TCTGCCAGCG GATCCTGGCT 1200 CCCTACCTGA AGAAAGCGCC TCGCCAGCAA GGTGCATGAT CGTCCACCAG GGCACCATTC 1260 TGGATAATGT TAAGAACCTC TTAGAATTTG CAGAAACACT CCAATTTATA GATTCTTTCT 1320 TAAACACTTC CAGTAACCAT GAAAACTCAG ACTTGGAAAT GCCTTCTITA ACTTCCACCC 1380 CCCTCATTGG TCACAAATTG ACTGTTACAA CACCATTTCA TAGAGACCAG ACTGTGAAAA 1440 CTCAAAAGGA AAATACTGTT TTTAGAACCC CAGCTATCAA AAGGTCAATC TTAGAAAGCT 1500 CTCCAAGAAC TCCTACACCA TTCAAACATG CACTTGCAGC TCAAGAAATT AAATACGGTC 1560 CCCTGAAGAT GCTACCTCAG ACACCCTCTC ATCTAGTAGA AGATCTGCAG GATGTGATCA 1620 AACAGGAATC TGATGAATCT GGATTTGTTG CTGAGTTTCA AGAAAATGGA CCACCCTTAC 1680 TGAAGAAAAT CAAACAAGAG GTGGAATCTC CAACTGATAA ATCAGGAAAC TTCTTCTGCT 1740 CACACCACTG GGAAGGGGAC AGTCTGAATA CCCAACTGTT CACGCAGACC TCGCCTGTGC 1800 GAGATGCACC GAATATTCTT ACAAGCTCCG TTTTAATGGC ACCAGCATCA GAAGATGAAG 1860 ACAATGTTCT CAAAGCATTT ACAGTACCTA AAAACAGGTC CCTGGCGAGC CCCTTGCAGC 1920 CTTGTAGCAG TACCTGGGAA CCTGCATCCT GTGGAAAGAT GGAGGAGCAG ATGACATCTT 1980 CCAGTCAAGC TCGTAAATAC GTGAATGCAT TCTCAGCCCG GACGCTGGTC ATGTGAGACA 2040 m CCAGAAA AGCATTATGG TTTTCAGAAC AGTTCAAGTT GACTTGGGAT ATATCATTCC 2100 TCAACATGAA ACTTTTCATG AATGGGAGAA GAACCTATTT TTGTTGTGGT ACAACAGTTG 2160 AGAGCACGAC CAAGTGCATT TAGTTGAATG AAGTCTTCTT GGATTTCACC CAACTAAAAG 2220 GATTTTTAAA AATAAATAAC AGTCTTACCT AAATTATTAG GTAATGAATT GTAGCCAGTT 2280 GTTAATATCT TAATGCAGAT TTTTTTAAAA AAAAACATAA AATGATTTAT CTGGTATTTT 2340 AAAGGATCCA ACAGATCAGT ATTTTTTCCT GTGATGGGTT TTTTGAAATT TGACACATTA 2400 AAAGGTACTC CAGTATTTCA CTTTTCTCGA TCACTAAACA TATGCATATA TTITTAAAAA 2460 TCAGTAAAAG CATTACTCTA AGTGTAGACT TAATACCATG TGACATTTAA TCCAGATTGT 2520 AAATGCTCAT TTATGGTTAA TGACATTGAA GGTACATTTA TTGTACCAAA CCATTTTATG 2580 AGTTTTCTGT TAGCTTGCTT TAAAAATTAT TACTGTAAGA AATAGTTTTA TAAAAAATTA 2640 TATTTTTATT CAGTAATTTA ATTTTGTAAA TGCCAAATGA AAAACGTTTT TTGCTGCTAT 2700 GGTCTTAGCC TGTAGACATG CTGCTAGTAT CAGAGGGGCA GTAGAGCTTG GACAGAAAGA 2760 AAAGAAACTT GGTGTTAGGT AATTGACTAT GCACTAGTAT TTCAGACTTT TTAATTTTAT 2820 ATATATATAC ATTTTTTTTC CTTCTGCAAT ACATTTGAAA ACTTGTTTGG GAGACTCTGC 2880 ATTTTTTATT GTGGTTTTIT TGTTATTGTT GGTTTATACA AGCATGCGTT GCACTTCTTT 2940 TTTGGGAGAT GTGTGTTGTT CATGTTCTAT GTTTTGTTTT GTGTGTAGCC TGACTGTTTT 3000 ATAATTTGGG AGTTCTCGAT TTGATCCGCA TCCCCTGTGG TTTCTAAGTG TATGGTCTCA 3060 GAACTGTTGC ATGGATCCTG TGTTTGCAAC TGGGGAGACA GAAACTGTGG TTGATAGCCA 3120 GTCACTGCCT TAAGAACATT TGATGCAAGA TGGCCAGCAC TGAACTTTTG AGATATGACG 3180 GTGTACTTAC TGCCTTGTAG CAAAATAAAG ATGTGCCCTT ATTIT 3225 12) INFORMATION FOR SEQ ID NO:92: li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 420 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: TTTATAGATT CTGATTCTTC ATCATGGTGT GATCTCAGCA GTTTTGAATT CTTTGAAGAA 60 GCAGATTTTT CACCTAGCCA ACATCACACA GGCAAAGCCC TACAGTTTCA GCAAAGAGAG 120 GGCAATGGGA CTAAACCTGC AGGAGAACCT AGCCCAAGGG TGAACAAACG TATGTTGAGT 180 GAGAGTTCAC TTGACCCACC CAAGGTCTTA CCTCCTGCAA GGCACAGCAC AATTCCACTG 240 GTCATCCTTC GAAAAAAACG GGGCCAGGCC AGCCCCTTAG CCACTGGAGA CTGTAGCTCC 300 TTCATATTTG CTGACGTCAG CAGTTCAACT CCCAAGCGTT CCCCTGTCAA AAGCCTACCC 360 TTCTCTCCCT CGCAGTTCTT AAACACTTCC AGTAACCATG AAAACTCAGA CTTGGAAATG 420 (2) INFORMATION FOR SEQ ID NO:93: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 790 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: AGAATTTAGA AGCAGGGAGA TGTAATTAGA GAATATGTCA TTACCTAGAA ATGAAGCCAC 60 AAAGTCTAAA GTAAAGCAGT TAGAAAGGAA GTGGACAGAT AAATAGATGA TTAATGTATT 120 TAGTGTCATT TATCTATACA CTAAAACTTT TATTCTGTGA ATGCTTTTCC TCAAATTCTT 180 CCCTGCAAAA AGAAATAAAA TATTACTAAG GTAGCAACTC ATTI: TTITGA AAATCCTTTA 240 TATTTAGGTG CTCCAAATAC TGCAGAATTA AGGATTTGTC GTGTAAACAA GAATTGTGGA 300 AGTGTCAGAG GAGGAGATGA AATATTTCTA CTTTGTGACA AAGTTCAGAA AGGTATTTAT 360 TTATTTCATT GAATTTAGAA TAAATTTTAG ATTAATAGAT GCAGTTACTT TGTTTTCCCA 420 TTTTTTTTTT TTTGGTTTCT TATTGACTAG ATGACATAGA AGTTCGTTTT GTGTTGAACG 480 ATTGGGAAGC AAAAGGCATC TTTTCACAAG CTGATGTACA CCGTCAAGTA GCCATTGTTT 540 TCAAAACTCC ACCATATTGC AAAGCTATCA CAGAACCCGT AACAGTAAAA ATGCAGTTGC 600 GGAGACCTTC TGACCAGGAA GTTAGTGAAT CTATGGATTT TAGATATCTG CCAGATGAAA 660 AAGGTATGAC ATTTTGCTGG TAATAATTTA TATATTTCTT GAAGTGGTCC TGCTAATAAC 720 ATCTTCTTGT AATATTCATT TGAGTACAGT TATGTATATT CATAATTTAT GTTTCTTTTC 780 CTGGAAGCTT 790
Claims
WHAT IS CLAIMED:
1. A method for producing an oligonucleotide having an Rp stereoisomeric alkyl- or aryl-phosphonate linkage between a first nucleotide and a second nucleotide in the oligoncleotide, wherein said oligonucleotide has the formula:
EMI202.1
which compriseS:
(a) reacting a first nucleotide of the formula:
EMI202.2
with an alkyl- or aryl-phosphonothioate intermediate of the formula:
EMI203.1
under conditions sufficient to produce said Rp stereoisomeric alkyl- or aryl-phosphonate linkage, wherein:
Y1 is a hydrogen, phosphate, phosphate present in said oligonucleotide, or Vl,;
Y2 is a hydrogen, phosphate, phosphate present in said oligonucleotide or V2;
X is hydroxy or V,;
M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group;
each B group is independently a purine or pyrimidine base and each B group can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl;
V1 is a protecting group, solid support or phosphate attached to the penultimate nucleotide of said oligonucleotide;
V2 is a protecting group;
V3 is hydrogen or O-Y3 wherein Y3 is a lower alkyl or protecting group;
A is an activating group; and
said intermediate has an Sp stereoisomeric configuration at the phosphate;
and
(b) when V1, V2 or V, is a protecting group, optionally removing V1, V2 or V3 protecting groups.
2. A method of producing a polynucleotide chain of an oligonucleotide comprising at least one Rpalkyl-phosphonate or Rp-aryl-phosphonate linkage, wherein said oligonucleotide has the formula:
EMI204.1
which compriSes:
(a) reacting a 5'- terminal nucleotide of the formula:
EMI204.2
with an alkyl- or aryl-phosphonothioate nucleotide intermediate of the formula:
EMI205.1
under conditions sufficient to produce said Rp stereoisomeric alkyl- or aryl-phosphonate linkage and so generate a new 5'-terminal nucleotide, wherein:
Y1 is a hydrogen, phosphate, phosphate present in said oligonucleotide or V1;
Y2 is a hydrogen, phosphate, phosphate present in said oligonucleotide or V2;
X is either hydroxy or V,;
M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group;
n is an integer from 0 to 200;
each B group is independently a purine or pyrimidine base which can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl;
Vl is a protecting group, solid support or phosphate present on the penultimate nucleotide of said oligonucleotide;
V2 is a protecting group;
V, is hydroxy or OY3 wherein Y, is lower alkyl or protecting group;
A is an activating group;
and
said intermediate has an Sp stereoisomeric phosphorus configuration;
(b) removing said v2 protecting group from said new 5'-terminal nucleotide;
(c) reacting the product of (b) with another alkyl- or aryl-phosphonthioate nucleotide intermediate under conditions sufficient to produce the Rp stereoisomeric linkage and so generate a new 5'-terminal nucleotide;
(d) repeating steps b and c to extend said polynucleotide chain n-l times; and
(e) when V1, V2 or V, is a protecting group optionally removing said Vl, V2 or V3 protecting group.
3. A method of producing an alkyl- or arylphosphonothioate nucleotide intermediate, having an Sp stereoisomeric configuration at the phosphorus, which intermediate is of the formula:
EMI206.1
which comprises:
a) reacting an alkyl- or aryl-phosphonothioate nucleotide of the formula:
EMI207.1
with an A-L activator under conditions sufficient to produce said intermediate without inversion of said Sp stereoisomeric phosphorus configuration, wherein:
A is an activating group;
V2 is a protecting group;
V, is either hydrogen, or OY, wherein Y, is lower alkyl or protecting group;
M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group;
B is a purine or pyrimidine base which can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl;
L is a leaving group which can be attached to
A; and
said alkyl- or aryl-phosphonothioate nucleotide has an Sp stereoisomeric phosphorus configuration.
4. The method of any one of Claims 1 or 2 which further comprises performing said method by automation.
5. The method of Claim 4 wherein V1 is a solid support or a phosphate present on the penultimate nucleotide of said oligonucleotide.
6. The method of any one of Claims 1 or 2 which further comprises removing the V1 solid support.
7. The method of Claim 1 or 2 wherein Y1 or
Y2 is a phosphate present in said oligonucleotide.
8. The method of Claim 7 wherein said method further comprises adding 1-50 nucleotides joined by O-
PO2-O linkages.
9. The method of any one of Claims 1-3 wherein M is lower alkyl or aryl.
10. The method of Claim 9 wherein M is lower alkyl.
11. The method of Claim 10 wherein M is methyl.
12. The method of any one of Claims 1-3 wherein each B is independently a purine or a pyrimidine which can have 1-2 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy or lower alkylamino.
13. The method of Claim 12 wherein each B is independently a guanine, adenine, thymine, cytosine or uracil.
14. The method of Claim 1 or 2 wherein said conditions sufficient to produce said RD stereoisomeric linkage comprise a time, a temperature, a solvent.or a reactant concentration sufficient for nucleophilic attack by the 5'-oxygen of said first nucleotide or said 5'-terminal nucleotide upon the phosphorus of said intermediate to displace the sulfur and invert the phosphorus configuration.
15. The method of Claim 14 wherein said time is about 5 min to about 10 hr.
16. The method of Claim 14 wherein said time is about 30 min to about 90 min.
17. The method of Claim 14 wherein said temperature is about 20"C to about 25"C.
18. The method of Claim 14 wherein said solvent is anhydrous.
19. The method of Claim 14 wherein said reactant concentration is a molar ratio of said first nucleotide or said 5'-terminal nucleotide to intermediate which ranges from about 1:10 to about 5:1.
20. The method of Claim 19 wherein said molar ratio is about 1:2.
21. The method of Claim 3 wherein said conditions sufficient to displace the L group comprise a time, a solvent, a temperature or a reactant concentration sufficent for nucleophilic displacement of
L by the sulfur present on the alkyl- or arylphosphonothioate nucleotide.
22. The method of Claim 21 wherein said time is about 1 sec to about 30 min.
23. The method of Claim 22 wherein said time is about 1 min.
24. The method of Claim 21 wherein said solvent is an anhydrous solvent.
25. The method of Claim 24 wherein said solvent is acetonitrile or dimethylformamide.
26. The method of Claim 21 wherein said temperature is about 0 C to about 60"C.
27. The method of Claim 26 wherein said temperature is about 40C to about 45"C.
28. The method of Claim 27 wherein said temperature is about 20"C to about 25"C.
29. The method of Claim 21 wherein said reactant concentration comprises a molar ratio of alkylor aryl-phosphonothioate nucleotide to A-L activator of about 1:10 to about 10:1.
30. The method of Claim 29 wherein said molar ratio is about 1:5 to about 3:1.
31. The method of Claim 30 wherein said molar ratio is about 1:2.
32. The method of Claim 2 wherein n is 5 to 200.
33. The method of Claim 2 wherein n is 8 to 200.
34. The method of Claim 2 wherein n is 10 to 200.
35. The method of Claim 2 wherein n is 14 to 200.
36. The method of Claim 1, 2 or 3 wherein said A group comprises an heteroaromatic ring containing 1 to 4 nitrogen atoms, wherein said A group is of the formula:
EMI210.1
wherein:
Q is C-R1 or N;
D is C-R2 or N;
E is C-R, or N;
G is C-R4 or N;
J is C-Rs or N;
Y is -S-, -NRe or -O-;
R is a substituent attached to one nitrogen atom, wherein said substituent is lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl;
R1, R2, R3, R4 and R5 are independently hydrogen, lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl, arylalkyl, or Ri and R2 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R3 and
R4 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R4 and Rs are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring; and
R6 is H or lower alkyl.
37. The method of Claim 36 wherein said R is lower alkyl.
38. The method of Claim 37 wherein said activating group A is of the formula:
EMI211.1
39. The method of Claim 38 wherein said activating group A is of the formula:
EMI212.1
40. The method of Claim 39 wherein said A group is of the formula:
EMI212.2
41. The method of Claim 40 wherein said A activating group is 2-N-methylpyridinium.
42. The method of Claim 3 wherein said A-L group comprises a heteroaromatic ring containing 1-4 nitrogen atoms of the formula:
EMI212.3
Q is C-R1 or N;
D is C-R2 or N;
E is C-R, or N;
G is C-R4 or N;
J is C-Rs or N;
Y is -S-, -NR- or -O-;
R is a substituent attached to one nitrogen atom, wherein said substituent is lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl;
R1, R2, R,, R4 and R5 are independently hydrogen, lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl, arylalkyl, or R1 and Rz are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R, and
R4 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R4 and Rs are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring;
R6 is H or lower alkyl; and
L is a leaving group comprising a halo, nitro, diazo, azido, trialkylamino, alkoxy, aryloxy, alkyl sulfonate, lower fluoroalkylsulfonate, aryl sulfonate, alkyl sulfinate or aryl sulfinate group.
43. The method of Claim 42 wherein A-L is a salt of the formula:
EMI213.1
and Z is a counter ion.
44. The method of Claim 43 wherein A-L is a salt of the formula:
EMI214.1
wherein Z is a counter ion.
45. The method of Claim 44 wherein A-L is a salt of the formula:
EMI214.2
46. The method of Claim 42 wherein A-L is a salt of 2-N-lower alkyl pyridinium.
47. The method of Claim 46 wherein said salt of A-L is 2-bromo-N-methylpyridine bromide, 2-bromo-Nmethylpyridine iodide, 2-chloro-N-methylpyridine bromide or 2-chloro-N-methylpyridine iodide.
48. A compartmentalized kit for producing a polynucleotide chain of an oligonucleotide having at least five sequential R-alkyl-phosphonate or R-arylphosphonate linkages, wherein the oligonucleotide has the formula:
EMI215.1
which comprises:
(a) a first container adapted to contain an A
L salt;
and
(b) a second container adapted to contain a salt of a first alkyl- or aryl-phosphonothioate nucleotide precursor of the formula:
EMI215.2
wherein:
Yi is a hydrogen, phosphate, phosphate present in said oligonucleotide, or Vl;
Y2 is a hydrogen, phosphate, phosphate present in said oligonucleotide or V2;
X is either hydroxy or V3;
M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group;
n is an integer from 4 to 200;
each B group is independently a purine or pyrimidine base which can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl;
Vl is a protecting group, a solid support or a phosphate attached to the penultimate nucleotide of said oligonucleotide;
V2 is a protecting group;
V3 is a hydrogen, or OY, wherein Y3 is lower alkyl or protecting group; and
said precursor has an Sp stereoisomeric phosphorus configuration.
49. The kit of Claim 48 which is further adapted to contain at least one additional container containing a salt of a second alkyl- or arylphosphonothioate nucleotide precursor, wherein second precursor has an Sp stereoisomeric phosphorus configuration and a different B group than said first precursor.
50. The kit of Claim 48 wherein said first alkyl- or aryl-phosphonothioate nucleotide precursor has a B group selected from the group of guanine, adenine, thymine, cytosine or uracil.
51. The kit of Claim 49 wherein said second alkyl- or aryl-phosphonothioate nucleotide precursor has a B group selected from the group of guanine, adenine, thymine, cytosine or uracil.
52. The kit of Claim 48 wherein A-L comprises a heteroaromatic ring containing 1-4 nitrogen atoms of the formula:
EMI217.1
or salts thereof, wherein:
Q is C-R1 or N;
D is C-R2 or N;
E is C-R, or N;
G is C-R4 or N;
J is C-Rs or N;
Y is -S-, -NR6- or -O-;
R is lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl;
Rl, R2, R,, R4 and Rs are independently hydrogen, lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl, arylalkyl, or R1 and R2 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R3 and
R4 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R4 and R3 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring;
Rs is H or lower alkyl; and
L is a leaving group comprising a halo, nitro, diazo, azido, trialkylamino, alkoxy, aryloxy, alkyl sulfonate, lower fluoroalkylsulfonate, aryl sulfonate, alkyl sulfinate or aryl sulfinate group.
53. The kit of Claim 52 wherein A-L is 2bromo-N-methylpyridinium bromide, 2-bromo-Nmethylpyridinium iodide, 2-chloro-N-methylpyridinium bromide or 2-chloro-N-methylpyridinium iodide.
54. An alkyl- or aryl-phosphonothioate nucleotide intermediate, having an Sp stereoisomeric configuration at the phosphorus, said intermediate having the formula:
EMI218.1
wherein:
V2 is a protecting group;
V3 is hydroxy or 0-Y,, wherein Y, is a lower alkyl or protecting group;
M is a lower alkyl, cycloalkyl, thioxo, a thio-lower alkyl, aryl or aryl-lower alkyl group which can be substituted with at least one hydroxy, halogen or cyano group;
B is a purine or pyrimidine base which can have 1-3 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy, amino-lower alkyl, lower alkylamino, hydroxy-lower alkyl, aryl and aryl lower alkyl; and
A is an activating group.
55. The intermediate of Claim 54 wherein M is lower alkyl or aryl.
56. The intermediate of Claim 55 wherein M is lower alkyl.
57. The intermediate of Claim 56 wherein M is methyl.
58. The intermediate of Claim 54 wherein B is a purine or a pyrimidine which can have 1-2 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy or lower alkylamino.
59. The intermediate of Claim 58 wherein B is a guanine, adenine, thymine, cytosine or uracil.
60. The intermediate of Claim 54 wherein said V2 or V, protecting group is lower alkyl, lower cyanaolkyl, lower alkanoyl, aroyl, aryloxy, aryloxy to lower alkanoyl, haloaryl, fluorenyl methoxy carbonyl, trityl, monomethoxytrityl or dimethoxytrityl.
61. The intermediate of Claim 60 wherein said protecting group is isopropyl, isobutyl, 2-cyanoethyl, acetyl, benzoyl, phenoxyacetyl, halophenyl, dimethoxytrityl or monomethoxytrityl
62. The intermediate of Claim 54 wherein said
A group comprises a heteroaromatic ring containing 1 to 4 nitrogen atoms of the formula:
EMI220.1
wherein:
Q is C-R1 or N;
D is C-R2 or N;
E is C-R3 or N;
G is C-R4 or N;
J is C-R or N;
Y is -S-, -NRe or -O-;
R is a substituent attached to one nitrogen atom, wherein said substituent is lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl or arylalkyl;
R,, R2, R3, R4 and R3 are independently hydrogen, lower alkyl, cycloalkyl, cycloalkyl alkyl, aryl, arylalkyl, or R1 and R2 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R3 and
R4 are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring, or R4 and R, are taken together with the carbon atoms to which they are attached to form a 5 or 6 membered aromatic or heteroaromatic ring; and
R6 is H or lower alkyl.
63. The intermediate of Claim 62 wherein said
R group is lower alkyl.
64. The intermediate of Claim 62 wherein A is of the formula:
EMI221.1
65. The intermediate of Claim 64 wherein A is of the formula:
EMI221.2
66. The intermediate of Claim 65 wherein A is of the formula:
EMI221.3
67. The intermediate of Claim 66 wherein said
A group is of the formula:
EMI221.4
68. The intermediate of Claim 67 wherein A is 2-N-methylpyridinium.
69. An oligonucleotide comprising at least 5 sequential Rp alkyl- or aryl-phosphonate linkages produced by the method of
Claim 2.
70. The oligonucleotide of Claim 69 which comprises 8-200 sequential Rp alkyl- or aryl-phosphonate linkages.
71. The oligonucleotide of Claim 69 which comprises 10-200 sequential Rp alkyl- or arylphosphonate linkages.
72. The oligonucleotide of Claim 69 which comprises 12-200 sequential Rp alkyl- or arylphosphonate linkages.
73. The oligonucleotide of Claim 71 or 72 wherein 85% to 100% of said linkages are Rp alkyl- or aryl-phosphonate linkages.
74. The oligonucleotide of Claim 69 which further comprises 1-50 nucleotides joined by -O-PO2-Olinkages.
75. The oligonucleotide of Claim 69 wherein M is lower alkyl or aryl.
76. The oligonucleotide of Claim 75 wherein M is lower alkyl.
77. The oligonucleotide of Claim 76 wherein M is methyl.
78. The oligonucleotide of Claim 69 wherein each B is independently a purine or a pyrimidine which can have 1-2 substituents selected from the group consisting of lower alkyl, amino, oxo, hydroxy, lower alkoxy or lower alkylamino.
79. The oligonucleotide of Claim 78 wherein each B is independently a guanine, adenine, thymine, cytosine or uracil.
80. The oligonucleotide of Claim 69 wherein
Y1 or Y2 are independently hydroqen or phosphate.
81. The oligonucleotide of Claim 69 wherein X is hydroxy or V, and V3 is hydrogen or O-Y3 wherein Y, is lower alkyl.
82. The oligonucleotide of Claim 69 which further comprises an agent to facilitate cellular delivery.
83. The oligonucleotide of Claim 82 wherein said agent is a non-polar group, steroid, hormone, polycation, protein carrier, or viral or bacterial protein capable of cell membrane penetration.
84. The oligonucleotide of Claim 69 wherein said oligonucleotide further comprises a drug or a drug analog.
85. The oligonucleotide of Claim 69 wherein said oligonucleotide further comprises a reporter molecule.
86. A compartmentalized kit for detection or diagnosis of a target nucleic acid, comprising at least one first container adapted to contain any one of the oligonucleotides of Claim 69 or 85.
87. A compartmentalized kit for isolation of a template nucleic acid, comprising at least one first
Container adapted to contain the oligonucleotide of
Claim 69, wherein said oligonucleotide is complementary to a target contained within said template.
88. The kit of Claim 87 wherein said template is poly (A)' mRNA.
89. The kit of Claim 88 wherein said each B group on said oligonucleotide is thymine or uracil.
90. A method of regulating biosynthesis of a
DNA, an RNA or a protein which comprises administering to a patient a pharmaceutically effective amount of at least one oligonucleotide of Claim 69 with a nucleic acid template for said DNA, said RNA or said protein.
91. The method of Claim 90 wherein said biosynthesis comprises at least one of DNA replication,
DNA reverse transcription, RNA transcription, RNA splicing, RNA polyadenylation, RNA translocation and protein translation.
92. The method of Claim 91 wherein said template for said DNA replication is an RNA template or a DNA template.
93. The method of Claim 92 wherein said target of said oligonucleotide for regulating said DNA replication is an origin of replication or a primer binding site.
94. The method of Claim 91 wherein said target of said oligonucleotide for regulating said DNA reverse transcription is a primer binding site, a site in a retroviral genome, or a site in an mRNA.
95. The method of Claim 91 wherein said target of said oligonucleotide for regulating said RNA transcription is a promoter, a repressor binding site, an operator, an enhancer, a transcription regulatory element or a site in a mRNA encoding region.
96. The method of Claim 91 wherein said target of said oligonucleotide for regulating said RNA splicing is at least one of a 5' splice junction, an intron branch point or a 3' splice junction.
97. The method of Claim 91 wherein said target of said oligonucleotide for regulating said RNA polyadenylation is a polyadenylation site.
98. The method of Claim 91 wherein said target of said oligonucleotide for regulating said RNA translocation is a poly(A) tail.
99. The method of Claim 91 wherein said template for said protein translation is an mRNA template.
100. The method of Claim 99 wherein said target of said template is a ribosome binding site, a 5' mRNA cap an initiation codon, a site between a 5' mRNA cap site and an initiation codon, or a site in a protein coding region.
101. The method of Claim 90 wherein said template is a viral DNA or RNA template.
102. The method of Claim 90 wherein said template has a nucleotide sequence comprising any one of
SEQ ID NO:1 to SEQ ID NO:93.
103. A pharmaceutical composition for regulating biosynthesis of a nucleic acid or protein comprising a pharmaceutically effective amount of the oligonucleotide of Claim 69 and a pharmaceutically acceptable carrier.
104. A complex formed between the oligonucleotide of Claim 69 and a target nucleic acid.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU46611/93A AU4661193A (en) | 1992-06-30 | 1993-06-30 | Pentavalent synthesis of oligonucleotides containing stereospecific alkylphosphonates and arylphosphonates |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US90777192A | 1992-06-30 | 1992-06-30 | |
US907,771 | 1992-06-30 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO1994000473A2 true WO1994000473A2 (en) | 1994-01-06 |
WO1994000473A3 WO1994000473A3 (en) | 1994-02-17 |
Family
ID=25424615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1993/006277 WO1994000473A2 (en) | 1992-06-30 | 1993-06-30 | Pentavalent synthesis of oligonucleotides containing stereospecific alkylphosphonates and arylphosphonates |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU4661193A (en) |
WO (1) | WO1994000473A2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0653438A2 (en) * | 1993-08-06 | 1995-05-17 | Takeda Chemical Industries, Ltd. | Oligonucleotide compounds, their production and use |
US5703223A (en) * | 1994-09-02 | 1997-12-30 | Thomas Jefferson University | Solid phase synthesis of oligonucleotides with stereospecific substituted phosphonate linkages by pentavalent grignard coupling |
US6486313B1 (en) | 1999-02-18 | 2002-11-26 | Isis Pharmaceuticals, Inc. | Oligonucleotides having alkylphosphonate linkages and methods for their preparation |
US7256179B2 (en) | 2001-05-16 | 2007-08-14 | Migenix, Inc. | Nucleic acid-based compounds and methods of use thereof |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA1337639C (en) * | 1989-08-01 | 1995-11-28 | Joseph Eugene Celebuski | Dna probe assay using neutrally charged probe strands |
US5212295A (en) * | 1990-01-11 | 1993-05-18 | Isis Pharmaceuticals | Monomers for preparation of oligonucleotides having chiral phosphorus linkages |
US5512668A (en) * | 1991-03-06 | 1996-04-30 | Polish Academy Of Sciences | Solid phase oligonucleotide synthesis using phospholane intermediates |
-
1993
- 1993-06-30 WO PCT/US1993/006277 patent/WO1994000473A2/en active Application Filing
- 1993-06-30 AU AU46611/93A patent/AU4661193A/en not_active Abandoned
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0653438A2 (en) * | 1993-08-06 | 1995-05-17 | Takeda Chemical Industries, Ltd. | Oligonucleotide compounds, their production and use |
EP0653438A3 (en) * | 1993-08-06 | 1995-10-18 | Takeda Chemical Industries Ltd | Oligonucleotide compounds, their production and use. |
US5703223A (en) * | 1994-09-02 | 1997-12-30 | Thomas Jefferson University | Solid phase synthesis of oligonucleotides with stereospecific substituted phosphonate linkages by pentavalent grignard coupling |
US6486313B1 (en) | 1999-02-18 | 2002-11-26 | Isis Pharmaceuticals, Inc. | Oligonucleotides having alkylphosphonate linkages and methods for their preparation |
US7256179B2 (en) | 2001-05-16 | 2007-08-14 | Migenix, Inc. | Nucleic acid-based compounds and methods of use thereof |
US7709449B2 (en) | 2001-05-16 | 2010-05-04 | Migenix, Inc. | Nucleic acid-based compounds and methods of use thereof |
Also Published As
Publication number | Publication date |
---|---|
WO1994000473A3 (en) | 1994-02-17 |
AU4661193A (en) | 1994-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200232018A1 (en) | Hybridization compositions and methods using formamide | |
US6605432B1 (en) | High-throughput methods for detecting DNA methylation | |
US20100129822A1 (en) | Compositions and methods for detecting small rnas, and uses thereof | |
CA2442820A1 (en) | Microarray gene expression profiling in clear cell renal cell carcinoma: prognosis and drug target identification | |
CA2852855A1 (en) | Genetic polymorphisms associated with liver fibrosis methods of detection and uses thereof | |
US20240018590A1 (en) | Hybridization compositions and methods | |
CN106636344B (en) | Gene detection kit for thalassemia based on second-generation high-throughput sequencing technology | |
JP2003144176A (en) | Detection method for gene polymorphism | |
KR101573467B1 (en) | Method for Detecting Bladder Cancer Using Bladder Cancer Specific Epigenetic Marker Gene | |
Kleinle et al. | Detection and characterization of mitochondrial DNA rearrangements in Pearson and Kearns-Sayre syndromes by long PCR | |
Pellestor et al. | The peptide nucleic acids, efficient tools for molecular diagnosis | |
DK1939287T3 (en) | Method of gene transfer specific to a trophectodermal cell | |
EP2935306B1 (en) | Labeled oligonucleotide probes used for nucleic acid sequence analysis | |
WO1994000473A2 (en) | Pentavalent synthesis of oligonucleotides containing stereospecific alkylphosphonates and arylphosphonates | |
KR20220156899A (en) | Methods and kits for screening for colorectal neoplasms | |
CN110546274A (en) | Detection method of minor BCR-ABL1 gene | |
US20040161773A1 (en) | Subtelomeric DNA probes and method of producing the same | |
CN117813403A (en) | Method for disease detection | |
CN113727732B (en) | Treatment of elevated lipid levels with sterol regulatory element binding transcription factor 1 (SREBF 1) inhibitors | |
KR20120119849A (en) | Snp genotyping assay set for apoe genes and method of detecting apoe using the same | |
KR20110113642A (en) | Methods for the subclassification of breast tumours | |
US20040106138A1 (en) | Methods for identifying subjects susceptible to ataxic neurological disease | |
US20240002937A1 (en) | Method | |
CA2587863A1 (en) | Genes associated with colorectal cancer | |
KR101187317B1 (en) | Polymorphic markers predicting susceptibility to diffuse-type gastric cancer and the prediction method thereof using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AU CA JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE |
|
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AU CA JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: CA |
|
122 | Ep: pct application non-entry in european phase |