US20040014067A1 - Amplification methods and compositions - Google Patents

Amplification methods and compositions Download PDF

Info

Publication number
US20040014067A1
US20040014067A1 US10/321,039 US32103902A US2004014067A1 US 20040014067 A1 US20040014067 A1 US 20040014067A1 US 32103902 A US32103902 A US 32103902A US 2004014067 A1 US2004014067 A1 US 2004014067A1
Authority
US
United States
Prior art keywords
primer
sequence
target
assay
oligonucleotide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/321,039
Inventor
Victor Lyamichev
Andrew Lukowiak
Nancy Jarvis
David Kurensky
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Wave Technologies Inc
Original Assignee
Third Wave Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Wave Technologies Inc filed Critical Third Wave Technologies Inc
Priority to US10/321,039 priority Critical patent/US20040014067A1/en
Assigned to THIRD WAVE TECHNOLOGIES, INC. reassignment THIRD WAVE TECHNOLOGIES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JARVIS, NANCY, KURENSKY, DAVID, LUKOWIAK, ANDREW, LYAMICHEV, VICTOR
Publication of US20040014067A1 publication Critical patent/US20040014067A1/en
Priority to US12/174,277 priority patent/US7790393B2/en
Assigned to GOLDMAN SACHS CREDIT PARTNERS L.P., AS COLLATERAL AGENT reassignment GOLDMAN SACHS CREDIT PARTNERS L.P., AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: THIRD WAVE TECHNOLOGIES, INC.
Assigned to CYTYC CORPORATION, CYTYC PRENATAL PRODUCTS CORP., CYTYC SURGICAL PRODUCTS II LIMITED PARTNERSHIP, CYTYC SURGICAL PRODUCTS III, INC., CYTYC SURGICAL PRODUCTS LIMITED PARTNERSHIP, THIRD WAVE TECHNOLOGIES, INC., SUROS SURGICAL SYSTEMS, INC., R2 TECHNOLOGY, INC., HOLOGIC, INC., BIOLUCENT, LLC, DIRECT RADIOGRAPHY CORP. reassignment CYTYC CORPORATION TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS Assignors: GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders

Definitions

  • the present invention provides methods for developing and optimizing nucleic acid detection assays for use in basic research, clinical research, and for the development of clinical detection assays.
  • the present invention provides methods for designing oligonucleotide primers to be used in multiplex amplification reactions.
  • the present invention also provides methods to optimize multiplex amplification reactions.
  • the present invention also provides methods to perform Highly Multiplexed PCR in Combination with the INVADER Assay.
  • HGP Human Genome Project
  • researchers collected blood (female) or sperm (male) samples from a large number of donors.
  • the human genome sequence generated by the private genomics company Celera was based on DNA samples collected from five donors who identified themselves as Hispanic, Asian, Caucasian, or African-American.
  • the small number of human samples used to generate the reference sequences does not reflect the genetic diversity among population groups and individuals. Attempts to analyze individuals based on the genome sequence information will often fail. For example, many genetic detection assays are based on the hybridization of probe oligonucleotides to a target region on genomic DNA or mRNA.
  • Probes generated based on the reference sequences will often fail (e.g., fail to hybridize properly, fail to properly characterize the sequence at specific position of the target) because the target sequence for many individuals differs from the reference sequence. Differences may be on an individual-by-individual basis, but many follow regional population patterns (e.g., many correlate highly to race, ethnicity, geographic local, age, environmental exposure, etc.). With the limited utility of information currently available, the art is in need of systems and methods for acquiring, analyzing, storing, and applying large volumes of genetic information with the goal of providing an array of detection assay technologies for research and clinical analysis of biological samples.
  • the present invention provides methods and routines for developing and optimizing nucleic acid detection assays for use in basic research, clinical research, and for the development of clinical detection assays.
  • the present invention provides methods comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises a forward and a reverse primer sequence for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- . . .
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide A or C
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides methods comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises a forward and a reverse primer sequence for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- . . .
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide G or T
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • a method comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the 5′ region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the 3′ region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- .
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide A or C
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the 5′ region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the 3′ region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- .
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide G or T
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises a single nucleotide polymorphism, b) determining where on each of the target sequences one or more assay probes would hybridize in order to detect the single nucleotide polymorphism such that a footprint region is located on each of the target sequences, and c) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- .
  • N represents a base
  • x is at least 6
  • N[1] is nucleotide A or C
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises a single nucleotide polymorphism, b) determining where on each of the target sequences one or more assay probes would hybridize in order to detect the single nucleotide polymorphism such that a footprint region is located on each of the target sequences, and c) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x ⁇ 1]- .
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide T or G
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the primer set is configured for performing a multiplex PCR reaction that amplifies at least Y amplicons, wherein each of the amplicons is defined by the position of the forward and reverse primers.
  • the primer set is generated as digital or printed sequence information.
  • the primer set is generated as physical primer oligonucleotides.
  • N[3]-N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[3]-N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ A or C in the 5′ region. In certain embodiments, the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ G or T in the 5′ region.
  • the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ A or C in the 5′ region, and wherein the processing further comprises changing the N[1] to the next most 3′ A or C in the 5′ region for the forward primer sequences that fail the requirement that each of the forward primer's N[2]-N[1]-3′ is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ A or C in the complement of the 3′ region. In some embodiments, the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ G or T in the complement of the 3′ region.
  • the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ A or C in the 3′ region, and wherein the processing further comprises changing the N[1] to the next most 3′ A or C in the 3′ region for the reverse primer sequences that fail the requirement that each of the reverse primer's N[2]-N[1]-3′ is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the footprint region comprises a single nucleotide polymorphism.
  • the footprint comprises a mutation.
  • the footprint region for each of the target sequences comprises a portion of the target sequence that hybridizes to one or more assay probes configured to detect the single nucleotide polymorphism.
  • the footprint is this region where the probes hybridize.
  • the footprint further includes additional nucleotides on either end.
  • the processing further comprises selecting N[5]-N[4]-N[3]-N[2]-N[1]-3′ for each of the forward and reverse primers such that less than 80 percent homology with a assay component sequence is present.
  • the assay component is a FRET probe sequence.
  • the target sequence is about 300-500 base pairs in length, or about 200-600 base pair in length.
  • Y is an integer between 2 and 500, or between 2-10,000.
  • the processing comprises selecting x for each of the forward and reverse primers such that each of the forward and reverse primers has a melting temperature with respect to the target sequence of approximately 50 degrees Celsius (e.g. 50 degrees, Celsius, or at least 50 degrees Celsius, and no more than 55 degrees Celsius).
  • the melting temperature of a primer is at least 50 degrees Celsius, but at least 10 degrees different than a selected detection assay's optimal reaction temperature.
  • the forward and reverse primer pair optimized concentrations are determined for the primer set.
  • the processing is automated. In further embodiments, the processing is automated with a processor.
  • the present invention provides a kit comprising the primer set generated by the methods of the present invention, and at least one other component. (e.g. cleavage agent, polymerase, INVADER oligonucleotide).
  • the present invention provides compositions comprising the primers and primer sets generated by the methods of the present invention.
  • the present invention provides methods comprising; a) providing; i) a user interface configured to receive sequence data, ii) a computer system having stored therein a multiplex PCR primer software application, and b) transmitting the sequence data from the user interface to the computer system, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and c) processing the target sequence information with the multiplex PCR primer pair software application to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide A or C
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides methods comprising; a) providing; i) a user interface configured to receive sequence data, ii) a computer system having stored therein a multiplex PCR primer software application, and b) transmitting the sequence data from the user interface to the computer system, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and c) processing the target sequence information with the multiplex PCR primer pair software application to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide G or T
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set.
  • the present invention provides systems comprising; a) a computer system configured to receive data from a user interface, wherein the user interface is configured to receive sequence data, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, b) a multiplex PCR primer pair software application operably linked to the user interface, wherein the multiplex PCR primer software application is configured to process the target sequence information to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide A or C
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set
  • a computer system having stored therein the multiplex PCR primer pair software application, wherein the computer system comprises computer memory and a computer processor.
  • the present invention provides systems comprising; a) a computer system configured to receive data from a user interface, wherein the user interface is configured to receive sequence data, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, b) a multiplex PCR primer pair software application operably linked to the user interface, wherein the multiplex PCR primer software application is configured to process the target sequence information to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of
  • N represents a nucleotide base
  • x is at least 6
  • N[1] is nucleotide G or T
  • N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set
  • a computer system having stored therein the multiplex PCR primer pair software application, wherein the computer system comprises computer memory and a computer processor.
  • the computer system is configured to return the primer set to the user interface.
  • FIG. 1 shows one embodiments of SNP detection using the INVADER assay in biplex format.
  • FIG. 2 shows an input target sequence and the result of processing this sequence with systems and routines of the present invention.
  • FIG. 3 shows an example of a basic work flow for highly multiplexed PCR using the INVADER Medically Associated Panel.
  • FIG. 4 shows a flow chart outlining the steps that may be performed in order to generated a primer set useful in multiplex PCR.
  • FIGS. 5 - 9 show sequences used and data generated in connection with Example 1.
  • FIGS. 10 - 17 show sequences used and data generated in connection with Example 2.
  • FIG. 18 shows one protocol for Multiplex PCR optimization according to the present invention.
  • FIG. 19 shows certain criteria that can be employed in certain embodiments of the present invention in order to design multiplex primers.
  • FIG. 20 shows certain PCR primers useful for amplifying various regions of CYP2D6.
  • FIG. 21 shows certain results from Example 3.
  • FIG. 22 shows certain results from Example 4.
  • FIG. 23 shows additional results from Example 4.
  • SNP single nucleotide polymorphisms
  • SNPs can be located in a portion of a genome that does not code for a gene.
  • a “SNP” may be located in the coding region of a gene.
  • the “SNP” may alter the structure and function of the RNA or the protein with which it is associated.
  • allele refers to a variant form of a given sequence (e.g., including but not limited to, genes containing one or more SNPs).
  • a large number of genes are present in multiple allelic forms in a population.
  • a diploid organism carrying two different alleles of a gene is said to be heterozygous for that gene, whereas a homozygote carries two copies of the same allele.
  • linkage refers to the proximity of two or more markers (e.g., genes) on a chromosome.
  • allele frequency refers to the frequency of occurrence of a given allele (e.g., a sequence containing a SNP) in given population (e.g., a specific gender, race, or ethnic group). Certain populations may contain a given allele within a higher percent of its members than other populations. For example, a particular mutation in the breast cancer gene called BRCA1 was found to be present in one percent of the general Jewish population. In comparison, the percentage of people in the general U.S. population that have any mutation in BRCA1 has been estimated to be between 0.1 to 0.6 percent. Two additional mutations, one in the BRCA1 gene and one in another breast cancer gene called BRCA2, have a greater prevalence in the Ashkenazi Jewish population, bringing the overall risk for carrying one of these three mutations to 2.3 percent.
  • in silico analysis refers to analysis performed using computer processors and computer memory.
  • insilico SNP analysis refers to the analysis of SNP data using computer processors and memory.
  • genotype refers to the actual genetic make-up of an organism (e.g., in terms of the particular alleles carried at a genetic locus). Expression of the genotype gives rise to an organism's physical appearance and characteristics—the “phenotype.”
  • locus refers to the position of a gene or any other characterized sequence on a chromosome.
  • disease or “disease state” refers to a deviation from the condition regarded as normal or average for members of a species, and which is detrimental to an affected individual under conditions that are not inimical to the majority of individuals of that species (e.g., diarrhea, nausea, fever, pain, and inflammation etc).
  • treatment in reference to a medical course of action refer to steps or actions taken with respect to an affected individual as a consequence of a suspected, anticipated, or existing disease state, or wherein there is a risk or suspected risk of a disease state. Treatment may be provided in anticipation of or in response to a disease state or suspicion of a disease state, and may include, but is not limited to preventative, ameliorative, palliative or curative steps.
  • therapy refers to a particular course of treatment.
  • the term “gene” refers to a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, RNA (e.g., rRNA, tRNA, etc.), or precursor.
  • the polypeptide, RNA, or precursor can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., ligand binding, signal transduction, etc.) of the full-length or fragment are retained.
  • the term also encompasses the coding region of a structural gene and the including sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA.
  • the sequences that are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences.
  • the sequences that are located 3′ or downstream of the coding region and that are present on the mRNA are referred to as 3′ untranslated sequences.
  • gene encompasses both cDNA and genomic forms of a gene.
  • a genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.”
  • Introns are segments included when a gene is transcribed into heterogeneous nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are generally absent in the messenger RNA (mRNA) transcript.
  • mRNA messenger RNA
  • Variations e.g., mutations, SNPS, insertions, deletions
  • transcribed portions of genes are reflected in, and can generally be detected in corresponding portions of the produced RNAs (e.g., hnRNAs, mRNAs, rRNAs, tRNAs).
  • amino acid sequence is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule
  • amino acid sequence and like terms, such as polypeptide or protein are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.
  • genomic forms of a gene may also include sequences located on both the 5′ and 3′ end of the sequences that are present on the RNA transcript. These sequences are referred to as “flanking” sequences or regions (these flanking sequences are located 5′ or 3′ to the non-translated sequences present on the mRNA transcript).
  • the 5′ flanking region may contain regulatory sequences such as promoters and enhancers that control or influence the transcription of the gene.
  • the 3′ flanking region may contain sequences that direct the termination of transcription, post-transcriptional cleavage and polyadenylation.
  • wild-type refers to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
  • a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designed the “normal” or “wild-type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • nucleic acid molecule encoding refers to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. In this case, the DNA sequence thus codes for the amino acid sequence.
  • DNA and RNA molecules are said to have “5′ ends” and “3′ ends” because mononucleotides are reacted to make oligonucleotides or polynucleotides in a manner such that the 5′ phosphate of one mononucleotide pentose ring is attached to the 3′ oxygen of its neighbor in one direction via a phosphodiester linkage.
  • an end of an oligonucleotides or polynucleotide referred to as the “5′ end” if its 5′ phosphate is not linked to the 3′ oxygen of a mononucleotide pentose ring and as the “3′ end” if its 3′ oxygen is not linked to a 5′ phosphate of a subsequent mononucleotide pentose ring.
  • a nucleic acid sequence even if internal to a larger oligonucleotide or polynucleotide, also may be said to have 5′ and 3′ ends.
  • an oligonucleotide having a nucleotide sequence encoding a gene and “polynucleotide having a nucleotide sequence encoding a gene,” means a nucleic acid sequence comprising the coding region of a gene or, in other words, the nucleic acid sequence that encodes a gene product.
  • the coding region may be present in either a cDNA, genomic DNA, or RNA form.
  • the oligonucleotide or polynucleotide may be single-stranded (i.e., the sense strand) or double-stranded.
  • Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region of the gene if needed to permit proper initiation of transcription and/or correct processing of the primary RNA transcript.
  • the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
  • the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, for the sequence “5′-A-G-T-3′,” is complementary to the sequence “3′-T-C-A-5′.” Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids.
  • the term “homology” refers to a degree of complementarity. There may be partial homology or complete homology (i.e., identity).
  • a partially complementary sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid and is referred to using the functional term “substantially homologous.”
  • the term “inhibition of binding,” when used in reference to nucleic acid binding, refers to inhibition of binding caused by competition of homologous sequences for binding to a target sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency.
  • a substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous to a target under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction.
  • the absence of non-specific binding may be tested by the use of a second target that lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.
  • the art knows conditions that promote hybridization under conditions of high stringency e.g., increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
  • substantially homologous refers to any probe that can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above.
  • a gene may produce multiple RNA species that are generated by differential splicing of the primary RNA transcript.
  • cDNAs that are splice variants of the same gene will contain regions of sequence identity or complete homology (representing the presence of the same exon or portion of the same exon on both cDNAs) and regions of complete non-identity (for example, representing the presence of exon “A” on cDNA 1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAs contain regions of sequence identity they will both hybridize to a probe derived from the entire gene or portions of the gene containing sequences found on both cDNAs; the two splice variants are therefore substantially homologous to such a probe and to each other.
  • substantially homologous refers to any probe that can hybridize (i.e., it is the complement of) the single-stranded nucleic acid sequence under conditions of low stringency as described above.
  • hybridization is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, the T m of the formed hybrid, and the G:C ratio within the nucleic acids.
  • T m is used in reference to the “melting temperature.”
  • the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • stringency is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. Those skilled in the art will recognize that “stringency” conditions may be altered by varying the parameters just described either individually or in concert. With “high stringency” conditions, nucleic acid base pairing will occur only between nucleic acid fragments that have a high frequency of complementary base sequences (e.g., hybridization under “high stringency” conditions may occur between homologs with about 85-100% identity, preferably about 70-100% identity).
  • nucleic acid base pairing will occur between nucleic acids with an intermediate frequency of complementary base sequences (e.g., hybridization under “medium stringency” conditions may occur between homologs with about 50-70% identity).
  • conditions of “weak” or “low” stringency are often required with nucleic acids that are derived from organisms that are genetically diverse, as the frequency of complementary sequences is usually less.
  • “High stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5 ⁇ SSPE (43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5 ⁇ Denhardt's reagent and 100 ⁇ g/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1 ⁇ SSPE, 1.0% SDS at 42 C when a probe of about 500 nucleotides in length is employed.
  • 5 ⁇ SSPE 43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH
  • SDS 5 ⁇ Denhardt's reagent
  • 100 ⁇ g/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1 ⁇ SSPE, 1.0% SDS at
  • “Medium stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5 ⁇ SSPE (43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5 ⁇ Denhardt's reagent and 100 ⁇ g/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0 ⁇ SSPE, 1.0% SDS at 42 C. when a probe of about 500 nucleotides in length is employed.
  • 5 ⁇ SSPE 43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH
  • SDS 5 ⁇ Denhardt's reagent
  • 100 ⁇ g/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0 ⁇ SSPE, 1.0% S
  • “Low stringency conditions” comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5 ⁇ SSPE (43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5 ⁇ Denhardt's reagent [50 ⁇ Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharamcia), 5 g BSA (Fraction V; Sigma)] and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5 ⁇ SSPE, 0.1% SDS at 42 C when a probe of about 500 nucleotides in length is employed.
  • 5 ⁇ SSPE 43.8 g/l NaCl, 6.9 g/l NaH 2 PO 4 H 2 O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH
  • 5 ⁇ Denhardt's reagent 50 ⁇ Denhardt's
  • reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA sequence given in a sequence listing or may comprise a complete gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length.
  • two polynucleotides may each (1) comprise a sequence (i.e., a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides
  • sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a “comparison window” to identify and compare local regions of sequence similarity.
  • a “comparison window,” as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • Optimal alignment of sequences for aligning a comparison window may be conducted by the local homology algorithm of Smith and Waterman [Smith and Waterman, Adv. Appl. Math.
  • sequence identity means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide basis) over the window of comparison.
  • percentage of sequence identity is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • the term “substantial identity” denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 85 percent sequence identity, preferably at least 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent or less of the reference sequence over the window of comparison.
  • the reference sequence may be a subset of a larger sequence, for example, as a splice variant of the full-length sequences.
  • the term “substantial identity” means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap weights, share at least 80 percent sequence identity, preferably at least 90 percent sequence identity, more preferably at least 95 percent sequence identity or more (e.g., 99 percent sequence identity).
  • residue positions that are not identical differ by conservative amino acid substitutions.
  • Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains.
  • a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine.
  • Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine.
  • “Amplification” is a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out.
  • Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid.
  • MDV-1 RNA is the specific template for the replicase (D. L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]).
  • Other nucleic acid will not be replicated by this amplification enzyme.
  • this amplification enzyme has a stringent specificity for its own promoters (M.
  • Taq and Pfu polymerases by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (H. A. Erlich (ed.), PCR Technology, Stockton Press [1989]).
  • amplifiable nucleic acid is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that “amplifiable nucleic acid” will usually comprise “sample template.”
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of “target” (defined below).
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • the term “primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer should be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • probe or “hybridization probe” refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing, at least in part, to another oligonucleotide of interest.
  • a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular sequences.
  • probes used in the present invention will be labeled with a “reporter molecule,” so that is detectable in any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
  • target refers to a nucleic acid sequence or structure to be detected or characterized.
  • PCR polymerase chain reaction
  • K. B. Mullis See e.g., U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,965,188, hereby incorporated by reference
  • This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase.
  • the two primers are complementary to their respective strands of the double stranded target sequence.
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one “cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired target sequence.
  • the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
  • PCR polymerase chain reaction
  • PCR With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32 P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment).
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • PCR product refers to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • amplification reagents refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template, and the amplification enzyme.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
  • the term “recombinant DNA molecule” as used herein refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques.
  • antisense is used in reference to RNA sequences that are complementary to a specific RNA sequence (e.g., mRNA).
  • antisense strand is used in reference to a nucleic acid strand that is complementary to the “sense” strand.
  • the designation ( ⁇ ) i.e., “negative” is sometimes used in reference to the antisense strand, with the designation (+) sometimes used in reference to the sense (i.e., “positive”) strand.
  • isolated when used in relation to a nucleic acid, as in “an isolated oligonucleotide” or “isolated polynucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA found in the state they exist in nature.
  • a given DNA sequence e.g., a gene
  • RNA sequences such as a specific mRNA sequence encoding a specific protein
  • isolated nucleic acids encoding a polypeptide include, by way of example, such nucleic acid in cells ordinarily expressing the polypeptide where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
  • the isolated nucleic acid, oligonucleotide, or polynucleotide may be present in single-stranded or double-stranded form.
  • the oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand (i.e., the oligonucleotide or polynucleotide may single-stranded), but may contain both the sense and anti-sense strands (i.e., the oligonucleotide or polynucleotide may be double-stranded).
  • portion when in reference to a nucleotide sequence (as in “a portion of a given nucleotide sequence”) refers to fragments of that sequence.
  • the fragments may range in size from four nucleotides to the entire nucleotide sequence minus one nucleotide (e.g., 10 nucleotides, 11, . . . , 20, . . . ).
  • the term “purified” or “to purify” refers to the removal of contaminants from a sample.
  • the term “purified” refers to molecules (e.g., nucleic or amino acid sequences) that are removed from their natural environment, isolated or separated.
  • An “isolated nucleic acid sequence” is therefore a purified nucleic acid sequence.
  • “Substantially purified” molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated.
  • recombinant protein or “recombinant polypeptide” as used herein refers to a protein molecule that is expressed from a recombinant DNA molecule.
  • native protein as used herein to indicate that a protein does not contain amino acid residues encoded by vector sequences; that is the native protein contains only those amino acids found in the protein as it occurs in nature.
  • a native protein may be produced by recombinant means or may be isolated from a naturally occurring source.
  • portion when in reference to a protein (as in “a portion of a given protein”) refers to fragments of that protein.
  • the fragments may range in size from four consecutive amino acid residues to the entire amino acid sequence minus one amino acid.
  • Southern blot refers to the analysis of DNA on agarose or acrylamide gels to fractionate the DNA according to size followed by transfer of the DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane.
  • the immobilized DNA is then probed with a labeled probe to detect DNA species complementary to the probe used.
  • the DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support.
  • Southern blots are a standard tool of molecular biologists (J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31-9.58 [1989]).
  • the term “Western blot” refers to the analysis of protein(s) (or polypeptides) immobilized onto a support such as nitrocellulose or a membrane.
  • the proteins are run on acrylamide gels to separate the proteins, followed by transfer of the protein from the gel to a solid support, such as nitrocellulose or a nylon membrane.
  • the immobilized proteins are then exposed to antibodies with reactivity against an antigen of interest.
  • the binding of the antibodies may be detected by various methods, including the use of labeled antibodies.
  • test compound refers to any chemical entity, pharmaceutical, drug, and the like that are tested in an assay (e.g., a drug screening assay) for any desired activity (e.g., including but not limited to, the ability to treat or prevent a disease, illness, sickness, or disorder of bodily function, or otherwise alter the physiological or cellular status of a sample).
  • Test compounds comprise both known and potential therapeutic compounds.
  • a test compound can be determined to be therapeutic by screening using the screening methods of the present invention.
  • a “known therapeutic compound” refers to a therapeutic compound that has been shown (e.g., through animal trials or prior experience with administration to humans) to be effective in such treatment or prevention.
  • sample as used herein is used in its broadest sense.
  • a sample suspected of containing a human chromosome or sequences associated with a human chromosome may comprise a cell, chromosomes isolated from a cell (e.g., a spread of metaphase chromosomes), genomic DNA (in solution or bound to a solid support such as for Southern blot analysis), RNA (in solution or bound to a solid support such as for Northern blot analysis), cDNA (in solution or bound to a solid support) and the like.
  • a sample suspected of containing a protein may comprise a cell, a portion of a tissue, an extract containing one or more proteins and the like.
  • label refers to any atom or molecule that can be used to provide a detectable (preferably quantifiable) effect, and that can be attached to a nucleic acid or protein. Labels include but are not limited to dyes; radiolabels such as 32 P; binding moieties such as biotin; haptens such as digoxgenin; luminogenic, phosphorescent or fluorogenic moieties; and fluorescent dyes alone or in combination with moieties that can suppress or shift emission spectra by fluorescence resonance energy transfer (FRET).
  • FRET fluorescence resonance energy transfer
  • Labels may provide signals detectable by fluorescence, radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption, magnetism, enzymatic activity, and the like.
  • a label may be a charged moiety (positive or negative charge) or alternatively, may be charge neutral.
  • Labels can include or consist of nucleic acid or protein sequence, so long as the sequence comprising the label is detectable.
  • signal refers to any detectable effect, such as would be caused or provided by a label or an assay reaction.
  • the term “detector” refers to a system or component of a system, e.g., an instrument (e.g. a camera, fluorimeter, charge-coupled device, scintillation counter, etc) or a reactive medium (X-ray or camera film, pH indicator, etc.), that can convey to a user or to another component of a system (e.g., a computer or controller) the presence of a signal or effect.
  • an instrument e.g. a camera, fluorimeter, charge-coupled device, scintillation counter, etc
  • a reactive medium X-ray or camera film, pH indicator, etc.
  • a detector can be a photometric or spectrophotometric system, which can detect ultraviolet, visible or infrared light, including fluorescence or chemiluminescence; a radiation detection system; a spectroscopic system such as nuclear magnetic resonance spectroscopy, mass spectrometry or surface enhanced Raman spectrometry; a system such as gel or capillary electrophoresis or gel exclusion chromatography; or other detection system known in the art, or combinations thereof.
  • the term “distribution system” refers to systems capable of transferring and/or delivering materials from one entity to another or one location to another.
  • a distribution system for transferring detection panels from a manufacturer or distributor to a user may comprise, but is not limited to, a packaging department, a mail room, and a mail delivery system.
  • the distribution system may comprise, but is not limited to, one or more delivery vehicles and associated delivery personnel, a display stand, and a distribution center.
  • interested parties e.g., detection panel manufactures
  • the term “at a reduced cost” refers to the transfer of goods or services at a reduced direct cost to the recipient (e.g. user). In some embodiments, “at a reduced cost” refers to transfer of goods or services at no cost to the recipient.
  • the term “at a subsidized cost” refers to the transfer of goods or services, wherein at least a portion of the recipient's cost is deferred or paid by another party. In some embodiments, “at a subsidized cost” refers to transfer of goods or services at no cost to the recipient.
  • the term “at no cost” refers to the transfer of goods or services with no direct financial expense to the recipient. For example, when detection panels are provided by a manufacturer or distributor to a user (e.g. research scientist) at no cost, the user does not directly pay for the tests.
  • detection refers to quantitatively or qualitatively identifying an analyte (e.g., DNA, RNA or a protein) within a sample.
  • detection assay refers to a kit, test, or procedure performed for the purpose of detecting an analyte nucleic acid within a sample.
  • Detection assays produce a detectable signal or effect when performed in the presence of the target analyte, and include but are not limited to assays incorporating the processes of hybridization, nucleic acid cleavage (e.g., exo- or endonuclease), nucleic acid amplification, nucleotide sequencing, primer extension, or nucleic acid ligation.
  • nucleic acid cleavage e.g., exo- or endonuclease
  • nucleic acid amplification e.g., exo- or endonuclease
  • nucleotide sequencing e.g., primer extension, or nucleic acid ligation.
  • the term “functional detection oligonucleotide” refers to an oligonucleotide that is used as a component of a detection assay, wherein the detection assay is capable of successfully detecting (i.e., producing a detectable signal) an intended target nucleic acid when the functional detection oligonucleotide provides the oligonucleotide component of the detection assay. This is in contrast to a non-functional detection oligonucleotides, which fail to produce a detectable signal in a detection assay for the particular target nucleic acid when the non-functional detection oligonucleotide is provided as the oligonucleotide component of the detection assay. Determining if an oligonucleotide is a functional oligonucleotide can be carried out experimentally by testing the oligonucleotide in the presence of the particular target nucleic acid using the detection assay.
  • the term “derived from a different subject,” such as samples or nucleic acids derived from a different subjects refers to a samples derived from multiple different individuals.
  • a blood sample comprising genomic DNA from a first person and a blood sample comprising genomic DNA from a second person are considered blood samples and genomic DNA samples that are derived from different subjects.
  • a sample comprising five target nucleic acids derived from different subjects is a sample that includes at least five samples from five different individuals. However, the sample may further contain multiple samples from a given individual.
  • the term “treating together”, when used in reference to experiments or assays, refers to conducting experiments concurrently or sequentially, wherein the results of the experiments are produced, collected, or analyzed together (i.e., during the same time period). For example, a plurality of different target sequences located in separate wells of a multiwell plate or in different portions of a microarray are treated together in a detection assay where detection reactions are carried out on the samples simultaneously or sequentially and where the data collected from the assays is analyzed together.
  • test result data refers to data collected from performance of an assay (e.g., to detect or quantitate a gene, SNP or an RNA).
  • Test result data may be in any form, i.e., it may be raw assay data or analyzed assay data (e.g., previously analyzed by a different process).
  • raw assay data e.g., a number corresponding to a measurement of signal, such as a fluorescence signal from a spot on a chip or a reaction vessel, or a number corresponding to measurement of a peak, such as peak height or area, as from, for example, a mass spectrometer, HPLC or capillary separation device
  • analyzed assay data or “output assay data”.
  • genomic information database refers to collections of information (e.g., data) arranged for ease of retrieval, for example, stored in a computer memory.
  • genomic information database is a database comprising genomic information, including, but not limited to, polymorphism information (i.e., information pertaining to genetic polymorphisms), genome information (i.e., genomic information), linkage information (i.e., information pertaining to the physical location of a nucleic acid sequence with respect to another nucleic acid sequence, e.g., in a chromosome), and disease association information (i.e., information correlating the presence of or susceptibility to a disease to a physical trait of a subject, e.g., an allele of a subject).
  • polymorphism information i.e., information pertaining to genetic polymorphisms
  • genome information i.e., genomic information
  • linkage information i.e., information pertaining to the physical location of a nucleic acid sequence with respect to another nucleic acid
  • Database information refers to information to be sent to a databases, stored in a database, processed in a database, or retrieved from a database.
  • Sequence database information refers to database information pertaining to nucleic acid sequences.
  • sequence database information refers to database information pertaining to nucleic acid sequences.
  • distinct sequence databases refers to two or more databases that contain different information than one another. For example, the dbSNP and GenBank databases are distinct sequence databases because each contains information not found in the other.
  • processor and “central processing unit” or “CPU” are used interchangeably and refer to a device that is able to read a program from a computer memory (e.g., ROM or other computer memory) and perform a set of steps according to the program.
  • a computer memory e.g., ROM or other computer memory
  • computer memory and “computer memory device” refer to any storage media readable by a computer processor. Examples of computer memory include, but are not limited to, RAM, ROM, computer chips, digital video disc (DVDs), compact discs (CDs), hard disk drives (HDD), and magnetic tape.
  • DVDs digital video disc
  • CDs compact discs
  • HDD hard disk drives
  • computer readable medium refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor.
  • Examples of computer readable media include, but are not limited to, DVDs, CDs, hard disk drives, magnetic tape and servers for streaming media over networks.
  • hyperlink refers to a navigational link from one document to another, or from one portion (or component) of a document to another.
  • a hyperlink is displayed as a highlighted word or phrase that can be selected by clicking on it using a mouse to jump to the associated document or documented portion.
  • hypertext system refers to a computer-based informational system in which documents (and possibly other types of data entities) are linked together via hyperlinks to form a user-navigable “web.”
  • the term “Internet” refers to any collection of networks using standard protocols.
  • the term includes a collection of interconnected (public and/or private) networks that are linked together by a set of standard protocols (such as TCP/IP, HTTP, and FTP) to form a global, distributed network. While this term is intended to refer to what is now commonly known as the Internet, it is also intended to encompass variations that may be made in the future, including changes and additions to existing standard protocols or integration with other media (e.g., television, radio, etc).
  • the term is also intended to encompass non-public networks such as private (e.g., corporate) Intranets.
  • World Wide Web or “web” refer generally to both (i) a distributed collection of interlinked, user-viewable hypertext documents (commonly referred to as Web documents or Web pages) that are accessible via the Internet, and (ii) the client and server software components which provide user access to such documents using standardized Internet protocols.
  • Web documents typically referred to as Web documents or Web pages
  • client and server software components which provide user access to such documents using standardized Internet protocols.
  • HTTP HyperText Transfer Protocol
  • Web pages are encoded using HTML.
  • Web and “World Wide Web” are intended to encompass future markup languages and transport protocols that may be used in place of (or in addition to) HTML and HTTP.
  • the term “web site” refers to a computer system that serves informational content over a network using the standard protocols of the World Wide Web. Typically, a Web site corresponds to a particular Internet domain name and includes the content associated with a particular organization. As used herein, the term is generally intended to encompass both (i) the hardware/software server components that serve the informational content over the network, and (ii) the “back end” hardware/software components, including any non-standard or specialized components, that interact with the server components to perform services for Web site users.
  • HTML HyperText Markup Language
  • HTML is a standard coding convention and set of codes for attaching presentation and linking attributes to informational content within documents.
  • HTML is based on SGML, the Standard Generalized Markup Language.
  • HTML codes (referred to as “tags”) are embedded within the informational content of the document.
  • the Web document or HTML document
  • HTML tags can be used to create links to other Web documents (commonly referred to as “hyperlinks”).
  • XML refers to Extensible Markup Language, an application profile that, like HTML, is based on SGML.
  • XML differs from HTML in that: information providers can define new tag and attribute names at will; document structures can be nested to any level of complexity; any XML document can contain an optional description of its grammar for use by applications that need to perform structural validation.
  • XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data, and some of which form markup. Markup encodes a description of the document's storage layout and logical structure.
  • XML provides a mechanism to impose constraints on the storage layout and logical structure, to define constraints on the logical structure and to support the use of predefined storage units.
  • a software module called an XML processor is used to read XML documents and provide access to their content and structure.
  • HTTP refers to HyperText Transport Protocol that is the standard World Wide Web client-server protocol used for the exchange of information (such as HTML documents, and client requests for such documents) between a browser and a Web server.
  • HTTP includes a number of different types of messages that can be sent from the client to the server to request different types of server actions. For example, a “GET” message, which has the format GET, causes the server to return the document or file located at the specified URL.
  • URL refers to Uniform Resource Locator that is a unique address that fully specifies the location of a file or other resource on the Internet.
  • the general format of a URL is protocol://machine address:port/path/filename.
  • the port specification is optional, and if none is entered by the user, the browser defaults to the standard port for whatever service is specified as the protocol. For example, if HTTP is specified as the protocol, the browser will use the HTTP default port of 80.
  • PUSH technology refers to an information dissemination technology used to send data to users over a network.
  • World Wide Web a “pull” technology
  • PUSH protocols send the informational content to the user computer automatically, typically based on information pre-specified by the user.
  • a communication network refers to any network that allows information to be transmitted from one location to another.
  • a communication network for the transfer of information from one computer to another includes any public or private network that transfers information using electrical, optical, satellite transmission, and the like.
  • Two or more devices that are part of a communication network such that they can directly or indirectly transmit information from one to the other are considered to be “in electronic communication” with one another.
  • a computer network containing multiple computers may have a central computer (“central node”) that processes information to one or more sub-computers that carry out specific tasks (“sub-nodes”).
  • Some networks comprises computers that are in “different geographic locations” from one another, meaning that the computers are located in different physical locations (i.e., aren't physically the same computer, e.g., are located in different countries, states, cities, rooms, etc.).
  • detection assay component refers to a component of a system capable of performing a detection assay.
  • Detection assay components include, but are not limited to, hybridization probes, buffers, and the like.
  • a detection assays configured for target detection refers to a collection of assay components that are capable of producing a detectable signal when carried out using the target nucleic acid.
  • a detection assay that has empirically been demonstrated to detect a particular single nucleotide polymorphism is considered a detection assay configured for target detection.
  • the phrase “unique detection assay” refers to a detection assay that has a different collection of detection assay components in relation to other detection assays located on the same detection panel.
  • a unique assay doesn't necessarily detect a different target (e.g. SNP) than other assays on the same detection panel, but it does have a least one difference in the collection of components used to detect a given target (e.g. a unique detection assay may employ a probe sequences that is shorter or longer in length than other assays on the same detection panel).
  • the term “candidate” refers to an assay or analyte, e.g., a nucleic acid, suspected of having a particular feature or property.
  • a “candidate sequence” refers to a nucleic acid suspected of comprising a particular sequence
  • a “candidate oligonucleotide” refers to an oligonucleotide suspected of having a property such as comprising a particular sequence, or having the capability to hybridize to a target nucleic acid or to perform in a detection assay.
  • a “candidate detection assay” refers to a detection assay that is suspected of being a valid detection assay.
  • detection panel refers to a substrate or device containing at least two unique candidate detection assays configured for target detection.
  • valid detection assay refers to a detection assay that has been shown to accurately predict an association between the detection of a target and a phenotype (e.g. medical condition).
  • valid detection assays include, but are not limited to, detection assays that, when a target is detected, accurately predict the phenotype medical 95%, 96%, 97%, 98%, 99%, 99.5%, 99.8%, or 99.9% of the time.
  • Other examples of valid detection assays include, but are not limited to, detection assays that quality as and/or are marketed as Analyte-Specific Reagents (i.e. as defined by FDA regulations) or In-Vitro Diagnostics (i.e. approved by the FDA).
  • kits refers to any delivery system for delivering materials.
  • delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., oligonucleotides, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g., buffers, written instructions for performing the assay etc.) from one location to another.
  • reaction reagents e.g., oligonucleotides, enzymes, etc. in the appropriate containers
  • supporting materials e.g., buffers, written instructions for performing the assay etc.
  • kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and/or supporting materials.
  • fragment kit refers to a delivery systems comprising two or more separate containers that each contain a subportion of the total kit components.
  • the containers may be delivered to the intended recipient together or separately.
  • a first container may contain an enzyme for use in an assay, while a second container contains oligonucleotides.
  • fragment kit is intended to encompass kits containing Analyte specific reagents (ASR's) regulated under section 520(e) of the Federal Food, Drug, and Cosmetic Act, but are not limited thereto. Indeed, any delivery system comprising two or more separate containers that each contains a subportion of the total kit components are included in the term “fragmented kit.”
  • a “combined kit” refers to a delivery system containing all of the components of a reaction assay in a single container (e.g., in a single box housing each of the desired components).
  • kit includes both fragmented and combined kits.
  • the term “information” refers to any collection of facts or data. In reference to information stored or processed using a computer system(s), including but not limited to internets, the term refers to any data stored in any format (e.g., analog, digital, optical, etc.).
  • the term “information related to a subject” refers to facts or data pertaining to a subject (e.g., a human, plant, or animal).
  • the term “genomic information” refers to information pertaining to a genome including, but not limited to, nucleic acid sequences, genes, allele frequencies, RNA expression levels, protein expression, phenotypes correlating to genotypes, etc.
  • Allele frequency information refers to facts or data pertaining allele frequencies, including, but not limited to, allele identities, statistical correlations between the presence of an allele and a characteristic of a subject (e.g., a human subject), the presence or absence of an allele in a individual or population, the percentage likelihood of an allele being present in an individual having one or more particular characteristics, etc.
  • assay validation information refers to genomic information and/or allele frequency information resulting from processing of test result data (e.g. processing with the aid of a computer). Assay validation information may be used, for example, to identify a particular candidate detection assay as a valid detection assay.
  • PCR drift is ascribed to stochastic variation in such steps as primer annealing during the early stages of the reaction (Polz and Cavanaugh, Applied and Environmental Microbiology, 64: 3724 (1998)), is not reproducible, and may be more prevalent when very small amounts of target molecules are being amplified (Walsh et al., PCR Methods and Applications, 1: 241 (1992)).
  • PCR selection pertains to the preferential amplification of some loci based on primer characteristics, amplicon length, G-C content, and other properties of the genome (Polz, supra).
  • PCR reactions Another factor affecting the extent to which PCR reactions can be multiplexed is the inherent tendency of PCR reactions to reach a plateau phase.
  • the plateau phase is seen in later PCR cycles and reflects the observation that amplicon generation moves from exponential to pseudo-linear accumulation and then eventually stops increasing. This effect appears to be due to non-specific interactions between the DNA polymerase and the double stranded products themselves.
  • the molar ratio of product to enzyme in the plateau phase is typically consistent for several DNA polymerases, even when different amounts of enzyme are included in the reaction, and is approximately 30:1 product:enzyme.
  • This effect thus limits the total amount of double-stranded product that can be generated in a PCR reaction such that the number of different loci amplified must be balanced against the total amount of each amplicon desired for subsequent analysis, e.g. by gel electrophoresis, primer extension, etc.
  • the present invention provides methods for substantial multiplexing of PCR reactions by, for example, combining the INVADER assay with multiplex PCR amplification.
  • the INVADER assay provides a detection step and signal amplification that allows very large numbers of targets to be detected in a multiplex reaction. As desired, hundreds to thousands to hundreds of thousands of targets may be detected in a multiplex reaction.
  • Direct genotyping by the INVADER assay typically uses from 5 to 100 ng of human genomic DNA per SNP, depending on detection platform. For a small number of assays, the reactions can be performed directly with genomic DNA without target pre-amplification, however, with more than 100,000 INVADER assays being developed and even larger number expected for genome-wide association studies, the amount of sample DNA may become a limiting factor.
  • the INVADER assay provides from 10 6 to 10 7 fold amplification of signal
  • multiplexed PCR in combination with the INVADER assay would use only limited target amplification as compared to a typical PCR. Consequently, low target amplification level alleviates interference between individual reactions in the mixture and reduces the inhibition of PCR by it's the accumulation of its products, thus providing for more extensive multiplexing. Additionally, it is contemplated that low amplification levels decrease a probability of target cross-contamination and decrease the number of PCR-induced mutations.
  • the INVADER reactions can be read at different time points, e.g., in real-time, thus significantly extending the dynamic range of the detection.
  • multiplex PCR can be performed under conditions that allow different loci to reach more similar levels of amplification. For example, primer concentrations can be limited, thereby allowing each locus to reach a more uniform level of amplification. In yet other embodiments, concentrations of PCR primers can be adjusted to balance amplification factors of different loci.
  • the present invention provides for the design and characteristics of highly multiplex PCR including hundreds to thousands of products in a single reaction.
  • the target pre-amplification provided by hundred-plex PCR reduces the amount of human genomic DNA required for INVADER-based SNP genotyping to less than 0.1 ng per assay.
  • the specifics of highly multiplex PCR optimization and a computer program for the primer design are described below.
  • the INVADER assay can be used for the detection of single nucleotide polymorphisms (SNPs) with as little as 100-10 ng of genomic DNA without the need for target pre-amplification.
  • SNPs single nucleotide polymorphisms
  • the amount of sample DNA becomes a limiting factor for large scale analysis.
  • multiplex PCR coupled with the INVADER assay requires only limited target amplification (10 3 -10 4 ) as compared to typical multiplex PCR reactions which require extensive amplification (10 9 -10 12 ) for conventional gel detection methods.
  • the low level of target amplification used for INVADERTM detection provides for more extensive multiplexing by avoiding amplification inhibition commonly resulting from target accumulation.
  • the present invention provides methods and selection criteria that allow primer sets for multiplex PCR to be generated (e.g. that can be coupled with a detection assay, such as the INVADER assay).
  • software applications of the present invention automated multiplex PCR primer selection, thus allowing highly multiplexed PCR with the primers designed thereby.
  • MAP INVADER Medically Associated Panel
  • the methods, software, and selection criteria of the present invention allowed accurate genotyping of 94 of the 101 possible amplicons ( ⁇ 93%) from a single PCR reaction.
  • the original PCR reaction used only 10 ng of hgDNA as template, corresponding to less than 150 pg hgDNA per INVADER assay.
  • FIG. 1 described the general principles of the INVADER assay.
  • the INVADER assay allows for the simultaneous detection of two distinct alleles in the same reaction using an isothermal, single addition format.
  • Allele discrimination takes place by “structure specific” cleavage of the Probe, releasing a 5′ flap which corresponds to a given polymorphism.
  • B In the second reaction, the released 5′ flap mediates signal generation by cleavage of the appropriate FRET cassette.
  • FIG. 2 illustrates creation of one of the primer pairs (both a forward and reverse primer) for a 101 primer sets from sequences available for analysis on the INVADER Medically Associated Panel using one embodiment of the software application of the present invention.
  • FIG. 2A shows a sample input file of a single entry (e.g. shows target sequence information for a single target sequence containing a SNP that is processed the method and software of the present invention).
  • the target sequence information in FIG. 2 includes Third Wave Technologies's SNP#, short name identifier, and sequence with the SNP location indicated in brackets.
  • FIG. 2B shows the sample output file of a the same entry (e.g. shows the target sequence after being processed by the systems and methods and software of the present invention.
  • the output information includes the sequence of the footprint region (capital letters flanking SNP site, showing region where INVADER assay probes hybridize to this target sequence in order to detect the SNP in the target sequence), forward and reverse primer sequences (bold), and their corresponding Tm's.
  • the selection of primers to make a primer set capable of multiplex PCR is performed in automated fashion (e.g. by a software application). Automated primer selection for multiplex PCR may be accomplished employing a software program designed as shown by the flow chart in FIG. 4A.
  • the present invention provides methods and software application that provide selection criteria to generate a primer set configured for multiplex PCR, and subsequent use in a detection assay (e.g. INVADER detection assays).
  • a detection assay e.g. INVADER detection assays
  • the methods and software applications of the present invention start with user defined sequences and corresponding SNP locations.
  • the methods and/or software application determines a footprint region within the target sequence (the minimal amplicon required for INVADER detection) for each sequence (shown in capital letters in FIG. 2B).
  • the footprint region includes the region where assay probes hybridize, as well as any user defined additional bases extending outward therefore (e.g. 5 additional bases included on each side of where the assay probes hybridize).
  • primers are designed outward from the footprint region and evaluated against several criteria, including the potential for primer-dimer formation with previously designed primers in the current multiplexing set (See, primers in bold in FIG. 2A, and selection steps in FIG. 4A). This process may be continued, as shown in FIG. 4A, through multiple iterations of the same set of sequences until primers against all sequences in the current multiplexing set can be designed.
  • a primer set is designed for multiplex PCR, this set may be employed as shown in the basic workflow scheme shown in FIG. 3. Multiplex PCR may be carried out, for example, under standard conditions using only 10 ng of hgDNA as template. After 10 min at 95° C., Taq (2.5 units) may be added to a 50 ul reaction and PCR carried out for 50 cycles. The PCR reaction may be diluted and loaded directly onto an INVADER MAP plate (3 ul/well) (See FIG. 3). An additional 3 ul of 1 5 mM MgCl 2 may be added to each reaction on the INVADER MAP plate and covered with 6 ul of mineral oil. The entire plate may then be heated to 95° C. for 5 min.
  • results from each SNP may be color coded in a table as “pass” (green), “mis-call” (pink), or “no-call” (white) (See, Example 2 below).
  • the number of PCR reactions is from about 1 to about 10 reactions. In some embodiments, the number of PCR reactions is from about 10 to about 50 reactions. In further embodiments, the number of PCR reactions is from about 50 to about 100. In additional embodiments, the number of PCR reactions is from about than 100 to 1,000. In still other embodiments, the number of PCR reactions is greater than 1,000.
  • the present invention also provides methods to optimize multiplex PCR reactions (e.g. once a primer set is generated, the concentration of each primer or primer pair may be optimized). For example, once a primer set has been generated and used in a multiplex PCR at equal molar concentrations, the primers may be evaluated separately such that the optimum primer concentration is determined such that the multiplex primer set performs better.
  • the cost per target is theoretically lowered by eliminating technician time in assay set-up and data analysis, and by the substantial reagent savings (especially enzyme cost).
  • Another benefit of the multiplex approach is that far less target sample is required.
  • SNPs single nucleotide polymorphisms
  • the concept of performing a single reaction, using one sample aliquot to obtain, for example, 100 results, versus using 100 sample aliquots to obtain the same data set is an attractive option.
  • primer dimers even if only a few bases in length, may inhibit both primers from correctly hybridizing to the target sequence. Further, if the dimers form at or near the 3′ ends of the primers, no amplification or very low levels of amplification will occur, since the 3′ end is required for the priming event. Clearly, the more primers utilized per multiplex reaction, the more aberrant primer interactions are possible. The methods, systems and applications of the present help prevent primer dimers in large sets of primers, making the set suitable for highly multiplexed PCR.
  • primer pairs for numerous site (for example 100 sites in a multiplex PCR reaction)
  • the order in which primer pairs are designed can influence the total number of compatible primer pairs for a reaction. For example, if a first set of primers is designed for a first target region that happens to be an A/T rich target region, these primer will be A/T rich. If the second target region chosen also happens to be an A/T rich target region, it is far more likely that the primers designed for these two sets will be incompatible due to aberrant interactions, such as primer dimers. If, however, the second target region chosen is not A/T rich, it is much more likely that a primer set can be designed that will not interact with the first A/T rich set.
  • the present invention randomizes the order in which primer sets are designed (See, FIG. 4A). Furthermore, in some embodiments, the present invention re-orders the set of input target sequences in a plurality of different, random orders to maximize the number of compatible primer sets for any given multiplex reaction (See, FIG. 4A).
  • N[1] is an A or C (in alternative embodiments, N[1] is a G or T).
  • N[2]-N[1] of each of the forward and reverse primers designed should not be complementary to N[2]-N[1] of any other oligonucleotide.
  • N[3]-N[2]-N[1] should not be complementary to N[3]-N[2]-N[1] of any other oligonucleotide.
  • the next base in the 5′ direction for the forward primer or the next base in the 3′ direction for the reverse primer may be evaluated as an N[1] site. This process is repeated, in conjunction with the target randomization, until all criteria are met for all, or a large majority of, the targets sequences (e.g. 95% of target sequences can have primer pairs made for the primer set that fulfill these criteria).
  • Another challenge to be overcome in a multiplex primer design is the balance between actual, required nucleotide sequence, sequence length, and the oligonucleotide melting temperature (Tm) constraints.
  • Tm oligonucleotide melting temperature
  • the primers in a multiplex primer set in a reaction should function under the same reaction conditions of buffer, salts and temperature, they need therefore to have substantially similar T m 's, regardless of GC or AT richness of the region of interest.
  • the present invention allows for primer design which meet minimum Tm and maximum Tm requirements and minimum and maximum length requirements. For example, in the formula for each primer 5′-N[x]-N[x ⁇ 1]- . . .
  • x is selected such the primer has a predetermined melting temperature (e.g. bases are included in the primer until the primer has a calculated melting temperature of about 50 degrees Celsius).
  • the products of a PCR reaction are used as the target material for another nucleic acid detection means, such as a hybridization-type detection assays, or the INVADER reaction assays for example.
  • a hybridization-type detection assays or the INVADER reaction assays for example.
  • Consideration should be given to the location of primer placement to allow for the secondary reaction to successfully occur, and again, aberrant interactions between amplification primers and secondary reaction oligonucleotides should be minimized for accurate results and data.
  • Selection criteria may be employed such that the primers designed for a multiplex primer set do not react (e.g. hybridize with, or trigger reactions) with oligonucleotide components of a detection assay.
  • N[4]-N[3]-N[2]-N[1]-3′ is selected for each primer such that it is less than 80% homologous with the FRET or INVADER oligonucleotides. In certain embodiments, N[4]-N[3]-N[2]-N[1]-3′ is selected for each primer such that it is less than 70% homologous with the FRET or INVADER oligonucleotides.
  • FIG. 4A shows a flow chart with the basic flow of certain embodiments of the methods and software application of the present invention.
  • the processes detailed in FIG. 4A are incorporated into a software application for ease of use (although, the methods may also be performed manually using, for example, FIG. 4A as a guide).
  • Target sequences and/or primer pairs are entered into the system shown in FIG. 4A.
  • the first set of boxes show how target sequences are added to the list of sequences that have a footprint determined (See “B” in FIG. 4A), while other sequences are passed immediately into the primer set pool (e.g. PDPass, those sequences that have been previously processed and shown to work together without forming Primer dimers or having reactivity to FRET sequences), as well as DimerTest entries (e.g. pair or primers a user wants to use, but that has not been tested yet for primer dimer or fret reactivity).
  • PDPass those sequences that have been previously processed and shown to work together without forming Primer dimers or having reactivity to FRET sequences
  • DimerTest entries e.g. pair or primers a user wants to use, but that has not been tested yet for primer dimer or fret reactivity.
  • the initial set of boxes leading up to “end of input” sort the sequences so they can be later processed properly.
  • the primer pool is basically cleared or “emptied” to start a fresh run.
  • the target sequences are then sent to “B” to be processed, and DimerTest pairs are sent to “C” to be processed.
  • Target sequences are sent to “B”, where a user or software application determines the footprint region for the target sequence (e.g. where the assay probes will hybridize in order to detect the mutation (e.g. SNP) in the target sequence).
  • This region is generally shown in capital letters in figures, such as FIG. 2B. It is important to design this region (which the user may further expand by defining that additional bases past the hybridization region be added) such that the primers that are designed fully encompass this region.
  • FIG. 2B It is important to design this region (which the user may further expand by defining that additional bases past the hybridization region be added) such that the primers that are designed fully encompass this region.
  • the software application INVADER CREATOR is used to design the INVADER oligonuclotide and downstream probes that will hybridize with the target region (although any type of program of system could be used to create any type of probes a user was interested in designing probes for, and thus determining the footprint region for on the target sequence).
  • the core footprint region is then defined by the location of these two assay probes on the target.
  • the system starts from the 5′ edge of the footprint and travels in the 5′ direction until the first base is reached, or until the first A or C (or G or T) is reached.
  • This is set as the initial starting point for defining the sequence of the forward primer (i.e. this serves as the initial N[1] site).
  • the sequence of the primer for the forward primer is the same as those bases encountered on the target region. For example, if the default size of the primer is set as 12 bases, the system starts with the bases selected as N[1] and then adds the next 11 bases found in the target sequences. This 12-mer primer is then tested for a melting temperature (e.g.
  • the system employs the formula 5′-N[x]-N[x ⁇ 1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, and x is initially 12. Then the system adjusts x to a higher number (e.g. longer sequences) until the pre-set melting temperature is found.
  • a maximum primer size is employed as a default parameter to serve as an upper limit on the length of the primers designed.
  • the maximum primer size is about 30 bases (e.g. 29 bases, 30, bases, or 31 bases).
  • the default settings e.g. minimum and maximum primer size, and minimum and maximum T m ) are able to be modified using standard database manipulation tools.
  • the next box in FIG. 4 a is used to determine if the primer that has been designed so far will cause primer-dimer and/or fret reactivity (e.g. with the other sequences already in the pool). The criteria used for this determination are explained above. If the primer passes this step, the forward primer is added to the primer pool. However, if the forward primer fails this criteria, as shown in FIG. 4A, the starting point (N[1] is moved) one nucleotide in the 5′ direction (or to the next A or C, or next G or T). The system first checks to make sure shifting over leaves enough room on the target sequence to successfully make a primer. If yes, the system loops back and check this new primer for melting temperature. However, if no sequence can be designed, then the target sequence is flagged as an error (e.g. indicating that no forward primer can be made for this target).
  • the target sequence is flagged as an error (e.g. indicating that no forward primer can be made for this target).
  • FIG. 4A shows how primer pairs that are entered as primers (DimerTest) are processed by the system. If there are no DimerTest pairs, as shown in FIG. 4 a , the system goes on to “D”. However, if there are DimerTest pairs, these are tested for primer-dimer and/or FRET reactivity as described above. If the DimerTest pair fails these criteria they are flagged as errors. If the DimerTest pair passes the criteria, they are added to the primer set pool, and then the system goes back to “C” if there are more DimerTest pairs to be evaluated, or or goes on to “D” if there are no more DimerTest pairs to be evaluated.
  • the pool of primers that has been created is evaluated.
  • the first step in this section is to examine the number of error (failures) generated by this particular randomized run of sequences. If there were no errors, this set is the best set as maybe ouputted to a user. If there are more than zero errors, the system compares this run to any other previous runs to see what run resulted in the fewest errors. If the current run has fewer errors, it is designated as the current best set. At this point, the system may go back to “A” to start the run over with another randomized set of the same sequences, or the pre-set maximum number of runs (e.g. 5 runs) may have been reached on this run (e.g.
  • Another challenge to be overcome with multiplex PCR reactions is the unequal amplicon concentrations that result in a standard multiplex reaction.
  • the different loci targeted for amplification may each behave differently in the amplification reaction, yielding vastly different concentrations of each of the different amplicon products.
  • the present invention provides methods, systems, software applications, computer systems, and a computer data storage medium that may be used to adjust primer concentrations relative to a first detection assay read (e.g. INVADER assay read), and then with balanced primer concentrations come close to substantially equal concentrations of different amplicons.
  • a first detection assay read e.g. INVADER assay read
  • the concentrations for various primer pairs may be determined experimentally.
  • These detection assays can be on an array of different sizes (384 well plates).
  • Having optimized primer pair concentrations in a single reaction vessel allows the user to conduct amplification for a plurality or multiplicity of amplification targets in a single reaction vessel and in a single step.
  • the yield of the single step process is then used to successfully obtain test result data for, for example, several hundred assays.
  • each well on a 384 well plate can have a different detection assay thereon.
  • the results of the single step mutliplex PCR reaction has amplified 384 different targets of genomic DNA, and provides you with 384 test results for each plate. Where each well has a plurality of assays even greater efficiencies can be obtained.
  • the present invention provides the use of the concentration of each primer set in highly multiplexed PCR as a parameter to achieve an unbiased amplification of each PCR product.
  • Any PCR includes primer annealing and primer extension steps.
  • high concentration of primers in the order of 1 uM ensures fast kinetics of primers annealing while the optimal time of the primer extension step depends on the size of the amplified product and can be much longer than the annealing step.
  • primer concentration By reducing primer concentration, the primer annealing kinetics can become a rate limiting step and PCR amplification factor should strongly depend on primer concentration, association rate constant of the primers, and the annealing time.
  • k a is the association rate constant of primer annealing.
  • [PT] is the concentration of target molecules associated with primer
  • T 0 initial target concentration
  • c is the initial primer concentration
  • t primer annealing time
  • the amplification factor should strongly depend on primer concentration.
  • biased loci amplification whether it is caused by individual association rate constants, primer extension steps or any other factors, can be corrected by adjusting primer concentration for each primer set in the multiplex PCR.
  • the adjusted primer concentrations can be also used to correct biased performance of INVADER assay used for analysis of PCR pre-amplified loci.
  • the present invention has demonstrated a linear relationship between amplification efficiency and primer concentration and used this equation to balance primer concentrations of different amplicons, resulting in the equal amplification of ten different amplicons in Example 1. This technique may be employed on any size set of multiplex primer pairs.
  • detection assays that may be employed with the present invention. For example, many different assays may be used to determine the footprint on the target nucleic sequence, and then used as the detection assay run on the output of the multiplex PCR (or the detection assays may be run simultaneously with the multiplex PCR reaction).
  • the present invention provides systems and methods for the design of oligonucleotides for use in detection assays.
  • the present invention provides systems and methods for the design of oligonucleotides that successfully hybridize to appropriate regions of target nucleic acids (e.g., regions of target nucleic acids that do not contain secondary structure) under the desired reaction conditions (e.g., temperature, buffer conditions, etc.) for the detection assay.
  • the systems and methods also allow for the design of multiple different oligonucleotides (e.g., oligonucleotides that hybridize to different portions of a target nucleic acid or that hybridize to two or more different target nucleic acids) that all function in the detection assay under the same or substantially the same reaction conditions.
  • These systems and methods may also be used to design control samples that work under the experimental reaction conditions.
  • the INVADER assay provides ease-of-use and sensitivity levels that, when used in conjunction with the systems and methods of the present invention, find use in detection panels, ASRs, and clinical diagnostics.
  • ASRs ASRs
  • clinical diagnostics ASRs
  • specific and general features of this illustrative example are generally applicable to other detection assays.
  • the INVADER assay provides means for forming a nucleic acid cleavage structure that is dependent upon the presence of a target nucleic acid and cleaving the nucleic acid cleavage structure so as to release distinctive cleavage products.
  • 5′ nuclease activity for example, is used to cleave the target-dependent cleavage structure and the resulting cleavage products are indicative of the presence of specific target nucleic acid sequences in the sample.
  • invasive cleavage can occur.
  • the cleavage agent e.g., a 5′ nuclease
  • the upstream oligonucleotide can be made to cleave the downstream oligonucleotide at an internal site in such a way that a distinctive fragment is produced.
  • the INVADER assay provides detections assays in which the target nucleic acid is reused or recycled during multiple rounds of hybridization with oligonucleotide probes and cleavage of the probes without the need to use temperature cycling (i.e., for periodic denaturation of target nucleic acid strands) or nucleic acid synthesis (i.e., for the polymerization-based displacement of target or probe nucleic acid strands).
  • temperature cycling i.e., for periodic denaturation of target nucleic acid strands
  • nucleic acid synthesis i.e., for the polymerization-based displacement of target or probe nucleic acid strands.
  • a cleavage reaction is run under conditions in which the probes are continuously replaced on the target strand (e.g. through probe-probe displacement or through an equilibrium between probe/target association and disassociation, or through a combination comprising these mechanisms, (Reynaldo, et al., J. Mol. Biol. 97:
  • sequences of interest are entered into the INVADERCREATOR program (Third Wave Technologies, Madison, Wis.). As described above, sequences may be input for analysis from any number of sources, either directly into the computer hosting the INVADERCREATOR program, or via a remote computer linked through a communication network (e.g., a LAN, Intranet or Internet network).
  • the program designs probes for both the sense and antisense strand. Strand selection is generally based upon the ease of synthesis, minimization of secondary structure formation, and manufacturability. In some embodiments, the user chooses the strand for sequences to be designed for.
  • the software automatically selects the strand.
  • oligonucleotide probes may be designed to operate at a pre-selected assay temperature (e.g., 63° C.). Based on these criteria, a final probe set (e.g., primary probes for 2 alleles and an INVADER oligonucleotide) is selected.
  • the INVADERCREATOR system is a web-based program with secure site access that contains a link to BLAST (available at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health website) and that can be linked to RNAstructure (Mathews et al., RNA 5:1458 [1999]), a software program that incorporates mfold (Zuker, Science, 244:48 [1989]).
  • RNAstructure tests the proposed oligonucleotide designs generated by INVADERCREATOR for potential uni- and bimolecular complex formation.
  • INVADERCREATOR is open database connectivity (ODBC)-compliant and uses the Oracle database for export/integration.
  • ODBC open database connectivity
  • the INVADERCREATOR system was configured with Oracle to work well with UNIX systems, as most genome centers are UNIX-based.
  • the INVADERCREATOR analysis is provided on a separate server (e.g., a Sun server) so it can handle analysis of large batch jobs. For example, a customer can submit up to 2,000 SNP sequences in one email.
  • the server passes the batch of sequences on to the INVADERCREATOR software, and, when initiated, the program designs detection assay oligonucleotide sets.
  • probe set designs are returned to the user within 24 hours of receipt of the sequences.
  • Each INVADER reaction includes at least two target sequence-specific, unlabeled oligonucleotides for the primary reaction: an upstream INVADER oligonucleotide and a downstream Probe oligonucleotide.
  • the INVADER oligonucleotide is generally designed to bind stably at the reaction temperature, while the probe is designed to freely associate and disassociate with the target strand, with cleavage occurring only when an uncut probe hybridizes adjacent to an overlapping INVADER oligonucleotide.
  • the probe includes a 5′ flap or “arm” that is not complementary to the target, and this flap is released from the probe when cleavage occurs.
  • the released flap participates as an INVADER oligonucleotide in a secondary reaction.
  • the user opens a work screen (FIG. 8), e.g., by clicking on an icon on a desktop display of a computer (e.g., a Windows desktop).
  • the user enters information related to the target sequence for which an assay is to be designed.
  • the user enters a target sequence.
  • the user enters a code or number that causes retrieval of a sequence from a database.
  • additional information may be provided, such as the user's name, an identifying number associated with a target sequence, and/or an order number.
  • the user indicates (e.g. via a check box or drop down menu) that the target nucleic acid is DNA or RNA.
  • the user indicates the species from which the nucleic acid is derived. In particularly preferred embodiments, the user indicates whether the design is for monoplex (i.e., one target sequence or allele per reaction) or multiplex (i.e., multiple target sequences or alleles per reaction) detection.
  • the user starts the analysis process. In one embodiment, the user clicks a “Go Design It” button to continue.
  • the software validates the field entries before proceeding. In some embodiments, the software verifies that any required fields are completed with the appropriate type of information. In other embodiments, the software verifies that the input sequence meets selected requirements (e.g., minimum or maximum length, DNA or RNA content). If entries in any field are not found to be valid, an error message or dialog box may appear. In preferred embodiments, the error message indicates which field is incomplete and/or incorrect. Once a sequence entry is verified, the software proceeds with the assay design.
  • the information supplied in the order entry fields specifies what type of design will be created.
  • the target sequence and multiplex check box specify which type of design to create. Design options include but are not limited to SNP assay, Multiplexed SNP assay (e.g., wherein probe sets for different alleles are to be combined in a single reaction), Multiple SNP assay (e.g., wherein an input sequence has multiple sites of variation for which probe sets are to be designed), and Multiple Probe Arm assays.
  • the INVADERCREATOR software is started via a Web Order Entry (WebOE) process (i.e., through an Intra/Internet browser interface) and these parameters are transferred from the WebOE via applet ⁇ param> tags, rather than entered through menus or check boxes.
  • WebOE Web Order Entry
  • the user chooses two or more designs to work with. In some embodiments, this selection opens a new screen view (e.g., a Multiple SNP Design Selection view FIG. 9).
  • the software creates designs for each locus in the target sequence, scoring each, and presents them to the user in this screen view. The user can then choose any two designs to work with. In some embodiments, the user chooses a first and second design (e.g., via a menu or buttons) and clicks a “Go Design It” button to continue.
  • the melting temperature (T m ) of the SNP to be detected is calculated using the nearest-neighbor model and published parameters for DNA duplex formation (Allawi and SantaLucia, Biochemistry, 36:10581 [1997]).
  • T m melting temperature
  • the target strand is RNA
  • parameters appropriate for RNA/DNA heteroduplex formation may be used. Because the assay's salt concentrations are often different than the solution conditions in which the nearest-neighbor parameters were obtained (1M NaCl and no divalent metals), and because the presence and concentration of the enzyme influence optimal reaction temperature, an adjustment should be made to the calculated T m to determine the optimal temperature at which to perform a reaction.
  • salt correction refers to a variation made in the value provided for a salt concentration for the purpose of reflecting the effect on a T m calculation for a nucleic acid duplex of a non-salt parameter or condition affecting said duplex. Variation of the values provided for the strand concentrations will also affect the outcome of these calculations.
  • the algorithm for used for calculating probe-target melting temperature has been adapted for use in predicting optimal INVADER assay reaction temperature. For a set of 30 probes, the average deviation between optimal assay temperatures calculated by this method and those experimentally determined is about 1.5° C.
  • the length of the downstream probe to a given SNP is defined by the temperature selected for running the reaction (e.g., 63° C.). Starting from the position of the variant nucleotide on the target DNA (the target base that is paired to the probe nucleotide 5′ of the intended cleavage site), and adding on the 3′ end, an iterative procedure is used by which the length of the target-binding region of the probe is increased by one base pair at a time until a calculated optimal reaction temperature (T m plus salt correction to compensate for enzyme effect) matching the desired reaction temperature is reached.
  • T m plus salt correction to compensate for enzyme effect The non-complementary arm of the probe is preferably selected to allow the secondary reaction to cycle at the same reaction temperature.
  • the entire probe oligonucleotide is screened using programs such as mfold (Zuker, Science, 244: 48 [1989]) or Oligo 5.0 (Rychlik and Rhoads, Nucleic Acids Res, 17: 8543 [1989]) for the possible formation of dimer complexes or secondary structures that could interfere with the reaction.
  • mfold Zuker, Science, 244: 48 [1989]
  • Oligo 5.0 Rychlik and Rhoads, Nucleic Acids Res, 17: 8543 [1989]
  • the same principles are also followed for INVADER oligonucleotide design. Briefly, starting from the position N on the target DNA, the 3′ end of the INVADER oligonucleotide is designed to have a nucleotide not complementary to either allele suspected of being contained in the sample to be tested.
  • mismatch does not adversely affect cleavage (Lyamichev et al., Nature Biotechnology, 17: 292 [1999]), and it can enhance probe cycling, presumably by minimizing coaxial stabilization effects between the two probes. Additional residues complementary to the target DNA starting from residue N-1 are then added in the 5′ direction until the stability of the INVADER oligonucleotide-target hybrid exceeds that of the probe (and therefore the planned assay reaction temperature), generally by 15-20° C.
  • the all of the probe sequences may be selected to allow the primary and secondary reactions to occur at the same optimal temperature, so that the reaction steps can run simultaneously.
  • the probes may be designed to operate at different optimal temperatures, so that the reaction steps are not simultaneously at their temperature optima.
  • the software provides the user an opportunity to change various aspects of the design including but not limited to: probe, target and INVADER oligonucleotide temperature optima and concentrations; blocking groups; probe arms; dyes, capping groups and other adducts; individual bases of the probes and targets (e.g., adding or deleting bases from the end of targets and/or probes, or changing internal bases in the INVADER and/or probe and/or target oligonucleotides).
  • changes are made by selection from a menu.
  • changes are entered into text or dialog boxes.
  • this option opens a new screen (e.g., a Designer Worksheet view, FIG. 10).
  • the software provides a scoring system to indicate the quality (e.g., the likelihood of performance) of the assay designs.
  • the scoring system includes a starting score of points (e.g., 100 points) wherein the starting score is indicative of an ideal design, and wherein design features known or suspected to have an adverse affect on assay performance are assigned penalty values. Penalty values may vary depending on assay parameters other than the sequences, including but not limited to the type of assay for which the design is intended (e.g., monoplex, multiplex) and the temperature at which the assay reaction will be performed.
  • the following example provides an illustrative scoring criteria for use with some embodiments of the INVADER assay based on an intelligence defined by experimentation.
  • design features that may incur score penalties include but are not limited to the following [penalty values are indicated in brackets, first number is for lower temperature assays (e.g., 62-64° C.), second is for higher temperature assays (e.g., 65-66° C.)]:
  • a probe has 5-base stretch (i.e., 5 of the same base in a row) containing the polymorphism;
  • probe hybridizing region is short (13 bases or less for designs 65-67° C.; 12 bases or less for designs 62-64° C.)
  • probe hybridizing region is long (29 bases or more for designs 65-67° C., 28 bases or more for designs 62-64° C.)
  • probe hybridizing region is 14, 15 or 24-28 bases long (65-67° C.) or 13, 14 or 26, 27 bases long (62-64° C.)
  • temperatures for each of the oligonucleotides in the designs are recomputed and scores are recomputed as changes are made.
  • score descriptions can be seen by clicking a “descriptions” button.
  • a BLAST search option is provided.
  • a BLAST search is done by clicking a “BLAST Design” button. In some embodiments, this action brings up a dialog box describing the BLAST process.
  • the BLAST search results are displayed as a highlighted design on a Designer Worksheet.
  • a user accepts a design by clicking an “Accept” button.
  • the program approves a design without user intervention.
  • the program sends the approved design to a next process step (e.g., into production; into a file or database).
  • the program provides a screen view (e.g., an Output Page, FIG. 11), allowing review of the final designs created and allowing notes to be attached to the design.
  • the user can return to the Designer Worksheet (e.g., by clicking a “Go Back” button) or can save the design (e.g., by clicking a “Save It” button) and continue (e.g., to submit the designed oligonucleotides for production).
  • the program provides an option to create a screen view of a design optimized for printing (e.g., a text-only view) or other export (e.g., an Output view, FIG. 12).
  • the Output view provides a description of the design particularly suitable for printing, or for exporting into another application (e.g., by copying and pasting into another application).
  • the Output view opens in a separate window.
  • the present invention is not limited to the use of the INVADERCREATOR software. Indeed, a variety of software programs are contemplated and are commercially available, including, but not limited to GCG Wisconsin Package (Genetics computer Group, Madison, Wis.) and Vector NTI (Informax, Rockville, Md.). Other detection assays may be used in the present invention.
  • variant sequences are detected using a direct sequencing technique.
  • DNA samples are first isolated from a subject using any suitable method.
  • the region of interest is cloned into a suitable vector and amplified by growth in a host cell (e.g., a bacteria).
  • DNA in the region of interest is amplified using PCR.
  • DNA in the region of interest (e.g., the region containing the SNP or mutation of interest) is sequenced using any suitable method, including but not limited to manual sequencing using radioactive marker nucleotides, or automated sequencing. The results of the sequencing are displayed using any suitable method. The sequence is examined and the presence or absence of a given SNP or mutation is determined.
  • variant sequences are detected using a PCR-based assay.
  • the PCR assay comprises the use of oligonucleotide primers that hybridize only to the variant or wild type allele (e.g., to the region of polymorphism or mutation). Both sets of primers are used to amplify a sample of DNA. If only the mutant primers result in a PCR product, then the patient has the mutant allele. If only the wild-type primers result in a PCR product, then the patient has the wild type allele.
  • variant sequences are detected using a fragment length polymorphism assay.
  • a fragment length polymorphism assay a unique DNA banding pattern based on cleaving the DNA at a series of positions is generated using an enzyme (e.g., a restriction enzyme or a CLEAVASE I [Third Wave Technologies, Madison, Wis.] enzyme).
  • an enzyme e.g., a restriction enzyme or a CLEAVASE I [Third Wave Technologies, Madison, Wis.] enzyme.
  • DNA fragments from a sample containing a SNP or a mutation will have a different banding pattern than wild type.
  • variant sequences are detected using a restriction fragment length polymorphism assay (RFLP).
  • RFLP restriction fragment length polymorphism assay
  • the region of interest is first isolated using PCR.
  • the PCR products are then cleaved with restriction enzymes known to give a unique length fragment for a given polymorphism.
  • the restriction-enzyme digested PCR products are generally separated by gel electrophoresis and may be visualized by ethidium bromide staining.
  • the length of the fragments is compared to molecular weight markers and fragments generated from wild-type and mutant controls.
  • variant sequences are detected using a CLEAVASE fragment length polymorphism assay (CFLP; Third Wave Technologies, Madison, Wis.; See e.g., U.S. Pat. Nos. 5,843,654; 5,843,669; 5,719,208; and 5,888,780; each of which is herein incorporated by reference).
  • This assay is based on the observation that when single strands of DNA fold on themselves, they assume higher order structures that are highly individual to the precise sequence of the DNA molecule. These secondary structures involve partially duplexed regions of DNA such that single stranded regions are juxtaposed with double stranded DNA hairpins.
  • the CLEAVASE I enzyme is a structure-specific, thermostable nuclease that recognizes and cleaves the junctions between these single-stranded and double-stranded regions.
  • the region of interest is first isolated, for example, using PCR. In preferred emodiments, one or both strands are labeled. Then, DNA strands are separated by heating. Next, the reactions are cooled to allow intrastrand secondary structure to form. The PCR products are then treated with the CLEAVASE I enzyme to generate a series of fragments that are unique to a given SNP or mutation. The CLEAVASE enzyme treated PCR products are separated and detected (e.g., by denaturing gel electrophoresis) and visualized (e.g., by autoradiography, fluorescence imaging or staining). The length of the fragments is compared to molecular weight markers and fragments generated from wild-type and mutant controls.
  • variant sequences are detected a hybridization assay.
  • a hybridization assay the presence of absence of a given SNP or mutation is determined based on the ability of the DNA from the sample to hybridize to a complementary DNA molecule (e.g., a oligonucleotide probe).
  • a complementary DNA molecule e.g., a oligonucleotide probe.
  • hybridization of a probe to the sequence of interest is detected directly by visualizing a bound probe (e.g., a Northern or Southern assay; See e.g., Ausabel et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, NY [1991]).
  • a Northern or Southern assay See e.g., Ausabel et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, NY [1991]).
  • genomic DNA Southern or RNA (Northern) is isolated from a subject. The DNA or RNA is then cleaved with a series of restriction enzymes that cleave infrequently in the genome and not near any of the markers being assayed.
  • the DNA or RNA is then separated (e.g., on an agarose gel) and transferred to a membrane.
  • a labeled (e.g., by incorporating a radionucleotide) probe or probes specific for the SNP or mutation being detected is allowed to contact the membrane under a condition or low, medium, or high stringency conditions. Unbound probe is removed and the presence of binding is detected by visualizing the labeled probe.
  • variant sequences are detected using a DNA chip hybridization assay.
  • a DNA chip hybridization assay In this assay, a series of oligonucleotide probes are affixed to a solid support. The oligonucleotide probes are designed to be unique to a given SNP or mutation. The DNA sample of interest is contacted with the DNA “chip” and hybridization is detected.
  • the DNA chip assay is a GeneChip (Affymetrix, Santa Clara, Calif.; See e.g., U.S. Pat. Nos. 6,045,996; 5,925,525; and 5,858,659; each of which is herein incorporated by reference) assay.
  • the GeneChip technology uses miniaturized, high-density arrays of oligonucleotide probes affixed to a “chip.” Probe arrays are manufactured by Affymetrix's light-directed chemical synthesis process, which combines solid-phase chemical synthesis with photolithographic fabrication techniques employed in the semiconductor industry.
  • the process constructs high-density arrays of oligonucleotides, with each probe in a predefined position in the array. Multiple probe arrays are synthesized simultaneously on a large glass wafer. The wafers are then diced, and individual probe arrays are packaged in injection-molded plastic cartridges, which protect them from the environment and serve as chambers for hybridization.
  • the nucleic acid to be analyzed is isolated, amplified by PCR, and labeled with a fluorescent reporter group.
  • the labeled DNA is then incubated with the array using a fluidics station.
  • the array is then inserted into the scanner, where patterns of hybridization are detected.
  • the hybridization data are collected as light emitted from the fluorescent reporter groups already incorporated into the target, which is bound to the probe array. Probes that perfectly match the target generally produce stronger signals than those that have mismatches. Since the sequence and position of each probe on the array are known, by complementarity, the identity of the target nucleic acid applied to the probe array can be determined.
  • a DNA microchip containing electronically captured probes (Nanogen, San Diego, Calif.) is utilized (See e.g., U.S. Pat. Nos. 6,017,696; 6,068,818; and 6,051,380; each of which are herein incorporated by reference).
  • Nanogen's technology enables the active movement and concentration of charged molecules to and from designated test sites on its semiconductor microchip.
  • DNA capture probes unique to a given SNP or mutation are electronically placed at, or “addressed” to, specific sites on the microchip. Since DNA has a strong negative charge, it can be electronically moved to an area of positive charge.
  • a test site or a row of test sites on the microchip is electronically activated with a positive charge.
  • a solution containing the DNA probes is introduced onto the microchip.
  • the negatively charged probes rapidly move to the positively charged sites, where they concentrate and are chemically bound to a site on the microchip.
  • the microchip is then washed and another solution of distinct DNA probes is added until the array of specifically bound DNA probes is complete.
  • a test sample is then analyzed for the presence of target DNA molecules by determining which of the DNA capture probes hybridize, with complementary DNA in the test sample (e.g., a PCR amplified gene of interest).
  • An electronic charge is also used to move and concentrate target molecules to one or more test sites on the microchip. The electronic concentration of sample DNA at each test site promotes rapid hybridization of sample DNA with complementary capture probes (hybridization may occur in minutes).
  • the polarity or charge of the site is reversed to negative, thereby forcing any unbound or nonspecifically bound DNA back into solution away from the capture probes.
  • a laser-based fluorescence scanner is used to detect binding,
  • an array technology based upon the segregation of fluids on a flat surface (chip) by differences in surface tension (ProtoGene, Palo Alto, Calif.) is utilized (See e.g., U.S. Pat. Nos. 6,001,311; 5,985,551; and 5,474,796; each of which is herein incorporated by reference).
  • Protogene's technology is based on the fact that fluids can be segregated on a flat surface by differences in surface tension that have been imparted by chemical coatings. Once so segregated, oligonucleotide probes are synthesized directly on the chip by ink-jet printing of reagents.
  • the array with its reaction sites defined by surface tension is mounted on a X/Y translation stage under a set of four piezoelectric nozzles, one for each of the four standard DNA bases.
  • the translation stage moves along each of the rows of the array and the appropriate reagent is delivered to each of the reaction site.
  • the A amidite is delivered only to the sites where amidite A is to be coupled during that synthesis step and so on.
  • Common reagents and washes are delivered by flooding the entire surface and then removing them by spinning.
  • DNA probes unique for the SNP or mutation of interest are affixed to the chip using Protogene's technology.
  • the chip is then contacted with the PCR-amplified genes of interest.
  • unbound DNA is removed and hybridization is detected using any suitable method (e.g., by fluorescence de-quenching of an incorporated fluorescent group).
  • a “bead array” is used for the detection of polymorphisms (Illumina, San Diego, Calif.; See e.g., PCT Publications WO 99/67641 and WO 00/39587, each of which is herein incorporated by reference).
  • Illumina uses a BEAD ARRAY technology that combines fiber optic bundles and beads that self-assemble into an array. Each fiber optic bundle contains thousands to millions of individual fibers depending on the diameter of the bundle.
  • the beads are coated with an oligonucleotide specific for the detection of a given SNP or mutation. Batches of beads are combined to form a pool specific to the array.
  • the BEAD ARRAY is contacted with a prepared subject sample (e.g., DNA). Hybridization is detected using any suitable method.
  • hybridization is detected by enzymatic cleavage of specific structures (INVADER assay, Third Wave Technologies; See e.g., U.S. Pat. Nos. 5,846,717, 6,090,543; 6,001,567; 5,985,557; and 5,994,069; each of which is herein incorporated by reference).
  • the INVADER assay detects specific DNA and RNA sequences by using structure-specific enzymes to cleave a complex formed, by the hybridization of overlapping oligonucleotide probes. Elevated temperature and an excess of one of the probes enable multiple probes to be cleaved for each target sequence present without temperature cycling.
  • the secondary probe oligonucleotide can be 5′-end labeled with a fluorescent dye that is quenched by a second dye or other quenching moiety.
  • the de-quenched dye-labeled product may be detected using a standard fluorescence plate reader, or an instrument configured to collect fluorescence data during the course of the reaction (i.e., a “real-time” fluorescence detector, such as an ABI 7700 Sequence Detection System, Applied Biosystems, Foster City, Calif.).
  • the INVADER assay detects specific mutations and SNPs in unamplified genomic DNA.
  • two oligonucleotides hybridize in tandem to the genomic DNA to form an overlapping structure.
  • a structure-specific nuclease enzyme recognizes this overlapping structure and cleaves the primary probe.
  • cleaved primary probe combines with a fluorescence-labeled secondary probe to create another overlapping structure that is cleaved by the enzyme.
  • the initial and secondary reactions can run concurrently in the same vessel. Cleavage of the secondary probe is detected by using a fluorescence detector, as described above.
  • the signal of the test sample may be compared to known positive and negative controls.
  • hybridization of a bound probe is detected using a TaqMan assay (PE Biosystems, Foster City, Calif.; See e.g., U.S. Pat. Nos. 5,962,233 and 5,538,848, each of which is herein incorporated by reference).
  • the assay is performed during a PCR reaction.
  • the TaqMan assay exploits the 5′-3′ exonuclease activity of DNA polymerases such as AMPLITAQ DNA polymerase.
  • a probe, specific for a given allele or mutation, is included in the PCR reaction.
  • the probe consists of an oligonucleotide with a 5′-reporter dye (e.g., a fluorescent dye) and a 3′-quencher dye.
  • the 5′-3′ nucleolytic activity of the AMPLITAQ polymerase cleaves the probe between the reporter and the quencher dye.
  • the separation of the reporter dye from the quencher dye results in an increase of fluorescence.
  • the signal accumulates with each cycle of PCR and can be monitored with a fluorimeter.
  • polymorphisms are detected using the SNP-IT primer extension assay (Orchid Biosciences, Princeton, N.J.; See e.g., U.S. Pat. Nos. 5,952,174 and 5,919,626, each of which is herein incorporated by reference).
  • SNPs are identified by using a specially synthesized DNA primer and a DNA polymerase to selectively extend the DNA chain by one base at the suspected SNP location. DNA in the region of interest is amplified and denatured. Polymerase reactions are then performed using miniaturized systems called microfluidics. Detection is accomplished by adding a label to the nucleotide suspected of being at the SNP or mutation location. Incorporation of the label into the DNA can be detected by any suitable method (e.g., if the nucleotide contains a biotin label, detection is via a fluorescently labelled antibody specific for biotin).
  • the present invention provides a high-throughput detection assay production system, allowing for high-speed, efficient production of thousands of detection assays.
  • the high-throughput production systems and methods allow sufficient production capacity to facilitate full implementation of the funnel process described above—allowing comprehensive of all known (and newly identified) markers.
  • oligonucleotides and/or other detection assay components are synthesized.
  • oligonucleotide synthesis is performed in an automated and coordinated manner.
  • produced detection assay are tested against a plurality of samples representing two or more different individuals or alleles (e.g., samples containing sequences from individuals with different ethnic backgrounds, disease states, etc.) to demonstrate the viability of the assay with different individuals.
  • the present invention provides an automated DNA production process.
  • the automated DNA production process includes an oligonucleotide synthesizer component and an oligonucleotide processing component.
  • the oligonucleotide production component includes multiple components, including but not limited to, an oligonucleotide cleavage and deprotection component, an oligonucleotide purification component, an oligonucleotide dry down component; an oligonucleotide de-salting component, an oligonucleotide dilute and fill component, and a quality control component.
  • the automated DNA production process of the present invention further includes automated design software and supporting computer terminals and connections, a product tracking system (e.g., a bar code system), and a centralized packaging component.
  • a product tracking system e.g., a bar code system
  • a centralized packaging component e.g., a product tracking system
  • the components are combined in an integrated, centrally controlled, automated production system.
  • the present invention thus provides methods of synthesizing several related oligonucleotides (e.g., components of a kit) in a coordinated manner.
  • the automated production systems of the present invention allow large scale automated production of detection assays for numerous different target sequences.
  • sequences are sent (e.g., electronically) to a high-throughput oligonucleotide synthesizer component.
  • the high-throughput synthesizer component contains multiple DNA synthesizers.
  • the synthesizers are arranged in banks.
  • a given bank of synthesizers may be used to produce one set of oligonucleotides (e.g., for an INVADER or PCR reaction).
  • the present invention is not limited to any one synthesizer.
  • synthesizers are contemplated, including, but not limited to MOSS EXPEDITE 16-channel DNA synthesizers (PE Biosystems, Foster City, Calif.), OligoPilot (Amersham Pharmacia,), the 3900 and 3948 48-Channel DNA synthesizers (PE Biosystems, Foster City, Calif.), and the high-throughput synthesizer described in PCT Publication WO 01/41918.
  • synthesizers are modified or are wholly fabricated to meet physical or performance specifications particularly preferred for use in the synthesis component of the present invention.
  • two or more different DNA synthesizers are combined in one bank in order to optimize the quantities of different oligonucleotides needed.
  • the DNA synthesizer component includes at least 100 synthesizers. In other embodiments, the DNA synthesizer component includes at least 200 synthesizers. In still other embodiments, the DNA synthesizer component includes at least 250 synthesizers. In some embodiments, the DNA synthesizers are run 24 hours a day.
  • the DNA synthesizers in the oligonucleotide synthesis component further comprise an automated reagent supply system.
  • the automated reagent supply system delivers reagents necessary for synthesis to the synthesizers from a central supply area.
  • acetonitrile is supplied via tubing (e.g., stainless steel tubing) through the automated supply system.
  • De-blocking solution may also be supplied directly to DNA synthesizers through tubing.
  • the reagent supply system tubing is designed to connect directly to the DNA synthesizers without modifying the synthesizers.
  • the central reagent supply is designed to deliver reagents at a constant and controlled pressure.
  • the amount of reagent circulating in the central supply loop is maintained at 8 to 12 times the level needed for synthesis in order to allow standardized pressure at each instrument.
  • the excess reagent also allows new reagent to be added to the system without shutting down.
  • the excess of reagent allows different types of pressurized reagent containers to be attached to one system.
  • the excess of reagents in one centralized system further allows for one central system for chemical spills and fire suppression.
  • the DNA synthesis component includes a centralized argon delivery system.
  • the system includes high-pressure argon tanks adjacent to each bank of synthesizers. These tanks are connected to large, main argon tanks for backup.
  • the main tanks are run in series. In other embodiments, the main tanks are set up in banks.
  • the system further includes an automated tank switching system.
  • the argon delivery system further comprises a tertiary backup system to provide argon in the case of failure of the primary and backup systems.
  • one or more branched delivery components are used between the reagent tanks and the individual synthesizers or banks of synthesizers.
  • acetonitrile is delivered through a branched metal structure.
  • each branched delivery component is individually pressurized.
  • each branched delivery component contains ten or more branches.
  • Reagent tanks may be connected to the branched delivery components using any number of configurations. For example, in some embodiments, a single reagent tank is matched with a single branched component. In other embodiments, a plurality of reagent tanks is used to supply reagents to one or more branched components.
  • the plurality of tanks may be attached to the branched components through a single feed line, wherein one or a subset of the tanks feeds the branched components until empty (or substantially empty), whereby a second tank or subset of tanks is accessed to maintain a continuous supply of reagent to the one or more branched components.
  • an ultrasonic level sensor may be applied to automate the monitoring and switching of tanks.
  • each branch of the branched delivery component provides reagent to one synthesizer or to a bank of synthesizers through connecting tubing.
  • tubing is continuous (i.e., provides a direct connection between the delivery branch and the synthesizer).
  • the tubing comprises an interior diameter of 0.25 inches or less (e.g., 0.125 inches).
  • each branch contains one or more valves (preferably one). While the valve may be located at any position along the delivery line, in preferred embodiments, the valve is located in close proximity to the synthesizer.
  • reagent is provided directly to synthesizers without any joints or valves between the branched delivery component and the synthesizers.
  • the solvent is contained in a cabinet designed for the safe storage of flammable chemicals (a “flammables cabinet”) and the branched structure is located outside of the cabinet and is fed by the solvent container through a tube passed through the wall of the cabinet.
  • the reagent and branched system is stored in an explosion proof room or chamber and the solvent is pumped via tubing through the wall of the explosion proof room.
  • all of the tubing from each of the branches is fed through the wall in at a single location (e.g., through a single hole in the wall).
  • the reagent delivery system of the present invention provides several advantages. For example, such a system allows each synthesizer to be turned off (e.g., for servicing) independent of the other synthesizers.
  • Use of continuous tubing reduces the number of joints and couplings, the areas most vulnerable to failure, between the reagent sources and the synthesizers, thereby reducing the potential for leakage or blockage in the system.
  • Use of continuous tubing through inaccessible or difficult-to-access areas reduces the likelihood that repairs or service will be needed in such areas. In addition, fewer valves results in cost savings.
  • the branched tubing structure further provides a sight glass.
  • the sight glass is located at the top of the branched delivery structure.
  • the sight glass provides the opportunity for visual and physical sampling of the reagent.
  • the sight glass includes a sampling valve (e.g., to collect samples for quality control).
  • the site glass serves as a trap for gas bubbles, to prevent bubbles from entering the connecting tubing.
  • the sight glass contains a vent (e.g., a solonoid valve) for de-gassing of the system.
  • scanning of the sight glass e.g., spectrophotometrically
  • sampling are automated. The automated system provides quality control and feedback (e.g., the presence of contamination).
  • the present invention provides a portable reagent delivery system.
  • the portable reagent delivery system comprises a branched structure connected to solvent tanks that are contained in a flammables cabinet.
  • one reagent delivery system is able to provide sufficient reagent for 40 or more synthesizers.
  • These portable reagent delivery systems of the present invention facilitate the operation of mobile (portable) synthesis facilities.
  • these portable reagent delivery systems facilitate the operation of flexible synthesis facilities that can be easily re-configured to meet particular needs of individual synthesis projects or contracts.
  • a synthesis facility comprises multiple portable reagent delivery systems.
  • the DNA synthesis component further comprises a centralized waste collection system.
  • the centralized waste collection system comprises cache pots for central waste collection.
  • the cache pots include level detectors such that when waste level reaches a preset value, a pump is activated to drain the cache into a central collection reservoir.
  • ductwork is provided to gather fumes from cache pots. The fumes are then vented safely through the roof, avoiding exposure of personnel to harmful fumes.
  • the air handling system provides an adequate amount of air exchange per person to ensure that personnel are not exposed to harmful fumes. The coordinated reagent delivery and waste removal systems increase the safety and health of workers, as well as improving cost savings.
  • the solvent waste disposal system comprises a waste transfer system.
  • the system contains no electronic components.
  • the system comprises no moving parts.
  • waste is first collected in a liquid transfer drum designed for the safe storage of flammable waste.
  • waste is manually poured into the drum through a waste channel.
  • solvent waste is automatically transported (e.g., through tubing) directly from synthesizers to the drum.
  • argon is pumped from a pressurized gas line into the drum through a first opening, forcing solvent waste out an output channel at a second opening (e.g., through tubing) into a centralized waste collection area.
  • the argon is pumped at low pressure (e.g., 3-10 pounds per square inch (psi), preferably 5 psi or less).
  • the drum contains a sight glass to visualize the solvent level.
  • the level is visualized manually and the disposal system is activated when the drum has reached a selected threshold level. In other embodiments, the level is automatically detected and the disposal system is automatically activated when the drum has reached the threshold level.
  • the solvent waste transfer system of the present invention provides several advantages over manual collection and complex systems.
  • the solvent waste system of the present invention is intrinsically safe, as it can be designed with no moving or electrical parts.
  • the system described above is suitable for use in Division I/Class I space under EPA regulations.
  • all of the DNA synthesizers in the synthesis component are attached to a centralized control system.
  • the centralized control system controls all areas of operation, including, but not limited to, power, pressure, reagent delivery, waste, and synthesis.
  • the centralized control system includes a clean electrical grid with uninterrupted power supply. Such a system minimizes power level fluctuations.
  • the centralized control system includes alarms for air flow, status of reagents, and status of waste containers. The alarm system can be monitored from the central control panel.
  • the centralized control system allows additions, deletions, or shutdowns of one synthesizer or one block of synthesizers without disrupting operations of other instruments.
  • the centralized power control allows user to turn instruments off instrument by instrument, bank by bank, or the entire module.
  • the automated DNA production process further comprises one or more oligonucleotide production components, including, but not limited to, an oligonucleotide cleavage and deprotection component, an oligonucleotide purification component, a dry-down component, a desalting component, a dilution and fill component, and a quality control component.
  • oligonucleotide production components including, but not limited to, an oligonucleotide cleavage and deprotection component, an oligonucleotide purification component, a dry-down component, a desalting component, a dilution and fill component, and a quality control component.
  • the oligonucleotides are moved to the cleavage and deprotection station.
  • the transfer of oligonucleotides to this station is automated and controlled by robotic automation.
  • the entire cleavage and deprotection process is performed by robotic automation.
  • NH 4 OH for deprotection is supplied through the automated reagent supply system.
  • oligonucleotide deprotection is performed in multi-sample containers (e.g., 96 well covered dishes) in an oven.
  • This method is designed for the high-throughput system of the present invention and is capable of the simultaneous processing of large numbers of samples.
  • This method provides several advantages over the standard method of deprotection in vials. For example, sample handling is reduced (e.g., labeling of vials dispensing of concentrated NH 4 OH to individual vials, as well as the associated capping and uncapping of the vials, is eliminated). This reduces the risks of contamination or mislabeling and decreases processing time.
  • the methods save many labor hours per day.
  • the method also reduces consumable requirements by eliminating the need for vials and pipette tips, reduces equipment needs by eliminating the need for pipettes, and improves worker safety conditions by reducing worker exposure to ammonium hydroxide.
  • the potential for repetitive motion disorders is also reduced.
  • Deprotection in a multi-well plate further has the advantage that the plate can be directly placed on an automated desalting apparatus (e.g., TECAN Robot).
  • the plate was optimized to be functional and compatible with the deprotection methods.
  • the plate is designed to be able to hold as much as two milliliters of oligonucleotide and ammonium hydroxide. If deep well plates are used, automated downstream processing steps may need to be altered to ensure that the full volume of sample is extracted from the wells.
  • the multi-well plates used in the methods of the present invention comprise a tight sealing lid/cover to protect from evaporation, provide for even heating, and are able to withstand temperatures necessary for deprotection. Attempts with initial plates were not successful, having problems with lids that were not suitably sealed and plates that did not withstand deprotection temperatures.
  • oligonucleotides are cleaved from the synthesis support in the multi-well plates. In other embodiments (e.g., processing of probe oligonucleotides), oligonucleotides are first cleaved from the synthesis column and then transferred to the plate for deprotection.
  • oligonucleotides are further purified.
  • Any suitable purification method may be employed, including, but not limited to, high pressure liquid chromatography (HPLC) (e.g., using reverse phase C18 and ion exchange), reverse phase cartridge purification, and gel electrophoresis.
  • HPLC high pressure liquid chromatography
  • purification is carried out using ion exchange HPLC chromatography.
  • HPLC instruments are utilized, and integrated into banks (e.g., banks of 8 HPLC instruments).
  • Each bank is referred to as an HPLC module.
  • Each HPLC module consists of an automated injector (e.g., including, but not limited to, Leap Technologies 8-port injector) connected to each bank of automated HPLC instruments (e.g., including, but not limited to, Beckman-Coulter HPLC instruments).
  • the automatic Leap injector can handle four 96-well plates of cleaved and deprotected oligonucleotides at a time.
  • the Leap injector automatically loads a sample onto each of the HPLCs in a given bank.
  • the use of one injector with each bank of HPLC provides the advantage of reducing labor and allowing integrated processing of information.
  • oligonucleotides are purified on an ion exchange column using a salt gradient.
  • a salt gradient Any suitable ion exchange functionality or support may be utilized, including but not limited to, Source 15 Q ion exchange resin (Pharmacia).
  • Any suitable salt may be utilized for elution of oligonucleotides from the ion exchange column, including but not limited to, sodium chloride, acetonitrile, and sodium perchlorate.
  • a gradient of sodium perchlorate in acetonitrile and sodium acetate is utilized.
  • the gradient is run for a sufficient time course to capture a broad range of sizes of oligonucleotides.
  • the gradient is a 54 minute gradient carried out using the method described in Tables 1 and 2.
  • Table 1 describes the HPLC protocol for the gradient.
  • the time column represents the time of the operation.
  • the module column represents the equipment that controls the operation.
  • the function column represents the function that the HPLC is performing.
  • the value column represents the value of the HPLC function at the time specified in the time column.
  • Table 2 describes the gradient used in HPLC purification.
  • the column temperature is 65° C.
  • Buffer A is 20 mM Sodium Perchlorate, 20 mM Sodium Acetate, 10 Acetonitrile, pH 7.35.
  • Buffer B is 600 mM Sodium Perchlorate, 20 mM Sodium Acetate, 10 Acetonitrile, pH 7.35.
  • the gradient is shortened.
  • the gradient is shortened so that a particular gradient range suitable for the elution of a particular oligonucleotide being purified is accomplished in a reduced amount of time.
  • the gradient is shortened so that a particular gradient range suitable for the elution of any oligonucleotide having a size within a selected size range is accomplished in a reduced amount of time.
  • the gradient is a 34 minute gradient described in the Tables 3 and 4.
  • the parameters and buffer compositions are as described for Tables 1 and 2 above. Reducing the gradient to 34 minutes increases the capacity of synthesis per HPLC instrument and reduces buffer usage by 50% compared to the 54 minute protocol described above.
  • the 34 minute HPLC method of the present invention has the further advantage of being optimized to be able to separate oligonucleotides of a length range of 23-39 nucleotides without any changes in the protocol for the different lengths within the range. Previous methods required changes for every 2-3 nucleotide change in length.
  • the gradient time is reduced even further (e.g., to less than 30 minutes, preferably to less than 20 minutes, and even more preferably, to less than 15 minutes). Any suitable method may be utilized that meets the requirements of the present invention (e.g., able to purify a wide range of oligonucleotide lengths using the same protocol).
  • separate sets of HPLC conditions each selected to purify oligonucleotides within a different size range, may be provided (e.g., may be run on separate HPLCs or banks of HPLCs).
  • a first bank of HPLCs are configured to purify oligonucleotides using a first set of purification conditions (e.g., for 23-39 mers), while second and third banks are used for the shorter and longer oligonucleotides.
  • the HPLC station is equipped with a central reagent supply system.
  • the central reagent system includes an automated buffer preparation system.
  • the automated buffer preparation system includes large vat carboys that receive pre-measured reagents and water for centralized buffer preparation.
  • the buffers e.g., a high salt buffer and a low salt buffer
  • the conductivity of the solution in the circulation loop is monitored to verify correct content and adequate mixing.
  • circulation lines are fitted with venturis for static mixing of the solutions as they are circulated through the piping loop.
  • the circulation lines are fitted with 0.05 ⁇ m filters for sterilization.
  • the HPLC purification step is carried out in a clean room environment.
  • the clean room includes a HEPA filtration system. All personnel in the clean room are outfitted with protective gloves, hair coverings, and foot coverings.
  • the automated buffer prep system is located in a non-clean room environment and the prepared buffer is piped through the wall into the clean room.
  • Each purified oligonucleotide is collected into a tube (e.g., a 50-ml conical tube) in a carrying case in the fraction collector. Collection is based on a set method, which is triggered by an absorbance rate change within a predetermined time window. In some embodiments, the method uses a flow rate of 5 ml/min (the maximum rate of the pumps is 10 ml/min.) and each column is automatically washed before the injector loads the next sample.
  • the fraction collector When the fraction collector is full of eluted oligonucleotides, they are transferred (e.g., by automated robotics or by hand) to a drying station. For example, in some embodiments, the samples are transferred to customized racks for Genevac centrifugal evaporator to be dried down. In preferred embodiments, the Genevac evaporator is equipped with racks designed to be used in both the Genevac and the subsequent desalting step. The Genevac evaporator decreases drying time, relative to other commercially available evaporators, by 60%.
  • oligonucleotides are desalted.
  • oligonucleotides are not HPLC purified, but instead proceed directly from deprotection to desalting.
  • the desalting stations have TECAN robot systems for automated desalting.
  • the system employs a rack that has been designed to fit the TECAN robot and the Genevac centrifugal evaporator without transfer to a different rack or holder.
  • the racks are designed to hold the different sizes of desalting columns, such as the NAP-5 and NAP-10 columns.
  • the TECAN robot loads each oligonucleotide onto an individual NAP-5 or NAP-10 column, supplies the buffer, and collects the eluate. If desired, desalted oligonucleotides may be frozen or dried down at this point.
  • oligonucleotides following desalting, INVADER and target oligonucleotides are analyzed by mass spectroscopy. For example, in some embodiments, a small sample from the desalted oligonucleotide sample is removed (e.g., by a TECAN robot) and spotted on an analysis plate, which is then placed into a mass spectrometer. The results are analyzed and processed by a software routine. Following the analysis, failed oligonucleotides are automatically reordered, while oligonucleotides that pass the analysis are transported to the next processing step. This preliminary quality control analysis removes failed oligonucleotides earlier in the processing, thus resulting in cost savings and improving cycle times.
  • mass spectroscopy For example, in some embodiments, a small sample from the desalted oligonucleotide sample is removed (e.g., by a TECAN robot) and spotted on an analysis plate, which is then placed into a mass spectr
  • the oligonucleotide production process further includes a dilute and fill module.
  • each module consists of three automated oligonucleotide dilution and normalization stations.
  • Each station consists of a network-linked computer and an automated robotic system (e.g., including but not limited to Biomek 2000).
  • the pipetting station is physically integrated with a spectrophotometer to allow machine handling of every step in the process. All manipulations are carried out in a HEPA-filtered environment. Dissolved oligonucleotides are loaded onto the Biomek 2000 deck the sequence files are transferred into the Biomek 2000.
  • the Biomek 2000 automatically transfers a sample of each oligonucleotide to an optical plate, which the spectrophotometer reads to measure the A260 absorbance.
  • an Excel program integrated with the Biomek software uses absorbance and the sequence information to prepare a dilution table for each oligonucleotide.
  • the Biomek employs that dilution table to dilute each oligonucleotide appropriately.
  • the instrument then dispenses oligonucleotides into an appropriate vessel (e.g., 1.5 ml microtubes).
  • the automated dilution and fill system is able to dilute different components of a kit (e.g., INVADER and probe oligonucleotides) to different concentrations.
  • the automated dilution and fill module is able to dilute different components to different concentrations specified by the end user.
  • oligonucleotides undergo a quality control assay before distribution to the user.
  • the specific quality control assay chosen depends on the final use of the oligonucleotides. For example, if the oligonucleotides are to be used in an INVADER SNP detection assay, they are tested in the assay before distribution.
  • each SNP set is tested in a quality control assay utilizing the Beckman Coulter SAGIAN CORE System.
  • the results are read on a real-time instrument (e.g., a ABI 7700 fluorescence reader).
  • the QC assay uses two no target blanks as negative controls and five untyped genomic samples as targets. For consistency, every SNP set is tested with the same genomic samples.
  • the ADS system is responsible for tracking tubes through the QC module. Thus, in some embodiments, if a tube is missing, the ADS program discards, reorders, or searches for the missing tube.
  • the user chooses which QC method to run. The operator then chooses how many sets are needed. Then, in some embodiments, the application auto-selects the correct number of SNPs based on priority and prints output (picklist). If a picklist needs to be regenerated, the operator inputs which picklist they are replacing as well as which sets are not valid. The system auto-selects the valid SNPs plus replacement SNPs and print output. Additionally, in some embodiments, picklists are manually generated by SNP number.
  • the auto-selected SNPs are then removed from being listed as available for auto-selection.
  • the software prints the following items: SNP/Oligo list (picklist), SNP/Oligo layout (rack setup).
  • picklist SNP/Oligo list
  • SNP/Oligo layout rack setup.
  • the operator then takes the picklist into inventory and removes the completed oligonucleotide sets. In some embodiments, a completed set is unavailable. In this case, the operator regenerates a picklist. Then, in preferred embodiments, the missing SNP set or tube is flagged in the system. Once a picklist is full, the oligonucleotides are moved to the next step.
  • the operator then takes the rack setup generated by the picklist and loads the rack.
  • a robotic handling system loads the rack.
  • tubes are scanned as they are placed onto the rack. The scan checks to make sure it is the correct tube and displays the location in the rack where the tube is to be placed.
  • Completed racks are then placed in a holding area to await the robot prep and robot run. Then, in some embodiments, the operator views what racks are in the queue and determines what genomics and reagent stock will be loaded onto the robot. The robot is then programmed to perform a specific method. Additionally, in some embodiments, the robot or operator records genomics and reagents lot numbers.
  • a carousel location map is printed that outlines where racks are to be placed.
  • the operator then loads the robot carousel according to the method layout.
  • the rack is scanned (e.g., by the operator or by the ADS program). If the rack is not valid for the current robot method, the operator will be informed.
  • the carousel location for the rack is then displayed.
  • the output plates are then scanned (e.g., by the operator or by the ADS program). If the plate is not valid for the current method the operator is informed.
  • the carousel location for the plate is then displayed.
  • the robot is run.
  • the robot places the plates onto heatblocks for a period of time specified in the method.
  • the robot then scans the plates on the Cytofluor. Output from the cytofluor is read into the database and attached to the output plate record.
  • the output is read on the ABI 7700 real time instrument.
  • the operator loads the plate on to the 7700.
  • the robot loads the plate onto the ABI 7700.
  • a scan is then started using the 7700 software.
  • the output file is saved onto a computer hard drive.
  • the operator then starts the application and scans in the plate bar code.
  • the software instructs the user to browse to the saved output file.
  • the software then reads the file into the database and deletes the file (or tells the operator to delete the file).
  • the plate reader results (e.g., from a Cytofluor or a ABI 7700) are then analyzed (e.g., by a software program or by the operator). Additionally, in some embodiments, the operator reviews the results of the software analysis of each SNP and takes one of several actions. In some embodiments, the operator approves all automated actions. In other embodiments, the operator reviews and approves individual actions. In some embodiments, the operator marks actions as needing additional review. Alternatively, in other embodiments, the operator passes on reviewing anything. Additionally, in some embodiments, the operator overrides all automated actions.
  • an oligonucleotide set fails quality control, the data is interpreted to determine the cause of the failure. The course of action is determined by such data interpretation. If the software marks an oligonucleotide Reassess Failed Oligonucleotide, no action by user is required, the reassess is handled by automation. In the software marks an oligonucleotide Redilute Failed Oligonucleotide, the operator discards diluted tubes. No other action is required. If the software marks an oligonucleotide Order Target Oligonucleotide, no action by user is required. In this case, a synthetic target oligonucleotide is ordered for further testing.
  • the software marks an oligonucleotide Fail Oligo(s) Discard Oligo(s), the operator discards the diluted tubes and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Fail SNP, the operator discards the diluted and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Full SNP Redesign, the operator discards the diluted and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Partial SNP Redesign the operator discards diluted tubes and discards some un-diluted tubes. No other action is required.
  • the software marks an oligonucleotide Manual Intervention. This step occurs if the operator or software has determined the SNP requires manual attention. This step puts the SNP “on hold” in the tracking system while the operator investigates the source of the failure.
  • a set of oligonucleotides e.g., a INVADER assay set
  • the set is transferred to the packaging station.
  • the produced detection assays are tested against a plurality of samples representing two or more different alleles (samples containing sequences from individuals with different ethnic backgrounds, disease states, etc.) to demonstrate the viability of the assay with different individuals.
  • the produced assays are tested against a sufficient number of alleles (e.g., 100 or more) to identify which members of the population can be tested by the assay and to identify the allele frequency in the population of the genotype for which the assay is designed.
  • the target sequence of the individuals is characterized to determine whether the intended SNP is not present and/or whether additional mutations are present the prevent the proper detection of the sample. Any such information may be collected and stored in databases.
  • target selection, in silico analysis, and oligonucleotide design are repeated to generate assays capable of detecting the corresponding sequence of these individuals, as desired.
  • allele frequency information is stored in a database and made available to users of the detection assays upon request (e.g., made available over a communication network).
  • one or more components generated using the system of the present invention are packaged using any suitable means.
  • the packaging system is automated.
  • the packaging component is controlled by the centralized control network of the present invention.
  • the automated DNA production process further comprises a centralized control system.
  • the centralized control system comprises a computer system.
  • the computer system comprises computer memory or a computer memory device and a computer processor.
  • the computer memory (or computer memory device) and computer processor are part of the same computer.
  • the computer memory device or computer memory are located on one computer and the computer processor is located on a different computer.
  • the computer memory is connected to the computer processor through the Internet or World Wide Web.
  • the computer memory is on a computer readable medium (e.g., floppy disk, hard disk, compact disk, DVD, etc).
  • the computer memory (or computer memory device) and computer processor are connected via a local network or intranet.
  • the computer system comprises a computer memory device, a computer processor, an interactive device (e.g., keyboard, mouse, voice recognition system), and a display system (e.g., monitor, speaker system, etc.).
  • the systems and methods of the present invention comprise a centralized control system, wherein the centralized control system comprises a computer tracking system.
  • the items to be manufactured e.g. oligonucleotide probes, targets, etc
  • a number of processing steps e.g. synthesis, purification, quality control, etc.
  • various components of a single order e.g. one type of SNP detection kit
  • the present invention provides systems and methods for tracking the location and status of the items to be manufactured such that multiple components of a single order can be separately manufactured and brought back together at the appropriate time.
  • the tracking system and methods of the present invention also allow for increased quality control and production efficiency.
  • the computer tracking system comprises a central processing unit (CPU) and a central database.
  • the central database is the central repository of information about manufacturing orders that are received (e.g. SNP sequence to be detected, final dilution requirements, etc), as well as manufacturing orders that have been processed (e.g. processed by software applications that determine optimal nucleic acid sequences, and applications that assign unique identifiers to orders).
  • Manufacturing orders that have been processed may generate, for example, the number and types of oligonucleotides that need to be manufactured (e.g. probe, INVADER oligonucleotide, synthetic target), and the unique identifier associated with the entire order as well as unique identifiers for each component of an order (e.g.
  • the components of an order proceed through the manufacturing process in containers that have been labeled with unique identifiers (e.g. bar coded test tubes, color coded test tubes, etc.).
  • unique identifiers e.g. bar coded test tubes, color coded test tubes, etc.
  • the computer tracking system further comprises one or more scanning units capable of reading the unique identifier associated with each labeled container.
  • the scanning units are portable (e.g. hand held scanner employed by an operator to scan a labeled container).
  • the scanning units are stationary (e.g. built into each module).
  • at least one scanning unit is portable and at least one scanning unit is stationary (e.g. hand held human implemented device).
  • Stationary scanning units may, for example, collect information from the unique identifier on a labeled container (i.e. the labeled container is ‘red’) as it passes through part of one of the production modules.
  • a rack of 100 labeled containers may pass from the purification module to the dilute and fill module on a conveyor belt or other transport means, and the 100 labeled containers may be read by the stationary scanning unit.
  • a portable scanning unit may be employed to collect the information from the labeled containers as they pass from one production module to the next, or at different points within a production module.
  • the scanning units may also be employed, for example, to determine the identity of a labeled container that has been tested (e.g. concentration of sample inside container is tested and the identity of the container is determined).
  • the scanning units are capable of transmitting the information they collect from the labeled containers to a central database.
  • the scanning units may be linked to a central database via wires, or the information may be transmitted to the central database.
  • the central database collects and processes this information such that the location and status of individual orders and components of orders can be tracked (e.g. information about when the order is likely to complete the manufacturing process may be obtained from the system).
  • the central database also collects information from any type of sample analysis performed within each module (e.g. concentration measurements made during dilute and fill module). This sample analysis is correlated with the unique identifiers on each labeled container such that the status of each labeled container is determined. This allows labeled containers that are unsatisfactory to be removed from the production process (e.g.
  • central database information from the central database is communicated to robotic or human container handlers to remove the unsatisfactory sample).
  • containers that are automatically removed from the production process as unsatisfactory may be identified, and this information communicated to a central database (e.g. to update the status of an order, allow a re-order to be generated, etc). Allowing unsatisfactory samples to be removed prevents unnecessary manufacturing steps, and allows the production of a replacement to begin as early as possible.
  • an order may be for the production of an INVADER detection kit.
  • An INVADER detection kit is composed of at least 2 components (the INVADER oligonucleotide, and the downstream probe), and generally includes a second downstream probe (e.g. for a different allele), and one or two synthetic targets so controls may be run (i.e. an INVADER kit may have 5 separate oligonucleotide sequences that need to be generated).
  • each container with a unique identifier corresponding to a single type of oligonucleotide (e.g. an INVADER oligonucleotide), and also corresponding to a single order (a SNP detection kit for diagnosing a certain SNP) allows separate, high through-put manufacture of the various components of a kit without confusion as to what components belong with each kit.
  • a single type of oligonucleotide e.g. an INVADER oligonucleotide
  • a single order a SNP detection kit for diagnosing a certain SNP
  • a kit e.g. a kit composed of 5 different oligonucleotides
  • HPLC near the end of the purification module HPLC is employed, and a simple sample analysis may be employed on each sample in each container to determine if a sample is collected in each tube. If no sample is collected after HPLC is performed, the unique identifier on the container, in connection with the central database, identifies the type of sample that should have been produced (e.g. INVADER oligonucleotide) and a re-order is generated. Identification of this particular oligonucleotide allows the manufacturing process for this oligonucleotide to start over from the beginning (e.g.
  • this order gets priority status over other orders to begin the manufacturing process again).
  • the other components of the order may continue the manufacturing process without being discarded as part of a defective order (e.g. the manufacturing process may continue for these oligonucleotides up to the point where the defective oligonucleotide is required).
  • additional manufacturing resources are not wasted on the defective component (i.e. additional reagents and time are not spent on this portion of the order in further manufacturing steps).
  • the unique identifier on each of the containers allows the various components of a given order to be grouped together at a step when this is required (likewise, there is no need to group the components of an order in the manufacturing process until it is required). For example, prior to the dilute and fill module, the various components of a single order may be grouped together such that the contents of the proper containers are combined in the proper fashion in the dilute and fill module. This identification and grouping also allows re-orders to ‘find’ the other components of a particular order. This type of grouping, for example, allows the automated mixing, in the dilute and fill stage, of the first and second downstream probes with the INVADER oligonucleotide, all from the same order.
  • the ability to track the individual containers allows the components of a kit to be associated together by directing a robot or human operator what tubes belong together. Consequently, final kits are produced with the proper components. Therefore, the tracking systems and methods of the present invention allow high through-put production of kits with many components, while assuring quality production.
  • This Example describes the production of an INVADER assay kit for SNP detection using the automated DNA production system of the present invention.
  • the sequence of the SNP to be detected is first submitted through the automated web-based user interface or through e-mail.
  • the sequences are then transferred to the INVADER CREATOR software.
  • the software designs the upstream INVADER oligonucleotide and downstream probe oligonucleotide.
  • the sequences are returned to the user for inspection.
  • the sequences are assigned a bar code and entered into the automated tracking system.
  • the bar codes of the probe and INVADER oligonucleotide are linked so that their synthesis, analysis, and packaging can be coordinated.
  • the sequences are transferred to the synthesis component.
  • the bar codes are read and the sequences are logged into the synthesis module.
  • Each module consists of 14 MOSS EXPEDITE 16-channel DNA synthesizers (PE Biosystems, Foster City, Calif.), that prepare the primary probes, and two ABI 3948 48-Channel DNA synthesizers (PE Biosystems, Foster City, Calif.), that prepare the INVADER oligonucleotides. Synthesizing a set of two primary and INVADER probes is complete 3-4 hours. The instruments run 24 h/day. Following synthesis, the automating tracking system reads the bar codes and logs the oligonucleotides as having completed the synthesis module.
  • the synthesis room is equipped with centralized reagent delivery. Acetonitrile is supplied to the synthesizers through stainless steel tubing. De-blocking solution (3% TCA in methylene chloride) is supplied through Teflon tubing. Tubing is designed to attach to the synthesizers without any modification of the synthesizers.
  • the synthesis room is also equipped with an automated waste removal system. Waste containers are equipped with ventilation and contain sensors that trigger removal of waste through centralized tubing when the cache pots are full. Waste is piped to a centralized storage facility equipped with a blow out wall. The pressure in the synthesis instruments is controlled with argon supplied through a centralized system.
  • the argon delivery system includes local tanks supplied from a centralized storage tank.
  • the oligonucleotides are transported to the cleavage and deprotection station. At this stage, completed oligonucleotides are subjected to a final deprotection step and are cleaved from the solid support used for synthesis. The cleavage and deprotection may be performed manually or through automated robotics. The oligonucleotides are cleaved from the solid support used for synthesis by incubation with concentrated NaOH and collected. The cleavage step takes 12 hours. Following cleavage, the bar code scanner scans the oligonucleotide tubes and logs them as having completed the cleavage and deprotection step.
  • probe oligonucleotides are further purified using HPLC.
  • INVADER oligonucleotides are not purified, but instead proceed directly to desalting (see below).
  • HPLC is performed on instruments integrated into banks (modules) of 8.
  • Each HPLC module consists of a Leap Technologies 8-port injector connected to 8 automated Beckman-Coulter HPLC instruments.
  • the automatic Leap injector can handle four 96-well plates of cleaved and deprotected primary probes at a time.
  • the Leap injector automatically loads a sample onto each of the 8 HPLCs.
  • Buffers for HPLC purification are produced by the automated buffer preparation system.
  • the buffer prep system is in a general access area. Prepared buffer is then piped through the wall in to clean room (HEPA environment).
  • the system includes large vat carboys that receive premeasured reagents and water for centralized buffer preparation.
  • the buffers are piped from central prep to HPLCs.
  • the conductivity of the solution in the circulation loop is monitored as a means of verifying both correct content and adequate mixing.
  • the circulation lines are fitted with venturis for static mixing of the solutions; additional mixing occurs as solutions are circulated through the piping loop.
  • the circulation lines are fitted with 0.05 ⁇ m filters for sterilization and removal of any residual particulates.
  • Each purified probe is collected into a 50-ml conical tube in a carrying case in the fraction collector. Collection is based on a set method, which is triggered by an absorbance rate change within a predetermined time window.
  • the HPLC is run at a flow rate of 5-7.5 ml/min (the maximum rate of the pumps is 10 ml/min.) and each column is automatically washed before the injector loads the next sample.
  • the gradient used is described in Tables 3 and 4 and takes 34 minutes to complete (including wash steps to prepare the column for the next sample).
  • the tubes are transferred manually to customized racks for concentration in a Genevac centrifugal evaporator.
  • the Genevac racks, containing dry oligonucleotide are then transferred to the TECAN Nap10 column handler for desalting.
  • oligonucleotides move to the desalting station.
  • the dried oligonucleotides are resuspended in a small volume of water.
  • Desalting steps are performed by a TECAN robot system.
  • the racks used in Genevac centrifugation are also used in the desalting step, eliminating the need for transfer of tubes at this step.
  • the racks are also designed to hold the different sizes of desalting columns, such as the NAP-5 and NAP-10 columns.
  • the TECAN robot loads each oligonucleotide onto an individual NAP-5 or NAP-10 column, supplies the buffer, and collects the eluate.
  • the oligonucleotides are transferred to the dilute and fill module for concentration normalization and dispenation.
  • Each module consists of three automated probe dilution and normalization stations.
  • Each station consists of a network-linked computer and a Biomek 2000 interfaced with a SPECTRAMAX spectrophotometer Model 190 or PLUS 384 (Molecular Devices Corp., Sunnyvale Calif.) in a HEPA-filtered environment.
  • the probe and INVADER oligonucleotides are transferred onto the Biomek 2000 deck and the sequence files are downloaded into the Biomek 2000.
  • the Biomek 2000 automatically transfers a sample of each oligonucleotide to an optical plate, which the spectrophotometer reads to measure the A260 absorbance.
  • an Excel program integrated with the Biomek software uses the measured absorbance and the sequence information to calculate the concentration of each oligonucleotide.
  • the software then prepares a dilution table for each oligonucleotide.
  • the probe and INVADER oligonucleotide are each diluted by the Biomek to a concentration appropriate for their intended use.
  • the instrument then combines and dispenses the probe and INVADER oligonucleotides into 1.5 ml microtubes for each SNP set.
  • the completed set of oligonucleotides contains enough material for 5,000 SNP assays.
  • an oligonucleotide fails the dilution step, it is first re-diluted. If it again fails dilution, the oligonucleotide is re-purified or returned for re-synthesis. The progress of the oligonucleotide through the dilution module is tracked by the bar coding system. Oligonucleotides that pass the dilution module are scanned as having completed dilution and are moved to the next module.
  • the SNP set Before shipping, the SNP set is subjected to a quality control assay in a SAGIAN CORE System (Beckman Coulter), which is read on a ABI 7700 real time fluorescence reader (PE Biosystems).
  • the QC assay uses two no target blanks as negative controls and five untyped genomic samples as targets.
  • the quality control assay is performed in segments. In each segment, the operator or automated system performs the following steps: log on; select location; step specific activity; and log off.
  • the ADS system is responsible for tracking tubes. If a tube is missing, existing ADS program routines will be used to discard/reorder/search for the tube.
  • a picklist is generated.
  • the list includes the identity of the SNPs that are being tested and the QC method chosen.
  • the tubes containing the oligonucleotide are selected by the automated software and a copy of the picklist is printed.
  • the tubes are removed from inventory by the operator and scanned with the bar code reader and being removed from inventory.
  • the operator or the automated system then takes the rack setup generated by the picklist and loads the rack. Tubes are scanned as they are placed onto the rack. The scan checks to make sure it is the correct tube and displays the location in the rack where the tube is to be placed. Completed racks are placed in a holding area to await the robot prep and robot run.
  • the operator or the automated system then chooses the genomics and reagent stock to be loaded onto the robot.
  • the robot is programmed with the specific method for the SNP set generated. Lot numbers of the genomics and reagents are recorded. Racks are placed in the proper carousel location. After all the carousel locations have been loaded the robot is run.
  • Places are then incubated on the robot.
  • the plates are placed onto heatblocks for a period of time specified in the method.
  • the operator then takes the plate and loads it into the ABI 7700.
  • a scan is started using the 7700 software. When the scan is completed the operator transfers the output file onto a Macintosh computer hard drive.
  • The then starts the analysis application and scans in the plate bar code.
  • the software instructs the operator to browse to the saved output file.
  • the software reads the file into the database and deletes the file.
  • results of the QC assay are then analyzed.
  • the operator scans plate in at workstation PC and reviews automated analysis.
  • the automated actions are performed using a spreadsheet system.
  • the automated spreadsheet program returns one of the following results:
  • Fail SNP (Operator discards diluted tubes. Operator discards un-diluted tubes. Requires no other action).
  • the oligonucleotides are transferred to the packaging station.
  • the produced detection assay is screened against a plurality of known sequences designed to represent one or more population groups, e.g., to determine the ability of the detection assay to detect the intended target among the diverse alleles found in the general population.
  • the frequency of occurrence of the SNP allele in each of the one or more population groups is determined using the produced detection assay. Data collected may be used to satisfy regulatory requirements, if the detection assay is to be used as a clinical product.
  • Sequences may be input for analysis from any number of sources.
  • sequence information is entered into a computer.
  • the computer need not be the same computer system that carries out in silico analysis.
  • candidate target sequences may be entered into a computer linked to a communication network (e.g., a local area network, Internet or Intranet).
  • a communication network e.g., a local area network, Internet or Intranet.
  • users anywhere in the world with access to a communication network may enter candidate sequences at their own locale.
  • a user interface is provided to the user over a communication network (e.g., a World Wide Web-based user interface), containing entry fields for the information required by the in silico analysis (e.g., the sequence of the candidate target sequence).
  • the use of a Web based user interface has several advantages. For example, by providing an entry wizard, the user interface can ensure that the user inputs the requisite amount of information in the correct format.
  • the user interface requires that the sequence information for a target sequence be of a minimum length (e.g., 20 or more, 50 or more, 100 or more nucleotides) and be in a single format (e.g., FASTA).
  • the information can be input in any format and the systems and methods of the present invention edit or alter the input information into a suitable form for analysis.
  • the systems and methods of the present invention search public databases for the short sequence, and if a unique sequence is identified, convert the short sequence into a suitably long sequence by adding nucleotides on one or both of the ends of the input target sequence.
  • sequence information is entered in an undesirable format or contains extraneous, non-sequence characters, the sequence can be modified to a standard format (e.g., FASTA) prior to further in silico analysis.
  • the user interface may also collect information about the user, including, but not limited to, the name and address of the user.
  • target sequence entries are associated with a user identification code.
  • sequences are input directly from assay design software (e.g., the INVADERCREATOR software.
  • each sequence is given an ID number.
  • the ID number is linked to the target sequence being analyzed to avoid duplicate analyses. For example, if the in silico analysis determines that a target sequence corresponding to the input sequence has already been analyzed, the user is informed and given the option of by-passing in silico analysis and simply receiving previously obtained results.
  • Users who wish to order detection assays, have detection assay designed, or gain access to databases or other information of the present invention may employ a electronic communication system (e.g., the Internet).
  • an ordering and information system of the present invention is connected to a public network to allow any user access to the information.
  • private electronic communication networks are provided.
  • a customer or user is a repeat customer (e.g., a distributor or large diagnostic laboratory)
  • the full-time dedicated private connection may be provided between a computer system of the customer and a computer system of the systems of the present invention.
  • the system may be arranged to minimize human interaction.
  • inventory control software is used to monitor the number and type of detection assays in possession of the customer.
  • a query is sent at defined intervals to determine if the customer has the appropriate number and type of detection assay, and if shortages are detected, instructions are sent to design, produce, and/or deliver additional assays to the customer.
  • the system also monitors inventory levels of the seller and in preferred embodiments, is integrated with production systems to manage production capacity and timing.
  • a user-friendly interface is provided to facilitate selection and ordering of detection assays. Because of the hundreds of thousands of detection assays available and/or polymorphisms that the user may wish to interrogate, the user-friendly interface allows navigation through the complex set of option. For example, in some embodiments, a series of stacked databases are used to guide users to the desired products.
  • the first layer provides a display of all of the chromosomes of an organism. The user selects the chromosome or chromosomes of interest. Selection of the chromosome provides a more detailed map of the chromosome, indicating banding regions on the chromosome. Selection of the desired band leads to a map showing gene locations.
  • One or more additional layers of detail provide base positions of polymorphisms, gene names, genome database identification tags, annotations, regions of the chromosome with pre-existing developed detection assays that are available for purchase, regions where no pre-existing developed assays exist but that are available for design and production, etc. Selecting a region, polymorphism, or detection assay takes the user to an ordering interface, where information is collected to initiate detection assay design and/or ordering.
  • a search engine is provided, where a gene name, sequence range, polymorphism or other query is entered to more immediately direct the user to the appropriate layer of information.
  • the ordering, design, and production systems are integrated with a finance system, where the pricing of the detection assay is determined by one or more factors: whether or not design is required, cost of goods based on the components in the detection assay, special discounts for certain customers, discounts for bulk orders, discounts for re-orders, price increases where the product is covered by intellectual property or contractual payment obligations to third parties, and price selection based on usage.
  • pricing is increased.
  • the pricing increase for clinical products occurs automatically.
  • the systems of the present invention are linked to FDA, public publication, or other databases to determine if a product has been certified for clinical diagnostic or ASR use.
  • N normal
  • M molar
  • mM millimolar
  • ⁇ M micromolar
  • mol molecular weight
  • mmol millimoles
  • ⁇ mol micromol
  • nmol nanomoles
  • pmol picomoles
  • g grams); mg (milligrams); ⁇ g (micrograms); ng (nanograms); l or L (liters); ml (milliliters); ⁇ l (microliters); cm (centimeters); mm (millimeters); ⁇ m (micrometers); nm (nanometers); DS (dextran sulfate); C (degrees Centigrade); and Sigma (Sigma Chemical Co., St. Louis, Mo.).
  • Target sequences were selected from a set of pre-validated SNP-containing sequences, available in a TWT in-house oligonucleotide order entry database (see FIG. 5). Each target contains a single nucleotide polymorphism (SNP) to which an INVADER assay had been previously designed.
  • the INVADER assay oligonucleotides were designed by the INVADER CREATOR software (Third Wave Technologies, Inc.
  • the footprint region in this example is defined as the INVADER “footprint”, or the bases covered by the INVADER and the probe oligonucleotides, optimally positioned for the detection of the base of interest, in this case, a single nucleotide polymorphism (See FIG. 5).
  • About 200 nucleotides of each of the 10 target sequences were analyzed for the amplification primer design analysis, with the SNP base residing about in the center of the sequence. The sequences are shown in FIG. 5.
  • Tm the melting temperature of the oligonucleotide is calculated using the nearest-neighbor model and published parameters for DNA duplex formation (Allawi and SantaLucia, Biochemistry, 36:10581 [1997], herein incorporated by reference).
  • salt correction refers to a variation made in the value provided for a salt concentration for the purpose of reflecting the effect on a T m calculation for a nucleic acid duplex of a non-salt parameter or condition affecting said duplex.
  • N[2]-N[1] of a given oligonucleotide primer should not be complementary to N[2]-N[1] of any other oligonucleotide
  • N[3]-N[2]-N[1] should not be complementary to N[3]-N[2]-N[1] of any other oligonucleotide. If these criteria were not met at a given N[1], the next base in the 5′ direction for the forward primer or the next base in the 3′ direction for the reverse primer will be evaluated as an N[1] site.
  • A/C rich regions were targeted in order to minimize the complementarity of 3′ ends.
  • an INVADER assay was performed following the multiplex amplification reaction. Therefore, a section of the secondary INVADER reaction oligonucleotide (the FRET oligonucleotide sequence, see FIG. 2) was also incorporated as criteria for primer design; the amplification primer sequence should be less than 80% homologous to the specified region of the FRET oligonucleotide.
  • the output primers for the 10-plex multiplex design are shown in FIG. 5). All primers were synthisized according to standard oligonucleotide chemistry, desalted (by standard methods) and quantified by absorbance at A260 and diluted to 50 ⁇ M concentrated stock. Multiplex PCR was then carried out using 10-plex PCR using equimolar amounts of primer (0.01 uM/primer) under the following conditions; 100 mM KCl, 3 mM MgCl 2 , 10 mM Tris pH8.0, 200 uM dNTPs, 2.5U Taq DNA polymerase, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction.
  • the reaction was incubated for (94C./30 sec, 50C./44 sec.) for 30 cycles. After incubation, the multiplex PCR reaction was diluted 1:10 with water and subjected to INVADER analysis using INVADER Assay FRET Detection Plates, 96 well genomic biplex, 100 ng CLEAVASE VIII enzyme, INVADER assays were assembled as 15 ul reactions as follows; 1 ul of the 1:10 dilution of the PCR reaction, 3 ul of PPI mix, 5 ul of 22.5 mM MgCl2, 6 ul of dH20, covered with 15 ul of CHILLOUT liquid wax. Samples were denatured in the INVADER biplex by incubation at 95C. for 5 min., followed by incubation at 63C. and fluorescence measured on a Cytofluor 4000 at various timepoints.
  • FOZ values of the INVADER assay can be used to estimate amplicon abundance.
  • FOZm represents the sum of RED_FOZ and FAM_FOZ of an unknown concentration of target incubated in an INVADER assay for a given amount of time (m).
  • FOZ 240 represents an empirically determined value of RED_FOZ (using INVADER assay 41646), using for a known number of copies of target (e.g. 100 ng of hgDNA ⁇ 30,000 copies) at 240 minutes.
  • equation 1a is used to determine the linear relationship between primer concentration and amplification factor F
  • equation 1a′ is used in the calculation of the amplification factor F for the 10-plex PCR (both with equimolar amounts of primer and optimized concentrations of primer), with the value of D representing the dilution factor of the PCR reaction.
  • D the dilution factor of the PCR reaction.
  • equations 1a and 1a′ will be used in the description of the 10-plex multiplex PCR, a more correct adaptation of this equation was used in the optimization of primer concentrations in the 107-plex PCR.
  • FOZ values should be within the dynamic range of the instrument on which the reading are taken. In the case of the Cytofluor 4000 used in this study, the dynamic range was between about 1.5 and about 12 FOZ.
  • primer concentration and amplification factor F
  • four distinct uniplex PCR reactions were run at using primers 1117-70-17 and 1117-70-18 at concentrations of 0.01 uM, 0.012 uM, 0.014 uM, 0.020 uM respectively.
  • the four independent PCR reactions were carried out under the following conditions; 100 mM KCl, 3 mM MgCl, 10 mM Tris pH 8.0, 200 uM dNTPs using 10 ng of hgDNA as template. Incubation was carried out at (94C./30 sec., 50C./20 sec.) for 30 cycles.
  • amplification bias observed under conditions of equimolar primer concentrations in multiplex PCR could be measured as the “apparent” primer concentration (X) based on the amplification factor F.
  • values of “apparent” primer concentration among different amplicons can be used to estimate the amount of primer of each amplicon required to equalize amplification of different loci:
  • Section 4 Calculation of Apparent Primer Concentrations from a Balanced Multiplex Mix.
  • primer concentration can directly influence the amplification factor of given amplicon.
  • FOZm readings can be used to calculate the “apparent” primer concentration of each amplicon using equation 2.
  • Replacing Y in equation 2 with log(F) of a given amplification factor and solving for X gives an “apparent” primer concentration based on the relative abundance of a given amplicon in a multiplex reaction.
  • equation 2 to calculate the “apparent” primer concentration of all primers (provided in equimolar concentration) in a multiplex reaction (FIG. 3A), provides a means of normalizing primer sets against each other.
  • each of the “apparent” primer concentrations should be divided into the maximum apparent primer concentration (X max ), such that the strongest amplicon is set to a value of 1 and the remaining amplicons to values equal or greater than 1
  • the values of R[n] are multiplied by a constant primer concentration to provide working concentrations for each primer in a given multiplex reaction.
  • the amplicon corresponding to SNP assay 41646 has an R[n] value equal to 1. All of the R[n] values were multiplied by 0.01 uM (the original starting primer concentration in the equimolar multiplex PCR reaction) such that lowest primer concentration is R[n] of 41646 which is set to 1, or 0.01 uM. The remaining primer sets were also proportionally increased as shown in FIG. 8. The results of multiplex PCR with the “optimized” primer mix are described below.
  • each sheet shows footprint region in upper case letters and SNP in brackets
  • 128 primer sets 256 primers, See FIG. 12
  • four of which were thrown out due to excessively long primer sequences SNP #47854, 47889, 54874, 67396
  • the remaining primers were synthesized using standard procedures at the 200 nmol scale and purified by desalting.
  • 107 primer sets were available for assembly of an equimolar 107-plex primer mix (214 primers, See FIG. 12). Of the 107 primer sets available for amplification, only 101 were present on the INVADER MAP plate to evaluate amplification factor.
  • Multiplex PCR was carried out using 101-plex PCR using equimolar amounts of primer (0.025 uM/primer) under the following conditions; 100 mMKCl, 3 mM MgCl, 10 mM Tris pH8.0, 200 uM dNTPs, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction. After denaturation at 95C. for 10 min, 2.5 units of Taq was added and the reaction incubated for (94C./30 sec, 50C./44 sec.) for 50 cycles. After incubation, the multiplex PCR reaction was diluted 1:24 with water and subjected to INVADER assay analysis using INVADER MAP detection platform.
  • primer 0.025 uM/primer
  • Multiplex PCR was under the following conditions; 100 mMKCl, 3 mM MgCl, 10 mM Tris pH8.0, 200 uM dNTPs, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction. After denaturation at 95C. for 10 min, 2.5 units of Taq was added and the reaction incubated for (94C./30 sec, 50C./44 sec.) for 50 cycles. After incubation, the multiplex PCR reaction was diluted 1:24 with water and subjected to INVADER analysis using INVADER MAP detection platform.
  • the INVADER assay can be used to monitor the progress of amplification during PCR reactions, i.e., to determine the amplification factor F that reflects efficiency of amplification of a particular amplicon in a reaction.
  • the INVADER assay can be used to determine the number of molecules present at any point of a PCR reaction by reference to a standard curve generated from quantified reference DNA molecules.
  • the amplification factor F is measured as a ratio of PCR product concentration after amplification to initial target concentration. This example demonstrates the effect of varying primer concentration on the measured amplification factor.
  • PCR reactions were conducted for variable numbers of cycles in increments of 5, i.e., 5, 10, 15, 20, 25, 30, so that the progress of the reaction could be assessed using the INVADER assay to measure accumulated product.
  • the reactions were diluted serially to assure that the target amounts did not saturate the INVADER assay, i.e., so that the measurements could be made in the linear range of the assay.
  • INVADER assay standard curves were generated using a dilution series containing known amounts of the amplicon. This standard curve was used to extrapolate the number of amplified DNA fragments in PCR reactions after the indicated number of cycles. The ratio of the number of molecules after a given number of PCR cycles to the number present prior to amplification is used to derive the amplification factor, F, of each PCR reaction.
  • PCR reactions were set up using equimolar amounts of primers (e.g., 0.02 ⁇ M or 0.1 ⁇ M primers, final concentration). Reactions at each primer concentration were set up in triplicate for each level of amplification tested, i.e., 5, 10, 15, 20, 25, and 30 PCR cycles.
  • primers e.g. 0.02 ⁇ M or 0.1 ⁇ M primers, final concentration.
  • Reactions at each primer concentration were set up in triplicate for each level of amplification tested, i.e., 5, 10, 15, 20, 25, and 30 PCR cycles.
  • One master mix sufficient for 6 standard PCR reactions (each in triplicate ⁇ 2 primer concentrations) plus 2 controls ⁇ 6 tests (5, 10, 15, 20, 25, or 30 cycles of PCR) plus enough for extra reactions to allow for overage.
  • FIG. 21 presents the results of the triplicate INVADER assays in a plot of log 10 of amplification factor (y-axis) as a function of cycle number (x-axis).
  • the PCR product concentration was estimated from the INVADER assays by extrapolation to the standard curve.
  • the data from the replicate assays were not averaged but instead were presented as multiple, overlapping points in the figure.
  • This example demonstrates the correlation between amplification factor, F, and primer concentration, c.
  • F was measured by generating a standard curve for each locus using a dilution series of purified, quantified reference amplicon preparations.
  • 12 different reference amplicons were generated: one for each allele of the SNPs contained in the 6 genomic regions amplified by the primer pairs.
  • Each reference amplicon concentration was tested in an INVADER assay, and a standard curve of fluorescence counts versus amplicon concentration was created.
  • PCR reactions were also run on genomic DNA samples, the products diluted, and then tested in an INVADER assay to determine the extent of amplification, in terms of number of molecules, by comparison to the standard curve.
  • a total of 8 genomic DNA samples isolated from whole blood were screened in standard biplex INVADER assays to determine their genotypes at 24 SNPs in order to identify samples homozygous for the wild-type or variant allele at a total of 6 different loci.
  • Suitable genomic DNA preparations were then amplified in standard individual, monoplex PCR reactions to generate amplified fragments for use as PCR reference standards as described in Example 3.
  • amplified DNA was gel isolated using standard methods and previously quantified using the PICOGREEN assay. Serial dilutions of these concentration standards were created as follows:
  • Each purified amplicon was diluted to create a working stock at a concentration of 200 pM. These stocks were then serially diluted as follows. A working stock solution of each amplicon was prepared with a concentration of 1.25 pM in dH 2 O containing tRNA at 30 ng/ ⁇ l. The working stock was diluted in 96-well microtiter plates and then serially diluted to yield the following final concentrations in the INVADER assay: 1, 2.5, 6.25, 15.6, 39, 100, and 250 fM.
  • One plate was prepared for the amplicons to be detected in the INVADER assay using probe oligonucleotides reporting to FAM dye and one plate for those to be tested with probe oligonucleotides reporting to RED dye. All amplicon dilutions were analyzed in duplicate.
  • PCR reactions were set up for individual amplification of the 6 genomic regions described in the previous example on each of 2 alleles at 4 different primer concentrations, for a total of 48 PCR reactions. All PCRs were run for 20 cycles. The following primer concentrations were tested: 0.01 ⁇ M, 0.025 ⁇ M, 0.05 ⁇ M, and 0.1 ⁇ M. A master mix for all 48 reactions was prepared according to standard procedures, with the exception of the modified primer concentrations, plus overage for an additional 23 reactions (16 reactions were prepared but not used, and overage of 7 additional reactions was prepared).
  • INVADER analysis was carried out on all dilutions of the products of each PCR reaction as well as the indicated dilutions of each quantified reference amplicon (to generate a standard curve for each amplicon) in standard biplex INVADER assays.
  • This example describes the use of the INVADER assay to detect the products of a highly multiplexed PCR reaction designed to amplify 192 distinct loci in the human genome.
  • Genomic DNA was isolated from 5 mls of whole blood and purified using the Autopure, manufactured by Gentra Systems, Inc. (Minneapolis, Minn.). The purified DNA was in 500 ⁇ l of dH 2 O.
  • Forward and reverse primer sets for the 192 loci were designed using Primer Designer, version 1.3.4 (See Primer Design section above, including FIG. 4A).
  • Target sequences used for INVADER designs were converted into a comma-delimited text file for use as an input file for PrimerDesigner.
  • PrimerDesigner was run using default parameters, with the exception of oligo T m , which was set at 60° C.
  • Oligonucleotide primers were synthesized using standard procedures in a Polyplex (GeneMachines, San Carlos, Calif.). The scale was 0.2 ⁇ mole, desalted only (not purified) on NAP-10 and not dried down.
  • Master mix 1 contained primers to amplify loci 1-96; master mix 2, 97-192.
  • the mixes were made according to standard procedures and contained standard components. All primers were present at a final concentration of 0.025 ⁇ M, with KCl at 100 mM, and MgCl at 3 mM.
  • PCR cycling conditions were as follows in a MJ PTC-100 thermocycler (MJ Research, Waltham, Mass.): 95° C. for 15 min; 94° C. for 30 sec, then 55° C. 44 sec ⁇ 50 cycles.
  • INVADER assays were set up using the CYBI-well 2000. Aliquots of 3 ⁇ l of the genomic DNA target were added to the appropriate wells. No target controls were comprised of 3 ⁇ l of Te (10 mM Tris, pH 8.0, 0.1 mM EDTA). The reagents for use in the INVADER assays were standard PPI mixes, buffer, FRET oligonucleotides, and Cleavase VIII enzyme and were added individually to each well by the CYBI-well 2000.
  • genotype calls could be made for 157 after 20 minutes and 158 after 40 minutes, or a total of 82%.
  • genotyping results were available for comparison from data obtained previously using either monoplex PCR followed by INVADER analysis or INVADER results obtained directly from analysis of genomic DNA.
  • INVADER results obtained directly from analysis of genomic DNA.
  • no corroborating genotype results were available.
  • This example shows that it is possible to amplify more than 150 loci in a single multiplexed PCR reaction. This example further shows that the amount of each amplified fragment generated in such a multiplexed PCR reaction is sufficient to produce discernable genotype calls when used as a target in an INVADER assay.
  • many of the amplicons generated in this multiplex PCR assay gave high signal, measured as FOZ, in the INVADER assay, while some gave such low signal that no genotype call could be made. Still others amplicons were present at such low levels, or not at all, that they failed to yield any signal in the INVADER assay.
  • one particular sample analyzed in Example 5 yielded FOZ results, after a 40 minute incubation in the INVADER assay, of 29.54 FAM and 66.98 RED, while another sample gave FOZ results after 40 min of 1.09 and 1.22, respectively, prompting a determination that there was insufficient signal to generate a genotype call.
  • Modulation of primer concentrations, down in the case of the first sample and up in the case of the second, should make it possible to bring the amplification factors of the two samples closer to the same value. It is envisioned that this sort of modulation may be an iterative process, requiring more than one modification to bring the amplification factors sufficiently close to one another to enable most or all loci in a multiplex PCR reaction to be amplified with approximately equivalent efficiency.
  • n can be g or t. 1 cactagaccg cctgtcccca agggagcctc agtggggcga cagggtgctc ggcggactcc 60 acctcaggcc ctccccactg ttgctgtgca ttcctgtgca ggtgcatctc tttctacta 120 actggtattt attaagggag gtgctctgta ggtctggagc cttccctca tcctttttgc 180 gagtccccac cttttgttttttttttttttttttttttgaggct cactagagga cgcagaacct 240 tgggagattg att
  • 556 0 DNA Unknown Intentionally omitted sequence 556 000 557 0 DNA Unknown Intentionally omitted sequence.

Abstract

The present invention provides methods and routines for developing and optimizing nucleic acid detection assays for use in basic research, clinical research, and for the development of clinical detection assays. In particular, the present invention provides methods for designing oligonucleotide primers to be used in multiplex amplification reactions. The present invention also provides methods to optimize multiplex amplification reactions.

Description

  • The present application is a continuation-in-part of U.S. application Ser. No. 09/998,157, filed Nov. 30, 2001, which claims priority to both U.S. [0001] Provisional Application 60/360,489 filed Oct. 19, 2001, and U.S. Provisional Application 60/329,113, filed Oct. 12, 2001, all of which are herein incorporated by reference.
  • FIELD OF THE INVENTION
  • The present invention provides methods for developing and optimizing nucleic acid detection assays for use in basic research, clinical research, and for the development of clinical detection assays. In particular, the present invention provides methods for designing oligonucleotide primers to be used in multiplex amplification reactions. The present invention also provides methods to optimize multiplex amplification reactions. The present invention also provides methods to perform Highly Multiplexed PCR in Combination with the INVADER Assay. [0002]
  • BACKGROUND
  • With the completion of the nucleic acid sequencing of the human genome, the demand for fast, reliable, cost-effective and user-friendly tests for genomics research and related drug design efforts has greatly increased. A number of institutions are actively mining the available genetic sequence information to identify correlations between genes, gene expression and phenotypes (e.g., disease states, metabolic responses, and the like). These analyses include an attempt to characterize the effect of gene mutations and genetic and gene expression heterogeneity in individuals and populations. However, despite the wealth of sequence information available, information on the frequency and clinical relevance of many polymorphisms and other variations has yet to be obtained and validated. For example, the human reference sequences used in current genome sequencing efforts do not represent an exact match for any one person's genome. In the Human Genome Project (HGP), researchers collected blood (female) or sperm (male) samples from a large number of donors. However, only a few samples were processed as DNA resources, and the source names are protected so neither donors nor scientists know whose DNA is being sequenced. The human genome sequence generated by the private genomics company Celera was based on DNA samples collected from five donors who identified themselves as Hispanic, Asian, Caucasian, or African-American. The small number of human samples used to generate the reference sequences does not reflect the genetic diversity among population groups and individuals. Attempts to analyze individuals based on the genome sequence information will often fail. For example, many genetic detection assays are based on the hybridization of probe oligonucleotides to a target region on genomic DNA or mRNA. Probes generated based on the reference sequences will often fail (e.g., fail to hybridize properly, fail to properly characterize the sequence at specific position of the target) because the target sequence for many individuals differs from the reference sequence. Differences may be on an individual-by-individual basis, but many follow regional population patterns (e.g., many correlate highly to race, ethnicity, geographic local, age, environmental exposure, etc.). With the limited utility of information currently available, the art is in need of systems and methods for acquiring, analyzing, storing, and applying large volumes of genetic information with the goal of providing an array of detection assay technologies for research and clinical analysis of biological samples. [0003]
  • SUMMARY OF THE INVENTION
  • The present invention provides methods and routines for developing and optimizing nucleic acid detection assays for use in basic research, clinical research, and for the development of clinical detection assays. [0004]
  • In some embodiments, the present invention provides methods comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises a forward and a reverse primer sequence for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide A or C, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0005]
  • In other embodiments, the present invention provides methods comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises a forward and a reverse primer sequence for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide G or T, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0006]
  • In particular embodiments, a method comprising; a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the 5′ region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the 3′ region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide A or C, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0007]
  • In other embodiments, the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and b) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the 5′ region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the 3′ region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide G or T, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0008]
  • In particular embodiments, the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises a single nucleotide polymorphism, b) determining where on each of the target sequences one or more assay probes would hybridize in order to detect the single nucleotide polymorphism such that a footprint region is located on each of the target sequences, and c) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a base, x is at least 6, N[1] is nucleotide A or C, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0009]
  • In some embodiments, the present invention provides methods comprising a) providing target sequence information for at least Y target sequences, wherein each of the target sequences comprises a single nucleotide polymorphism, b) determining where on each of the target sequences one or more assay probes would hybridize in order to detect the single nucleotide polymorphism such that a footprint region is located on each of the target sequences, and c) processing the target sequence information such that a primer set is generated, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide T or G, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0010]
  • In certain embodiments, the primer set is configured for performing a multiplex PCR reaction that amplifies at least Y amplicons, wherein each of the amplicons is defined by the position of the forward and reverse primers. In other embodiments, the primer set is generated as digital or printed sequence information. In some embodiments, the primer set is generated as physical primer oligonucleotides. [0011]
  • In certain embodiments, N[3]-N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[3]-N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. In other embodiments, the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ A or C in the 5′ region. In certain embodiments, the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ G or T in the 5′ region. In some embodiments, the processing comprises initially selecting N[1] for each of the forward primers as the most 3′ A or C in the 5′ region, and wherein the processing further comprises changing the N[1] to the next most 3′ A or C in the 5′ region for the forward primer sequences that fail the requirement that each of the forward primer's N[2]-N[1]-3′ is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0012]
  • In other embodiments, the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ A or C in the complement of the 3′ region. In some embodiments, the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ G or T in the complement of the 3′ region. In further embodiments, the processing comprises initially selecting N[1] for each of the reverse primers as the most 3′ A or C in the 3′ region, and wherein the processing further comprises changing the N[1] to the next most 3′ A or C in the 3′ region for the reverse primer sequences that fail the requirement that each of the reverse primer's N[2]-N[1]-3′ is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0013]
  • In particular embodiments, the footprint region comprises a single nucleotide polymorphism. In some embodiments, the footprint comprises a mutation. In some embodiments, the footprint region for each of the target sequences comprises a portion of the target sequence that hybridizes to one or more assay probes configured to detect the single nucleotide polymorphism. In certain embodiments, the footprint is this region where the probes hybridize. In other embodiments, the footprint further includes additional nucleotides on either end. [0014]
  • In some embodiments, the processing further comprises selecting N[5]-N[4]-N[3]-N[2]-N[1]-3′ for each of the forward and reverse primers such that less than 80 percent homology with a assay component sequence is present. In preferred embodiments, the assay component is a FRET probe sequence. In certain embodiments, the target sequence is about 300-500 base pairs in length, or about 200-600 base pair in length. In certain embodiments, Y is an integer between 2 and 500, or between 2-10,000. [0015]
  • In certain embodiments, the processing comprises selecting x for each of the forward and reverse primers such that each of the forward and reverse primers has a melting temperature with respect to the target sequence of approximately 50 degrees Celsius (e.g. 50 degrees, Celsius, or at least 50 degrees Celsius, and no more than 55 degrees Celsius). In preferred embodiments, the melting temperature of a primer (when hybridized to the target sequence) is at least 50 degrees Celsius, but at least 10 degrees different than a selected detection assay's optimal reaction temperature. [0016]
  • In some embodiments, the forward and reverse primer pair optimized concentrations are determined for the primer set. In other embodiments, the processing is automated. In further embodiments, the processing is automated with a processor. [0017]
  • In other embodiments, the present invention provides a kit comprising the primer set generated by the methods of the present invention, and at least one other component. (e.g. cleavage agent, polymerase, INVADER oligonucleotide). In certain embodiments, the present invention provides compositions comprising the primers and primer sets generated by the methods of the present invention. [0018]
  • In particular embodiments, the present invention provides methods comprising; a) providing; i) a user interface configured to receive sequence data, ii) a computer system having stored therein a multiplex PCR primer software application, and b) transmitting the sequence data from the user interface to the computer system, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and c) processing the target sequence information with the multiplex PCR primer pair software application to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide A or C, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0019]
  • In some embodiments, the present invention provides methods comprising; a) providing; i) a user interface configured to receive sequence data, ii) a computer system having stored therein a multiplex PCR primer software application, and b) transmitting the sequence data from the user interface to the computer system, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, and c) processing the target sequence information with the multiplex PCR primer pair software application to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide G or T, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set. [0020]
  • In certain embodiments, the present invention provides systems comprising; a) a computer system configured to receive data from a user interface, wherein the user interface is configured to receive sequence data, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, b) a multiplex PCR primer pair software application operably linked to the user interface, wherein the multiplex PCR primer software application is configured to process the target sequence information to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide A or C, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set, and c) a computer system having stored therein the multiplex PCR primer pair software application, wherein the computer system comprises computer memory and a computer processor. [0021]
  • In other embodiments, the present invention provides systems comprising; a) a computer system configured to receive data from a user interface, wherein the user interface is configured to receive sequence data, wherein the sequence data comprises target sequence information for at least Y target sequences, wherein each of the target sequences comprises; i) a footprint region, ii) a 5′ region immediately upstream of the footprint region, and iii) a 3′ region immediately downstream of the footprint region, b) a multiplex PCR primer pair software application operably linked to the user interface, wherein the multiplex PCR primer software application is configured to process the target sequence information to generate a primer set, wherein the primer set comprises; i) a forward primer sequence identical to at least a portion of the target sequence immediately 5′ of the footprint region for each of the Y target sequences, and ii) a reverse primer sequence identical to at least a portion of a complementary sequence of the target sequence immediately 3′ of the footprint region for each of the at least Y target sequences, wherein each of the forward and reverse primer sequences comprises a nucleic acid sequence represented by 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, wherein N represents a nucleotide base, x is at least 6, N[1] is nucleotide G or T, and N[2]-N[1]-3′ of each of the forward and reverse primers is not complementary to N[2]-N[1]-3′ of any of the forward and reverse primers in the primer set, and c) a computer system having stored therein the multiplex PCR primer pair software application, wherein the computer system comprises computer memory and a computer processor. In certain embodiments, the computer system is configured to return the primer set to the user interface.[0022]
  • DESCRIPTION OF THE FIGURES
  • The following figures form part of the present specification and are included to further demonstrate certain aspects and embodiments of the present invention. The invention may be better understood by reference to one or more of these figures in combination with the description of specific embodiments presented herein. [0023]
  • FIG. 1 shows one embodiments of SNP detection using the INVADER assay in biplex format. [0024]
  • FIG. 2 shows an input target sequence and the result of processing this sequence with systems and routines of the present invention. [0025]
  • FIG. 3 shows an example of a basic work flow for highly multiplexed PCR using the INVADER Medically Associated Panel. [0026]
  • FIG. 4 shows a flow chart outlining the steps that may be performed in order to generated a primer set useful in multiplex PCR. [0027]
  • FIGS. [0028] 5-9 show sequences used and data generated in connection with Example 1.
  • FIGS. [0029] 10-17 show sequences used and data generated in connection with Example 2.
  • FIG. 18 shows one protocol for Multiplex PCR optimization according to the present invention. [0030]
  • FIG. 19 shows certain criteria that can be employed in certain embodiments of the present invention in order to design multiplex primers. [0031]
  • FIG. 20 shows certain PCR primers useful for amplifying various regions of CYP2D6. [0032]
  • FIG. 21 shows certain results from Example 3. [0033]
  • FIG. 22 shows certain results from Example 4. [0034]
  • FIG. 23 shows additional results from Example 4.[0035]
  • DEFINITIONS
  • To facilitate an understanding of the present invention, a number of terms and phrases are defined below: [0036]
  • As used herein, the terms “SNP,” “SNPs” or “single nucleotide polymorphisms” refer to single base changes at a specific location in an organism's (e.g., a human) genome. “SNPs” can be located in a portion of a genome that does not code for a gene. Alternatively, a “SNP” may be located in the coding region of a gene. In this case, the “SNP” may alter the structure and function of the RNA or the protein with which it is associated. [0037]
  • As used herein, the term “allele” refers to a variant form of a given sequence (e.g., including but not limited to, genes containing one or more SNPs). A large number of genes are present in multiple allelic forms in a population. A diploid organism carrying two different alleles of a gene is said to be heterozygous for that gene, whereas a homozygote carries two copies of the same allele. [0038]
  • As used herein, the term “linkage” refers to the proximity of two or more markers (e.g., genes) on a chromosome. [0039]
  • As used herein, the term “allele frequency” refers to the frequency of occurrence of a given allele (e.g., a sequence containing a SNP) in given population (e.g., a specific gender, race, or ethnic group). Certain populations may contain a given allele within a higher percent of its members than other populations. For example, a particular mutation in the breast cancer gene called BRCA1 was found to be present in one percent of the general Jewish population. In comparison, the percentage of people in the general U.S. population that have any mutation in BRCA1 has been estimated to be between 0.1 to 0.6 percent. Two additional mutations, one in the BRCA1 gene and one in another breast cancer gene called BRCA2, have a greater prevalence in the Ashkenazi Jewish population, bringing the overall risk for carrying one of these three mutations to 2.3 percent. [0040]
  • As used herein, the term “in silico analysis” refers to analysis performed using computer processors and computer memory. For example, “insilico SNP analysis” refers to the analysis of SNP data using computer processors and memory. [0041]
  • As used herein, the term “genotype” refers to the actual genetic make-up of an organism (e.g., in terms of the particular alleles carried at a genetic locus). Expression of the genotype gives rise to an organism's physical appearance and characteristics—the “phenotype.”[0042]
  • As used herein, the term “locus” refers to the position of a gene or any other characterized sequence on a chromosome. [0043]
  • As used herein the term “disease” or “disease state” refers to a deviation from the condition regarded as normal or average for members of a species, and which is detrimental to an affected individual under conditions that are not inimical to the majority of individuals of that species (e.g., diarrhea, nausea, fever, pain, and inflammation etc). [0044]
  • As used herein, the term “treatment” in reference to a medical course of action refer to steps or actions taken with respect to an affected individual as a consequence of a suspected, anticipated, or existing disease state, or wherein there is a risk or suspected risk of a disease state. Treatment may be provided in anticipation of or in response to a disease state or suspicion of a disease state, and may include, but is not limited to preventative, ameliorative, palliative or curative steps. The term “therapy” refers to a particular course of treatment. [0045]
  • The term “gene” refers to a nucleic acid (e.g., DNA) sequence that comprises coding sequences necessary for the production of a polypeptide, RNA (e.g., rRNA, tRNA, etc.), or precursor. The polypeptide, RNA, or precursor can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., ligand binding, signal transduction, etc.) of the full-length or fragment are retained. The term also encompasses the coding region of a structural gene and the including sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA. The sequences that are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences. The sequences that are located 3′ or downstream of the coding region and that are present on the mRNA are referred to as 3′ untranslated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments included when a gene is transcribed into heterogeneous nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are generally absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide. Variations (e.g., mutations, SNPS, insertions, deletions) in transcribed portions of genes are reflected in, and can generally be detected in corresponding portions of the produced RNAs (e.g., hnRNAs, mRNAs, rRNAs, tRNAs). [0046]
  • Where the phrase “amino acid sequence” is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule, amino acid sequence and like terms, such as polypeptide or protein are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule. [0047]
  • In addition to containing introns, genomic forms of a gene may also include sequences located on both the 5′ and 3′ end of the sequences that are present on the RNA transcript. These sequences are referred to as “flanking” sequences or regions (these flanking sequences are located 5′ or 3′ to the non-translated sequences present on the mRNA transcript). The 5′ flanking region may contain regulatory sequences such as promoters and enhancers that control or influence the transcription of the gene. The 3′ flanking region may contain sequences that direct the termination of transcription, post-transcriptional cleavage and polyadenylation. [0048]
  • The term “wild-type” refers to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designed the “normal” or “wild-type” form of the gene. In contrast, the terms “modified,” “mutant,” and “variant” refer to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product. [0049]
  • As used herein, the terms “nucleic acid molecule encoding,” “DNA sequence encoding,” and “DNA encoding” refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order of amino acids along the polypeptide (protein) chain. In this case, the DNA sequence thus codes for the amino acid sequence. [0050]
  • DNA and RNA molecules are said to have “5′ ends” and “3′ ends” because mononucleotides are reacted to make oligonucleotides or polynucleotides in a manner such that the 5′ phosphate of one mononucleotide pentose ring is attached to the 3′ oxygen of its neighbor in one direction via a phosphodiester linkage. Therefore, an end of an oligonucleotides or polynucleotide, referred to as the “5′ end” if its 5′ phosphate is not linked to the 3′ oxygen of a mononucleotide pentose ring and as the “3′ end” if its 3′ oxygen is not linked to a 5′ phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide or polynucleotide, also may be said to have 5′ and 3′ ends. In either a linear or circular DNA molecule, discrete elements are referred to as being “upstream” or 5′ of the “downstream” or 3′ elements. This terminology reflects the fact that transcription proceeds in a 5′ to 3′ fashion along the DNA strand. The promoter and enhancer elements that direct transcription of a linked gene are generally located 5′ or upstream of the coding region. However, enhancer elements can exert their effect even when located 3′ of the promoter element and the coding region. Transcription termination and polyadenylation signals are located 3′ or downstream of the coding region. [0051]
  • As used herein, the terms “an oligonucleotide having a nucleotide sequence encoding a gene” and “polynucleotide having a nucleotide sequence encoding a gene,” means a nucleic acid sequence comprising the coding region of a gene or, in other words, the nucleic acid sequence that encodes a gene product. The coding region may be present in either a cDNA, genomic DNA, or RNA form. When present in a DNA form, the oligonucleotide or polynucleotide may be single-stranded (i.e., the sense strand) or double-stranded. Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region of the gene if needed to permit proper initiation of transcription and/or correct processing of the primary RNA transcript. Alternatively, the coding region utilized in the expression vectors of the present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements. [0052]
  • As used herein, the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, for the sequence “5′-A-G-T-3′,” is complementary to the sequence “3′-T-C-A-5′.” Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. [0053]
  • The term “homology” refers to a degree of complementarity. There may be partial homology or complete homology (i.e., identity). A partially complementary sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid and is referred to using the functional term “substantially homologous.” The term “inhibition of binding,” when used in reference to nucleic acid binding, refers to inhibition of binding caused by competition of homologous sequences for binding to a target sequence. The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous to a target under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target that lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target. [0054]
  • The art knows well that numerous equivalent conditions may be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol) are considered and the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, the art knows conditions that promote hybridization under conditions of high stringency (e.g., increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.). [0055]
  • When used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone, the term “substantially homologous” refers to any probe that can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above. [0056]
  • A gene may produce multiple RNA species that are generated by differential splicing of the primary RNA transcript. cDNAs that are splice variants of the same gene will contain regions of sequence identity or complete homology (representing the presence of the same exon or portion of the same exon on both cDNAs) and regions of complete non-identity (for example, representing the presence of exon “A” on [0057] cDNA 1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAs contain regions of sequence identity they will both hybridize to a probe derived from the entire gene or portions of the gene containing sequences found on both cDNAs; the two splice variants are therefore substantially homologous to such a probe and to each other.
  • When used in reference to a single-stranded nucleic acid sequence, the term “substantially homologous” refers to any probe that can hybridize (i.e., it is the complement of) the single-stranded nucleic acid sequence under conditions of low stringency as described above. [0058]
  • As used herein, the term “hybridization” is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, the T[0059] m of the formed hybrid, and the G:C ratio within the nucleic acids.
  • As used herein, the term “T[0060] m” is used in reference to the “melting temperature.” The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the Tm of nucleic acids is well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (See e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization [1985]). Other references include more sophisticated computations that take structural as well as sequence characteristics into account for the calculation of Tm.
  • As used herein the term “stringency” is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. Those skilled in the art will recognize that “stringency” conditions may be altered by varying the parameters just described either individually or in concert. With “high stringency” conditions, nucleic acid base pairing will occur only between nucleic acid fragments that have a high frequency of complementary base sequences (e.g., hybridization under “high stringency” conditions may occur between homologs with about 85-100% identity, preferably about 70-100% identity). With medium stringency conditions, nucleic acid base pairing will occur between nucleic acids with an intermediate frequency of complementary base sequences (e.g., hybridization under “medium stringency” conditions may occur between homologs with about 50-70% identity). Thus, conditions of “weak” or “low” stringency are often required with nucleic acids that are derived from organisms that are genetically diverse, as the frequency of complementary sequences is usually less. [0061]
  • “High stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH[0062] 2PO4 H2O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1×SSPE, 1.0% SDS at 42 C when a probe of about 500 nucleotides in length is employed.
  • “Medium stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH[0063] 2PO4 H2O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0×SSPE, 1.0% SDS at 42 C. when a probe of about 500 nucleotides in length is employed.
  • “Low stringency conditions” comprise conditions equivalent to binding or hybridization at 42 C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH[0064] 2PO4 H2O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5× Denhardt's reagent [50× Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharamcia), 5 g BSA (Fraction V; Sigma)] and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5×SSPE, 0.1% SDS at 42 C when a probe of about 500 nucleotides in length is employed.
  • The following terms are used to describe the sequence relationships between two or more polynucleotides: “reference sequence,” “sequence identity,” “percentage of sequence identity,” and “substantial identity.” A “reference sequence” is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA sequence given in a sequence listing or may comprise a complete gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. Since two polynucleotides may each (1) comprise a sequence (i.e., a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a “comparison window” to identify and compare local regions of sequence similarity. A “comparison window,” as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by the local homology algorithm of Smith and Waterman [Smith and Waterman, [0065] Adv. Appl. Math. 2: 482 (1981)] by the homology alignment algorithm of Needleman and Wunsch [Needleman and Wunsch, J. Mol. Biol. 48:443 (1970)], by the search for similarity method of Pearson and Lipman [Pearson and Lipman, Proc. Natl. Acad. Sci. (U.S.A.) 85:2444 (1988)], by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by inspection, and the best alignment (i.e., resulting in the highest percentage of homology over the comparison window) generated by the various methods is selected. The term “sequence identity” means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide basis) over the window of comparison. The term “percentage of sequence identity” is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • As applied to polynucleotides, the term “substantial identity” denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 85 percent sequence identity, preferably at least 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent or less of the reference sequence over the window of comparison. The reference sequence may be a subset of a larger sequence, for example, as a splice variant of the full-length sequences. [0066]
  • As applied to polypeptides, the term “substantial identity” means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap weights, share at least 80 percent sequence identity, preferably at least 90 percent sequence identity, more preferably at least 95 percent sequence identity or more (e.g., 99 percent sequence identity). Preferably, residue positions that are not identical differ by conservative amino acid substitutions. Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine. [0067]
  • “Amplification” is a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out. [0068]
  • Template specificity is achieved in most amplification techniques by the choice of enzyme. Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. For example, in the case of Q replicase, MDV-1 RNA is the specific template for the replicase (D. L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic acid will not be replicated by this amplification enzyme. Similarly, in the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity for its own promoters (M. Chamberlin et al., Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (D. Y. Wu and R. B. Wallace, Genomics 4:560 [1989]). Finally, Taq and Pfu polymerases, by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (H. A. Erlich (ed.), [0069] PCR Technology, Stockton Press [1989]).
  • As used herein, the term “amplifiable nucleic acid” is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that “amplifiable nucleic acid” will usually comprise “sample template.”[0070]
  • As used herein, the term “sample template” refers to nucleic acid originating from a sample that is analyzed for the presence of “target” (defined below). In contrast, “background template” is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample. [0071]
  • As used herein, the term “primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer should be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method. [0072]
  • As used herein, the term “probe” or “hybridization probe” refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing, at least in part, to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular sequences. In some preferred embodiments, probes used in the present invention will be labeled with a “reporter molecule,” so that is detectable in any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label. [0073]
  • As used herein, the term “target” refers to a nucleic acid sequence or structure to be detected or characterized. [0074]
  • As used herein, the term “polymerase chain reaction” (“PCR”) refers to the method of K. B. Mullis (See e.g., U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,965,188, hereby incorporated by reference), which describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands of the double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one “cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired target sequence. The length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the “polymerase chain reaction” (hereinafter “PCR”). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be “PCR amplified.”[0075]
  • With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of [0076] 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • As used herein, the terms “PCR product,” “PCR fragment,” and “amplification product” refer to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences. [0077]
  • As used herein, the term “amplification reagents” refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template, and the amplification enzyme. Typically, amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.). [0078]
  • As used herein, the term “recombinant DNA molecule” as used herein refers to a DNA molecule that is comprised of segments of DNA joined together by means of molecular biological techniques. [0079]
  • As used herein, the term “antisense” is used in reference to RNA sequences that are complementary to a specific RNA sequence (e.g., mRNA). The term “antisense strand” is used in reference to a nucleic acid strand that is complementary to the “sense” strand. The designation (−) (i.e., “negative”) is sometimes used in reference to the antisense strand, with the designation (+) sometimes used in reference to the sense (i.e., “positive”) strand. [0080]
  • The term “isolated” when used in relation to a nucleic acid, as in “an isolated oligonucleotide” or “isolated polynucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids are nucleic acids such as DNA and RNA found in the state they exist in nature. For example, a given DNA sequence (e.g., a gene) is found on the host cell chromosome in proximity to neighboring genes; RNA sequences, such as a specific mRNA sequence encoding a specific protein, are found in the cell as a mixture with numerous other mRNAs that encode a multitude of proteins. However, isolated nucleic acids encoding a polypeptide include, by way of example, such nucleic acid in cells ordinarily expressing the polypeptide where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature. The isolated nucleic acid, oligonucleotide, or polynucleotide may be present in single-stranded or double-stranded form. When an isolated nucleic acid, oligonucleotide or polynucleotide is to be utilized to express a protein, the oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand (i.e., the oligonucleotide or polynucleotide may single-stranded), but may contain both the sense and anti-sense strands (i.e., the oligonucleotide or polynucleotide may be double-stranded). [0081]
  • As used herein the term “portion” when in reference to a nucleotide sequence (as in “a portion of a given nucleotide sequence”) refers to fragments of that sequence. The fragments may range in size from four nucleotides to the entire nucleotide sequence minus one nucleotide (e.g., 10 nucleotides, 11, . . . , 20, . . . ). [0082]
  • As used herein, the term “purified” or “to purify” refers to the removal of contaminants from a sample. As used herein, the term “purified” refers to molecules (e.g., nucleic or amino acid sequences) that are removed from their natural environment, isolated or separated. An “isolated nucleic acid sequence” is therefore a purified nucleic acid sequence. “Substantially purified” molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated. [0083]
  • The term “recombinant protein” or “recombinant polypeptide” as used herein refers to a protein molecule that is expressed from a recombinant DNA molecule. [0084]
  • The term “native protein” as used herein to indicate that a protein does not contain amino acid residues encoded by vector sequences; that is the native protein contains only those amino acids found in the protein as it occurs in nature. A native protein may be produced by recombinant means or may be isolated from a naturally occurring source. [0085]
  • As used herein the term “portion” when in reference to a protein (as in “a portion of a given protein”) refers to fragments of that protein. The fragments may range in size from four consecutive amino acid residues to the entire amino acid sequence minus one amino acid. [0086]
  • The term “Southern blot,” refers to the analysis of DNA on agarose or acrylamide gels to fractionate the DNA according to size followed by transfer of the DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then probed with a labeled probe to detect DNA species complementary to the probe used. The DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support. Southern blots are a standard tool of molecular biologists (J. Sambrook et al., [0087] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31-9.58 [1989]).
  • The term “Western blot” refers to the analysis of protein(s) (or polypeptides) immobilized onto a support such as nitrocellulose or a membrane. The proteins are run on acrylamide gels to separate the proteins, followed by transfer of the protein from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized proteins are then exposed to antibodies with reactivity against an antigen of interest. The binding of the antibodies may be detected by various methods, including the use of labeled antibodies. [0088]
  • The term “test compound” refers to any chemical entity, pharmaceutical, drug, and the like that are tested in an assay (e.g., a drug screening assay) for any desired activity (e.g., including but not limited to, the ability to treat or prevent a disease, illness, sickness, or disorder of bodily function, or otherwise alter the physiological or cellular status of a sample). Test compounds comprise both known and potential therapeutic compounds. A test compound can be determined to be therapeutic by screening using the screening methods of the present invention. A “known therapeutic compound” refers to a therapeutic compound that has been shown (e.g., through animal trials or prior experience with administration to humans) to be effective in such treatment or prevention. [0089]
  • The term “sample” as used herein is used in its broadest sense. A sample suspected of containing a human chromosome or sequences associated with a human chromosome may comprise a cell, chromosomes isolated from a cell (e.g., a spread of metaphase chromosomes), genomic DNA (in solution or bound to a solid support such as for Southern blot analysis), RNA (in solution or bound to a solid support such as for Northern blot analysis), cDNA (in solution or bound to a solid support) and the like. A sample suspected of containing a protein may comprise a cell, a portion of a tissue, an extract containing one or more proteins and the like. [0090]
  • The term “label” as used herein refers to any atom or molecule that can be used to provide a detectable (preferably quantifiable) effect, and that can be attached to a nucleic acid or protein. Labels include but are not limited to dyes; radiolabels such as [0091] 32P; binding moieties such as biotin; haptens such as digoxgenin; luminogenic, phosphorescent or fluorogenic moieties; and fluorescent dyes alone or in combination with moieties that can suppress or shift emission spectra by fluorescence resonance energy transfer (FRET). Labels may provide signals detectable by fluorescence, radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption, magnetism, enzymatic activity, and the like. A label may be a charged moiety (positive or negative charge) or alternatively, may be charge neutral. Labels can include or consist of nucleic acid or protein sequence, so long as the sequence comprising the label is detectable.
  • The term “signal” as used herein refers to any detectable effect, such as would be caused or provided by a label or an assay reaction. [0092]
  • As used herein, the term “detector” refers to a system or component of a system, e.g., an instrument (e.g. a camera, fluorimeter, charge-coupled device, scintillation counter, etc) or a reactive medium (X-ray or camera film, pH indicator, etc.), that can convey to a user or to another component of a system (e.g., a computer or controller) the presence of a signal or effect. A detector can be a photometric or spectrophotometric system, which can detect ultraviolet, visible or infrared light, including fluorescence or chemiluminescence; a radiation detection system; a spectroscopic system such as nuclear magnetic resonance spectroscopy, mass spectrometry or surface enhanced Raman spectrometry; a system such as gel or capillary electrophoresis or gel exclusion chromatography; or other detection system known in the art, or combinations thereof. [0093]
  • As used herein, the term “distribution system” refers to systems capable of transferring and/or delivering materials from one entity to another or one location to another. For example, a distribution system for transferring detection panels from a manufacturer or distributor to a user may comprise, but is not limited to, a packaging department, a mail room, and a mail delivery system. Alternately, the distribution system may comprise, but is not limited to, one or more delivery vehicles and associated delivery personnel, a display stand, and a distribution center. In some embodiments of the present invention interested parties (e.g., detection panel manufactures) utilize a distribution system to transfer detection panels to users at no cost, at a subsidized cost, or at a reduced cost. [0094]
  • As used herein, the term “at a reduced cost” refers to the transfer of goods or services at a reduced direct cost to the recipient (e.g. user). In some embodiments, “at a reduced cost” refers to transfer of goods or services at no cost to the recipient. [0095]
  • As used herein, the term “at a subsidized cost” refers to the transfer of goods or services, wherein at least a portion of the recipient's cost is deferred or paid by another party. In some embodiments, “at a subsidized cost” refers to transfer of goods or services at no cost to the recipient. [0096]
  • As used herein, the term “at no cost” refers to the transfer of goods or services with no direct financial expense to the recipient. For example, when detection panels are provided by a manufacturer or distributor to a user (e.g. research scientist) at no cost, the user does not directly pay for the tests. [0097]
  • The term “detection” as used herein refers to quantitatively or qualitatively identifying an analyte (e.g., DNA, RNA or a protein) within a sample. The term “detection assay” as used herein refers to a kit, test, or procedure performed for the purpose of detecting an analyte nucleic acid within a sample. Detection assays produce a detectable signal or effect when performed in the presence of the target analyte, and include but are not limited to assays incorporating the processes of hybridization, nucleic acid cleavage (e.g., exo- or endonuclease), nucleic acid amplification, nucleotide sequencing, primer extension, or nucleic acid ligation. [0098]
  • As used herein, the term “functional detection oligonucleotide” refers to an oligonucleotide that is used as a component of a detection assay, wherein the detection assay is capable of successfully detecting (i.e., producing a detectable signal) an intended target nucleic acid when the functional detection oligonucleotide provides the oligonucleotide component of the detection assay. This is in contrast to a non-functional detection oligonucleotides, which fail to produce a detectable signal in a detection assay for the particular target nucleic acid when the non-functional detection oligonucleotide is provided as the oligonucleotide component of the detection assay. Determining if an oligonucleotide is a functional oligonucleotide can be carried out experimentally by testing the oligonucleotide in the presence of the particular target nucleic acid using the detection assay. [0099]
  • As used herein, the term “derived from a different subject,” such as samples or nucleic acids derived from a different subjects refers to a samples derived from multiple different individuals. For example, a blood sample comprising genomic DNA from a first person and a blood sample comprising genomic DNA from a second person are considered blood samples and genomic DNA samples that are derived from different subjects. A sample comprising five target nucleic acids derived from different subjects is a sample that includes at least five samples from five different individuals. However, the sample may further contain multiple samples from a given individual. [0100]
  • As used herein, the term “treating together”, when used in reference to experiments or assays, refers to conducting experiments concurrently or sequentially, wherein the results of the experiments are produced, collected, or analyzed together (i.e., during the same time period). For example, a plurality of different target sequences located in separate wells of a multiwell plate or in different portions of a microarray are treated together in a detection assay where detection reactions are carried out on the samples simultaneously or sequentially and where the data collected from the assays is analyzed together. [0101]
  • The terms “assay data” and “test result data” as used herein refer to data collected from performance of an assay (e.g., to detect or quantitate a gene, SNP or an RNA). Test result data may be in any form, i.e., it may be raw assay data or analyzed assay data (e.g., previously analyzed by a different process). Collected data that has not been further processed or analyzed is referred to herein as “raw” assay data (e.g., a number corresponding to a measurement of signal, such as a fluorescence signal from a spot on a chip or a reaction vessel, or a number corresponding to measurement of a peak, such as peak height or area, as from, for example, a mass spectrometer, HPLC or capillary separation device), while assay data that has been processed through a further step or analysis (e.g., normalized, compared, or otherwise processed by a calculation) is referred to as “analyzed assay data” or “output assay data”. [0102]
  • As used herein, the term “database” refers to collections of information (e.g., data) arranged for ease of retrieval, for example, stored in a computer memory. A “genomic information database” is a database comprising genomic information, including, but not limited to, polymorphism information (i.e., information pertaining to genetic polymorphisms), genome information (i.e., genomic information), linkage information (i.e., information pertaining to the physical location of a nucleic acid sequence with respect to another nucleic acid sequence, e.g., in a chromosome), and disease association information (i.e., information correlating the presence of or susceptibility to a disease to a physical trait of a subject, e.g., an allele of a subject). “Database information” refers to information to be sent to a databases, stored in a database, processed in a database, or retrieved from a database. “Sequence database information” refers to database information pertaining to nucleic acid sequences. As used herein, the term “distinct sequence databases” refers to two or more databases that contain different information than one another. For example, the dbSNP and GenBank databases are distinct sequence databases because each contains information not found in the other. [0103]
  • As used herein the terms “processor” and “central processing unit” or “CPU” are used interchangeably and refer to a device that is able to read a program from a computer memory (e.g., ROM or other computer memory) and perform a set of steps according to the program. [0104]
  • As used herein, the terms “computer memory” and “computer memory device” refer to any storage media readable by a computer processor. Examples of computer memory include, but are not limited to, RAM, ROM, computer chips, digital video disc (DVDs), compact discs (CDs), hard disk drives (HDD), and magnetic tape. [0105]
  • As used herein, the term “computer readable medium” refers to any device or system for storing and providing information (e.g., data and instructions) to a computer processor. Examples of computer readable media include, but are not limited to, DVDs, CDs, hard disk drives, magnetic tape and servers for streaming media over networks. [0106]
  • As used herein, the term “hyperlink” refers to a navigational link from one document to another, or from one portion (or component) of a document to another. Typically, a hyperlink is displayed as a highlighted word or phrase that can be selected by clicking on it using a mouse to jump to the associated document or documented portion. [0107]
  • As used herein, the term “hypertext system” refers to a computer-based informational system in which documents (and possibly other types of data entities) are linked together via hyperlinks to form a user-navigable “web.”[0108]
  • As used herein, the term “Internet” refers to any collection of networks using standard protocols. For example, the term includes a collection of interconnected (public and/or private) networks that are linked together by a set of standard protocols (such as TCP/IP, HTTP, and FTP) to form a global, distributed network. While this term is intended to refer to what is now commonly known as the Internet, it is also intended to encompass variations that may be made in the future, including changes and additions to existing standard protocols or integration with other media (e.g., television, radio, etc). The term is also intended to encompass non-public networks such as private (e.g., corporate) Intranets. [0109]
  • As used herein, the terms “World Wide Web” or “web” refer generally to both (i) a distributed collection of interlinked, user-viewable hypertext documents (commonly referred to as Web documents or Web pages) that are accessible via the Internet, and (ii) the client and server software components which provide user access to such documents using standardized Internet protocols. Currently, the primary standard protocol for allowing applications to locate and acquire Web documents is HTTP, and the Web pages are encoded using HTML. However, the terms “Web” and “World Wide Web” are intended to encompass future markup languages and transport protocols that may be used in place of (or in addition to) HTML and HTTP. [0110]
  • As used herein, the term “web site” refers to a computer system that serves informational content over a network using the standard protocols of the World Wide Web. Typically, a Web site corresponds to a particular Internet domain name and includes the content associated with a particular organization. As used herein, the term is generally intended to encompass both (i) the hardware/software server components that serve the informational content over the network, and (ii) the “back end” hardware/software components, including any non-standard or specialized components, that interact with the server components to perform services for Web site users. [0111]
  • As used herein, the term “HTML” refers to HyperText Markup Language that is a standard coding convention and set of codes for attaching presentation and linking attributes to informational content within documents. HTML is based on SGML, the Standard Generalized Markup Language. During a document authoring stage, the HTML codes (referred to as “tags”) are embedded within the informational content of the document. When the Web document (or HTML document) is subsequently transferred from a Web server to a browser, the codes are interpreted by the browser and used to parse and display the document. Additionally, in specifying how the Web browser is to display the document, HTML tags can be used to create links to other Web documents (commonly referred to as “hyperlinks”). [0112]
  • As used herein, the term “XML” refers to Extensible Markup Language, an application profile that, like HTML, is based on SGML. XML differs from HTML in that: information providers can define new tag and attribute names at will; document structures can be nested to any level of complexity; any XML document can contain an optional description of its grammar for use by applications that need to perform structural validation. XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data, and some of which form markup. Markup encodes a description of the document's storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure, to define constraints on the logical structure and to support the use of predefined storage units. A software module called an XML processor is used to read XML documents and provide access to their content and structure. [0113]
  • As used herein, the term “HTTP” refers to HyperText Transport Protocol that is the standard World Wide Web client-server protocol used for the exchange of information (such as HTML documents, and client requests for such documents) between a browser and a Web server. HTTP includes a number of different types of messages that can be sent from the client to the server to request different types of server actions. For example, a “GET” message, which has the format GET, causes the server to return the document or file located at the specified URL. [0114]
  • As used herein, the term “URL” refers to Uniform Resource Locator that is a unique address that fully specifies the location of a file or other resource on the Internet. The general format of a URL is protocol://machine address:port/path/filename. The port specification is optional, and if none is entered by the user, the browser defaults to the standard port for whatever service is specified as the protocol. For example, if HTTP is specified as the protocol, the browser will use the HTTP default port of 80. [0115]
  • As used herein, the term “PUSH technology” refers to an information dissemination technology used to send data to users over a network. In contrast to the World Wide Web (a “pull” technology), in which the client browser should request a Web page before it is sent, PUSH protocols send the informational content to the user computer automatically, typically based on information pre-specified by the user. [0116]
  • As used herein, the term “communication network” refers to any network that allows information to be transmitted from one location to another. For example, a communication network for the transfer of information from one computer to another includes any public or private network that transfers information using electrical, optical, satellite transmission, and the like. Two or more devices that are part of a communication network such that they can directly or indirectly transmit information from one to the other are considered to be “in electronic communication” with one another. A computer network containing multiple computers may have a central computer (“central node”) that processes information to one or more sub-computers that carry out specific tasks (“sub-nodes”). Some networks comprises computers that are in “different geographic locations” from one another, meaning that the computers are located in different physical locations (i.e., aren't physically the same computer, e.g., are located in different countries, states, cities, rooms, etc.). [0117]
  • As used herein, the term “detection assay component” refers to a component of a system capable of performing a detection assay. Detection assay components include, but are not limited to, hybridization probes, buffers, and the like. [0118]
  • As used herein, the term “a detection assays configured for target detection” refers to a collection of assay components that are capable of producing a detectable signal when carried out using the target nucleic acid. For example, a detection assay that has empirically been demonstrated to detect a particular single nucleotide polymorphism is considered a detection assay configured for target detection. [0119]
  • As used herein, the phrase “unique detection assay” refers to a detection assay that has a different collection of detection assay components in relation to other detection assays located on the same detection panel. A unique assay doesn't necessarily detect a different target (e.g. SNP) than other assays on the same detection panel, but it does have a least one difference in the collection of components used to detect a given target (e.g. a unique detection assay may employ a probe sequences that is shorter or longer in length than other assays on the same detection panel). [0120]
  • As used herein, the term “candidate” refers to an assay or analyte, e.g., a nucleic acid, suspected of having a particular feature or property. A “candidate sequence” refers to a nucleic acid suspected of comprising a particular sequence, while a “candidate oligonucleotide” refers to an oligonucleotide suspected of having a property such as comprising a particular sequence, or having the capability to hybridize to a target nucleic acid or to perform in a detection assay. A “candidate detection assay” refers to a detection assay that is suspected of being a valid detection assay. [0121]
  • As used herein, the term “detection panel” refers to a substrate or device containing at least two unique candidate detection assays configured for target detection. [0122]
  • As used herein, the term “valid detection assay” refers to a detection assay that has been shown to accurately predict an association between the detection of a target and a phenotype (e.g. medical condition). Examples of valid detection assays include, but are not limited to, detection assays that, when a target is detected, accurately predict the phenotype medical 95%, 96%, 97%, 98%, 99%, 99.5%, 99.8%, or 99.9% of the time. Other examples of valid detection assays include, but are not limited to, detection assays that quality as and/or are marketed as Analyte-Specific Reagents (i.e. as defined by FDA regulations) or In-Vitro Diagnostics (i.e. approved by the FDA). [0123]
  • As used herein, the term “kit” refers to any delivery system for delivering materials. In the context of reaction assays, such delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., oligonucleotides, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g., buffers, written instructions for performing the assay etc.) from one location to another. For example, kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and/or supporting materials. As used herein, the term “fragmented kit” refers to a delivery systems comprising two or more separate containers that each contain a subportion of the total kit components. The containers may be delivered to the intended recipient together or separately. For example, a first container may contain an enzyme for use in an assay, while a second container contains oligonucleotides. The term “fragmented kit” is intended to encompass kits containing Analyte specific reagents (ASR's) regulated under section 520(e) of the Federal Food, Drug, and Cosmetic Act, but are not limited thereto. Indeed, any delivery system comprising two or more separate containers that each contains a subportion of the total kit components are included in the term “fragmented kit.” In contrast, a “combined kit” refers to a delivery system containing all of the components of a reaction assay in a single container (e.g., in a single box housing each of the desired components). The term “kit” includes both fragmented and combined kits. [0124]
  • As used herein, the term “information” refers to any collection of facts or data. In reference to information stored or processed using a computer system(s), including but not limited to internets, the term refers to any data stored in any format (e.g., analog, digital, optical, etc.). As used herein, the term “information related to a subject” refers to facts or data pertaining to a subject (e.g., a human, plant, or animal). The term “genomic information” refers to information pertaining to a genome including, but not limited to, nucleic acid sequences, genes, allele frequencies, RNA expression levels, protein expression, phenotypes correlating to genotypes, etc. “Allele frequency information” refers to facts or data pertaining allele frequencies, including, but not limited to, allele identities, statistical correlations between the presence of an allele and a characteristic of a subject (e.g., a human subject), the presence or absence of an allele in a individual or population, the percentage likelihood of an allele being present in an individual having one or more particular characteristics, etc. [0125]
  • As used herein, the term “assay validation information” refers to genomic information and/or allele frequency information resulting from processing of test result data (e.g. processing with the aid of a computer). Assay validation information may be used, for example, to identify a particular candidate detection assay as a valid detection assay. [0126]
  • DETAILED DESCRIPTION OF THE INVENTION
  • Since its introduction in 1988 (Chamberlain, et al. Nucleic Acids Res., 16:11141 (1988)), multiplex PCR has become a routine means of amplifying multiple genetic loci in a single reaction. This approach has found utility in a number of research, as well as clinical, applications. Multiplex PCR has been described for use in diagnostic virology (Elnifro, et al. Clinical Microbiology Reviews, 13: 559 (2000)), paternity testing (Hidding and Schmitt, Forensic Sci. Int., 113: 47 (2000); Bauer et al., Int. J. Legal Med. 116: 39 (2002)), preimplantation genetic diagnosis (Ouhibi, et al., Curr Womens Health Rep. 1: 138 (2001)), microbial analysis in environmental and food samples (Rudi et al., Int J Food Microbiology, 78: 171 (2002)), and veterinary medicine (Zarlenga and Higgins, Vet Parasitol. 101: 215 (2001)), among others. Most recently, expansion of genetic analysis to whole genome levels, particularly for single nucleotide polymorphisms, or SNPs, has created a need highly multiplexed PCR capabilities. Comparative genome-wide association and candidate gene studies require the ability to genotype between 100,000-500,000 SNPs per individual (Kwok, Molecular Medicine Today, 5: 538-5435 (1999); Kwok, Pharmacogenomics, 1: 231 (2000); Risch and Merikangas, Science, 273: 1516 (1996)). Moreover, SNPs in coding or regulatory regions alter gene function in important ways (Cargill et al. Nature Genetics, 22: 231 (1999); Halushka et al., Nature Genetics, 22: 239 (1999)), making these SNPs useful diagnostic tools in personalized medicine (Hagmann, Science, 285: 21 (1999); Cargill et al. Nature Genetics, 22: 231 (1999); Halushka et al., Nature Genetics, 22: 239 (1999)). Likewise, validating the medical association of a set of SNPs previously identified for their potential clinical relevance as part of a diagnostic panel will mean testing thousands of individuals for thousands of markers at a time. [0127]
  • Despite its broad appeal and utility, several factors complicate multiplex PCR amplification. Chief among these is the phenomenon of PCR or amplification bias, in which certain loci are amplified to a greater extent than others. Two classes of amplification bias have been described. One, referred to as PCR drift, is ascribed to stochastic variation in such steps as primer annealing during the early stages of the reaction (Polz and Cavanaugh, Applied and Environmental Microbiology, 64: 3724 (1998)), is not reproducible, and may be more prevalent when very small amounts of target molecules are being amplified (Walsh et al., PCR Methods and Applications, 1: 241 (1992)). The other, referred to as PCR selection, pertains to the preferential amplification of some loci based on primer characteristics, amplicon length, G-C content, and other properties of the genome (Polz, supra). [0128]
  • Another factor affecting the extent to which PCR reactions can be multiplexed is the inherent tendency of PCR reactions to reach a plateau phase. The plateau phase is seen in later PCR cycles and reflects the observation that amplicon generation moves from exponential to pseudo-linear accumulation and then eventually stops increasing. This effect appears to be due to non-specific interactions between the DNA polymerase and the double stranded products themselves. The molar ratio of product to enzyme in the plateau phase is typically consistent for several DNA polymerases, even when different amounts of enzyme are included in the reaction, and is approximately 30:1 product:enzyme. This effect thus limits the total amount of double-stranded product that can be generated in a PCR reaction such that the number of different loci amplified must be balanced against the total amount of each amplicon desired for subsequent analysis, e.g. by gel electrophoresis, primer extension, etc. [0129]
  • Because of these and other considerations, although multiplexed PCR including 50 loci has been reported (Lindblad-Toh et al., Nature Genet. 4: 381 (2000)), multiplexing is typically limited to fewer than ten distinct products. However, given the need to analyze as many as 100,000 to 450,000 SNPs from a single genomic DNA sample there is a clear need for a means of expanding the multiplexing capabilities of PCR reactions. [0130]
  • The present invention provides methods for substantial multiplexing of PCR reactions by, for example, combining the INVADER assay with multiplex PCR amplification. The INVADER assay provides a detection step and signal amplification that allows very large numbers of targets to be detected in a multiplex reaction. As desired, hundreds to thousands to hundreds of thousands of targets may be detected in a multiplex reaction. [0131]
  • Direct genotyping by the INVADER assay typically uses from 5 to 100 ng of human genomic DNA per SNP, depending on detection platform. For a small number of assays, the reactions can be performed directly with genomic DNA without target pre-amplification, however, with more than 100,000 INVADER assays being developed and even larger number expected for genome-wide association studies, the amount of sample DNA may become a limiting factor. [0132]
  • Because the INVADER assay provides from 10[0133] 6 to 107 fold amplification of signal, multiplexed PCR in combination with the INVADER assay would use only limited target amplification as compared to a typical PCR. Consequently, low target amplification level alleviates interference between individual reactions in the mixture and reduces the inhibition of PCR by it's the accumulation of its products, thus providing for more extensive multiplexing. Additionally, it is contemplated that low amplification levels decrease a probability of target cross-contamination and decrease the number of PCR-induced mutations.
  • Uneven amplification of different loci presents one of biggest challenges in the development of multiplexed PCR. Difference in amplification factors between two loci may result in a situation where the signal generated by an INVADER reaction with a slow-amplifying locus is below the limit of detection of the assay, while the signal from a fast-amplifying locus is beyond the saturation level of the assay. This problem can be addressed in several ways. In some embodiments, the INVADER reactions can be read at different time points, e.g., in real-time, thus significantly extending the dynamic range of the detection. In other embodiments, multiplex PCR can be performed under conditions that allow different loci to reach more similar levels of amplification. For example, primer concentrations can be limited, thereby allowing each locus to reach a more uniform level of amplification. In yet other embodiments, concentrations of PCR primers can be adjusted to balance amplification factors of different loci. [0134]
  • The present invention provides for the design and characteristics of highly multiplex PCR including hundreds to thousands of products in a single reaction. For example, the target pre-amplification provided by hundred-plex PCR reduces the amount of human genomic DNA required for INVADER-based SNP genotyping to less than 0.1 ng per assay. The specifics of highly multiplex PCR optimization and a computer program for the primer design are described below. [0135]
  • The following discussion provides a description of certain preferred illustrative embodiments of the present invention and is not intended to limit the scope of the present invention. [0136]
  • I. Multiplex PCR Primer Design [0137]
  • The INVADER assay can be used for the detection of single nucleotide polymorphisms (SNPs) with as little as 100-10 ng of genomic DNA without the need for target pre-amplification. However, with more than 50,000 INVADER assays being developed and the potential for whole genome association studies involving hundreds of thousands of SNPs, the amount of sample DNA becomes a limiting factor for large scale analysis. Due to the sensitivity of the INVADER assay on human genomic DNA (hgDNA) without target amplification, multiplex PCR coupled with the INVADER assay requires only limited target amplification (10[0138] 3-104) as compared to typical multiplex PCR reactions which require extensive amplification (109-1012) for conventional gel detection methods. The low level of target amplification used for INVADER™ detection provides for more extensive multiplexing by avoiding amplification inhibition commonly resulting from target accumulation.
  • The present invention provides methods and selection criteria that allow primer sets for multiplex PCR to be generated (e.g. that can be coupled with a detection assay, such as the INVADER assay). In some embodiments, software applications of the present invention automated multiplex PCR primer selection, thus allowing highly multiplexed PCR with the primers designed thereby. Using the INVADER Medically Associated Panel (MAP) as a corresponding platform for SNP detection, as shown in example 2, the methods, software, and selection criteria of the present invention allowed accurate genotyping of 94 of the 101 possible amplicons (˜93%) from a single PCR reaction. The original PCR reaction used only 10 ng of hgDNA as template, corresponding to less than 150 pg hgDNA per INVADER assay. [0139]
  • FIG. 1 described the general principles of the INVADER assay. The INVADER assay allows for the simultaneous detection of two distinct alleles in the same reaction using an isothermal, single addition format. (A) Allele discrimination takes place by “structure specific” cleavage of the Probe, releasing a 5′ flap which corresponds to a given polymorphism. (B) In the second reaction, the released 5′ flap mediates signal generation by cleavage of the appropriate FRET cassette. [0140]
  • FIG. 2 illustrates creation of one of the primer pairs (both a forward and reverse primer) for a 101 primer sets from sequences available for analysis on the INVADER Medically Associated Panel using one embodiment of the software application of the present invention. FIG. 2A shows a sample input file of a single entry (e.g. shows target sequence information for a single target sequence containing a SNP that is processed the method and software of the present invention). The target sequence information in FIG. 2 includes Third Wave Technologies's SNP#, short name identifier, and sequence with the SNP location indicated in brackets. FIG. 2B shows the sample output file of a the same entry (e.g. shows the target sequence after being processed by the systems and methods and software of the present invention. The output information includes the sequence of the footprint region (capital letters flanking SNP site, showing region where INVADER assay probes hybridize to this target sequence in order to detect the SNP in the target sequence), forward and reverse primer sequences (bold), and their corresponding Tm's. [0141]
  • In some embodiments, the selection of primers to make a primer set capable of multiplex PCR is performed in automated fashion (e.g. by a software application). Automated primer selection for multiplex PCR may be accomplished employing a software program designed as shown by the flow chart in FIG. 4A. [0142]
  • Multiplex PCR commonly requires extensive optimization to avoid biased amplification of select amplicons and the amplification of spurious products resulting from the formation of primer-dimers. In order to avoid these problems, the present invention provides methods and software application that provide selection criteria to generate a primer set configured for multiplex PCR, and subsequent use in a detection assay (e.g. INVADER detection assays). [0143]
  • In some embodiments, the methods and software applications of the present invention start with user defined sequences and corresponding SNP locations. In certain embodiments, the methods and/or software application determines a footprint region within the target sequence (the minimal amplicon required for INVADER detection) for each sequence (shown in capital letters in FIG. 2B). The footprint region includes the region where assay probes hybridize, as well as any user defined additional bases extending outward therefore (e.g. 5 additional bases included on each side of where the assay probes hybridize). Next, primers are designed outward from the footprint region and evaluated against several criteria, including the potential for primer-dimer formation with previously designed primers in the current multiplexing set (See, primers in bold in FIG. 2A, and selection steps in FIG. 4A). This process may be continued, as shown in FIG. 4A, through multiple iterations of the same set of sequences until primers against all sequences in the current multiplexing set can be designed. [0144]
  • Once a primer set is designed for multiplex PCR, this set may be employed as shown in the basic workflow scheme shown in FIG. 3. Multiplex PCR may be carried out, for example, under standard conditions using only 10 ng of hgDNA as template. After 10 min at 95° C., Taq (2.5 units) may be added to a 50 ul reaction and PCR carried out for 50 cycles. The PCR reaction may be diluted and loaded directly onto an INVADER MAP plate (3 ul/well) (See FIG. 3). An additional 3 ul of 1 5 mM MgCl[0145] 2 may be added to each reaction on the INVADER MAP plate and covered with 6 ul of mineral oil. The entire plate may then be heated to 95° C. for 5 min. and incubated at 63° C. for 40 min. FAM and RED fluorescence may then be measured on a Cytofluor 4000 fluorescent plate reader and “Fold Over Zero” (FOZ) values calculated for each amplicon. Results from each SNP may be color coded in a table as “pass” (green), “mis-call” (pink), or “no-call” (white) (See, Example 2 below).
  • In some embodiments the number of PCR reactions is from about 1 to about 10 reactions. In some embodiments, the number of PCR reactions is from about 10 to about 50 reactions. In further embodiments, the number of PCR reactions is from about 50 to about 100. In additional embodiments, the number of PCR reactions is from about than 100 to 1,000. In still other embodiments, the number of PCR reactions is greater than 1,000. [0146]
  • The present invention also provides methods to optimize multiplex PCR reactions (e.g. once a primer set is generated, the concentration of each primer or primer pair may be optimized). For example, once a primer set has been generated and used in a multiplex PCR at equal molar concentrations, the primers may be evaluated separately such that the optimum primer concentration is determined such that the multiplex primer set performs better. [0147]
  • Multiplex PCR reactions are being recognized in the scientific, research, clinical and biotechnology industries as potentially time effective and less expensive means of obtaining nucleic acid information compared to standard, monoplex PCR reactions. Instead of performing only a single amplification reaction per reaction vessel (tube or well of a multi-well plate for example), numerous amplification reactions are performed in a single reaction vessel. [0148]
  • The cost per target is theoretically lowered by eliminating technician time in assay set-up and data analysis, and by the substantial reagent savings (especially enzyme cost). Another benefit of the multiplex approach is that far less target sample is required. In whole genome association studies involving hundreds of thousands of single nucleotide polymorphisms (SNPs), the amount of target or test sample is limiting for large scale analysis, so the concept of performing a single reaction, using one sample aliquot to obtain, for example, 100 results, versus using 100 sample aliquots to obtain the same data set is an attractive option. [0149]
  • To design primers for a successful multiplex PCR reaction, the issue of aberrant interaction among primers should be addressed. The formation of primer dimers, even if only a few bases in length, may inhibit both primers from correctly hybridizing to the target sequence. Further, if the dimers form at or near the 3′ ends of the primers, no amplification or very low levels of amplification will occur, since the 3′ end is required for the priming event. Clearly, the more primers utilized per multiplex reaction, the more aberrant primer interactions are possible. The methods, systems and applications of the present help prevent primer dimers in large sets of primers, making the set suitable for highly multiplexed PCR. [0150]
  • When designing primer pairs for numerous site (for example 100 sites in a multiplex PCR reaction), the order in which primer pairs are designed can influence the total number of compatible primer pairs for a reaction. For example, if a first set of primers is designed for a first target region that happens to be an A/T rich target region, these primer will be A/T rich. If the second target region chosen also happens to be an A/T rich target region, it is far more likely that the primers designed for these two sets will be incompatible due to aberrant interactions, such as primer dimers. If, however, the second target region chosen is not A/T rich, it is much more likely that a primer set can be designed that will not interact with the first A/T rich set. For any given set of input target sequences, the present invention randomizes the order in which primer sets are designed (See, FIG. 4A). Furthermore, in some embodiments, the present invention re-orders the set of input target sequences in a plurality of different, random orders to maximize the number of compatible primer sets for any given multiplex reaction (See, FIG. 4A). [0151]
  • The present invention provides criteria for primer design that minimize 3′ interactions while maximizing the number of compatible primer pairs for a given set of reaction targets in a multiplex design. For primers described as 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, N[1] is an A or C (in alternative embodiments, N[1] is a G or T). N[2]-N[1] of each of the forward and reverse primers designed should not be complementary to N[2]-N[1] of any other oligonucleotide. In certain embodiments, N[3]-N[2]-N[1] should not be complementary to N[3]-N[2]-N[1] of any other oligonucleotide. In preferred embodiments, if these criteria are not met at a given N[1], the next base in the 5′ direction for the forward primer or the next base in the 3′ direction for the reverse primer may be evaluated as an N[1] site. This process is repeated, in conjunction with the target randomization, until all criteria are met for all, or a large majority of, the targets sequences (e.g. 95% of target sequences can have primer pairs made for the primer set that fulfill these criteria). [0152]
  • Another challenge to be overcome in a multiplex primer design is the balance between actual, required nucleotide sequence, sequence length, and the oligonucleotide melting temperature (Tm) constraints. Importantly, since the primers in a multiplex primer set in a reaction should function under the same reaction conditions of buffer, salts and temperature, they need therefore to have substantially similar T[0153] m's, regardless of GC or AT richness of the region of interest. The present invention allows for primer design which meet minimum Tm and maximum Tm requirements and minimum and maximum length requirements. For example, in the formula for each primer 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, x is selected such the primer has a predetermined melting temperature (e.g. bases are included in the primer until the primer has a calculated melting temperature of about 50 degrees Celsius).
  • Often the products of a PCR reaction are used as the target material for another nucleic acid detection means, such as a hybridization-type detection assays, or the INVADER reaction assays for example. Consideration should be given to the location of primer placement to allow for the secondary reaction to successfully occur, and again, aberrant interactions between amplification primers and secondary reaction oligonucleotides should be minimized for accurate results and data. Selection criteria may be employed such that the primers designed for a multiplex primer set do not react (e.g. hybridize with, or trigger reactions) with oligonucleotide components of a detection assay. For example, in order to prevent primers from reacting with the FRET oligonucleotide of a bi-plex INVADER assay, certain homology criteria is employed. In particular, if each of the primers in the set are defined as 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, then N[4]-N[3]-N[2]-N[1]-3′ is selected such that it is less than 90% homologous with the FRET or INVADER oligonucleotides. In other embodiments, N[4]-N[3]-N[2]-N[1]-3′ is selected for each primer such that it is less than 80% homologous with the FRET or INVADER oligonucleotides. In certain embodiments, N[4]-N[3]-N[2]-N[1]-3′ is selected for each primer such that it is less than 70% homologous with the FRET or INVADER oligonucleotides. [0154]
  • While employing the criteria of the present invention to develop a primer set, some primer pairs may not meet all of the stated criteria (these may be rejected as errors). For example, in a set of 100 targets, 30 are designed and meet all listed criteria, however, set 31 fails. In the method of the present invention, set 31 may be flagged as failing, and the method could continue through the list of 100 targets, again flagging those sets which do not meet the criteria (See FIG. 4A). Once all 100 targets have had a chance at primer design, the method would note the number of failed sets, re-order the 100 targets in a new random order and repeat the design process (See, FIG. 4A). After a configurable number of runs, the set with the most passed primer pairs (the least number of failed sets) are chosen for the multiplex PCR reaction (See FIG. 4A). [0155]
  • FIG. 4A shows a flow chart with the basic flow of certain embodiments of the methods and software application of the present invention. In preferred embodiments, the processes detailed in FIG. 4A are incorporated into a software application for ease of use (although, the methods may also be performed manually using, for example, FIG. 4A as a guide). [0156]
  • Target sequences and/or primer pairs are entered into the system shown in FIG. 4A. The first set of boxes show how target sequences are added to the list of sequences that have a footprint determined (See “B” in FIG. 4A), while other sequences are passed immediately into the primer set pool (e.g. PDPass, those sequences that have been previously processed and shown to work together without forming Primer dimers or having reactivity to FRET sequences), as well as DimerTest entries (e.g. pair or primers a user wants to use, but that has not been tested yet for primer dimer or fret reactivity). In other words, the initial set of boxes leading up to “end of input” sort the sequences so they can be later processed properly. [0157]
  • Starting at “A” in FIG. 4A, the primer pool is basically cleared or “emptied” to start a fresh run. The target sequences are then sent to “B” to be processed, and DimerTest pairs are sent to “C” to be processed. Target sequences are sent to “B”, where a user or software application determines the footprint region for the target sequence (e.g. where the assay probes will hybridize in order to detect the mutation (e.g. SNP) in the target sequence). This region is generally shown in capital letters in figures, such as FIG. 2B. It is important to design this region (which the user may further expand by defining that additional bases past the hybridization region be added) such that the primers that are designed fully encompass this region. In FIG. 4A, the software application INVADER CREATOR is used to design the INVADER oligonuclotide and downstream probes that will hybridize with the target region (although any type of program of system could be used to create any type of probes a user was interested in designing probes for, and thus determining the footprint region for on the target sequence). Thus the core footprint region is then defined by the location of these two assay probes on the target. [0158]
  • Next, the system starts from the 5′ edge of the footprint and travels in the 5′ direction until the first base is reached, or until the first A or C (or G or T) is reached. This is set as the initial starting point for defining the sequence of the forward primer (i.e. this serves as the initial N[1] site). From this initial N[1] site, the sequence of the primer for the forward primer is the same as those bases encountered on the target region. For example, if the default size of the primer is set as 12 bases, the system starts with the bases selected as N[1] and then adds the next 11 bases found in the target sequences. This 12-mer primer is then tested for a melting temperature (e.g. using INVADER CREATOR), and additional bases are added from the target sequence until the sequence has a melting temperature that is designated by the user as the default minimum and maximum melting temperatures (e.g. about 50 degrees Celsius, and not more than 55 degrees Celsius). For example, the system employs the [0159] formula 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, and x is initially 12. Then the system adjusts x to a higher number (e.g. longer sequences) until the pre-set melting temperature is found. In certain embodiments, a maximum primer size is employed as a default parameter to serve as an upper limit on the length of the primers designed. In some embodiments, the maximum primer size is about 30 bases (e.g. 29 bases, 30, bases, or 31 bases). On other embodiments, the default settings (e.g. minimum and maximum primer size, and minimum and maximum Tm) are able to be modified using standard database manipulation tools.
  • The next box in FIG. 4[0160] a, is used to determine if the primer that has been designed so far will cause primer-dimer and/or fret reactivity (e.g. with the other sequences already in the pool). The criteria used for this determination are explained above. If the primer passes this step, the forward primer is added to the primer pool. However, if the forward primer fails this criteria, as shown in FIG. 4A, the starting point (N[1] is moved) one nucleotide in the 5′ direction (or to the next A or C, or next G or T). The system first checks to make sure shifting over leaves enough room on the target sequence to successfully make a primer. If yes, the system loops back and check this new primer for melting temperature. However, if no sequence can be designed, then the target sequence is flagged as an error (e.g. indicating that no forward primer can be made for this target).
  • This same process is then repeated for designing the reverse primer, as shown in FIG. 4A. If a reverse primer is successfully made, then the pair or primers is put into the primer pool, and the system goes back to “B” (if there are more target sequences to process), or goes onto “C” to test DimerTest pairs. [0161]
  • Starting a “C” in FIG. 4A shows how primer pairs that are entered as primers (DimerTest) are processed by the system. If there are no DimerTest pairs, as shown in FIG. 4[0162] a, the system goes on to “D”. However, if there are DimerTest pairs, these are tested for primer-dimer and/or FRET reactivity as described above. If the DimerTest pair fails these criteria they are flagged as errors. If the DimerTest pair passes the criteria, they are added to the primer set pool, and then the system goes back to “C” if there are more DimerTest pairs to be evaluated, or or goes on to “D” if there are no more DimerTest pairs to be evaluated.
  • Starting at “D” in FIG. 4[0163] a, the pool of primers that has been created is evaluated. The first step in this section is to examine the number of error (failures) generated by this particular randomized run of sequences. If there were no errors, this set is the best set as maybe ouputted to a user. If there are more than zero errors, the system compares this run to any other previous runs to see what run resulted in the fewest errors. If the current run has fewer errors, it is designated as the current best set. At this point, the system may go back to “A” to start the run over with another randomized set of the same sequences, or the pre-set maximum number of runs (e.g. 5 runs) may have been reached on this run (e.g. this was the 5th run, and the maximum number of runs was set as 5). If the maximum has been reached, then the best set is outputted as the best set. This best set of primers may then be used to generate as physical set of oligonucleotides such that a multiplex PCR reaction may be carried out.
  • Another challenge to be overcome with multiplex PCR reactions is the unequal amplicon concentrations that result in a standard multiplex reaction. The different loci targeted for amplification may each behave differently in the amplification reaction, yielding vastly different concentrations of each of the different amplicon products. The present invention provides methods, systems, software applications, computer systems, and a computer data storage medium that may be used to adjust primer concentrations relative to a first detection assay read (e.g. INVADER assay read), and then with balanced primer concentrations come close to substantially equal concentrations of different amplicons. [0164]
  • The concentrations for various primer pairs may be determined experimentally. In some embodiments, there is a first run conducted with all of the primers in equimolar concentrations. Time reads are then conducted. Based upon the time reads, the relative amplification factors for each amplicon are determined. Then based upon a unifying correction equation, an estimate of what the primer concentration should be obtained to get the signals closer within the same time point. These detection assays can be on an array of different sizes (384 well plates). [0165]
  • It is appreciated that combining the invention with detection assays and arrays of detection assays provides substantial processing efficiencies. Employing a balanced mix of primers or primer pairs created using the invention, a single point read can be carried out so that an average user can obtain great efficiencies in conducting tests that require high sensitivity and specificity across an array of different targets. [0166]
  • Having optimized primer pair concentrations in a single reaction vessel allows the user to conduct amplification for a plurality or multiplicity of amplification targets in a single reaction vessel and in a single step. The yield of the single step process is then used to successfully obtain test result data for, for example, several hundred assays. For example, each well on a 384 well plate can have a different detection assay thereon. The results of the single step mutliplex PCR reaction has amplified 384 different targets of genomic DNA, and provides you with 384 test results for each plate. Where each well has a plurality of assays even greater efficiencies can be obtained. [0167]
  • Therefore, the present invention provides the use of the concentration of each primer set in highly multiplexed PCR as a parameter to achieve an unbiased amplification of each PCR product. Any PCR includes primer annealing and primer extension steps. Under standard PCR conditions, high concentration of primers in the order of 1 uM ensures fast kinetics of primers annealing while the optimal time of the primer extension step depends on the size of the amplified product and can be much longer than the annealing step. By reducing primer concentration, the primer annealing kinetics can become a rate limiting step and PCR amplification factor should strongly depend on primer concentration, association rate constant of the primers, and the annealing time. [0168]
  • The binding of primer P with target T can be described by the following model: [0169] P + T k a PT ( 1 )
    Figure US20040014067A1-20040122-M00001
  • where k[0170] a is the association rate constant of primer annealing. We assume that the annealing occurs at the temperatures below primer melting and the reverse reaction can be ignored.
  • The solution for this kinetics under the conditions of a primer excess is well known: [0171]
  • [PT]=T 0(1−e −k a ct)   (2)
  • where [PT] is the concentration of target molecules associated with primer, T[0172] 0 is initial target concentration, c is the initial primer concentration, and t is primer annealing time. Assuming that each target molecule associated with primer is replicated to produce full size PCR product, the target amplification factor in a single PCR cycle is Z = T 0 + [ PT ] T 0 = 2 - - k a ct ( 3 )
    Figure US20040014067A1-20040122-M00002
  • The total PCR amplification factor after n cycles is given by [0173]
  • F=Z n=(2−e −k a ct)n   (4)
  • As it follows from [0174] equation 4, under the conditions where the primer annealing kinetics is the rate limiting step of PCR, the amplification factor should strongly depend on primer concentration. Thus, biased loci amplification, whether it is caused by individual association rate constants, primer extension steps or any other factors, can be corrected by adjusting primer concentration for each primer set in the multiplex PCR. The adjusted primer concentrations can be also used to correct biased performance of INVADER assay used for analysis of PCR pre-amplified loci. Employing this basic principle, the present invention has demonstrated a linear relationship between amplification efficiency and primer concentration and used this equation to balance primer concentrations of different amplicons, resulting in the equal amplification of ten different amplicons in Example 1. This technique may be employed on any size set of multiplex primer pairs.
  • II. Detection Assay Design [0175]
  • The following section describes detection assays that may be employed with the present invention. For example, many different assays may be used to determine the footprint on the target nucleic sequence, and then used as the detection assay run on the output of the multiplex PCR (or the detection assays may be run simultaneously with the multiplex PCR reaction). [0176]
  • There are a wide variety of detection technologies available for determining the sequence of a target nucleic acid at one or more locations. For example, there are numerous technologies available for detecting the presence or absence of SNPs. Many of these techniques require the use of an oligonucleotide to hybridize to the target. Depending on the assay used, the oligonucleotide is then cleaved, elongated, ligated, disassociated, or otherwise altered, wherein its behavior in the assay is monitored as a means for characterizing the sequence of the target nucleic acid. A number of these technologies are described in detail, in Section IV, below. [0177]
  • The present invention provides systems and methods for the design of oligonucleotides for use in detection assays. In particular, the present invention provides systems and methods for the design of oligonucleotides that successfully hybridize to appropriate regions of target nucleic acids (e.g., regions of target nucleic acids that do not contain secondary structure) under the desired reaction conditions (e.g., temperature, buffer conditions, etc.) for the detection assay. The systems and methods also allow for the design of multiple different oligonucleotides (e.g., oligonucleotides that hybridize to different portions of a target nucleic acid or that hybridize to two or more different target nucleic acids) that all function in the detection assay under the same or substantially the same reaction conditions. These systems and methods may also be used to design control samples that work under the experimental reaction conditions. [0178]
  • While the systems and methods of the present invention are not limited to any particular detection assay, the following description illustrates the invention when used in conjunction with the INVADER assay (Third Wave Technologies, Madison Wis.; See e.g., U.S. Pat. Nos. 5,846,717, 5,985,557, 5,994,069, and 6,001,567, PCT Publications WO 97/27214 and WO 98/42873, and de Arruda et al., Expert. Rev. Mol. Diagn. 2(5), 487-496 (2002), all of which are incorporated herein by reference in their entireties) to detect a SNP. The INVADER assay provides ease-of-use and sensitivity levels that, when used in conjunction with the systems and methods of the present invention, find use in detection panels, ASRs, and clinical diagnostics. One skilled in the art will appreciate that specific and general features of this illustrative example are generally applicable to other detection assays. [0179]
  • A. INVADER Assay [0180]
  • The INVADER assay provides means for forming a nucleic acid cleavage structure that is dependent upon the presence of a target nucleic acid and cleaving the nucleic acid cleavage structure so as to release distinctive cleavage products. 5′ nuclease activity, for example, is used to cleave the target-dependent cleavage structure and the resulting cleavage products are indicative of the presence of specific target nucleic acid sequences in the sample. When two strands of nucleic acid, or oligonucleotides, both hybridize to a target nucleic acid strand such that they form an overlapping invasive cleavage structure, as described below, invasive cleavage can occur. Through the interaction of a cleavage agent (e.g., a 5′ nuclease) and the upstream oligonucleotide, the cleavage agent can be made to cleave the downstream oligonucleotide at an internal site in such a way that a distinctive fragment is produced. [0181]
  • The INVADER assay provides detections assays in which the target nucleic acid is reused or recycled during multiple rounds of hybridization with oligonucleotide probes and cleavage of the probes without the need to use temperature cycling (i.e., for periodic denaturation of target nucleic acid strands) or nucleic acid synthesis (i.e., for the polymerization-based displacement of target or probe nucleic acid strands). When a cleavage reaction is run under conditions in which the probes are continuously replaced on the target strand (e.g. through probe-probe displacement or through an equilibrium between probe/target association and disassociation, or through a combination comprising these mechanisms, (Reynaldo, et al., J. Mol. Biol. 97: 511-520 [2000]), multiple probes can hybridize to the same target, allowing multiple cleavages, and the generation of multiple cleavage products. [0182]
  • B. Oligonucleotide Design for the INVADER Assay [0183]
  • In some embodiments where an oligonucleotide is designed for use in the INVADER assay to detect a SNP, the sequence(s) of interest are entered into the INVADERCREATOR program (Third Wave Technologies, Madison, Wis.). As described above, sequences may be input for analysis from any number of sources, either directly into the computer hosting the INVADERCREATOR program, or via a remote computer linked through a communication network (e.g., a LAN, Intranet or Internet network). The program designs probes for both the sense and antisense strand. Strand selection is generally based upon the ease of synthesis, minimization of secondary structure formation, and manufacturability. In some embodiments, the user chooses the strand for sequences to be designed for. In other embodiments, the software automatically selects the strand. By incorporating thermodynamic parameters for optimum probe cycling and signal generation (Allawi and SantaLucia, Biochemistry, 36:10581 [1997]), oligonucleotide probes may be designed to operate at a pre-selected assay temperature (e.g., 63° C.). Based on these criteria, a final probe set (e.g., primary probes for 2 alleles and an INVADER oligonucleotide) is selected. [0184]
  • In some embodiments, the INVADERCREATOR system is a web-based program with secure site access that contains a link to BLAST (available at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health website) and that can be linked to RNAstructure (Mathews et al., RNA 5:1458 [1999]), a software program that incorporates mfold (Zuker, Science, 244:48 [1989]). RNAstructure tests the proposed oligonucleotide designs generated by INVADERCREATOR for potential uni- and bimolecular complex formation. INVADERCREATOR is open database connectivity (ODBC)-compliant and uses the Oracle database for export/integration. The INVADERCREATOR system was configured with Oracle to work well with UNIX systems, as most genome centers are UNIX-based. [0185]
  • In some embodiments, the INVADERCREATOR analysis is provided on a separate server (e.g., a Sun server) so it can handle analysis of large batch jobs. For example, a customer can submit up to 2,000 SNP sequences in one email. The server passes the batch of sequences on to the INVADERCREATOR software, and, when initiated, the program designs detection assay oligonucleotide sets. In some embodiments, probe set designs are returned to the user within 24 hours of receipt of the sequences. [0186]
  • Each INVADER reaction includes at least two target sequence-specific, unlabeled oligonucleotides for the primary reaction: an upstream INVADER oligonucleotide and a downstream Probe oligonucleotide. The INVADER oligonucleotide is generally designed to bind stably at the reaction temperature, while the probe is designed to freely associate and disassociate with the target strand, with cleavage occurring only when an uncut probe hybridizes adjacent to an overlapping INVADER oligonucleotide. In some embodiments, the probe includes a 5′ flap or “arm” that is not complementary to the target, and this flap is released from the probe when cleavage occurs. In some embodiments, the released flap participates as an INVADER oligonucleotide in a secondary reaction. [0187]
  • The following discussion provides one example of how a user interface for an INVADERCREATOR program may be configured. [0188]
  • The user opens a work screen (FIG. 8), e.g., by clicking on an icon on a desktop display of a computer (e.g., a Windows desktop). The user enters information related to the target sequence for which an assay is to be designed. In some embodiments, the user enters a target sequence. In other embodiments, the user enters a code or number that causes retrieval of a sequence from a database. In still other embodiments, additional information may be provided, such as the user's name, an identifying number associated with a target sequence, and/or an order number. In preferred embodiments, the user indicates (e.g. via a check box or drop down menu) that the target nucleic acid is DNA or RNA. In other preferred embodiments, the user indicates the species from which the nucleic acid is derived. In particularly preferred embodiments, the user indicates whether the design is for monoplex (i.e., one target sequence or allele per reaction) or multiplex (i.e., multiple target sequences or alleles per reaction) detection. When the requisite choices and entries are complete, the user starts the analysis process. In one embodiment, the user clicks a “Go Design It” button to continue. [0189]
  • In some embodiments, the software validates the field entries before proceeding. In some embodiments, the software verifies that any required fields are completed with the appropriate type of information. In other embodiments, the software verifies that the input sequence meets selected requirements (e.g., minimum or maximum length, DNA or RNA content). If entries in any field are not found to be valid, an error message or dialog box may appear. In preferred embodiments, the error message indicates which field is incomplete and/or incorrect. Once a sequence entry is verified, the software proceeds with the assay design. [0190]
  • In some embodiments, the information supplied in the order entry fields specifies what type of design will be created. In preferred embodiments, the target sequence and multiplex check box specify which type of design to create. Design options include but are not limited to SNP assay, Multiplexed SNP assay (e.g., wherein probe sets for different alleles are to be combined in a single reaction), Multiple SNP assay (e.g., wherein an input sequence has multiple sites of variation for which probe sets are to be designed), and Multiple Probe Arm assays. [0191]
  • In some embodiments, the INVADERCREATOR software is started via a Web Order Entry (WebOE) process (i.e., through an Intra/Internet browser interface) and these parameters are transferred from the WebOE via applet <param> tags, rather than entered through menus or check boxes. [0192]
  • In the case of Multiple SNP Designs, the user chooses two or more designs to work with. In some embodiments, this selection opens a new screen view (e.g., a Multiple SNP Design Selection view FIG. 9). In some embodiments, the software creates designs for each locus in the target sequence, scoring each, and presents them to the user in this screen view. The user can then choose any two designs to work with. In some embodiments, the user chooses a first and second design (e.g., via a menu or buttons) and clicks a “Go Design It” button to continue. [0193]
  • To select a probe sequence that will perform optimally at a pre-selected reaction temperature, the melting temperature (T[0194] m) of the SNP to be detected is calculated using the nearest-neighbor model and published parameters for DNA duplex formation (Allawi and SantaLucia, Biochemistry, 36:10581 [1997]). In embodiments wherein the target strand is RNA, parameters appropriate for RNA/DNA heteroduplex formation may be used. Because the assay's salt concentrations are often different than the solution conditions in which the nearest-neighbor parameters were obtained (1M NaCl and no divalent metals), and because the presence and concentration of the enzyme influence optimal reaction temperature, an adjustment should be made to the calculated Tm to determine the optimal temperature at which to perform a reaction. One way of compensating for these factors is to vary the value provided for the salt concentration within the melting temperature calculations. This adjustment is termed a ‘salt correction’. As used herein, the term “salt correction” refers to a variation made in the value provided for a salt concentration for the purpose of reflecting the effect on a Tm calculation for a nucleic acid duplex of a non-salt parameter or condition affecting said duplex. Variation of the values provided for the strand concentrations will also affect the outcome of these calculations. By using a value of 0.5 M NaCl (SantaLucia, Proc Natl Acad Sci U S A, 95:1460 [1998]) and strand concentrations of about 1 mM of the probe and 1 fM target, the algorithm for used for calculating probe-target melting temperature has been adapted for use in predicting optimal INVADER assay reaction temperature. For a set of 30 probes, the average deviation between optimal assay temperatures calculated by this method and those experimentally determined is about 1.5° C.
  • The length of the downstream probe to a given SNP is defined by the temperature selected for running the reaction (e.g., 63° C.). Starting from the position of the variant nucleotide on the target DNA (the target base that is paired to the [0195] probe nucleotide 5′ of the intended cleavage site), and adding on the 3′ end, an iterative procedure is used by which the length of the target-binding region of the probe is increased by one base pair at a time until a calculated optimal reaction temperature (Tm plus salt correction to compensate for enzyme effect) matching the desired reaction temperature is reached. The non-complementary arm of the probe is preferably selected to allow the secondary reaction to cycle at the same reaction temperature. The entire probe oligonucleotide is screened using programs such as mfold (Zuker, Science, 244: 48 [1989]) or Oligo 5.0 (Rychlik and Rhoads, Nucleic Acids Res, 17: 8543 [1989]) for the possible formation of dimer complexes or secondary structures that could interfere with the reaction. The same principles are also followed for INVADER oligonucleotide design. Briefly, starting from the position N on the target DNA, the 3′ end of the INVADER oligonucleotide is designed to have a nucleotide not complementary to either allele suspected of being contained in the sample to be tested. The mismatch does not adversely affect cleavage (Lyamichev et al., Nature Biotechnology, 17: 292 [1999]), and it can enhance probe cycling, presumably by minimizing coaxial stabilization effects between the two probes. Additional residues complementary to the target DNA starting from residue N-1 are then added in the 5′ direction until the stability of the INVADER oligonucleotide-target hybrid exceeds that of the probe (and therefore the planned assay reaction temperature), generally by 15-20° C.
  • It is one aspect of the assay design that the all of the probe sequences may be selected to allow the primary and secondary reactions to occur at the same optimal temperature, so that the reaction steps can run simultaneously. In an alternative embodiment, the probes may be designed to operate at different optimal temperatures, so that the reaction steps are not simultaneously at their temperature optima. [0196]
  • In some embodiments, the software provides the user an opportunity to change various aspects of the design including but not limited to: probe, target and INVADER oligonucleotide temperature optima and concentrations; blocking groups; probe arms; dyes, capping groups and other adducts; individual bases of the probes and targets (e.g., adding or deleting bases from the end of targets and/or probes, or changing internal bases in the INVADER and/or probe and/or target oligonucleotides). In some embodiments, changes are made by selection from a menu. In other embodiments, changes are entered into text or dialog boxes. In preferred embodiments, this option opens a new screen (e.g., a Designer Worksheet view, FIG. 10). [0197]
  • In some embodiments, the software provides a scoring system to indicate the quality (e.g., the likelihood of performance) of the assay designs. In one embodiment, the scoring system includes a starting score of points (e.g., 100 points) wherein the starting score is indicative of an ideal design, and wherein design features known or suspected to have an adverse affect on assay performance are assigned penalty values. Penalty values may vary depending on assay parameters other than the sequences, including but not limited to the type of assay for which the design is intended (e.g., monoplex, multiplex) and the temperature at which the assay reaction will be performed. The following example provides an illustrative scoring criteria for use with some embodiments of the INVADER assay based on an intelligence defined by experimentation. Examples of design features that may incur score penalties include but are not limited to the following [penalty values are indicated in brackets, first number is for lower temperature assays (e.g., 62-64° C.), second is for higher temperature assays (e.g., 65-66° C.)]: [0198]
  • 1. [100:100] 3′ end of INVADER oligonucleotide resembles the probe arm: [0199]
    PENALTY AWARDED IF INVADER
    ARM SEQUENCE: ENDS IN:
    Arm 1: CGCGCCGAGG 5′ . . . GAGGX or
    5′ . . . GAGGXX
    Arm 2: ATGACGTGGCAGAC 5′ . . . CAGACX or
    5′ . . . CAGACXX
    Arm 3: ACGGACGCGGAG 5′ GGAGX or
    5′ . . . GGAGXX
    Arm 4: TCCGCGCGTCC 5′ . . . GTCCX or
    5′ . . . GTCCXX
  • 2. [70:70] a probe has 5-base stretch (i.e., 5 of the same base in a row) containing the polymorphism; [0200]
  • 3. [60:60] a probe has 5-base stretch adjacent to the polymorphism; [0201]
  • 4. [50:50] a probe has 5-base stretch one base from the polymorphism; [0202]
  • 5. [40:40] a probe has 5-base stretch two bases from the polymorphism; [0203]
  • 6. [50:50] probe 5-base stretch is of Gs—additional penalty; [0204]
  • 7. [100:100] a probe has 6-base stretch anywhere; [0205]
  • 8. [90:90] a two or three base sequence repeats at least four times; [0206]
  • 9. [100:100] a degenerate base occurs in a probe; [0207]
  • 10. [60:90] probe hybridizing region is short (13 bases or less for designs 65-67° C.; 12 bases or less for designs 62-64° C.) [0208]
  • 11. [40:90] probe hybridizing region is long (29 bases or more for designs 65-67° C., 28 bases or more for designs 62-64° C.) [0209]
  • 12. [5:5] probe hybridizing region length—per base additional penalty [0210]
  • 13. [80:80] Ins/Del design with poor discrimination in first 3 bases after probe arm [0211]
  • 14. [100:100] calculated INVADER oligonucleotide Tm within 7.5° C. of probe target Tm (designs 65-67° C. with INVADER oligonucleotide less than ≦70.5° C., designs 62-64° C. with INVADER oligonucleotide ≦69.5° C. [0212]
  • 15. [20:20] calculated probes Tms differ by more than 2.0° C. [0213]
  • 16. [100:100] a probe has calculated [0214] Tm 2° C. less than its target Tm
  • 17. [10:10] target of one strand 8 bases longer than that of other strand [0215]
  • 18. [30:30] INVADER oligonucleotide has 6-base stretch anywhere—initial penalty [0216]
  • 19. [70:70] INVADER oligonucleotide 6-base stretch is of Gs—additional penalty [0217]
  • 20. [15:15] probe hybridizing region is 14, 15 or 24-28 bases long (65-67° C.) or 13, 14 or 26, 27 bases long (62-64° C.) [0218]
  • 21. [15:15] a probe has a 4-base stretch of Gs containing the polymorphism [0219]
  • In particularly preferred embodiments, temperatures for each of the oligonucleotides in the designs are recomputed and scores are recomputed as changes are made. In some embodiments, score descriptions can be seen by clicking a “descriptions” button. In some embodiments, a BLAST search option is provided. In preferred embodiments, a BLAST search is done by clicking a “BLAST Design” button. In some embodiments, this action brings up a dialog box describing the BLAST process. In preferred embodiments, the BLAST search results are displayed as a highlighted design on a Designer Worksheet. [0220]
  • In some embodiments, a user accepts a design by clicking an “Accept” button. In other embodiments, the program approves a design without user intervention. In preferred embodiments, the program sends the approved design to a next process step (e.g., into production; into a file or database). In some embodiments, the program provides a screen view (e.g., an Output Page, FIG. 11), allowing review of the final designs created and allowing notes to be attached to the design. In preferred embodiments, the user can return to the Designer Worksheet (e.g., by clicking a “Go Back” button) or can save the design (e.g., by clicking a “Save It” button) and continue (e.g., to submit the designed oligonucleotides for production). [0221]
  • In some embodiments, the program provides an option to create a screen view of a design optimized for printing (e.g., a text-only view) or other export (e.g., an Output view, FIG. 12). In preferred embodiments, the Output view provides a description of the design particularly suitable for printing, or for exporting into another application (e.g., by copying and pasting into another application). In particularly preferred embodiments, the Output view opens in a separate window. [0222]
  • The present invention is not limited to the use of the INVADERCREATOR software. Indeed, a variety of software programs are contemplated and are commercially available, including, but not limited to GCG Wisconsin Package (Genetics computer Group, Madison, Wis.) and Vector NTI (Informax, Rockville, Md.). Other detection assays may be used in the present invention. [0223]
  • 1. Direct Sequencing Assays [0224]
  • In some embodiments of the present invention, variant sequences are detected using a direct sequencing technique. In these assays, DNA samples are first isolated from a subject using any suitable method. In some embodiments, the region of interest is cloned into a suitable vector and amplified by growth in a host cell (e.g., a bacteria). In other embodiments, DNA in the region of interest is amplified using PCR. [0225]
  • Following amplification, DNA in the region of interest (e.g., the region containing the SNP or mutation of interest) is sequenced using any suitable method, including but not limited to manual sequencing using radioactive marker nucleotides, or automated sequencing. The results of the sequencing are displayed using any suitable method. The sequence is examined and the presence or absence of a given SNP or mutation is determined. [0226]
  • 2. PCR Assay [0227]
  • In some embodiments of the present invention, variant sequences are detected using a PCR-based assay. In some embodiments, the PCR assay comprises the use of oligonucleotide primers that hybridize only to the variant or wild type allele (e.g., to the region of polymorphism or mutation). Both sets of primers are used to amplify a sample of DNA. If only the mutant primers result in a PCR product, then the patient has the mutant allele. If only the wild-type primers result in a PCR product, then the patient has the wild type allele. [0228]
  • 3. Fragment Length Polymorphism Assays [0229]
  • In some embodiments of the present invention, variant sequences are detected using a fragment length polymorphism assay. In a fragment length polymorphism assay, a unique DNA banding pattern based on cleaving the DNA at a series of positions is generated using an enzyme (e.g., a restriction enzyme or a CLEAVASE I [Third Wave Technologies, Madison, Wis.] enzyme). DNA fragments from a sample containing a SNP or a mutation will have a different banding pattern than wild type. [0230]
  • a. RFLP Assay [0231]
  • In some embodiments of the present invention, variant sequences are detected using a restriction fragment length polymorphism assay (RFLP). The region of interest is first isolated using PCR. The PCR products are then cleaved with restriction enzymes known to give a unique length fragment for a given polymorphism. The restriction-enzyme digested PCR products are generally separated by gel electrophoresis and may be visualized by ethidium bromide staining. The length of the fragments is compared to molecular weight markers and fragments generated from wild-type and mutant controls. [0232]
  • b. CFLP Assay [0233]
  • In other embodiments, variant sequences are detected using a CLEAVASE fragment length polymorphism assay (CFLP; Third Wave Technologies, Madison, Wis.; See e.g., U.S. Pat. Nos. 5,843,654; 5,843,669; 5,719,208; and 5,888,780; each of which is herein incorporated by reference). This assay is based on the observation that when single strands of DNA fold on themselves, they assume higher order structures that are highly individual to the precise sequence of the DNA molecule. These secondary structures involve partially duplexed regions of DNA such that single stranded regions are juxtaposed with double stranded DNA hairpins. The CLEAVASE I enzyme, is a structure-specific, thermostable nuclease that recognizes and cleaves the junctions between these single-stranded and double-stranded regions. [0234]
  • The region of interest is first isolated, for example, using PCR. In preferred emodiments, one or both strands are labeled. Then, DNA strands are separated by heating. Next, the reactions are cooled to allow intrastrand secondary structure to form. The PCR products are then treated with the CLEAVASE I enzyme to generate a series of fragments that are unique to a given SNP or mutation. The CLEAVASE enzyme treated PCR products are separated and detected (e.g., by denaturing gel electrophoresis) and visualized (e.g., by autoradiography, fluorescence imaging or staining). The length of the fragments is compared to molecular weight markers and fragments generated from wild-type and mutant controls. [0235]
  • 4. Hybridization Assays [0236]
  • In preferred embodiments of the present invention, variant sequences are detected a hybridization assay. In a hybridization assay, the presence of absence of a given SNP or mutation is determined based on the ability of the DNA from the sample to hybridize to a complementary DNA molecule (e.g., a oligonucleotide probe). A variety of hybridization assays using a variety of technologies for hybridization and detection are available. A description of a selection of assays is provided below. [0237]
  • a. Direct Detection of Hybridization [0238]
  • In some embodiments, hybridization of a probe to the sequence of interest (e.g., a SNP or mutation) is detected directly by visualizing a bound probe (e.g., a Northern or Southern assay; See e.g., Ausabel et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, NY [1991]). In a these assays, genomic DNA (Southern) or RNA (Northern) is isolated from a subject. The DNA or RNA is then cleaved with a series of restriction enzymes that cleave infrequently in the genome and not near any of the markers being assayed. The DNA or RNA is then separated (e.g., on an agarose gel) and transferred to a membrane. A labeled (e.g., by incorporating a radionucleotide) probe or probes specific for the SNP or mutation being detected is allowed to contact the membrane under a condition or low, medium, or high stringency conditions. Unbound probe is removed and the presence of binding is detected by visualizing the labeled probe. [0239]
  • b. Detection of Hybridization Using “DNA Chip” Assays [0240]
  • In some embodiments of the present invention, variant sequences are detected using a DNA chip hybridization assay. In this assay, a series of oligonucleotide probes are affixed to a solid support. The oligonucleotide probes are designed to be unique to a given SNP or mutation. The DNA sample of interest is contacted with the DNA “chip” and hybridization is detected. [0241]
  • In some embodiments, the DNA chip assay is a GeneChip (Affymetrix, Santa Clara, Calif.; See e.g., U.S. Pat. Nos. 6,045,996; 5,925,525; and 5,858,659; each of which is herein incorporated by reference) assay. The GeneChip technology uses miniaturized, high-density arrays of oligonucleotide probes affixed to a “chip.” Probe arrays are manufactured by Affymetrix's light-directed chemical synthesis process, which combines solid-phase chemical synthesis with photolithographic fabrication techniques employed in the semiconductor industry. Using a series of photolithographic masks to define chip exposure sites, followed by specific chemical synthesis steps, the process constructs high-density arrays of oligonucleotides, with each probe in a predefined position in the array. Multiple probe arrays are synthesized simultaneously on a large glass wafer. The wafers are then diced, and individual probe arrays are packaged in injection-molded plastic cartridges, which protect them from the environment and serve as chambers for hybridization. [0242]
  • The nucleic acid to be analyzed is isolated, amplified by PCR, and labeled with a fluorescent reporter group. The labeled DNA is then incubated with the array using a fluidics station. The array is then inserted into the scanner, where patterns of hybridization are detected. The hybridization data are collected as light emitted from the fluorescent reporter groups already incorporated into the target, which is bound to the probe array. Probes that perfectly match the target generally produce stronger signals than those that have mismatches. Since the sequence and position of each probe on the array are known, by complementarity, the identity of the target nucleic acid applied to the probe array can be determined. [0243]
  • In other embodiments, a DNA microchip containing electronically captured probes (Nanogen, San Diego, Calif.) is utilized (See e.g., U.S. Pat. Nos. 6,017,696; 6,068,818; and 6,051,380; each of which are herein incorporated by reference). Through the use of microelectronics, Nanogen's technology enables the active movement and concentration of charged molecules to and from designated test sites on its semiconductor microchip. DNA capture probes unique to a given SNP or mutation are electronically placed at, or “addressed” to, specific sites on the microchip. Since DNA has a strong negative charge, it can be electronically moved to an area of positive charge. [0244]
  • First, a test site or a row of test sites on the microchip is electronically activated with a positive charge. Next, a solution containing the DNA probes is introduced onto the microchip. The negatively charged probes rapidly move to the positively charged sites, where they concentrate and are chemically bound to a site on the microchip. The microchip is then washed and another solution of distinct DNA probes is added until the array of specifically bound DNA probes is complete. [0245]
  • A test sample is then analyzed for the presence of target DNA molecules by determining which of the DNA capture probes hybridize, with complementary DNA in the test sample (e.g., a PCR amplified gene of interest). An electronic charge is also used to move and concentrate target molecules to one or more test sites on the microchip. The electronic concentration of sample DNA at each test site promotes rapid hybridization of sample DNA with complementary capture probes (hybridization may occur in minutes). To remove any unbound or nonspecifically bound DNA from each site, the polarity or charge of the site is reversed to negative, thereby forcing any unbound or nonspecifically bound DNA back into solution away from the capture probes. A laser-based fluorescence scanner is used to detect binding, [0246]
  • In still further embodiments, an array technology based upon the segregation of fluids on a flat surface (chip) by differences in surface tension (ProtoGene, Palo Alto, Calif.) is utilized (See e.g., U.S. Pat. Nos. 6,001,311; 5,985,551; and 5,474,796; each of which is herein incorporated by reference). Protogene's technology is based on the fact that fluids can be segregated on a flat surface by differences in surface tension that have been imparted by chemical coatings. Once so segregated, oligonucleotide probes are synthesized directly on the chip by ink-jet printing of reagents. The array with its reaction sites defined by surface tension is mounted on a X/Y translation stage under a set of four piezoelectric nozzles, one for each of the four standard DNA bases. The translation stage moves along each of the rows of the array and the appropriate reagent is delivered to each of the reaction site. For example, the A amidite is delivered only to the sites where amidite A is to be coupled during that synthesis step and so on. Common reagents and washes are delivered by flooding the entire surface and then removing them by spinning. [0247]
  • DNA probes unique for the SNP or mutation of interest are affixed to the chip using Protogene's technology. The chip is then contacted with the PCR-amplified genes of interest. Following hybridization, unbound DNA is removed and hybridization is detected using any suitable method (e.g., by fluorescence de-quenching of an incorporated fluorescent group). [0248]
  • In yet other embodiments, a “bead array” is used for the detection of polymorphisms (Illumina, San Diego, Calif.; See e.g., PCT Publications WO 99/67641 and WO 00/39587, each of which is herein incorporated by reference). Illumina uses a BEAD ARRAY technology that combines fiber optic bundles and beads that self-assemble into an array. Each fiber optic bundle contains thousands to millions of individual fibers depending on the diameter of the bundle. The beads are coated with an oligonucleotide specific for the detection of a given SNP or mutation. Batches of beads are combined to form a pool specific to the array. To perform an assay, the BEAD ARRAY is contacted with a prepared subject sample (e.g., DNA). Hybridization is detected using any suitable method. [0249]
  • c. Enzymatic Detection of Hybridization [0250]
  • In some embodiments of the present invention, hybridization is detected by enzymatic cleavage of specific structures (INVADER assay, Third Wave Technologies; See e.g., U.S. Pat. Nos. 5,846,717, 6,090,543; 6,001,567; 5,985,557; and 5,994,069; each of which is herein incorporated by reference). The INVADER assay detects specific DNA and RNA sequences by using structure-specific enzymes to cleave a complex formed, by the hybridization of overlapping oligonucleotide probes. Elevated temperature and an excess of one of the probes enable multiple probes to be cleaved for each target sequence present without temperature cycling. These cleaved probes then direct cleavage of a second labeled probe. The secondary probe oligonucleotide can be 5′-end labeled with a fluorescent dye that is quenched by a second dye or other quenching moiety. Upon cleavage, the de-quenched dye-labeled product may be detected using a standard fluorescence plate reader, or an instrument configured to collect fluorescence data during the course of the reaction (i.e., a “real-time” fluorescence detector, such as an ABI 7700 Sequence Detection System, Applied Biosystems, Foster City, Calif.). [0251]
  • The INVADER assay detects specific mutations and SNPs in unamplified genomic DNA. In an embodiment of the INVADER assay used for detecting SNPs in genomic DNA, two oligonucleotides (a primary probe specific either for a SNP/mutation or wild type sequence, and an INVADER oligonucleotide) hybridize in tandem to the genomic DNA to form an overlapping structure. A structure-specific nuclease enzyme recognizes this overlapping structure and cleaves the primary probe. In a secondary reaction, cleaved primary probe combines with a fluorescence-labeled secondary probe to create another overlapping structure that is cleaved by the enzyme. The initial and secondary reactions can run concurrently in the same vessel. Cleavage of the secondary probe is detected by using a fluorescence detector, as described above. The signal of the test sample may be compared to known positive and negative controls. [0252]
  • In some embodiments, hybridization of a bound probe is detected using a TaqMan assay (PE Biosystems, Foster City, Calif.; See e.g., U.S. Pat. Nos. 5,962,233 and 5,538,848, each of which is herein incorporated by reference). The assay is performed during a PCR reaction. The TaqMan assay exploits the 5′-3′ exonuclease activity of DNA polymerases such as AMPLITAQ DNA polymerase. A probe, specific for a given allele or mutation, is included in the PCR reaction. The probe consists of an oligonucleotide with a 5′-reporter dye (e.g., a fluorescent dye) and a 3′-quencher dye. During PCR, if the probe is bound to its target, the 5′-3′ nucleolytic activity of the AMPLITAQ polymerase cleaves the probe between the reporter and the quencher dye. The separation of the reporter dye from the quencher dye results in an increase of fluorescence. The signal accumulates with each cycle of PCR and can be monitored with a fluorimeter. [0253]
  • In still further embodiments, polymorphisms are detected using the SNP-IT primer extension assay (Orchid Biosciences, Princeton, N.J.; See e.g., U.S. Pat. Nos. 5,952,174 and 5,919,626, each of which is herein incorporated by reference). In this assay, SNPs are identified by using a specially synthesized DNA primer and a DNA polymerase to selectively extend the DNA chain by one base at the suspected SNP location. DNA in the region of interest is amplified and denatured. Polymerase reactions are then performed using miniaturized systems called microfluidics. Detection is accomplished by adding a label to the nucleotide suspected of being at the SNP or mutation location. Incorporation of the label into the DNA can be detected by any suitable method (e.g., if the nucleotide contains a biotin label, detection is via a fluorescently labelled antibody specific for biotin). [0254]
  • III. Detection Assay Production [0255]
  • The present invention provides a high-throughput detection assay production system, allowing for high-speed, efficient production of thousands of detection assays. The high-throughput production systems and methods allow sufficient production capacity to facilitate full implementation of the funnel process described above—allowing comprehensive of all known (and newly identified) markers. [0256]
  • In some embodiments of the present invention, oligonucleotides and/or other detection assay components (e.g., those designed by the INVADERCREATOR software and directed to target sequences analyzed by the in silico systems and methods) are synthesized. In preferred embodiments, oligonucleotide synthesis is performed in an automated and coordinated manner. As discussed in more detail below, in some embodiments, produced detection assay are tested against a plurality of samples representing two or more different individuals or alleles (e.g., samples containing sequences from individuals with different ethnic backgrounds, disease states, etc.) to demonstrate the viability of the assay with different individuals. [0257]
  • In some embodiments, the present invention provides an automated DNA production process. In some embodiments, the automated DNA production process includes an oligonucleotide synthesizer component and an oligonucleotide processing component. In some embodiments, the oligonucleotide production component includes multiple components, including but not limited to, an oligonucleotide cleavage and deprotection component, an oligonucleotide purification component, an oligonucleotide dry down component; an oligonucleotide de-salting component, an oligonucleotide dilute and fill component, and a quality control component. In some embodiments, the automated DNA production process of the present invention further includes automated design software and supporting computer terminals and connections, a product tracking system (e.g., a bar code system), and a centralized packaging component. In some embodiments, the components are combined in an integrated, centrally controlled, automated production system. The present invention thus provides methods of synthesizing several related oligonucleotides (e.g., components of a kit) in a coordinated manner. The automated production systems of the present invention allow large scale automated production of detection assays for numerous different target sequences. [0258]
  • A. Oligonucleotide Synthesis Component [0259]
  • Once a particular oligonucleotide sequence or set of sequences has been chosen, sequences are sent (e.g., electronically) to a high-throughput oligonucleotide synthesizer component. In some preferred embodiments, the high-throughput synthesizer component contains multiple DNA synthesizers. [0260]
  • In some embodiments, the synthesizers are arranged in banks. For example, a given bank of synthesizers may be used to produce one set of oligonucleotides (e.g., for an INVADER or PCR reaction). The present invention is not limited to any one synthesizer. Indeed, a variety of synthesizers are contemplated, including, but not limited to MOSS EXPEDITE 16-channel DNA synthesizers (PE Biosystems, Foster City, Calif.), OligoPilot (Amersham Pharmacia,), the 3900 and 3948 48-Channel DNA synthesizers (PE Biosystems, Foster City, Calif.), and the high-throughput synthesizer described in PCT Publication WO 01/41918. In some embodiments, synthesizers are modified or are wholly fabricated to meet physical or performance specifications particularly preferred for use in the synthesis component of the present invention. In some embodiments, two or more different DNA synthesizers are combined in one bank in order to optimize the quantities of different oligonucleotides needed. This allows for the rapid synthesis (e.g., in less than 4 hours) of an entire set of oligonucleotides (all the oligonucleotide components needed for a particular assay, e.g., for detection of one SNP using an INVADER assay). [0261]
  • In some embodiments the DNA synthesizer component includes at least 100 synthesizers. In other embodiments, the DNA synthesizer component includes at least 200 synthesizers. In still other embodiments, the DNA synthesizer component includes at least 250 synthesizers. In some embodiments, the DNA synthesizers are run 24 hours a day. [0262]
  • 1. Automated Reagent Supply [0263]
  • In some embodiments, the DNA synthesizers in the oligonucleotide synthesis component further comprise an automated reagent supply system. The automated reagent supply system delivers reagents necessary for synthesis to the synthesizers from a central supply area. For example, in some embodiments, acetonitrile is supplied via tubing (e.g., stainless steel tubing) through the automated supply system. De-blocking solution may also be supplied directly to DNA synthesizers through tubing. In some preferred embodiments, the reagent supply system tubing is designed to connect directly to the DNA synthesizers without modifying the synthesizers. Additionally, in some embodiments, the central reagent supply is designed to deliver reagents at a constant and controlled pressure. The amount of reagent circulating in the central supply loop is maintained at 8 to 12 times the level needed for synthesis in order to allow standardized pressure at each instrument. The excess reagent also allows new reagent to be added to the system without shutting down. In addition, the excess of reagent allows different types of pressurized reagent containers to be attached to one system. The excess of reagents in one centralized system further allows for one central system for chemical spills and fire suppression. [0264]
  • In some embodiments, the DNA synthesis component includes a centralized argon delivery system. The system includes high-pressure argon tanks adjacent to each bank of synthesizers. These tanks are connected to large, main argon tanks for backup. In some embodiments, the main tanks are run in series. In other embodiments, the main tanks are set up in banks. In some embodiments, the system further includes an automated tank switching system. In some preferred embodiments, the argon delivery system further comprises a tertiary backup system to provide argon in the case of failure of the primary and backup systems. [0265]
  • In some embodiments, one or more branched delivery components are used between the reagent tanks and the individual synthesizers or banks of synthesizers. For example, in some embodiments, acetonitrile is delivered through a branched metal structure. Where more than one branched delivery component is used, in preferred embodiments, each branched delivery component is individually pressurized. [0266]
  • The present invention is not limited by the number of branches in the branched delivery component. In preferred embodiments, each branched delivery component contains ten or more branches. Reagent tanks may be connected to the branched delivery components using any number of configurations. For example, in some embodiments, a single reagent tank is matched with a single branched component. In other embodiments, a plurality of reagent tanks is used to supply reagents to one or more branched components. In some such embodiments, the plurality of tanks may be attached to the branched components through a single feed line, wherein one or a subset of the tanks feeds the branched components until empty (or substantially empty), whereby a second tank or subset of tanks is accessed to maintain a continuous supply of reagent to the one or more branched components. To automate the monitoring and switching of tanks, an ultrasonic level sensor may be applied. [0267]
  • In some embodiments, each branch of the branched delivery component provides reagent to one synthesizer or to a bank of synthesizers through connecting tubing. In preferred embodiments, tubing is continuous (i.e., provides a direct connection between the delivery branch and the synthesizer). In some preferred embodiments, the tubing comprises an interior diameter of 0.25 inches or less (e.g., 0.125 inches). In some embodiments, each branch contains one or more valves (preferably one). While the valve may be located at any position along the delivery line, in preferred embodiments, the valve is located in close proximity to the synthesizer. In other embodiments, reagent is provided directly to synthesizers without any joints or valves between the branched delivery component and the synthesizers. [0268]
  • In some embodiments, the solvent is contained in a cabinet designed for the safe storage of flammable chemicals (a “flammables cabinet”) and the branched structure is located outside of the cabinet and is fed by the solvent container through a tube passed through the wall of the cabinet. In other embodiments, the reagent and branched system is stored in an explosion proof room or chamber and the solvent is pumped via tubing through the wall of the explosion proof room. In preferred embodiments, all of the tubing from each of the branches is fed through the wall in at a single location (e.g., through a single hole in the wall). [0269]
  • The reagent delivery system of the present invention provides several advantages. For example, such a system allows each synthesizer to be turned off (e.g., for servicing) independent of the other synthesizers. Use of continuous tubing reduces the number of joints and couplings, the areas most vulnerable to failure, between the reagent sources and the synthesizers, thereby reducing the potential for leakage or blockage in the system. Use of continuous tubing through inaccessible or difficult-to-access areas reduces the likelihood that repairs or service will be needed in such areas. In addition, fewer valves results in cost savings. [0270]
  • In some embodiments, the branched tubing structure further provides a sight glass. In preferred embodiments, the sight glass is located at the top of the branched delivery structure. The sight glass provides the opportunity for visual and physical sampling of the reagent. For example, in some embodiments, the sight glass includes a sampling valve (e.g., to collect samples for quality control). In some embodiments, the site glass serves as a trap for gas bubbles, to prevent bubbles from entering the connecting tubing. In other embodiments, the sight glass contains a vent (e.g., a solonoid valve) for de-gassing of the system. In some embodiments, scanning of the sight glass (e.g., spectrophotometrically) and sampling are automated. The automated system provides quality control and feedback (e.g., the presence of contamination). [0271]
  • In other embodiments, the present invention provides a portable reagent delivery system. In some embodiments, the portable reagent delivery system comprises a branched structure connected to solvent tanks that are contained in a flammables cabinet. In preferred embodiments, one reagent delivery system is able to provide sufficient reagent for 40 or more synthesizers. These portable reagent delivery systems of the present invention facilitate the operation of mobile (portable) synthesis facilities. In another embodiment, these portable reagent delivery systems facilitate the operation of flexible synthesis facilities that can be easily re-configured to meet particular needs of individual synthesis projects or contracts. In some embodiments, a synthesis facility comprises multiple portable reagent delivery systems. [0272]
  • 2. Waste Collection [0273]
  • In some embodiments, the DNA synthesis component further comprises a centralized waste collection system. The centralized waste collection system comprises cache pots for central waste collection. In some embodiments, the cache pots include level detectors such that when waste level reaches a preset value, a pump is activated to drain the cache into a central collection reservoir. In preferred embodiments, ductwork is provided to gather fumes from cache pots. The fumes are then vented safely through the roof, avoiding exposure of personnel to harmful fumes. In preferred embodiments, the air handling system provides an adequate amount of air exchange per person to ensure that personnel are not exposed to harmful fumes. The coordinated reagent delivery and waste removal systems increase the safety and health of workers, as well as improving cost savings. [0274]
  • In some embodiments, the solvent waste disposal system comprises a waste transfer system. In some preferred embodiments, the system contains no electronic components. In some preferred embodiments, the system comprises no moving parts. For example, in some embodiments, waste is first collected in a liquid transfer drum designed for the safe storage of flammable waste. In some embodiments, waste is manually poured into the drum through a waste channel. In preferred embodiments, solvent waste is automatically transported (e.g., through tubing) directly from synthesizers to the drum. To drain the liquid transfer drum, argon is pumped from a pressurized gas line into the drum through a first opening, forcing solvent waste out an output channel at a second opening (e.g., through tubing) into a centralized waste collection area. In preferred embodiments, the argon is pumped at low pressure (e.g., 3-10 pounds per square inch (psi), preferably 5 psi or less). In some embodiments, the drum contains a sight glass to visualize the solvent level. In some embodiments, the level is visualized manually and the disposal system is activated when the drum has reached a selected threshold level. In other embodiments, the level is automatically detected and the disposal system is automatically activated when the drum has reached the threshold level. [0275]
  • The solvent waste transfer system of the present invention provides several advantages over manual collection and complex systems. The solvent waste system of the present invention is intrinsically safe, as it can be designed with no moving or electrical parts. For example, the system described above is suitable for use in Division I/Class I space under EPA regulations. [0276]
  • 3. Centralized Control System [0277]
  • In some embodiments, all of the DNA synthesizers in the synthesis component are attached to a centralized control system. The centralized control system controls all areas of operation, including, but not limited to, power, pressure, reagent delivery, waste, and synthesis. In some preferred embodiments, the centralized control system includes a clean electrical grid with uninterrupted power supply. Such a system minimizes power level fluctuations. In additional preferred embodiments, the centralized control system includes alarms for air flow, status of reagents, and status of waste containers. The alarm system can be monitored from the central control panel. The centralized control system allows additions, deletions, or shutdowns of one synthesizer or one block of synthesizers without disrupting operations of other instruments. The centralized power control allows user to turn instruments off instrument by instrument, bank by bank, or the entire module. [0278]
  • B. Oligonucleotide Processing Components [0279]
  • In some embodiments, the automated DNA production process further comprises one or more oligonucleotide production components, including, but not limited to, an oligonucleotide cleavage and deprotection component, an oligonucleotide purification component, a dry-down component, a desalting component, a dilution and fill component, and a quality control component. [0280]
  • 1. Oligonucleotide Cleavage and Deprotection [0281]
  • After synthesis is complete, the oligonucleotides are moved to the cleavage and deprotection station. In some embodiments, the transfer of oligonucleotides to this station is automated and controlled by robotic automation. In some embodiments, the entire cleavage and deprotection process is performed by robotic automation. In some embodiments, NH[0282] 4OH for deprotection is supplied through the automated reagent supply system.
  • Accordingly, in some embodiments, oligonucleotide deprotection is performed in multi-sample containers (e.g., 96 well covered dishes) in an oven. This method is designed for the high-throughput system of the present invention and is capable of the simultaneous processing of large numbers of samples. This method provides several advantages over the standard method of deprotection in vials. For example, sample handling is reduced (e.g., labeling of vials dispensing of concentrated NH[0283] 4OH to individual vials, as well as the associated capping and uncapping of the vials, is eliminated). This reduces the risks of contamination or mislabeling and decreases processing time. Where such methods are used to replace human pipetting of samples and capping of vials, the methods save many labor hours per day. The method also reduces consumable requirements by eliminating the need for vials and pipette tips, reduces equipment needs by eliminating the need for pipettes, and improves worker safety conditions by reducing worker exposure to ammonium hydroxide. The potential for repetitive motion disorders is also reduced. Deprotection in a multi-well plate further has the advantage that the plate can be directly placed on an automated desalting apparatus (e.g., TECAN Robot).
  • During the development of the present invention, the plate was optimized to be functional and compatible with the deprotection methods. In some embodiments, the plate is designed to be able to hold as much as two milliliters of oligonucleotide and ammonium hydroxide. If deep well plates are used, automated downstream processing steps may need to be altered to ensure that the full volume of sample is extracted from the wells. In some embodiments, the multi-well plates used in the methods of the present invention comprise a tight sealing lid/cover to protect from evaporation, provide for even heating, and are able to withstand temperatures necessary for deprotection. Attempts with initial plates were not successful, having problems with lids that were not suitably sealed and plates that did not withstand deprotection temperatures. [0284]
  • In some embodiments (e.g., processing of target and INVADER oligonucleotides), oligonucleotides are cleaved from the synthesis support in the multi-well plates. In other embodiments (e.g., processing of probe oligonucleotides), oligonucleotides are first cleaved from the synthesis column and then transferred to the plate for deprotection. [0285]
  • 2. Oligonucleotide Purification [0286]
  • In some embodiments, following deprotection and cleavage from the solid support, oligonucleotides are further purified. Any suitable purification method may be employed, including, but not limited to, high pressure liquid chromatography (HPLC) (e.g., using reverse phase C18 and ion exchange), reverse phase cartridge purification, and gel electrophoresis. However, in preferred embodiments, purification is carried out using ion exchange HPLC chromatography. [0287]
  • In some embodiments, multiple HPLC instruments are utilized, and integrated into banks (e.g., banks of 8 HPLC instruments). Each bank is referred to as an HPLC module. Each HPLC module consists of an automated injector (e.g., including, but not limited to, Leap Technologies 8-port injector) connected to each bank of automated HPLC instruments (e.g., including, but not limited to, Beckman-Coulter HPLC instruments). The automatic Leap injector can handle four 96-well plates of cleaved and deprotected oligonucleotides at a time. The Leap injector automatically loads a sample onto each of the HPLCs in a given bank. The use of one injector with each bank of HPLC provides the advantage of reducing labor and allowing integrated processing of information. [0288]
  • In some embodiments, oligonucleotides are purified on an ion exchange column using a salt gradient. Any suitable ion exchange functionality or support may be utilized, including but not limited to, Source 15 Q ion exchange resin (Pharmacia). Any suitable salt may be utilized for elution of oligonucleotides from the ion exchange column, including but not limited to, sodium chloride, acetonitrile, and sodium perchlorate. However, in preferred embodiments, a gradient of sodium perchlorate in acetonitrile and sodium acetate is utilized. [0289]
  • In some embodiments, the gradient is run for a sufficient time course to capture a broad range of sizes of oligonucleotides. For example, in some embodiments, the gradient is a 54 minute gradient carried out using the method described in Tables 1 and 2. Table 1 describes the HPLC protocol for the gradient. The time column represents the time of the operation. The module column represents the equipment that controls the operation. The function column represents the function that the HPLC is performing. The value column represents the value of the HPLC function at the time specified in the time column. Table 2 describes the gradient used in HPLC purification. The column temperature is 65° C. Buffer A is 20 mM Sodium Perchlorate, 20 mM Sodium Acetate, 10 Acetonitrile, pH 7.35. Buffer B is 600 mM Sodium Perchlorate, 20 mM Sodium Acetate, 10 Acetonitrile, pH 7.35. [0290]
  • In some embodiments, the gradient is shortened. In preferred embodiments, the gradient is shortened so that a particular gradient range suitable for the elution of a particular oligonucleotide being purified is accomplished in a reduced amount of time. In other preferred embodiments, the gradient is shortened so that a particular gradient range suitable for the elution of any oligonucleotide having a size within a selected size range is accomplished in a reduced amount of time. This latter embodiment provides the advantages that the worker performing HPLC need not have foreknowledge of the size of an oligonucleotide within the selected size range, and the protocol need not be altered for purification of any oligonucleotide having a size within the range. [0291]
  • In a particularly preferred embodiment, the gradient is a 34 minute gradient described in the Tables 3 and 4. The parameters and buffer compositions are as described for Tables 1 and 2 above. Reducing the gradient to 34 minutes increases the capacity of synthesis per HPLC instrument and reduces buffer usage by 50% compared to the 54 minute protocol described above. The 34 minute HPLC method of the present invention has the further advantage of being optimized to be able to separate oligonucleotides of a length range of 23-39 nucleotides without any changes in the protocol for the different lengths within the range. Previous methods required changes for every 2-3 nucleotide change in length. In yet other embodiments, the gradient time is reduced even further (e.g., to less than 30 minutes, preferably to less than 20 minutes, and even more preferably, to less than 15 minutes). Any suitable method may be utilized that meets the requirements of the present invention (e.g., able to purify a wide range of oligonucleotide lengths using the same protocol). [0292]
  • In some embodiments, separate sets of HPLC conditions, each selected to purify oligonucleotides within a different size range, may be provided (e.g., may be run on separate HPLCs or banks of HPLCs). Thus, in some embodiments of the present invention, a first bank of HPLCs are configured to purify oligonucleotides using a first set of purification conditions (e.g., for 23-39 mers), while second and third banks are used for the shorter and longer oligonucleotides. Use of this system allows for automated purification without the need to change any parameters from purification to purification and decreases the time required for oligonucleotide production. [0293]
  • In some embodiments, the HPLC station is equipped with a central reagent supply system. In some embodiments, the central reagent system includes an automated buffer preparation system. The automated buffer preparation system includes large vat carboys that receive pre-measured reagents and water for centralized buffer preparation. The buffers (e.g., a high salt buffer and a low salt buffer) are piped through a circulation loop directly from the central preparation area to the HPLCs. In some embodiments, the conductivity of the solution in the circulation loop is monitored to verify correct content and adequate mixing. In addition, in some embodiments, circulation lines are fitted with venturis for static mixing of the solutions as they are circulated through the piping loop. In still further embodiments, the circulation lines are fitted with 0.05 μm filters for sterilization. [0294]
  • In some preferred embodiments, the HPLC purification step is carried out in a clean room environment. The clean room includes a HEPA filtration system. All personnel in the clean room are outfitted with protective gloves, hair coverings, and foot coverings. [0295]
  • In preferred embodiments, the automated buffer prep system is located in a non- clean room environment and the prepared buffer is piped through the wall into the clean room. [0296]
  • Each purified oligonucleotide is collected into a tube (e.g., a 50-ml conical tube) in a carrying case in the fraction collector. Collection is based on a set method, which is triggered by an absorbance rate change within a predetermined time window. In some embodiments, the method uses a flow rate of 5 ml/min (the maximum rate of the pumps is 10 ml/min.) and each column is automatically washed before the injector loads the next sample. [0297]
    TABLE 1
    uz,11/22 54 Minute HPLC Method
    Time (min) Module Function Value Duration (mm)
    0 Pump %B 22.00 4.0
    0 Det 166-3 Autozero ON
    0 Det 166-3 Relay ON 3.0 0.10
    4 Pump %B 37.00 43.00
    47 Pump %B 100.00 0.50
    47.5 Pump Flow Rate 7.5 0.00
    50.0 Pump %B 5.0 0.50
    53.45 Det 166-3 Stop Data
  • [0298]
    TABLE 2
    54 Minute HPLC Method
    Time Gradient Flow Rate
    0 5% B/95% A   5 ml/min
    0-4 min 5-22% B   5 ml/min
    4-47 min 22-37% B   5 ml/min
    47-47.5 min 37-100% B 7.5 ml/min
    47.5-50 min 100% B 7.5 ml/min
    50-50.5 min 100-5% B   7.5 ml/min
    50.5-53.5 min  5% B 7.5 ml/min
  • [0299]
    TABLE 3
    34 Minute HPLC Method
    Time (min) Module Function Value Duration
    0 Pump %B 26.00 2.0
    0 Det 166-3 Autozero ON
    0 Det 166-3 Relay ON 3.0 0.10
    2 Pump %B 36.00 27.00
    29 Pump %B 100.00 0.50
    29.5 Pump Flow Rate 7.5 0.00
    32 Pump %B 5.0 0.50
    33.45 Det 166-3 Stop Data
  • [0300]
    TABLE 4
    34 Minute HPLC Method
    Time Gradient Flow Rate
    0 5% B/95% A   5 ml/min
    0-2 min  5-26% B   5 ml/min
    2-29 min 26-36% B   5 ml/min
    29-29.5 min  36-100% B 6.5 ml/min
    29.5-32 min 100% B 7.5 ml/min
    32-32.5 min 100-5% B  7.5 ml/min
    32.5-33.5 min  5% B 7.5 ml/min
  • 3. Dry-Down Component [0301]
  • When the fraction collector is full of eluted oligonucleotides, they are transferred (e.g., by automated robotics or by hand) to a drying station. For example, in some embodiments, the samples are transferred to customized racks for Genevac centrifugal evaporator to be dried down. In preferred embodiments, the Genevac evaporator is equipped with racks designed to be used in both the Genevac and the subsequent desalting step. The Genevac evaporator decreases drying time, relative to other commercially available evaporators, by 60%. [0302]
  • 4. Desalting Component [0303]
  • In some embodiments, following HPLC, oligonucleotides are desalted. In other embodiments, oligonucleotides are not HPLC purified, but instead proceed directly from deprotection to desalting. In some embodiments, the desalting stations have TECAN robot systems for automated desalting. The system employs a rack that has been designed to fit the TECAN robot and the Genevac centrifugal evaporator without transfer to a different rack or holder. The racks are designed to hold the different sizes of desalting columns, such as the NAP-5 and NAP-10 columns. The TECAN robot loads each oligonucleotide onto an individual NAP-5 or NAP-10 column, supplies the buffer, and collects the eluate. If desired, desalted oligonucleotides may be frozen or dried down at this point. [0304]
  • In some embodiments, following desalting, INVADER and target oligonucleotides are analyzed by mass spectroscopy. For example, in some embodiments, a small sample from the desalted oligonucleotide sample is removed (e.g., by a TECAN robot) and spotted on an analysis plate, which is then placed into a mass spectrometer. The results are analyzed and processed by a software routine. Following the analysis, failed oligonucleotides are automatically reordered, while oligonucleotides that pass the analysis are transported to the next processing step. This preliminary quality control analysis removes failed oligonucleotides earlier in the processing, thus resulting in cost savings and improving cycle times. [0305]
  • 5. Oligonucleotide Dilution and Fill Component [0306]
  • In some embodiments, the oligonucleotide production process further includes a dilute and fill module. In some embodiments, each module consists of three automated oligonucleotide dilution and normalization stations. Each station consists of a network-linked computer and an automated robotic system (e.g., including but not limited to Biomek 2000). In one embodiment, the pipetting station is physically integrated with a spectrophotometer to allow machine handling of every step in the process. All manipulations are carried out in a HEPA-filtered environment. Dissolved oligonucleotides are loaded onto the Biomek 2000 deck the sequence files are transferred into the Biomek 2000. The Biomek 2000 automatically transfers a sample of each oligonucleotide to an optical plate, which the spectrophotometer reads to measure the A260 absorbance. Once the A260 has been determined, an Excel program integrated with the Biomek software uses absorbance and the sequence information to prepare a dilution table for each oligonucleotide. The Biomek employs that dilution table to dilute each oligonucleotide appropriately. The instrument then dispenses oligonucleotides into an appropriate vessel (e.g., 1.5 ml microtubes). [0307]
  • In some preferred embodiments, the automated dilution and fill system is able to dilute different components of a kit (e.g., INVADER and probe oligonucleotides) to different concentrations. In other preferred embodiments, the automated dilution and fill module is able to dilute different components to different concentrations specified by the end user. [0308]
  • 6. Quality Control Component [0309]
  • In some embodiments, oligonucleotides undergo a quality control assay before distribution to the user. The specific quality control assay chosen depends on the final use of the oligonucleotides. For example, if the oligonucleotides are to be used in an INVADER SNP detection assay, they are tested in the assay before distribution. [0310]
  • In some embodiments, each SNP set is tested in a quality control assay utilizing the Beckman Coulter SAGIAN CORE System. In some embodiments, the results are read on a real-time instrument (e.g., a ABI 7700 fluorescence reader). The QC assay uses two no target blanks as negative controls and five untyped genomic samples as targets. For consistency, every SNP set is tested with the same genomic samples. In preferred embodiment, the ADS system is responsible for tracking tubes through the QC module. Thus, in some embodiments, if a tube is missing, the ADS program discards, reorders, or searches for the missing tube. [0311]
  • In some preferred embodiments, the user chooses which QC method to run. The operator then chooses how many sets are needed. Then, in some embodiments, the application auto-selects the correct number of SNPs based on priority and prints output (picklist). If a picklist needs to be regenerated, the operator inputs which picklist they are replacing as well as which sets are not valid. The system auto-selects the valid SNPs plus replacement SNPs and print output. Additionally, in some embodiments, picklists are manually generated by SNP number. [0312]
  • The auto-selected SNPs are then removed from being listed as available for auto-selection. In some embodiments, the software prints the following items: SNP/Oligo list (picklist), SNP/Oligo layout (rack setup). The operator then takes the picklist into inventory and removes the completed oligonucleotide sets. In some embodiments, a completed set is unavailable. In this case, the operator regenerates a picklist. Then, in preferred embodiments, the missing SNP set or tube is flagged in the system. Once a picklist is full, the oligonucleotides are moved to the next step. [0313]
  • In some embodiments, the operator then takes the rack setup generated by the picklist and loads the rack. Alternatively, a robotic handling system loads the rack. In preferred embodiments, tubes are scanned as they are placed onto the rack. The scan checks to make sure it is the correct tube and displays the location in the rack where the tube is to be placed. [0314]
  • Completed racks are then placed in a holding area to await the robot prep and robot run. Then, in some embodiments, the operator views what racks are in the queue and determines what genomics and reagent stock will be loaded onto the robot. The robot is then programmed to perform a specific method. Additionally, in some embodiments, the robot or operator records genomics and reagents lot numbers. [0315]
  • In preferred embodiments, a carousel location map is printed that outlines where racks are to be placed. The operator then loads the robot carousel according to the method layout. The rack is scanned (e.g., by the operator or by the ADS program). If the rack is not valid for the current robot method, the operator will be informed. The carousel location for the rack is then displayed. The output plates are then scanned (e.g., by the operator or by the ADS program). If the plate is not valid for the current method the operator is informed. The carousel location for the plate is then displayed. [0316]
  • Then, in some embodiments, the robot is run. The robot then places the plates onto heatblocks for a period of time specified in the method. In some embodiments, the robot then scans the plates on the Cytofluor. Output from the cytofluor is read into the database and attached to the output plate record. [0317]
  • In other embodiments, the output is read on the ABI 7700 real time instrument. In some embodiments, the operator loads the plate on to the 7700. Alternatively, in other embodiments, the robot loads the plate onto the ABI 7700. A scan is then started using the 7700 software. When the scan is completed the output file is saved onto a computer hard drive. The operator then starts the application and scans in the plate bar code. The software instructs the user to browse to the saved output file. The software then reads the file into the database and deletes the file (or tells the operator to delete the file). [0318]
  • The plate reader results (e.g., from a Cytofluor or a ABI 7700) are then analyzed (e.g., by a software program or by the operator). Additionally, in some embodiments, the operator reviews the results of the software analysis of each SNP and takes one of several actions. In some embodiments, the operator approves all automated actions. In other embodiments, the operator reviews and approves individual actions. In some embodiments, the operator marks actions as needing additional review. Alternatively, in other embodiments, the operator passes on reviewing anything. Additionally, in some embodiments, the operator overrides all automated actions. [0319]
  • Depending on the results of the QC analysis, one of several actions is next taken. If the software marks ready for Full Fill, the operator forwards discards diluted Probe/INVADER oligonucleotide mixes and forwards the samples to the packaging module. [0320]
  • If an oligonucleotide set fails quality control, the data is interpreted to determine the cause of the failure. The course of action is determined by such data interpretation. If the software marks an oligonucleotide Reassess Failed Oligonucleotide, no action by user is required, the reassess is handled by automation. In the software marks an oligonucleotide Redilute Failed Oligonucleotide, the operator discards diluted tubes. No other action is required. If the software marks an oligonucleotide Order Target Oligonucleotide, no action by user is required. In this case, a synthetic target oligonucleotide is ordered for further testing. If the software marks an oligonucleotide Fail Oligo(s) Discard Oligo(s), the operator discards the diluted tubes and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Fail SNP, the operator discards the diluted and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Full SNP Redesign, the operator discards the diluted and un-diluted tubes. No other action is required. If the software marks an oligonucleotide Partial SNP Redesign the operator discards diluted tubes and discards some un-diluted tubes. No other action is required. [0321]
  • In some embodiments, the software marks an oligonucleotide Manual Intervention. This step occurs if the operator or software has determined the SNP requires manual attention. This step puts the SNP “on hold” in the tracking system while the operator investigates the source of the failure. [0322]
  • When a set of oligonucleotides (e.g., a INVADER assay set) is completed, the set is transferred to the packaging station. [0323]
  • In some embodiments of the present invention, the produced detection assays are tested against a plurality of samples representing two or more different alleles (samples containing sequences from individuals with different ethnic backgrounds, disease states, etc.) to demonstrate the viability of the assay with different individuals. In preferred embodiments, the produced assays are tested against a sufficient number of alleles (e.g., 100 or more) to identify which members of the population can be tested by the assay and to identify the allele frequency in the population of the genotype for which the assay is designed. In some embodiments, where certain individuals or classes of individuals are not detected by the detection assay, the target sequence of the individuals is characterized to determine whether the intended SNP is not present and/or whether additional mutations are present the prevent the proper detection of the sample. Any such information may be collected and stored in databases. In some embodiments, target selection, in silico analysis, and oligonucleotide design are repeated to generate assays capable of detecting the corresponding sequence of these individuals, as desired. In some embodiments, allele frequency information is stored in a database and made available to users of the detection assays upon request (e.g., made available over a communication network). [0324]
  • C. Packaging Component [0325]
  • In some embodiments, one or more components generated using the system of the present invention are packaged using any suitable means. In some embodiments, the packaging system is automated. In some embodiments, the packaging component is controlled by the centralized control network of the present invention. [0326]
  • D. Centralized Control Network [0327]
  • In some embodiments, the automated DNA production process further comprises a centralized control system. In some embodiments, the centralized control system comprises a computer system. [0328]
  • In some embodiments, the computer system comprises computer memory or a computer memory device and a computer processor. In some embodiments, the computer memory (or computer memory device) and computer processor are part of the same computer. In other embodiments, the computer memory device or computer memory are located on one computer and the computer processor is located on a different computer. In some embodiments, the computer memory is connected to the computer processor through the Internet or World Wide Web. In some embodiments, the computer memory is on a computer readable medium (e.g., floppy disk, hard disk, compact disk, DVD, etc). In other embodiments, the computer memory (or computer memory device) and computer processor are connected via a local network or intranet. In certain embodiments, the computer system comprises a computer memory device, a computer processor, an interactive device (e.g., keyboard, mouse, voice recognition system), and a display system (e.g., monitor, speaker system, etc.). [0329]
  • In preferred embodiments, the systems and methods of the present invention comprise a centralized control system, wherein the centralized control system comprises a computer tracking system. As discussed above, the items to be manufactured (e.g. oligonucleotide probes, targets, etc) are subjected to a number of processing steps (e.g. synthesis, purification, quality control, etc). Also as discussed above, various components of a single order (e.g. one type of SNP detection kit) are manufactured in separate tubes, and may be subjected to a different number of processing steps. Consequently, the present invention provides systems and methods for tracking the location and status of the items to be manufactured such that multiple components of a single order can be separately manufactured and brought back together at the appropriate time. The tracking system and methods of the present invention also allow for increased quality control and production efficiency. [0330]
  • In some embodiments, the computer tracking system comprises a central processing unit (CPU) and a central database. The central database is the central repository of information about manufacturing orders that are received (e.g. SNP sequence to be detected, final dilution requirements, etc), as well as manufacturing orders that have been processed (e.g. processed by software applications that determine optimal nucleic acid sequences, and applications that assign unique identifiers to orders). Manufacturing orders that have been processed may generate, for example, the number and types of oligonucleotides that need to be manufactured (e.g. probe, INVADER oligonucleotide, synthetic target), and the unique identifier associated with the entire order as well as unique identifiers for each component of an order (e.g. probe, INVADER oligonucleotide, etc). In certain embodiments, the components of an order proceed through the manufacturing process in containers that have been labeled with unique identifiers (e.g. bar coded test tubes, color coded test tubes, etc.). [0331]
  • In certain embodiments, the computer tracking system further comprises one or more scanning units capable of reading the unique identifier associated with each labeled container. In some embodiments, the scanning units are portable (e.g. hand held scanner employed by an operator to scan a labeled container). In other embodiments, the scanning units are stationary (e.g. built into each module). In some embodiments, at least one scanning unit is portable and at least one scanning unit is stationary (e.g. hand held human implemented device). [0332]
  • Stationary scanning units may, for example, collect information from the unique identifier on a labeled container (i.e. the labeled container is ‘red’) as it passes through part of one of the production modules. For example, a rack of 100 labeled containers may pass from the purification module to the dilute and fill module on a conveyor belt or other transport means, and the 100 labeled containers may be read by the stationary scanning unit. Likewise, a portable scanning unit may be employed to collect the information from the labeled containers as they pass from one production module to the next, or at different points within a production module. The scanning units may also be employed, for example, to determine the identity of a labeled container that has been tested (e.g. concentration of sample inside container is tested and the identity of the container is determined). [0333]
  • The scanning units are capable of transmitting the information they collect from the labeled containers to a central database. The scanning units may be linked to a central database via wires, or the information may be transmitted to the central database. The central database collects and processes this information such that the location and status of individual orders and components of orders can be tracked (e.g. information about when the order is likely to complete the manufacturing process may be obtained from the system). The central database also collects information from any type of sample analysis performed within each module (e.g. concentration measurements made during dilute and fill module). This sample analysis is correlated with the unique identifiers on each labeled container such that the status of each labeled container is determined. This allows labeled containers that are unsatisfactory to be removed from the production process (e.g. information from the central database is communicated to robotic or human container handlers to remove the unsatisfactory sample). Likewise, containers that are automatically removed from the production process as unsatisfactory may be identified, and this information communicated to a central database (e.g. to update the status of an order, allow a re-order to be generated, etc). Allowing unsatisfactory samples to be removed prevents unnecessary manufacturing steps, and allows the production of a replacement to begin as early as possible. [0334]
  • As mentioned above, the tracking system of the present invention allows the production of single orders that have multiple components that may proceed through different production modules, and/or that may be processed (at least in part) in separate containers. For example, an order may be for the production of an INVADER detection kit. An INVADER detection kit is composed of at least 2 components (the INVADER oligonucleotide, and the downstream probe), and generally includes a second downstream probe (e.g. for a different allele), and one or two synthetic targets so controls may be run (i.e. an INVADER kit may have 5 separate oligonucleotide sequences that need to be generated). The generation of separate sequences, in separate containers, generally necessitates that the tracking system track the location and status of each container, and direct the proper association of completed oligonucleotides into a single container or kit. Providing each container with a unique identifier corresponding to a single type of oligonucleotide (e.g. an INVADER oligonucleotide), and also corresponding to a single order (a SNP detection kit for diagnosing a certain SNP) allows separate, high through-put manufacture of the various components of a kit without confusion as to what components belong with each kit. [0335]
  • Tracking the location and status of the components of a kit (e.g. a kit composed of 5 different oligonucleotides) has many advantages. For example, near the end of the purification module HPLC is employed, and a simple sample analysis may be employed on each sample in each container to determine if a sample is collected in each tube. If no sample is collected after HPLC is performed, the unique identifier on the container, in connection with the central database, identifies the type of sample that should have been produced (e.g. INVADER oligonucleotide) and a re-order is generated. Identification of this particular oligonucleotide allows the manufacturing process for this oligonucleotide to start over from the beginning (e.g. this order gets priority status over other orders to begin the manufacturing process again). Importantly, the other components of the order may continue the manufacturing process without being discarded as part of a defective order (e.g. the manufacturing process may continue for these oligonucleotides up to the point where the defective oligonucleotide is required). Likewise, additional manufacturing resources are not wasted on the defective component (i.e. additional reagents and time are not spent on this portion of the order in further manufacturing steps). [0336]
  • The unique identifier on each of the containers allows the various components of a given order to be grouped together at a step when this is required (likewise, there is no need to group the components of an order in the manufacturing process until it is required). For example, prior to the dilute and fill module, the various components of a single order may be grouped together such that the contents of the proper containers are combined in the proper fashion in the dilute and fill module. This identification and grouping also allows re-orders to ‘find’ the other components of a particular order. This type of grouping, for example, allows the automated mixing, in the dilute and fill stage, of the first and second downstream probes with the INVADER oligonucleotide, all from the same order. This helps prevent human errors in reading containers and accidentally providing probes intended for one SNP being labeled as specific for a different SNP (i.e. this helps prevent components of different kits from being accidentally mixed together). The identification of individual containers not only allows for the proper grouping of the various components of a single order, but also allows for an order to be customized for a particular customer (e.g. a certain concentration or buffer employed in the second dilute and fill procedure). Finally, containers with finished products in them (e.g. containers with probes, and containers with synthetic targets) need to be associated with each other so they are properly assayed in the quality control module, and packaged together as a single kit (otherwise, quality control and/or a final end-user may find false negative and false positives when attempting to test/use the kit). The ability to track the individual containers allows the components of a kit to be associated together by directing a robot or human operator what tubes belong together. Consequently, final kits are produced with the proper components. Therefore, the tracking systems and methods of the present invention allow high through-put production of kits with many components, while assuring quality production. [0337]
  • E. Example [0338]
  • This Example describes the production of an INVADER assay kit for SNP detection using the automated DNA production system of the present invention. [0339]
  • 1. Oligonucleotide Design [0340]
  • The sequence of the SNP to be detected is first submitted through the automated web-based user interface or through e-mail. The sequences are then transferred to the INVADER CREATOR software. The software designs the upstream INVADER oligonucleotide and downstream probe oligonucleotide. The sequences are returned to the user for inspection. At this point, the sequences are assigned a bar code and entered into the automated tracking system. The bar codes of the probe and INVADER oligonucleotide are linked so that their synthesis, analysis, and packaging can be coordinated. [0341]
  • 2. Oligonucleotide Synthesis [0342]
  • Once the probe and INVADER oligonucleotide sequences have been designed, the sequences are transferred to the synthesis component. The bar codes are read and the sequences are logged into the synthesis module. Each module consists of 14 MOSS EXPEDITE 16-channel DNA synthesizers (PE Biosystems, Foster City, Calif.), that prepare the primary probes, and two ABI 3948 48-Channel DNA synthesizers (PE Biosystems, Foster City, Calif.), that prepare the INVADER oligonucleotides. Synthesizing a set of two primary and INVADER probes is complete 3-4 hours. The instruments run 24 h/day. Following synthesis, the automating tracking system reads the bar codes and logs the oligonucleotides as having completed the synthesis module. [0343]
  • The synthesis room is equipped with centralized reagent delivery. Acetonitrile is supplied to the synthesizers through stainless steel tubing. De-blocking solution (3% TCA in methylene chloride) is supplied through Teflon tubing. Tubing is designed to attach to the synthesizers without any modification of the synthesizers. The synthesis room is also equipped with an automated waste removal system. Waste containers are equipped with ventilation and contain sensors that trigger removal of waste through centralized tubing when the cache pots are full. Waste is piped to a centralized storage facility equipped with a blow out wall. The pressure in the synthesis instruments is controlled with argon supplied through a centralized system. The argon delivery system includes local tanks supplied from a centralized storage tank. [0344]
  • During synthesis, the efficiency of each step of the reaction is monitored. If an oligonucleotide fails the synthesis process, it is re-synthesized. The bar coding system scans the container of the oligonucleotide and marks it as being sent back for re-synthesis. [0345]
  • Following synthesis, the oligonucleotides are transported to the cleavage and deprotection station. At this stage, completed oligonucleotides are subjected to a final deprotection step and are cleaved from the solid support used for synthesis. The cleavage and deprotection may be performed manually or through automated robotics. The oligonucleotides are cleaved from the solid support used for synthesis by incubation with concentrated NaOH and collected. The cleavage step takes 12 hours. Following cleavage, the bar code scanner scans the oligonucleotide tubes and logs them as having completed the cleavage and deprotection step. [0346]
  • 3. Purification [0347]
  • Following synthesis and cleavage, probe oligonucleotides are further purified using HPLC. INVADER oligonucleotides are not purified, but instead proceed directly to desalting (see below). [0348]
  • HPLC is performed on instruments integrated into banks (modules) of 8. Each HPLC module consists of a Leap Technologies 8-port injector connected to 8 automated Beckman-Coulter HPLC instruments. The automatic Leap injector can handle four 96-well plates of cleaved and deprotected primary probes at a time. The Leap injector automatically loads a sample onto each of the 8 HPLCs. [0349]
  • Buffers for HPLC purification are produced by the automated buffer preparation system. The buffer prep system is in a general access area. Prepared buffer is then piped through the wall in to clean room (HEPA environment). The system includes large vat carboys that receive premeasured reagents and water for centralized buffer preparation. The buffers are piped from central prep to HPLCs. The conductivity of the solution in the circulation loop is monitored as a means of verifying both correct content and adequate mixing. The circulation lines are fitted with venturis for static mixing of the solutions; additional mixing occurs as solutions are circulated through the piping loop. The circulation lines are fitted with 0.05 μm filters for sterilization and removal of any residual particulates. [0350]
  • Each purified probe is collected into a 50-ml conical tube in a carrying case in the fraction collector. Collection is based on a set method, which is triggered by an absorbance rate change within a predetermined time window. The HPLC is run at a flow rate of 5-7.5 ml/min (the maximum rate of the pumps is 10 ml/min.) and each column is automatically washed before the injector loads the next sample. The gradient used is described in Tables 3 and 4 and takes 34 minutes to complete (including wash steps to prepare the column for the next sample). When the fraction collector is full of eluted probes, the tubes are transferred manually to customized racks for concentration in a Genevac centrifugal evaporator. The Genevac racks, containing dry oligonucleotide, are then transferred to the TECAN Nap10 column handler for desalting. [0351]
  • 4. Desalting [0352]
  • Following HPLC purification (probe oligonucleotides) or cleavage (INVADER oligonucleotides), oligonucleotides move to the desalting station. The dried oligonucleotides are resuspended in a small volume of water. Desalting steps are performed by a TECAN robot system. The racks used in Genevac centrifugation are also used in the desalting step, eliminating the need for transfer of tubes at this step. The racks are also designed to hold the different sizes of desalting columns, such as the NAP-5 and NAP-10 columns. The TECAN robot loads each oligonucleotide onto an individual NAP-5 or NAP-10 column, supplies the buffer, and collects the eluate. [0353]
  • 5. Dilution [0354]
  • Following desalting, the oligonucleotides are transferred to the dilute and fill module for concentration normalization and dispenation. Each module consists of three automated probe dilution and normalization stations. Each station consists of a network-linked computer and a Biomek 2000 interfaced with a SPECTRAMAX spectrophotometer Model 190 or PLUS 384 (Molecular Devices Corp., Sunnyvale Calif.) in a HEPA-filtered environment. [0355]
  • The probe and INVADER oligonucleotides are transferred onto the Biomek 2000 deck and the sequence files are downloaded into the Biomek 2000. The Biomek 2000 automatically transfers a sample of each oligonucleotide to an optical plate, which the spectrophotometer reads to measure the A260 absorbance. Once the A260 has been determined, an Excel program integrated with the Biomek software uses the measured absorbance and the sequence information to calculate the concentration of each oligonucleotide. The software then prepares a dilution table for each oligonucleotide. The probe and INVADER oligonucleotide are each diluted by the Biomek to a concentration appropriate for their intended use. The instrument then combines and dispenses the probe and INVADER oligonucleotides into 1.5 ml microtubes for each SNP set. The completed set of oligonucleotides contains enough material for 5,000 SNP assays. [0356]
  • If an oligonucleotide fails the dilution step, it is first re-diluted. If it again fails dilution, the oligonucleotide is re-purified or returned for re-synthesis. The progress of the oligonucleotide through the dilution module is tracked by the bar coding system. Oligonucleotides that pass the dilution module are scanned as having completed dilution and are moved to the next module. [0357]
  • 6. Quality Control [0358]
  • Before shipping, the SNP set is subjected to a quality control assay in a SAGIAN CORE System (Beckman Coulter), which is read on a ABI 7700 real time fluorescence reader (PE Biosystems). The QC assay uses two no target blanks as negative controls and five untyped genomic samples as targets. [0359]
  • The quality control assay is performed in segments. In each segment, the operator or automated system performs the following steps: log on; select location; step specific activity; and log off. The ADS system is responsible for tracking tubes. If a tube is missing, existing ADS program routines will be used to discard/reorder/search for the tube. [0360]
  • In the first step, a picklist is generated. The list includes the identity of the SNPs that are being tested and the QC method chosen. The tubes containing the oligonucleotide are selected by the automated software and a copy of the picklist is printed. The tubes are removed from inventory by the operator and scanned with the bar code reader and being removed from inventory. [0361]
  • The operator or the automated system then takes the rack setup generated by the picklist and loads the rack. Tubes are scanned as they are placed onto the rack. The scan checks to make sure it is the correct tube and displays the location in the rack where the tube is to be placed. Completed racks are placed in a holding area to await the robot prep and robot run. [0362]
  • The operator or the automated system then chooses the genomics and reagent stock to be loaded onto the robot. The robot is programmed with the specific method for the SNP set generated. Lot numbers of the genomics and reagents are recorded. Racks are placed in the proper carousel location. After all the carousel locations have been loaded the robot is run. [0363]
  • Places are then incubated on the robot. The plates are placed onto heatblocks for a period of time specified in the method. The operator then takes the plate and loads it into the ABI 7700. A scan is started using the 7700 software. When the scan is completed the operator transfers the output file onto a Macintosh computer hard drive. The then starts the analysis application and scans in the plate bar code. The software instructs the operator to browse to the saved output file. The software then reads the file into the database and deletes the file. [0364]
  • The results of the QC assay are then analyzed. The operator scans plate in at workstation PC and reviews automated analysis. The automated actions are performed using a spreadsheet system. The automated spreadsheet program returns one of the following results: [0365]
  • 1) Mark SNP Oligonucleotide ready for full fill (Operator discards diluted Probe/INVADER mixes. Requires no other action). [0366]
  • 2) ReAssess Failed Oligonucleotide (Requires no action by operator, handled by automation). [0367]
  • 3) Redilute Failed Oligonucleotide (Operator discards diluted tubes. Requires no other action). [0368]
  • 4) Order Target Oligonucleotide (Requires no action by operator, handled by automation). [0369]
  • 5) Fail Oligo(s) Discard Oligo(s) (Operator discards diluted tubes. Operator discards un-diluted tubes. Requires no other action). [0370]
  • 6) Fail SNP (Operator discards diluted tubes. Operator discards un-diluted tubes. Requires no other action). [0371]
  • 7) Full SNP Redesign (Operator discards diluted tubes. Operator discards un-diluted tubes. [0372]
  • Requires no other action). [0373]
  • 8) Partial SNP Redesign (Operator discards diluted tubes. Operator discards some un-diluted tubes. Requires no other action). [0374]
  • 9) Manual Intervention (This step occurs if the operator or software has determined the SNP requires manual attention. This step puts the SNP “on hold” in the tracking system). [0375]
  • The operator then views each SNP analysis and either approves all automated actions, approves individual actions, marks actions as needing additional review, passes on reviewing anything, or over rides automated actions. [0376]
  • Once the SNP set has passed the QC analysis, the oligonucleotides are transferred to the packaging station. [0377]
  • In some embodiments, the produced detection assay is screened against a plurality of known sequences designed to represent one or more population groups, e.g., to determine the ability of the detection assay to detect the intended target among the diverse alleles found in the general population. In preferred embodiments, the frequency of occurrence of the SNP allele in each of the one or more population groups is determined using the produced detection assay. Data collected may be used to satisfy regulatory requirements, if the detection assay is to be used as a clinical product. [0378]
  • IV. Sequence Inputs and User Interfaces [0379]
  • Sequences may be input for analysis from any number of sources. In many embodiments, sequence information is entered into a computer. The computer need not be the same computer system that carries out in silico analysis. In some preferred embodiments, candidate target sequences may be entered into a computer linked to a communication network (e.g., a local area network, Internet or Intranet). In such embodiments, users anywhere in the world with access to a communication network may enter candidate sequences at their own locale. In some embodiments, a user interface is provided to the user over a communication network (e.g., a World Wide Web-based user interface), containing entry fields for the information required by the in silico analysis (e.g., the sequence of the candidate target sequence). The use of a Web based user interface has several advantages. For example, by providing an entry wizard, the user interface can ensure that the user inputs the requisite amount of information in the correct format. In some embodiments, the user interface requires that the sequence information for a target sequence be of a minimum length (e.g., 20 or more, 50 or more, 100 or more nucleotides) and be in a single format (e.g., FASTA). In other embodiments, the information can be input in any format and the systems and methods of the present invention edit or alter the input information into a suitable form for analysis. For example, if an input target sequence is too short, the systems and methods of the present invention search public databases for the short sequence, and if a unique sequence is identified, convert the short sequence into a suitably long sequence by adding nucleotides on one or both of the ends of the input target sequence. Likewise, if sequence information is entered in an undesirable format or contains extraneous, non-sequence characters, the sequence can be modified to a standard format (e.g., FASTA) prior to further in silico analysis. The user interface may also collect information about the user, including, but not limited to, the name and address of the user. In some embodiments, target sequence entries are associated with a user identification code. [0380]
  • In some embodiments, sequences are input directly from assay design software (e.g., the INVADERCREATOR software. [0381]
  • In preferred embodiments, each sequence is given an ID number. The ID number is linked to the target sequence being analyzed to avoid duplicate analyses. For example, if the in silico analysis determines that a target sequence corresponding to the input sequence has already been analyzed, the user is informed and given the option of by-passing in silico analysis and simply receiving previously obtained results. [0382]
  • Web-Ordering Systems and Methods [0383]
  • Users who wish to order detection assays, have detection assay designed, or gain access to databases or other information of the present invention may employ a electronic communication system (e.g., the Internet). In some embodiments, an ordering and information system of the present invention is connected to a public network to allow any user access to the information. In some embodiments, private electronic communication networks are provided. For example, where a customer or user is a repeat customer (e.g., a distributor or large diagnostic laboratory), the full-time dedicated private connection may be provided between a computer system of the customer and a computer system of the systems of the present invention. The system may be arranged to minimize human interaction. For example, in some embodiments, inventory control software is used to monitor the number and type of detection assays in possession of the customer. A query is sent at defined intervals to determine if the customer has the appropriate number and type of detection assay, and if shortages are detected, instructions are sent to design, produce, and/or deliver additional assays to the customer. In some embodiments, the system also monitors inventory levels of the seller and in preferred embodiments, is integrated with production systems to manage production capacity and timing. [0384]
  • In some embodiments, a user-friendly interface is provided to facilitate selection and ordering of detection assays. Because of the hundreds of thousands of detection assays available and/or polymorphisms that the user may wish to interrogate, the user-friendly interface allows navigation through the complex set of option. For example, in some embodiments, a series of stacked databases are used to guide users to the desired products. In some embodiments, the first layer provides a display of all of the chromosomes of an organism. The user selects the chromosome or chromosomes of interest. Selection of the chromosome provides a more detailed map of the chromosome, indicating banding regions on the chromosome. Selection of the desired band leads to a map showing gene locations. One or more additional layers of detail provide base positions of polymorphisms, gene names, genome database identification tags, annotations, regions of the chromosome with pre-existing developed detection assays that are available for purchase, regions where no pre-existing developed assays exist but that are available for design and production, etc. Selecting a region, polymorphism, or detection assay takes the user to an ordering interface, where information is collected to initiate detection assay design and/or ordering. In some embodiments, a search engine is provided, where a gene name, sequence range, polymorphism or other query is entered to more immediately direct the user to the appropriate layer of information. [0385]
  • In some embodiments, the ordering, design, and production systems are integrated with a finance system, where the pricing of the detection assay is determined by one or more factors: whether or not design is required, cost of goods based on the components in the detection assay, special discounts for certain customers, discounts for bulk orders, discounts for re-orders, price increases where the product is covered by intellectual property or contractual payment obligations to third parties, and price selection based on usage. For example, where detection assays are to be used for or are certified for clinical diagnostics rather than research applications, pricing is increased. In some embodiments, the pricing increase for clinical products occurs automatically. For example, in some embodiments, the systems of the present invention are linked to FDA, public publication, or other databases to determine if a product has been certified for clinical diagnostic or ASR use. [0386]
  • EXAMPLES
  • The following examples are provided in order to demonstrate and further illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof. [0387]
  • In the experimental disclosure which follows, the following abbreviations apply: N (normal); M (molar); mM (millimolar); μM (micromolar); mol (moles); mmol (millimoles); μmol (micromoles); nmol (nanomoles); pmol (picomoles); g (grams); mg (milligrams); μg (micrograms); ng (nanograms); l or L (liters); ml (milliliters); μl (microliters); cm (centimeters); mm (millimeters); μm (micrometers); nm (nanometers); DS (dextran sulfate); C (degrees Centigrade); and Sigma (Sigma Chemical Co., St. Louis, Mo.). [0388]
  • Example 1
  • Designing a 10-Plex (Manual): Test for Invader Assays [0389]
  • The following experimental example describes the manual design of amplification primers for a multiplex amplification reaction, and the subsequent detection of the amplicons by the INVADER assay. [0390]
  • Ten target sequences were selected from a set of pre-validated SNP-containing sequences, available in a TWT in-house oligonucleotide order entry database (see FIG. 5). Each target contains a single nucleotide polymorphism (SNP) to which an INVADER assay had been previously designed. The INVADER assay oligonucleotides were designed by the INVADER CREATOR software (Third Wave Technologies, Inc. Madison, Wis.), thus the footprint region in this example is defined as the INVADER “footprint”, or the bases covered by the INVADER and the probe oligonucleotides, optimally positioned for the detection of the base of interest, in this case, a single nucleotide polymorphism (See FIG. 5). About 200 nucleotides of each of the 10 target sequences were analyzed for the amplification primer design analysis, with the SNP base residing about in the center of the sequence. The sequences are shown in FIG. 5. [0391]
  • Criteria of maximum and minimum probe length (defaults of 30 nucleotides and 12 nucleotides, respectively) were defined, as was a range for the probe melting temperature Tm of 50-60° C. In this example, to select a probe sequence that will perform optimally at a pre-selected reaction temperature, the melting temperature (T[0392] m) of the oligonucleotide is calculated using the nearest-neighbor model and published parameters for DNA duplex formation (Allawi and SantaLucia, Biochemistry, 36:10581 [1997], herein incorporated by reference). Because the assay's salt concentrations are often different than the solution conditions in which the nearest-neighbor parameters were obtained (1M NaCl and no divalent metals), and because the presence and concentration of the enzyme influence optimal reaction temperature, an adjustment should be made to the calculated Tm to determine the optimal temperature at which to perform a reaction. One way of compensating for these factors is to vary the value provided for the salt concentration within the melting temperature calculations. This adjustment is termed a ‘salt correction’. The term “salt correction” refers to a variation made in the value provided for a salt concentration for the purpose of reflecting the effect on a Tm calculation for a nucleic acid duplex of a non-salt parameter or condition affecting said duplex. Variation of the values provided for the strand concentrations will also affect the outcome of these calculations. By using a value of 280 nM NaCl (SantaLucia, Proc Natl Acad Sci U S A, 95:1460 [1998], herein incorporated by reference) and strand concentrations of about 10 pM of the probe and 1 fM target, the algorithm for used for calculating probe-target melting temperature has been adapted for use in predicting optimal primer design sequences.
  • Next, the sequence adjacent to the footprint region, both upstream and downstream were scanned and the first A or C was chosen for design start such that for primers described as 5′-N[x]-N[x−1]- . . . -N[4]-N[3]-N[2]-N[1]-3′, where N[1] should be an A or C. Primer complementarity was avoided by using the rule that: N[2]-N[1] of a given oligonucleotide primer should not be complementary to N[2]-N[1] of any other oligonucleotide, and N[3]-N[2]-N[1] should not be complementary to N[3]-N[2]-N[1] of any other oligonucleotide. If these criteria were not met at a given N[1], the next base in the 5′ direction for the forward primer or the next base in the 3′ direction for the reverse primer will be evaluated as an N[1] site. In the case of manual analysis, A/C rich regions were targeted in order to minimize the complementarity of 3′ ends. [0393]
  • In this example, an INVADER assay was performed following the multiplex amplification reaction. Therefore, a section of the secondary INVADER reaction oligonucleotide (the FRET oligonucleotide sequence, see FIG. 2) was also incorporated as criteria for primer design; the amplification primer sequence should be less than 80% homologous to the specified region of the FRET oligonucleotide. [0394]
  • The output primers for the 10-plex multiplex design are shown in FIG. 5). All primers were synthisized according to standard oligonucleotide chemistry, desalted (by standard methods) and quantified by absorbance at A260 and diluted to 50 μM concentrated stock. Multiplex PCR was then carried out using 10-plex PCR using equimolar amounts of primer (0.01 uM/primer) under the following conditions; 100 mM KCl, 3 mM MgCl[0395] 2, 10 mM Tris pH8.0, 200 uM dNTPs, 2.5U Taq DNA polymerase, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction. The reaction was incubated for (94C./30 sec, 50C./44 sec.) for 30 cycles. After incubation, the multiplex PCR reaction was diluted 1:10 with water and subjected to INVADER analysis using INVADER Assay FRET Detection Plates, 96 well genomic biplex, 100 ng CLEAVASE VIII enzyme, INVADER assays were assembled as 15 ul reactions as follows; 1 ul of the 1:10 dilution of the PCR reaction, 3 ul of PPI mix, 5 ul of 22.5 mM MgCl2, 6 ul of dH20, covered with 15 ul of CHILLOUT liquid wax. Samples were denatured in the INVADER biplex by incubation at 95C. for 5 min., followed by incubation at 63C. and fluorescence measured on a Cytofluor 4000 at various timepoints.
  • Using the following criteria to accurately make genotyping calls (FOZ_FAM+FOZ_RED−2>0.6), only 2 of the 10 INVADER assay calls can be made after 10 minutes of incubation at 63C., and only 5 of the 10 calls could be made following an additional 50 min of incubation at 63C. (60 min.) (See, FIG. 6A). At the 60 min time point, the variation between the detectable FOZ values is over 100 fold between the strongest signal (FIG. 6A, 41646, FAM_FOZ+RED_FOZ−2=54.2, which is also is far outside of the dynamic range of the reader) and the weakest signal (FIG. 6A, 67356, FAM_FOZ+RED_FOZ−2=0.2). Using the same INVADER assays directly against 100 ng of human genomic DNA (where equimolar amounts of each target would be available), all reads could be made with in the dynamic range of the reader and variation in the FOZ values was approximately seven fold between the strongest (FIG. 6, 53530, FAM_FOZ+RED_FOZ−2=3.1) and weakest (FIG. 6, 53530, FAM_FOZ+RED_FOZ−2=0.43) of the assays. This suggests that the dramatic discrepancies in FOZ values seen between different amplicons in the same multiplex PCR reaction is a function of biased amplification, and not variability attributable to INVADER assay. Under these conditions, FOZ values generated by different INVADER assays are directly comparable to one another and can reliably be used as indicators of the efficiency of amplification. [0396]
  • Estimation of amplification factor of a given amplicon using FOZ values. In order to estimate the amplification factor (F) of a given amplicon, the FOZ values of the INVADER assay can be used to estimate amplicon abundance. The FOZ of a given amplicon with unknown concentration at a given time (FOZm) can be directly compared to the FOZ of a known amount of target (e.g. 100 ng of genomic DNA=30,000 copies of a single gene) at a defined point in time (FOZ[0397] 240, 240 min) and used to calculate the number of copies of the unknown amplicon. In equation 1, FOZm represents the sum of RED_FOZ and FAM_FOZ of an unknown concentration of target incubated in an INVADER assay for a given amount of time (m). FOZ240 represents an empirically determined value of RED_FOZ (using INVADER assay 41646), using for a known number of copies of target (e.g. 100 ng of hgDNA≅30,000 copies) at 240 minutes.
  • F=((FOZ m−1)*500/(FOZ 240−1))*(240/m){circumflex over ( )}2   (equation 1a)
  • Although equation 1a is used to determine the linear relationship between primer concentration and amplification factor F, equation 1a′ is used in the calculation of the amplification factor F for the 10-plex PCR (both with equimolar amounts of primer and optimized concentrations of primer), with the value of D representing the dilution factor of the PCR reaction. In the case of a 1:3 dilution of the 50 ul multiplex PCR reaction. D=0.3333. [0398]
  • F=((FOZ m−2)*500/(FOZ 240−1)*D)*(240/m){circumflex over ( )}2   (equation 1a′)
  • Although equations 1a and 1a′ will be used in the description of the 10-plex multiplex PCR, a more correct adaptation of this equation was used in the optimization of primer concentrations in the 107-plex PCR. In this case, FOZ[0399] 240=the average of FAM_FOZ240+RED_FOZ240 over the entire INVADER MAP plate using hgDNA as target (FOZ240=3.42) and the dilution factor D is set to 0.125.
  • F=((FOZ m−2)*500/(FOZ 240−2)*D)*(240/m){circumflex over ( )}2   (equation 1b)
  • It should be noted that in order for the estimation of amplification factor F to be more accurate, FOZ values should be within the dynamic range of the instrument on which the reading are taken. In the case of the Cytofluor 4000 used in this study, the dynamic range was between about 1.5 and about 12 FOZ. [0400]
  • [0401] Section 3. Linear Relationship Between Amplification Factor and Primer Concentration.
  • In order to determine the relationship between primer concentration and amplification factor (F), four distinct uniplex PCR reactions were run at using primers 1117-70-17 and 1117-70-18 at concentrations of 0.01 uM, 0.012 uM, 0.014 uM, 0.020 uM respectively. The four independent PCR reactions were carried out under the following conditions; 100 mM KCl, 3 mM MgCl, 10 mM Tris pH 8.0, 200 uM dNTPs using 10 ng of hgDNA as template. Incubation was carried out at (94C./30 sec., 50C./20 sec.) for 30 cycles. Following PCR, reactions were diluted 1:10 with water and run under standard conditions using INVADER Assay FRET Detection Plates, 96 well genomic biplex, 100 ng CLEAVASE VIII enzyme. Each 15 ul reaction was set up as follows; 1 ul of 1:10 diluted PCR reaction, 3 ul of the PPI [0402] mix SNP# 47932, 5 ul 22.5 mM MgCl2, 6 ul of water, 15 ul of CHILLOUT liquid wax. The entire plate was incubated at 95C. for 5 min, and then at 63C. for 60 min at which point a single read was taken on a Cytofluor 4000 fluorescent plate reader. For each of the four different primer concentrations (0.01 uM, 0.012 uM, 0.014 uM, 0.020 uM) the amplification factor F was calculated using equation 1a, with FOZm=the sum of FOZ_FAM and FOZ_RED at 60 minutes, m=60, and FOZ240=1.7. In plotting the primer concentration of each reaction against the log of the amplification factor Log(F), a strong linear relationship was noted (FIG. 7). Using the data points in FIG. 7, the formula describing the linear relationship between amplification factor and primer concentration is described in equation 2:
  • Y=1.684X+2.6837   (equation 2a)
  • Using [0403] equation 2, the amplification factor of a given amplicon Log(F)=Y could be manipulated in a predictable fashion using a known concentration of primer (X). In a converse manner, amplification bias observed under conditions of equimolar primer concentrations in multiplex PCR, could be measured as the “apparent” primer concentration (X) based on the amplification factor F. In multiplex PCR, values of “apparent” primer concentration among different amplicons can be used to estimate the amount of primer of each amplicon required to equalize amplification of different loci:
  • X=(Y−2.6837)/1.68 (equation 2b)
  • [0404] Section 4. Calculation of Apparent Primer Concentrations from a Balanced Multiplex Mix.
  • As described in a previous section, primer concentration can directly influence the amplification factor of given amplicon. Under conditions of equimolar amounts of primers, FOZm readings can be used to calculate the “apparent” primer concentration of each [0405] amplicon using equation 2. Replacing Y in equation 2 with log(F) of a given amplification factor and solving for X, gives an “apparent” primer concentration based on the relative abundance of a given amplicon in a multiplex reaction. Using equation 2 to calculate the “apparent” primer concentration of all primers (provided in equimolar concentration) in a multiplex reaction (FIG. 3A), provides a means of normalizing primer sets against each other. In order to derive the relative amounts of each primer that should be added to an “Optimized” multiplex primer mix R, each of the “apparent” primer concentrations should be divided into the maximum apparent primer concentration (Xmax), such that the strongest amplicon is set to a value of 1 and the remaining amplicons to values equal or greater than 1
  • R[n]=Xmax/X[n]  (equation 3)
  • Using the values of R[n] as an arbitrary value of relative primer concentration, the values of R[n] are multiplied by a constant primer concentration to provide working concentrations for each primer in a given multiplex reaction. In the example shown, the amplicon corresponding to [0406] SNP assay 41646 has an R[n] value equal to 1. All of the R[n] values were multiplied by 0.01 uM (the original starting primer concentration in the equimolar multiplex PCR reaction) such that lowest primer concentration is R[n] of 41646 which is set to 1, or 0.01 uM. The remaining primer sets were also proportionally increased as shown in FIG. 8. The results of multiplex PCR with the “optimized” primer mix are described below.
  • [0407] Section 5 Using Optimized Primer Concentrations in Multiplex PCR, Variation in FOZ's Among 10 INVADER Assays are Greatly Reduced.
  • Multiplex PCR was carried out using 10-plex PCR using varying amounts of primer based on the volumes indicated in FIG. 8 (X[max] was SNP41646, setting 1x=0.01 uM/primer). Multiplex PCR was carried out under conditions identical to those used in with equimolar primer mix; 100 mMKCl, 3 mMMgCl, 10 mM Tris pH8.0, 200 uM dNTPs, 2.5U taq, and 10 ng of hgDNA template in a 50 ul reaction. The reaction was incubated for (94C./30 sec, 50C./44 sec.) for 30 cycles. After incubation, the multiplex PCR reaction was diluted 1:10 with water and subjected to INVADER analysis. Using INVADER Assay FRET Detection Plates, (96 well genomic biplex, 100 ng CLEAVSE VIII enzyme), reactions were assembled as 15 ul reactions as follows; 1 ul of the 1:10 dilution of the PCR reaction, 3 ul of the appropriate PPI mix, 5 ul of 22.5 mM MgCl2, 6 ul of dH20. An additional 15 ul of CHILL OUT was added to each well, followed by incubation at 95C. for 5 min. Plates were incubated at 63C. and fluorescence measured on a Cytofluor 4000 at 10 min. [0408]
  • Using the following criteria to accurately make genotyping calls (FOZ_FAM+FOZ_RED−2>0.6), all 10 of 10 (100%) INVADER calls can be made after 10 minutes of incubation at 63C. In addition, the values of FAM+RED−2 (an indicator of overall signal generation, directly related to amplification factor (see equation 2)) varied by less than seven fold between the the lowest signal (FIG. 9, 67325, FAM+RED−2=0.7) and the highest (FIG. 9, 47892, FAM+RED−2=4.3). [0409]
  • Example 2
  • Design of 101-plex PCR Using the Software Application [0410]
  • Using the TWT Oligo Order Entry Database, 144 sequences of less than 200 nucleotides in length were obtained, with SNPs annotated using brackets to indicate the SNP position for each sequence (e.g. NNNNNNN[N[0411] (wt)/N(mt)]NNNNNNNN). In order to expand sequence data flanking the SNP of interest, sequences were expanded to approximately 1 kB in length (500 nts flanking each side of the SNP) using BLAST analysis. Of the 144 starting sequences, 16 could not expanded by BLAST, resulting in a final set of 128 sequences expanded to approximately 1 kB length (See, FIG. 10). These expanded sequences were provided to the user in Excel format with the following information for each sequence; (1) TWT Number, (2) Short Name Identifier, and (3) sequence (see FIG. 10). The Excel file was converted to a comma delimited format and used as the input file for Primer Designer INVADER CREATOR v1.3.3. software (this version of the program does not screen for FRET reactivity of the primers, nor does it allow the user to specify the maximum length of the primer). INVADER CREATOR Primer Designer v1.3.3., was run using default conditions (e.g. minimum primer size of 12, maximum of 30), with the exception of Tmlow which was set to 60C. The output file (see FIG. 10, bottom of each sheet shows footprint region in upper case letters and SNP in brackets) contained 128 primer sets (256 primers, See FIG. 12), four of which were thrown out due to excessively long primer sequences ( SNP # 47854, 47889, 54874, 67396), leaving 124 primers sets (248 primers) available for synthesis. The remaining primers were synthesized using standard procedures at the 200 nmol scale and purified by desalting. After synthesis failures, 107 primer sets were available for assembly of an equimolar 107-plex primer mix (214 primers, See FIG. 12). Of the 107 primer sets available for amplification, only 101 were present on the INVADER MAP plate to evaluate amplification factor.
  • Multiplex PCR was carried out using 101-plex PCR using equimolar amounts of primer (0.025 uM/primer) under the following conditions; 100 mMKCl, 3 mM MgCl, 10 mM Tris pH8.0, 200 uM dNTPs, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction. After denaturation at 95C. for 10 min, 2.5 units of Taq was added and the reaction incubated for (94C./30 sec, 50C./44 sec.) for 50 cycles. After incubation, the multiplex PCR reaction was diluted 1:24 with water and subjected to INVADER assay analysis using INVADER MAP detection platform. Each INVADER MAP assay was run as a 6 ul reaction as follows; 3 ul of the 1:24 dilution of the PCR reaction (total dilution 1:8 equaling D=0.125), 3 ul of 15 mM MgCl2 covered with covered with 6 ul of CHILLOUT. Samples were denatured in the INVADER MAP plate by incubation at 95C. for 5 min., followed by incubation at 63C. and fluorescence measured on a Cytofluor 4000 (384 well reader) at various timepoints over 160 minutes. Analysis of the FOZ values calculated at 10, 20, 40, 80, 160 min. shows that correct calls (compared to genomic calls of the same DNA sample) could be made for 94 of the 101 amplicons detectable by the INVADER MAP platform (FIG. 13 and FIG. 14). This provides proof that the INVADER CREATOR Primer Designer software can create primer sets which function in highly multiplex PCR. [0412]
  • In using the FOZ values obtained throughout the 160 min. time course, amplification factor F and R[n] were calculated for each of the 101 amplicons (FIG. 15). R[nmax] was set at 1.6, which although Low end corrections were made for amplicons which failed to provide sufficient FOZm signal at 160 min., assigning an arbitrary value of 12 for R[n]. High end corrections for amplicons whose FOZm values at the 10 min. read, an R[n] value of 1 was arbitrarily assigned. Optimized primer concentrations of the 101-plex were calculated using the basic principles outlined in the 10-plex example and equation 1b, with an R[n] of 1 corresponding to 0.025 uM primer (see FIG. 15 for various primer concentrations). Multiplex PCR was under the following conditions; 100 mMKCl, 3 mM MgCl, 10 mM Tris pH8.0, 200 uM dNTPs, and 10 ng of human genomic DNA (hgDNA) template in a 50 ul reaction. After denaturation at 95C. for 10 min, 2.5 units of Taq was added and the reaction incubated for (94C./30 sec, 50C./44 sec.) for 50 cycles. After incubation, the multiplex PCR reaction was diluted 1:24 with water and subjected to INVADER analysis using INVADER MAP detection platform. Each INVADER MAP assay was run as a 6 ul reaction as follows; 3 ul of the 1:24 dilution of the PCR reaction (total dilution 1:8 equaling D=0.125), 3 ul of 15 mM MgCl2 covered with covered with 6 ul of CHILLOUT. Samples were denatured in the INVADER MAP plate by incubation at 95C. for 5 min., followed by incubation at 63C. and fluorescence measured on a Cytofluor 4000 (384 well reader) at various timepoints over 160 minutes. Analysis of the FOZ values was carried out at 10, 20, and 40 min. and compared to calls made directly against the genomic DNA. Shown in FIG. 13, is a comparison between calls made at 10 min. with a 101-plex PCR with the equimolar primer concentrations versus calls that were made at 10 min. with a 101-plex PCR run under optimized primer concentrations. Additional data for this example is shown in FIGS. 16[0413] a, 16 b, and 17). Under equimolar primer concentration, multiplex PCR results in only 50 correct calls at the 10 min time point, where under optimized primer concentrations multiplex PCR results in 71 correct calls, resulting in a gain of 21 (42%) new calls. Although all 101 calls could not be made at the 10 min timepoint, 94 calls could be made at the 40 min. timepoint suggesting the amplification efficiency of the majority of amplicons had improved. Unlike the 10-plex optimization that only required a single round of optimization, multiple rounds of optimization may be required for more complex multiplexing reactions to balance the amplification of all loci.
  • Example 3
  • Use of the Invader Assay to Determine Amplification Factor of PCR [0414]
  • The INVADER assay can be used to monitor the progress of amplification during PCR reactions, i.e., to determine the amplification factor F that reflects efficiency of amplification of a particular amplicon in a reaction. In particular, the INVADER assay can be used to determine the number of molecules present at any point of a PCR reaction by reference to a standard curve generated from quantified reference DNA molecules. The amplification factor F is measured as a ratio of PCR product concentration after amplification to initial target concentration. This example demonstrates the effect of varying primer concentration on the measured amplification factor. [0415]
  • PCR reactions were conducted for variable numbers of cycles in increments of 5, i.e., 5, 10, 15, 20, 25, 30, so that the progress of the reaction could be assessed using the INVADER assay to measure accumulated product. The reactions were diluted serially to assure that the target amounts did not saturate the INVADER assay, i.e., so that the measurements could be made in the linear range of the assay. INVADER assay standard curves were generated using a dilution series containing known amounts of the amplicon. This standard curve was used to extrapolate the number of amplified DNA fragments in PCR reactions after the indicated number of cycles. The ratio of the number of molecules after a given number of PCR cycles to the number present prior to amplification is used to derive the amplification factor, F, of each PCR reaction. [0416]
  • PCR Reactions [0417]
  • PCR reactions were set up using equimolar amounts of primers (e.g., 0.02 μM or 0.1 μM primers, final concentration). Reactions at each primer concentration were set up in triplicate for each level of amplification tested, i.e., 5, 10, 15, 20, 25, and 30 PCR cycles. One master mix sufficient for 6 standard PCR reactions (each in triplicate×2 primer concentrations) plus 2 controls×6 tests (5, 10, 15, 20, 25, or 30 cycles of PCR) plus enough for extra reactions to allow for overage. [0418]
  • Serial Dilutions of PCR Reaction Products [0419]
  • In order to ensure that the amount of PCR product added as target to the INVADER assay reactions would not exceed the dynamic range of the real time assay on the PERSEPTIVE BIOSYSTEMS CYTOFLUOR 4000, the PCR reaction products were diluted prior to addition to the INVADER assays. An initial 20-fold dilution was made of each reaction, followed by subsequent five-fold serial dilutions. [0420]
  • To create standards, amplification products generated with the same primers used in the tests of different numbers of cycles were isolated from non-denaturing polyacrylimide gels using standard methods and quantified using the PICOGREEN assay. A working stock of 200 pM was created, and serial dilutions of these concentration standards were created in dH[0421] 2O containing tRNA at 30 ng/μl to yield a series with final amplicon concentrations of 0.5, 1, 2.5, 6.25, 15.62, 39, and 100 fM.
  • INVADER Assay Reactions [0422]
  • Appropriate dilutions of each PCR reaction and the no target control were made in triplicate, and tested in standard, singlicate INVADER assay reactions. One master mix was made for all INVADER assay reactions. In all, there were 6 PCR cycle conditions×24 individual test assays [(1 test of triplicate dilutions×2 primer conditions×3 PCR replicates)=18+6 no target controls]. In addition, there were 7 dilutions of the quantified amplicon standards and 1 no target control in the standard series. The standard series was analyzed in replicate on each of two plates, for an additional 32 INVADER assays. The total number of INVADER assays is 6×24+32=176. The master mix included coverage for 32 reactions. INVADER assay master mix and comprised the following standard components:FRET buffer/Cleavase XI/Mg/PPI mix for 192 plus 16 wells. [0423]
  • The following oligonucleotides were included in the PPI mix. [0424]
  • 0.25 μM INVADER for assay 2 (GAAGCGGCGCCGGTTACCACCA) [0425]
  • 2.5 μM A Probe for assay 2 (CGCGCCGAGGTGGTTGAGCAATTCCAA) [0426]
  • 2.5 μM G Probe for assay 2 (ATGACGTGGCAGACCGGTTGAGCAATTCCA) [0427]
  • All wells were overlaid with 15 μl mineral oil, incubated at 95° C. 5 min, then at 63° C. read at various intervals, eg. 20, 40, 80, or 160 min, depending on the level of signal generated. The reaction plate was read on a CytoFluor® Series 4000 Fluorescence Multi-Well Plate Reader. The settings used were: 485/20 nm excitation/bandwidth and 530/25 nm emission/bandwidth for F dye detection, and 560/20 nm excitation/bandwidth and 620/40 nm emission/bandwidth for R dye detection. The instrument gain was set for each dye so that the No Target Blank produced between 100-200 Absolute Fluorescence Units (AFUs). [0428]
  • Results: [0429]
  • FIG. 21 presents the results of the triplicate INVADER assays in a plot of log[0430] 10 of amplification factor (y-axis) as a function of cycle number (x-axis). The PCR product concentration was estimated from the INVADER assays by extrapolation to the standard curve. The data from the replicate assays were not averaged but instead were presented as multiple, overlapping points in the figure.
  • These results indicate that the PCR reactions were exponential over the range of cycles tested. The use of different primer concentrations resulted in different slopes such that the slope generated from INVADER assay analysis of PCR reactions carried out with the higher primer concentration (0.1 μM) is steeper than that with the lower (0.02 μM) concentration. In addition, the slope obtained using 0.1 μM approaches that anticipated for perfect doubling (0.301). The amplification factors from the PCR reactions at each primer concentration were obtained from the slopes: [0431]
  • For 0.1 μM primers, slope=0.286; amplification factor: 1.93 [0432]
  • For 0.02 μM primers, slope=0.218; amplification factor: 1.65. [0433]
  • The lines do not appear to extend to the origin but rather intercept the X-axis between 0 and 5 cycles, perhaps reflective of errors in estimating the starting concentration of human genomic DNA. [0434]
  • Thus, these data show that primer concentration affects the extent of amplification during the PCR reaction. These data further demonstrate that the INVADER assay is an effective tool for monitoring amplification throughout the PCR reaction. [0435]
  • Example 4
  • Dependence of Amplification Factor on Primer Concentration [0436]
  • This example demonstrates the correlation between amplification factor, F, and primer concentration, c. In this experiment, F was determined for 2 alleles from each of 6 SNPs amplified in monoplex PCR reactions, each at 4 different primer concentrations, hence 6 primer pairs×2 genomic samples×4 primer concentrations=48 PCR reactions. [0437]
  • Whereas the effect of PCR cycle number was tested on a single amplified region, at two primer concentrations, in Example 3, in this example, all test PCR reactions were run for 20 cycles, but the effect of varying primer concentration was studied at 4 different concentration levels: 0.01 μM, 0.025 μM, 0.05 μM, 0.1 μM. Furthermore, this experiment examines differences in amplification of different genomic regions to investigate (a) whether different genomic regions are amplified to different extents (i.e. PCR bias) and (b) how amplification of different genomic regions depends on primer concentration. [0438]
  • As in Example 3, F was measured by generating a standard curve for each locus using a dilution series of purified, quantified reference amplicon preparations. In this case, 12 different reference amplicons were generated: one for each allele of the SNPs contained in the 6 genomic regions amplified by the primer pairs. Each reference amplicon concentration was tested in an INVADER assay, and a standard curve of fluorescence counts versus amplicon concentration was created. PCR reactions were also run on genomic DNA samples, the products diluted, and then tested in an INVADER assay to determine the extent of amplification, in terms of number of molecules, by comparison to the standard curve. [0439]
  • a. Generation of Standard Curves Using Quantified Reference Amplicons [0440]
  • A total of 8 genomic DNA samples isolated from whole blood were screened in standard biplex INVADER assays to determine their genotypes at 24 SNPs in order to identify samples homozygous for the wild-type or variant allele at a total of 6 different loci. [0441]
  • Once these loci were identified, wild-type and variant genomic DNA samples were analyzed in separate PCR reactions with primers flanking the genomic region containing each SNP. At each SNP, one allele reported to FAM dye and one to RED. [0442]
  • Suitable genomic DNA preparations were then amplified in standard individual, monoplex PCR reactions to generate amplified fragments for use as PCR reference standards as described in Example 3. [0443]
  • Following PCR, amplified DNA was gel isolated using standard methods and previously quantified using the PICOGREEN assay. Serial dilutions of these concentration standards were created as follows: [0444]
  • Each purified amplicon was diluted to create a working stock at a concentration of 200 pM. These stocks were then serially diluted as follows. A working stock solution of each amplicon was prepared with a concentration of 1.25 pM in dH[0445] 2O containing tRNA at 30 ng/μl. The working stock was diluted in 96-well microtiter plates and then serially diluted to yield the following final concentrations in the INVADER assay: 1, 2.5, 6.25, 15.6, 39, 100, and 250 fM. One plate was prepared for the amplicons to be detected in the INVADER assay using probe oligonucleotides reporting to FAM dye and one plate for those to be tested with probe oligonucleotides reporting to RED dye. All amplicon dilutions were analyzed in duplicate.
  • Aliquots of 100 μl were transferred, in this layout, to 96 well MJ Research plates and denatured for 5 min at 95° C. prior to addition to INVADER assays. [0446]
  • b. PCR Amplification of Genomic Samples at Different Primer Concentrations. [0447]
  • PCR reactions were set up for individual amplification of the 6 genomic regions described in the previous example on each of 2 alleles at 4 different primer concentrations, for a total of 48 PCR reactions. All PCRs were run for 20 cycles. The following primer concentrations were tested: 0.01 μM, 0.025 μM, 0.05 μM, and 0.1 μM. A master mix for all 48 reactions was prepared according to standard procedures, with the exception of the modified primer concentrations, plus overage for an additional 23 reactions (16 reactions were prepared but not used, and overage of 7 additional reactions was prepared). [0448]
  • c. Dilution of PCR Reactions [0449]
  • Prior to analysis by the INVADER assay, it was necessary to dilute the products of the PCR reactions, as described in Examples 1 and 2. Serial dilutions of each of the 48 PCR reactions were made using one 96-well plate for each SNP. The left half of the plate contained the SNPs to be tested with probe oligonucleotides reporting to FAM; the right half, with probe oligonucleotides reporting to RED. The initial dilution was 1:20; asubsequent dilutions were 1:5 up to 1:62,500. [0450]
  • d. INVADER Assay Analysis of PCR Dilutions and Reference Amplicons [0451]
  • INVADER analysis was carried out on all dilutions of the products of each PCR reaction as well as the indicated dilutions of each quantified reference amplicon (to generate a standard curve for each amplicon) in standard biplex INVADER assays. [0452]
  • All wells were overlaid with 15 μl of mineral oil. Samples were heated to 95° C. for 5 min to denature and then incubated at 64° C. Fluorescence measurements were taken at 40 and 80 minutes in a CytoFluor® 4000 fluorescence plate reader (Applied Biosystems, Foster City, Calif.). The settings used were: 485/20 nm excitation/bandwidth and 530/25 nm emission/bandwidth for F dye detection, and 560/20 nm excitation/bandwidth and 620/40 nm emission/bandwidth for R dye detection. The instrument gain was set for each dye so that the No Target Blank produced between 100-200 Absolute Fluorescence Units (AFUs). The raw data is that generated by the device/instrument used to measure the assay performance (real-time or endpoint mode). [0453]
  • These results indicate that the dependence of lnF on c shown in FIG. 22 demonstrates different amplification rates for the 12 PCRs under the same reaction conditions, although the difference is much smaller within each pair of targets representing the same SNP. The upper plot (22A) illustrates the results obtained from the alleles detected with the INVADER probe oligonucleotide reporting to FAM dye; the lower plot (22B) illustrates those obtained from the alleles reporting to RED (Note: one amplicon expected to report to RED is missing because it mistakenly contained the allele reporting to FAM). The amplification factor strongly depends on c at low primer concentrations with a trend to plateau at higher primer concentrations. This phenomenon can be explained in terms of the kinetics of primer annealing. At high primer concentrations, fast annealing kinetics ensures that primers are bound to all targets and maximum amplification rate is achieved, on the contrary, at low primer concentrations the primer annealing kinetics become a rate limiting step decreasing F. [0454]
  • This analysis suggests that plotting amplification factor as a function of primer concentration in ln(2−F[0455] 1/n) vs. c coordinates should produce a straight line with a slope −kata. Re-plotting of the data shown in FIG. 23 in the ln(2−F1/n) vs. c coordinates demonstrates the expected linear dependence for low primer concentrations (low amplification factor) which deviates from the linearity at 0.1 μM primer concentration (F is 105 or larger) due to lower than expected amplification factor. The kata. values can be calculated for each PCR using the following equation.
  • F=z n=(2−e −k a ct a )n
  • Example 5
  • Invader Assay Analysis of 192-Plex PCR Reaction [0456]
  • This example describes the use of the INVADER assay to detect the products of a highly multiplexed PCR reaction designed to amplify 192 distinct loci in the human genome. [0457]
  • Genomic DNA Extraction [0458]
  • Genomic DNA was isolated from 5 mls of whole blood and purified using the Autopure, manufactured by Gentra Systems, Inc. (Minneapolis, Minn.). The purified DNA was in 500 μl of dH[0459] 2O.
  • Primer Design [0460]
  • Forward and reverse primer sets for the 192 loci were designed using Primer Designer, version 1.3.4 (See Primer Design section above, including FIG. 4A). Target sequences used for INVADER designs, with no more than 500 bases flanking the relevant SNP site, were converted into a comma-delimited text file for use as an input file for PrimerDesigner. PrimerDesigner was run using default parameters, with the exception of oligo T[0461] m, which was set at 60° C.
  • Primer Synthesis [0462]
  • Oligonucleotide primers were synthesized using standard procedures in a Polyplex (GeneMachines, San Carlos, Calif.). The scale was 0.2 μmole, desalted only (not purified) on NAP-10 and not dried down. [0463]
  • PCR Reactions [0464]
  • Two master mixes were created. [0465] Master mix 1 contained primers to amplify loci 1-96; master mix 2, 97-192. The mixes were made according to standard procedures and contained standard components. All primers were present at a final concentration of 0.025 μM, with KCl at 100 mM, and MgCl at 3 mM. PCR cycling conditions were as follows in a MJ PTC-100 thermocycler (MJ Research, Waltham, Mass.): 95° C. for 15 min; 94° C. for 30 sec, then 55° C. 44 sec×50 cycles.
  • Following cycling, all 4 PCR reactions were combined and aliquots of 3 μl were distributed into a 384 deep-well plate using a CYBI-well 2000 automated pipetting station (CyBio AG, Jena, Germany). This instrument makes individual reagent additions to each well of a 384-well microplate. The reagents to be added are themselves arrayed in 384-well deep half plates. [0466]
  • INVADER Assay Reactions [0467]
  • INVADER assays were set up using the CYBI-well 2000. Aliquots of 3 μl of the genomic DNA target were added to the appropriate wells. No target controls were comprised of 3 μl of Te (10 mM Tris, pH 8.0, 0.1 mM EDTA). The reagents for use in the INVADER assays were standard PPI mixes, buffer, FRET oligonucleotides, and Cleavase VIII enzyme and were added individually to each well by the CYBI-well 2000. [0468]
  • Following the reagent additions, 6 μl of mineral oil were overlaid in each well. The plates were heated in a MJ PTC-200 DNA ENGINE thermocycler (MJ Research) to 95° C. for 5 minutes then cooled to the incubation temperature of 63° C. Fluorescence was read after 20 minutes and 40 minutes using the Safire microplate reader (Tecan, Zurich, Switzerland) using the following settings. 495/5 nm excitation/bandwidth and 520/5 nm emission/bandwidth for F dye detection; and 600/5 nm emission/bandwidth, 575/5 nm excitation/bandwidth Z position, 5600 μs; number of flashes, 10; lag time, 0; integration time, 40 μsec for R dye detection. Gain was set for F dye at 90 nm and R dye at 120. The raw data is that generated by the device/instrument used to measure the assay performance (real-time or endpoint mode). [0469]
  • Of the 192 reactions, genotype calls could be made for 157 after 20 minutes and 158 after 40 minutes, or a total of 82%. For 88 of the assays, genotyping results were available for comparison from data obtained previously using either monoplex PCR followed by INVADER analysis or INVADER results obtained directly from analysis of genomic DNA. For 69 results, no corroborating genotype results were available. [0470]
  • This example shows that it is possible to amplify more than 150 loci in a single multiplexed PCR reaction. This example further shows that the amount of each amplified fragment generated in such a multiplexed PCR reaction is sufficient to produce discernable genotype calls when used as a target in an INVADER assay. In addition, many of the amplicons generated in this multiplex PCR assay gave high signal, measured as FOZ, in the INVADER assay, while some gave such low signal that no genotype call could be made. Still others amplicons were present at such low levels, or not at all, that they failed to yield any signal in the INVADER assay. [0471]
  • Example 6
  • Optimization of Primer Concentration to Improve Performance of Highly Multiplexed PCR Reactions [0472]
  • Competition between individual reactions in multiplex PCR may aggravate amplification bias and cause an overall decrease in amplification factor compared with uniplex PCR. The dependence of amplification factor on primer concentration can be used to alleviate PCR bias. The variable levels of signal produced from the different loci amplified in the 192-plex PCR of the previous example, taken with the results from Example 3 that show the effect of primer concentration on amplification factor, further suggest that it may be possible to improve the percentage of PCR reactions that generate sufficient target for use in the INVADER assay by modulating primer concentrations. [0473]
  • For example, one particular sample analyzed in Example 5 yielded FOZ results, after a 40 minute incubation in the INVADER assay, of 29.54 FAM and 66.98 RED, while another sample gave FOZ results after 40 min of 1.09 and 1.22, respectively, prompting a determination that there was insufficient signal to generate a genotype call. Modulation of primer concentrations, down in the case of the first sample and up in the case of the second, should make it possible to bring the amplification factors of the two samples closer to the same value. It is envisioned that this sort of modulation may be an iterative process, requiring more than one modification to bring the amplification factors sufficiently close to one another to enable most or all loci in a multiplex PCR reaction to be amplified with approximately equivalent efficiency. [0474]
  • All publications and patents mentioned in the above specification are herein incorporated by reference as if expressly set forth herein. Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention that are obvious to those skilled in relevant fields are intended to be within the scope of the following claims. [0475]
  • 1 759 1 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or t. 1 cactagaccg cctgtcccca agggagcctc agtggggcga cagggtgctc ggcggactcc 60 acctcaggcc ctccccactg ttgctgtgca ttcctgtgca ggtgcatctc tttcttacta 120 actggtattt attaagggag gtgctctgta ggtctggagc ctttccctca tcctttttgc 180 gagtccccac ctttttgttt tttttttttt ctttgaggct cactagagga cgcagaacct 240 tgggagattg atttgcacag aactccccac ctcccacttt tacaatttcc agtttctgat 300 tgaaaatttt agggtttctc cccactgccc ttccctatct ttccttcccc tcaacaccat 360 gaaggaaaaa cacacacggc agggcttttt gtagccctga aggcaacttt agacatttaa 420 aatccagcac tttaatctct tgttctctgt gaatcactat gagaagtgaa tggttttaaa 480 ggctgtaatg ctatgttgga aattggtttg ttttgccttt tattgaaaag gtaagatcat 540 gtgattggaa gaacacaact nttggcttgg gaagaggact ttgctgctga agtgttttct 600 accttctgag tgtgtttaag gcaggatttg gagggaagga ccagcttagg gagagtgtct 660 gagccacagc gtcaggatgg gggaaaccac atgggatcca tcaagttcca gttgaacagg 720 agcaagatca gaacttagga gggcagtgtc agctcccttg ttggctgtca aggaacaccg 780 atctagtaga aacccacttg gttgtgaccc aggtagaggt agatgccata catttgagat 840 atgcgtcctt aaggaacctg acaagcagac tgaagggatg gtaagtgtga cagcctgata 900 agttttctca aagcccagga tacagagcca gtgttttctg taactggaga cctcagttag 960 gccaacttcg aattccagag caacgtagga agtctattca gcagaaactc gacattgttc 1020 a 1021 2 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 2 ataccaaaag taattgtagt actgaatttt gctgtcattt aagccaatgg tttgcactga 60 aactctgtag acaactctga tactgccatt ccctgttctt actgcctaca atgatagtga 120 gcacaccaag tagcaatcac ctgttcattg ttttcttaca tagactttag gtccctatgg 180 tttactaaag gctggcagat aataagtatt caataatatg tcttaaggca ttttaatact 240 ctagatgctc tgaatcctaa tctcaaaagg attaacttta aaatagaagt tagaagaacc 300 aagactatct tgtcaggggt gtattttgag agtggcagac ttttcagtgc ctttccattc 360 atgacacttc ttgaatctct ggcagaacca gccagccgtg ttcacagtgt caaatgaagg 420 gatgtctttg attgcttcca ggtgttcctc agcaccaccg gagggggatg ggtgatcagc 480 cgaatctttg actcgggcta cccatgggac atggtgttca tgacacgctt tcagaacatg 540 ttgagaaatt ccctcccaac nccaattgtg acttggttga tggagcgaaa gataaacaac 600 tggctcaatc atgcaaatta cggcttaata ccagaagaca ggtaaatata atgtgactgc 660 caagggcttt taggaagaag gagcctctgc ctgtccagca gcctatacaa gccaggcagt 720 accacagcaa catggctgaa tgtgtgggaa cacttgatac aaatttgctt gataataaca 780 gctaactgtt cttaagtact cagaaagtga aattatgtat ttcaccttgt cagcaacact 840 ttacgtatta ttataataat ccttttatta tggagaaact gaaacagcaa aattcagcca 900 tttacccaag ctcactgagt agtaagtgaa ctctgtgacc ttggcaagtt acttgatcct 960 cagctgtagc aaccaaaaga gaatgatttg tctatgactt tgttgataaa agaaacacac 1020 t 1021 3 1021 DNA Homo sapiens misc_feature (438)..(438) n is a, c, g, or t 3 cagctgtggg gtcaggaagg gcttgaagta tgggacacta gcctgcccca cctccactct 60 gcagcaccca caggaccacc ctcatgcccc tggcaacagc atgcagggca gctgcaggat 120 ccaggtggga cccagatact atatgaagga gccaccttac ctgctttttg caaagctact 180 gggatggcat aggcaggtcc aatgcccatg atgtcaggtg ggaccccaac cactgcataa 240 gacctcagga ccccaaggat gggaaggccc aactcttctg ccttggacct ccgggccagc 300 aggatggcag ctgccccatc actcacctgg ctagagtttc ctaggggcaa actgttgggg 360 taagaaggca tcggggtggg gatgaggaga tcccagccct cccacttcta ctttgcagag 420 gggcctggtc tattccangt tcccagagta cagcacccag catggccatg gcctgctttc 480 tcatacccct accccggacc agtntcacca gctgtggtag aaccatcttt cttgaaggca 540 ggcttcagtt tggccaggcc ntccatggtg gtgctggggc ggataccctc atcctgggtc 600 acagtgatgc tcctcttggt gcccttgtca tcatggaccg tggtggtcac aggcacaatc 660 tcagcttgga aacagccctt gctctgggct cttgctgccc tgccagcacc atggacagcc 720 agcttcagac tcccttgggg ttcccttcct tccctgcccc caacccctat ccatttgggt 780 agacacaagc tcaggctgct aaattcaggg acatgctcga ctttggggga gctctgaggg 840 catggctaag gccttacagg gccttcttca ccatcagccc cagacctcca gatcgtggcc 900 aatcccaacc tcaaaggggg gaaagggtgt ttggaagtgg tgcctccact tagagccctt 960 tgtccaagag ggattaagcc tgcttgattc tctctgctaa actgaggatg gaaccccaga 1020 a 1021 4 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 4 gaggatcaaa gcacctggta caatgcctgg ccagaaagtt gaataatcga atatagctaa 60 cgtcactatt gcaggctggc tatgtgcctg gcggtgttct tagccattta caagtatgaa 120 ctcatttaat cctcataaga tcctgtatga ggtgagtaag ctgttaattc ccttccttgc 180 ccatactctg tgactccaac ccaccacagt tgaatttctc cttatgaatt ataaatcaga 240 aaacggcccc aaattctgtc atgtctaagt gggaaaatgg aagaaggcat tgatttctcc 300 cctactcaag cagaagagaa ttaacctcag tccctgcttt gcccatattc cttccccagg 360 gccccaggaa gaagacatgg aaaaacaata tttccaccaa agtttatttc tctgaaacaa 420 tcaccagttg ctgtcctcta tggcacactg agagccccag gagggtcttt aactcccttc 480 ctcagattat attcatccca gaaatatagc cttggacaat aatttggtta cagcatagtc 540 ccaggaatga ggtcccccaa nttgctaagt tttacatagg ggagactggg aaattcaaag 600 aattggatgg agaaaccata ggatccaaga taatgtcagg gggttgaaga tgttggagag 660 gcatggtagc atcattgagt ttgaatctcc ttctcacttg gagtggaagt tgtaggattc 720 tgcctctagg aaatgtgcca tcctacagaa taaataaaag ggagataatg aggcttcaac 780 ccaacttgcc cccatcgttt gtcactgtaa ccatcccatg ccttaataca gtgatactga 840 aaactccagg gcaccaacaa ctaatacaaa ggaagcacct tcagcctcct ctccacagac 900 atcccacttg gtagaagagg aggatgctcc ttcctgctct taatcctagc aatggcagct 960 taaatcatgc ccttgcctag atcctcatgg aagctcaccc atataataat caagattagt 1020 t 1021 5 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or a. 5 aggtgcactt tttccaggac ctcctgcaca ggtgtgatat ttagcctgga agcaatgtgt 60 acatggaatg ccctacaggc acaggaggca tccctggaga ctgaatggtg tctgggaaga 120 gtagggccac agagctgagc ccctatggac tgcagcagag ggcctggctc caatcctagc 180 ctaccatatc ccagtcccat gatcgtgagt agtcccatgg gatcaagtgc tctcattcat 240 aaaagaaggg aggtaacagc tgccccactc acgccccagg atcatccggc agtcaaaggg 300 gattcaggtg cttcctggaa gacagagtca caggggaccc tccttttccc agccacccat 360 atcagtccac cttttgggtt ttgaccttta ctatgtggtt ttctagactt ctattgacaa 420 atcctgcttt atggacaggg atgcttttca tttagattgg gggccactcc ccaacatctc 480 atttattttt cacagctctg gtcccatgga gtcttgtttg agtgcaagtg aactgaattt 540 cccaattcct caaaaagagc natagtaata aaaaccataa tagtgacact tacatatgga 600 tagtgctttg tagtttagaa aatgctttca ccaactgatt gccatgacag ccctgagaag 660 taacctactc tacagatgag gagcctagag agagaaagtg actttcctgg gcacataggc 720 ccatgaggtt ctggtgccag cataatagac tagtcaaatt tccagactct ggagtcagac 780 tgcctgagtt caaaccatgg gtcctcttgg tcaggtttta taaccactct aaaactctgt 840 ttgcccatct gtaaagtgag cacaattaca gaatctacct aatagggctg tctgtatgtc 900 aatgggcttg gcctgtgcct gaggaaatgc tanccccatg atcctgcagc catggttagg 960 aaggacatgg cagggaatgg gacctttcac agaccgggct gtggccagca gccagggccg 1020 a 1021 6 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 6 tcatacaact ccttgcagtt catgtaagga ctcggatttt acctggagtg gaaaaagaag 60 cactgaaaga tttgagcagg ggagtaacct gatagcgttt atgtttagtc ctgccacttc 120 gacagataaa cgcaccaatg ggcttgatga gatttaggcc aacccataac cgcccctcaa 180 cttctttcct ttcaatttca aaactcctct atggcttcct ccatctgttc ttccttctga 240 gaagtgctct ctctgcccct ttacagaact aaccacttcg gcaactcctt ggacactttc 300 cttcttgtta ataatttgct ttctccgccc ctcaaaagct tgctgtttct gtaaatcatt 360 acctgtaaga ggaaccgctg ggagtcctgt aaactttagc ccagagcttg gctcctcctc 420 cagaatgtct ccaccaatca aggaaagtgt tttgggccag tcttgctcct ccggattgtc 480 agactgctcc tccctcttct ttagactgcc acgaggaaaa agcagatgtg agaactcaag 540 gttcagggct gctcttctaa naaacaagtc tgccataatc tccatctgtg ttggaatctg 600 ttaactagtg agtacctcat ctcccctcct gtgtaagatt tcctgaactg gcacatctgt 660 tttttgagca aagataacaa acagatgaac aaaaccaaca atcaaaaatg ctgtcattaa 720 agtcttgggc agccaaagtt tctctcagaa tttctcagtt gtgtgatact atctattaag 780 tgatgaggag tatgcacaca caaaaggcta taaatgtagc agctgagttt tcatgttgag 840 ccttttggtg ctatttgatt ttttgaaaaa ctatgtacat gtattaagtt gataaatttt 900 ttttttaatt ttaattgaac cagatgcggt ggctcaagcc tgtaatccca ccactttagg 960 aggctatggt gggcagatgc agatcacttg aggccaggag ttcgagacca gcttggccaa 1020 c 1021 7 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or t. 7 tatgtgttga atgaaaggct gggtcatatg tgacccttgt gagcagctgt ttccgtggac 60 tgctcctggg tcccctcctc cacccgccct gcctctccca tttcatccta ggaggtgcct 120 gtggccgggc gcagtagctc atgcctgtaa tcccagcact ttgggaggcc gaggcgggcg 180 gaccacctga ggtcaggaat ttgagactag ccggcccaac atggcgaaac cccatctcta 240 ctaaacatac aaaaaattag ccaggcgtcg tggcgggcgc ctgtaatccc agctactcag 300 gaggctgagg caggagaatc gcttgaaccc aggaggcgga gcttgcagtg ggccgagatt 360 gcgccactgc actctagcct gggggacaac agcgaaactc cgtctcaaaa atatatatat 420 atattaatta aataaaaaaa cgaggtgcct tctcctgact ccctgatccc cgcgctctcc 480 agctctgccc tcgcgatcgc tggagccccc tgaggaactc acgcagacgc ggctgcaccg 540 cctcatcaat cccaacttct ncggctatca ggacgccccc tggaagatct tcctgcgcaa 600 agaggtgccg agcacagccg tagccagggg aggggctgaa gcggggcagg ggaggggctg 660 aagcgagcag aggagggtct aggacttggg gagggagccc aggaggacag aaaaaggccg 720 ggctgaaacc aggggtgggg ttacagccgg ggcggaactg catttagggg gcggggccgg 780 gtgtgaagca aggccagggg gcagtcggac agtacccact gaagccccgc ccctgcaggt 840 gttttacccc aaggacagct acagccatcc tgtgcagctt gacctcctgt tccggcaggt 900 gaggtcctgt ctcccctttc tgcctcagtg aactcagcag ggctgtgtgg acgcaaagat 960 gagctagctg caaagcctgc ctctgcatgt tgggatttgg ggtccttgac aggggtgagg 1020 a 1021 8 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 8 ctatatgctt gaaagaattt ataattaaaa ttttttttaa aaaaagagca tgaagacttg 60 cacagcaaga tatcagaaag ctaaatggaa attttcttct tagctatgtg aaagacacag 120 gcagagcacc agatggttca gtagcctgag ttctagaaat aatctcaaca tggtaagagg 180 gtctgtaagc tagcctacac ctatgcgaaa cagggtttta tgcatgggac actattccag 240 tagaaaatgc aggatttgag tagacttcta gagttggttt taaaatgatt taatgtaagg 300 catcaaatct agacaatcag taagagagta acccatacag gctatatttt cacatgttct 360 ataaagtata gtttggtgtc tacagcctgc aaaccacagc caggccccaa atctttcaag 420 ttggcccctg actctttcct gctgtctcca tatgaccgag tatgcactga actatcagcg 480 tttccaggtt cctctccagg caccgcagag tggtggcgct ctcacaaagg catgacagga 540 agacagggtg tgaggttgga nggagagagg ctgtagctga ggaaaagcac agcccatggc 600 attttactgt aatgcctgaa caaatgcact taatgaatat gtggcaaatg taggctcaga 660 agtatcattt ctttcctgta aatgtaaatg ctctccctct gaagttcctg tgggaatggc 720 ttctggattc tgggggtgag tgtggggcca ccctccacga ggcctctgcc tacctgaaag 780 catcattcca tagaccctcc cattgttcac acacagtgga cctaactctc cactttcact 840 ttttcttctg taatagttta taacagtcaa tagaactccc acattagctt ttagggtcat 900 cacagaatac aaaatgttga agatacatat tttatctttt ctatctttct ccttagtatc 960 caggtacact aactctgata ttctaacaga aattatacag acaccatgat caccatcttg 1020 a 1021 9 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 9 ggttcactca cccctcctcc cacctcggca gccctgggat gtcgctgctg actcaggagg 60 aacccgaggt gccgtagcgg ctgctccaat attgcagaag aggttcctca ggcagctctg 120 cccacagccc caagtcacga attccgtgac tccagctcca tcccaggccc cagggtacct 180 ggcccagggt tgtgctgccg cagacttggc ctgtaccatc caggcggcgg tggggagctg 240 gggttggaaa ggcttcttgg agtggactcc tgggtctgtc tgggagacgg ggaggaaggg 300 acactctgaa catcaccagg ggctgctggg gggccctggc cacccccaga gtcagaacag 360 gcaggtgggg caggatctca ggtcatccta tgctacactc agccattgcg tggcccctct 420 cctccctgtg cctggccttt tggccagccc tggggccacc gagaggatgc agcaccgaac 480 cctccaggag cccccagtgc tgccgtctgt gggacaggga caatcccatc cccactgcta 540 ctgtctgtgc tgtgctgggc ncagagctgg acacctccaa ggcccagcgc ccgtagtggc 600 tctcatcatg gacaattcac aggcagatgg tggccagctc tgtggcctgc agggactggg 660 agcggcgcca gaccatctag gccccaacct atctgcatta tcctggaaga cttcctggag 720 gaggcttcta agctgaggcc caaggaccat gtcaggtcta ggactaggac cagtgcaggc 780 cgaggccaga gagacagctg ggcttccagg tagggtcaaa gtgaggtggg cagcaggtgt 840 gggggccagg ggactcgggg acttcctctc cggctgggcc cgcctgacgt gggaggcagc 900 cagggttaat catttccacg aagccttgac cccacctgcc ttggcgctct gctcccgcct 960 cccactgccc ctcaggccag ctcaggagcc atggggcgct gggcctgggt ccccagcccc 1020 t 1021 10 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 10 tttatggcac aaatggggcc gggggcaggc ccaggggcaa ttcaacagga ggcaagagcc 60 cagggctcca gagtggagag acaggaggca gctcagtccc cagaccccag cagagcatct 120 ggggcctcgg ccccactcca gagcttcttc ctgagggagc catgcacagc aatgctggga 180 gagggactga tggggtgggg tcaggcctcc tgccacagag ctgggctgca gagcccagat 240 ggaaagacac agtgaagagc tcaacctcct tccaagctct ccttctcagg gcttcaggtt 300 ccagagcccc aggggagctc ccagccaggg gcagggtcac cttgatattc acaactgggc 360 ttgtgggggc catcttcagt gcaaccgttg tgacaaagtc aagaggctgc ctccctgaag 420 cagacccact gcctacgcca cactgacggt ccagaggccc cctcctgagg gcggccagca 480 aggggcactg tggcagctcc cactgtgcct gtcccagact gggtcagcag gtctctctgg 540 acagcacact gcaccaagta ngcccaccaa aaacgcatca ggtgtggcca tggcccacag 600 taccttcttc attccctgcc tctaacatgt gcggtctgaa tgaattttgt cactcttctg 660 ccatttataa aggagaagac agtgatccaa agctatgcat gtttctgaag ccctcaagga 720 agctcggtgc aggccatcac ttcttttggc agaaggcggg ctgtggtctc tatgtacaca 780 cgcgagcccg ccagtgacgt gcggcagtgc gtggcgtcca ggctgggaca ggggcctttc 840 aagtctcccc agggaccggt gttttctaca acagacaggt gctcccagac cgttggggta 900 caggccaggc cgtctacacc acagtattga gggagctgcg gctgtggcgg ccaccccctg 960 gcagtgcctc tgcagctggg gtgctcccgc tctgggcagg gtcagggggc acgagcaggg 1020 c 1021 11 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or c. 11 ttaatataaa taggatatca taataaatag aaatcatgcc aggtcagacg cacagcacgc 60 ttggagctca gggttccctg agaccctgac cctaagttct gctgttccct tgccctgggg 120 accagagacg gcctccagtc cccctcaagt acctctgtgt gacctcacaa ggcctcccag 180 ggcctcagat gtgagctgct actctgagct accccagccc cttcttacag acctttaccc 240 agaggaagag cctgggtccc tcagaacctc tgcacctgac ttagcaacct gcccctgccc 300 tacccacctc cacaaacccc tgctgcaggt ccagccatca gaccctggcc atcccaggct 360 gcagggaaga tcacggggaa gagaacgaag aacctaccaa agctttccag gcctctcctc 420 ctcccagtgt cttccttccc aggcctgaag gtggcttctc tgcctcccca agagcctgaa 480 tgccaagtga cctccttctg gaaacttctg ccagattgtt cctatgccca agttctctga 540 tcatcctcaa aagaagacag ncttccatcc cagaggcccc tctctatctt ccactcatca 600 aacttctagg ggacaaggag tcctttggga tcctagcccc tctggcccac ctaagtccca 660 acctaagggg cagcaaaggc acagatggtg ataatttgct gggggctggt ccactcccct 720 gggccctgct gtctcaccct gtggtcaggg ctcttgtaga tgacttgtgt agtttgttca 780 ctgcacaaag tgagcaaggg gccaaaggga caagtagagg cagaagtcca gcccacgctc 840 cccagtccac aatctcccag aggaaggggc accttcttct agctccctcc ctatggaagt 900 ttccactctg ctcagcttca tcacagccca gcccagagtg gagtggactg gccaggcacc 960 ctcggggtct gccagcagcc cccatttggg tttagcgatg ccctgggccc cagccaccct 1020 t 1021 12 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or c. 12 tgtgacaatc agcaaagccc cacccaggcc cccatctggg atgatgggag agctctggca 60 gatgtcccaa tcctggaggt catccattag gaattaaatt ctccagcctc actctcggct 120 ctttcctact tgttagtagt cttgggatgg tggtagtcag aggcagggac tgaagaggtg 180 agggaatgac agaaccgaca tttaccaggc accagctgta tacattacac atgccatctc 240 ctttaatctg catcacaacc ctgtgagatc agtgctattc ttagacccat ttcacaggtg 300 agcgaactga ggcctttaaa aggttacatc aacctctcaa gatcagacac caaaccatag 360 ttcagctagg tgtcgcaggg gggaatactt attaagtgct aagcactgta tatgtattgg 420 ttcacttaat cctcaacaac cctatgaggt agctcctgtt tagagacccc ctttttttag 480 aggaggaaac taaggcttag agtgcaagag ggaggtcctt tgcgcaaagg catggaggag 540 atttgaattt aggtttaggg ntgggccagg aagggcacgg cagccgttaa aaaaagaggc 600 ccccctggga ggaggggagc tgaaagccct ctccaacacc caccccaatc ctggattcag 660 acacagacat ttctgtgaca tccctaactt cccacctgct acctcaggcc acagcaccca 720 ggcactaggg ctcccctagg caggtttttg aggcatgtat tatttttgca acacggacat 780 acatgtacct cctcctggta ctgcctgggg ctgctgcaat aagttaccct ttccccattc 840 tcatctgtat gtgaagttcc ctggcaaggc caaagcccag ggcatcagaa tgagcttcct 900 gaacaccaca tccaggcata gaagagttgt gtcatacata gctcaaggtt acccagaaca 960 gcaggagatg tggtccagca tttgggcctt gagatccccc cattcatcct cttgattgtc 1020 c 1021 13 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or t. 13 ccaccaccga ggccgagctg ctggtgtcgg gcgacgagaa ctgcgcctac ttcgaggtgt 60 cggccaagaa gaacaccaac gtggacgaga tgttctacgt gctcttcagc atggccaagc 120 tgccacacga gatgagcccc gccctgcatc gcaagatctc cgtgcagtac ggtgacgcct 180 tccaccccag gcccttctgc atgcgccgcg tcaaggagat ggacgcctat ggcatggtct 240 cgcccttcgc ccgccgcccc agcgtcaaca gtgacctcaa gtacatcaag gccaaggtcc 300 ttcgggaagg ccaggcccgt gagagggaca agtgcaccat ccagtgagcg agggatgctg 360 gggcggggct tggccagtgc cttcagggag gtggccccag atgcccactg tgcgcatctc 420 cccaccgagg ccccggcagc agtcttgttc acagacctta ggcaccagac tggaggcccc 480 cgggcgctgg cctccgcaca ttcgtctgcc ttctcacagc tttcctgagt ccgcttgtcc 540 acagctcctt ggtggtttca nctcctctgt gggaggacac atctctgcag cctcaagagt 600 taggcagaga ctcaagttac accttcctct cctggggttg gaagaaatgt tgatgccaga 660 ggggtgagga ttgctgcgtc atatggagcc tcctgggaca agcctcagga tgaaaaggac 720 acagaaggcc agatgagaaa ggtctcctct ctcctggcat aacacccagc ttggtttggg 780 tggcagctgg gagaacttct ctcccagccc tgcaactctt acgctctggt tcagctgcct 840 ctgcaccccc tcccaccccc agcacacaca caagttggcc cccagctgcg cctgacattg 900 agccagtgga ctctgtgtct gaagggggcg tggccacacc tcctagacca cgcccaccac 960 ttagaccacg cccacctcct gaccgcgttc ctcagcctcc tctcctaggt ccctccgccc 1020 g 1021 14 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 14 gcctatggtg cagggctggc agaggcgggg ccaggattct agcttcccca cacaccagcc 60 ctgtggcatc attcttccca acgtccaaac gtttttccaa gggggagaaa tggactgggt 120 catgtaaaga aatactcatt tttagggctt tttatgtggc cttcaaagca cgttgcaaac 180 aaatcccttt cactcctcag aggaggagcc attaggaagg tagggggcga caggcacagc 240 ctacagcctc tcctcaggag gacagagggg gtcatcgcat ttgagccccc tgcagtcatc 300 tcgggggctc ctgagggtcc aggtccacat gttcgagggt ctgcagcaca tccacggcgc 360 tgtaggactt ccaggcctgc atgttacagc tcttcaggat ggctcccagc tgcctgccag 420 ggcctacttc gaaagtttgg gggaaccccc tgcccttttt cctttcgtat atggcatgca 480 tcgtctgctc ccacttcact ggggagacca gctgctgggc cagcagcttg tggatgtgcc 540 cgggatgcct gtatctatgc ncgtggacgt tggagtagac agaaaccaga ggcttcttaa 600 tgtcgactgc ctttaaagct tgcgtcaggg gctccacggc tggctccatg aggcgggtgt 660 ggaatgcgcc actaaccggc aacatcctgg tgcgtctgaa atgaaactta gaggaattct 720 tctggagaaa ccgtagagcc tggggaagga aggaggtttc agccgagcaa tgtcccagaa 780 atccgccttt acagatctga ccattcacag ggccaaactg ggagggtgac cacaaagaga 840 cccacagctg ctagatgtgg acatgtgacc tgtctgtccc agcaccatcc ccaggcaatt 900 cacttaacat cctggaatct cttctgtccc agccttcaaa taagcacagt tccatctact 960 tcacaacgct gccaggaaga gcaaacccta caaggcatgc aacagtgtct ggtagaggaa 1020 a 1021 15 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 15 acgttatcag gcacaaaccc cctccagaca cctgagcctc ccccacaggc tcccagtgag 60 gagccatcac atgcccaggc cagccgaggg gccctcaggc atggggatct gggcaatggc 120 agcaagctgg gcggggggtg cagccaggat gacagcagat ctgcagggcg gggtcctcgc 180 cccgggccac ctggctgggg ccgaaggtca cagctgcgtc taactgggcc ttgagcagct 240 gaagctgttt cagggcttgc agcacctctg gggtggcccc ggccacaccc cccagcaggt 300 tgtagttctc accagggtcc ttggacaggt catagagcag cgggggctca tgagcagtca 360 gagagctgga ggcgtggcag gcagggtctg cagtggtatc actgtgggca gagcctgggg 420 agggggccaa ttctgtgcac agggcaaggg cgagaggagg ggccagggat ctagggctcc 480 ggggaggggt cagcaggtcg gggggaggga tccacgggga ggggttaccc tgggtgaaga 540 agtgagcctt gtactttcca ntccgcacag caaaaacccc acggacctcg tctgggtagg 600 acgggtagaa gaagagagac tgccgagggc tctgggggca gagtcagggg tcacggggcg 660 gggcaggccc caagcactgc acatacctgg ggctgccagc cctggtggga ggccctggac 720 gtgcaccgct tcttgcccac ccaggaacct gagaggtggc gccacttgga tgccactcag 780 tgcaggaggc actgaggcac agactctcag gcactgccca cactcacccc aggggaaggc 840 caggacaggg gccaaggatc tgggatcagg ggtcaccggc cctaccttgc ctgtgcccag 900 cagcaggggg ctgaggtcaa agccatccaa ggtgacattg ggcagtgggg ccccagccag 960 ggctgccagg gtaggcagca ggtccaggga gctggccagc tcgtgggtca cgcctggggg 1020 c 1021 16 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 16 gaatggtaag aaacattctt cagctcaaga tggtgaccag aggcatccag cactcacttc 60 cttcacaaag gactcaaaca gcaaatgaat aatcacatgt caagtagagc agcttagaaa 120 gaacactgga attcagaggg aaaggacaag gaacttcgga aacatgcaaa gagaatgatg 180 tgaagcagcc ggcccagcca ggatcagctc agatccaaga gaaactgccc aacgtaggga 240 aaaggtaaat gagagatccc cgcaaggctg cattcccacc acagactcct gtggccctag 300 ccacagagag ccccttggcc ctcatgggct ttgagactag tatagagagc cgcctgcatt 360 gttccaaaga gggattttat gatgggtcct acacatcctc tgagacctga gcagctgcag 420 cacagcacca ttttgagagc ccacccctga ccagacccca tcccgccctg gggctcaaca 480 gcccctgcat ctccacatcc atggagtcct gctgacattc cgccatgtcc acccagaagg 540 ctgcagcctc acaatgcagg ntgactgggt ccccagcaat ctagtctaca catgtcctat 600 aacctgggaa tgggtggtgc accacaccag ggaggctgcc cctgggacaa agggagccaa 660 agcccatgtt tcccagagcc gcagagctgc ccgcctggga ccactgccac tgacagcacc 720 cccaccatcc ccccagcagc ggggtcactg tgcacttgtg atatggtttg gctgtgtccc 780 cacccaaatc tcatctccag ttgtaatcca aattgtaatc cccacgtgtc aggggaggga 840 cctggtggga ggtcattgga ttacaggggc ggtttcctcc atgttgttct catgatagtg 900 agtaaattct catgagatct gatggtttta taagtgtttg atagttcctt cttcacacac 960 actctctcct gtcgccatgt gaaaatgtcc ttgcttcccc tttgccttcc gccatgactg 1020 t 1021 17 1021 DNA Homo sapiens misc_feature (508)..(508) n can be g or t. 17 cacctccctt aactccccag ccatgccccg tgggtatctg ttttcccagt tttgtagatg 60 aaagcacagc tcagagaggt ttactcagtt gcctggagtc acacagtcaa caagtggaga 120 gccagtcatt gaatctggta ccacaaactc ttcctgctgc aacagctgtg cttttgcagg 180 cactgacttt ggaataccct cagctgattc acagggtcct ttgtcctggg gaatggcctt 240 ccctgtctcc ttcagggaaa gggtttcatc cttcagggaa gattcattga atcaggattt 300 gctgggtttt tttcattttt ttttttcatt tctttttttt ttacacgaat gggcttcctg 360 gcccgcattt tgatttgcgc ttgggtttat gaattgagga atcacagtca gccttgggaa 420 ttagttgcaa gataaatatt gcaatcctgg ttaaggactt aagaattgtc acttgtgtgt 480 gtatattgtt gttgttgttg caacggtnct gtgtacgcac ggttacagtg gatcaaattt 540 ggggagttag gaagtggcgt tggtttgtgg ttagacttgg gggaggtgtc gctttcggtt 600 gttggtgtgc tggtggctgt gttcctgtga tatggaatgt actgtctgag aatgtgttca 660 ggggtctgtg gttatgtgga tatgggtgtg tagctgctga tgacatggat ggagggatgt 720 atctgggtgt gtttctgcag aacaagtgat acctgtacca tgtgactttg tcagttccac 780 catgtccagg cacaggtcgg gggggttgtc catggttctg aacgtatctg cccccatttt 840 acagatagga aaccaagact tagagaggcc aagtcatctg cttgaagtca tctagctgag 900 aagcggctga gcctgaaggg aaaccagggc tgccttcaga gtccagcctc ttttccctgc 960 tccccaggaa aggttttagt aacaataaaa ggtttaaatg ccagcaaaag gtctaaacgc 1020 c 1021 18 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 18 ttgccccaca gacaagatga tcccccctgg catgttgtta ggggcaaatt gctgtcctgc 60 tcagagtggc atctttcaat gttgcctcca tcttggccaa gaggtccctg cctcctgatc 120 cggcacagct gagctgaggc agatgtgacc agttttcaag ctaccagccc tgggcagagg 180 aagatgtcaa caattccaga gcagagggaa gaggcacctt ccttgaccac accagtggcc 240 tcctgaagtt ccatgctttt aagagctggg accttgggag gatgattcaa accctcaatt 300 cctcctccct gggaactttt taccaccttt acctatttat caaaatcata ttcatcttta 360 ccatcactgt cactgtaatc tacattccat cacctttatc aggtgctgct gagtacaaag 420 cacttgggat gggagacaca gcactgaatt cacaaacatt ggaccaaact gtttgtcccc 480 atctgggttc atgaggccac ctctttgctc aatccatgcc tcttgccctc agtcaacaag 540 acattcctag agggaaaggg ntgctgctct gggagtcaac ctgagttcct ccctcctggg 600 aagctgggtt ggcaagattc taggacactc acctgcatgg acatcacctc tgtgacaaat 660 gcttacctgt ttctcatctt cagacttggc gatatcaagc ctgttctgga ccatgaccag 720 gctggctcat atctctggtt tagagaaacc tatgaataac tggggacaaa cagactcttt 780 ggtagcagca gacacatgtg atccatcaag atcaaccaag gttgcaactg gagcgtccac 840 tgccagagac ctttggctct tcaagctcgg gacaaaaaag aagactctgt tgtcccttgg 900 taacccagtc cctgcttttg tagctatcac agcagaaagc aactcttcct gaagaccaaa 960 cactcgtcat ccacattcct tgaatggcca atccttccat ctggaggcct ggctcagaaa 1020 g 1021 19 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 19 gtttgatggg acaagatagg acagtggtta agagtgtgac ctcagcagct gactgcctgg 60 gtgtaaagcc taccatgtgg tcaagcacac gggtggctct accacttacc aaccatgtga 120 ccttgggcgg ttaacagccc tgtgactcgg tttccccatc tgaaaagtga ggatcatagc 180 agtatctacc tcctgcggtg gtcggaaggc agaaaagaat tggcacatgt gaaagtactt 240 agcacaggct tggtgcatag caagtcctga ggaaatgtat tcactgtcat cagtttcacc 300 cgctttgaaa ggcaggcaaa gaaagcacct gacaaaacct tttgatcccc cacgccttgt 360 ctcccacacc caggacattc ccctgactcc catcttcacg gacaccgtga tattccggat 420 tgctccgtgg atcatgaccc ccaacatcct gcctcccgtg tcggtgtttg tgtgctggta 480 aggggtgacc ccagcctgga gaggcagcgt ggcagagtgg ccaagggccg agtcagatgg 540 acatgagtct agttcctggc nccgtcactt accactgtgt taccttgagc aactctcttg 600 gcctctctga aatgcccaca tcgtagagtc actgtgagaa ttaaatgaga tgaagcaggc 660 aaagcattta tccaaggccc agcacacagg gtatgctcta aaaataatag ctgccattct 720 gttctcttgc ttaaccctct accaggcagt tagcaacctc ctatgcagtg gaaatgcagc 780 tcatctgact cattcattaa acagactttt attgaccacc tattatgagc taggtccaca 840 acagcaagat gagaaccaag ggaaaaagtg cctgtgatta gatggctagc aacccaaaag 900 ggacccttgg ggtcctcacg tccatcccat cttcatgcca ggcagagctc ttctttgaaa 960 atctgtggag tcagaggtgt aaggcattgg gacaggtggg ggtgagagtt ccccccctca 1020 t 1021 20 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 20 gccctgccct gtagtggctt ctcaatgaat atgtagttgc cttattctca caaacaccag 60 gctttcctca catcagcacc cggtgtgata ggtaagagtg tgtgatacta gaaacgtcag 120 cttatccaaa aatgtatttc tttctctcat gagagcctcg tgagctctcc agcttgctgg 180 aactttctaa gacctaacac ttgccaaatt ccttgcagca attgtctggt ttgtggtacc 240 acaatcgaac ccaccaccct gacgtatttg ctgctcagaa ccaccgatct tccaagttct 300 catcactcca gtgcagctcc tgtgacaaaa ccttccccaa caccattgag cacaagaagc 360 acatcaaagc agaacatgca ggtggagttt gggtaccgcc ggcagagagc gggaggggct 420 tgatggtgta gcctcctggg ccccaccaga aatccccact tctaatagtc tagtgtgatg 480 tgcagtggtc attgcctttg ttctgcccca gcgcacctgt ccgtagcagc agcagtcagt 540 agcagcagct tgagtggcag nggttctcaa acctggaagc gtagcgcagt gtaagctccc 600 accagccctg agtgagagct tgttggggca cctgggaagg gtgtcagcct cagtggtagg 660 caggcctgag tggaaatcct gattccagca cttatcagct acatgacctt ggcaagtgac 720 ttcccttttc tgagcctgtt tccttctctc caggatggca gttattaaaa cctactttgc 780 aggtaaattt ggtgataatc acaacagctg tcagttacag agtgtttcct atgtgcaaga 840 caccatgcta agcacctcgt gtatattttc tcatttcatt ctcacaacat ccctctgagc 900 atccaggcag tctggatcca gatctcatgc tctttaccac tagattgtac aaatatacca 960 taggttataa gattcctggc acttggtaga tgcttgctaa gtattggcca tcgccccaac 1020 c 1021 21 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 21 taaatttctt acaaggtctc ttctagttct aattttttaa aaaatgttat gacctctgcc 60 cagatttttt gtctcactgg aattttatga aatcaaatag tttgtaagtg gaccattata 120 ggactgtttt gcccagttct ttgttgtaag ggtgtttgac cggttgaatc atggtattta 180 aaaaattctt atacaactcc agatctaatg gtaggctaag ttgtggtgat gcttatactc 240 agtgatattg ggtgtgtatt ataagaatga agagagcgga gaacaaacat aaacattaat 300 gttaatgaca aacattaacc caagtacaag gttaatgttt agtcaatata gcaaacatgt 360 aatttacaag attaaaaata attaggcttg tgataaagtc aatgaatttc ctacgtaatt 420 gtaacattag actgttttat tatttgtcct gacattttgc agaatccaag attaattaaa 480 gaaatggttt caagaagagg gtgaatacta taaaaataga cttaccttcc tgaattgagg 540 aattcatcag gaaagcctca ngtgtgcaaa tgagccatcc ttccagaggg aaatttctta 600 gaattatccc acgatttgag ccaaagcact tccgatagaa tttttaacct ctagttggtt 660 ctgctccttc catttttact aatttttaag aaaatactat gacttataat tgtatctgga 720 atgattatca actccttttc atccactgac ttaaatttga ttataaatat gctttacata 780 aagatctaga ccttataatt tgaattcaag tgaattgttg tgactagcat gtaaattatt 840 attatggatt gtaaatctta acataggtag ttctgtgccc ttaaattgat aaaccagtta 900 tctcttgtaa tcatgtgtac taagatatac gtagtaaagt gattgtatca gtttttatca 960 taagcagtca tagttcagat agttcagaag tttagtgtct gctgtttcta ttaggaaagt 1020 g 1021 22 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 22 acttggtgac tttctctgca ccaggtgagc ccctagtcta cactgcactg cacccccccc 60 ccaccccggc gcacgcacac acacacacac acacacacac acacacacac acacaggcat 120 gcacaggccc tcctgtgaga gatagcccta aggagggaac cgtccctaga gccggtcccc 180 agccgctcgg cacttcccgc ccacgcgccc ggtcccacag tgcagcggac cctcactcac 240 cccgcggatg tcccagtacc ccagtgtcat ggacatgatg ctggttggtg tcgattctgc 300 agacaggcct cagctgggct gaactgcgac ctcctctggg gttcccggca cgcaggggct 360 ggacctagcg ccagacccgc cccctcggcc ccgctgcgcc cgccgatctt caaggtcgtc 420 acttccaacc ggccgatctt caaggtcgtc acttccaacc aacaggcgcg ggaggcacgg 480 agcaggttgc tggatcctca ctggctggaa ggagtaagat ccaccgccac ctccgagtgt 540 tcagggagca aggtccggaa ncactaggag gggctcggcc tcgccagctt ccgtagcccc 600 gccccgcccc gctccgcttc ggacctctgc tgggtcccca gggactcggc tgtgcgcgtg 660 agagtaaagc cagatcgtaa gagaaaagtt cttcccccgt ttcttcttct ccggacgtcg 720 cccagccttc tgcctctcgg ctgccgagtt cccacaggct ctgggagact gaggctgcca 780 gggtcagact aaagagaggt ctcagagagt ttaattcaac acttcttggc tactaagtct 840 tagaagtctg atggtgtgct ctctctgctg agttggggag cgtgaatgga ggctatgtca 900 ccgaagctga tagagctcag tctctgttgc agatgctccc gacccttttg cattgggcca 960 gttccccagc tctgagactg ggtccaggct caggaagtgg cctatgtgtc aaggtggatt 1020 c 1021 23 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 23 gaatggtcat ttttgatgtt ttgttgttgt tgctattttc gttgttgagg ataactataa 60 ttttttgtgc caaaaatgtg gcaaaccttt ctatggggaa aacgatagaa atggcactta 120 accctaaccc attggacata atctattatc tgtttttact aaaatccact gaacctgtag 180 aaatcttaga ttaatcagaa acacactctt ttcttgtgct tctcaataaa taattgaatt 240 gtttttgccc aggaattacc cctgagcaac taaaatgttt accttcctgc agttataaaa 300 atctcggtgg gggttgtttt tcagctcctt taactcgtcc atctcgttaa gcatctgatg 360 gacctggaac ttggaggaga ggaacttcag gcgccggtgg gtataggtct tactgtgaaa 420 aataaaatca cataattcca aaaagtttca ggcattcaag aaaaacagtc acaatttcaa 480 aactatcagg acctttatca ttcataggaa ataattgttg gaacaaacct tttagtttac 540 tctgcagtta atcccactga naagtagtgg gctccaaagg cttaatcttt tcaataatgt 600 tggacataag aatgagggag aacttggaaa ggtatcttaa aactcaatgg agagagtgtt 660 attcaaagtt tggggtcagc agattcgagt gtgaatcctg gctcagccag ctgtgtcact 720 ttaggcaagt tacttaagtc atcaaagtct cagctcataa aactggaatt atgaaaataa 780 ccacctcaca gtgaaaagtg taagcaataa aaggaacaat gtgcatgaag ggcttaatac 840 agtgtttgaa catagtaagc atttagtaaa tacttagtct cactatcagt agaagtagta 900 ctagttgttg tttaggtctt gtagtactag ttgttgttgt ttaggtctca ctaaacactt 960 acacaggtcc ttgagcaatt aaagcaagta aaaaattcat atcgtctaag aaggtgtcca 1020 g 1021 24 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 24 gaaagctgag aaagaggcac accaagacta agggaaagag gccgggaagg gtaaaaaggt 60 gaaatgaaaa gaggttggtg aatgactaag aacggttgga taggacaaat aagttccaat 120 gttcgatagc agacgagggt gactacagtt agcaatatat tgtatatttc aaagtagcta 180 gaagacttaa aatgttatca acacatagaa atgaaatata cctaaggtga tgtatccttc 240 aaatacccgg acttgatcat tacacattcc gggcatgtaa aaaacgcttc catgtacccc 300 atttcataaa tatgtaaaat attatgtatc attaaaagaa agaacaaaaa agacagggaa 360 aatgcatatg ctgtgctcca ctcagccaac aaacttctgc tctaagcagg gatattgatt 420 ccaaaggcta gcttgcgttt cttaaaaata attaaaaaca acaacatgtc atttatttca 480 gagctggagg ctagaaataa attactcaaa tctcgcaact atgtaaacta tgaaaatgaa 540 acaagctagt taccttttat ngttcagttt aaaaaagttc ttcttctttg ctcctccatt 600 gcggtcccct tcaagatcca ttccgacctg aagagaaacc gcagctcatt agccaaatgc 660 atgagcctca ggcgcgctgg aggtgagact aacctctagt cccccgtcga agccagagag 720 cagtaagagg gagcgcccgc cgttgatgcc ccagctgctc tggccgcgat gggcactgca 780 ggggctttcc tgtgcgcggg gtctccagca tctccacgaa ggcagagttg ggggtctggc 840 agcgcgttct ggactttgcc cgccgccagt gcgattctcc ctcccggttc cagtcgccgc 900 ggacgatgct tcctcccacc caccgcccgc gggctcagag agcaggtccc cgcaccgcgc 960 gggctgtgcg cgctccgggc aacatggtcc agtgccacta cggtttgggc gctgctccag 1020 g 1021 25 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 25 gtgagttttg aggcttggga gagagctgca aggaggaaga aggaagagaa ataggggaga 60 gacatgggga gagacagtca tgcctacttc ctcagcaggc cagaagcagc atgtgcaggt 120 ggggacccag actctgtact tggacttaaa gtgaaaggct ttccagatat tgtacttacc 180 cctaaggctg acaaaggtgg agcctcaagc ctatagcttt ggatcaagac aattgttcca 240 gttctcctat cccagaaatg ttcctctctc ctaaacctga agtggtcgaa cactttcatc 300 ccttcctcac aaggagggtc aggtgatcag gtaaaggtaa caactaaccc aaacaggaag 360 tgtggccaga tgcttgtata caggtaaggg tgtgatttgg ttgctaattt ctcttcactt 420 ctgggagacc agccccttat aaatcaaact ataggccaga gaggctgcca catgctccca 480 ggctgtttat ttgaagagag acttacatta ggcagtgact cgatgaaggc atgtatgttg 540 gcctcctttg ctgccctcac natctcttcc tgtgacacca cccggctgtt gtctccatag 600 gcaatgttct cagcaatgct gcagtcaaac aggatgggct cctgggacac gatgcccagg 660 tgtgctcgga gccactgaac attcagtcgc tttatttctt tgccatcaag cagctgaaaa 720 caagagttca cagatcaact tcaggaccag cacactttga atgtagcaca attaacatca 780 ttatttctta cactgaaact gccaagttac tgtgagatta aggaaaagtt tgtgtgatta 840 aaatttggat agtgaaggtt aacccaacaa ggtcataatt gtatgccttg aggaactgtc 900 atgtttcctg tgtttcaacc atggtttctg atgtatgcat gtggtaggca gaataatgtt 960 ccctctccca caagacatct gtgtcctaat ccctggatcc tgtgaatgtg ttatgttaca 1020 t 1021 26 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 26 gccttgcctt cccccaggca ggtttgagag gtctgggtct caactgactg gggcagcagg 60 acctcatccc ctccctgccc tacacccagc ctgccccagc cctgcagtct gttgttcctt 120 agtcagggag gagcccaaaa gtgtgaccaa accaagggaa cactcaactt ctggcttcct 180 ccctctttgg gtagccctca agccactgga ctttgaagtc agcaggtaat tctccaaatg 240 gaagaacttt tttttttttt tttaaaagca gagccaagga agccacattt tgagtgatgt 300 ggtttttgaa gaaaaaagaa aaagagatcc cagataaaaa tgatcttatg tgaagggagt 360 aaatggatgc acagaaacag cagcagctcc cgagccacct ggtggagcac aggggccctc 420 cctggcctcc cccaacactg gggctggggt ctgggggctg cccagcaggg tgatgtggct 480 cccttgggcc tgagagcacc ctggagggag ttgaccctgg ggggcaatgt tcccaggacg 540 cagtacctga tatccaagtc ngtcgctgtc tcccgctctg ggctgcagca ggggaggaaa 600 ggcatactga gctctcatgg gagtgaacca tatcctccag gaagatcctg agctccctcc 660 aacccaacat gagcatgcct ttacaatccc ctggacccag tctgtagcca caaatgctgc 720 atagagaggt gtggagagtg gggtgtgccc atcttgggga agcctctgct gcctgaccac 780 gtgggtgtgt gaggagggcc ctggaggacc cagttaagag ggagaatggg gagaggtgcc 840 attggtgcag gctctggggg gaaaacttgt cagatcagga gtatgaagcc cgcaatgtgg 900 ctcctccaga cccagcctct gcattcaggt tggaatgaat aggctgaggt ctgaggctga 960 tacagctgca caaacagctg gggcaaggag tgctctggac agagccaggc caggccaggc 1020 a 1021 27 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 27 tgtcaggcaa gattcaaatc aaaataatta attttaaatg acatgcatac tttttggaga 60 gaaaagtttg ggttacaatt agccaatctg ttaaaactca aagaaatcta atccaaacgt 120 aatacacatg tctgtaccat tttttttagc ctattctctc ttcagactta tacttaatca 180 caaataacat tcttctttct attaattaat tccaaaaact ggctcacagc catatatgac 240 agtcatttat tgctactagg gacataaaat ttctaaataa tcagaaatcc acgttgtcat 300 ttatgaatat tctctctcct tgcaaaccaa aaaaatcatc tttaacctta cctgatagat 360 tttggcatcc ctcattagtt tttctacagg atattctgta ttaaatccat tgcctccaag 420 tatctgcaca gcatcagtag ctaactgatt tgcaatatct ccagcaaatg cctttgcaat 480 agaagcataa taggtatttc gacgaccaga atcaacctcc caagctgctc tctggtaact 540 cattctagct agttcaactt ncattgccat ttcagccagc ataaatgata ttgcttggtg 600 ctagaattaa aaagaaaaaa attaaaggat atttattgag aaaacttaaa agttttttcc 660 tggggctttt tcatttttat agtgacgggg tcttgctatg ttgcccaggc tggtctgcaa 720 ctcctggcct caagcaatcc tcctacttag gcctctcaaa gtgctgagat tacaggcgtg 780 agccactgtg cctgaccttt ttatttttta aacttttcat taacgaattt taggtttata 840 gaagttacac ccagcttcct ctaatgttaa catattacca aaccatagtg ccatgatcga 900 gaacaggaca ttaacactgg tatagtatta acaactaaac tataagcctt actcaaatct 960 ggtcaagttt tctactaatg ttctttttcc accattatac gttgaattta gttatttctt 1020 c 1021 28 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 28 aggtagcggc cacagaagag ccaaaagctc ccgggttggc tggtaaggac accacctcca 60 gctttagccc tctggggcca gccagggtag ccgggaagca gtggtggccc gccctccagg 120 gagcagttgg gccccgcccg ggccagcccc aggagaagga gggcgagggg aggggaggga 180 aaggggagga gtgcctcgcc ccttcgcggc tgccggcgtg ccattggccg aaagttcccg 240 tacgtcacgg cgagggcagt tcccctaaag tcctgtgcac ataacgggca gaacgcactg 300 cgaagcggct tcttcagagc acgggctgga actggcaggc accgcgagcc cctagcaccc 360 gacaagctga gtgtgcagga cgagtcccca ccacacccac accacagccg ctgaatgagg 420 cttccaggcg tccgctcgcg gcccgcagag ccccgccgtg ggtccgcccg ctgaggcgcc 480 cccagccagt gcgctcacct gccagactgc gcgccatggg gcaacccggg aacggcagcg 540 ccttcttgct ggcacccaat ngaagccatg cgccggacca cgacgtcacg caggaaaggg 600 acgaggtgtg ggtggtgggc atgggcatcg tcatgtctct catcgtcctg gccatcgtgt 660 ttggcaatgt gctggtcatc acagccattg ccaagttcga gcgtctgcag acggtcacca 720 actacttcat cacttcactg gcctgtgctg atctggtcat gggcctggca gtggtgccct 780 ttggggccgc ccatattctt atgaaaatgt ggacttttgg caacttctgg tgcgagtttt 840 ggacttccat tgatgtgctg tgcgtcacgg ccagcattga gaccctgtgc gtgatcgcag 900 tggatcgcta ctttgccatt acttcacctt tcaagtacca gagcctgctg accaagaata 960 aggcccgggt gatcattctg atggtgtgga ttgtgtcagg ccttacctcc ttcttgccca 1020 t 1021 29 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 29 cagagccccg ccgtgggtcc gcccgctgag gcgcccccag ccagtgcgct cacctgccag 60 actgcgcgcc atggggcaac ccgggaacgg cagcgccttc ttgctggcac ccaatagaag 120 ccatgcgccg gaccacgacg tcacgcagga aagggacgag gtgtgggtgg tgggcatggg 180 catcgtcatg tctctcatcg tcctggccat cgtgtttggc aatgtgctgg tcatcacagc 240 cattgccaag ttcgagcgtc tgcagacggt caccaactac ttcatcactt cactggcctg 300 tgctgatctg gtcatgggcc tggcagtggt gccctttggg gccgcccata ttcttatgaa 360 aatgtggact tttggcaact tctggtgcga gttttggact tccattgatg tgctgtgcgt 420 cacggccagc attgagaccc tgtgcgtgat cgcagtggat cgctactttg ccattacttc 480 acctttcaag taccagagcc tgctgaccaa gaataaggcc cgggtgatca ttctgatggt 540 gtggattgtg tcaggcctta nctccttctt gcccattcag atgcactggt accgggccac 600 ccaccaggaa gccatcaact gctatgccaa tgagacctgc tgtgacttct tcacgaacca 660 agcctatgcc attgcctctt ccatcgtgtc cttctacgtt cccctggtga tcatggtctt 720 cgtctactcc agggtctttc aggaggccaa aaggcagctc cagaagattg acaaatctga 780 gggccgcttc catgtccaga accttagcca ggtggagcag gatgggcgga cggggcatgg 840 actccgcaga tcttccaagt tctgcttgaa ggagcacaaa gccctcaaga cgttaggcat 900 catcatgggc actttcaccc tctgctggct gcccttcttc atcgttaaca ttgtgcatgt 960 gatccaggat aacctcatcc gtaaggaagt ttacatcctc ctaaattgga taggctatgt 1020 c 1021 30 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 30 ccactccgga gcacctggct ctgccctcag gaactccctg agctttgcac acagggccga 60 gacacctgga tttctctggt tccctgagtg gggccagctt ggaagaattt cccaaagcct 120 attagagcaa cggctgcctc ctgcctgcct ccttgggctg ggcagggctg agggcggagg 180 gagagagaga gagagggagg gggagaggag gaaggaaaaa gttggcaggc cgacagcaca 240 gccgtgtctg catccatcca gaggaggtct gtgtggtgtg gggcgggcca ggagcgaaga 300 gaggccttcc tccctttgtg ctccccccgc cccccggccc tataaatagg cccagcccag 360 gctgtggctc agctctcaga gggaattgag cacccggcag cggtctcagg ccaagccccc 420 tgccagcatg gccagcgagt tcaagaagaa gctcttctgg agggcagtgg tggccgagtt 480 cctggccacg accctctttg tcttcatcag catcggttct gccctgggct tcaaataccc 540 ggtggggaac aaccagacgg nggtccagga caacgtgaag gtgtcgctgg ccttcgggct 600 gagcatcgcc acgctggcgc agagtgtggg ccacatcagc ggcgcccacc tcaacccggc 660 tgtcacactg gggctgctgc tcagctgcca gatcagcatc ttccgtgccc tcatgtacat 720 catcgcccag tgcgtggggg ccatcgtcgc caccgccatc ctctcaggca tcacctcctc 780 cctgactggg aactcgcttg gccgcaatga cgtgagtggg gtgtccctgg gcttgggggg 840 gttctagaat gatgctgaaa ggcactggtt ccatcctctg cccattgtgc agatggggac 900 actgaggaac ggagaggaca agaggttgct ggaggtcacg tagagagctg gggggaagag 960 ctggggctgg aactcagcta tgcatgcctc ccaaagcctg ttttctgcca ggcactgtgg 1020 g 1021 31 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or c. 31 ctcctcacca gtcctcacca cctctctccc ctgcagctgg ctgatggtgt gaactcgggc 60 cagggcctgg gcatcgagat catcgggacc ctccagctgg tgctatgcgt gctggctact 120 accgaccgga ggcgccgtga ccttggtggc tcagcccccc ttgccatcgg cctctctgta 180 gcccttggac acctcctggc tgtgagtcag gggccctccc agatggaggt gggggaaggg 240 agggcggggg ctggtggggt gccctgccat gggcagccag tgggactccc gacagggctc 300 ttgccattgg gtggaggatg gcgggtcagc gctgggggct gggggcaggg tcctgccctg 360 gagaggagca cagggacctc ctgcccagct tggggtcagc actcctcttt ccctgggtct 420 cattgtcccc caccctgatt gttctctttc tccctccaac ctctccctcc tctcactctc 480 tcttcaccta tgactctctg ccttcgcccc tccctctgtt tctttccctc acagattgac 540 tacactggct gtgggattaa ncctgctcgg tcctttggct ccgcggtgat cacacacaac 600 ttcagcaacc actgggtagg agacccacgg ggggtggggt gggaagcttt ggtgtcccat 660 ggtaagcctg accccaccct cacagtgtcc cttcctgttc tggaggctct gggagacagc 720 cagaggacag gaaatcagga aactgaggcc tgccatgtag aggcaggctg ggggtcacac 780 tgccagcact ttcaggccta gtctctgccc tcccagctcg gccctgcccc atgctgcctg 840 gcctccaggt cttcccagct gcgtggttaa aagtggggct ccaaatcctg gctcagccac 900 tttcgggttt agcatgacct tgcgcagtgt gcttgagctt tggtttcctg agctgcggag 960 ggggatatgg tggtgcccac ctctcagggt ggccgagaag aggaaagggc tcactcccca 1020 t 1021 32 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 32 tttgactccc tgtaccttta agagggaccc ttaaatttaa aaatctattg tatttttttt 60 ttagtagggg tagggaatat ttagggaatt tggaaggggt tatatagttc tttaagaatc 120 aaatagcaca tcttcctgaa aatagcacgt agacaaagtt tttttggaga taaccttagg 180 aatatcgtaa ctctctgatg ccacctccat atgtgatcct atgttgatta taagattttg 240 atcagtggct ttcagacttt tttgactgca acctagaata aaagattcat ttacattgtg 300 acctagaaca cacacacaca cacacactct ctctccgcca ctctcctgca cacagaaatc 360 attgatgctt acaacaattc ttactcttac tatgggtgat ttactttgat atgctctgtt 420 ttttttttca tttacaaaac tgtggattaa ttttttttga catgctaaat tgatctcagt 480 aatagattgt atttattctt ccttagattc ttctttggag cagaataaaa gatctggccc 540 atcagttcac acaggtccag ngggacatgt tcaccctgga ggacacgctg ctaggctacc 600 ttgctgatga cctcacatgg tgtggtgaat tcaacacttc cagtgaggct ctgggccctg 660 tgggattgcc cagggatgtg gagggtgaac agagtgactt ctgctggagg ccctgaatga 720 ttagtgtgga ggacagagcc acaggcaccc atcctgatgc catctatact tatattagtc 780 catttgtgtt gctattaagg aatacctgag gctgcgtaat ttataaagaa aagaggttta 840 tttgactcac agttacgcag gctgtacaag aagtagggta ccagcatcca cttcgggtga 900 aggcctgagg ctgtttccac tcatggagaa ggggaagggg agctggcatt tacagagatc 960 acatggtgag ggaggaaagc aaggagaggt caggggaggt gccaggctgt ttgtaatgac 1020 c 1021 33 1021 DNA Homo sapiens misc_feature (562)..(562) n can be a or g. 33 tcaaattatc atcgcttttt tatttcagga ttacaccaaa gactgtttcc aacttgactg 60 aggtaggtag tcttggatag actgggggaa ataagtcctg tgggacctcc tgccttaaag 120 aaagcaggcg gagggcccta aaggaaatca ggcaaccaga ccaaaagaat gtggaccagg 180 tggtccatgc tgtgtctctt gtgacccttc ttctccctgc catgtctttt gggagagccc 240 ttgtgttgca aaaatgagag tgtggtggta tggattgggg tttaggcaga acagtactgg 300 ccaagcagcg cctccctgga cctcaatttt ccctctgtgg aatgggctag caatcctggg 360 cctccccagg gcgaaggaaa gaccactcag gaagggcacc gtctggggca ggaaaacgga 420 gtgggttgga tgtatttttt tcacggatgg gcatgaggat gaatgcttgt ccaggccgtg 480 cagcatctgc cttgtgggtc acttctgtgc tccagggagg actcaccatg ggcatttgat 540 tggcagagca gctccgagtc cntccagagc ttcctgcagt caatgatcac cgctgtgggc 600 atccctgagg tcatgtctcg taagtgtggg ctggagggga aactgggtgc cgaggctgac 660 agagcttccc atttcacctt gtgggccctt cccaggcaga gcttcaggtg cccctcttcc 720 cagtcattga tacttagcgg tcctggcccc ctttcctctc cctgctggtg gtattgcacg 780 ccaatgactc ggccagatgc ccagacccct gttcttggtt tacctgcaga atattatctt 840 tgccaccccg cgggatggct caacccactt tcaggatgca ggtctcctaa tagcaacctg 900 atatagcaga aagacccctg ggctgggagt ctgagaccta gttctagccc agccctgaac 960 ctcagtttcc ctttctgtga aacaagaatg ttgaacttga tgattcccaa ttttcctttt 1020 g 1021 34 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 34 gggccaaggc cacaaagtct caggacaagg cagactgcag acccagggga cgtgcgcgga 60 ccggggcttg tttcggtcct gggtgttctc agccttgatg tggacactag cggctctggt 120 gcacttgctc ggaggaagca gccacgtgtg ggtgtcctgg cctcagccgg cagtaaccag 180 cagacacaca gcacggaacc ctccacccta ccaggaagcc caggcaagac cccccagcag 240 tgcatgctga ccccagaccc tggcgacgga tcggagctcc tcggatttgg agtggatcct 300 tacaaatcct gcacactaga cagcagacac aggccctgcc agagccaggg acccgaattt 360 ttgtttggaa aaacactgag gtaagtgggg ggtggctcct gtccaggcag cccggccggt 420 gggacagtgg ggagggtcgg ctccaagccc tcctgagccc tagagggggt gcgggacggg 480 gactcacagg agatgcagga cggcccgaac atagtaattc ctggtaaagg gcccgaacag 540 cttcaccacg gcggtcatgt ncttctgtcc cctgggggag ggaggaaggc gagacggcgc 600 ggctgggcct ctcccactcg ggactccttt gctgccctgc tgaccacccc agggcaccca 660 ggcctctttc ctcccacaaa acacaccggg caggcaccgg ccttggttta cccacaagca 720 ccaaagggtt ggttccggag cctccaagtg agaaaccaag ctccacccaa ccctgtgagc 780 cctgcctggg ccccgcagcc cccggagaga ccccagagca ggaggagact caccagcgct 840 ccatggtgga gcccttcttc ctcttccccc gggggtactc cagcaggcac acaaacacgc 900 ccgccacact gaagccatgt ggttaaggaa cagcccagct cagcctgagg ggccacaggg 960 aactcccttt actgaagaca acacagagag gggcccgagc acggtggctc atgcctggaa 1020 t 1021 35 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or t. 35 taatacatga aagaagaagc tagtcaatgt ggagctctat tgtgtcccgg gatcaacaaa 60 gacaagatat ctttaaaatc gtcttctaaa tttaccctaa tgtaaaacaa atccaataaa 120 actctaatgt aattttttaa gaatttaaat ttggaataat tccaaagaac aatttttctt 180 aattttctac agccagaata tataccttta aaaaaaatga aaacagagat taactttctc 240 agaattggtt gactcactct ttccttttat ttttcttcca tggaattttc cagttaactt 300 gagaaagtgg aatcgaattc cgatgttgaa ttttccttct ggccccattc atgtggcagg 360 tggtgattca ggtactactg ggggctgctc agacaaacct cctcatcaga catcaagagg 420 ctgttgcacc aggagggccg gtaccgtgtc tagaggtggt cggcatgggg ttggagttgt 480 attacataaa ccctactcca aacaaatgca tggggatgtg gctggagttc cccgttgtct 540 aaccagtgcc aaagggcagg ncggtacctc accccacgtt cttaactatg ggttggcaac 600 atgttcctgg atgtgtttgc tggcacagtg acaggtgcta gcaaccaggg tgttgacaca 660 gtccaactcc atcctcacca ggtcactggc tggaacccct gggggccacc attgcgggaa 720 tcagcctttg aaacgatggc caacagcagc taataataaa ccagtaattt gggatagacg 780 agtagcaaga gggcattggt tggtgggtca ccctccttct cagaacacat tataaaaacc 840 ttccgtttcc acaggattgt ctcccgggct ggcagcaggg ccccagcggc accatgtctg 900 ccctcggagt caccgtggcc ctgctggtgt gggcggcctt cctcctgctg gtgtccatgt 960 ggaggcaggt gcacagcagc tggaatctgc ccccaggccc tttcccgctt cccatcatcg 1020 g 1021 36 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 36 aaaaagagga attaaattgt gtagatgcct ttaaagaaca tttttctagc atctttctac 60 atctttccct aagtggcctc ttgagcccag tcggattttg gttatatgcc atgatagtaa 120 tcataagaat cagttaaaaa tgatccaaaa atgcacgaat acagtcgatt ccctctcatt 180 tattccttgt ggaaaaagaa aaacacaaat cttaaaaact aaagcaagtc agggaagcct 240 ggaaagatac ccagatttga taacatgtta gaaggaaatc caggctaagg aatctcattt 300 tctagctttg atctggttgt cagttgggat ggacttgccc aagtgatggc ccacagaaag 360 gccaaatttc ttgtttttct cctcatcctg tacctctttt ttcattaaga atcctgcctg 420 gaagtttagg tcaaagaggc tgcttggagc aaaatacagt ggtgtctcat cccaaatatt 480 ctccaggcgt ttcttccatc cttccaggat ttgaattcgg gcgtctgctg gagtgtgccc 540 aatgctatat gtcagttgag nttctaagac ttggaagcca cagaaatgca gaatgccact 600 ctgaggatac agaaagcaca gagaggtaag tcaaccaatt ccatgcagtt gtactataaa 660 caacagaagt tggtctgggc ttctcagtaa gacactctga taaggaggcc tcaggcacac 720 tagagaatca gttcagagct agcgtctctc tcttaccctc tacctagccg ttaccaattt 780 tagccttctc aggtgtgttc ttctttaaat gcataaacct tgaaactgtg ccaacctgga 840 tcctttgcca agaaggctgg aagttctgtt actttaggga gtctcagttt cttggcaggt 900 gactcaccaa gacctgcgtg ggtgcatttc tctgcctctc catataacta gatgagtcct 960 ttttttcttt ttcttttttt tttttttttt gaggcagagt ctcgctcggt cgtccaggct 1020 g 1021 37 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 37 ctcatgtagg aaccagcagg ctactagaaa ttaaagttta aatctgggag aaggttagcg 60 ttaagtgtgt gtatacgaga gtcaagccag agaggagggc agtaatgctg tggggttgca 120 tgaaattcac caaaggagag catgcaaagt gaagagggag gcaaatgaag atggagcccc 180 aaggagcact tacatttaaa aatatgggca gaggaagagg aatcgtcaaa ggagactaaa 240 aagtagccaa ggcagggagc atttcaagaa ggagagaaag atccactttg ccatatgctg 300 cagaaagagt ccaacaggtt gagaaatgac agtactcgtg attccaaagg taatgaaaaa 360 aatccccaga attctatgca tgaattaatt acgtgattaa acatacaaat gtactgttct 420 ccaagaaaac tgagctgttt ccatattcag cattgaatac caagatatta ttttcttgtt 480 tgtagagata ttcatgatct aaagagagaa aacacccaga tcaaaatttc aagttgttat 540 taaacatctt cataagctga naattacaga atacagttta agctcacaaa taccaaatag 600 gcatttctaa gttgagaaaa catgaatgat attatactaa cattcattca ttttttcatc 660 attattgtca aggtttcaat tcacatttaa ttttttatta tacatgtcaa agaaatactt 720 gggttccttt cagtctttct ccctttgcac ttcaagtaga aaaagaaaaa aaaaactctc 780 tatagaattt ttaaaaacaa ggattacctc ttctcagtgc cataaaagcc cacatctcga 840 cttaactaga atgaatgtaa gcataaaatc tgccctaccc caaaaaattc ttacctgaaa 900 tccatcttaa ggagtataac ttcagtctat aagtattttt taagtaatca gttagagtgt 960 aagttttgcg actgtcagct gtagcatcat ctgctggttg aaagaaagag ccaaatgttc 1020 a 1021 38 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or g. 38 agcgttcaga gaaggagcgc aggcagaagt caccgcgggc ggcggagacg cgcgtcctgc 60 accgctgctc cgggcggtgg agtcactcgc cgctggcaag tttcggcccc gagttaaaca 120 ttagtgagcg ccgagcccgc tgggtataaa ggcgccgcgg gcaggctgca gggcaggcgg 180 cgcgggagca ggcgcgcgtg gcgcggggca ctggcatccc ggccgggggg agcccgcgag 240 ggccccctga gggcggtgta gggcgctggg cggcagccgg ggcgcagagt gcggggcccg 300 gaggagccgt gggggagggg aaagggcgcg cggcctcgga tgcgcagacc ctgggccggc 360 gactcgggga ccctgctccc tcttagctaa aaatgacgtc ggcgttcagc tcctccaacc 420 tcacgtggac aggcgaggga accgagaccc agagagggca ggggactttg gcaaactcac 480 acagcccacc gcaggcaact ggaactgaaa cccaggactc cgtctcttgc cagtgaaagt 540 tatgttagga agcagtgagg ngtctaaagc agtatgaaag gcaaagagaa aaggtgattg 600 ttccctcttg aatggccctt ggaagctgag tatctggatt caccctccct agggaatttc 660 ccgattgtct tgcaggctta cacactcatc aagatgacaa aaataatgac agtaacactt 720 atgtggaact tgactttttc ccaggtgctg ctctaagcat ttactgtgtt tgttttacag 780 gaaggaagac tgtacacaga gaataaataa cttggccaag ccattcagct aggaagttgt 840 agatcctaaa ttaagagttc aaggtcttaa tggctactct atgcggcctc tcatagtctt 900 ttcaagggtt ttggagaaga ataaaagatc aggtatggct tctccctccc ccagctctct 960 attgttccct aaaggattat tcattcgttc attcattcct acatcctccc atttattcca 1020 g 1021 39 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 39 ttctgatcag ttttctatgt taaataaata tacatctacc ttgtcagttt agatgactgt 60 actggactcc agtatactgt caaactatac ttgattaatc ctgtattgct ggatacgtgg 120 ggctttctcc ctaccctcca gattttaaat tattgaacaa gtatttatgg aggcctgctg 180 tgagccagga gctgtcctga gccctggaaa cccagcagtg gctgtacaga cctggcccag 240 ctgtcagggg gcacctctaa ggaaaccggg aggcaataat cgtagctccc ttgcagggag 300 gttgtgaagg ctgagtgagg acatctgtgc acctggagca cagtgtgagt gtgaaaccag 360 tgtcagccct tattactgtc aataccatga aggggcggcg ggggcactaa gggtggcagg 420 actcaatatc taggctctgg ggggtgccag agcctgaccg tgcagggtct tctctctccc 480 tccaccctga ctgtgctctg tccccccagg gctggacatc cacttcatcc acgtgaagcc 540 cccccagctg cccgcaggcc ntaccccgaa gcccttgctg atggtgcacg gctggcccgg 600 ctctttctac gagttttata agatcatccc actcctgact gaccccaaga accatggcct 660 gagcgatgag cacgtttttg aagtcatctg cccttccatc cctggctatg gcttctcaga 720 ggcatcctcc aagaagggta cggggctgct agaggttcca taactgcccc gtcctcgcca 780 agggtgggcc cggtgttccc accaggctct ccttccggcg gggtgagcag ggagttggcc 840 cgaggaagct gggaaaggag gggcctgaga ggccggcccc agacacaccg ccctccgggg 900 ctggagatgc cacccctata tttgggctcc aggattcctt cttgcctctg tgagcttttc 960 tgacctccac ctgggggtag gcgggcctga gaaatttcat agaacaccag agggcccaag 1020 g 1021 40 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 40 tttctggatc acgttttcat atattctggt tcagtacatc tatctttgag ttatctttaa 60 tatactgaac cagaacatac aggaatgtga tccagaacat cattggccat cagattttct 120 agtatatgtg atgtgcacct cttataaatt ataattgaat tcactgccat atctccaagg 180 ggtgtcactc ttgtactcca gaagatactg gttatgcaca agaaatcatg cagggacaaa 240 tagacagata ccatttagtg ttttgattta ttctgaggga attttaaatt tgtaatatgt 300 atcttaatca ttaaatattt ttcttaaccc acttttcttt tttcatactg tatctgccaa 360 aaccatttgc tagcatagaa aagagggatt tctttctgta tttctcttag acatttgtat 420 ccagtgtaaa taaacatcct gattttgcaa ctactggcca gtgggatgtt accactgaaa 480 gggatggtaa aaaagaatcg gctgtctttg atgctgtaat ggtttgttcc ggacatcatg 540 tgtatcccaa cctaccaaaa nagtcctttc caggtaaggc caaaatttaa gctgctagcc 600 acataactga caaaaatgaa tatcttgata atgtcttctt ttttctaaaa gtataagcag 660 gttaaattaa aatatacttc tgttatatct aatatgcttg gtgtgttaaa atagcacatt 720 attgtgactg catctattca caaggtcgct tctgttaaag tctttgttta aatatatgac 780 tcaaactgcc atgtatttct cacttttcac tcaggactaa accactttaa aggcaaatgc 840 ttccacagca gggactataa agaaccaggt gtattcaatg gaaagcgtgt cctggtggtt 900 ggcctgggga attcgggctg tgatattgcc acagaactca gccgcacagc agaacaggta 960 ctactccccg ggtactcggg tgactctcgt tactgacaga agagttatta tcgtttgaaa 1020 g 1021 41 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 41 tcaaagaaaa tccaacatta aaatgtatgc cttacgatag gcttgtgttc ttatttgctg 60 ccttctctct ctatgctgtg cagctaggct gtaattttaa atgcatgtct tggattttat 120 tctacaagaa aaggaatgca tctgtttcca ttccttaccc ttggctgggg gataatttta 180 atgttgggtt tgaaccccac gaaagaatgt tatatttgct ctatcttttg gtagaaatta 240 gattggtaac ctcgtaggtc cacaaaagta aactttcact ttaagggaaa atgagtaagc 300 aagtaaatat tgctaggact accactggga aaataattta aaggctatgt cacactggag 360 gttgggtaag tggtttagag gggtgcgggt taagacattc gggggcataa tactaaggag 420 agcatcccca accctaaaca tcttcaaaat gatcagggct tatgggcact atttgacgag 480 cataagaact taataatgtc aagagaaatt ttagacctat ttaatacatt tataagcaag 540 ttttgagcca ggcttagact nttacctgtt cctcttggta ttcatcaacc actgcacaaa 600 atcttgggca cgcctggagt ccagatactt gctgtagtca ctggtgaatg tgccctgtga 660 atggcgcttg tcctcgttca tctgatcagg atcactgagt gggtctgcct gggaagctga 720 gaatgatctg tgaagaacag tgattggtac aacataaatc tctcctcaag agtagactca 780 cttgagaagc atcttcacta caaaatacaa gaccatataa aacagtaagg caggcatcta 840 gagtatttca ataggtagtt tagaaagatc ttccttagct tgtcatgaga atcccttcgt 900 tttagtatag ttgcatacgc tattattctg aattctagaa acatgtttct caactgactt 960 ctttttttct gaaataggat taaacaaatc tttttctact aattaatcta ctcatgatta 1020 t 1021 42 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 42 tctgcctgtc cgtctgcctg tctgtctgcc tgtccatctg tccatctgcc tatccatctg 60 cctgcctgtc tgtcggcctg cctgcctgcc tgtctgtctg ctgcctgtct gtccgtctgc 120 ctgtctgcct gtccgtctgc ctgcctgtcc gtctgcctgt ccgtctgcct gcctgcctgt 180 ctgtctgcct gcctgtctgc ctgcctgtcc gtctgcctgt ccgtctgcct gcctgtctgc 240 ctgcctgtct gcctgtctgc ccgtctgcct gtctgtctgc ctgtccgtct gcctgtctgt 300 ccgtctgtcc atctgcctat ccatctgcct gcctatctgt ctgtccgtct gcctgcctgt 360 ctgtctgcct gtctgcctgt ctgtctgcct gtctgtccat ctgcctatcc atctacctgc 420 ctgcctgtct gcctgtctgt ctgcctgtct gtctgcctgc ctgtctgtct gtctgtctgg 480 ttgcttgtgc atgtgtcccc cagccacagg tcccctccgc tcaggtgatg gacttcctgt 540 ttgagaagtg gaagctctac ngtgaccagt gtcaccacaa cctgagcctg ctgccccctc 600 ccacgggtga gccccccacc cagagccttt cagcctgtgc ctggcctcag cacttcctga 660 gttctcttca tgggaaggtt cctgggtgct tatgcagcct ttgaggaccc cgccaagggg 720 ccctgtcatt cctcaggccc ccaccaccgt gggcaggtga ggtaacgagg taactgagcc 780 acagagctgg ggacttgcct caggccgcag agccaggaaa taacagaacg gtggcattgc 840 cccagaaccg gctgctgctg ctgcccccag gcccagatgg gtaataccac ctacagcccc 900 gtggagtttt cagtgggcag acagtgccag ggcgtggaag ctgggaccca ggggcctggg 960 agggctcggg tggagagtgt atatcatggc ctggacactt ggggtgcagg gagaggatag 1020 g 1021 43 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 43 ctgctttcca aatcagcttg gagagacagg ctgactcctt tccctcttcc tcaggcatcc 60 tctctggcca cgataacagg gtgagctgcc tgggagtcac agctgacggg atggctgtgg 120 ccacaggttc ctgggacagc ttcctcaaaa tctggaactg aggaggctgg agaaagggaa 180 gtggaaggca gtgaacacac tcagcagccc cctgcccgac cccatctcat tcaggtgttc 240 tcttctatat tccgggtgcc attcccacta agctttctcc tttgagggca gtggggagca 300 tgggactgtg cctttgggag gcagcatcag ggacacaggg gcaaagaact gccccatctc 360 ctcccatggc cttccctccc cacagtcctc acagcctctc ccttaatgag caaggacaac 420 ctgcccctcc ccagcccttt gcaggcccag cagacttgag tctgaggccc caggccctag 480 gattcctccc ccagagccac tacctttgtc caggcctggg tggtataggg cgtttggccc 540 tgtgactatg gctctggcac nactagggtc ctggccctct tcttattcat gctttctcct 600 ttttctacct ttttttctct cctaagacac ctgcaataaa gtgtagcacc ctggtacatc 660 tgtgatgttt gccttctact ctcttctgtt ccaaaaagac ccaggtccca tttaagggca 720 gtaatgtgtt acaggtgctg tgataaaggc tgggtactgg atagcttgtg ggcttatggg 780 aggaggcctg agatgggtca gggggagaag gtattcagca ggtggctggg ggactgtgtg 840 cagcagttcg ctatggcctg cctgtggtgc ccatgtgttt gtacgggagg gttagcttga 900 gaaggaatca gattataaaa ggtcttgaat gtcaagccag agagtccaga ctttttccta 960 agggcaatga gaagccattg aggagttctg agcagagtag taacatgatc agttatgctt 1020 c 1021 44 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 44 accattgttc ctgttttgca aagaaggcaa caggctcaga gaaggccagt gcctcgcccc 60 aagacatgct agctctgact aggatgccat gaccacgctg tcccctgccc actacactca 120 cccggtgtgt agccccaagg ctcatagtag gaggggaaga ctccaaggtg acagccacgg 180 acaaactcct catagtccac agggagcagg gggcttgtgg aggagaggaa ctccgggtgg 240 aaaatcacct ggtagtgaaa aagaaggact cagcccaagt gccttattta gctaagccct 300 gagatcccaa ggtggcccag agagggtaaa aagcttgtct agcatcacac agcatgtgtt 360 tggcaggacc aatgttcaaa cccaggtctg cctgcctcag aagccagggt tctttctaac 420 cacagcaata cctttgataa aacttatagg ggaatggagt gtgtgaggcc caggacccaa 480 ccccttccct ctgccgtgcc caacccagcc ctgaccaaat gccctcacct tcaccctgtc 540 ggcactgcta ttgaagaggc ngattcggcg gatggtggtc aggatggggt ctgaggagtc 600 atccagcata ttgtgggtgc acacaggggg gaaagactgc cgctgcagga gccacaagaa 660 gggtaagggg tcatggaagg gacagagaac tccctacttc ctcatgagcc atgcggaccc 720 tgggggagcc aaggagacca caaatgcacc ggacgtgggg caacaaaccc aagtgatcac 780 caggagttgt ggattcccac tagtacaacc tgtaaaggtt ttctttcttt tcttttaaat 840 tattattatt tatttttgag gcggagtctc gctctgtcgc ccaggctgga atgcagtggc 900 acaatctcgg ctcactgcaa gctccacctc ccaggatcat gccattctcc tgcctcagcc 960 tcccgagtag ctggaactac aggcgcctac caccacgccc ggttaatttt ttgtattttt 1020 a 1021 45 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 45 caaactcaca gttggatggc acaacaatta catcctgtgt ggtcagcagt gatggagggg 60 ccgcagagat ttggaaacag gagggacaca ggatacagat ataggagggt agaaggcaga 120 cttcctggag gaggtgaaac ctaacctgag tccctaaggt gataggaagc aggaaccagg 180 gaaggggagc ctattctaac acagtagaag cagcaactgc tgaggtctgg atgaggggac 240 ctcaactgtg gcccaaaacc ccaagttccc attgtggctc tgccaacaac tggctgtgcg 300 acccaggaca agtcctatct ttgcactgtg tctgggtttc cccgtgtgta agatgaggcg 360 gttgctaggt gcttattgga tgcattcctc aagtcccgcc ctccatctcc tattcccctc 420 tcttctggtt tagtgcttta ggaaatgtgg cagaaatctt tttctgcctg tgtctaggaa 480 atcataattc atgctggcgt accctggttg ttgaggtccc tgaatccttg tgcccacact 540 gctgaagact ccttgtgtga nacaagtcag gggacatctg ggtcttgact ccccagatgc 600 tccagctgga ccctgctgcc ctcccttgcc caccctcttc cattgtagat gccaaggggc 660 tgagcgatcc agggaagatc aagcggctgc gttcccaggt gcaggtgagc ttggaggact 720 acatcaacga ccgccagtat gactcgcgtg gccgctttgg agagctgctg ctgctgctgc 780 ccaccttgca gagcatcacc tggcagatga tcgagcagat ccagttcatc aagctcttcg 840 gcatggccaa gattgacaac ctgttgcagg agatgctgct gggaggtccg tgccaagccc 900 aggaggggcg gggttggagt ggggactccc caggagacag gcctcacaca gtgagctcac 960 ccctcagctc cttggcttcc ccactgtgcc gctttgggca agttgcttaa cctgtctgtg 1020 c 1021 46 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 46 tcatttttac acaggatgta cgcgttttga agcacaaaac tctccagtga tcacaggtca 60 tagactgtct gatttttatg tgaaatccca ttttaagagt aaaatataag taacatagta 120 ggctctagtc tataaacaaa gacttctatt tatagtttgt ttgccccctg agccccatct 180 catctgctgg tggcatgcac atgctcttta ttaccagtgc gaatatagct gggaaactaa 240 tgccactcac catacaggat ggttaacatg gacacgggca tgacaaggaa acccagcagc 300 atatcagcta tggcaagtga catcaggaaa tagttggtgg cattctgcag ctttttctct 360 agggacactg ccatgatgac gagtatgttt ccagcaatag ttagaataat cactacggct 420 gtcagtaaag cagaccagtt tttttcctgg agatgaagta aggagagaca cgacggtgag 480 aggcaccctt cacaggaaag gttggttcga ttttcagagt cgactgtcca gttaaatgca 540 tcagaagtgt tagcttctcc ngagttaaag tcattactgt agagcctggt gtcatcattt 600 aattgcatta gggagttcgt agttgagctc aaagaagtat tttcttcaca aagaatatcc 660 atgtctaagc cagaacttgt agcagatgag gtgtagaagg actaacaggt tatagtttct 720 gctcaccatt caccttgatg tacccacact ctgtaacact gaggctggtg tacatgctgt 780 tctcccgggg ctggattttt gtcttccatt attacaatga tagttaaaga actgaactgt 840 ggtggctgta agttttcttc attcacaatt ttaggagagt ccactgtttg gttttattat 900 tttctcacca aaccgaggac aaaaaagcag aatgaacttt tagcatagag gttgcagggt 960 tttttttgag cgctcgggaa gataaatgtc ctggacaaag aagaaaagtt ttataactac 1020 t 1021 47 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 47 gaagattgtg gaaaatgatg gaagattccg gaaagtggtg gaagattcca gaaaatgatg 60 gaagattcca gaaagtgatg aaagattctg gaaagcaatg aaacattcca gaaagtgatg 120 agacagtgat agagtctggt tccaggcgaa gtgggagagg atgggatttg agaagggaat 180 gatccctcct cacacctcta ggatgggaag cttagtggag tgaggggtgg gtaggaggtt 240 acaccctgtg tcctctgtcg ctctgtgcag gaggaggagg cagagaaagg gaagggtcag 300 gaaagccagc ccatgtccca cccccactgg actcaccacg tgatggcagg tgaagccctt 360 catgaccgag gcctcattga ggaactcaat ccgctctcgg agactggctg actcgttgac 420 cgtcttcacc gccacgcggg tctctgcctc acccttgatg atgtccctgg cattgccctc 480 atacaccatg ccgaaggagc cctgccccag ctctcgaagg agggtgatct tctctcgaga 540 cacctcccac tcgtccggca ngtacacaga gcatggaaac actacttctt acttatctac 600 acagcatcct tggaggatcc cttgggggtc tgcagccacc ttccacccaa gccctcaccc 660 aaaccccctc gaaaacactc atgaaatgag ttctgtgatc caggacccat gccgggcact 720 gggcatatgg ccgagaacag gacaggcatc tgcacccatg gagagggcat ggcagagact 780 caaggaagga gccacaactg gtccaagatc ctggccaata tgtcctgagg caaacctgca 840 tccccatcct tcttgtctga tttcagaccc ttgctatgga atgatgctac ttcccacctg 900 agactactgt ttctgcaaag tgccaagggg atggaagaca ggttgtaata ggttggggaa 960 aaaaaaagcc aggatacttg gagctcttcc catgaaaagg tggagtctat ctcaccaccc 1020 c 1021 48 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 48 tgtatttttg tagagatggg gtttcaccat gttggccagg ctggtctcaa actcctggcc 60 tcaagtgatc tgcctgcctc ggcctcccaa agtgcttgga ttacaggtgt gagccactgt 120 acccagcaat ttataaggtt ttaagactca aataactcct tctaaagtga aatgagtctc 180 ctgttgtggt gggaggcaga catcattcaa cttagaggac acagctggaa agcaatgtga 240 gaaactaaga aaagtaacaa gctggtagat tggcatttct gacccatctt cctgcgaagt 300 caggtatcaa ggctttaagt actaatagca cagtacctga tgagagaagc actggaatca 360 aaatttcagc agaggaagga ggtaccaagt gcaactctga aggggcatgc tgaagtgtgc 420 aggggcatgc ccaagagtca agggccttac ctcatcacca tatcgccgat aactcacttc 480 atacagcacg atcagaccat tgggctcctt cggctcctgc cacatcaagt ggacgacgtt 540 gttctcaaag atttcatgcg ncacagggcc aacaatgtca tcagccttgg ctgtaaggag 600 aggaagtgag aggcagggat gtaactcttg gatgagatcc cacttctgcc acctgtccat 660 ggtgcaacct tgggctggtg acgtcatttt cccacaaccc attttcctcg tcagagaacg 720 gacatctaaa actcatccca caagattgtt aggaagatta aatgggttac tttctgcgta 780 taactttttt ttttttttga gacagagtct tgctctgtca cccaggcggg agtgcagtgg 840 tgtattttct aaagtttaca taatgattgc ctatgactca taattttaaa atatgacctg 900 gcatggtggc tcatgcctgt aatcccagca ctttgggagc tcaaggttgg cggaccactt 960 gagctcaggc attggagacc agcctgggca acatggtgga accatctcta ctgaaaatac 1020 a 1021 49 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 49 gactgaggtt cacccgggtg aaggcgctca tgcccccagg tccttgtggg ccccccagca 60 gggacgagtg ggcagccagc tctgctgccc cttgaggccc agtcggggaa gcagaggctg 120 ctgaggatga ggaggcagca gccatggtgg ccctgggcag gctcacctcc tctgcagcaa 180 tgcctgttcg catgtcagca tagcttacag gggcagctgg cgaggtgtcc acgtagctct 240 gacggggaca actcatctgc atggtcatgt agtcaccccg gctgctgggc actgcccggg 300 taggcctgca aatgctagca gccccgggag gtgcagggcc cagtctgccc atctcgaccc 360 cagtgctctc ctgccaggct gccctccgcc cggccccagg tccatcttca tgtactcctc 420 agtgccagtc tcttcctctc tgggagctgg ctggagctgg gatggacacc tgacagaagg 480 tgagctgtgg aaagccaccg ggccagacaa gtagccagac tgatcactcc caaattcaat 540 attgacatat tcccccgggc ncttgggctc tggagggtgc agcaagggct gctgctgctg 600 ctgctgctct cgggcccgag gtaaggtgct ggccttggga tcccccaggg acagcctcgt 660 gggccgggcc aggcggctat tggtctgagc agctgtgtcc acctttcgag gcagatgggg 720 ctgcagaacc tgatggtggg gatgtggaag gctgggctcc agcctagccc cgcagtatcc 780 cccacccagg ctgtcgctgc tggtggaaga ggaagaatca tctgctgttg cagcatagag 840 aaggcgacca gagctagtgg aaaggcggag gtgctgatgc cgggcaccct cctccggctc 900 cccggggcgc tgggtgtgct taaaggatct tggcaatgag tagtaggaga ggactggctt 960 gtgctggggg tcctcagggc cgtagtagca gtcggagggg ctgctggtgt tggagtcccc 1020 c 1021 50 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 50 gattggggat ctggtggaag cggatgaact cccgcacccg cagcatctgt gtgtggtagc 60 gggctgtgcc cgagtacagc cgctggatga tggccgacac gttgccgaag atgctagcat 120 acatgagggc tgggggcgtg ggcacgtggg gccgtcagcc tctgcaggga ccccacccac 180 ccacagggac cctgctcagg ccccgcacca ggtcagtgtc tcagtctcag cgtcgacatg 240 cccacgagac gcccttgtac atctgcgctc cagcacaccc cacccttcag tagtccccgc 300 cctggtgacc cagcccccaa accatgtcac gatggtggcc cctggagtct ctaagttcca 360 gggcctcact ctggcccggc tagcagcctc agtttcctcc aacttgggtt cctccaccgt 420 gggctctccc cgccgcccgc ccctgggcac actcacagcc aatgagcatg acgcagatgg 480 agaagatctt ctctgagttg gtgttgggag agacgttgcc gaagcccaca ctggtgaggc 540 tgctgaaggt gaagtagagc nccgtcacat acttgtcctt gatggagggg ccgcccaggc 600 cgctgctgtt gtagggtttg cctatctggt cgcccaggtt gtgcagccag ccgatgcgtg 660 agtccatgtg tggctgctcc atgttgccga tggcgtacca gatgcaggct agccagtgcg 720 cgatgagcgc aaaggtgcac atgagcaaga acagcacggc cgcgccgtac tctgagtagc 780 gatccagctt ccgcgccacg cgcaccagcc gcagcagccg cgcagtcttc agcagcccga 840 tcagctgggg gacagggaag gggcacattc cgttgatggg gcaagggggg caagggagga 900 ggggaggtgc tgcggccctc agagcgagca tcagaggtca gatccccaaa gacttcctag 960 accctcctcc taagaggtga agcccacact gggcccagca caggtgtctc attaatctta 1020 g 1021 51 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 51 catgctcttg cggaggtcac ccacacgtag catgaagcag aggcggccgt ggcgcagggc 60 gatcaccgca tgcttgctga agatgagggt ctcagccctg cggtgggctt gggcagtctt 120 catgaagatg cagccaagca tgatggcgtt gatcatgagc cccacgatgt tctgcacgat 180 gaggatcagg atggccagtg ggcactcctc agtcaccatg cgccccccaa agccaatagt 240 cacttggacc tcaatggaga aaaggaaggc agacgagaag gagtggatgc tggtgacaca 300 gggctcagca gtgccctcgc tgggggccag gtcaccgtgg gcgaaggcga tgagccacca 360 ggccatggcg aagagcagcc agctgcacag gaaggacatg gtgaagatga gcaatgtgtg 420 tggccacttg aggtccacca gcgtggtgaa cacgtcctgc aggaagcggc cctgctcccg 480 gatgttcttg tgggccacgt tgcagttgcc tttcttggac acaaagcggg ccctccgctg 540 gcgggcacgg tacctgggct nggcagggtc ctctgccagg cgtgtcagca cgtattcctc 600 ggggatgatg cccttgcggg acagcatggc tccggtgacc cccagggagg ggcttccccc 660 atcggaggca cccctcggac gtggcctagg gcctcactgc agagtcctct cggtgggcac 720 cttctcaccc tggggctgca ctcagcctgt gctggcctca cttctgagat aactccccac 780 cagactcttc cttacctcca cctgggtccc acttcacttc ttaataccag cctcaggccg 840 ggcgcggtgg ctcacgcctg taatcccagt acgttgggag gctgaggagg gcagatcact 900 aggtcaggag ttcgagacca gcctgaccaa catggtgaaa ccccatctct actaaaaata 960 caaaagttag ccgggcatgg tggtgcgcac ctgtaatccc agctactcag gaagctgagg 1020 c 1021 52 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 52 cacttcttgg agccacagac gcaaagcagc agccctcggg gattgttctt ccccagccac 60 cggcccagag tgtggctggt caatcgtggg gacccaggac tggctggacg cacagctcta 120 gggcccagta cctcccacag cctctgcagc cttgggcggg ggagaggggt gagccagtcc 180 tgaattgggt tgggaggagc agggacaaaa ataacccagt acaggttcct gctgaggcca 240 gaaatagcat agtgacaagt gccttgtaac accctggatg agcagcaggg ggaggctgag 300 ctgaggctgg cccagcctca caccaggccc tggccgggct acataccaca tggtccgtgt 360 gtacacacgc gtgtgggggg cccgagagac catggctcag gacagggaat ctggagagat 420 gctgaacttg ggcttggcct tggccatggg cacgctgcgc ttgcgcaggg gcccgcgggc 480 tgaggcgagg gtcagagctt ccagtaggct gtggtcctca tcaagctggc gggccgtgca 540 gagtggtgtg ggcactttga nggtgttgcc aaacttggag tagtccacag agtaacgtcc 600 gtcctcctca gctacaatgg gcacaaagcg ctggccccac aggatctcat cggccaggta 660 ggaggtgcgg gcctgggtgg tgatgcccgt ggtttccacc acgccttcca ggatgacgat 720 gatctcgagg tcctggtggt ggtgcaggtc gctgggtgcc aggtcgtaga gtgggctgtt 780 ggcatcaatg acatggtaga tgatcagcgg ggccaccagg aagatgctgt tgccacccac 840 gccgttctcc atggggatgt ccacctggtg gaggggcacc acctcgccct cggggctggt 900 ggtcttgcgt accacctgca tgtggatggt ggcgctgatg atcatgctct tgcggaggtc 960 acccacacgt agcatgaagc agaggcggcc gtggcgcagg gcgatcaccg catgcttgct 1020 g 1021 53 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 53 atctggtatt gtacaacaca tgcaggtaag taactgaaat ctccaggagt tggatgtgta 60 gtattttggg aggagaccag gcttgggcca caaatgaggg cactttgcac tttcatcaaa 120 tccatgtcta ccttgtcaat ctgaataact gagagagggc aggtagatat tttacacctt 180 gaagatttgt tttctggtca tgtaaaaatt aaatataaac aaataaagaa caaagcaaga 240 gagacagaaa aagaaagaga atgagagaca aggaaagatt gtgttggggg gagaagagaa 300 gggtttgccc agctagggca ctaaactttg gattcattct ccaggtttgc cacatcacca 360 tttctttctg tttgctcttc gaggttcttt tcttcctctt cagtctccag ttctgcatgt 420 tggttgagtt tgctggatac agaccaactc aggggcagct ctgccctgct ggctaactcg 480 gccagctctt tggcactaag ggatggggtg ctggtctcat aggtctcatg gaagctgttg 540 tagtcaactt cgtagaaccc ntcctccagg gtcaggacag gtgtgaaccg gtaaccccac 600 aggatctcac tggtgatgta ggagcttcga gcttggcatg tcatccctgc agagagaaga 660 atggaggctt tagcatatgt aagtgtgggc tttccatggc caaggagtca cagagagcca 720 ggaggagtac tgcatgcagc tgttgagact gacctgcata cgatgccaca cttagtaggt 780 gtcattcatg ttgtagacac atgctaatgt gccatggaga ttccaggcct cttaagggag 840 tcctggggaa caatgagaga gtcctggccc acatcaagcc acatttgcct gcatggccat 900 gcacatgcaa aggaaatcaa gtgtgcaaat gcacacaagt tttcgcatgt gcatggctat 960 gtctggtcca ctctgctctg ggagaaccct gaagccatga ctctggcctc ctactgctct 1020 t 1021 54 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or g. 54 acgggtgccg gtcaagagag gggggcaccc cgtgcctccc taccacacct tctggaagac 60 atagcccccg ctggggcccc agcccacgat ggggtcggag gacggcttcc cgttgatgtt 120 gggctgtgag ttgatggtga ggatgccctg gcggttcacc cgcagcagct cctccttcag 180 caggctggtc tcagccgcca ggggctcatc gttccagggc aggcaagtca cctgggagag 240 acggtgagct ggctggggcg accatcaggt ttggcaccct gagtccctct cacggccccc 300 aacaaagacc cagcctgtct ttgcctccct aagcccttcc aggtggaggt ctcccaactt 360 acccttctcc ctttgccatg tccacagcat ggaggggagg gcacaggatg gggaagtcac 420 agccccgcag cctggcctgc agctggggtc aggccagggg caggggatga accagggtcc 480 ccactccagc atcactcact ttgtgaccat tccggtttgg ttctcccgag aggtaaagaa 540 caaagacttc aaagacactt ncttcactgg tcagctcctc cccccacatc ttcagcagct 600 cctccttggg ggacttgctc ttcaggtaga agaggtagta gtccttcagc tccccaaagg 660 caggggaaga ggaattgccc ctggcagagg ggtgcccaga ggtcagggca cactcctgac 720 agagggcagt gccaccacat gcccaggagg ccattcctgt aaattctgcc cctgactcct 780 cccaggtcaa ccacaagcat gcaaacttct tctgccctcc cgctcccaag aacaaagatg 840 tatttgcaag gaaggtctgc aggccctcac cagcggccgt tagggaactc gtcccactcc 900 tgggtacggt agatgtaact ctttggtctg gaggcccaga agatgggacg tacatcttcc 960 tctcggcgct tggggtgggc gctgagagcc cagggtaggg gacgcctggg tgaggatggg 1020 g 1021 55 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 55 gccacctccc tggattcttg ggctccaaat ctctttggag caattctggc ccagggagca 60 attctctttc cccttcccca ccgcagtcgt caccccgagg tgatctctgc tgtcagcgtt 120 gatcccctga agctaggcag accagaagta acagagaaga aacttttctt cccagacaag 180 agtttgggca agaagggaga aaagtgaccc agcaggaaga acttccaatt cggttttgaa 240 tgctaaactg gcggggcccc caccttgcac tctcgccgcg cgcttcttgg tccctgagac 300 ttcgaacgaa gttgcgcgaa gttttcaggt ggagcagagg ggcaggtccc gaccggacgg 360 cgcccggagc ccgcaaggtg gtgctagcca ctcctgggtt ctctctgcgg gactgggacg 420 agagcggatt gggggtcgcg tgtggtagca ggaggaggag cgcggggggc agaggaggga 480 ggtgctgcgc gtgggtgctc tgaatcccca agcccgtccg ttgagccttc tgtgcctgca 540 gatgctaggt aacaagcgac nggggctgtc cggactgacc ctcgccctgt ccctgctcgt 600 gtgcctgggt gcgctggccg aggcgtaccc ctccaagccg gacaacccgg gcgaggacgc 660 accagcggag gacatggcca gatactactc ggcgctgcga cactacatca acctcatcac 720 caggcagagg tgggtgggac cgcgggaccg attccgggag cgccagtgcc tgcacaccag 780 gagatcctgg ggatgttagg gaaagggatt gtttcttttc cttcgctcta tcccagggca 840 ggacagtatc aggcacttag tcagctctag gtaaatgttt gtacagggca cactctacac 900 aaaatgggta ccttccattt tgtgcaacta cagtcacaga gtcgtgatcc ccagattcag 960 gttccccagg ctggtaggct ggcaatctcc tctcactcac ctcttatggt ttgttgtggt 1020 t 1021 56 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 56 acccagaatc ctgcagtttc tcctgattaa cagctaagta aattctatag cactgtactg 60 aaaatataaa aaatttagaa tatagggctg atcatccctg atcctaagat tgtcctctga 120 agttgatttt cagggtaaat ctttcatatc cactttttaa attgccgatt gtttcttatg 180 aaacaagtag taaaatgtac aaaagaaaaa gaatctagct taaattatag agttcagaca 240 tattttttag taggaggaag aggaatagaa taacaaaata gagtgtgaaa tttggagtaa 300 attgacagat tttcagaata aaatgtttct tttttctctg tacatgttaa aaatatactt 360 tgtattgata ctttcatgtg ccatcactaa tattacatat atagcatatt aaagagtgac 420 attttaaacc attgttaaat tattcaacag ggactaaata ggaatagttt gccaactcca 480 cagctgagga gaagctcagg aacttcagga ttgctacctg ttgaacagtc ttcaaggtgg 540 gatcgtaata atggcaaaag ncctcaccaa gaatttggca tttcaaggta aaatctgcag 600 agccttttaa gaaacttgaa tcaaatgcat ctactttgtt tctgtcaata atgtttcaaa 660 tagttctgga agcagaaagg aatggttgaa gtattttagg tataggacaa catgtgtagt 720 aataatatgg taaaatagag aaactgatta ttaaagagaa gctaatgtgt cttgtcctaa 780 aactttgata ggctgggtac aaaatgtgct ggatccctga gaacatgaga tagtttaggg 840 aaatcaggat caactcagga ctggatgctg gggaagtttt taaatcgata gaagtggcca 900 ttacagggtt agccaccaat ccaatgaata gtatccaaag gtaggtctgc agaattactg 960 acttctgaaa agaggagcac gtttccaagg ctcatcacaa ttgttaggtt taaggtaacc 1020 a 1021 57 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 57 ctcctttaac ataagatata tgggtaagaa aattccaatt taatgatatt caaatatata 60 aatatttgtt gcatcctcag gtttctagtt atgtgttaaa aaaatgatat gttgaaatct 120 cttcaatttt agaagaacct tgttataaag aacagagcta aaaatattag aaccacctgc 180 cctttagtgt aacaaaataa actagccttt ttggtttact taattacagt cttaccatca 240 aaaatatatt ctctaactta aaaaaatact tttttggtaa tatttgatga catttctgat 300 gagagcacat aaaaataaaa caatacttaa agatgtggat ataaaatgct caaggaatca 360 tcatttaaaa acagacggtt cccttattgt ttctgttcat gtcaaaaagc agggtttttt 420 ttttacacag tctctgtagc tcctaggaat ttcatttcta cagcagcttt tggcctgtgg 480 gctgagccac tcttcttttg gaattctgca gcaatttcct caaaagactt tcctttggtt 540 tctggaactt taaaaaatgt naacagggta aaggccagga gcactccagc aaagaggaaa 600 aacacataag gtccacagaa gtcctggata gaaagcaaac acagactttg agttagcagt 660 tttttgaccc tctcttctgt tcagtaaatc tgtggaatat taggctgctt accgcaatgt 720 actggaaaca cagagctaca atgaaattgc aggtccaatt gctgaatgca gctattgcta 780 aagcagcagg acgtggtcct tgactgaaaa actcagccac catgaaccag gggatcgggc 840 ctggcccaat ttcaaagaag ctgacaaaga ggaagatggc tatcatgctc acataactca 900 tccaagagaa cttattctga ggaaaaaaac aaaaacaata gtgggactga gatcatttgg 960 ctgctttttc ctttagctaa gtagcctctg agttcacagg cggcatacaa ctttttctaa 1020 t 1021 58 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 58 atggaataca ggggacgttt aagaagatat ggccacacac tggggccctg agaagtgaga 60 gcttcatgaa aaaaatcagg gaccccagag ttccttggaa gccaagactg aaaccagcat 120 tatgagtctc cgggtcagaa tgaaagaaga aggcctgccc cagtggggtc tgtgaattcc 180 cgggggtgat ttcactcccc ggggctgtcc caggcttgtc cctgctaccc ccacccagcc 240 tttcctgagg cctcaagcct gccaccaagc ccccagctcc ttctccccgc agggacccaa 300 acacaggcct caggactcaa cacagctttt ccctccaacc ccgttttctc tccctcaagg 360 actcagcttt ctgaagcccc tcccagttct agttctatct ttttcctgca tcctgtctgg 420 aagttagaag gaaacagacc acagacctgg tccccaaaag aaatggaggc aataggtttt 480 gaggggcatg gggacggggt tcagcctcca gggtcctaca cacaaatcag tcagtggccc 540 agaagacccc cctcggaatc ngagcaggga ggatggggag tgtgaggggt atccttgatg 600 cttgtgtgtc cccaactttc caaatccccg cccccgcgat ggagaagaaa ccgagacaga 660 aggtgcaggg cccactaccg cttcctccag atgagctcat gggtttctcc accaaggaag 720 ttttccgctg gttgaatgat tctttccccg ccctcctctc gccccaggga catataaagg 780 cagttgttgg cacacccagc cagcagacgc tccctcagca aggacagcag aggaccagct 840 aagagggaga gaagcaacta cagacccccc ctgaaaacaa ccctcagacg ccacatcccc 900 tgacaagctg ccaggcaggt tctcttcctc tcacatactg acccacggct ccaccctctc 960 tcccctggaa aggacaccat gagcactgaa agcatgatcc gggacgtgga gctggccgag 1020 g 1021 59 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 59 gagtcccctc cttactgggg tccctgcccc agcctgaggg gagggaaagc tctgcctaag 60 accgcctgcg tccagagtcc agacctacct ttccacaggc ccctgactcc ttcctccctg 120 gcgatggttc tgtaggcgtc catagtcccg ctgtattttc tgtcgctcct ggatggcccg 180 aggtgtatgc tggcctgaaa tcggaccttc accacatctg tgggctgggc acaggtcacc 240 gccatggctc ctgtggtgca gccggccaaa atccgggtag tgaggctgga gtctgggagg 300 ggcagagaga gtgggccagt gtcccctact aagcagcatt ctgggacatg ctgttctctg 360 cggggctgcc cctgcagctt ccttgatgtc cactcagagc ctcctcataa gcgtccggta 420 ccagcttccc cccgcccctg gctctgcctc tgagtctaga cttccctggt ctcttgaccc 480 acacactttc agccacccct ttggtgttca gggacctggt cactcactgt ccgcgccttt 540 gggggtgtac acctgcttga nggagtcata gaggccgatg cggatggagg cgaagctcat 600 ctggcgctgc aggccggcca ccagcccatt gtaggggctg cagggaccct cagtccgcac 660 catggtcagg atggtgccca gcacgccacg gtactgcacg agccgggccg tctggaccgc 720 ctggttctcc ccctggatct gagggacaat agcagggggt gaggactcag atgggaaggc 780 aagaaggggc tgcgtgcaca ggaaccctgc tggggctggg cctgcctggg ctgggcctga 840 gaacaaccat gctggtcaca gtagaaatca ctggtgtctg cgcagcattt taccattcac 900 aaagcagtat tatacacatg gcttggtgtt tgatcctcag agtaaatcag agggacagat 960 tgtttttccc attttataag tgcttcgtgg cttgcccaag gtcacacagt taattcctta 1020 c 1021 60 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 60 aagaaaatca aacttaactc ggacccagag acattttagt atgtgttgga aactttagca 60 tctggtcacc atcctccaaa gaattatttg gattggaact cggtcagagc tgtcactctt 120 cagctaggaa tctaagagga tcatgtcttg gatgttacgg agtatagaca accaagttcc 180 ctgccctcaa aagcccgatc acttataaga cagcttatgg agctttgaca gagggcagca 240 gttgatggca ttatcctttg aactcatagc ttagttggac tcctactggc ttgtgggacc 300 aaatctttcc ctaccacagt tggctatagc aaaagttgtg aaaaatgcca ctaggatata 360 ctggtgaggg aaaaggaggt ccatttgtag ttatagtata attgaaaaga aaagctctga 420 agaaaactct agcctactct ttttcagccc aaggggaagg cagagcacct gctgacagat 480 gctggcgtag cgagccaggg cgttggcgtt ttcctggata gcgaggctgg atggacactg 540 gtcggcaatc ctcagcacag nacgccactt cccaaagtca acaccatctt tcttgtactg 600 agcacagcgc tctgagaggc catcaagccc tgcaagtcac aaaagagaga aaggcttctt 660 tgtacctttg tacctgatcc atggggcttc taataaaggg aaggagttct ccctttgctt 720 agctttcaat ccactgtgct tgaggattga aaacagccaa gcatatcagc attaatcaca 780 acactgaacc agaagactta gatttaataa atagtgtttt gacatacata ctatctactc 840 catatataga atagaagaaa ccaatagtta atatgatact cattttacaa aggtggaaac 900 tgaagctcct aatggttaag caactttacc aagtttgaat tgctcaagag tgacagagct 960 gggattcaaa ttctgcttag ctaacccaat gttgtgagtt aatgcttgtc tacttgggca 1020 g 1021 61 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 61 gccttagttt tggtaatctg caaaaccaag ggccctgccc tgggtgctgc tctcccagtg 60 caaagtccct aactttggtg tgacccttac cagagtcaag gctgtctggg cctggctcct 120 tgtgacatcc atcgccatct ctccgagggg ctaagaggta gatgctttgg gaggcagaga 180 tgctcctgcc tgctgaggcc tagcacatgc tgtagccttg gagcgtaagg cccgcctgtg 240 gcagcaacgg ctgcttggat ggagagctgc ctgaggctgg cagcccaggg cttctgcact 300 gaaagggctc agcctggcgg ctgctcaaat actctgcccc ctgccatggg gtcagaggca 360 gggcagaaag ggagggtagg ccatgtgggt aacagttgac agggccacgg ggacagagcc 420 atggggcagc cggccacact ctgtgaacat ggggtaggga ttgctgccca gcaggagggg 480 gtgtgcagag ccagcctacc catcttccat tcctcagcct tgtgcgggca gaaagtcacc 540 aggctgcctt ggccacagaa nacttactga aatgcccttg gacagggagg gggtcctaag 600 ggggcctggc ccgcgctggt gcaggtctgg acttgctctt ggaggcaagg ggatccccag 660 tggattttca tctgcagaga ggttcgattt gcatttcata caatccaggg gtctgtatgg 720 aacttgggga aggggtggtg gaggaaggtg gccaactgat caaaaacaaa caaaaaacag 780 gggtatcatt cttaattttg tgactgcaaa gtccaggcct caggcttgct ttgggtgcct 840 ccatgggcat agaccatgac ttccaggctc tggcccaggc ctctccttgg gctcacctgg 900 gagtgacatc cacatgctat gtacttgctg gcacctgcca aagcctgcta aaattagctg 960 gagctggcaa gtgggtcagg gtatggaggg tgccttgtca gaatgccagg tctctcgcca 1020 a 1021 62 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 62 ataaaagccc caggccaggc cccggacact ggtgtcctgg gtcaccgtta gctccaggaa 60 taagtaccct agaacccctc gagaggctgg acactggata gccacagtga ggaggggtgg 120 tgggcagagg gccagtggca ggcacagctg ccctagccag gacccccaag gcccatgtgc 180 ctccttccaa ggtgccccaa gcctgctcgc cttccctgcc cccagcctta gttttggtaa 240 tctgcaaaac caagggccct gccctgggtg ctgctctccc agtgcaaagt ccctaacttt 300 ggtgtgaccc ttaccagagt caaggctgtc tgggcctggc tccttgtgac atccatcgcc 360 atctctccga ggggctaaga ggtagatgct ttgggaggca gagatgctcc tgcctgctga 420 ggcctagcac atgctgtagc cttggagcgt aaggcccgcc tgtggcagca acggctgctt 480 ggatggagag ctgcctgagg ctggcagccc agggcttctg cactgaaagg gctcagcctg 540 gcggctgctc aaatactctg ncccctgcca tggggtcaga ggcagggcag aaagggaggg 600 taggccatgt gggtaacagt tgacagggcc acggggacag agccatgggg cagccggcca 660 cactctgtga acatggggta gggattgctg cccagcagga gggggtgtgc agagccagcc 720 tacccatctt ccattcctca gccttgtgcg ggcagaaagt caccaggctg ccttggccac 780 agaacactta ctgaaatgcc cttggacagg gagggggtcc taagggggcc tggcccgcgc 840 tggtgcaggt ctggacttgc tcttggaggc aaggggatcc ccagtggatt ttcatctgca 900 gagaggttcg atttgcattt catacaatcc aggggtctgt atggaacttg gggaaggggt 960 ggtggaggaa ggtggccaac tgatcaaaaa caaacaaaaa acaggggtat cattcttaat 1020 t 1021 63 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 63 caggcagggt ctgcagtggt atcactgtgg gcagagcctg gggagggggc caattctgtg 60 cacagggcaa gggcgagagg aggggccagg gatctagggc tccggggagg ggtcagcagg 120 tcggggggag ggatccacgg ggaggggtta ccctgggtga agaagtgagc cttgtacttt 180 ccagtccgca cagcaaaaac cccacggacc tcgtctgggt aggacgggta gaagaagaga 240 gactgccgag ggctctgggg gcagagtcag gggtcacggg gcggggcagg ccccaagcac 300 tgcacatacc tggggctgcc agccctggtg ggaggccctg gacgtgcacc gcttcttgcc 360 cacccaggaa cctgagaggt ggcgccactt ggatgccact cagtgcagga ggcactgagg 420 cacagactct caggcactgc ccacactcac cccaggggaa ggccaggaca ggggccaagg 480 atctgggatc aggggtcacc ggccctacct tgcctgtgcc cagcagcagg gggctgaggt 540 caaagccatc caaggtgaca ntgggcagtg gggccccagc cagggctgcc agggtaggca 600 gcaggtccag ggagctggcc agctcgtggg tcacgcctgg gggcaggagg ctggtcagtc 660 actcagttcg ccatcaaggt tggggtggtg gggccagggt tccaaggaga gggcctgcgg 720 actgaccggg agcgatatga cctggccaga aggccaaggc aggctctcgg acaccgccct 780 cgtaggtcgt tccctttcca caccgcaaga gaccggagca gccgcctcgg gacatacgca 840 tggtctcagg tctgggacac aggaggcgct catgagccat ggagccacag cctctgagcc 900 accgagggtg accagtggcc ccacacctct aagtcacaaa gcttgcccgg aggtgcccag 960 catgagcccg gcacctccca ggcctaccaa gaccagctct ctgtgcactg tgtctcctga 1020 c 1021 64 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 64 gtccagcaat gagtcacaga cctatgcacc acctgcaaag gagccagaga aaacaaacgc 60 ccagcgcttt tagcctgaaa atgagaatct ggtttgctgg ggaagataaa gggtgtcgga 120 aaatggctgt tgggtaaatc attgatgtct gccactagga atgaaaggca aatcaggaac 180 tggcacacat gctttcaggg agatggctgc aagggagagg gcaaagactg ggaagttgct 240 tatgtggtgc cagactattt ggaagatcat ggattgcggt gtttgtgttg tgtggtcatc 300 attttgttct ttgtttacag aacagagaaa gtggattgaa caaggacgca tttccccagt 360 acatccacaa catgctgtcc acatctcgtt ctcggtttat cagaaatacc aacgagagcg 420 gtgaagaagt caccaccttt tttgattatg attacggtgc tccctgtcat aaatttgacg 480 tgaagcaaat tggggcccaa ctcctgcctc cgctctactc gctggtgttc atctttggtt 540 ttgtgggcaa catgctggtc ntcctcatct taataaactg caaaaagctg aagtgcttga 600 ctgacattta cctgctcaac ctggccatct ctgatctgct ttttcttatt actctcccat 660 tgtgggctca ctctgctgca aatgagtggg tctttgggaa tgcaatgtgc aaattattca 720 cagggctgta tcacatcggt tattttggcg gaatcttctt catcatcctc ctgacaatcg 780 atagatacct ggctattgtc catgctgtgt ttgctttaaa agccaggacg gtcacctttg 840 gggtggtgac aagtgtgatc acctggttgg tggctgtgtt tgcttctgtc ccaggaatca 900 tctttactaa atgccagaaa gaagattctg tttatgtctg tggcccttat tttccacgag 960 gatggaataa tttccacaca ataatgagga acattttggg gctggtcctg ccgctgctca 1020 t 1021 65 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 65 cggggtccca gaaggtggtt taaggactgg tgtggacaca cacagttctg ttgtctgccc 60 agcggagggg cctcacgggg ggccgttggg agccagattg tcagcttttg gatttaccag 120 ctgtgggtgg cagtgggcgt gtaactcagc atcttgctgc ctcagtttct ctcatctgta 180 aagtggggat aataacattt acctcataaa gttcctgcga ggattcgatg acttgataca 240 tcagttgctt agcacagggc tcagcactca gtacatgttc cctgtcagga aggcagggag 300 gcctcactgg cagcatcagg acatgggaca tcaggacata caccgtggct ctcagggaaa 360 ggaaaaagac cctctcccag gtgtacaagc tcgattctaa acctcatggg accctgcatt 420 gttcgctccc tcattcattc actagtccat gcgtgtactc agtagtggca taagcagact 480 gctcgggtcg gacctgaatt agcctcaccc actctcctcc tacactgtcc ctccccaggg 540 cacattcgcc tcccaggtga ngctggaggg ggacaagttg aaagtggagc gggagatcga 600 tgggggcctg gagaccctgc gcctgaagct gccagctgtg gtgacagctg acctgaggct 660 caacgagccc cgctacgcca cgctgcccaa catcatggtg agcccctggc cagcgggcac 720 tgagggcctg ggggtggcaa gcacattgcc agcccagtgc cccccggtgg tcgcacgtgg 780 ggagggaagg atccaaagga ggtctcgtgc acaggaagcc gtcacctgga gtttggctga 840 tagagagagt ttgctgggtc atctctgcca atactgagag ttcatggggg ctgctttggc 900 tagcagggag ggcttgctgg tatctaggcc agtagaaagc cttcgctggg cagcagaagg 960 tgttcccttt gtcattccag ccagtggaac aagttcactg ggtcatctag gttcattagg 1020 g 1021 66 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 66 tactaaaaat atataaatta gctgggtgtg gtggcatgtg cctgtaatcc caggtacttg 60 ggaggccaag gcaggagtat tgcttgaacc caggaggcag aggttgcagt gagccgagat 120 cgtgccactg cactccagcc tgggcgacag agcgagactc catctcaaaa taaataaata 180 aatataataa aaataaataa acaaataagc ttccttttgc tcattgaccc cagaatccca 240 gagaaaccac acgtcccagc aaccctcgtg gcagaataag ccacagaaaa cagcccaccc 300 taagtgcctc gcctccagca actgaagttg cacgagtcag cacgtgccct tctgtggacc 360 tcagaataga tcccttcata caagggctgc aggagaaagc aggactccca gcaatctctg 420 gggtctgagc tggcctggca agctgcctct ggggctgcca ggaactgcta tctctctgca 480 cagaggtcca atccatacct gcgttgcaaa gatggctctc ttcatcatag tgaagtcttc 540 cttatccagc atcttgttca ngtcgggaag gctcccactg caaggcaagc agggggcatg 600 catgtgagaa cggagtaatg agaggggtta gtcagggcct aggagggcac agggctgagg 660 gtggggcact cacaccagta aggattcata aagcttcctc ccgaactttt ccttcaccgt 720 gttggccgtg tccctggagg aagcagagca acagggtcac atacacacca gctgccattt 780 actgttaggc ttctttagtt agtttgtttg tttattttga gacggagttt ggctcttgtt 840 gcccaggctg gaatgcaatg gcgtgatctc ggctcactgc aacctctgcc tcccaggttc 900 aagcaattct cctgcctcag cctcccgagt agctgggatt acaggcatga gccaccgcgc 960 ccggctaatt ttctattttt agtagagacg gggtttctcc atgttggtca ggctggtctc 1020 a 1021 67 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 67 ccctccccac agtactgtgc agccctggaa tccctgatca acgtgtcagg ctgcagtgcc 60 atcgagaaga cccagaggat gctgagcgga ttctgcccgc acaaggtctc agctggggta 120 aggcatcccc caccctctca cacccaccct gcaccccctc ctgccaaccc tgggctcgct 180 gaagggaagc tggctgaata tccatggtgt gtgtccaccc aggggtgggg ccattgtggc 240 agcagggacg tggccttcgg gatttacagg atctgggctc aagggctcct aactcctacc 300 tgggcctcaa tttccacatc tgtacagtag aggtactaac agtacccacc tcatggggac 360 ttccgtgagg actgaatgag acagtccctg gaaagcccct ggtttgtgcg agtcgtcccg 420 gcctctggcg ttctactcac gtgctgacct ctttgtcctg cagcagtttt ccagcttgca 480 tgtccgagac accaaaatcg aggtggccca gtttgtaaag gacctgctct tacatttaaa 540 gaaacttttt cgcgagggac ngttcaactg aaacttcgaa agcatcatta tttgcagaga 600 caggacctga ctattgaagt tgcagattca tttttctttc tgatgtcaaa aatgtcttgg 660 gtaggcggga aggagggtta gggaggggta aaattcctta gcttagacct cagcctgtgc 720 tgcccgtctt cagcctagcc gacctcagcc ttccccttgc ccagggctca gcctggtggg 780 cctcctctgt ccagggccct gagctcggtg gacccaggga tgacatgtcc ctacacccct 840 cccctgccct agagcacact gtagcattac agtgggtgcc ccccttgcca gacatgtggt 900 gggacaggga cccacttcac acacaggcaa ctgaggcaga cagcagctca ggcacacttc 960 ttcttggtct tatttattat tgtgtgttat ttaaatgagt gtgtttgtca ccgttgggga 1020 t 1021 68 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 68 gtacatacac acccatgtga tacatataca catacccata gtatacaggt aacataaaat 60 tatacacaca caacacaaac acatattatg cacatacgca cataacacac acacacacac 120 ccacatacag gcattgtgaa ctagacacat caccttacaa tctgtggttt actggaagga 180 catggaacaa aaccccccca gccacagcgt ggaagtgccc tctccaggca caagattctg 240 cctccatggg gcgtggtagc agcattgccc acccacccag ggctgagtga gcaggcctgc 300 cccacactgc gcccatgcac agccactcca ggctgcctcc cacactgcct gcaaggaccc 360 cagtggggac tgcaaacggg aagtctgcat ccagggcccc agggagggca ggtggggctc 420 tggagtatag cactttctag aagggaagca ccctcttggt tctgaacgta agtgggtctg 480 ctcacaggga ggggcgtgca gccaccccag gaccccagct gtccaaggag ccagggaaaa 540 cgcacccacg gggcacctac ngctgggagc gcaaagaagg agatggcaaa gacagagaag 600 caggaggcga tggtcttccc gacccacgtc tggggcacct tgtccccata gccgatggtg 660 gtgactgtga cctgcaggga gagggacagt ggtcagccac ggatgggact ggagcctcgg 720 gagggccaac tgcctaaccc aaacccacca ctctgatgag cggagaggcc ggcaagagac 780 cctgaccacc aggacgaccc cgtgtgactc ggcgaaagca ccaggaacag agccgcggga 840 tggcacatgt ctcccaggct ctcggcgtca cacacaaggt atgtcccacc agcacatgta 900 aggagcccag cacccacgaa gggccaggcc tgctggctgg gaacgtgggc ctgggagctc 960 gccccacacc ggctgcctca tctgcctgcc tgtccccagg aggctgggcc cctgggccac 1020 c 1021 69 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 69 agcctgggtg acaagagcaa aactccacat caaaaaaaat aataataaat aaattaatta 60 attaattaaa taaaacaaga gcttttcttt ttgcttaata agagagagtg gtggtggtgc 120 ttttttattc ctgaagatgg gaagtcctct tttgcccact aacctcagaa gaaagggatg 180 aggtgtaccg tacaggggca gtcaccttct cctctgttta gcttccattt tggcctcatg 240 tctaccccaa agttgtagct tagatggggg gaaaattcag aattttgcat agaccatagg 300 tagcaccccc tagaaaaaga atgtttctcc ccagatgtct cccactagta ccctaaccat 360 ctgcttgtct gtctagtgag gacccttgga gggctgctaa aatgatcaag ggttacatgc 420 agcaacacaa catcccccag agggaggtgg tcgatgtcac cggcctgaac cagtcgcacc 480 tctcccagca tctcaacaag ggcaccccta tgaagaccca gaagcgtgcc gctctgtaca 540 cctggtacgt cagaaagcaa ngagagatcc tccgacgtaa gtgttttcat cctgcctctg 600 cctcaacctg aagtgacctt tgccctctca ccccattggc tgcctcagtt tccctttcat 660 cgacaaggcc ttgtgagcac ttggcagata tgaggaaggt ggcaagtaga tttggccttg 720 gtggttgctg tacaatggat tggcttctgt catgttcttc agtcacagcc cccttgctac 780 ccagccagtt gctctgagga gcctgtcagt gtatgcagca taccttaaac tttttggccc 840 ctccttccac ctccttctct ttgaaaccaa gtaggtgaca gagtgaaatg tcttccctga 900 gagaaaaccc agcatctccc cttgatacgt gaccatcagt caatttccaa agaagacatt 960 tcgttgcagt caataatatt gattactatt actgttaatt tcctcctctc tggaaaaagt 1020 a 1021 70 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 70 ctgacttagc tgggtgatat tgggcgggtt tcctctctct ggccgtttcc cacacctgca 60 ggctgggagt ggtgcctgct gcctcctgac agtgctgcag tgagcatcaa gtgagacaag 120 cccatgaaaa ccctctgcag ccccagaatg ccacggaaat gcagcattat tgtattgagc 180 tttgctttga gtttattata tcatcaaaca tattattaaa tgactgagtt gggtgggggg 240 ttggtcaaga gggcctatac aagaccccag gattctgtgg gacctgagat tctagaattc 300 tgccaccctg attccaaagc aagagaagag tctctgacat gatcagggcc agaaaactgg 360 ctggagaggc agacagtaca gtgcgttcat ataaatgact ctaattcagg tggtggcgtg 420 agactgtggg catgtgtgat gtgcaacaga gcaggctggt gtccataagc caacgatggc 480 acagtactca ccttctgggg ggcattgatg actccagtgt tgtagccaaa ctgcagggag 540 ccaagcactg ctcctcccac ngccagcatg aggcgacccg tcagcttctg cggagaaaca 600 aaccacactg ttataggcgt gtctgggagc aggttactac agggcagggc ctggactggc 660 aagtttctgt gttcagatat cttgcctgac tcttggcacc acaccagtct ttctcccagg 720 aaacttggcc aattcctgac cttaggtgcc caaaccagcc tagctgactt caagatactg 780 ggctggccgg gccatttcct ggggagagag gggaagtatg atcttctctc tctgtagcca 840 ggtctcagag agggagaggc tttggattct tgggggtctc atttccctgg tggagccatg 900 cctagggtct ggtggttcta gactctctga ctgggaggcc caggaaccag ccctcctatg 960 cgagggggcc caaattactt ggtaggaata gcacagatat agataggaga agcaccctgg 1020 a 1021 71 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or a. 71 cataattttt ctcaaactcg gcggacggtt cgtgtttgaa agagaagttg ccattgatgc 60 tgagcggcgg gctgaggggt ccatcaaagg aagggctggt gcaatcagtc agagggcttt 120 caaagaaggg ctccagcgct gcgctgtagg cgtgcggcgg aggcttaacg tggaagacat 180 gggagctgtc catggtaccg taaggcggac tgggcagccc aggcgactgg taggagtagg 240 ggtgtacagg gaaggaagcg ctggccgtcg gcaggtgggg gggcatgtcc tggttctgct 300 caggcagaaa agtccgagga ttgagttgca ggcagcccgc aaccaggttg gtggtgggtt 360 gggataagcc cttgcaaagc gtctgaacga aggagaccag gtctgggctt ttgcctgagc 420 gcaggatctc cgacagagcc cagatgtagt tcttggccaa gcgcagagtc tcgattttgg 480 acagcttctg cgtcttagaa tagcaaggca ccaccttgcg caggttgtct agcgccgcgt 540 tcagtccgtg catgcggttc ngctcccggg cgttagcctt catgcgtctc aatttaaaac 600 gctccaggcg agccttagtc atcttcttct ttttggggcc gcgtctcttg ggcttttgat 660 cgtcatcctc ctcttcctct tcttcctcct cttccaggtc ctcatcttcg tcctcctcct 720 ctcccccgtt cctcagtgag tcctcctctg cgttcatggt ttcgaggtcg tcctccttct 780 tgtctgcctc gtgctcctcg tcctgagaac tgagacactc gtctgtccag cttggaggac 840 cttggggctg aggctcgccc atcagcccac tctcgctgta cgatttggtc atgtttcgat 900 ttcctacatt caacaaggga gaggcaaaca gaaagaaaag cagaaaaacg ctatattcaa 960 aagccagata cgccttcagc ttccactccc taaacctgta caaatgcttg cgaaaagtac 1020 c 1021 72 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 72 ggatttatct agtataacaa accatcggtc tgataataca tatctgatag tgttgctgtg 60 aatataattg aggtaataca tgtaaaagag ctggcacaca aaaagaagct caaaaaattg 120 ttctttcctt accaggtgtt gccctggttc ctgccatatc gctccccaaa ggtgctgtag 180 gagccatcat agtgtttgta gttcaactgt ctctggtaac ctggaaagga agattaacga 240 aacagcacaa tggattaatg tgcatgctga gggtggagaa attactaaaa gtaccttggc 300 ttctcttgtg acatttctta aattttgttg tcatagatta ggagtttctg agccttaaat 360 attttattgg aggttggaga gtggatagtt tccttgaaat taactatcat agcagctatc 420 atagtgagct aagctaatgt atcataatat tcataagtaa ctgaaaccta ctgggaaatc 480 cagttgaaat aacattcaag ttttccctta ctcaagtaat cactcaccag tgttgagata 540 gccaatggcc ttggacttga nctctggagt aagctgctgt gtttcattta gataatccag 600 tacatagatg ttaggagcaa agaggaccat attctgctct ccacagccat agggcatctg 660 gagaagattt tgtgtgtttt gcatggcaga gcctaatatg tctcctagag aatgggagag 720 atgggaagtc ataaagcttg gagattatca tctatcaaag tcattaagca gaaataatta 780 gttgagctta gaaattgaga atttttagga aggatgattc ttccagggat agaagtatga 840 ttgaaagcaa taaacaagcc caaagaagaa gagaagaaag aagttaaaat tatagtatta 900 tttttagtaa atatttatgg gaaataaaaa tagtataata gaagctgtta atgcccggat 960 ccactagggg ctggagactc acccaaaact gagacagaag ctcgggcaga ttcttctacc 1020 a 1021 73 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 73 gaaaacagtc tgggttccct gtggaacatc atttctcaaa actgtatttt ggggcttgct 60 ttctcatttt tcctttccat ttcagatatc cttactgctg tctttgggct ctttaaacac 120 tgcctttttt cctttttcga tcacacccaa aaacttttct caaaaattac atgtaaattt 180 aaaaatttac aaattaaatt taaaattgaa attttaaaaa tcccgactct ccctaatttc 240 aggaagcatg catttattat acataacaag acgtgaaagc cgcaagagtt tcagcctaaa 300 cactgaagac cccgcgaagt gaatccagct gctgctctac aagcagcaac aacaactggg 360 aagccttctc agctacactt cggggcactg gtccaacccc acgcaaaatc cctcgtttcc 420 cttagcgtgg taagacggag cctgacctga gctccaactg tcctatcttt ttcaaatgtt 480 tcaaacttac tgcctttgtt cagcagaacc acgggcacgg tgatgatggt gacaagcgca 540 gcagcaccca gcagtcccag nagaaccttc cacggtgtct gcaagccgag cagatcaagt 600 ccaattagag ggaagcgtgt ggccccagtt tccgtaggag ggtcggggct gctccagagg 660 cagcaggatt tgcaggtggg agtgcgttag aagagggaga ccgcgggctg ggggtggggg 720 tggcgtctgg agtgcgccag ttggagttct ctaaggcggg tgcccttgaa cttgtgcctt 780 cagagcacat tagcgttggt ttctctaccc ctgcccgggt tcgggcgtgc gttctgtgag 840 tggctctccg ggacattcaa agctcgacgc cagggtccta gcagaagcca gggtccgaaa 900 gctaagcgag agctctggga cgtcccttca cctgtcagag ggtggccttg gggcttccgc 960 ctaaggggag tccctggtcc ggtttcgcca gcttttgggc catttgggga gtttggcgaa 1020 g 1021 74 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 74 agaggcacaa gaaattacct tgaaaaaatg gaattcagtg actgccatct aggaaagaca 60 gtgatactgt ccagcagcat gcagttccga gagctcaact cttaggccac cctccctcca 120 ctctactcta ggaacaagga gcattaggtc tgttttctct ccatacacct caatcgctcg 180 tcctctcgtc ttattaaaac acagacacag aaccaaactt tttgacagtt aaagacaaac 240 aattacatct aattaaaatg ctaagagatc ctgagctgtt agagatgagg agagtagata 300 gtatgacctg atcttccccc ctcttttttt tcctttaaca gtattctgtt tcagcataaa 360 gcacactttc tgaagaggtt cctggtggag actggaaatc tgactgtgtc ctgtggcaac 420 acacagtccc ttgcataact ttggcttcag tccctggatc tgtcctttgc agctacgtca 480 ggttccatgg aaggaggaaa gagctggagg gcagtatcac tcagccaaag ctcccatggg 540 gtcccatgct ggcaggataa ngggttcctg ctctaacaca gctagcacct cttcagggac 600 atgcttcctg tccaccacca cttcgtagac atactcagag aaccactcat ctgtcatgca 660 caggtaacct ggagaaaaga acagaagact tatgagtcca gagggcaagg gacaaagagc 720 agaaaccctt tttgtaggat aaacctttta caaaactaat attcatacat atttttcagc 780 tttcccatct gtaatttcat ttaatctaaa tcttattagc aattctgtga agcagatagg 840 acaggcatgg ctctattttt agaaaaatta gaaaaccggg tcttgagtaa ctaggtgatg 900 tgcccaggtc acatggtgag gttcagagct gggccttgga cctaaggcta acaccagatc 960 ctgtactgat gctctcttcc tccgctgcct tggtgatggt gagtgatgac ctgtatacta 1020 g 1021 75 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 75 catacgcaca taacacacac acacacaccc acatacaggc attgtgaact agacacatca 60 ccttacaatc tgtggtttac tggaaggaca tggaacaaaa cccccccagc cacagcgtgg 120 aagtgccctc tccaggcaca agattctgcc tccatggggc gtggtagcag cattgcccac 180 ccacccaggg ctgagtgagc aggcctgccc cacactgcgc ccatgcacag ccactccagg 240 ctgcctccca cactgcctgc aaggacccca gtggggactg caaacgggaa gtctgcatcc 300 agggccccag ggagggcagg tggggctctg gagtatagca ctttctagaa gggaagcacc 360 ctcttggttc tgaacgtaag tgggtctgct cacagggagg ggcgtgcagc caccccagga 420 ccccagctgt ccaaggagcc agggaaaacg cacccacggg gcacctaccg ctgggagcgc 480 aaagaaggag atggcaaaga cagagaagca ggaggcgatg gtcttcccga cccacgtctg 540 gggcaccttg tccccatagc ngatggtggt gactgtgacc tgcagggaga gggacagtgg 600 tcagccacgg atgggactgg agcctcggga gggccaactg cctaacccaa acccaccact 660 ctgatgagcg gagaggccgg caagagaccc tgaccaccag gacgaccccg tgtgactcgg 720 cgaaagcacc aggaacagag ccgcgggatg gcacatgtct cccaggctct cggcgtcaca 780 cacaaggtat gtcccaccag cacatgtaag gagcccagca cccacgaagg gccaggcctg 840 ctggctggga acgtgggcct gggagctcgc cccacaccgg ctgcctcatc tgcctgcctg 900 tccccaggag gctgggcccc tgggccaccg acgttgctgt gcgccggccc ccaggagacc 960 gggagctccc actgaggctg gtcgtcaaca aagagcaggg gctgggatga cgcgctgctt 1020 c 1021 76 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or t. 76 tcagtttgtc cagtaagatg gggtggtctg tttccaccag gtccagctat ccactggtgg 60 ttctatgggg agcagtgggg gtggttaaag gagctctgtg tggccgggag cggtggctga 120 tgcctgtaat cccagctctt tgggatgcca aggcaggagg atcgcttgag cccaggagtt 180 tgagatcagg ctgggcaata tagtgaaacc ttgtctctac gacaaataaa attagctagg 240 catactggtg gtgcacctgt ggtaccagct ataggggggc gctgagacag gaggattgct 300 tgagctcagg aggttgaggc tgcagtgagc cctgattgtg tcactgcatt ctagcctggg 360 tgacagagtg agaccctgtt taaaaaaaaa aatagaactc tgtgtggctg aggacagctc 420 tccaggggcc cccacactgc cttccaaatt cccctaggcg gctacattgc actagaaact 480 atatccacat caacctgttc acgtctttca tgctgcgagc tgcggccatt ctcagccgag 540 accgtctgct acctcgacct ngcccctacc ttggggacca ggcccttgcg ctgtggaacc 600 aggtgggcat cctccttccg ttcctccaaa tgggaatctt gcttctctgg tgggaccagg 660 aagttctcag tccatttcct atctcctaca ctctccacag tttatctgag ttgggagggt 720 ccctctccaa atgtgtcttg gggtggggga tcaagacaca tttggagagg gaacctccca 780 actcggcctc tgccatcatt taactctccc agcctatcac tcccatactg gaattttccg 840 ttcctctccc tcattatttc acccatcatt gaactttttc accaatgaga gaatccacct 900 gctggcggtg aggcatggca ggatacgaga aagtaagtgg gggtggggat gtggcaggtg 960 ccagtttgtt actaggagac agggtgggag agactagagt ctgggagcag acgtggtaag 1020 a 1021 77 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 77 tgtttccacc aggtccagct atccactggt ggttctatgg ggagcagtgg gggtggttaa 60 aggagctctg tgtggccggg agcggtggct gatgcctgta atcccagctc tttgggatgc 120 caaggcagga ggatcgcttg agcccaggag tttgagatca ggctgggcaa tatagtgaaa 180 ccttgtctct acgacaaata aaattagcta ggcatactgg tggtgcacct gtggtaccag 240 ctataggggg gcgctgagac aggaggattg cttgagctca ggaggttgag gctgcagtga 300 gccctgattg tgtcactgca ttctagcctg ggtgacagag tgagaccctg tttaaaaaaa 360 aaaatagaac tctgtgtggc tgaggacagc tctccagggg cccccacact gccttccaaa 420 ttcccctagg cggctacatt gcactagaaa ctatatccac atcaacctgt tcacgtcttt 480 catgctgcga gctgcggcca ttctcagccg agaccgtctg ctacctcgac ctggccccta 540 ccttggggac caggcccttg ngctgtggaa ccaggtgggc atcctccttc cgttcctcca 600 aatgggaatc ttgcttctct ggtgggacca ggaagttctc agtccatttc ctatctccta 660 cactctccac agtttatctg agttgggagg gtccctctcc aaatgtgtct tggggtgggg 720 gatcaagaca catttggaga gggaacctcc caactcggcc tctgccatca tttaactctc 780 ccagcctatc actcccatac tggaattttc cgttcctctc cctcattatt tcacccatca 840 ttgaactttt tcaccaatga gagaatccac ctgctggcgg tgaggcatgg caggatacga 900 gaaagtaagt gggggtgggg atgtggcagg tgccagtttg ttactaggag acagggtggg 960 agagactaga gtctgggagc agacgtggta agaactaact tgttgaaagt tggaccatac 1020 c 1021 78 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or t. 78 ccttttattt ttcttccatg gaattttcca gttaacttga gaaagtggaa tcgaattccg 60 atgttgaatt ttccttctgg ccccattcat gtggcaggtg gtgattcagg tactactggg 120 ggctgctcag acaaacctcc tcatcagaca tcaagaggct gttgcaccag gagggccggt 180 accgtgtcta gaggtggtcg gcatggggtt ggagttgtat tacataaacc ctactccaaa 240 caaatgcatg gggatgtggc tggagttccc cgttgtctaa ccagtgccaa agggcaggac 300 ggtacctcac cccacgttct taactatggg ttggcaacat gttcctggat gtgtttgctg 360 gcacagtgac aggtgctagc aaccagggtg ttgacacagt ccaactccat cctcaccagg 420 tcactggctg gaacccctgg gggccaccat tgcgggaatc agcctttgaa acgatggcca 480 acagcagcta ataataaacc agtaatttgg gatagacgag tagcaagagg gcattggttg 540 gtgggtcacc ctccttctca naacacatta taaaaacctt ccgtttccac aggattgtct 600 cccgggctgg cagcagggcc ccagcggcac catgtctgcc ctcggagtca ccgtggccct 660 gctggtgtgg gcggccttcc tcctgctggt gtccatgtgg aggcaggtgc acagcagctg 720 gaatctgccc ccaggccctt tcccgcttcc catcatcggg aacctcttcc agttggaatt 780 gaagaatatt cccaagtcct tcacccgggt aagagaaata gtgttgattt tagggagaat 840 aactcagcaa ttggatctgg tatgtgtgta ttcaactcat ttgcagacaa attgtggttg 900 ttcaatacca gcctgttgtg aattacctga attgatagca tcctggagcg acactcaaaa 960 tgtgtcgcct gtggtgcagc tggagcccgg agcctgcgtg ccaggccccg gaggcccccg 1020 c 1021 79 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 79 aagcagtacc agcagccaga aaccgcataa caaatacatt gggcaattgg gagttgggga 60 tgttgactga acctttgaac ttgactgatc cgaatgccaa attctaattt aaaaagggaa 120 aactggatgt ttgagacata gtgtgatttg gtgaagcaaa caatggactc ccaggttaaa 180 aactctaatt agtctcttct gctgagttcc tgagttaacc gtttgtgtag tggtctccta 240 gtatatttta taacttacaa agctagagga tcaaagcaat tatctagaaa tacacacaaa 300 actcatgttt ggtataaatg tctgaacaat taaccaaact gtcgcaacag cttttttcat 360 tgatttgatc taacactgat atgcctcata gggtcatgag ttgaaaaaac aactctaaag 420 ctattccaca agaagaaaga taatattttt ctaaacaagt gttaggaaaa tgaaaatatg 480 aaagtttctg tttatgctta tttatgaaat ttgcctacct tccaagtgtg tccccaagcc 540 acccaccaaa gaatgatgca ntcattccac caactgcaaa gctggataca gacagggacc 600 agagcatggt gattagttga gcagctgcca cagtctcttc ctcagcccaa ggggttggtt 660 ttgggttcat tgagtatgag attgtgggca gttcatctgt actgttgata acatagttgt 720 tgatagcttt tcggtcatcc agtggaacac ccaaaacatg tctatagtga gatattatta 780 cctaggagat aaagaaaaat agctttacta tttcaaacat tctatgtatt tttgtttttg 840 tctttaaagt gtttgttacg tgtttaaata gtaccatctc aattatgtgt tttatataca 900 tataaacatg gatagatttg tttacagttg gccatatcct ataaaagaaa ggttataaat 960 tacattgcca acaagaacca ggcaggaaca aataaatgaa gggaacatgt aatactttga 1020 t 1021 80 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or t. 80 atgccctttg gcctaaaccc tggacttgac taagaaatgc agcctccaat gacattgcgg 60 gaaaagggaa tctgggaact tctatgacac aattcagtct tgctgagcat ttggggctaa 120 tatttaactc tgaacatata ttgacatagg caattcttcc ataacagatt catacaaaat 180 ttaaaaatgc atatagaagc cttaattttt atttaaattc ttttatttaa ttgtgtttta 240 gaggcagaga atagtgtgtc tttttttgcc tcttttataa tttttatttt tttttttcat 300 ttttgccact gtctttcttt gcgctttcta gggcattaca tttttctttt ccgttttctc 360 catgtttctt agcgagattc tctaaaaggt tacttctatt tccatcacat catcatctag 420 ctccagcagg cctacttttc ttcatttcct ctattgtatt ttctgctttt cattcttgct 480 gtctgctcct ctctcatcat ccttgcctct gtctgtttaa tcctcctgtc cttcattttc 540 cttttttgcc tctgcattca ncatttctac ttccaatctc cctcctctgc tctttcttct 600 ttcctctgat ctgcagactt gcttctgtcc cctccttctg ttcccctcct ggatgtgtct 660 ttggccaacc tttccttctc tgagacttcg tgttcttgtt ggtagatggg ggctgatact 720 gtaaacatca caaaaataat tgcattgaga acaagtggtt cccatggtgt ccctttgaat 780 gagctcagaa tgcccaggct ccatatgatg caggagacag cactcatgct ggagaggggt 840 ctagacctca gtcacaagac ccaccattcc agaactttgg gactcatctc ttgacaccta 900 ccccctcccc agttagaaac caagaggcgc tgggtcacct gggaagagaa agaatgaatc 960 tgcctttgcc ccagcaagca cgctttcctg ccacattcac ctaaaagtct tttctgagat 1020 c 1021 81 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 81 ccatagcgaa tgttttcagc tatcgtggtg gcaaacaata caggttcctg actcaccaca 60 ccaatgattt cccgtagaaa ccttacattt atggtcctaa tatcctgtcc atcaacactg 120 acctggaata aaaagtaagt gtgactttca tacatttgta attgaaaggg caacatcaga 180 aagatgtgca atgtgactgc tgatcaccgc agggtctagc tcgcatgggt catctcacca 240 tcccctctgt ggggtcatag agcctctgca tcagctggac tgttgtgctc ttcccacagc 300 cactgtttcc aaccagggcc accgtctgcc cactctgcac cttcaggttc agacccttca 360 agatctacca ggacgagtga gaaaaaaact tcaaggcaat tcacagacac aggatatagg 420 aactgactgt tcactaggtt taaatataca tgcacttttt tataatctct acaagaaaac 480 atcagaaact cttcattcaa tagattaatt gttgattaat catttatcac tgtaccttaa 540 cttcttttcg agatgggtaa ntgaagtgaa catttctgaa ttccaaattt cccttaatat 600 tatctggttt gtgcccactc ttcgaatagc tgtcaatact tggcttctaa acagaatcaa 660 attttaagag attactaggt tacaataact acttttagtg atattttgtg gagagctgga 720 taaagtgaca aagaaattga cttaactgga caatctttta gataggtgga tagatggcca 780 actcagactt acattatcaa ttatcttgaa gatttcataa gctgctcctc ttgcatttgc 840 aaatgcttca atgcttggag atgcctgtcc aacactaaaa gccccaatta atacagaaaa 900 gaatacctga ggaatgtgaa gaaaaaccat caggctactg agatagtgac agcaattttt 960 tttcatactt cttctgtctt tttctaacat aggtaattaa aatttaaaat ggcgaggcaa 1020 c 1021 82 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 82 ggacaagagc agggctttaa atgccccata aatatgtgtg gcaaggatga aagcacatag 60 gactcaaaga ggaacaaagg agcagaaagg caggaagagt tggtgctgcc ttcaaaggag 120 agtaggaacg agggcaggtg gtatcaggtg gacctctatg tggtcctggg ttacaaaggt 180 gccaggaaaa agcaagaaat ggaagagtct aaaaagcaat ggaagattgt ggaaaatgat 240 ggaagattcc ggaaagtggt ggaagattcc agaaaatgat ggaagattcc agaaagtgat 300 gaaagattct ggaaagcaat gaaacattcc agaaagtgat gagacagtga tagagtctgg 360 ttccaggcga agtgggagag gatgggattt gagaagggaa tgatccctcc tcacacctct 420 aggatgggaa gcttagtgga gtgaggggtg ggtaggaggt tacaccctgt gtcctctgtc 480 gctctgtgca ggaggaggag gcagagaaag ggaagggtca ggaaagccag cccatgtccc 540 acccccactg gactcaccac ntgatggcag gtgaagccct tcatgaccga ggcctcattg 600 aggaactcaa tccgctctcg gagactggct gactcgttga ccgtcttcac cgccacgcgg 660 gtctctgcct cacccttgat gatgtccctg gcattgccct catacaccat gccgaaggag 720 ccctgcccca gctctcgaag gagggtgatc ttctctcgag acacctccca ctcgtccggc 780 acgtacacag agcatggaaa cactacttct tacttatcta cacagcatcc ttggaggatc 840 ccttgggggt ctgcagccac cttccaccca agccctcacc caaaccccct cgaaaacact 900 catgaaatga gttctgtgat ccaggaccca tgccgggcac tgggcatatg gccgagaaca 960 ggacaggcat ctgcacccat ggagagggca tggcagagac tcaaggaagg agccacaact 1020 g 1021 83 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 83 tcccacctcc tgggcagcct ggtagaggag acattccttt aattcttcct gcctaattta 60 gaggctgggt gggggtctga aggttcactc ccttcacatc atcccactag tctactttgg 120 gaagaattac aggttgttgg agctggaagc cccattctag gcatggtctg aagacctgaa 180 caatcccagg gggtggtgaa gggggcaggg aggagatggg caccacttac catttgaggc 240 cgcccagaga agtccttgcc cttcttaata agcacctcct tggccagctg gtggtggccg 300 acaatcactg tagtcttggt gcccatacga accgaataga tggggccata ttttttctgc 360 agcttgaaga agttgttatg catatggccg tgtctgggga ggaatggcag gctgcccacc 420 aggggcaggg acaggaggct cttggggtac ttggcaccag ggcaccttct cttgggccaa 480 aacaaataag ctagggtaag cagcaagaga gccacgagct cccacatggt ggctgggtgc 540 cggcaggcaa gatagacagc ngtggagtag aagagctgtg gcaactctag ggcacaagga 600 ggccttttaa agggctaccc tgatcttcac cttgactttg tgttatctct tgccttgtgg 660 aaagattctc ctggagccca gccaggcctg agctcatatc cagaagggag agaggcggtg 720 ggagtgaagg cctcctcaag ggctggctca actccagggc aaacctccgg aggaggagct 780 aggtaaggga ggtcagttga tcaccctctg aggagctccc catgcttgaa tgactccaga 840 gtgcgaatgg tatctgggct caggagtcaa ggcttggaac tttccatgtt gcaaaatcaa 900 aatcactgga cagatgacag attcaggagg gtcacaagta gcagggactg ttaaaggtct 960 tttatgcttc tttttttttt tttcagagtc ttgctccatc accaggctgg tgtgcagtgg 1020 t 1021 84 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or a. 84 caatagctag gctaattctc cccagcagct ttcatggagg acagtagtca ctgcccccat 60 tttccatgaa aagtaacatg aatcctggct gtataagggg cacttactgt gctgggtgct 120 aggctaagtg ctgtacatgc accttctcag tccattagag aagtctaggc tcagagagag 180 gagtggagtg aggattcctt gacccctcag accactgtgg tcctcccatc ccacctcctg 240 ggcagcctgg tagaggagac attcctttaa ttcttcctgc ctaatttaga ggctgggtgg 300 gggtctgaag gttcactccc ttcacatcat cccactagtc tactttggga agaattacag 360 gttgttggag ctggaagccc cattctaggc atggtctgaa gacctgaaca atcccagggg 420 gtggtgaagg gggcagggag gagatgggca ccacttacca tttgaggccg cccagagaag 480 tccttgccct tcttaataag cacctccttg gccagctggt ggtggccgac aatcactgta 540 gtcttggtgc ccatacgaac ngaatagatg gggccatatt ttttctgcag cttgaagaag 600 ttgttatgca tatggccgtg tctggggagg aatggcaggc tgcccaccag gggcagggac 660 aggaggctct tggggtactt ggcaccaggg caccttctct tgggccaaaa caaataagct 720 agggtaagca gcaagagagc cacgagctcc cacatggtgg ctgggtgccg gcaggcaaga 780 tagacagcgg tggagtagaa gagctgtggc aactctaggg cacaaggagg ccttttaaag 840 ggctaccctg atcttcacct tgactttgtg ttatctcttg ccttgtggaa agattctcct 900 ggagcccagc caggcctgag ctcatatcca gaagggagag aggcggtggg agtgaaggcc 960 tcctcaaggg ctggctcaac tccagggcaa acctccggag gaggagctag gtaagggagg 1020 t 1021 85 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or c. 85 gggtttcctg tttccttttc tgatcattct tacaagttat actcttattt ggaaggccct 60 aaagaaggct tatgaaattc agaagaacaa accaagaaat gatgatattt ttaagataat 120 tatggcaatt gtgcttttct ttttcttttc ctggattccc caccaaatat tcacttttct 180 ggatgtattg attcaactag gcatcatacg tgactgtaga attgcagata ttgtggacac 240 ggccatgcct atcaccattt gtatagctta ttttaacaat tgcctgaatc ctctttttta 300 tggctttctg gggaaaaaat ttaaaagata ttttctccag cttctaaaat atattccccc 360 aaaagccaaa tcccactcaa acctttcaac aaaaatgagc acgctttcct accgcccctc 420 agataatgta agctcatcca ccaagaagcc tgcaccatgt tttgaggttg agtgacatgt 480 tcgaaacctg tccataaagt aattttgtga aagaaggagc aagagaacat tcctctgcag 540 cacttcacta ccaaatgagc nttagctact tttcagaatt gaaggagaaa atgcattatg 600 tggactgaac cgacttttct aaagctctga acaaaagctt ttctttcctt ttgcaacaag 660 acaaagcaaa gccacatttt gcattagaca gatgacggct gctcgaagaa caatgtcaga 720 aactcgatga atgtgttgat ttgagaaatt ttactgacag aaatgcaatc tccctagcct 780 gcttttgtcc tgttattttt tatttccaca taaaggtatt tagaatatat taaatcgtta 840 gaggagcaac aggagatgag agttccagat tgttctgtcc agtttccaaa gggcagtaaa 900 gttttcgtgc cggttttcag ctattagcaa ctgtgctaca cttgcacctg gtactgcaca 960 ttttgtacaa agatatgcta agcagtagtc gtcaagttgc agatcttttt gtgaaattca 1020 a 1021 86 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 86 gggagagagg acctgtgaca ggataaaggg gctgccttat ttaaacctgg aaggaagaac 60 gacagtataa gcttccagga tattaatatc aggctaacat ggacagttaa gagcctttgc 120 caggagatag tatgactgta gttcaatggt gactgagcac ctgggatgtg ctagacacaa 180 gagtgacttc taagggtcac aggagaagct gacgtcaaaa acttcacaca aggggaccct 240 gagaggtcac agaagttcaa gattctgaaa gtagttctgg attccaagga gcaggctggc 300 ttcaccactt ctgacaggct ctgggaagta ggagaaagtt tgcctcaggt tggagagagc 360 agtaggggag agggtggtat ccccaaaggg tcagatttct actcttctgg cacaaagaag 420 aagcagagag gtaaagaata ggtcagtatg agcaagggca actgaccctt tatgacgtag 480 caaaggagtg gcagcaagtt ctgaatgtaa caaattctcc tttccttttt gaaaatgtag 540 aacacattaa caaatgcact ngatcaaact gtggtcaatc agaaatcgct gcacaaaatg 600 tcttcctatt aaataaaaat catacagtgc tttgcatttg aatagtgttc tatactttcc 660 cataattctc tcattagcca ccactgggaa ataccctgtt ataattatac agataaatgt 720 gcaaatgaca gaagaatcaa tttctaaaag aagaaataca aacttttata atgggagaga 780 ggatatattt attatcacta ataaaaaagc atatacttca cctaataaat taatactttg 840 tcactaccaa agttataatt actataacat ttatatataa tatacattta cattaatatt 900 ataaatagta ataaattatg aatgttataa ttacaaatta tgaattttaa aatgtaataa 960 ccataatatc aatactatat tagtgatggt gttatacatt gacacaattt ttttggaaaa 1020 c 1021 87 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 87 ttttctgtag actctccctc cgtttgagct tatctgacat ttgctcgccg tgagatccag 60 gccttgcatt tgtactggac cctgttctta cacaccctga tccagcccac ttgtgtagtc 120 tgggagtctg ggacaacctc cgtccgccct tctagccggg tcactgcagg caagccttgg 180 tgctcttgcc tgcgacgtgg aaatgatgcc tgcctgcagc gctgtatagt gcagagcggg 240 cgaggggcat agggaagtca ctggcacgtg gtatgtgttg gcagggctgc ttctcacccc 300 aaaccaaggg agggacaggc agggaggctg agagcagcgg cttgccctgg agctgtcagg 360 tgggaggcag agggcgggag aggctgtggg ctgcccaggt ctgatccctg acccacttgc 420 cacccgtgcc ctcagttctt ccccaatgga gaggccatct gcacgggctc ggatgacgct 480 tcctgccgct tgtttgacct gcgggcagac caggagctga tctgcttctc ccacgagagc 540 atcatctgcg gcatcacgtc ngtggccttc tccctcagtg gccgcctact attcgctggc 600 tacgacgact tcaactgcaa tgtctgggac tccatgaagt ctgagcgtgt gggtaagggc 660 cagccctggc tgctgcttcc tcagctggaa ggaccctccc cagccctccc tccccattct 720 gtacccccca tcagctccca tttcggactc tcttactgct gtcccttgtc actgggtgac 780 tccacccctg gaatccagta ccccttggtt cccaactagg actgttttcc ctcagtgttg 840 ctctaagcag cctctctcca ctgcccaatg ccatgactgc tccctgccct aggagatctg 900 tggaccatga ctgtccagtc agttctgggt tcctggcatt tcaggggcac ccactgagag 960 gcaagacagc ctcagggaaa catggaatca aggcagaatc aaggagatct ggagtggccc 1020 g 1021 88 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 88 tgcctcaggt aagaaagacc tgggcttccc tggctaaacg catgagtccc taggaggcca 60 ggaaagcccc caaaccccag cttcgggccc tcctccctgg cagtgcttcc tgggccccgg 120 agcctaccca ctgaggactc agtgcaggag ttagggtctg gagagtataa atgatcagag 180 tggctaaaaa tttccaccac ctcccagttc tccaggcatt tgagttgtga actcacctgc 240 tttttctccc atcttggacc cccctgggaa atgtccccct tgcccaagga ctgggctaaa 300 ggcctgggct catgggattt gggactctgc agaggagcag ttcaggggct ggaggctcaa 360 acctccaagc aaggacccct gggctctcat gggccctgtc ccccttccca gcaactaggc 420 taaaggctga aggtcatggg gactcaactc agaagggggg ctcgttagga gctgaggggg 480 gcccctctag gctctcctgg gagcggggac ggggcagggc tccttactgc agaagggtct 540 ccaccacggc tttctggtgg nccgcctcct cagggctgag gttctccagc tctttgagga 600 tgggtggcgt gaagtcttcc ccatcgtcgt ccgtctcgtc ctcggagccc cgagtctccc 660 ccagcccatt gggcagctca gccagctccc ctcgaccgcc gccgcaggac tcccccttgt 720 ccagggggcc ttctccagcc aggaggtagg gccccggctc acccagtgcc tggatcagtg 780 cctctttgct cagccctgac tcgagcaggg ccgccaggag ctccgtctgc agctggctca 840 gtttagaaac catggctcgg ctgccacagg gccacgcggc ccgggtccac cacgctagcc 900 gcctccccca ccgcgtgggt tgcgtttgcc tgccggccgg cagacacaaa ccaaactcct 960 tgcacccact gcccccccaa aaccccacta gccaagccct gtgggcaccc ccaaccccca 1020 a 1021 89 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 89 cctcagcctc ccaagtagct gggactacag gcacgtgcct ccacgcctgg ctaatttttg 60 tacttttagt agagacgggg tttcaccgtg ttggccaggc tggtcttgaa ctcctgacct 120 caagtgatct gcccacctcg gccttccaaa gtgctgggat tacaggcatg agccaccacg 180 cctggcccca gattaccttt ctaaaatctg aatagatttt agaaattcat atggccctaa 240 gagtttcaga gaaacacagg catgcacaca aatgcatgca caaccgatac acacccagac 300 acgcactagg gatctgctca cacaagcagt cgtgcacaca cacagatacg tgcattcaca 360 tgggaacaca ctggcctgca gacaccctca atcacggaaa cacacttgtc ccagagacac 420 atgcagactg caatgcctgc caggcacccc tttcccctgc atccattgac agccaacctc 480 tatcatcatc tcctgctgtg tggggcacag ggcgctcacc gtgggggctc tgcagctgag 540 ccatggtggc catgaagggg ntctgggtca catggctctg cacaggtggc atgagcggct 600 gctggtagga ggggtgcagc ggctgggaga actggacggg ctgcagggtg gtcaggctgc 660 tgcccatgct gttgatgacc ggcacactct gtgcctgcgt ggaggccagg cctggagtgg 720 aaggggaggg aatcagctgg gccccccagt tatatcccac ccctgcccaa gacctcccaa 780 gggcaccacc tctccttccc agagcccgtg gtttggagga gggggcaggg tggtcaggaa 840 acagccctcc actgggacct gccactaatt taagtggctc tggcaagtca ttccccctct 900 ctgagccttt agctctttgt ctaggctagt gggagaggca ggcggtgact tgttcaaaag 960 ttgtcaaact gcggttccct ggagccctgg gttccacagc agtgcaaagg ccatggggtc 1020 a 1021 90 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 90 gtgtagatgc agtagctttt gcctgtggga tgggagggat gggagatgtg tccagaccct 60 cctaggaggc cacatgagtg tgactgttct cggcccaagt ctttctcgtt cctcagagaa 120 tttgcggggc ccctgggcac acaagctgag atccacccag ccctggtccc ttggcaagaa 180 ctgagggaca ggacctggtt ctggggaaaa tgcaggggaa tgtttctccc ttccacagcc 240 cccttgcgag ttaggaggcc ggctcccacc ccagaaggtg gccaggtttt catgccttcc 300 tagagaaagc tggggctcgt ggcctccacc acaggaagac gcagaccctc agaaacaagt 360 ctgtgaagtc acaaccagcc ccagtttaca gatgtgaaac tgaagctcca aaaagtcagg 420 aggtcactga gtggggaggt gatggagtgg gaacagcccc cagatctggc tgaggccgaa 480 gccctggaga gatccccgca aggctccctt agatgcctga cattctgttc ttcctgaagc 540 ctcactccct tctctcctgg ngcagacacg tccccatcag aaggcaccaa cctcaacgcg 600 cccaacagcc tgggtgtcag cgccctgtgt gccatctgcg gggaccgggc cacgggcaaa 660 cactacggtg cctcgagctg tgacggctgc aagggcttct tccggaggag cgtgcggaag 720 aaccacatgt actcctgcag gtgaggagcc tcaatttctt cagctgggaa atgggcacac 780 ttgggctcat ggccccaagg tctgtcttct ccctgagtgg gtaggtccca gagacagctg 840 cccttcaggg ccttcaaggc tcttctggtt ttgtaaaaga ctttgtgaat ccaagaagag 900 catctattct aggaaccaca tttactgatc atcaagctac tggctgccgt ttattgagct 960 cttatcatat gccaggcaca atactaagtc tttgtgtgta tttacccatc cccttgagcc 1020 c 1021 91 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 91 atacaccaaa tttgtttact ttgaatagct ttctttggac agaggaattt tgagtactta 60 atattttttg catatttttc atactttcca tcatgaacat gtatggcttt tacatttagg 120 aagaaataat gctatttttt aaaggaggaa aaaagagaaa agagttggtg cgaataattg 180 aagtaatcta ttatgcagtg tgtgagtaat gaattgatag ataggatcat ctgtagattt 240 caaggagcta taatttcccc tgtaacatgt ttttcaacat ttctctcccc ttttattata 300 aaaaacacaa actctgatct acactccaac aaagtctgct tttatcacaa ggatacttta 360 aacatttgat cattgtgcag aatatttatt ctaaattact gagaccttat tcactaatca 420 tagttttcac aggctttatt ccaaccatat tgatatgtta gttcgagact acggatttaa 480 tacctggatt tctcctctgt gtcttgaagg gaacgttgcc agctgccttg taccagcatt 540 acaaataatc cagccacaaa ntaaatgctt ttcatttctg ctgtctgtca gaacacagaa 600 tgggggtagg gtgagggggg caggcaagga tttttaaaca tgtcaggcta aattaattag 660 atttgactag ataaatatca taagtagaag gaaaaagcta gtgttatcac ttttattctg 720 attatatttt cagcttaatt ttaaatagtg ggttatatta tttccccaga ttttttggag 780 gcaaaaaagg acacaaaaga tgtgttccac cattaagctt tttcattaat gtagggacac 840 ttctgtttaa taattagaag gctcatttcc agactggaaa ttaaaatgtc cacaatcaac 900 atttaaaata cccactgtag atgatatgct acatatggtt agcctgaatg gcaccttatc 960 catcatgcca cccccctcac tatcagtctg gctttcaatt aatagtcctt cacttccaag 1020 c 1021 92 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 92 taggaattgt gcatcaggaa agtgaagagg attgctagac atttagtcct gttataagag 60 cactaaagat ttggcagtca ccaggtatgg agtctcagga ggagcttacc gatggatggg 120 gcatagccat tatatttgcc cgagtccagg gcatctttca ttgcctgggt aacttcaggg 180 tctgtaggca ggtttccaaa cacagtaggg tccccttttt atgggaggaa aacacaaaag 240 gagccaagag gttattctcc catgttcagt actcagactc acccccaact gccatcttct 300 ccaaccagcc tgtgaacatg agagtagagg aggacaatga cagcccctca gtagtgtccc 360 caactcacca atggacaggg aaatcatggt tttgtttgga tttggtttca ccttcatgtt 420 gtccacaatg gctcggatgg ggttgaaagt tttcttggcc atgtctgagg gcctcacaga 480 ccacctggcc tttctgcctt tcatttttcc cggcacagag cttctcccac caacgttgac 540 atgcacgtcc agaattgagg ngaggttgcc tttgctgctc atctgaatca tgtatgggtc 600 catcactagc gaagcctgcg aggggaaaga agttccctgt gatgttgata acatagcgct 660 gggggacaga ggagctacat ttggacctaa acattgggtg acttcactaa aagtgtcttt 720 ccaaactctc tctttatttt tttttctact ttctgttgta aagtagcttt actatgaatg 780 ggggagtttt aagagttttt actgagatgg aaaataaagc aagaacccat tctacttaag 840 taggatttgc tacacgcatc tgcaattcct gtcaaagctt aaccatgctc tatgtgaaac 900 caagaaggaa taagatgaaa attgttcatc agtcaaagca taggttctcc ttcctttcca 960 tgcgagccta tccaagaaaa tctacctaat gcttcttgtc atctgcagag gaccaggaag 1020 a 1021 93 1021 DNA Homo sapiens misc_feature (540)..(540) n can be c or t. 93 ccaactagaa tacagcttcc tgagaggcag gatcttgact gacttgttca ttcctaattt 60 ctcagcatct agaagaaggt atggcacata gtaggtgctt gttagatact tgctaataaa 120 tggaaataaa catatcccta gttcctattc cagctttttc cctgctgttt tgtcctccat 180 tcttccagca gacaacagga ctagttccct gaccccctgc aggaagctaa caatacccta 240 gcctacttct aagcaaaacg tcgcagcttc aaagactttc catggagggc gatgggctga 300 ggacaatctt gttcttcacg taaaacacag gcccacaatc tcaaatttat aatttaaaaa 360 tatatatact tacaatgtct ctaaaggcac ttatttttct taaaaatcat gtatttgtaa 420 gctgaactat cattttaaca caaaagctat cattcttgct caatggagtc aggctgctct 480 tggagtttct gtcctgggag gaaaaagggc agggtgtagg tacctgatgg ttttccacan 540 gtcgaagcca tccagaggct ttgtgccatt ggtgtgtccc ctggccagct tcacgagtgt 600 tggcagccag tcagagatgt ggatgagctc ccggttcttc acgcccttct gcttcagcaa 660 ggggcttgcc acaaagccca cccctcggac gcctccttcc cacaggctcc attttcttcc 720 tcgaaggggc cagttattac cccctgccaa agtctgccct ccgttatctg aaacacagta 780 aggtcttggc atgaggatga tgttaactct taaatacatt taagaacaga gactgtatgt 840 acattgttac taaatggtgc ttaaataata aaaaaaaaga aaattccttg ccttttccca 900 ccctaaattc ccttttccca ttgacatagc ctttcattat tcagacataa gtaaggccca 960 gtgtgataca tatctacctt taaatcctcc atggagagag ccactggaaa acaaggcagt 1020 c 1021 94 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 94 ctgtggggag cgtggctttg ctactcaatg gcaactggat ttcaagagtt tcaggaaggg 60 tgggggagca agatatcaaa ggctcaagct cactcccctt cgtccagaca gacttttcat 120 tttttgtttg atgaagatta ggaagaaaag agtgaggatt aggcctaatt tactgcctct 180 gtcaaaagcc agcgcagagt agaagggaag ggagtaagtg gattatgaaa agaaaacaaa 240 cggagggaaa gggggccgag gatgaactgc attcagtgat atttatttat ctgattgcaa 300 aaggaaaaga agggatctgt tctaatggtt caccttctta tgaaccctgg agctcccaaa 360 accctggcga agtccttctg acactgctgt gaggtagatc ggagccattc catggctaaa 420 gtgagagagg ccactgcttg agagcagtaa taagggaacc agagataaaa ccccaaatct 480 tggtcttttc taccctgctg ctctcagcct gggccacaga gcctggagaa cactaaggtc 540 tcatcagggt ttgggtggca naaggaatgg aaccagggga gctctctttg ccctaagcac 600 tcactgactg cacaggcaag ccgggtgatg ggtgccccta ccaaagccag cctgctgctc 660 cacggcacct ggacactacc actgagggag gagtgaagtt caaggctggg gtttagaaaa 720 catctctcag acagagagca agaggatggt gaaaacccac ttggtaagga tccctccttg 780 ggtcacatgg cccagtcgtc aggttctgga gggtagagtg tcacagccgg ggaatcccat 840 gggactcatt ctgaacagag gccagaggtt ttccacaggt tctgatcaac agagttgttg 900 cttcttgtcc ttcaggccta agaaactccc caagaagccc tgggaaaaaa agtggagata 960 atagaccctg gggtgaaagg agcaacaggt gcactgaggg gaatgacaga gatcagagac 1020 c 1021 95 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 95 ctttagaaac ggctctaggt tgagaccgcc ggcatggatc tccacctcta ctgcagacac 60 acactggaag gcttcggacc agtcgggctg aggttcggag aagttgcaga cgcagcggaa 120 atcttcatcg tccagctcac aaggttctgg cgtggtcgca gagacgtgca ccagcggcag 180 cagcagcagc aacaagcagg acgcgcgctc ctggggagag agcagaggtc taggaggccc 240 catccaaccc ctgtggctcc cgagtggcac gcgttcgacc ccaagaccct acactcacca 300 tggtcgataa gtcttccgaa cctctgagct ccggacaggc tctggaagtg ctttacgttc 360 tttcctacac agcggcaccc gccggcttcc aggcttcaca cttgtgaact cttcggctgc 420 ctctgacagt ttatgtaatc ctgggatgtc attcagttcc ctcctctgtg aaccctgatc 480 acctccccac ctctcttcct ccgagccagc ccccttcctt tcctggaaat attgcaatga 540 aggatgtttc agggaggggg nccgtaacag gaaggattct gcagggcatc tagggttctg 600 tgtctcctgg cagtgtcctg atgactcagg cgccccaggc ggtgaatgcc ctgttgactc 660 gggagcctaa gccttctctg gtgggtgtgg gaaaaggatg atcctcagtg ccttaggcca 720 gtaccatact ctgcactatc caacccccca atccccctac cttatatccc agagaatcta 780 cttgattcat ttctttgact tcttccttgt cttggtttat gttgatctcc tgccaccaaa 840 tccaagtccc tgaatatcct cagatattta actgcatgtt ttgtggaaga gattgtgaac 900 ctcatctgtt ggcaccaagg ggggtagaat taggttcaag aaaaggaagt tggtctaaag 960 aaaaattccc ccttcctttt tttttccttg ctcctttgat taagtaataa ctttctttct 1020 t 1021 96 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 96 gcagggctca gcctgcctcc ctgctgctga ggcccctacc aaattggaac ccgagtagca 60 ccagggaagc agggcctgca ggggatgcca ttctcacccc tgcctgcaaa acgctgcagt 120 gcccgagtct gctgtgggct ggtgggggaa gggcatcgct aggttggtgg ctgcccccac 180 cccagcacac tccccccatt ctctttagat tgtctcacag ggggacccac ttggttctca 240 ttctgaactt tcagtgaatg gattctgctc cctgccttgc gtgtgtaccc ttgggtggcc 300 tttgcccgta tcttagtctc agtttcctga gtttgggcag gaaggagagg aggggttctg 360 actgatgagt tacctcttct ccctctcccc acctcgcagg gggctcctga gagtgtgatc 420 gagcgctgta gctcagtccg cgtggggagc cgcacagcac ccctgacccc cacctccagg 480 gagcagatcc tggcaaagat ccgggattgg ggctcaggct cagacacgct gcgctgcctg 540 gcactggcca cccgggacgc ncccccaagg aaggaggaca tggagctgga cgactgcagc 600 aagtttgtgc agtacgaggt gggtgcagga gccgattctc cctgcagtac gaggtgggtg 660 caggagccaa gtctccctgc agcagctgag caggtggtag gtcagggatg ggctcaggcc 720 ccgcttgaat ctgccccctc cctacagacg gacctgacct tcgtgggctg cgtaggcatg 780 ctggacccgc cgcgacctga ggtggctgcc tgcatcacac gctgctacca ggcgggcatc 840 cgcgtggtca tgatcacggg ggataacaaa ggcactgccg tggccatctg ccgcaggctt 900 ggcatctttg gggacacgga agacgtggcg ggcaaggcct acacgggccg cgagtttgat 960 gacctcagcc ccgagcagca gcgccaggcc tgccgcaccg cccgctgctt cgcccgcgtg 1020 g 1021 97 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or g. 97 agcagagaag acaaataata gatactgcga agataggatg attgaagaat gcagtgatat 60 aaatttgggg gaagaggagg gaggcagagc aaagaaattc aaggccttgg ccagacgtaa 120 tgtctcacac cttgtaatcc cagcagtttg ggaggctgag gcaggctgat agcttgtgtc 180 caggagttcg agaccagcct gggcaatcca gcaaaaccct gtgtctacaa aaaaatacaa 240 aaattagcca ggcatggtgg catgcgcctg tggtcccagc tacttgggag gctgaggtgg 300 gagaatcgcc gggacgtcga gattgcagtg agctgagatc gtgccactgc actcctgcct 360 gggtgacaga gcaagaacgt ctcaaagaaa aaacaacaac aacaacaaca acaacaacaa 420 caaaaacaca aggcctgtgg ttgggggaag gttgtaactc taaaaaagac ccatgtggct 480 acagcgaggg acactgggtg taggtagaga taagaagagt gatactcagt tctcacatca 540 cggcggactg aatacaggcc nggggagtga gagaccatcc acccctgtga tctggggcaa 600 gtcaccagcc ctttcagaga agcttccgtc ttctctgcaa aatgggacaa taccttgctt 660 cacaagcttg caaggatcaa aagaactggt agtgggccgg gcgcggtggt tcacgcccgt 720 aatcccagca ctttgggagg tcgaggcagg tggatcactt acttgaggtc acgggttcga 780 gaccagcctg ggcaaaatgg tgaaaccccg tgtctgctaa aaatacaaac attagcctgg 840 cgtggtggca ggtgccagtg atcccagcta ctcgggaggc agaggcagga ggatcgcttg 900 aacccaaggg gtggaggttg cagtgagctg agatcgcgcg ctgcactcca gtctgggcaa 960 cagatcaaga ctgtctcaga aaaaacaaac aaaaaagaac tggtagagga agcgctttgc 1020 a 1021 98 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 98 aaaaaaaaag tggctggaac tgccatcact atcctagaga tggaaggtta ggccaatgct 60 acagcaaggt agctgtggtc agacactaag aatgctcctt ctatctggct gccagccaat 120 ggatctccat tctggaccag cccacgagaa gcaaacctca aaggaaacta atctgaggtc 180 ttagctcaat ctgtggggaa cggcattaaa gcctctccct ctgagtgacc tctgctagct 240 tctctacctc ctgcttcctc atctgcttct gctacacacc cgcacactga aaaccctgta 300 tattgtatga gtcctccctg aaccccacat cagtcctgag gtgcaattct gcctagtcat 360 ctttcctctt ccctcaacag cagcttactt tatgttcttc aagcttcact gaggcctctt 420 ttgcaaatcc tcccagatct cctcagctgg gatggggccc ctctaggctt cctgagcccc 480 atgcttcctc ccttcatggc atctgtcata atgcagtggg attgccatgt aactcccttg 540 actgtctccc caacacagag ntgtacactt cacatctggg cagggtcacc atgactgtgt 600 ccaccattgc cagcttggaa cctggcatac tggcatcagt aaatgtttgc tgaaagaata 660 aatgataaca agctgtcctg cccaccgtga cctttgggag aatgggcata tgcttttgat 720 tacctgcagg gccatcaagg tgttggccag ggcttgacca taggtgtcat ggcagtggac 780 agccagggca gccagaggca cttcctgcat gacagcagat agcatgtctt tcatgatccc 840 tggggtgccc acaccaatgg tgtcccccag ggagatctcg tagcagccca ttgagtagaa 900 cttcttggtg acctaaggaa gcaagcaggc acttggagga tacagaatcc accagccagg 960 ggatccatgc actcagaaga gggggccttt gcctgggcag aacacttctg ggtatgacgc 1020 a 1021 99 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 99 attggccttg ttccccaggg tggagctgtc acaaaataga gtgggaactg tctggctttc 60 agcccaagag aatctgcatg gcaagttgca ttaacaacca ggcatttccg gcagttccca 120 acatttctgg gaattttctc atccaaacga ctgaaagccc actccattct cttgcttctt 180 actcatgctt tctttgtata atggtaatta tgttttaaaa aatcctgggc tatgttgttt 240 catggaacaa tttagaactt attggtcaaa ctctgaagca aaggtatata aaaggtagtt 300 agagatgttt agggaatatt caaagcacat ttttgggtca ctcataattg atctttatat 360 tcatatatgt atatatatat atataacata atgtacccat cttaacatat caaagctaaa 420 ccagtattaa aaacaactga ctatggtcta ttgatacaat atatgatgcc caagtacact 480 cttcattgct actgcatatc taaaatcatt tatttattta tccatccatc aagagtgtat 540 tgagagcctg acaacatacc ngcatcaagc cctggaggtc tttttaaggc tgagccaata 600 tagctatgga taacattcta aaactgatag catattttca tgttttatag tctttccaca 660 gactagttca aaatgaacac tgcctgagag gggctttaag atgactgact agaggtactg 720 gacacctgtt tccccagcaa agaagagcca aaatagcaag tagataatca tactttgaat 780 agacatctaa gagagaatgc tggaattcag cagagaagtg acagaaaaca cctgagatac 840 tgaaggagag ggaggcaagg tagacagcct ggctggaatc agctgggagc ccagagaggg 900 tccctagtga gaggaaaggg taagtgagag attcccagtg gtacatgttc ccatgttgac 960 tgctgaaatc ctagtcataa gagtctctca aaccccaagg accctgaaac tggtattccc 1020 g 1021 100 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 100 gtttaaattt gccgattagt ttcgatgatt caccagtgct tgatgattaa ggggtattgg 60 tgcagtgcca ctgagttgct gttcatagtc tccagtaagg gcagtacaag agaggagaaa 120 agtaaagttg cacatcaggc caatacattt ccatgtccct acagcccatg ggtatttttc 180 tctgaagttt aaaattacag ctcaagaaga tcatatgtat ttatgtaatc tgcctttaac 240 caggccacct tgcttcccta atgctgttgt ttttttccct tcgtttattt atctttaatt 300 gacacctgtt gctattctta tgcctgctca ccttcacata aatgtcagca tccatgcacc 360 atgtatgtca cacacacaca cacacacaca cacacacaca cacacacccc tctaaaagtt 420 ctgatgagta tttgataaat agtagagttt tgaggagaga tggaggaaag tgtttacaag 480 tttaactttt tgaatttgct tttaactctc tgctgttccc tcacctgtaa aatctgcctc 540 atctctgccc ctctttcttc ntgcaaacct cacttctcat agcctcctcc agcagcactg 600 acttctggag attccctgtc agtgaaataa aactggaaag ctggtctcat aataaaagcc 660 caacagttta tgggcaaagc ccaaccacct gtggttcttc aggtgtggtt ttcttgagga 720 gtgcttattt accctgccac attttcctct ctttctctcc aaggaggctt tctctccagg 780 gtggattaag tgaaattatg ctgttactta gggactgatt tacatatttc ttatccctca 840 cactctgggt ttctctatgt tagctacatc taggaaaaaa atggggaaaa aaatcacctt 900 gattggaagt gcagttaatt cctgaaaata aagcctgatc acgagtggta atcacagatc 960 aattagttac tggatcccta gataatgcat ccctgtcatt gtgagacaaa agaggggaaa 1020 g 1021 101 1021 DNA Homo sapiens misc_feature (561)..(561) n can be g or a. 101 cccaaaatct taggatgctg ccttaaacat catggtagaa taatgtaact agctacccac 60 gatttccttc tttaattcat tttgtgtttt atctccccag gaaagtattt caagcctaaa 120 cctttgggtg aaaagaactc ttgaagtcat gattgcttca cagtttctct cagctctcac 180 tttgggtaag tcagtgccat tagaccaaga tttctcattc tcgcactata gatatttcag 240 actgaaatat ccttgcttgt ctggggctgt cctgcacagg atatctggca gcatccttga 300 cctctacctg caatgtgttc ttccctgggc ttggggtcat ttactttacc tcttggtgtc 360 tccctttcct taagtgtaaa gtgtggatca taatgaccta tttcccagat gcattgtgag 420 gattcaatag catggttcat ggaaagtacc tcatacagtg cttcttggtg catactaagt 480 gctcaataaa gcttagttat tctgattatt attctactac aaaatgggta tactataatg 540 ttgtgagtga gtgtggataa ngtacctagt gggtggcagt cacaaaagag ataaacaata 600 agtcgctgtt tcttcatacg tacttcttac ttttgaaaag atgagaaaag tctgggccat 660 gtcacaaaca ttgccaaaaa taagacaata aaaagcacag ttgtcagagt taaaccacaa 720 cagtaccaaa ctctaccatt tcttttcttt ttctcccact agtgcttctc attaaagaga 780 gtggagcctg gtcttacaac acctccacgg aagctatgac ttatgatgag gccagtgctt 840 attgtcagca aaggtacaca cacctggttg caattcaaaa caaagaagag attgagtacc 900 taaactccat attgagctat tcaccaagtt attactggat tggaatcaga aaagtcaaca 960 atgtgtgggt ctgggtagga acccagaaac ctctgacaga agaagccaag aactgggctc 1020 c 1021 102 1021 DNA Homo sapiens misc_feature (561)..(561) n can be t or c. 102 gcaggactgc agacatgact catggcaggg tagctgctga ggcacgtccc atctcctttc 60 agttcaggag aggctgtggg aagagggaag aactgagcac acatgaagat ttggcagagg 120 gaggaggcca agtagggagg aagtggaata attgatattg gagccagaca tataatcaga 180 tgaaacctgg gcaaaaccaa acgaggtcca gacataagga gaaggagagc aggcgaaaag 240 gcaatagaga tctgtggcat gagataatcc tatgtccgtg ggattttccc atggatggta 300 caactggcac aggacgatgt tattcctccc ctctggtgaa accaatatgg cagcagaagg 360 cagggagggt ggggaggagg gtgtagtttg tctgcacaag catcatcagc atattttcag 420 gagcttctga gagctgatga aggatcattt gctgcagata ctttatattc actcggtcag 480 ccaacttgta ttgagcaatt gctggggcac agcagtgagt gaggtgcgct acagaaacac 540 agttgaaaag aatctgactt ngccctcaat gaacctgcag tcaagttaga agcacagagg 600 tcaacagaca aataagataa aggcattagt ttctgtactg gagcataaca ccaatactgc 660 cattgctcag aatgtttcta gaacccctaa aagttcagaa ctgtcttcag catcatttca 720 ggagccagac aagaaaacca gtctcatttc tttattgtca tgacctgggt ttgaccagaa 780 acaatattac tcacttggag cacctcactc ctcagatctg gctctagttc taaatatcaa 840 accattctca aatagcaaag ctttgtcacc tccctataca tatctcattt aaatatgtaa 900 aggatctgta ggcaattcca aaaagaaggc tctaaaaata tttaaaaagc aatggtcgta 960 ccttatagtt ttaccttata gtgtatatca ataatagcct tgtaattaaa aaacaatcat 1020 c 1021 103 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 103 caggaagttg ttggtgtttg gatggatgaa tggactaatg gatggatgaa taatagatag 60 atggattgtt gagagagaca gagaagagaa aagccttgcc cccaaaagct cacagactac 120 ttggagagag aagaaagcta cctggaggga gaaccagatg catgaagcag tgcagatgtg 180 gtgcctaatg agtgtgtagt ctggaagggc agcaaaagtc gagtggagtg agaggttcct 240 gtgtcctgga gcactgagta gagactccct catgggggtg aatcttaaag gataaagggg 300 cctctataat gaaaaggagg aggatgggat ttctggtaga ggaaattgct tgagcaaaac 360 ctccaaggtt ggaatgacta tggtgtgttc agggatgtta gcagacccag atgggtggag 420 cgttgagtgt gtgtgtgtag gaaggaagag gggaggtggc tggatgagca cagtgagacc 480 tgatttgatt gagagccttg aacgccacgc tgaataatgg aggcaatggg acgccataga 540 gggcttttga gtagacatat ntcagtgtag aagggtgaat ttcagatttt tagacagaat 600 agagtaagga gaggagctct tagaaatcat ctagtccagg gcttgtggca gagccctgag 660 gttttaagaa ggcatgtcag gggctaccat gacaggcacg gagaggctga gtgaattggg 720 gttcttgcca caattccctt gcctgagatt caacaagagc agctgtatta caatctgtgc 780 aaaatgtcat taggagaaac tagttagtag ctgggcgtgg tggcatgcaa ctgttgtccc 840 agctactcgg gaggctgagg ccggagaatc gcttgaagct gggaggcgga ggttgcagtg 900 agcagagact gtgccactgc actccagcct ggatgacaga gcaagactct gtttcaaaaa 960 aaaaaaaaaa aaaaactagt caggactctt tcagatacaa gtaatagaaa ccaactcaaa 1020 c 1021 104 1021 DNA Homo sapiens misc_feature (561)..(561) n can be a or g. 104 taccaaaggg caagtaggga aacagaccaa cagagatgtt accttctgaa taattggacc 60 caggaagagg agtgtaacct aagagaggaa gatacttgat tataccagtc tttgtggatg 120 aaaatatcta gcagtattca tagcaaatgc agtaggaagg agagagttaa tcacaaacag 180 aaagtaagca gagagtggga ccaagagtgg ggatgggagt tcagcgagtc actcactaga 240 gtggccagct ctccgccagc tgatcacacc aagagagaag atgatgaggc ccaggcccag 300 agtcactgca gacacagaaa ccttcagggt ctgcatgggg gacagcccag gtgctgcaaa 360 aaatagaaac ttacttgacc cagtttctgt tgctcacccc cagggcaatt ccatttattg 420 cagccacctc tcagtgggtt aaaaggtcct ttatcccagc tccaagggtc tagctcacac 480 cacccactcc caagaaaatg atctttctca aatcaaaccc tcgtcccatg gacctctact 540 cctagagtaa gcctggggaa nccatctccc cagaattagc atcctggctt ccaggtcctc 600 tctaatacag tggggcctct caaggcatcc tctttccttc ctttacctca aagccaccct 660 tatcaggata aagggctcct cactgtcctc tccattgccc ccacggtaac aatgtttgct 720 tccttacttt ctccaactga gcagcttcct attacactgt cttaccacat gtcttaacct 780 ccagtggatc catcctgtga gttatcctac tacttgtgta ccttctacat ctagatctcc 840 catgtgtcct ttcagagctt gtctccatcc cactccacag cccctgcact tccttgggcc 900 ggtcctgttc tgaatcatgt cccactcaga ttcttttccc atgataaaat gaacactcca 960 tttctaaagg gaggctcttg tgcacgctgt gaggagacgt tccccaggaa agttcaagtg 1020 a 1021 105 1021 DNA Homo sapiens misc_feature (614)..(614) n can be c or t. 105 caaaaggtca ccccacagtc cccactccaa ggcaggttga tagcagggat ctcagggtgc 60 ccatggatca aggactaagt cagagtcggg gtccctcagg ccgagggtaa cgtaggtggt 120 gcctgccagg ctctcctcgc ccaggggggc tgagaatgtc taaacccggg tggctgtgac 180 ccctaggcag agccagccca gcccttgcca gggatggaga ccggcctcga ggaggccaag 240 ccctgggggt ccacaggcct gtgggcttcg gggaggctct gctccctgtg gccctgtgtg 300 gcccaggctg ctgagtcatc agaacctcgg gggcgccgcg ggccccacat tccgcccagg 360 cctctctctg acccccttcc cagcccatct gtgtttttgg aaaacagagc cagagccccc 420 cgcggccctg ccagcttgcg gctgctcacg ctgggactca aatcgcaccc ttctgtcttc 480 aaagtccacc ttcacttcaa agctcggtcc caccccagcc cggcctccac agggccacca 540 cctgcccaca cccaggcccg ctgctgccca gtttcggagg gaccttgggc atcccctgat 600 cctctctaga gcgnggggtt cctggcatgg gcccgttaca catgggtggc tcggtgggtg 660 gtgaggacgg ggctgggaga agatcctggg gaccccatgg tggaggcaat gaggcaccca 720 aaccccaact ccagcgatgg ctgcttccac ggggccctcc gagccctgac cttcaaggtg 780 caagaaaagc tttcaggggc aggggtgagt ggaaggtggg cttcctccct tgccacctgg 840 ggggcgggcc caggacagat gctccgtgag agcacttccc aacctaggcc cagctgtggg 900 gaaggaggga gcaggcggct gggctccagg cagggggaag agttgcctga gaactcaggg 960 agagagggag ggctggggca ccccatgcca gctccagctg cagcaccaga gctcagagca 1020 g 1021 106 1021 DNA Homo sapiens misc_feature (638)..(638) n can be c or t. 106 attccctgac cagggccctg ggacccaccg cacagctgag ctggcccgag ctgaagagtt 60 gttggagcag cagctggagc tgtaccaggc cctccttgaa gggcaggagg gagcctggga 120 ggcccaagcc ctggtgctca agatccagaa gctgaaggaa cagatgagga ggcaccaaga 180 gagccttgga ggaggtgcct aagtttcccc cagtgcccac agcaccctcc ggcactgaaa 240 atacacgcac cacccaccag gagccttggg atcataaaca ccccagcgtc ttcccaggcc 300 agagaaagtg gaagagacca caaaccgcag gcaattggca ggcagtgggg gagccagggc 360 tctgcagtct tagtcccatt cccctttgat ctcacagcag gcagggcacc caggccttat 420 aggaattcac cctggaccat gccctaaaat aacctcaccc caaatacaat aaagggacga 480 agcacttata gataccacag acacatgtgt ttcattttta gttttgttaa aaaaaaattc 540 tgacaaatca gaaatggggg ttcaggagtg gtggtgatgc aaaagatgga agccatgggg 600 tgggggctgt caggggtggg ggcagtagtg tctccttnac ccccaccctg gtgtcctctc 660 ctgaaggaca gacggtcaca ttccaaaatg ggcgagtctt ctaccgtgtc tgttcaactg 720 agaagaaaac gtagcatggt cagaataagg catgaaaagg ggaaagtgag gcaggaacac 780 acggcacaca tgcagacact ggtgtactgc ctgggttcag aggacggacg tgggggtgag 840 ggaagggatg taatatgatg agagaagaca gaaaccccac ataaaggtca gaaaaacatc 900 ccaacacagc atcaaagacc agggggcatg aaccagtcaa gtgtccatta tgcatcagat 960 gcccatgacc tatgtgatgg gatttaggac aaacacacta aggaacaggg aggacctaaa 1020 g 1021 107 1021 DNA Homo sapiens misc_feature (573)..(573) n can be g or t. 107 ctctgaaggc ttgcctggtg ctcactcagc ccgtgaagag ggcctgctgg tcctctggag 60 cccacagccc tttgtccaga ggcgactcct aacctttagc aggctctgcc ctaacttaca 120 gtcccaccat tgtctgcccc acatcctgtc tgcctgtctg tgctccattc tggcccatcc 180 taggtgtctc tggctgcaaa gcctttcctg ggctcagcct tctgccttga acgggccctg 240 accatgagtc cccatgtgcc cagcccatac cttttccctg tccagccagg agccaacaca 300 ggcctggagc attgcctgtg gtatggcctg ctcgctgctg ttcccggcct gggtggtcac 360 ggacatgcag aggtggcact cagagtctcg cggcagccat tctcctgtcg gcgaccctgg 420 agatgtgagc attaggggga aagcaggcaa ggccacccta cagaggtgtt tggtttctgt 480 cctccttggt gcattgcagt gggaccacag agggagaggg tcatgcagtg gcagggtagg 540 gggaggagga gagcaggcat tgggctaagg agngggcagt gggctcactt gggccagcgc 600 tgtcatccat ggagcaccgg aggacgaggc ggcagaccag ctggggcagc atgcggccca 660 gcagcgtgtc gagcaggatg acggagtagc gctcagccag gcactggcag atgccgcccg 720 ccaccagagg taccacgcgg cacacctggg ccactgccac agctagcgca ccctggggcg 780 ggggcggaga gaggccagca tgggaccttc acttggcaag cctccactct ctgcccagca 840 cccagctggg cacttcctac gcattccctc attctcttct agaagggagg gcaaggctat 900 tcacaaataa ggacactggg gatcagagag tccaggggat gcaggggact cacacagggt 960 cactgagtgt aggagccagc ttcagaccta cgtctggccc caaaggctct ggcccacagc 1020 t 1021 108 1021 DNA Homo sapiens misc_feature (531)..(531) n can be t or c. 108 ggagagcagc agctggaggg caggctggga gcgcttgtga gggagaggag ctatggacgt 60 ctgcttctct gccaagggag agagtgaggt aggcctgggc ccgctgactt cagggtgagg 120 ccacagctac tgcagcgctt tttatttatt tatttattta ctgagatgga gtcttgctct 180 gtcacccagg ctggagtgca gtggtgcaat ctcggctcac tgcaacctct gcctcctggg 240 ctgcagtgat tctcctgcgt tcaagtaatt ctcctgcctc ggccttctga gtagttggga 300 ttacaggcat atgccaccac acttggctaa ttttttgtat ttttagtaga aatggggttt 360 caccatgttg gcgaggctgg tctcgaactc ctgacctcaa ggatcctcct gcctcggcct 420 cctaaggtgc tgggattgca ggtgtgagcc accacgtctg gccatactgc agcactttaa 480 aggacggtgt ctttttcttt ctcataaaag agaataggac tttattagca ntggtgcaga 540 cattgtatta cacaggaatg ggtccctagc ttgcacaacc ccagctgagc tttcagcaga 600 taaatcacag cagaaataga atcaccctag gactttcaat caaaagctgg aagtccacct 660 tacagaaaga caaaaagaaa ccccttttta tatcttaaca aagcaatagc tctcaagcag 720 cagagcatct cgaggaagaa agcttgcccg gtcgccatcc catcatgcca gagcgtgcag 780 tgtccaccct tgactacgct ggggaattgc tgattttttg aaaaagctta acttaacaat 840 ttctgatgtc tatcttttag agttctgtat gttcccattt tttattcttc tgaattttga 900 attgcaagta gctgtaaaat ccaatctttg agtgcatggg ggtgggtgtg aggcggggct 960 cagcttcaac cccctgtcct gtaaagcagt ggctggtttt tcctgagccc agccctggga 1020 g 1021 109 1021 DNA Homo sapiens misc_feature (592)..(592) n can be c or t. 109 cagccatggt tcgcggtgcc ctcggctgcc ctgggccaga gctggggcta gctttcacct 60 tgttgagacc caggactctg tcccccaagc ctgtcttcgc cagcgccttg accccacccc 120 tcatatactg tgtcctggaa aacgtggaca cgggagacca cagccagggc gaggtatcgc 180 ccctccatcc ccccaggccc aatgagaagc agttggccaa ggtgatccag gtggcagagg 240 cagcatcaga cccagtctcc tgtcaggcac caccttgggt gccggtcccc agatgccctg 300 gcggggagtg tgcatgctcc cggagccccc aggtcacccc atgtgagcca ggcccacaga 360 gcttggctct gcaatgcctg ctgggctgct gcccatgctc caccccttct gggaagctaa 420 aagacagccc ttcagtgtcc agagacctgc ctggccttgg agcctgggtt tcacatgccc 480 accgggctgg caggggcact cagctgcctc cagccccggc ggtcaccctg gcattgggtc 540 catctaactg ctccccagtc acaaggcagc tgctccccaa gtctccccaa anctgctggc 600 ccctctagaa gcctctgtcc attcctggag gaccgagggc agcctgcatg ccatcccgca 660 cacagccttc tgtctgggca tcctgccttc acacatgctg cacagggagg aaactcttat 720 accacattcc ttaagcagag actgaagcct ggagccaggc acatggcaca tgctcccacc 780 cacccaggac acactgcggt gtggctgcct ccaggctggc cccctagatt gcgtctgctc 840 ctggcatgga taactggcgc ctttgcctgg ccgttggggc agtgtttgcc ttcccctgtc 900 ggcagcaaat atttactgtc ctccgtctcc aggactctcc aggcctgagc agaccccggg 960 gggatgagtg tggactcagc ggtgctgagg gtagccccct gcccttcggg tcctggtgcc 1020 c 1021 110 1021 DNA Homo sapiens misc_feature (601)..(601) n can be g or a. 110 ggcagaggca gcatcagacc cagtctcctg tcaggcacca ccttgggtgc cggtccccag 60 atgccctggc ggggagtgtg catgctcccg gagcccccag gtcaccccat gtgagccagg 120 cccacagagc ttggctctgc aatgcctgct gggctgctgc ccatgctcca ccccttctgg 180 gaagctaaaa gacagccctt cagtgtccag agacctgcct ggccttggag cctgggtttc 240 acatgcccac cgggctggca ggggcactca gctgcctcca gccccggcgg tcaccctggc 300 attgggtcca tctaactgct ccccagtcac aaggcagctg ctccccaagt ctccccaaac 360 ctgctggccc ctctagaagc ctctgtccat tcctggagga ccgagggcag cctgcatgcc 420 atcccgcaca cagccttctg tctgggcatc ctgccttcac acatgctgca cagggaggaa 480 actcttatac cacattcctt aagcagagac tgaagcctgg agccaggcac atggcacatg 540 ctcccaccca cccaggacac actgcggtgt ggctgcctcc aggctggccc cctagattgc 600 ntctgctcct ggcatggata actggcgcct ttgcctggcc gttggggcag tgtttgcctt 660 cccctgtcgg cagcaaatat ttactgtcct ccgtctccag gactctccag gcctgagcag 720 accccggggg gatgagtgtg gactcagcgg tgctgagggt agccccctgc ccttcgggtc 780 ctggtgccca gcaggggtcc agcccaggga agagactgag gccaggacag gcagtgttta 840 agcctgagtt tctgggaaag gtagccctgg gcagaacttg ggccgaacgt tggccagtgt 900 ctctctccag ccaggctgtg aggtagctgt ttccaggatg ggcacctttc cacacccagc 960 aatgtggcca ggagccgcca ttcacgggtg cgaccagcag atggcatcag agcctcactt 1020 t 1021 111 1021 DNA Homo sapiens misc_feature (629)..(629) n can be c or t. 111 agagactgag gccaggacag gcagtgttta agcctgagtt tctgggaaag gtagccctgg 60 gcagaacttg ggccgaacgt tggccagtgt ctctctccag ccaggctgtg aggtagctgt 120 ttccaggatg ggcacctttc cacacccagc aatgtggcca ggagccgcca ttcacgggtg 180 cgaccagcag atggcatcag agcctcactt ttgatgcact ccggccacca gccacgggtc 240 caggttctgg ccaccaccca gggtctgagc agctgcatcc tgcccctgcc gggcactccc 300 gggggctgtg gggcctgtgg gggccctgcc agacactctt gggggctgtg gggggccctg 360 ccaggcactc ccagggacta tgggggctgt ggggggccct gctgggcact ctgaagggca 420 tggggcttag gaatgagagg agctgtctga tgatgatggt gggggcactg cagaggcccc 480 cggcctgctc aggtccagtc tcggccccta agtcaagcct caggccagcc tctcaccagc 540 ctgggtttct cagagggccg ggacaaatgt tctgggtctc taatattcca agaaagcctc 600 tggctggact ctgagcccca cctgcgagnc cctagaatca cagagagcta gggtgagaag 660 accaggggga ctccgtccca ccctcgtcgt ggctgagccc actgtggccg gtggtggacc 720 aggctgtggc ctttgctgag ggtccccagg gcccctgggg gctactgagg ctggaggcca 780 gcggtggcca ggagggtccc tccctcagcc actcaagcca gaaggtcgag tcctggtttc 840 tatgtgagga gggggcttca ggggctggga cctgggggca ccgaaggcct ggagctgggg 900 tccaggcggc tgagggttag tgcgttccca cgctcccctc cgccagcgcc gtgaggagag 960 ggaggtccac tctggaaaga atgtttgagg gcaggggtag acagggtctg ggaacgcgga 1020 g 1021 112 1021 DNA Homo sapiens misc_feature (563)..(563) n can be g or a. 112 atgcccctcc taacatgaaa gggatttaag caagccaatt gcttatttct gcctgggcca 60 gggaccccag ttcctgacct tctcaagaga tatgaacctg acccttctga gtgtagaact 120 gggctgtggg gccaggagat gtgggtttca atcccaggac ccccactggt ggctgtgcca 180 tcttgagcaa ggcactttgt ttctccgagt ctctatttct tcactggtaa acaaaggcac 240 aaatacctct tcaccacatc ataaggggat taaatgatgt aggaaaaagg atgttgtata 300 gtcgtgcaca tagtagggca gcaggtccag gaggtggacg gcccatccag ggacccagcg 360 gagcagccac ttccccactt ctcaagggtg gtcaccaggt atgtccgcag ggctgccccc 420 tgcccatctc caaggcctga ctggctgatc tcagctacac attggatact aagtcctagg 480 gccagagcca gcagagaggt ttgccttacc ttggaagtgg acgtaggtgt tgaaagccag 540 ggtgctgtcc acactggctc ccntcaggga gcagccagtc ttccatcctg tcacagcctg 600 catgaacctg tcaatcttct cagcagcaac atccagttct gtgaagtcca gagagcgtgg 660 gaggaccaca ggggtataga gagccaggcc ctgcacaaac ggctgcttca ggtgcaggcc 720 tggggctgtg aacacgccca ccaccgtgga cagcagcagc tgggcctggc tatcagccct 780 gccctgggcc actagcaggc cctgtacagc ctgcagggca gacaggacct tgtgcgcatc 840 cagccgggag gtgcagttct tgtccttcca aggaacaccc aggattgcct gtagcctgtc 900 agctgtgtgg tccaaggctc ccagatagag agaggccagg gtgccaaaga cagccgttgg 960 ggagaggacg gtggccccat ggaccacgcc ccatagctca ctgtgcatgc catatatacg 1020 g 1021 113 1021 DNA Homo sapiens misc_feature (551)..(551) n can be a or g. 113 aggagaggaa gggcgtggaa actggaatga tcctagtggg gtgtcttggc atctcttggc 60 ctcattttcc ccatctgaac catgaagcta aaactagggg atgtggatta aatggttcct 120 acaactactt gcaaggagac cactctgtgt ggttgcaaag aacactttga gaagctgtgt 180 gggaaagttt ccttcctagc agggtagact cagctaactg caggtcatgt ggccattgtg 240 gatgggttgg gagctcaagt ttggggcaga agggaatttt ttttggcagc agagtggcaa 300 gccctgccgc caggcaaact ctgctcttcc tcatcctcag aagcacttgc tcactctgct 360 aaatcaaagt gaaacgcatg tttacagaat attggtccaa aagggtctca gcatctccca 420 ctacccaggg tggcagagcc tcgggccggc cttgctcccc aagaagggct gactggggct 480 ctgtcccctg ccccagggct cgaggtagtg tttacagccc tcatgaacag caaaggcgtg 540 agcctcttcg ncatcatcaa ccctgagatt atcactcgag atgtgagtac aaagcccccc 600 tcaccagccc ctgttcctgg ggagagaggc ccagacagga ttcctggggt gactgggggc 660 tgttggggag acagacagag gggcctctac cagcttggct ccctcctggt ggcctgggag 720 tcagcccagc tcgcccctct ctcctactgc ccctcccttc agggcttcct gctgctgcag 780 atggactttg gcttccctga gcacctgctg gtggatttcc tccagagctt gagctagaag 840 tctccaagga ggtcgggatg gggcttgtag cagaaggcaa gcaccaggct cacagctgga 900 accctggtgt ctcctccagc gatggtggaa gttgggttag gagtacggag atggagattg 960 gctcccaact cctccctatc ctaaaggccc actggcatta aagtgctgta tccaagagct 1020 g 1021 114 1021 DNA Homo sapiens misc_feature (548)..(548) n can be g or a. 114 ttggatagac tgggggaaat aagtcctgtg ggacctcctg ccttaaagaa agcaggcgga 60 gggccctaaa ggaaatcagg caaccagacc aaaagaatgt ggaccaggtg gtccatgctg 120 tgtctcttgt gacccttctt ctccctgcca tgtcttttgg gagagccctt gtgttgcaaa 180 aatgagagtg tggtggtatg gattggggtt taggcagaac agtactggcc aagcagcgcc 240 tccctggacc tcaattttcc ctctgtggaa tgggctagca atcctgggcc tccccagggc 300 gaaggaaaga ccactcagga agggcaccgt ctggggcagg aaaacggagt gggttggatg 360 tatttttttc acggatgggc atgaggatga atgcttgtcc aggccgtgca gcatctgcct 420 tgtgggtcac ttctgtgctc cagggaggac tcaccatggg catttgattg gcagagcagc 480 tccgagtccg tccagagctt cctgcagtca atgatcaccg ctgtgggcat ccctgaggtc 540 atgtctcnta agtgtgggct ggaggggaaa ctgggtgccg aggctgacag agcttcccat 600 ttcaccttgt gggcccttcc caggcagagc ttcaggtgcc cctcttccca gtcattgata 660 cttagcggtc ctggccccct ttcctctccc tgctggtggt attgcacgcc aatgactcgg 720 ccagatgccc agacccctgt tcttggttta cctgcagaat attatctttg ccaccccgcg 780 ggatggctca acccactttc aggatgcagg tctcctaata gcaacctgat atagcagaaa 840 gacccctggg ctgggagtct gagacctagt tctagcccag ccctgaacct cagtttccct 900 ttctgtgaaa caagaatgtt gaacttgatg attcccaatt ttccttttga ccttgaaatg 960 gtagaatatt tatccctttg aggtgactcg gatggtagac tctcagacac catagcacac 1020 g 1021 115 1021 DNA Homo sapiens misc_feature (544)..(544) n can be g or a. 115 ggggcagggc tggtggtcag ctggggcggg gtgggagctg gaggtccgtg gtcaccagct 60 gccctgacta atgtcgttac ttgaatataa ccctgtgaag gcaggaacca cgtctgtctg 120 gttcacttcc cacggtggtt gagacatagt gggcactccg gaagtatttg ttgaatgagt 180 gaaagccccg ctgggggaaa ctgggtacag ctctttcctc agtttcccca tctgcactct 240 gggctgaatg ctggggctcc tcccaatctc cctgaagctg gacctgagcc cagtagggac 300 acacagggtc cagccagcgt cctggcttcc tccagggtca tttcatctac aagaatgtct 360 cagaggacct ccccctcccc accttctcgc ccacactgct gggggactcc cgcatgctgt 420 acttctggtt ctctgagcga gtcttccact cgctggccaa ggtagctttc caggatggcc 480 gcctcatgct cagcctgatg ggagacgagt tcaaggtgag tgggtggggc tgggctgcta 540 gggnatccag atggcatgtg gtatgtgtgt gtgtgcacac gcatggggag gagggaggaa 600 actcggaaac ttggtggtgg gcaaaagaac taagctggag caatagcagt gaagtccaga 660 ctgggcacag tggctcacac ctgtaatccc aatcctttgg gaggctgaga tgtagcagga 720 cgaaccgcag acaaaactcc tcagacactg agttaaagaa ggaaagagtt tattcagccg 780 ggagcatggg taagactcct gtctcaagag cggagctctc cgagtgagca attcctgtcc 840 cttttaaggg ctcacaactc taagggggtc tgcatgagag ggtcgtgatc tattgagcaa 900 gtagcaggta cgtgactggg ggctgcatgc accggtaatc agaacgaaac agaacaggac 960 agggattttt acaatgctct ttcatgcaat gtctggaatc tatagataac ataactggtt 1020 a 1021 116 1021 DNA Homo sapiens misc_feature (542)..(542) n can be t or c. 116 gcaaatccat agagacagaa agcacattta tggttgccag gagctgggaa agggcaggat 60 ggggaatgac tgtttattgg atgtggggct ctattttggg gtgatgagaa tgttctggaa 120 ttaaattcat ggctgcataa cactgtgaac atactaaatg cccctgaatt gtacacttta 180 aaatggttaa agtggcaagt tttcactaag cagtaaatta aattctacta caattttaaa 240 aagactaaaa aataatttaa aaaagattaa atgagataac gcaaaaaagc attatctcga 300 aaatacagct gatattagta taattcttac taagttttaa gagtctaagg tgcaggattc 360 taagtttaaa gggataggct cttttggttt tttggtttag ttatttggtt ttttttttta 420 atccattatc cccacccttg ggaggccccc agcacccagt ctgcactaga ggatggggcc 480 cacctccctt ttctctccag gcccagccac tgaccaccag taccctggcc aggggcaccc 540 tnggtcattg ccctccgtgg cccaaggaag ggaacagaaa caacagccaa gaagacaata 600 gccgccggga agtcctcaca tttctggaga aatagagccc attaatgaat gaagttcctc 660 cagcctgatc ggaggacggg gtgctgggga ggcctgggct aaagggctca cctccagccc 720 ccaccctggc agggccgatg gtacatgctc actcagtgag ggggctccag aggtctgtgg 780 gtacgaaccc aagggctggt gcccaggggc aatcagctta tgtctctgag ccttgggaaa 840 cagtgagggt cagcccggct ccccacgtgc ttctgggcag ctttggtatt ggagcaggtg 900 caaactcggg actagggcag gaccccctga gaggcgactg agcaaggcca tcccgactca 960 tgtttccttg gccctgcccg gggcacagca tcctgcccac atccctgcag ccctggctcc 1020 t 1021 117 1021 DNA Homo sapiens misc_feature (551)..(551) n can be a or g. 117 gggaactagt gccgccccag ggccccaagg tgggcggttc ggtgattcag agagggcagc 60 tctgtgttag gacacactgg ggccagccag gaagggtgga aaagataggg accagcgtga 120 gcatagaggc taagggacca tgggagctcc aagcgcgctc acagtgggga ccaggtcctg 180 ggggctgggg acaccaggga ggtgaaatac ccctccagcg ggtagggagg gtgggcagag 240 gagggccagc ggccaggcat ttgggagggg ctcctgctct ttgggagagg tggggggccg 300 tgcctgggga tccaagttcc cctctctcca cctgtgctca cctctcctcc gtccccaacc 360 ctgcacaggc aagatcgtgg acgccgtgat tcaggagcac cagccctccg tgctgctgga 420 gctgggggcc tactgtggct actcagctgt gcgcatggcc cgcctgctgt caccaggggc 480 gaggctcatc accatcgaga tcaaccccga ctgtgccgcc atcacccagc ggatggtgga 540 tttcgctggc ntgaaggaca aggtgtgcat gcctgacccg ttgtcagacc tggaaaaagg 600 gccggctgtg ggcagggagg gcatgcgcac tttgtcctcc ccaccaggtg ttcacaccac 660 gttcactgaa aacccactat caccaggccc ctcagtgctt cccagcctgg ggctgaggaa 720 agaccccccc agcagctcag tgagggtctc acagctctgg gtaaactgcc aaggtggcac 780 caggaggggc agggacagag tggggccttg tcatcccaga accctaaaga aaactgatga 840 atgcttgtat gggtgtgtaa agatggcctc ctgtctgtgt gggcgtgggc actgacaggc 900 gctgttgtat aggtgtgtag ggatggcctc ctgtctgtga ggacgtgggc actgacaggc 960 gctgttccag gtcacccttg tggttggagc gtcccaggac atcatccccc agctgaagaa 1020 g 1021 118 1021 DNA Homo sapiens misc_feature (554)..(554) n can be c or t. 118 agcttcctga gtagctggga ttacaggcac tcacctccac gcccagctaa cttttgtatt 60 tttagtacag atggggtttc accatgttgg tcaggctggt ctcgaactcc tgacctcgtg 120 atccgccctc ctcgcctccc aaagtgctgt gattacagga gtgagccacc gctcctggcc 180 agaaatctct tctttattat gtctactgtc cgttatccaa ctccagaagg taagaacctc 240 cactgataca taaggacttg tataccccac gtgcctgcaa cagtgcttgg cacctagtag 300 gcataccaaa atatataaat gttgaacaaa tgaagaaagt taaagtaaaa ctagaggtcc 360 aaaaatatca caaaagccat ctatggtcgc cttttcccta cctgattttg ctgagtggcc 420 ttacttttca gtcctctaca cagctggaac attaatgaac acagaggggg aagaagtgtg 480 tttactctag gatcacctct caatgggtca cttggcaagg gcatctttgc ttcttcgtca 540 gctccttttg acangggggt gaagggtttt ctgcaccaca ctttgaccac aagcatcacc 600 aatttcactg aacccaacag aaatttggac cctctggggg ctctctgcgt ggcagggccc 660 ttttcttttt ctttgggctt aggctgcaat ttgaaacacc actttcctga gccagcatcc 720 cccttgcagc gctgtcacag ggaggcttag gcagccacgt ggaagccacc taccccgacc 780 tttggcagaa tttccaaaca caacacagta gctttaagtt gattaatttg gaactctgac 840 cttggcccca aaaggtaaga atacataaca aggtatttta ttctcaaaat gtgtcaggat 900 aagaagcact tctgtaaatc gaccttttta aaatagatat aattagattt gcagttgggg 960 gcagtaaaga aagggtctga acagtggata acatgttgag aggttaatta ttaatgggca 1020 g 1021 119 1021 DNA Homo sapiens misc_feature (548)..(548) n can be a or g. 119 gcagcctgtt gtgccttgtg cctcgaagag gtttggtatc tgccagtttc tccctcgctg 60 tttttatggc tttcaaaagc agaagtagga ggctgagaaa tttctctgtt gaatacctga 120 tttcacaatc aagttaaagg aaaggggaaa agagtattgg tggaagcttc ttaggggagg 180 ggactaataa actgagataa ttctctggtt catggaaggg caaggagtag caaactatga 240 cacattttgc aaatgtatca ccatgcaaat atgcattgtt ttcctgacaa tcgttgtgca 300 gttgatgtcc acattaaaat actggatttt cccacgttag aagaatgttt aaatttagta 360 tatgtgggac aaagtggaag acacacagat ttatacatgc acatactttt cttcattcac 420 ttctttgtac ttaagtttag gaatcttccc acttacagat ggataaatgg gtacaatgaa 480 gggccaatag ccctccctgt ctgtattgag ggtgtgggtc tctaccttgg gtgctgttct 540 ctgcctcngg agctctctgt caattgcagg agcctctgag gagaaaattg acctttcttg 600 gctggggcag agaacatacg gtatgcaggg ttcaggctcc tgacggagtt ggggcaaccc 660 tggagataag ctcacacaac cctgcaagac caggtgctgt taccctagcc aatctcatgg 720 atgaaccaga tcaatgccag atgagctctg cctaaaatga ttttttggtg aactctgaaa 780 agtggaatat tgtttctgta agaatatcca tctgagactc tatctcttgg taataccaac 840 caagagttat cagtttctct ttaaccgaga caccagcaaa gtgcctgctc cagggtactg 900 cccaggggag ccctccattt gtagaatgaa tgagagtcca ggttatgaac agtgcctgga 960 gtgtaggaac accctccttt gcctctttga caggtctgca tcataacact tttttttttt 1020 t 1021 120 1021 DNA Homo sapiens misc_feature (546)..(546) n can be a or g. 120 gaaataccat attgcatcaa acctaagacg ccatcaagaa taaaaggcac ttttctttac 60 attactaccc agacgcaaac agagctgcca attcaaccat gatgagtcac cagttatagg 120 aggtttgatt tcagagctat aagagtgtat gtcctagaac caatgagcta tcgtagatcc 180 aagaatctac atatctgagt tggaagggct gccagccctt ggggcatgat cttccatcct 240 caaagacttc ttcagatttg aagagcaagg ggaaggactg cctggtgtct taacgaagtg 300 tctcctactc agccagtagg accctgagca ctctggggca tcctggcatc tgttgcccag 360 ctaatggttc ccaccagtca cccgtcccaa cccatgccac catccagtgc ccagcagctc 420 tcagagatac tcacttacta caggagacac actcgttttc tcttagaaag aaacctgcat 480 ggcaggtgca cacggtgttc tgtttctcct ggcctgtagg gagaagtgcg gcacagctaa 540 aggagnagcg cctgcacccc caccccacag gacagaggaa gtgacgaggg acagggtggg 600 ggcggccaga gaggagttgg ttgtcagacc cacagaatac aggaggggga aggaaaggaa 660 gtgccaccgc atggggaagg ggccaacccc tggggtgggg agagggcttg gcctcaggag 720 agctgcgctc acaggagagg tgcacggtcc cattgaggca gaggctgcaa ttgaagcact 780 ggaaaaggtt ttcactccaa taatgccggt actggttctt cctgcagcca cacacggtgt 840 cccggtccac tgtgcaagaa gagatctcca cctgacccat ttctggtgag gggagaagat 900 ggggtatgag tcctgcatcc tcctgtccct gcatcccctt cctgacatac ccctaagtgt 960 gtgtctctgt aatacacact cacatccatg cagtgtccca ccaaaacaca caccttcctg 1020 c 1021 121 1021 DNA Homo sapiens misc_feature (553)..(553) n can be c or a. 121 agatccagaa gctgaaggaa cagatgagga ggcaccaaga gagccttgga ggaggtgcct 60 aagtttcccc cagtgcccac agcaccctcc ggcactgaaa atacacgcac cacccaccag 120 gagccttggg atcataaaca ccccagcgtc ttcccaggcc agagaaagtg gaagagacca 180 caaaccgcag gcaattggca ggcagtgggg gagccagggc tctgcagtct tagtcccatt 240 cccctttgat ctcacagcag gcagggcacc caggccttat aggaattcac cctggaccat 300 gccctaaaat aacctcaccc caaatacaat aaagggacga agcacttata gataccacag 360 acacatgtgt ttcattttta gttttgttaa aaaaaaattc tgacaaatca gaaatggggg 420 ttcaggagtg gtggtgatgc aaaagatgga agccatgggg tgggggctgt caggggtggg 480 ggcagtagtg tctccttcac ccccaccctg gtgtcctctc ctgaaggaca gacggtcaca 540 ttccaaaatg ggngagtctt ctaccgtgtc tgttcaactg agaagaaaac gtagcatggt 600 cagaataagg catgaaaagg ggaaagtgag gcaggaacac acggcacaca tgcagacact 660 ggtgtactgc ctgggttcag aggacggacg tgggggtgag ggaagggatg taatatgatg 720 agagaagaca gaaaccccac ataaaggtca gaaaaacatc ccaacacagc atcaaagacc 780 agggggcatg aaccagtcaa gtgtccatta tgcatcagat gcccatgacc tatgtgatgg 840 gatttaggac aaacacacta aggaacaggg aggacctaaa gggtttcatg agatcagtac 900 tcactgtagg aggagatgtc tatctcatca ggcagctcac taatattgac ctcaaagcga 960 tcctgcacat cattgaggat cttggcatca ttctcatcgg acacaaatgt gatagccaag 1020 c 1021 122 1021 DNA Homo sapiens misc_feature (551)..(551) n can be c or t. 122 aggtgtgtgc caccatgcac ggctaatttt tgcattttta gtagagagag ggtttcatcc 60 tgttggccac attggtctta aactcctgac ctcaaataat ccacacgcct tggcctccca 120 aactgctgag attacaggtg taagccattg tgcacttggc cagaatcctc aatattcaca 180 caccactgga gctgttttaa agtttccggc tttctctgcc acatacccca aaattattaa 240 actgatatga ttcaaagtca gtataaagta gtaagaaaag ggtggtcttg tgttaagcat 300 catccatagc ccaattacga atcctcctgt tacataggaa ctcaacactc tgttacacca 360 cagcaaacta aagcttctcc aaaattaaag agactattgg cctacaagtt tcttatccct 420 ccaacttgcc acaccctcac tctcaggtct ctttaccttg gcttaccttg acattgggca 480 tgtatttaga gaagcgctca tattccttgc tgatctgaaa agccaactcc cgagtgtgac 540 acatcaccag nacagacacc ttaggcagga agtatacgga gacatatggt aaatgtagct 600 cttcattatc ccctctaggg aagtgactgt cacaaaaaca cacctgggcc gataataaat 660 gacttcaatt ctgtgatcta aatcatgaac cccacgcttg cgacagaaca tcccccacag 720 ctgtcaggtt gtcaagggta acagaggtca tgtgctcatg gctctgcaag catcatgtag 780 ttaggacaaa aacacccttc ccttatagtc ctaaccaaaa tcccctcccc agcactctcc 840 ccaaatatac ctgcccagta actggctcca gctgttgcag tgtggccaag acaaacactg 900 ctgtctttcc catgcccgac ttggcctggc acaggacatc cattcccaga atggcctgag 960 ggatgcactc atgctggact aaaagttggg gggggaggaa gataaattag acttcagtct 1020 c 1021 123 1021 DNA Homo sapiens misc_feature (569)..(569) n can be g or a. 123 gtgtaatgta ttagagcaaa tcctcttgat taggcttgag aatggagcca tggagcccca 60 tttttttccc acccttcatg cagtagtgtt taattaaata tttaaaatat ttaatgccct 120 gcacaggcat catttaattg gaatgaacaa ctgctaactg ctggcacagg gctctagaag 180 gccccagata tcagtaattt accactgttt gcttgctctt gggataggaa ggatccgggg 240 atcctagagg aggagctagg gcagttgggt gctggaggag gcacatgggg gctcagcaca 300 gccacttgtt tgccagctgg tggagcagtg tggaactcgc cttcttggga ggaagaaaca 360 cgtctccaga cttccataac aaagtaccca gagttgctgg gctagttaca gttccaatga 420 ccattcctcc ccagcaggat aagcccaggg ccccacccta cctgggtccc ccttctcgcc 480 ccgagggccc tctctcccat cccgtccatc gcgaccaggc aggccactct ccactgagct 540 acacatgacc agggtgcaag cactgggcnt tgttctgtgg gagtaggtct tcatttctgc 600 ttccaggtag cccaggggct gtgtgagcag gaccagtgca gagaggagga agagcagcat 660 ggcctggaga ggtgaacaga aagagaaaag acatgcttat gcttcatgga catggtttag 720 ggcttggctc agcttctaga ggtgacaaga agcccccatt ccctccttct gtcctctgct 780 atggggccta gagcagcagg aatccaaaag cagtttaagg acaaggaggg cacaaggtct 840 ggatggagag catgagttac ccagctggaa ctctgacata ggttgacagc agcatccccc 900 attcccaggt gctcatgtct tcccttcttg tgccttccct tgggcactaa gtttggcaca 960 gtggctagga tgtagcattc ctcactgggg ccatctgtca catcaagaag ggttcattga 1020 g 1021 124 1021 DNA Homo sapiens misc_feature (553)..(553) n can be c or t. 124 atggcacctg ccctttggca ccccaaggtg gagcccccag cgaccttccc cttccagctg 60 agcattgctg tgggggagag ggggaagacg ggaggaaaga agggagtggt tccatcacgc 120 ctcctcactc ctctcctccc gtcttctcct ctcctgccct tgtctccctg tctcagcagc 180 tccaggggtg gtgtgggccc ctccagcctc ctaggtggtg ccaggccaga gtccaagctc 240 agggacagca gtccctcctg tgggggcccc tgaactgggc tcacatccca cacattttcc 300 aaaccactcc cattgtgagc ctttggtcct ggtggtgtcc ctctggttgt gggaccaaga 360 gcttgtgccc atttttcatc tgaggaagga ggcagcagag gccacgggct ggtctgggtc 420 ccactcacct cccctctcac ctctcttctt cctgggacgc ctctgcctgc cagctctcac 480 ttccctcccc tgacccgcag ggtggctgcg tccttccagg gcctggcctg agggcagggg 540 tggtttgctc ccncttcagc ctccgggggc tggggtcagt gcggtgctaa cacggctctc 600 tctgtgctgt gggacttcca ggcaggcccg caagccgtgt gagccgtcgc agccgtggca 660 tcgttgagga gtgctgtttc cgcagctgtg acctggccct cctggagacg tactgtgcta 720 cccccgccaa gtccgagagg gacgtgtcga cccctccgac cgtgcttccg gtgagggtcc 780 tgggcccctt tcccactctc tagagacaga gaaatagggc ttcgggcgcc cagcgtttcc 840 tgtggcctct gggacctctt ggccagggac aaggacccgt gacttccttg cttgctgtgt 900 ggcccgggag cagctcagac gctggctcct tctgtccctc tgcccgtgga cattagctca 960 agtcactgat cagtcacagg ggtggcctgt caggtcaggc gggcggctca ggcggaagag 1020 c 1021 125 1021 DNA Homo sapiens misc_feature (601)..(601) n can be c or t. 125 gagctggccc gagctgaaga gttgttggag cagcagctgg agctgtacca ggccctcctt 60 gaagggcagg agggagcctg ggaggcccaa gccctggtgc tcaagatcca gaagctgaag 120 gaacagatga ggaggcacca agagagcctt ggaggaggtg cctaagtttc ccccagtgcc 180 cacagcaccc tccggcactg aaaatacacg caccacccac caggagcctt gggatcataa 240 acaccccagc gtcttcccag gccagagaaa gtggaagaga ccacaaaccg caggcaattg 300 gcaggcagtg ggggagccag ggctctgcag tcttagtccc attccccttt gatctcacag 360 caggcagggc acccaggcct tataggaatt caccctggac catgccctaa aataacctca 420 ccccaaatac aataaaggga cgaagcactt atagatacca cagacacatg tgtttcattt 480 ttagttttgt taaaaaaaaa ttctgacaaa tcagaaatgg gggttcagga gtggtggtga 540 tgcaaaagat ggaagccatg gggtgggggc tgtcaggggt gggggcagta gtgtctcctt 600 nacccccacc ctggtgtcct ctcctgaagg acagacggtc acattccaaa atgggcgagt 660 cttctaccgt gtctgttcaa ctgagaagaa aacgtagcat ggtcagaata aggcatgaaa 720 aggggaaagt gaggcaggaa cacacggcac acatgcagac actggtgtac tgcctgggtt 780 cagaggacgg acgtgggggt gagggaaggg atgtaatatg atgagagaag acagaaaccc 840 cacataaagg tcagaaaaac atcccaacac agcatcaaag accagggggc atgaaccagt 900 caagtgtcca ttatgcatca gatgcccatg acctatgtga tgggatttag gacaaacaca 960 ctaaggaaca gggaggacct aaagggtttc atgagatcag tactcactgt aggaggagat 1020 g 1021 126 1021 DNA Homo sapiens misc_feature (564)..(564) n can be c or a. 126 tttcaatcaa aagctggaag tccaccttac agaaagacaa aaagaaaccc ctttttatat 60 cttaacaaag caatagctct caagcagcag agcatctcga ggaagaaagc ttgcccggtc 120 gccatcccat catgccagag cgtgcagtgt ccacccttga ctacgctggg gaattgctga 180 ttttttgaaa aagcttaact taacaatttc tgatgtctat cttttagagt tctgtatgtt 240 cccatttttt attcttctga attttgaatt gcaagtagct gtaaaatcca atctttgagt 300 gcatgggggt gggtgtgagg cggggctcag cttcaacccc ctgtcctgta aagcagtggc 360 tggtttttcc tgagcccagc cctgggaggt cgtggtaggt gtggaggctg cagagctcct 420 ccagatgctg ccctcgctgt gcctcacacc agagaggatg gaagtgggct ctggtgtcag 480 actgtggttg agctgagaca gacaaggccg acacagggct gggggcccgt ggtccaccag 540 tggaagtgac tgccgaggaa gggnggtgag gagggcggtg tgggagctga ggcttctttt 600 cagcctggca gctggcgagg gccagggagc aggggaagag cctggtcacc atggtcccag 660 agcccgtctc acttggcttt tcctttgcag ctgaggagga tgagggccag agagggactg 720 tgtgtatgtc ctgcctgggg acccacagcc aggtgatagc agaggtggtt tgaagcccag 780 gcctcccacg ccaacccact ggtcttgctg tttcagcagg gaaggccggg agccctagga 840 gctggggaaa ggcgactgcc cgggtcctgg gtgactcccc acccccagat ccccagctgt 900 catcactggg gcaaggacac attaaactgg tccctgtggg tcaggtctga gtgggggagg 960 acctcccctc cccactgcct cccacagggg cttgtgatgc agggtttcag gaacagggct 1020 g 1021 127 1021 DNA Homo sapiens misc_feature (607)..(607) n can be c or t. 127 ttggtttttg ttgtattcaa ttctaattat ttattacaca gttaccatcc tttgatgaga 60 tgttactctt catctgtgat tgcttatagt tgttcgcgag cttctgtcca ttggtaatta 120 gaaagtttat ttatatcaag tttaatcttc ctgttaaaaa cagtgttcta atagtcatcc 180 atattaaaat attatatggc agtattaaaa actacaaata ttactcttgg gaatcaaatc 240 atacactgta gcacatcatc tttcttggca atagtactgc tgttgtacac tgatggcctc 300 taacagagaa gaaatcattc cattgaaaga aaagtaacta tcaagaacaa agttggaagt 360 gatgccttaa agctaccggc ccatgtctaa atgtactttt gatttttatt ttattggtta 420 agtagaaatt atttttaatg taatgacagc ccattaataa atgtctcctc tgttgaaggt 480 agggttaatt cagtatgcca ataatccaag agttgtgttt aacttgaaca catataaaac 540 caaagaagaa atgattgtag caacatccca gacatcccaa tatggtgggg acctcacaaa 600 cacattngga gcaattcaat atgcaaggta agttttggtg ctaataggcc aatgttttca 660 taatgtaaaa cattatattt atgtaataaa tatgaaaaag taaggaaaag acaaagaaaa 720 ataatatacc tggtacctaa tttaaatcag aactaataaa gaaaaaaaca tcagagcatt 780 ctatgtcttg aatactttga gaaggcagct gggaaagtta aatctttgat tttaggatat 840 ttataagata tcacatgata tttaaatgaa tttatgtgaa gtaaatgaaa tgagaagacc 900 ttagattaaa acagtaggaa atggggcaat ctgtcataat ttgttaatat tcatcagaga 960 ttcagacaaa ttgagctcat ggatcacttg gtgcaaatta acaaagacca cagaatctta 1020 a 1021 128 1021 DNA Homo sapiens misc_feature (561)..(561) n can be c or t. 128 tggatctgca gctccagaga agggcctggg tcagatgtca ctgaagccct atggtggcgg 60 aaaggcgaga aatagtgggt tgagattcca agtgcaatcc actgcggctc ctcgctcgcc 120 ctccaggtgg cagcacaacc ctgcgcttcc gaagcccgtt ttctgagcca gacactctcc 180 acgctctggg tatttcggct tctctctccc cacacgccga ccctaggtcg cgcactttct 240 gcctggcaga atttggccga ggatccaaac ccggagcagc ctccagagag cgtgtcgttc 300 acgcggccag catatgctca gagacctcag aggctcagag acctcagggc tggtggtgtg 360 gtcggttgtg accacttgtc cctcggaccg gctccaggaa ccaacctggg gaatgtgtgt 420 aggggaaggg cgggatagac agtgcccgga gcagggaggc gctgaaagac aggaccaagc 480 agcccggcca ccagacccgt tgtgggaacg gaatttcctg gcccccaggg ccacactcgc 540 gtgggaagca tgtcgcggac nctttaaggc gtcatctccc tgtctctccg cccccgcctg 600 ggacaggccg ggacgcccgg gacctgacat ttggaggctc ccaacgtggg agctaaaaat 660 agcagccccg ggttactttg gggcattgct cctctcccaa cccgcgcgcc ggctcgcgag 720 ccgtctcagg ccgctggagt ttccccgggg caagtacacc tggcccgtcc tctcctctca 780 gaccccactg tccagacccg cagagtttaa gatgcttctg cagcccggga tcctagctgg 840 tgggcggagt cctaacacgt gggtgggcgg ggccttttgt tccagggact cttttctcaa 900 aacttcccag tcggaggctg gcgggaaccc gagaggcgtg tctcgccagc cacgcggagg 960 ggcgtggcct cattggcccg ccccaccaac tccagccaaa ctctaaaccc caggcggagg 1020 g 1021 129 42 DNA Artificial Sequence Synthetic 129 cagttgttta tctttcgctc catcaaccaa gtcacaattg gt 42 130 29 DNA Artificial Sequence Synthetic 130 cgcgccgagg agttgggagg gaatttctv 29 131 32 DNA Artificial Sequence Synthetic 131 atgacgtggc agacggttgg gagggaattt cv 32 132 25 DNA Artificial Sequence Synthetic 132 ggcaggcttc agtttggcca ggcca 25 133 30 DNA Artificial Sequence Synthetic 133 atgacgtggc agacctccat ggtggtgctv 30 134 26 DNA Artificial Sequence Synthetic 134 cgcgccgagg ttccatggtg gtgctv 26 135 30 DNA Artificial Sequence Synthetic 135 agcatagtcc caggaatgag gtcccccaat 30 136 36 DNA Artificial Sequence Synthetic 136 atgacgtggc agacgttgct aagttttaca tagggv 36 137 33 DNA Artificial Sequence Synthetic 137 cgcgccgagg attgctaagt tttacatagg ggv 33 138 34 DNA Artificial Sequence Synthetic 138 ccaacacaga tggagattat ggcagacttg tttt 34 139 32 DNA Artificial Sequence Synthetic 139 atgacgtggc agacgttaga agagcagccc tv 32 140 28 DNA Artificial Sequence Synthetic 140 cgcgccgagg cttagaagag cagccctv 28 141 30 DNA Artificial Sequence Synthetic 141 gctgcaccgc ctcatcaatc ccaacttctc 30 142 26 DNA Artificial Sequence Synthetic 142 cgcgccgagg tcggctatca ggacgv 26 143 30 DNA Artificial Sequence Synthetic 143 atgacgtggc agacacggct atcaggacgv 30 144 32 DNA Artificial Sequence Synthetic 144 gcatgacagg aagacagggt gtgaggttgg at 32 145 27 DNA Artificial Sequence Synthetic 145 cgcgccgagg aggagagagg ctgtagv 27 146 31 DNA Artificial Sequence Synthetic 146 atgacgtggc agacgggaga gaggctgtag v 31 147 28 DNA Artificial Sequence Synthetic 147 actgctactg tctgtgctgt gctgggct 28 148 30 DNA Artificial Sequence Synthetic 148 atgacgtggc agacgcagag ctggacaccv 30 149 26 DNA Artificial Sequence Synthetic 149 cgcgccgagg acagagctgg acaccv 26 150 32 DNA Artificial Sequence Synthetic 150 ggtctctctg gacagcacac tgcaccaagt at 32 151 26 DNA Artificial Sequence Synthetic 151 cgcgccgagg agcccaccaa aaacgv 26 152 30 DNA Artificial Sequence Synthetic 152 atgacgtggc agacggccca ccaaaaacgv 30 153 42 DNA Artificial Sequence Synthetic 153 tcctatgccc aagttctctg atcatcctca aaagaagaca gt 42 154 27 DNA Artificial Sequence Synthetic 154 cgcgccgagg acttccatcc cagaggv 27 155 31 DNA Artificial Sequence Synthetic 155 atgacgtggc agacccttcc atcccagagg v 31 156 23 DNA Artificial Sequence Synthetic 156 ctgccrtgcc cttcctggcc cac 23 157 34 DNA Artificial Sequence Synthetic 157 cgcgccgagg tccctaaacc taaattcaaa tctv 34 158 37 DNA Artificial Sequence Synthetic 158 atgacgtggc agacgcccta aacctaaatt caaatcv 37 159 31 DNA Artificial Sequence Synthetic 159 gctgcagaga tgtgtcctcc cacagaggag t 31 160 32 DNA Artificial Sequence Synthetic 160 atgacgtggc agacctgaaa ccaccaagga gv 32 161 28 DNA Artificial Sequence Synthetic 161 cgcgccgagg atgaaaccac caaggagv 28 162 33 DNA Artificial Sequence Synthetic 162 gcctctggtt tctgtctact ccaacgtcca cgt 33 163 29 DNA Artificial Sequence Synthetic 163 cgcgccgagg cgcatagata caggcatcv 29 164 33 DNA Artificial Sequence Synthetic 164 atgacgtggc agacggcata gatacaggca tcv 33 165 26 DNA Artificial Sequence Synthetic 165 gtccgtgggg tttttgctgt gcggat 26 166 33 DNA Artificial Sequence Synthetic 166 atgacgtggc agacgtggaa agtacaaggc tcv 33 167 29 DNA Artificial Sequence Synthetic 167 cgcgccgagg ctggaaagta caaggctcv 29 168 28 DNA Artificial Sequence Synthetic 168 cagaaggctg cagcctcaca atgcaggt 28 169 25 DNA Artificial Sequence Synthetic 169 cgcgccgagg atgactgggt ccccv 25 170 29 DNA Artificial Sequence Synthetic 170 atgacgtggc agacgtgact gggtccccv 29 171 36 DNA Artificial Sequence Synthetic 171 cccaaatttg atccactgta accgtgcgta cacagt 36 172 31 DNA Artificial Sequence Synthetic 172 atgacgtggc agaccaccgt tgcaacaaca v 31 173 27 DNA Artificial Sequence Synthetic 173 cgcgccgagg aaccgttgca acaacav 27 174 37 DNA Artificial Sequence Synthetic 174 gagagttgct caaggtaaca cagtggtaag tgacggt 37 175 31 DNA Artificial Sequence Synthetic 175 atgacgtggc agacggccag gaactagact v 31 176 28 DNA Artificial Sequence Synthetic 176 cgcgccgagg agccaggaac tagactcv 28 177 30 DNA Artificial Sequence Synthetic 177 gcagtcagta gcagcagctt gagtggcaga 30 178 31 DNA Artificial Sequence Synthetic 178 atgacgtggc agaccggttc tcaaacctgg v 31 179 28 DNA Artificial Sequence Synthetic 179 cgcgccgagg tggttctcaa acctggav 28 180 30 DNA Artificial Sequence Synthetic 180 ccctctggaa ggatggctma tttgcacaca 30 181 32 DNA Artificial Sequence Synthetic 181 atgacgtggc agacctgagg ctttcctgat gv 32 182 29 DNA Artificial Sequence Synthetic 182 cgcgccgagg ttgaggcttt cctgatgav 29 183 24 DNA Artificial Sequence Synthetic 183 gcgaggccga gcccctccta gtgt 24 184 30 DNA Artificial Sequence Synthetic 184 atgacgtggc agacgttccg gaccttgctv 30 185 26 DNA Artificial Sequence Synthetic 185 cgcgccgagg cttccggacc ttgctv 26 186 39 DNA Artificial Sequence Synthetic 186 acaaaccttt tagtttactc tgcagttaat cccactgat 39 187 31 DNA Artificial Sequence Synthetic 187 atgacgtggc agacgaagta gtgggctcca v 31 188 28 DNA Artificial Sequence Synthetic 188 cgcgccgagg aaagtagtgg gctccaav 28 189 30 DNA Artificial Sequence Synthetic 189 tgtatgttgg cctcctttgc tgccctcact 30 190 33 DNA Artificial Sequence Synthetic 190 atgacgtggc agacgatctc ttcctgtgac acv 33 191 29 DNA Artificial Sequence Synthetic 191 cgcgccgagg aatctcttcc tgtgacacv 29 192 23 DNA Artificial Sequence Synthetic 192 gcccagagcg ggagacagcg aca 23 193 34 DNA Artificial Sequence Synthetic 193 atgacgtggc agaccgactt ggatatcagg tacv 34 194 31 DNA Artificial Sequence Synthetic 194 cgcgccgagg tgacttggat atcaggtact v 31 195 23 DNA Artificial Sequence Synthetic 195 tcgtggtccg gcgcatggct tca 23 196 26 DNA Artificial Sequence Synthetic 196 cgcgccgagg tattgggtgc cagcav 26 197 29 DNA Artificial Sequence Synthetic 197 atgacgtggc agaccattgg gtgccagcv 29 198 38 DNA Artificial Sequence Synthetic 198 gtgatcattc tgatggtgtg gattgtgtca ggccttaa 38 199 30 DNA Artificial Sequence Synthetic 199 atgacgtggc agaccctcct tcttgcccav 30 200 28 DNA Artificial Sequence Synthetic 200 cgcgccgagg tctccttctt gcccattv 28 201 28 DNA Artificial Sequence Synthetic 201 agcgacacct tcacgttgtc ctggacct 28 202 30 DNA Artificial Sequence Synthetic 202 atgacgtggc agacgccgtc tggttgttcv 30 203 27 DNA Artificial Sequence Synthetic 203 cgcgccgagg accgtctggt tgttccv 27 204 24 DNA Artificial Sequence Synthetic 204 gcggagccaa aggaccgagc aggc 24 205 28 DNA Artificial Sequence Synthetic 205 cgcgccgagg tttaatccca cagccagv 28 206 32 DNA Artificial Sequence Synthetic 206 atgacgtggc agacgttaat cccacagcca gv 32 207 28 DNA Artificial Sequence Synthetic 207 gcgtgtcctc cagggtgaac atgtccct 28 208 30 DNA Artificial Sequence Synthetic 208 atgacgtggc agacgctgga cctgtgtgav 30 209 27 DNA Artificial Sequence Synthetic 209 cgcgccgagg actggacctg tgtgaav 27 210 30 DNA Artificial Sequence Synthetic 210 gcatttgatt gcagagcagc tccgagtcct 30 211 27 DNA Artificial Sequence Synthetic 211 cgcgccgagg atccagagct tcctgcv 27 212 31 DNA Artificial Sequence Synthetic 212 atgacgtggc agacgtccag agcttcctgc v 31 213 27 DNA Artificial Sequence Synthetic 213 gaacagcttc accacggcgg tcatgtt 27 214 30 DNA Artificial Sequence Synthetic 214 atgacgtggc agacgcttct gtcccctggv 30 215 26 DNA Artificial Sequence Synthetic 215 cgcgccgagg acttctgtcc cctggv 26 216 34 DNA Artificial Sequence Synthetic 216 aacccatagt taagaacgtg gggtgaggta ccgc 34 217 25 DNA Artificial Sequence Synthetic 217 cgcgccgagg tcctgccctt tggcv 25 218 29 DNA Artificial Sequence Synthetic 218 atgacgtggc agacacctgc cctttggcv 29 219 35 DNA Artificial Sequence Synthetic 219 gctggagtgt gcccaatgct atatgtcagt tgagt 35 220 34 DNA Artificial Sequence Synthetic 220 atgacgtggc agacgttcta agacttggaa gccv 34 221 30 DNA Artificial Sequence Synthetic 221 cgcgccgagg attctaagac ttggaagccv 30 222 45 DNA Artificial Sequence Synthetic 222 gggaacaatc accttttctc tttgcctttc atactgcttt agact 45 223 28 DNA Artificial Sequence Synthetic 223 cgcgccgagg acctcactgc ttcctaav 28 224 32 DNA Artificial Sequence Synthetic 224 atgacgtggc agacccctca ctgcttccta av 32 225 39 DNA Artificial Sequence Synthetic 225 tttgttccgg acatcatgtg tatcccaacc taccaaaat 39 226 30 DNA Artificial Sequence Synthetic 226 cgcgccgagg aagtcctttc caggtaaggv 30 227 33 DNA Artificial Sequence Synthetic 227 atgacgtggc agacgagtcc tttccaggta agv 33 228 40 DNA Artificial Sequence Synthetic 228 tttgtgcagt ggttgatgaa taccaacagg aacaggtaat 40 229 27 DNA Artificial Sequence Synthetic 229 cgcgccgagg aagtctaagc ctggctv 27 230 31 DNA Artificial Sequence Synthetic 230 atgacgtggc agacgagtct aagcctggct v 31 231 30 DNA Artificial Sequence Synthetic 231 caggctcagg ttgtggtgac actggtcaca 30 232 29 DNA Artificial Sequence Synthetic 232 cgcgccgagg tgtagagctt ccacttctv 29 233 33 DNA Artificial Sequence Synthetic 233 atgacgtggc agaccgtaga gcttccactt ctv 33 234 27 DNA Artificial Sequence Synthetic 234 tggccctgtg actatggctc tggcaca 27 235 30 DNA Artificial Sequence Synthetic 235 atgacgtggc agaccactag ggtcctggcv 30 236 27 DNA Artificial Sequence Synthetic 236 cgcgccgagg tactagggtc ctggccv 27 237 28 DNA Artificial Sequence Synthetic 237 cccatcctga ccaccatccg ccgaatct 28 238 29 DNA Artificial Sequence Synthetic 238 cgcgccgagg agcctcttca atagcagtv 29 239 32 DNA Artificial Sequence Synthetic 239 atgacgtggc agacggcctc ttcaatagca gv 32 240 32 DNA Artificial Sequence Synthetic 240 ggagtcaaga cccagatgtc ccctgacttg tt 32 241 33 DNA Artificial Sequence Synthetic 241 atgacgtggc agacgtcaca caaggagtct tcv 33 242 30 DNA Artificial Sequence Synthetic 242 cgcgccgagg atcacacaag gagtcttcav 30 243 41 DNA Artificial Sequence Synthetic 243 cgactgtcca gttaaatgca tcagaagtgt tagcttctcc t 41 244 38 DNA Artificial Sequence Synthetic 244 atgacgtggc agacggagtt aaagtcatta ctgtagav 38 245 35 DNA Artificial Sequence Synthetic 245 cgcgccgagg agagttaaag tcattactgt agagv 35 246 25 DNA Artificial Sequence Synthetic 246 gagacacctc ccactcgtcc ggcaa 25 247 28 DNA Artificial Sequence Synthetic 247 cgcgccgagg tgtacacaga gcatggav 28 248 31 DNA Artificial Sequence Synthetic 248 atgacgtggc agaccgtaca cagagcatgg v 31 249 30 DNA Artificial Sequence Synthetic 249 ccaaggctga tgacattgtt ggccctgtgt 30 250 30 DNA Artificial Sequence Synthetic 250 cgcgccgagg acgcatgaaa tctttgagav 30 251 33 DNA Artificial Sequence Synthetic 251 atgacgtggc agacgcgcat gaaatctttg agv 33 252 37 DNA Artificial Sequence Synthetic 252 cactcccaaa ttcaatattg acatattccc ccgggca 37 253 26 DNA Artificial Sequence Synthetic 253 cgcgccgagg tcttgggctc tggagv 26 254 30 DNA Artificial Sequence Synthetic 254 atgacgtggc agacccttgg gctctggagv 30 255 22 DNA Artificial Sequence Synthetic 255 cgcctggcag aggaccctgc ct 22 256 25 DNA Artificial Sequence Synthetic 256 cgcgccgagg aagcccaggt accgv 25 257 29 DNA Artificial Sequence Synthetic 257 atgacgtggc agacgagccc aggtaccgv 29 258 28 DNA Artificial Sequence Synthetic 258 ccgtgcagag tggtgtgggc actttgaa 28 259 28 DNA Artificial Sequence Synthetic 259 cgcgccgagg tggtgttgcc aaacttgv 28 260 31 DNA Artificial Sequence Synthetic 260 atgacgtggc agaccggtgt tgccaaactt v 31 261 43 DNA Artificial Sequence Synthetic 261 ggttctcccg agaggtaaag aacaaagact tcaaagacac ttc 43 262 28 DNA Artificial Sequence Synthetic 262 cgcgccgagg tcttcactgg tcagctcv 28 263 30 DNA Artificial Sequence Synthetic 263 atgacgtggc agacgcttca ctggtcagcv 30 264 43 DNA Artificial Sequence Synthetic 264 tgttgaacag tcttcaaggt gggatcgtaa taatggcaaa agt 43 265 29 DNA Artificial Sequence Synthetic 265 cgcgccgagg acctcaccaa gaatttggv 29 266 33 DNA Artificial Sequence Synthetic 266 atgacgtggc agacgcctca ccaagaattt ggv 33 267 53 DNA Artificial Sequence Synthetic 267 cagcaatttc ctcaaaagac tttcctttgg tttctggaac tttaaaaaat gtt 53 268 31 DNA Artificial Sequence Synthetic 268 atgacgtggc agacgaacag ggtaaaggcc v 31 269 28 DNA Artificial Sequence Synthetic 269 cgcgccgagg aaacagggta aaggccav 28 270 26 DNA Artificial Sequence Synthetic 270 ggcccagaag acccccctcg gaatct 26 271 26 DNA Artificial Sequence Synthetic 271 cgcgccgagg agagcaggga ggatgv 26 272 30 DNA Artificial Sequence Synthetic 272 atgacgtggc agacggagca gggaggatgv 30 273 29 DNA Artificial Sequence Synthetic 273 ctccatccgc atcggcctct atgactcct 29 274 28 DNA Artificial Sequence Synthetic 274 cgcgccgagg atcaagcagg tgtacacv 28 275 32 DNA Artificial Sequence Synthetic 275 atgacgtggc agacgtcaag caggtgtaca cv 32 276 29 DNA Artificial Sequence Synthetic 276 ggacactggt cggcaatcct cagcacagt 29 277 29 DNA Artificial Sequence Synthetic 277 atgacgtggc agacgacgcc acttcccav 29 278 25 DNA Artificial Sequence Synthetic 278 cgcgccgagg cacgccactt cccav 25 279 25 DNA Artificial Sequence Synthetic 279 caccaggctg ccttggccac agaaa 25 280 31 DNA Artificial Sequence Synthetic 280 cgcgccgagg tacttactga aatgcccttg v 31 281 34 DNA Artificial Sequence Synthetic 281 atgacgtggc agaccactta ctgaaatgcc cttv 34 282 23 DNA Artificial Sequence Synthetic 282 gcctctgacc ccatggcagg ggt 23 283 29 DNA Artificial Sequence Synthetic 283 cgcgccgagg acagagtatt tgagcagcv 29 284 33 DNA Artificial Sequence Synthetic 284 atgacgtggc agacgcagag tatttgagca gcv 33 285 20 DNA Artificial Sequence Synthetic 285 gctggggccc cactgcccat 20 286 28 DNA Artificial Sequence Synthetic 286 cgcgccgagg atgtcacctt ggatggcv 28 287 31 DNA Artificial Sequence Synthetic 287 atgacgtggc agacgtgtca ccttggatgg v 31 288 35 DNA Artificial Sequence Synthetic 288 gttcatcttt ggttttgtgg gcaacatgct ggtct 35 289 33 DNA Artificial Sequence Synthetic 289 cgcgccgagg atcctcatct taataaactg cav 33 290 37 DNA Artificial Sequence Synthetic 290 atgacgtggc agacgtcctc atcttaataa actgcav 37 291 30 DNA Artificial Sequence Synthetic 291 gctccacttt caacttgtcc ccctccagct 30 292 29 DNA Artificial Sequence Synthetic 292 atgacgtggc agacgtcacc tgggaggcv 29 293 25 DNA Artificial Sequence Synthetic 293 cgcgccgagg atcacctggg aggcv 25 294 47 DNA Artificial Sequence Synthetic 294 gctctcttca tcatagtgaa gtcttcctta tccagcatct tgttcaa 47 295 31 DNA Artificial Sequence Synthetic 295 cgcgccgagg atgaacaaga tgctggataa v 31 296 34 DNA Artificial Sequence Synthetic 296 atgacgtggc agacgtgaac aagatgctgg atav 34 297 46 DNA Artificial Sequence Synthetic 297 gtcctgtctc tgcaaataat gatgctttcg aagtttcagt tgaaca 46 298 26 DNA Artificial Sequence Synthetic 298 cgcgccgagg tgtccctcgc gaaaav 26 299 28 DNA Artificial Sequence Synthetic 299 atgacgtggc agaccgtccc tcgcgaav 28 300 27 DNA Artificial Sequence Synthetic 300 gccatctcct tctttgcgct cccagct 27 301 25 DNA Artificial Sequence Synthetic 301 cgcgccgagg agtaggtgcc ccgtv 25 302 28 DNA Artificial Sequence Synthetic 302 atgacgtggc agacggtagg tgccccgv 28 303 36 DNA Artificial Sequence Synthetic 303 ggcaggatga aaacacttac gtcggaggat ctctct 36 304 33 DNA Artificial Sequence Synthetic 304 atgacgtggc agacgttgct ttctgacgta ccv 33 305 29 DNA Artificial Sequence Synthetic 305 cgcgccgagg attgctttct gacgtaccv 29 306 25 DNA Artificial Sequence Synthetic 306 ggagccaagc actgctcctc ccact 25 307 29 DNA Artificial Sequence Synthetic 307 atgacgtggc agacggccag catgaggcv 29 308 25 DNA Artificial Sequence Synthetic 308 cgcgccgagg agccagcatg aggcv 25 309 26 DNA Artificial Sequence Synthetic 309 cgcgttcagt ccgtgcatgc ggttct 26 310 26 DNA Artificial Sequence Synthetic 310 atgacgtggc agaccgctcc cgggcv 26 311 22 DNA Artificial Sequence Synthetic 311 cgcgccgagg agctcccggg cv 22 312 39 DNA Artificial Sequence Synthetic 312 tggattatct aaatgaaaca cagcagctta ctccagagt 39 313 27 DNA Artificial Sequence Synthetic 313 cgcgccgagg atcaagtcca aggccav 27 314 31 DNA Artificial Sequence Synthetic 314 atgacgtggc agacgtcaag tccaaggcca v 31 315 28 DNA Artificial Sequence Synthetic 315 cggcttgcag acaccgtgga aggttcta 28 316 29 DNA Artificial Sequence Synthetic 316 atgacgtggc agaccctggg actgctggv 29 317 25 DNA Artificial Sequence Synthetic 317 cgcgccgagg tctgggactg ctggv 25 318 28 DNA Artificial Sequence Synthetic 318 ccatggggtc ccatgctggc aggataaa 28 319 28 DNA Artificial Sequence Synthetic 319 cgcgccgagg tgggttcctg ctctaacv 28 320 31 DNA Artificial Sequence Synthetic 320 atgacgtggc agaccgggtt cctgctctaa v 31 321 29 DNA Artificial Sequence Synthetic 321 ctccctgcag gtcacagtca ccaccatct 29 322 27 DNA Artificial Sequence Synthetic 322 cgcgccgagg agctatgggg acaaggv 27 323 31 DNA Artificial Sequence Synthetic 323 atgacgtggc agacggctat ggggacaagg v 31 324 23 DNA Artificial Sequence Synthetic 324 gcctggtccc caaggtaggg gct 23 325 31 DNA Artificial Sequence Synthetic 325 atgacgtggc agaccaggtc gaggtagcag v 31 326 27 DNA Artificial Sequence Synthetic 326 cgcgccgagg aaggtcgagg tagcagv 27 327 26 DNA Artificial Sequence Synthetic 327 ccctaccttg gggaccaggc ccttga 26 328 29 DNA Artificial Sequence Synthetic 328 atgacgtggc agaccgctgt ggaaccagv 29 329 26 DNA Artificial Sequence Synthetic 329 cgcgccgagg tgctgtggaa ccaggv 26 330 43 DNA Artificial Sequence Synthetic 330 gggaggacaa tcctgtggaa aggaaggttt ttataatgtg ttt 43 331 32 DNA Artificial Sequence Synthetic 331 atgacgtggc agacctgaga aggagggtga cv 32 332 28 DNA Artificial Sequence Synthetic 332 cgcgccgagg atgagaagga gggtgacv 28 333 36 DNA Artificial Sequence Synthetic 333 cctgtctgta tccagctttg cagttggtgg aatgaa 36 334 33 DNA Artificial Sequence Synthetic 334 atgacgtggc agacctgcat cattctttgg tgv 33 335 30 DNA Artificial Sequence Synthetic 335 cgcgccgagg ttgcatcatt ctttggtggv 30 336 44 DNA Artificial Sequence Synthetic 336 ggaaagaaga aagagcagag gagggagatt ggaagtagaa atgt 44 337 29 DNA Artificial Sequence Synthetic 337 cgcgccgagg atgaatgcag aggcaaaav 29 338 32 DNA Artificial Sequence Synthetic 338 atgacgtggc agacctgaat gcagaggcaa av 32 339 55 DNA Artificial Sequence Synthetic 339 ggcacaaacc agataatatt aagggaaatt tggaattcag aaatgttcac ttcat 55 340 32 DNA Artificial Sequence Synthetic 340 cgcgccgagg attacccatc tcgaaaagaa gv 32 341 35 DNA Artificial Sequence Synthetic 341 atgacgtggc agacgttacc catctcgaaa agaav 35 342 25 DNA Artificial Sequence Synthetic 342 tcccaccccc actggactca ccact 25 343 31 DNA Artificial Sequence Synthetic 343 atgacgtggc agacgtgatg gcaggtgaag v 31 344 27 DNA Artificial Sequence Synthetic 344 cgcgccgagg atgatggcag gtgaagv 27 345 26 DNA Artificial Sequence Synthetic 345 ggtgccggca ggcaagatag acagct 26 346 32 DNA Artificial Sequence Synthetic 346 atgacgtggc agacggtgga gtagaagagc tv 32 347 29 DNA Artificial Sequence Synthetic 347 cgcgccgagg agtggagtag aagagctgv 29 348 51 DNA Artificial Sequence Synthetic 348 ggttcagtcc acataatgca ttttctcctt caattctgaa aagtagctaa c 51 349 30 DNA Artificial Sequence Synthetic 349 cgcgccgagg tgctcatttg gtagtgaagv 30 350 34 DNA Artificial Sequence Synthetic 350 atgacgtggc agacggctca tttggtagtg aagv 34 351 24 DNA Artificial Sequence Synthetic 351 cggccactga gggagaaggc cact 24 352 28 DNA Artificial Sequence Synthetic 352 atgacgtggc agacggacgt gatgccgv 28 353 25 DNA Artificial Sequence Synthetic 353 cgcgccgagg agacgtgatg ccgcv 25 354 27 DNA Artificial Sequence Synthetic 354 gggtctccac cacggctttc tggtggt 27 355 28 DNA Artificial Sequence Synthetic 355 atgacgtggc agacgccgcc tcctcagv 28 356 24 DNA Artificial Sequence Synthetic 356 cgcgccgagg accgcctcct cagv 24 357 26 DNA Artificial Sequence Synthetic 357 ctgagccatg gtggccatga agggga 26 358 27 DNA Artificial Sequence Synthetic 358 cgcgccgagg ttctgggtca catggcv 27 359 31 DNA Artificial Sequence Synthetic 359 atgacgtggc agacctctgg gtcacatggc v 31 360 28 DNA Artificial Sequence Synthetic 360 ggtgccttct gatggggacg tgtctgct 28 361 30 DNA Artificial Sequence Synthetic 361 atgacgtggc agacgccagg agagaagggv 30 362 27 DNA Artificial Sequence Synthetic 362 cgcgccgagg accaggagag aagggav 27 363 39 DNA Artificial Sequence Synthetic 363 ctgccttgta ccagcattac aaataatcca gccacaaat 39 364 37 DNA Artificial Sequence Synthetic 364 atgacgtggc agacgtaaat gcttttcatt tctgctv 37 365 33 DNA Artificial Sequence Synthetic 365 cgcgccgagg ataaatgctt ttcatttctg ctv 33 366 33 DNA Artificial Sequence Synthetic 366 accaacgttg acatgcacgt ccagaattga ggt 33 367 30 DNA Artificial Sequence Synthetic 367 atgacgtggc agacggaggt tgcctttgcv 30 368 27 DNA Artificial Sequence Synthetic 368 cgcgccgagg agaggttgcc tttgctv 27 369 32 DNA Artificial Sequence Synthetic 369 acactaaggt ctcatcaggg tttgggtggc at 32 370 32 DNA Artificial Sequence Synthetic 370 atgacgtggc agacgaagga atggaaccag gv 32 371 28 DNA Artificial Sequence Synthetic 371 cgcgccgagg aaaggaatgg aaccaggv 28 372 34 DNA Artificial Sequence Synthetic 372 cctagatgcc ctgcagaatc cttcctgtta cgga 34 373 28 DNA Artificial Sequence Synthetic 373 atgacgtggc agaccccccc tccctgav 28 374 25 DNA Artificial Sequence Synthetic 374 cgcgccgagg tccccctccc tgaav 25 375 21 DNA Artificial Sequence Synthetic 375 gcactggcca cccgggacgc t 21 376 25 DNA Artificial Sequence Synthetic 376 cgcgccgagg ccccccaagg aaggv 25 377 29 DNA Artificial Sequence Synthetic 377 atgacgtggc agacgccccc aaggaaggv 29 378 27 DNA Artificial Sequence Synthetic 378 caggggtgga tggtctctca ctcccct 27 379 31 DNA Artificial Sequence Synthetic 379 atgacgtggc agacgggcct gtattcagtc v 31 380 27 DNA Artificial Sequence Synthetic 380 cgcgccgagg cggcctgtat tcagtcv 27 381 31 DNA Artificial Sequence Synthetic 381 tggtgaccct gcccagatgt gaagtgtaca t 31 382 27 DNA Artificial Sequence Synthetic 382 cgcgccgagg actctgtgtt ggggagv 27 383 30 DNA Artificial Sequence Synthetic 383 atgacgtggc agacgctctg tgttggggav 30 384 34 DNA Artificial Sequence Synthetic 384 ctcagcctta aaaagacctc cagggcttga tgca 34 385 28 DNA Artificial Sequence Synthetic 385 cgcgccgagg tggtatgttg tcaggctv 28 386 31 DNA Artificial Sequence Synthetic 386 atgacgtggc agaccggtat gttgtcaggc v 31 387 33 DNA Artificial Sequence Synthetic 387 gctggaggag gctatgagaa gtgaggtttg cat 33 388 28 DNA Artificial Sequence Synthetic 388 cgcgccgagg agaagaaaga ggggcagv 28 389 31 DNA Artificial Sequence Synthetic 389 atgacgtggc agacggaaga aagaggggca v 31 390 38 DNA Artificial Sequence Synthetic 390 caatgggacg ccatagaggg cttttgagta gacatatt 38 391 30 DNA Artificial Sequence Synthetic 391 cgcgccgagg atcagtgtag aagggtgaav 30 392 33 DNA Artificial Sequence Synthetic 392 atgacgtggc agacgtcagt gtagaagggt gav 33 393 51 DNA Artificial Sequence Synthetic 393 acacatgtgt ttcattttta gttttgttaa aaaaaaattc tgacaaatca t 51 394 31 DNA Artificial Sequence Synthetic 394 atgacgtggc agacgaaatg ggggttcagg v 31 395 28 DNA Artificial Sequence Synthetic 395 cgcgccgagg aaaatggggg ttcaggav 28 396 29 DNA Artificial Sequence Synthetic 396 ggaggagagc aggcattggg ctaaggagc 29 397 27 DNA Artificial Sequence Synthetic 397 atgacgtggc agacggggca gtgggcv 27 398 23 DNA Artificial Sequence Synthetic 398 cgcgccgagg tgggcagtgg gcv 23 399 36 DNA Artificial Sequence Synthetic 399 gggacccatt cctgtgtaat acaatgtctg caccat 36 400 35 DNA Artificial Sequence Synthetic 400 cgcgccgagg atgctaataa agtcctattc tcttv 35 401 38 DNA Artificial Sequence Synthetic 401 atgacgtggc agacgtgcta ataaagtcct attctctv 38 402 28 DNA Artificial Sequence Synthetic 402 gacagaggct tctagagggg ccagcagt 28 403 31 DNA Artificial Sequence Synthetic 403 atgacgtggc agacgtttgg ggagacttgg v 31 404 28 DNA Artificial Sequence Synthetic 404 cgcgccgagg atttggggag acttgggv 28 405 26 DNA Artificial Sequence Synthetic 405 cctccaggct ggccccctag attgct 26 406 29 DNA Artificial Sequence Synthetic 406 atgacgtggc agacgtctgc tcctggcav 29 407 26 DNA Artificial Sequence Synthetic 407 cgcgccgagg atctgctcct ggcatv 26 408 25 DNA Artificial Sequence Synthetic 408 tggactctga gccccacctg cgaga 25 409 33 DNA Artificial Sequence Synthetic 409 atgacgtggc agacccccta gaatcacaga gav 33 410 30 DNA Artificial Sequence Synthetic 410 cgcgccgagg tccctagaat cacagagagv 30 411 24 DNA Artificial Sequence Synthetic 411 gggtgctgtc cacactggct ccct 24 412 29 DNA Artificial Sequence Synthetic 412 atgacgtggc agacgtcagg gagcagccv 29 413 25 DNA Artificial Sequence Synthetic 413 cgcgccgagg atcagggagc agccv 25 414 31 DNA Artificial Sequence Synthetic 414 tcatgaacag caaaggcgtg agcctcttcg t 31 415 29 DNA Artificial Sequence Synthetic 415 cgcgccgagg acatcatcaa ccctgagav 29 416 32 DNA Artificial Sequence Synthetic 416 atgacgtggc agacgcatca tcaaccctga gv 32 417 22 DNA Artificial Sequence Synthetic 417 ggtggggctg ggctgctagg gt 22 418 32 DNA Artificial Sequence Synthetic 418 atgacgtggc agacgatcca gatggcatgt gv 32 419 28 DNA Artificial Sequence Synthetic 419 cgcgccgagg aatccagatg gcatgtgv 28 420 25 DNA Artificial Sequence Synthetic 420 cttgggccac ggagggcaat gacct 25 421 24 DNA Artificial Sequence Synthetic 421 cgcgccgagg aagggtgccc ctgv 24 422 28 DNA Artificial Sequence Synthetic 422 atgacgtggc agacgagggt gcccctgv 28 423 29 DNA Artificial Sequence Synthetic 423 agtgtggtgc agaaaaccct tcaccccct 29 424 33 DNA Artificial Sequence Synthetic 424 atgacgtggc agacgtgtca aaaggagctg acv 33 425 29 DNA Artificial Sequence Synthetic 425 cgcgccgagg atgtcaaaag gagctgacv 29 426 32 DNA Artificial Sequence Synthetic 426 ggtctctacc ttgggtgctg ttctctgcct ct 32 427 29 DNA Artificial Sequence Synthetic 427 cgcgccgagg aggagctctc tgtcaattv 29 428 31 DNA Artificial Sequence Synthetic 428 atgacgtggc agacgggagc tctctgtcaa v 31 429 31 DNA Artificial Sequence Synthetic 429 gtagggagaa gtgcggcaca gctaaaggag t 31 430 28 DNA Artificial Sequence Synthetic 430 atgacgtggc agacgagcgc ctgcaccv 28 431 24 DNA Artificial Sequence Synthetic 431 cgcgccgagg aagcgcctgc accv 24 432 43 DNA Artificial Sequence Synthetic 432 gctacgtttt cttctcagtt gaacagacac ggtagaagac tcc 43 433 33 DNA Artificial Sequence Synthetic 433 atgacgtggc agacgcccat tttggaatgt gav 33 434 30 DNA Artificial Sequence Synthetic 434 cgcgccgagg tcccattttg gaatgtgacv 30 435 26 DNA Artificial Sequence Synthetic 435 catgaccagg gtgcaagcac tgggct 26 436 33 DNA Artificial Sequence Synthetic 436 atgacgtggc agacgttgtt ctgtgggagt agv 33 437 30 DNA Artificial Sequence Synthetic 437 cgcgccgagg attgttctgt gggagtaggv 30 438 24 DNA Artificial Sequence Synthetic 438 ggagaggaca ccagggtggg ggtt 24 439 33 DNA Artificial Sequence Synthetic 439 atgacgtggc agacgaagga gacactactg ccv 33 440 29 DNA Artificial Sequence Synthetic 440 cgcgccgagg aaaggagaca ctactgccv 29 441 31 DNA Artificial Sequence Synthetic 441 gcggagagac agggagatga cgccttaaag t 31 442 28 DNA Artificial Sequence Synthetic 442 atgacgtggc agacggtccg cgacatgv 28 443 25 DNA Artificial Sequence Synthetic 443 cgcgccgagg agtccgcgac atgcv 25 444 0 DNA Unknown Intentionally omitted sequence. 444 000 445 0 DNA Unknown Intentionally omitted sequence. 445 000 446 18 DNA Artificial Sequence Synthetic 446 cgggctaccc atgggaca 18 447 28 DNA Artificial Sequence Synthetic 447 gtcttctggt attaagccgt aatttgca 28 448 26 DNA Artificial Sequence Synthetic 448 cagtntcacc agctgtggta gaacca 26 449 23 DNA Artificial Sequence Synthetic 449 aagaggagca tcactgtgac cca 23 450 31 DNA Artificial Sequence Synthetic 450 tcccttcctc agattatatt catcccagaa a 31 451 27 DNA Artificial Sequence Synthetic 451 tcaaccccct gacattatct tggatcc 27 452 30 DNA Artificial Sequence Synthetic 452 cactccccaa catctcattt atttttcaca 30 453 26 DNA Artificial Sequence Synthetic 453 gtcatggcaa tcagttggtg aaagca 26 454 27 DNA Artificial Sequence Synthetic 454 tcttctttag actgccacga ggaaaaa 27 455 27 DNA Artificial Sequence Synthetic 455 gggagatgag gtactcacta gttaaca 27 456 21 DNA Artificial Sequence Synthetic 456 ccctgaggaa ctcacgcaga c 21 457 21 DNA Artificial Sequence Synthetic 457 gcacctcttt gcgcaggaag a 21 458 20 DNA Artificial Sequence Synthetic 458 agtggtggcg ctctcacaaa 20 459 30 DNA Artificial Sequence Synthetic 459 catttgttca ggcattacag taaaatgcca 30 460 20 DNA Artificial Sequence Synthetic 460 cagggacaat cccatcccca 20 461 28 DNA Artificial Sequence Synthetic 461 gtgaattgtc catgatgaga gccactac 28 462 20 DNA Artificial Sequence Synthetic 462 tgtcccagac tgggtcagca 20 463 24 DNA Artificial Sequence Synthetic 463 gaatgaagaa ggtactgtgg gcca 24 464 27 DNA Artificial Sequence Synthetic 464 ctggaaactt ctgccagatt gttccta 27 465 24 DNA Artificial Sequence Synthetic 465 caaaggactc cttgtcccct agaa 24 466 19 DNA Artificial Sequence Synthetic 466 ggtcctttgc gcaaaggca 19 467 20 DNA Artificial Sequence Synthetic 467 tttcagctcc cctcctccca 20 468 22 DNA Artificial Sequence Synthetic 468 gtctgccttc tcacagcttt cc 22 469 26 DNA Artificial Sequence Synthetic 469 aggtgtaact tgagtctctg cctaac 26 470 16 DNA Artificial Sequence Synthetic 470 agctgctggg ccagca 16 471 28 DNA Artificial Sequence Synthetic 471 caagctttaa aggcagtcga cattaaga 28 472 19 DNA Artificial Sequence Synthetic 472 gccagggatc tagggctcc 19 473 18 DNA Artificial Sequence Synthetic 473 cccgtcctac ccagacga 18 474 19 DNA Artificial Sequence Synthetic 474 tcctgctgac attccgcca 19 475 19 DNA Artificial Sequence Synthetic 475 ggtgcaccac ccattccca 19 476 31 DNA Artificial Sequence Synthetic 476 gcaatcctgg ttaaggactt aagaattgtc a 31 477 23 DNA Artificial Sequence Synthetic 477 acaaaccaac gccacttcct aac 23 478 0 DNA Unknown Intentionally omitted sequence. 478 000 479 0 DNA Unknown Intentionally omitted sequence. 479 000 480 17 DNA Artificial Sequence Synthetic 480 cagcgtggca gagtggc 17 481 26 DNA Artificial Sequence Synthetic 481 actctacgat gtgggcattt cagaga 26 482 18 DNA Artificial Sequence Synthetic 482 gcgcacctgt ccgtagca 18 483 21 DNA Artificial Sequence Synthetic 483 gccccaacaa gctctcactc a 21 484 37 DNA Artificial Sequence Synthetic 484 gaagagggtg aatactataa aaatagactt accttcc 37 485 31 DNA Artificial Sequence Synthetic 485 ttggctcaaa tcgtgggata attctaagaa a 31 486 16 DNA Artificial Sequence Synthetic 486 ccaccgccac ctccga 16 487 19 DNA Artificial Sequence Synthetic 487 gacccagcag aggtccgaa 19 488 40 DNA Artificial Sequence Synthetic 488 tttcaaaact atcaggacct ttatcattca taggaaataa 40 489 30 DNA Artificial Sequence Synthetic 489 ttttaagata cctttccaag ttctccctca 30 490 0 DNA Unknown Intentionally omitted sequence. 490 000 491 0 DNA Unknown Intentionally omitted sequence. 491 000 492 28 DNA Artificial Sequence Synthetic 492 gacttacatt aggcagtgac tcgatgaa 28 493 27 DNA Artificial Sequence Synthetic 493 cattgctgag aacattgcct atggaga 27 494 19 DNA Artificial Sequence Synthetic 494 ccctggaggg agttgaccc 19 495 22 DNA Artificial Sequence Synthetic 495 gctcagtatg cctttcctcc cc 22 496 0 DNA Unknown Intentionally omitted sequence. 496 000 497 0 DNA Unknown Intentionally omitted sequence. 497 000 498 17 DNA Artificial Sequence Synthetic 498 gcaacccggg aacggca 17 499 20 DNA Artificial Sequence Synthetic 499 tcgtcccttt cctgcgtgac 20 500 23 DNA Artificial Sequence Synthetic 500 cctgctgacc aagaataagg ccc 23 501 26 DNA Artificial Sequence Synthetic 501 cattggcata gcagttgatg gcttcc 26 502 20 DNA Artificial Sequence Synthetic 502 catcagcatc ggttctgccc 20 503 18 DNA Artificial Sequence Synthetic 503 ggcgatgctc agcccgaa 18 504 25 DNA Artificial Sequence Synthetic 504 tccctctgtt tctttccctc acaga 25 505 25 DNA Artificial Sequence Synthetic 505 ggttgctgaa gttgtgtgtg atcac 25 506 34 DNA Artificial Sequence Synthetic 506 ttccttagat tcttctttgg agcagaataa aaga 34 507 24 DNA Artificial Sequence Synthetic 507 cacaccatgt gaggtcatca gcaa 24 508 20 DNA Artificial Sequence Synthetic 508 gctccaggga ggactcacca 20 509 22 DNA Artificial Sequence Synthetic 509 catgacctca gggatgccca ca 22 510 27 DNA Artificial Sequence Synthetic 510 ggcccgaaca tagtaattcc tggtaaa 27 511 17 DNA Artificial Sequence Synthetic 511 cgagtgggag aggccca 17 512 33 DNA Artificial Sequence Synthetic 512 tgtattacat aaaccctact ccaaacaaat gca 33 513 23 DNA Artificial Sequence Synthetic 513 gccagcaaac acatccagga aca 23 514 28 DNA Artificial Sequence Synthetic 514 cgtttcttcc atccttccag gatttgaa 28 515 26 DNA Artificial Sequence Synthetic 515 acctctctgt gctttctgta tcctca 26 516 0 DNA Unknown Intentionally omitted sequence. 516 000 517 0 DNA Unknown Intentionally omitted sequence. 517 000 518 22 DNA Artificial Sequence Synthetic 518 caggcaactg gaactgaaac cc 22 519 22 DNA Artificial Sequence Synthetic 519 ctcagcttcc aagggccatt ca 22 520 23 DNA Artificial Sequence Synthetic 520 ggctggacat ccacttcatc cac 23 521 19 DNA Artificial Sequence Synthetic 521 cgtagaaaga gccgggcca 19 522 26 DNA Artificial Sequence Synthetic 522 agaatcggct gtctttgatg ctgtaa 26 523 50 DNA Artificial Sequence Synthetic 523 cttatacttt tagaaaaaag aagacattat caagatattc atttttgtca 50 524 40 DNA Artificial Sequence Synthetic 524 acgagcataa gaacttaata atgtcaagag aaattttaga 40 525 26 DNA Artificial Sequence Synthetic 525 tgactacagc aagtatctgg actcca 26 526 18 DNA Artificial Sequence Synthetic 526 acaggtcccc tccgctca 18 527 19 DNA Artificial Sequence Synthetic 527 ggccaggcac aggctgaaa 19 528 22 DNA Artificial Sequence Synthetic 528 ccagagccac tacctttgtc ca 22 529 32 DNA Artificial Sequence Synthetic 529 ggtgtcttag gagagaaaaa aaggtagaaa aa 32 530 22 DNA Artificial Sequence Synthetic 530 tgaccaaatg ccctcacctt ca 22 531 27 DNA Artificial Sequence Synthetic 531 cacaatatgc tggatgactc ctcagac 27 532 23 DNA Artificial Sequence Synthetic 532 tggttgttga ggtccctgaa tcc 23 533 18 DNA Artificial Sequence Synthetic 533 cagggtccag ctggagca 18 534 22 DNA Artificial Sequence Synthetic 534 gagaggcacc cttcacagga aa 22 535 32 DNA Artificial Sequence Synthetic 535 aagaaaatac ttctttgagc tcaactacga ac 32 536 17 DNA Artificial Sequence Synthetic 536 gaaggagccc tgcccca 17 537 20 DNA Artificial Sequence Synthetic 537 agacccccaa gggatcctcc 20 538 20 DNA Artificial Sequence Synthetic 538 ttcggctcct gccacatcaa 20 539 24 DNA Artificial Sequence Synthetic 539 tgcctctcac ttcctctcct taca 24 540 24 DNA Artificial Sequence Synthetic 540 gtagccagac tgatcactcc caaa 24 541 18 DNA Artificial Sequence Synthetic 541 gcagcagcag cagcagca 18 542 18 DNA Artificial Sequence Synthetic 542 gacgttgccg aagcccac 18 543 24 DNA Artificial Sequence Synthetic 543 agataggcaa accctacaac agca 24 544 17 DNA Artificial Sequence Synthetic 544 cacaaagcgg gccctcc 17 545 21 DNA Artificial Sequence Synthetic 545 ccccgaggaa tacgtgctga c 21 546 22 DNA Artificial Sequence Synthetic 546 cagtaggctg tggtcctcat ca 22 547 22 DNA Artificial Sequence Synthetic 547 gcccattgta gctgaggagg ac 22 548 23 DNA Artificial Sequence Synthetic 548 gggtgctggt ctcataggtc tca 23 549 25 DNA Artificial Sequence Synthetic 549 agctcctaca tcaccagtga gatcc 25 550 20 DNA Artificial Sequence Synthetic 550 ccggtttggt tctcccgaga 20 551 22 DNA Artificial Sequence Synthetic 551 cccaaggagg agctgctgaa ga 22 552 0 DNA Unknown Intentionally omitted sequence. 552 000 553 0 DNA Unknown Intentionally omitted sequence. 553 000 554 25 DNA Artificial Sequence Synthetic 554 gctcaggaac ttcaggattg ctacc 25 555 39 DNA Artificial Sequence Synthetic 555 agaaacaaag tagatgcatt tgattcaagt ttcttaaaa 39 556 0 DNA Unknown Intentionally omitted sequence. 556 000 557 0 DNA Unknown Intentionally omitted sequence. 557 000 558 25 DNA Artificial Sequence Synthetic 558 cagggtccta cacacaaatc agtca 25 559 24 DNA Artificial Sequence Synthetic 559 ccttctgtct cggtttcttc tcca 24 560 22 DNA Artificial Sequence Synthetic 560 gttcagggac ctggtcactc ac 22 561 16 DNA Artificial Sequence Synthetic 561 ggcctgcagc gccaga 16 562 18 DNA Artificial Sequence Synthetic 562 agggcgttgg cgttttcc 18 563 21 DNA Artificial Sequence Synthetic 563 agggcttgat ggcctctcag a 21 564 24 DNA Artificial Sequence Synthetic 564 agcctaccca tcttccattc ctca 24 565 18 DNA Artificial Sequence Synthetic 565 gccaggcccc cttaggac 18 566 22 DNA Artificial Sequence Synthetic 566 gcttctgcac tgaaagggct ca 22 567 19 DNA Artificial Sequence Synthetic 567 ccacatggcc taccctccc 19 568 17 DNA Artificial Sequence Synthetic 568 gcctgtgccc agcagca 17 569 17 DNA Artificial Sequence Synthetic 569 gcctaccctg gcagccc 17 570 21 DNA Artificial Sequence Synthetic 570 caactcctgc ctccgctcta c 21 571 24 DNA Artificial Sequence Synthetic 571 gccaggttga gcaggtaaat gtca 24 572 23 DNA Artificial Sequence Synthetic 572 ccactctcct cctacactgt ccc 23 573 19 DNA Artificial Sequence Synthetic 573 aggcccccat cgatctccc 19 574 29 DNA Artificial Sequence Synthetic 574 gcaaagatgg ctctcttcat catagtgaa 29 575 21 DNA Artificial Sequence Synthetic 575 cgttctcaca tgcatgcccc c 21 576 20 DNA Artificial Sequence Synthetic 576 accaaaatcg aggtggccca 20 577 38 DNA Artificial Sequence Synthetic 577 catcagaaag aaaaatgaat ctgcaacttc aatagtca 38 578 19 DNA Artificial Sequence Synthetic 578 caggacccca gctgtccaa 19 579 19 DNA Artificial Sequence Synthetic 579 cgggaagacc atcgcctcc 19 580 22 DNA Artificial Sequence Synthetic 580 gcacccctat gaagacccag aa 22 581 24 DNA Artificial Sequence Synthetic 581 caaaggtcac ttcaggttga ggca 24 582 23 DNA Artificial Sequence Synthetic 582 tccagtgttg tagccaaact gca 23 583 23 DNA Artificial Sequence Synthetic 583 gtgtggtttg tttctccgca gaa 23 584 16 DNA Artificial Sequence Synthetic 584 ggcaccacct tgcgca 16 585 24 DNA Artificial Sequence Synthetic 585 cgcctggagc gttttaaatt gaga 24 586 37 DNA Artificial Sequence Synthetic 586 gttgaaataa cattcaagtt ttcccttact caagtaa 37 587 28 DNA Artificial Sequence Synthetic 587 cagaatatgg tcctctttgc tcctaaca 28 588 18 DNA Artificial Sequence Synthetic 588 cagcagaacc acgggcac 18 589 24 DNA Artificial Sequence Synthetic 589 ccacacgctt ccctctaatt ggac 24 590 22 DNA Artificial Sequence Synthetic 590 agctggaggg cagtatcact ca 22 591 21 DNA Artificial Sequence Synthetic 591 ggtggacagg aagcatgtcc c 21 592 19 DNA Artificial Sequence Synthetic 592 gaggcgatgg tcttcccga 19 593 19 DNA Artificial Sequence Synthetic 593 ccgtggctga ccactgtcc 19 594 19 DNA Artificial Sequence Synthetic 594 cgagctgcgg ccattctca 19 595 19 DNA Artificial Sequence Synthetic 595 acctggttcc acagcgcaa 19 596 20 DNA Artificial Sequence Synthetic 596 gaccgtctgc tacctcgacc 20 597 26 DNA Artificial Sequence Synthetic 597 caagattccc atttggagga acggaa 26 598 35 DNA Artificial Sequence Synthetic 598 agcagctaat aataaaccag taatttggga tagac 35 599 19 DNA Artificial Sequence Synthetic 599 gtgactccga gggcagaca 19 600 36 DNA Artificial Sequence Synthetic 600 gtttatgctt atttatgaaa tttgcctacc ttccaa 36 601 23 DNA Artificial Sequence Synthetic 601 ggcagctgct caactaatca cca 23 602 24 DNA Artificial Sequence Synthetic 602 ctgtctgctc ctctctcatc atcc 24 603 24 DNA Artificial Sequence Synthetic 603 ggacagaagc aagtctgcag atca 24 604 37 DNA Artificial Sequence Synthetic 604 tctacaagaa aacatcagaa actcttcatt caataga 37 605 29 DNA Artificial Sequence Synthetic 605 gaagccaagt attgacagct attcgaaga 29 606 20 DNA Artificial Sequence Synthetic 606 gggaagggtc aggaaagcca 20 607 23 DNA Artificial Sequence Synthetic 607 cgagagcgga ttgagttcct caa 23 608 18 DNA Artificial Sequence Synthetic 608 gagccacgag ctcccaca 18 609 22 DNA Artificial Sequence Synthetic 609 ggtagccctt taaaaggcct cc 22 610 0 DNA Unknown Intentionally omitted sequence. 610 000 611 0 DNA Unknown Intentionally omitted sequence. 611 000 612 30 DNA Artificial Sequence Synthetic 612 tgacatgttc gaaacctgtc cataaagtaa 30 613 34 DNA Artificial Sequence Synthetic 613 ggaaagaaaa gcttttgttc agagctttag aaaa 34 614 0 DNA Unknown Intentionally omitted sequence. 614 000 615 0 DNA Unknown Intentionally omitted sequence. 615 000 616 21 DNA Artificial Sequence Synthetic 616 gctgatctgc ttctcccacg a 21 617 19 DNA Artificial Sequence Synthetic 617 agtcgtcgta gccagcgaa 19 618 21 DNA Artificial Sequence Synthetic 618 gcagggctcc ttactgcaga a 21 619 21 DNA Artificial Sequence Synthetic 619 cacgccaccc atcctcaaag a 21 620 16 DNA Artificial Sequence Synthetic 620 gcacagggcg ctcacc 16 621 18 DNA Artificial Sequence Synthetic 621 cctaccagca gccgctca 18 622 22 DNA Artificial Sequence Synthetic 622 aggctccctt agatgcctga ca 22 623 17 DNA Artificial Sequence Synthetic 623 cagggcgctg acaccca 17 624 27 DNA Artificial Sequence Synthetic 624 atttctcctc tgtgtcttga agggaac 27 625 18 DNA Artificial Sequence Synthetic 625 ctgcccccct caccctac 18 626 23 DNA Artificial Sequence Synthetic 626 cctttcattt ttcccggcac aga 23 627 21 DNA Artificial Sequence Synthetic 627 gggaacttct ttcccctcgc a 21 628 25 DNA Artificial Sequence Synthetic 628 ggagtttctg tcctgggagg aaaaa 25 629 20 DNA Artificial Sequence Synthetic 629 aacactcgtg aagctggcca 20 630 18 DNA Artificial Sequence Synthetic 630 ggccacagag cctggaga 18 631 19 DNA Artificial Sequence Synthetic 631 cggcttgcct gtgcagtca 19 632 19 DNA Artificial Sequence Synthetic 632 gccagccccc ttcctttcc 19 633 22 DNA Artificial Sequence Synthetic 633 acactgccag gagacacaga ac 22 634 23 DNA Artificial Sequence Synthetic 634 ggagcagatc ctggcaaaga tcc 23 635 22 DNA Artificial Sequence Synthetic 635 cgtactgcac aaacttgctg ca 22 636 32 DNA Artificial Sequence Synthetic 636 ggtgtaggta gagataagaa gagtgatact ca 32 637 19 DNA Artificial Sequence Synthetic 637 gctggtgact tgccccaga 19 638 24 DNA Artificial Sequence Synthetic 638 tgtcataatg cagtgggatt gcca 24 639 22 DNA Artificial Sequence Synthetic 639 caagctggca atggtggaca ca 22 640 0 DNA Unknown Intentionally omitted sequence. 640 000 641 0 DNA Unknown Intentionally omitted sequence. 641 000 642 26 DNA Artificial Sequence Synthetic 642 ttttaactct ctgctgttcc ctcacc 26 643 25 DNA Artificial Sequence Synthetic 643 actgacaggg aatctccaga agtca 25 644 0 DNA Unknown Intentionally omitted sequence. 644 000 645 0 DNA Unknown Intentionally omitted sequence. 645 000 646 0 DNA Unknown Intentionally omitted sequence. 646 000 647 0 DNA Unknown Intentionally omitted sequence. 647 000 648 27 DNA Artificial Sequence Synthetic 648 gacctgattt gattgagagc cttgaac 27 649 29 DNA Artificial Sequence Synthetic 649 acaagccctg gactagatga tttctaaga 29 650 0 DNA Unknown Intentionally omitted sequence. 650 000 651 0 DNA Unknown Intentionally omitted sequence. 651 000 652 0 DNA Unknown Intentionally omitted sequence. 652 000 653 0 DNA Unknown Intentionally omitted sequence. 653 000 654 23 DNA Artificial Sequence Synthetic 654 ggtgatgcaa aagatggaag cca 23 655 26 DNA Artificial Sequence Synthetic 655 ccattttgga atgtgaccgt ctgtcc 26 656 20 DNA Artificial Sequence Synthetic 656 gagagggtca tgcagtggca 20 657 21 DNA Artificial Sequence Synthetic 657 tccggtgctc catggatgac a 21 658 25 DNA Artificial Sequence Synthetic 658 gccatactgc agcactttaa aggac 25 659 28 DNA Artificial Sequence Synthetic 659 ctgctgtgat ttatctgctg aaagctca 28 660 23 DNA Artificial Sequence Synthetic 660 catctaactg ctccccagtc aca 23 661 19 DNA Artificial Sequence Synthetic 661 gccctcggtc ctccaggaa 19 662 19 DNA Artificial Sequence Synthetic 662 ccacccaccc aggacacac 19 663 18 DNA Artificial Sequence Synthetic 663 ccccaacggc caggcaaa 18 664 31 DNA Artificial Sequence Synthetic 664 ggacaaatgt tctgggtctc taatattcca a 31 665 17 DNA Artificial Sequence Synthetic 665 gggtgggacg gagtccc 17 666 24 DNA Artificial Sequence Synthetic 666 gtttgcctta ccttggaagt ggac 24 667 27 DNA Artificial Sequence Synthetic 667 tgctgagaag attgacaggt tcatgca 27 668 0 DNA Unknown Intentionally omitted sequence. 668 000 669 0 DNA Unknown Intentionally omitted sequence. 669 000 670 0 DNA Unknown Intentionally omitted sequence. 670 000 671 0 DNA Unknown Intentionally omitted sequence. 671 000 672 0 DNA Unknown Intentionally omitted sequence. 672 000 673 0 DNA Unknown Intentionally omitted sequence. 673 000 674 17 DNA Artificial Sequence Synthetic 674 ggcccagcca ctgacca 17 675 25 DNA Artificial Sequence Synthetic 675 cttcttggct gttgtttctg ttccc 25 676 16 DNA Artificial Sequence Synthetic 676 cccgactgtg ccgcca 16 677 23 DNA Artificial Sequence Synthetic 677 gccctttttc caggtctgac aac 23 678 23 DNA Artificial Sequence Synthetic 678 cctctcaatg ggtcacttgg caa 23 679 28 DNA Artificial Sequence Synthetic 679 gtccaaattt ctgttgggtt cagtgaaa 28 680 20 DNA Artificial Sequence Synthetic 680 gaagggccaa tagccctccc 20 681 21 DNA Artificial Sequence Synthetic 681 gccccagcca agaaaggtca a 21 682 22 DNA Artificial Sequence Synthetic 682 ttctcctggc ctgtagggag aa 22 683 21 DNA Artificial Sequence Synthetic 683 ccctcgtcac ttcctctgtc c 21 684 20 DNA Artificial Sequence Synthetic 684 gtgtctcctt cacccccacc 20 685 29 DNA Artificial Sequence Synthetic 685 actttcccct tttcatgcct tattctgac 29 686 0 DNA Unknown Intentionally omitted sequence. 686 000 687 0 DNA Unknown Intentionally omitted sequence. 687 000 688 16 DNA Artificial Sequence Synthetic 688 cgaccaggca ggccac 16 689 19 DNA Artificial Sequence Synthetic 689 gtcctgctca cacagcccc 19 690 0 DNA Unknown Intentionally omitted sequence. 690 000 691 0 DNA Unknown Intentionally omitted sequence. 691 000 692 23 DNA Artificial Sequence Synthetic 692 ggtgatgcaa aagatggaag cca 23 693 26 DNA Artificial Sequence Synthetic 693 ccattttgga atgtgaccgt ctgtcc 26 694 0 DNA Unknown Intentionally omitted sequence. 694 000 695 0 DNA Unknown Intentionally omitted sequence. 695 000 696 0 DNA Unknown Intentionally omitted sequence. 696 000 697 0 DNA Unknown Intentionally omitted sequence. 697 000 698 19 DNA Artificial Sequence Synthetic 698 cggaatttcc tggccccca 19 699 16 DNA Artificial Sequence Synthetic 699 gtcccggcct gtccca 16 700 537 DNA Homo sapiens misc_feature (275)..(275) n is c or t. 700 aagttagaag aaccaagact atcttgtcag gggtgtattt tgagagtggc agacttttca 60 gtgcctttcc attcatgaca cttcttgaat ctctggcaga accagccagc cgtgttcaca 120 gtgtcaaatg aagggatgtc tttgattgct tccaggtgtt cctcagcacc accggagggg 180 gatgggtgat cagccgaatc tttgactcgg gctacccatg ggacatggtg ttcatgacac 240 gctttcagaa catgttgaga aattccctcc caacnccaat tgtgacttgg ttgatggagc 300 gaaagataaa caactggctc aatcatgcaa attacggctt aataccagaa gacaggtaaa 360 tataatgtga ctgccaaggg cttttaggaa gaaggagcct ctgcctgtcc agcagcctat 420 acaagccagg cagtaccaca gcaacatggc tgaatgtgtg ggaacacttg atacaaattt 480 gcttgataat aacagctaac tgttcttaag tactcagaaa gtgaaattat gtatttc 537 701 18 DNA Homo sapiens 701 cgggctaccc atgggaca 18 702 31 DNA Homo sapiens 702 tctggtatta agccgtaatt tgcatgattg a 31 703 19 DNA Artificial Sequence Synthetic 703 ctgttcttcc tgaagcctc 19 704 17 DNA Artificial Sequence Synthetic 704 ttgaggttgg tgccttc 17 705 19 DNA Artificial Sequence Synthetic 705 aagagtgtat tgagagcct 19 706 20 DNA Artificial Sequence Synthetic 706 tcagccttaa aaagacctcc 20 707 19 DNA Artificial Sequence Synthetic 707 ctcgtcactt cctctgtcc 19 708 16 DNA Artificial Sequence Synthetic 708 gggagaagtg cggcac 16 709 18 DNA Artificial Sequence Synthetic 709 gaccttatgt gtttttcc 18 710 22 DNA Artificial Sequence Synthetic 710 caatttcctc aaaagacttt cc 22 711 20 DNA Artificial Sequence Synthetic 711 aaggacttaa gaattgtcac 20 712 17 DNA Artificial Sequence Synthetic 712 cctcaatcct tcaccgc 17 713 19 DNA Artificial Sequence Synthetic 713 gaggtagtgt ttacagccc 19 714 22 DNA Artificial Sequence Synthetic 714 tcacatctcg agtgataatc tc 22 715 17 DNA Artificial Sequence Synthetic 715 tgatgggaga cgagttc 17 716 18 DNA Artificial Sequence Synthetic 716 tgcacacaca cacatacc 18 717 15 DNA Artificial Sequence Synthetic 717 aggtcccctc cgctc 15 718 16 DNA Artificial Sequence Synthetic 718 cacagtggtg ttggac 16 719 18 DNA Artificial Sequence Synthetic 719 atcctgaaga gcaagtcc 18 720 18 DNA Artificial Sequence Synthetic 720 attccggttt ggttctcc 18 721 17 DNA Artificial Sequence Synthetic 721 ctgaaaccca ggactcc 17 722 19 DNA Artificial Sequence Synthetic 722 ggaacaatca ccttttctc 19 723 122 DNA Homo sapiens 723 aggctccctt agatgcctga cattctgttc ttcctgaagc ctcactccct tctctcctgg 60 ctgcagacac gtccccatca gaaggcacca acctcaacgc gcccaacagc ctgggtgtca 120 gc 122 724 122 DNA Homo sapiens 724 taaaatcatt tatttattta tccatccatc aagagtgtat tgagagcctg acaacatacc 60 aggcatcaag ccctggaggt ctttttaagg ctgagccaat atagctatgg ataacattct 120 aa 122 725 91 DNA Homo sapiens 725 ccctcgtcac ttcctctgtc ctgtggggtg ggggtgcagg cgctctctcc tttagctgtg 60 ccgcacttct ccctacaggc caggagaaac a 91 726 122 DNA Homo sapiens 726 cttctgtgga ccttatgtgt ttttcctctt tgctggagtg ctcctggcct ttaccctgtt 60 ctacattttt taaagttcca gaaaccaaag gaaagtcttt tgaggaaatt gctgcagaat 120 tc 122 727 122 DNA Homo sapiens 727 tggttaagga cttaagaatt gtcacttgtg tgtgtatatt gttgttgttg ttgcaacggt 60 gtctgtgtac gcacggttac agtggatcaa atttggggag ttaggaagtg gcgttggttt 120 gt 122 728 91 DNA Homo sapiens 728 cgaggtagtg tttacagccc tcatgaacag caaaggcgtg agcctcttcg agcatcatca 60 accctgagat tatcactcga gatgtgagta c 91 729 91 DNA Homo sapiens 729 ggagacgagt tcaaggtgag tgggtggggc tgggctgcta ggggaatcca gatggcatgt 60 ggtatgtgtg tgtgtgcaca cgcatgggga g 91 730 122 DNA Homo sapiens 730 cagccacagg tcccctccgc tcaggtgatg gacttcctgt ttgagaagtg gaagctctac 60 aggtgaccag tgtcaccaca acctgagcct gctgccccct cccacgggtg agccccccac 120 cc 122 731 122 DNA Homo sapiens 731 agagcaagtc ccccaaggag gagctgctga agatgtgggg ggaggagctg accagtgaag 60 acaagtgtct ttgaagtctt tgttctttac ctctcgggag aaccaaaccg gaatggtcac 120 aa 122 732 122 DNA Homo sapiens 732 ggaactgaaa cccaggactc cgtctcttgc cagtgaaagt tatgttagga agcagtgagg 60 tggtctaaag cagtatgaaa ggcaaagaga aaaggtgatt gttccctctt gaatggccct 120 tg 122 733 19 DNA Artificial Sequence Synthetic 733 ctgggctggg agcagcctc 19 734 23 DNA Artificial Sequence Synthetic 734 cactcgctgg cctgtttcat gtc 23 735 22 DNA Artificial Sequence Synthetic 735 ctggaatccg gtgtcgaagt gg 22 736 20 DNA Artificial Sequence Synthetic 736 ctcggcccct gcactgtttc 20 737 22 DNA Artificial Sequence Synthetic 737 gaggcaagaa ggagtgtcag gg 22 738 23 DNA Artificial Sequence Synthetic 738 agtcctgtgg tgaggtgacg agg 23 739 15 DNA Artificial Sequence Synthetic 739 ggtagtgagg caggt 15 740 16 DNA Artificial Sequence Synthetic 740 gcttctggta ggggag 16 741 19 DNA Artificial Sequence Synthetic 741 aaataggact aggacctgt 19 742 15 DNA Artificial Sequence Synthetic 742 gggtcccacg gaaat 15 743 12 DNA Artificial Sequence Synthetic 743 catggccacg cg 12 744 13 DNA Artificial Sequence Synthetic 744 ccggcacctc tcg 13 745 14 DNA Artificial Sequence Synthetic 745 ccgtcctcct gcat 14 746 17 DNA Artificial Sequence Synthetic 746 cactctcacc ttctcca 17 747 17 DNA Artificial Sequence Synthetic 747 gttctgtccc gagtatg 17 748 16 DNA Artificial Sequence Synthetic 748 tgcactgttt cccaga 16 749 17 DNA Artificial Sequence Synthetic 749 ctgacctcct ccaacat 17 750 15 DNA Artificial Sequence Synthetic 750 gggctatcac caggt 15 751 17 DNA Artificial Sequence Synthetic 751 ctgacctcct ccaacat 17 752 15 DNA Artificial Sequence Synthetic 752 gggctatcac caggt 15 753 10 DNA Artificial Sequence Synthetic 753 cgcgccgagg 10 754 14 DNA Artificial Sequence Synthetic 754 atgacgtggc agac 14 755 12 DNA Artificial Sequence Synthetic 755 acggacgcgg ag 12 756 11 DNA Artificial Sequence Synthetic 756 tccgcgcgtc c 11 757 22 DNA Artificial Sequence Synthetic 757 gaagcggcgc cggttaccac ca 22 758 27 DNA Artificial Sequence Synthetic 758 cgcgccgagg tggttgagca attccaa 27 759 30 DNA Artificial Sequence Synthetic 759 atgacgtggc agaccggttg agcaattcca 30

Claims (1)

We claim:
1. A method comprising: a) providing a sample comprising 150 or more target nucleic acids; b) amplifying said 150 or more target nucleic acids using the polymerase chain reaction under conditions such that each of said target nucleic acids is amplified to generate 150 or more amplified targets; and c) exposing said 150 or more amplified targets to invasive cleavage assay reagents.
US10/321,039 2001-10-12 2002-12-17 Amplification methods and compositions Abandoned US20040014067A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/321,039 US20040014067A1 (en) 2001-10-12 2002-12-17 Amplification methods and compositions
US12/174,277 US7790393B2 (en) 2001-10-12 2008-07-16 Amplification methods and compositions

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US32911301P 2001-10-12 2001-10-12
US36048901P 2001-10-19 2001-10-19
US99815701A 2001-11-30 2001-11-30
US10/321,039 US20040014067A1 (en) 2001-10-12 2002-12-17 Amplification methods and compositions

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US99815701A Continuation-In-Part 2000-11-30 2001-11-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/174,277 Continuation US7790393B2 (en) 2001-10-12 2008-07-16 Amplification methods and compositions

Publications (1)

Publication Number Publication Date
US20040014067A1 true US20040014067A1 (en) 2004-01-22

Family

ID=46298899

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/321,039 Abandoned US20040014067A1 (en) 2001-10-12 2002-12-17 Amplification methods and compositions
US12/174,277 Expired - Lifetime US7790393B2 (en) 2001-10-12 2008-07-16 Amplification methods and compositions

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/174,277 Expired - Lifetime US7790393B2 (en) 2001-10-12 2008-07-16 Amplification methods and compositions

Country Status (1)

Country Link
US (2) US20040014067A1 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040096874A1 (en) * 2002-04-11 2004-05-20 Third Wave Technologies, Inc. Characterization of CYP 2D6 genotypes
US20060175058A1 (en) * 2005-02-08 2006-08-10 Halliburton Energy Services, Inc. Methods of creating high-porosity propped fractures using reticulated foam
US20060265133A1 (en) * 2003-02-21 2006-11-23 Vision Biosystems Limited Analysis system and procedures
US20070026439A1 (en) * 2005-07-15 2007-02-01 Applera Corporation Fluid processing device and method
US20080131870A1 (en) * 2006-06-09 2008-06-05 Third Wave Technologies, Inc. T-structure invasive cleavage assays, consistent nucleic acid dispensing, and low level target nucleic acid detection
US20090142752A1 (en) * 2006-10-04 2009-06-04 Third Wave Technologies, Inc. Snap-Back Primers And Detectable Hairpin Structures
US20090165323A1 (en) * 2007-12-27 2009-07-02 Daewoo Electronics Corporation Dryer
US20090215043A1 (en) * 2005-05-12 2009-08-27 Third Wave Technologies, Inc. Polymorphic GHSR Nucleic Acids And Uses Thereof
US8361720B2 (en) 2010-11-15 2013-01-29 Exact Sciences Corporation Real time cleavage assay
US8715937B2 (en) 2010-11-15 2014-05-06 Exact Sciences Corporation Mutation detection assay
WO2014158628A1 (en) 2013-03-14 2014-10-02 Hologic, Inc. Compositions and methods for analysis of nucleic acid molecules
US8916344B2 (en) 2010-11-15 2014-12-23 Exact Sciences Corporation Methylation assay
US9464320B2 (en) 2002-01-25 2016-10-11 Applied Biosystems, Llc Methods for placing, accepting, and filling orders for products and services
US11486877B2 (en) 2017-04-10 2022-11-01 Sysmex Corporation Measurement method, measuring apparatus, program, and method for obtaining and displaying qualitative determination result

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013511991A (en) * 2009-11-25 2013-04-11 クアンタライフ, インコーポレイテッド Methods and compositions for detecting genetic material
US20190010543A1 (en) 2010-05-18 2019-01-10 Natera, Inc. Methods for simultaneous amplification of target loci
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
CA2826748C (en) 2011-02-09 2020-08-04 Bio-Rad Laboratories, Inc. Method of detecting variations in copy number of a target nucleic acid
WO2012154555A1 (en) * 2011-05-06 2012-11-15 The General Hospital Corporation Nanocompositions for monitoring polymerase chain reaction (pcr)
US20140074659A1 (en) 2012-09-07 2014-03-13 Oracle International Corporation Ramped ordering for cloud services
US9621435B2 (en) * 2012-09-07 2017-04-11 Oracle International Corporation Declarative and extensible model for provisioning of cloud based services
US10148530B2 (en) 2012-09-07 2018-12-04 Oracle International Corporation Rule based subscription cloning
US9542400B2 (en) 2012-09-07 2017-01-10 Oracle International Corporation Service archive support
US9276942B2 (en) 2012-09-07 2016-03-01 Oracle International Corporation Multi-tenancy identity management system
US9467355B2 (en) 2012-09-07 2016-10-11 Oracle International Corporation Service association model
CN109971852A (en) 2014-04-21 2019-07-05 纳特拉公司 Detect the mutation and ploidy in chromosome segment
US9579606B2 (en) 2014-07-23 2017-02-28 Air Liquide Advanced Technologies U.S. Llc Gas separation membrane module with improved gas seal
US10164901B2 (en) 2014-08-22 2018-12-25 Oracle International Corporation Intelligent data center selection
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US10142174B2 (en) 2015-08-25 2018-11-27 Oracle International Corporation Service deployment infrastructure request provisioning
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US10011870B2 (en) 2016-12-07 2018-07-03 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US10648025B2 (en) * 2017-12-13 2020-05-12 Exact Sciences Development Company, Llc Multiplex amplification detection assay II
CA3103811A1 (en) * 2018-06-29 2020-01-02 Covariance Biosciences, Llc Methods and compositions for improved multiplex genotyping and sequencing
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4683195A (en) * 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683202A (en) * 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US4965188A (en) * 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US5538848A (en) * 1994-11-16 1996-07-23 Applied Biosystems Division, Perkin-Elmer Corp. Method for detecting nucleic acid amplification using self-quenching fluorescence probe
US5719028A (en) * 1992-12-07 1998-02-17 Third Wave Technologies Inc. Cleavase fragment length polymorphism
US5843654A (en) * 1992-12-07 1998-12-01 Third Wave Technologies, Inc. Rapid detection of mutations in the p53 gene
US5843669A (en) * 1996-01-24 1998-12-01 Third Wave Technologies, Inc. Cleavage of nucleic acid acid using thermostable methoanococcus jannaschii FEN-1 endonucleases
US5888780A (en) * 1992-12-07 1999-03-30 Third Wave Technologies, Inc. Rapid detection and identification of nucleic acid variants
US5985557A (en) * 1996-01-24 1999-11-16 Third Wave Technologies, Inc. Invasive cleavage of nucleic acids
US5994069A (en) * 1996-01-24 1999-11-30 Third Wave Technologies, Inc. Detection of nucleic acids by multiple sequential invasive cleavages
US6528254B1 (en) * 1999-10-29 2003-03-04 Stratagene Methods for detection of a target nucleic acid sequence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2243353C (en) 1996-01-24 2010-03-30 Third Wave Technologies, Inc. Invasive cleavage of nucleic acids

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4683202A (en) * 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US4683202B1 (en) * 1985-03-28 1990-11-27 Cetus Corp
US4683195B1 (en) * 1986-01-30 1990-11-27 Cetus Corp
US4683195A (en) * 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4965188A (en) * 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US5888780A (en) * 1992-12-07 1999-03-30 Third Wave Technologies, Inc. Rapid detection and identification of nucleic acid variants
US5719028A (en) * 1992-12-07 1998-02-17 Third Wave Technologies Inc. Cleavase fragment length polymorphism
US5843654A (en) * 1992-12-07 1998-12-01 Third Wave Technologies, Inc. Rapid detection of mutations in the p53 gene
US5538848A (en) * 1994-11-16 1996-07-23 Applied Biosystems Division, Perkin-Elmer Corp. Method for detecting nucleic acid amplification using self-quenching fluorescence probe
US5843669A (en) * 1996-01-24 1998-12-01 Third Wave Technologies, Inc. Cleavage of nucleic acid acid using thermostable methoanococcus jannaschii FEN-1 endonucleases
US5846717A (en) * 1996-01-24 1998-12-08 Third Wave Technologies, Inc. Detection of nucleic acid sequences by invader-directed cleavage
US5985557A (en) * 1996-01-24 1999-11-16 Third Wave Technologies, Inc. Invasive cleavage of nucleic acids
US5994069A (en) * 1996-01-24 1999-11-30 Third Wave Technologies, Inc. Detection of nucleic acids by multiple sequential invasive cleavages
US6001567A (en) * 1996-01-24 1999-12-14 Third Wave Technologies, Inc. Detection of nucleic acid sequences by invader-directed cleavage
US6090543A (en) * 1996-01-24 2000-07-18 Third Wave Technologies, Inc. Cleavage of nucleic acids
US6528254B1 (en) * 1999-10-29 2003-03-04 Stratagene Methods for detection of a target nucleic acid sequence
US6548250B1 (en) * 1999-10-29 2003-04-15 Stratagene Methods for detection of a target nucleic acid sequence

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10689692B2 (en) 2002-01-25 2020-06-23 Applied Biosystems, Llc Methods for placing, accepting, and filling orders for products and services
US9464320B2 (en) 2002-01-25 2016-10-11 Applied Biosystems, Llc Methods for placing, accepting, and filling orders for products and services
US20040096874A1 (en) * 2002-04-11 2004-05-20 Third Wave Technologies, Inc. Characterization of CYP 2D6 genotypes
US20060265133A1 (en) * 2003-02-21 2006-11-23 Vision Biosystems Limited Analysis system and procedures
US8396669B2 (en) * 2003-02-21 2013-03-12 Leica Biosystems Melbourne Pty Lt Analysis system and procedures
US20060175058A1 (en) * 2005-02-08 2006-08-10 Halliburton Energy Services, Inc. Methods of creating high-porosity propped fractures using reticulated foam
US7939257B2 (en) 2005-05-12 2011-05-10 Third Wave Technologies, Inc. Polymorphic GHSR nucleic acids and uses thereof
US20090215043A1 (en) * 2005-05-12 2009-08-27 Third Wave Technologies, Inc. Polymorphic GHSR Nucleic Acids And Uses Thereof
US20070026439A1 (en) * 2005-07-15 2007-02-01 Applera Corporation Fluid processing device and method
US7759062B2 (en) 2006-06-09 2010-07-20 Third Wave Technologies, Inc. T-structure invasive cleavage assays, consistent nucleic acid dispensing, and low level target nucleic acid detection
US8354232B2 (en) 2006-06-09 2013-01-15 Third Wave Technologies, Inc. T-structure invasive cleavage assays, consistent nucleic acid dispensing, and low level target nucleic acid detection
US20080131870A1 (en) * 2006-06-09 2008-06-05 Third Wave Technologies, Inc. T-structure invasive cleavage assays, consistent nucleic acid dispensing, and low level target nucleic acid detection
US20100285488A1 (en) * 2006-06-09 2010-11-11 Third Wave Technologies, Inc. T-structure invasive cleavage assays, consistent nucleic acid dispensing, and low level target nucleic acid detection
US8911973B2 (en) 2006-10-04 2014-12-16 Third Wave Technologies, Inc. Snap-back primers and detectable hairpin structures
US8445238B2 (en) 2006-10-04 2013-05-21 Third Wave Technologies, Inc. Snap-back primers and detectable hairpin structures
US20090142752A1 (en) * 2006-10-04 2009-06-04 Third Wave Technologies, Inc. Snap-Back Primers And Detectable Hairpin Structures
US8069582B2 (en) * 2007-12-27 2011-12-06 Daewoo Electronics Corporation Dryer
US20090165323A1 (en) * 2007-12-27 2009-07-02 Daewoo Electronics Corporation Dryer
US8715937B2 (en) 2010-11-15 2014-05-06 Exact Sciences Corporation Mutation detection assay
US10000817B2 (en) 2010-11-15 2018-06-19 Exact Sciences Development Company, Llc Mutation detection assay
US9024006B2 (en) 2010-11-15 2015-05-05 Exact Sciences Corporation Mutation detection assay
US9121071B2 (en) 2010-11-15 2015-09-01 Exact Sciences Corporation Mutation detection assay
US9290797B2 (en) 2010-11-15 2016-03-22 Exact Sciences Corporation Real time cleavage assay
US9376721B2 (en) 2010-11-15 2016-06-28 Exact Sciences Corporation Mutation detection assay
US11845995B2 (en) 2010-11-15 2023-12-19 Exact Sciences Corporation Mutation detection assay
US8916344B2 (en) 2010-11-15 2014-12-23 Exact Sciences Corporation Methylation assay
US10450614B2 (en) 2010-11-15 2019-10-22 Exact Sciences Development Company, Llc Mutation detection assay
US10604793B2 (en) 2010-11-15 2020-03-31 Exact Sciences Development Company, Llc Real time cleavage assay
US8361720B2 (en) 2010-11-15 2013-01-29 Exact Sciences Corporation Real time cleavage assay
US11091812B2 (en) 2010-11-15 2021-08-17 Exact Sciences Development Company, Llc Mutation detection assay
US11685956B2 (en) 2010-11-15 2023-06-27 Exact Sciences Corporation Methylation assay
US11499179B2 (en) 2010-11-15 2022-11-15 Exact Sciences Development Company, Llc Real time cleavage assay
WO2014158628A1 (en) 2013-03-14 2014-10-02 Hologic, Inc. Compositions and methods for analysis of nucleic acid molecules
US11486877B2 (en) 2017-04-10 2022-11-01 Sysmex Corporation Measurement method, measuring apparatus, program, and method for obtaining and displaying qualitative determination result

Also Published As

Publication number Publication date
US20090068664A1 (en) 2009-03-12
US7790393B2 (en) 2010-09-07

Similar Documents

Publication Publication Date Title
US20040014067A1 (en) Amplification methods and compositions
CN107941681B (en) Method for identifying quantitative cellular composition in biological sample
US20030228617A1 (en) Method for predicting autoimmune diseases
US20030175736A1 (en) Expression profile of prostate cancer
KR101421326B1 (en) Composition for predicting prognosis of breast cancer and kit comprising the same
US20040096874A1 (en) Characterization of CYP 2D6 genotypes
KR20160127713A (en) Method for subtyping lymphoma types by means of expression profiling
KR20100095564A (en) Methods and compositions for assessing responsiveness of b-cell lymphoma to treatment with anti-cd40 antibodies
KR20110081807A (en) Genetic variants useful for risk assessment of thyroid cancer
US20110189663A1 (en) Assessment of risk for colorectal cancer
US20230416827A1 (en) Assay for distinguishing between sepsis and systemic inflammatory response syndrome
KR20140140069A (en) Compositions and methods for diagnosis and treatment of pervasive developmental disorder
MXPA05005653A (en) Heart failure gene determination and therapeutic screening.
JP2003144176A (en) Detection method for gene polymorphism
CN1704478A (en) Methods for assessing patients with acute myeloid leukemia
US20030235848A1 (en) Characterization of CYP 2D6 alleles
US20030219784A1 (en) Systems and methods for analysis of agricultural products
US20020137077A1 (en) Genes regulated in activated T cells
KR102624979B1 (en) B4GALT1 variants and their uses
KR100901147B1 (en) A marker for predicting recurrence of a uterine cancer patient treated with radiation therapy, a kit and microarray comprising the same and method for predicting recurrence of a uterine cancer patient after radiation therapy using the marker
US20040161773A1 (en) Subtelomeric DNA probes and method of producing the same
KR101653131B1 (en) Composition or Kit and Method for predicting prognosis of liver cancer
JP2003235573A (en) Diabetic nephropathy marker and its utilization
CA3067730A1 (en) Methods for detection of plasma cell dyscrasia
KR102250063B1 (en) Method for identifying causative genes of tourette syndrome

Legal Events

Date Code Title Description
AS Assignment

Owner name: THIRD WAVE TECHNOLOGIES, INC., WISCONSIN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LYAMICHEV, VICTOR;LUKOWIAK, ANDREW;JARVIS, NANCY;AND OTHERS;REEL/FRAME:014155/0562

Effective date: 20030410

AS Assignment

Owner name: GOLDMAN SACHS CREDIT PARTNERS L.P., AS COLLATERAL

Free format text: SECURITY AGREEMENT;ASSIGNOR:THIRD WAVE TECHNOLOGIES, INC.;REEL/FRAME:021301/0780

Effective date: 20080724

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: CYTYC PRENATAL PRODUCTS CORP., MASSACHUSETTS

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: HOLOGIC, INC., MASSACHUSETTS

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: CYTYC SURGICAL PRODUCTS LIMITED PARTNERSHIP, MASSA

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: THIRD WAVE TECHNOLOGIES, INC., WISCONSIN

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: CYTYC SURGICAL PRODUCTS II LIMITED PARTNERSHIP, MA

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: CYTYC CORPORATION, MASSACHUSETTS

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: BIOLUCENT, LLC, CALIFORNIA

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: R2 TECHNOLOGY, INC., CALIFORNIA

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: SUROS SURGICAL SYSTEMS, INC., INDIANA

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: CYTYC SURGICAL PRODUCTS III, INC., MASSACHUSETTS

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819

Owner name: DIRECT RADIOGRAPHY CORP., DELAWARE

Free format text: TERMINATION OF PATENT SECURITY AGREEMENTS AND RELEASE OF SECURITY INTERESTS;ASSIGNOR:GOLDMAN SACHS CREDIT PARTNERS, L.P., AS COLLATERAL AGENT;REEL/FRAME:024944/0315

Effective date: 20100819