WO2017139354A1 - Detection of nucleic acids - Google Patents

Detection of nucleic acids Download PDF

Info

Publication number
WO2017139354A1
WO2017139354A1 PCT/US2017/016977 US2017016977W WO2017139354A1 WO 2017139354 A1 WO2017139354 A1 WO 2017139354A1 US 2017016977 W US2017016977 W US 2017016977W WO 2017139354 A1 WO2017139354 A1 WO 2017139354A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
region
target nucleic
double
stranded
Prior art date
Application number
PCT/US2017/016977
Other languages
French (fr)
Inventor
Nils Walter
Muneesh Tewari
Alexander JOHNSON-BUCK
Original Assignee
The Regents Of The University Of Michigan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of Michigan filed Critical The Regents Of The University Of Michigan
Priority to US16/076,853 priority Critical patent/US20190048415A1/en
Priority to EP22179788.9A priority patent/EP4155397A1/en
Priority to JP2018541687A priority patent/JP2019513345A/en
Priority to EP20198764.1A priority patent/EP3800249B1/en
Priority to CN202111607215.6A priority patent/CN114196746A/en
Priority to ES17750678T priority patent/ES2835101T3/en
Priority to EP17750678.9A priority patent/EP3414327B1/en
Priority to CN201780020624.1A priority patent/CN109072205A/en
Publication of WO2017139354A1 publication Critical patent/WO2017139354A1/en
Priority to US17/319,289 priority patent/US20210348230A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • C12Q1/6818Hybridisation assays characterised by the detection means involving interaction of two or more labels, e.g. resonant energy transfer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6832Enhancement of hybridisation reaction
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/80Vectors containing sites for inducing double-stranded breaks, e.g. meganuclease restriction sites

Definitions

  • nucleic acid biomarkers Unfortunately, despite their promise as diagnostic biomarkers, the sensitive and specific detection of nucleic acid biomarkers has proven challenging. In particular, existing techniques for detecting nucleic acids utilize probes that form a
  • thermodynamically stable complex with the target molecule and are thus limited to weak and often unreliable thermodynamic discrimination against background signal, spurious targets, or closely related mutant nucleic acids.
  • the presence of a complementary DNA or RNA strand in the sample severely limits accessibility of the target sequence.
  • a sensitive and specific assay for the amplification-free detection of nucleic acids in minimally treated native biofluids is needed to provide a rapid and reliable identification and/or quantification of nucleic acid biomarkers.
  • a technology for the specific and ultrasensitive detection and counting of single nucleic acid e.g., dsDNA, IncRNA, methylated DNA, etc.
  • target nucleic acid e.g., dsDNA, IncRNA, methylated DNA, etc.
  • the target nucleic acid is detected by a kinetic "fingerprint" signal produced by the probe-target interaction.
  • SiMREPS Equilibrium Poisson Sampling
  • the technology further comprises use of a protospacer adjacent motif (PAM) oligonucleotide to provide for dCas9 targeting of nucleic acid targets (e.g., IncRNA).
  • PAM protospacer adjacent motif
  • SiMREPS probe e.g., a capture probe and a query probe that are linked to provide an intramolecular probing mechanism (see, e.g., Fig. 5 a and b; see infra).
  • dCas9-mediated melting is followed by observation of the repeated, transient binding of a short fluorescently labeled DNA query probe to the second segment (e.g., query region) of the target nucleic acid that has been made accessible by dCas9-mediated melting.
  • a labeled (e.g., fluorescently labeled) dCas9/gRNA complex provides detection of the target nucleic acid.
  • the SiMREPS probes repeatedly bind to a target sequence (e.g., query region) specifically made accessible by binding of dCas9/gRNA to one or more nucleic acid region(s) adjacent to the target region (e.g., "adjacent regions”).
  • a target sequence e.g., query region
  • nucleic acid region(s) adjacent to the target region e.g., "adjacent regions”
  • the repeated binding of the query probes to the query region provides a unique, continuous kinetic "fingerprint", providing a large number of independent measurements for each observed target molecule.
  • This repeated kinetic sampling affords two main advantages: (l) arbitrarily high discrimination against background signals with increased sampling time, essentially eliminating false positive signals!
  • a complex for providing a detectable fingerprint of a double-stranded target nucleic acid comprising a double- stranded target nucleic acid (e.g., a DNA, an RNA, a DNA/RNA hybrid) comprising a first region adjacent to a second region! a melting component (e.g., an immobilized melting component) interacting with the first region to form a thermodynamically stable complex and provide the second region in a single -stranded form! and a query probe that binds repeatedly to the second region to provide a detectable fingerprint associated with the double-stranded target nucleic acid.
  • a double- stranded target nucleic acid e.g., a DNA, an RNA, a DNA/RNA hybrid
  • a melting component e.g., an immobilized melting component
  • the query probe hybridizes repeatedly to the second region with a kinetic rate constant k 0 ff that is greater than 0.1 min -1 and/or a kinetic rate constant k on that is greater than 0.1 min -1 . In some embodiments, the query probe hybridizes repeatedly to the second region with a kinetic rate constant k 0 ff that is greater than 1 min -1 and/or a kinetic rate constant k on that is greater than 1 min -1 .
  • data are analyzed, e.g., in some embodiments the fingerprint is detectable by a pattern recognition analysis.
  • Additional embodiments provide a method for providing a detectable fingerprint of a double -stranded target nucleic acid in a sample, the method comprising
  • a double-stranded target nucleic acid immobilizing a double-stranded target nucleic acid to a discrete region of a solid support, said double -stranded target nucleic acid comprising a first region adjacent to a second region and said discrete region of said solid support comprising an immobilized melting component interacting with the first region! providing a query probe that binds repeatedly to the second region to provide a detectable fingerprint; and associating the detectable fingerprint with the double -stranded nucleic acid to identify the double- stranded nucleic acid.
  • Some embodiments comprise analyzing data using pattern recognition or a similar analysis (e.g., machine learning, neural network, supervised and/or unsupervised learning, etc.) to produce or identify the detectable fingerprint of the double stranded nucleic acid.
  • Additional embodiments comprise providing a second melting component that interacts with a third region of the target nucleic acid, said third target region adjacent to the second region.
  • some embodiments comprise providing a second melting component that interacts with a third region of the target nucleic acid, said third target region adjacent to the second region.
  • dCas9/gRNA complex comprising a gRNA complementary to a third region of the target nucleic acid adjacent to the second region of the target nucleic acid.
  • Some embodiments comprise detecting repeated binding of a iluorescently labeled nucleic acid to the second region with a kinetic rate constant k 0 ff that is greater than 0.1 min -1 and/or a kinetic rate constant k on that is greater than 0.1 min -1 . Some embodiments comprise detecting repeated binding of a iluorescently labeled nucleic acid to the second region with a kinetic rate constant k 0 ff that is greater than 1 min -1 and/or a kinetic rate constant k on that is greater than 1 min- 1 .
  • Some embodiments comprise calculating an amount or concentration of the double-stranded target nucleic acid in the sample from the detectable fingerprint.
  • Figure 1 a is a schematic drawing of an embodiment of the nucleic acid detection technology provided herein.
  • Figure 1 b shows exemplary SiMREPS data of fluorescently labeled query probes transiently associating non- specifically to a slide surface.
  • Figure 1 c shows exemplary SiMREPS data of fluorescently labeled query probes transiently binding to a target nucleic acid.
  • the kinetic fingerprints of lc and lb are different, thus providing examples of the different kinetic signatures for specific and non-specific binding of query probes.
  • Figure 1 d is a series of histograms indicating the number of query probes counted to have a given number of intensity transitions (Nb+d) in the absence (thick gray bars) or presence (thin lines) of 1 pM target nucleic acid (e.g., a miR- 141 microRNA).
  • the four histograms plot data acquired with acquisition times of 1, 2, 5, and 10 minutes.
  • Figure 1 e shows plots of standard curves from SiMREPS assays of five miRNAs, yielding R 2 values > 0.99.
  • the SiMREPS technology provides high-confidence detection of nucleic acids.
  • Figure 2 b is a dwell time analysis showing the high-confidence single-copy-level discrimination between let-7a (closed circles) and let-7c (open circles).
  • Figure 2 c is a receiver operating characteristic (ROC) plot constructed by varying the x on threshold for discriminating between let- 7a and let- 7c.
  • ROC receiver operating characteristic
  • Figure 2 d is a Nb+a histogram for the detection of let-7 in crude HeLa cell extract in the presence or absence of the miRCURY let- 7 inhibitor.
  • the Nb+a histogram for endogenous hsa-let- 7a showed a well-defined peak (thin line) that vanished in the presence of a let-7 inhibitor designed to bind and sequester let-7 family members (thick grey bars).
  • Figure 2 e shows the dwell times for molecules detected in crude HeLa extract using the fluorescent and capture probes for let- 7a.
  • the filled and open circles represent two clusters of target molecules classified by k-means clustering of x on values, consistent with the expected x on distributions for single -nucleotide mutants hsa-let- 7a and hsadet- 7c.
  • Figure 2 f shows the quantification of synthetic miR- 141 spiked into human serum.
  • FIG. 3 is a schematic diagram showing an embodiment of the technology comprising a dCas9/gRNA. Genomic target DNA is briefly pre -treated with
  • the guide RNA (gRNA) of the dCas9/gRNA complex hybridizes to the target nucleic acid (e.g., a genomic DNA) at a region adjacent to a query region (e.g., complementary to a query probe).
  • the dCas9/gRNA melts a specific DNA sequence (e.g., the query region) in the target nucleic acid.
  • SiMREPS is used to detect binding of the query probe to the now accessible query region in the target nucleic acid, which is adjacent to the site of target nucleic acid hybridized to the gRNA.
  • Figure 4 is a schematic diagram showing an embodiment of the technology comprising hybridization of two flanking dCas9/gRNA complexes to a target nucleic acid.
  • hybridizing two dCas9/gRNA complexes to flank the query region further improves accessibility of the target nucleic acid (e.g., the query region) to the query probe for SiMREPS-based detection, and, in some embodiments, increases specificity.
  • one dCas9 is biotinylated for capture onto the surface and the other dCas9 is not biotinylated, though in some embodiments the second dCas9 is modified, e.g., biotinylated.
  • Figure 5 a is a schematic showing an embodiment of intramolecular SiMREPS probing in which the query and capture probes are linked on a contiguous
  • Figure 5 b is a schematic showing an embodiment of intramolecular SiMREPS probing in which the non-contiguous query and capture probes are co-localized by an address oligonucleotide.
  • nucleic acid or a “nucleic acid sequence” refers to a polymer or oligomer of pyrimidine and/or purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively (See Albert L. Lehninger, Principles of
  • the present technology contemplates any deoxyribonucleotide, ribonucleotide, or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated, or glycosylated forms of these bases, and the like.
  • the polymers or oligomers may be heterogenous or homogenous in composition, and may be isolated from naturally occurring sources or may be artificially or synthetically produced.
  • the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single - stranded or double -stranded form, including homoduplex, heteroduplex, and hybrid states.
  • a nucleic acid or nucleic acid sequence comprises other kinds of nucleic acid structures such as, for instance, a DNA/RNA helix, peptide nucleic acid (PNA), morpholino, locked nucleic acid (LNA), and/or a ribozyme.
  • PNA peptide nucleic acid
  • LNA locked nucleic acid
  • nucleic acid or “nucleic acid sequence” may also encompass a chain comprising non- natural nucleotides, modified nucleotides, and/or non- nucleotide building blocks that can exhibit the same function as natural nucleotides (e.g., “nucleotide analogs”); further, the term “nucleic acid sequence” as used herein refers to an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin, which may be single or double-stranded, and represent the sense or antisense strand.
  • nucleotide analog refers to modified or non-naturally occurring nucleotides including but not limited to analogs that have altered stacking interactions such as 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP); base analogs with alternative hydrogen bonding configurations (e.g., such as Iso-C and Iso-G and other non-standard base pairs described in U.S. Pat. No. 6,001,983 to S. Benner and herein incorporated by reference); non-hydrogen bonding analogs (e.g., non-polar, aromatic nucleoside analogs such as 2,4-difluorotoluene, described by B. A. Schweitzer and E. T.
  • 7-deaza purines i.e., 7-deaza-dATP and 7-deaza-dGTP
  • base analogs with alternative hydrogen bonding configurations e.g., such as Iso-C and Iso-G and other non-standard base pairs described in U
  • Nucleotide analogs include nucleotides having modification on the sugar moiety, such as dideoxy nucleotides and 2'-0-methyl nucleotides. Nucleotide analogs include modified forms of deoxyribonucleotides as well as ribonucleotides.
  • Protein nucleic acid means a DNA mimic that incorporates a peptide ike polyamide backbone.
  • % sequence identity refers to the percentage of nucleotides or nucleotide analogs in a nucleic acid sequence that is identical with the corresponding nucleotides in a reference sequence after aligning the two sequences and introducing gaps, if necessary, to achieve the maximum percent identity.
  • additional nucleotides in the nucleic acid, that do not align with the reference sequence are not taken into account for determining sequence identity.
  • sequence variation refers to differences in nucleic acid sequence between two nucleic acids.
  • a wild-type structural gene and a mutant form of this wild-type structural gene may vary in sequence by the presence of single base substitutions and/or deletions or insertions of one or more nucleotides. These two forms of the structural gene are said to vary in sequence from one another.
  • a second mutant form of the structural gene may exist. This second mutant form is said to vary in sequence from both the wild-type gene and the first mutant form of the gene.
  • the terms “complementary” or “complementarity” are used in reference to polynucleotides (e.g., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules.
  • polynucleotides e.g., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid
  • Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids.
  • the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand.
  • complementary refers to the nucleotides of a nucleic acid sequence that can bind to another nucleic acid sequence through hydrogen bonds, e.g., nucleotides that are capable of base pairing, e.g., by Watson-Crick base pairing or other base pairing. Nucleotides that can form base pairs, e.g., that are complementary to one another, are the pairs: cytosine and guanine, thymine and adenine, adenine and uracil, and guanine and uracil. The percentage complementarity need not be calculated over the entire length of a nucleic acid sequence.
  • the percentage of complementarity may be limited to a specific region of which the nucleic acid sequences that are base-paired, e.g., starting from a first base-paired nucleotide and ending at a last base-paired nucleotide.
  • the complement of a nucleic acid sequence as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association.”
  • Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids of the present invention and include, for example, inosine and 7-deazaguanine.
  • duplex stability need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases.
  • Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs.
  • “complementary” refers to a first nucleobase sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identical to the complement of a second nucleobase sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more nucleobases, or that the two sequences hybridize under stringent hybridization conditions.
  • “Fully complementary” means each nucleobase of a first nucleic acid is capable of pairing with each nucleobase at a corresponding position in a second nucleic acid.
  • an oligonucleotide wherein each nucleobase has complementarity to a nucleic acid has a nucleobase sequence that is identical to the complement of the nucleic acid over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more nucleobases.
  • Mismatch means a nucleobase of a first nucleic acid that is not capable of pairing with a nucleobase at a corresponding position of a second nucleic acid.
  • hybridization is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the T m of the formed hybrid. "Hybridization” methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon. The initial observations of the
  • T m is used in reference to the "melting temperature.”
  • the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • T m 81.5 + 0.41 * (% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985).
  • melting when used in reference to a nucleic acid refers to the dissociation of a double-stranded nucleic acid or region of a nucleic acid into a single-stranded nucleic acid or region of a nucleic acid.
  • melting component refers to a substance, molecule (e.g., a biomolecule), or a complex of more than one molecule (e.g., a complex of more than one biomolecule) that interacts with a nucleic acid and melts it, e.g., dissociates double-stranded regions (e.g., secondary structure of a single-stranded nucleic acid, a duplex structure of DNA or of a RNA/DNA hybrid) to provide single -stranded regions, e.g., to provide access to query regions for binding of a query probe.
  • a melting component is a dCas9/gRNA complex (e.g., in some
  • a biotinylated dCas9 comprising a biotinylated dCas9 and in some embodiments comprising a non-biotinylated dCas9).
  • the technology is not limited, however, to melting components that comprise dCas9.
  • the technology comprises use of any entity that provides access to a query region by a query probe and allows SiMREPS assay of a target nucleic acid.
  • a “double-stranded nucleic acid” may be a portion of a nucleic acid, a region of a longer nucleic acid, or an entire nucleic acid.
  • a “double-stranded nucleic acid” may be, e.g., without limitation, a double-stranded DNA, a double -stranded RNA, a double-stranded DNA/RNA hybrid, etc.
  • a single-stranded nucleic acid having secondary structure (e.g., base-paired secondary structure) and/or higher order structure comprises a "double-stranded nucleic acid".
  • triplex structures are considered to be "double-stranded".
  • any base-paired nucleic acid is a "double-stranded nucleic acid"
  • RNA non-coding RNA
  • ncRNA non-protein- coding RNA
  • nmRNA non-messenger RNA
  • snmRNA small non-messenger RNA
  • fRNA functional RNA
  • small RNA (sRNA) is often used for bacterial ncRNAs.
  • the DNA sequence from which a non-coding RNA is transcribed as the end product is often called an RNA gene or a non-coding RNA gene.
  • Non-coding RNA genes include highly abundant and functionally important RNAs such as transfer RNA (tRNA) and ribosomal RNA (rRNA), as well as RNAs such as snoRNAs, microRNAs, siRNAs, and piRNAs.
  • tRNA transfer RNA
  • rRNA ribosomal RNA
  • long non-coding RNA or “IncRNA” or “long ncRNA” refers to a non-protein coding RNA longer than approximately 200 nucleotides. As used herein, the term is used to distinguish IncRNAs from small regulatory RNAs such as microRNAs (miRNAs), short interfering RNAs (siRNAs), PiwHnteracting RNAs
  • miRNAs microRNAs
  • siRNAs short interfering RNAs
  • PiwHnteracting RNAs small regulatory RNAs
  • RNAs small nucleolar RNAs
  • snoRNAs small nucleolar RNAs
  • other short RNAs other short RNAs
  • microRNA refers to microRNA.
  • microRNA refers to microRNA.
  • miRNA target sequence refers to a miRNA that is to be detected (e.g., in the presence of other nucleic acids).
  • a miRNA target sequence is a variant of a miRNA.
  • siRNAs refers to short interfering RNAs.
  • siRNAs comprise a duplex, or double -stranded region, where each strand of the double- stranded region is about 18 to 25 nucleotides long! the double-stranded region can be as short as 16, and as long as 29, base pairs long, where the length is determined by the antisense strand.
  • siRNAs contain from about two to four unpaired nucleotides at the 3' end of each strand.
  • SiRNAs appear to function as key intermediates in triggering RNA interference in invertebrates and in vertebrates, and in triggering sequence- specific RNA degradation during posttranscriptional gene silencing in plants.
  • At least one strand of the duplex or double-stranded region of a siRNA is substantially homologous to or substantially complementary to a target RNA molecule.
  • the strand complementary to a target RNA molecule is the "antisense” strand! the strand homologous to the target RNA molecule is the "sense” strand and is also complementary to the siRNA antisense strand.
  • One strand of the double -stranded region need not be the exact length of the opposite strand” thus, one strand may have at least one fewer nucleotides than the opposite complementary strand, resulting in a "bubble” or at least one unmatched base in the opposite strand.
  • One strand of the double-stranded region need not be exactly complementary to the opposite strand; thus, the strand, preferably the sense strand, may have at least one mismatched base pair.
  • siRNAs may also contain additional sequences! non-limiting examples of such sequences include linking sequences, or loops, which connect the two strands of the duplex region.
  • This form of siRNAs may be referred to "si-like RNA", “short hairpin siRNA” where the short refers to the duplex region of the siRNA, or "hairpin siRNA”.
  • Additional non-limiting examples of additional sequences present in siRNAs include stem and other folded structures.
  • the additional sequences may or may not have known functions! non-limiting examples of such functions include increasing stability of an siRNA molecule, or providing a cellular destination signal.
  • Pre-miRNA or "pre-miR” means a non-coding RNA having a hairpin structure, which is the product of cleavage of a pri-miR by the double-stranded RNA-specific ribonuclease known as Drosha.
  • Ste-loop sequence means an RNA having a hairpin structure and containing a mature miRNA sequence. Pre-miRNA sequences and stem-loop sequences may overlap.
  • stem-loop sequences are found in the miRNA database known as miRBase
  • Primer or “pri-miR” means a non-coding RNA having a hairpin structure that is a substrate for the double -stranded RNA-specific ribonuclease Drosha.
  • miRNA precursor means a transcript that originates from a genomic DNA and that comprises a non-coding, structured RNA comprising one or more miRNA sequences.
  • a miRNA precursor is a pre -miRNA.
  • a miRNA precursor is a pri-miRNA.
  • RNA having a non-coding function e.g., a ribosomal or transfer RNA
  • the RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
  • wild-type refers to a gene or a gene product that has the
  • a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal” or “wild-type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered
  • mutants when compared to the wild-type gene or gene product. It is noted that naturally- occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • oligonucleotide as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 10 to 15 nucleotides and more preferably at least about 15 to 30 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide.
  • the oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof.
  • an end of an oligonucleotide is referred to as the "5' end” if its 5' phosphate is not linked to the 3' oxygen of a
  • a nucleic acid sequence even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends.
  • a first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction.
  • the former When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream” oligonucleotide and the latter the "downstream” oligonucleotide.
  • the first oligonucleotide when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream" oligonucleotide and the second
  • oligonucleotide may be called the "downstream" oligonucleotide.
  • subject and “patient” refer to any organisms including plants, microorganisms, and animals (e.g., mammals such as dogs, cats, livestock, and humans).
  • animals e.g., mammals such as dogs, cats, livestock, and humans.
  • sample in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples.
  • a sample may include a specimen of synthetic origin.
  • a biological sample refers to a sample of biological tissue or fluid.
  • a biological sample may be a sample obtained from an animal (including a human); a fluid, solid, or tissue sample! as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by ⁇ products, and waste.
  • Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagomorphs, rodents, etc. Examples of biological samples include sections of tissues, blood, blood fractions, plasma, serum, urine, or samples from other peripheral sources or cell cultures, cell colonies, single cells, or a collection of single cells.
  • a biological sample includes pools or mixtures of the above mentioned samples.
  • a biological sample may be provided by removing a sample of cells from a subject, but can also be provided by using a previously isolated sample.
  • a tissue sample can be removed from a subject suspected of having a disease by conventional biopsy techniques.
  • a blood sample is taken from a subject.
  • a biological sample from a patient means a sample from a subject suspected to be affected by a disease.
  • Environmental samples include environmental material such as surface matter, soil, water, and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.
  • label refers to any atom or molecule that can be used to provide a detectable (preferably quantifiable) effect, and that can be attached to a nucleic acid or protein.
  • Labels include, but are not limited to, dyes (e.g., fluorescent dyes or moities); radiolabels such as 32 P; binding moieties such as biotin! haptens such as digoxgenin! luminogenic, phosphorescent, or fluorogenic moieties! mass tags! and fluorescent dyes alone or in combination with moieties that can suppress or shift emission spectra by fluorescence resonance energy transfer (FRET).
  • dyes e.g., fluorescent dyes or moities
  • radiolabels such as 32 P
  • binding moieties such as biotin! haptens such as digoxgenin! luminogenic, phosphorescent, or fluorogenic moieties! mass tags!
  • fluorescent dyes alone or in combination with moieties that can suppress or shift emission spectra by fluorescence resonance energy transfer (
  • “Support” or “solid support”, as used herein, refers to a matrix on or in which nucleic acid molecules, microparticles, and the like may be immobilized, e.g., to which they may be covalently or noncovalently attached or in or on which they may be partially or completely embedded so that they are largely or entirely prevented from diffusing freely or moving with respect to one another.
  • a "query probe” or “reader probe” is any entity (e.g., molecule, biomolecule, etc.) that recognizes a nucleic acid (e.g., binds to a nucleic acid, e.g., binds specifically to a nucleic acid).
  • the query probe is a protein that recognizes a nucleic acid (e.g., a nucleic acid binding protein, an antibody, antibody fragment, a transcription factor, or any other protein that binds to a particular sequence in a nucleic acid).
  • the query probe is a nucleic acid (e.g., a DNA, an RNA, a nucleic acid comprising DNA and RNA, a nucleic acid comprising modified bases and/or modified linkages between bases! e.g., a nucleic acid as described hereinabove, a nucleic acid aptamer or any other nucleic acid that binds to a particular sequence in a nucleic acid).
  • the query probe is labeled, e.g., with a detectable label such as, e.g., a fluorescent moiety as described herein.
  • the query probe comprises more than one type of molecule (e.g., more than one of a protein, a nucleic acid, a chemical linker or a chemical moiety).
  • a capture probe is a nucleic acid (e.g., a DNA, an RNA, a nucleic acid comprising DNA and RNA, a nucleic acid comprising modified bases and/or modified linkages between bases! e.g., a nucleic acid as described hereinabove).
  • a capture probe is labeled, e.g., with a detectable label such as, e.g., a fluorescent moiety as described herein.
  • the capture probe comprises more than one type of molecule (e.g., more than one of a protein, a nucleic acid, a chemical linker or a chemical moiety).
  • SiMREPS uses total internal reflection fluorescence (TIRF) microscopy, single -molecule visualization, and kinetic analysis of binding and release of fluorescently labeled probes to target molecules (see, e.g., Fig. 1 a).
  • TIRF total internal reflection fluorescence
  • Target molecules are quantified by simple, amplification- free, direct counting upon kinetic fingerprint identification that provides for extremely discrimination between single nucleotide variants, as demonstrated previously for the detection of microRNA (see, e.g., U.S. Pat. App. Ser. No.
  • the technology provided herein provides for the detection of additional forms of nucleic acids, e.g., DNA, mutant DNA, methylated DNA, e.g., in an abundant wild-type background.
  • a combination of, e.g., an epoxy group (on the solid support) and an amino group (dCas9/gRNA complex) is used in some embodiments as a combination of functional groups for immobilization.
  • Surface treatments using various kinds of silane coupling agents are also effective.
  • Other techniques for the attachment of proteins to solid supports and solid surfaces are known in the art. Poisson processes
  • the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of the events occurring in the given time interval if these events occur with a known average rate and independently of the time since the last event.
  • the Poisson distribution can also be used for the number of events in other specified intervals such as distance, area, or volume.
  • Particular embodiments of the technology are related to detecting a nucleic acid by analyzing the kinetics of the interaction of a query probe with a query region of a target nucleic acid to be detected.
  • a query probe Q e.g., at an equilibrium concentration [Q]
  • a target nucleic acid T e.g., at an equilibrium concentration [T]
  • the kinetic rate constant k 0 n describes the time -dependent formation of the complex QT comprising the query probe Q hybridized to the query region of the target nucleic acid T.
  • the formation of the QT complex is associated with a second order rate constant that is dependent on the concentration of query probe and has units of M _1 min _1 (or the like)
  • the formation of the QT complex is sufficiently described by a k 0 n that is a pseudo-first order rate constant associated with the formation of the QT complex.
  • k 0 n is an apparent (“pseudo") first- order rate constant.
  • the kinetic rate constant k 0 ff describes the time-dependent dissociation of the complex QT into the query probe Q and the target nucleic acid T.
  • Kinetic rates are typically provided herein in units of min -1 or s _1 .
  • the "dwell time" of the query probe Q in the bound state (x on ) is the time interval (e.g., length of time) that the probe Q is hybridized to the query region of the target nucleic acid T during each instance of query probe Q binding to the query region of the target nucleic acid T to form the QT complex.
  • the "dwell time" of the query probe Q in the unbound state (x 0 ff) is the time interval
  • Dwell times may be provided as averages or weighted averages integrating over numerous binding and non-binding events.
  • the repeated, stochastic binding of query probes e.g., detectably labeled query probes (e.g., fluorescent probes), e.g., nucleic acid probes such as DNA or RNA probes
  • query probes e.g., detectably labeled query probes (e.g., fluorescent probes), e.g., nucleic acid probes such as DNA or RNA probes
  • Nb+d standard deviation in the number of binding and dissociation events per unit time
  • the statistical noise becomes a smaller fraction of Nb+d as the observation time is increased. Accordingly, the observation is lengthened as needed in some embodiments to achieve discrimination between target and off-target binding.
  • An acquisition time of approximately 10 minutes (e.g., approximately 1 to 100 minutes, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 minutes) yields sufficient (e.g., complete) separation of the signal from background distributions of Nb+d, providing for substantially background-free quantification of the target. See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety.
  • the probe length is chosen to provide sufficient separation of signal and background peaks on convenient experimental time scales.
  • the kinetics of query probe exchange are related to the number of
  • a short DNA query probe with its complement increases as an approximately exponential function of the number of base pairs formed, while the rate constant of binding is affected only weakly for interactions comprising at least 6 to 7 base pairs.
  • varying query probe length provides for tuning the kinetic behavior to improve discrimination of query probe binding events to the target from background binding.
  • a query (e.g., fluorescent) probe length of 9 nt to 10 nt yields rapid target binding that is distinguished from background signal, as displayed in histograms of intensity
  • the kinetics of binding and dissociation are more closely correlated to probe length than to the melting temperature of the duplex. While some embodiments comprise use of a probe having a length of 9 to 10 nt, the technology is not limited by this length. Indeed, use of probes longer or shorter than 9 to 10 nt is contemplated by the technology, e.g., as discussed throughout.
  • Embodiments of the technology provide for the detection of double-stranded nucleic acids.
  • Some embodiments provide compositions, reaction mixtures, and complexes comprising a plurality of molecules for detecting one or more nucleic acids.
  • Some embodiments of compositions, reaction mixtures, and complexes comprise a nucleic acid (e.g., a target nucleic acid) that is to be detected, identified, quantified, and/or characterized; a solid substrate comprising a dCas9/gRNA complex linked to a solid surface that binds one or more regions of the target with high specificity! and a detectably labeled (e.g., fluorescent) query probe.
  • Some embodiments further comprise a protospacer adjacent motif (PAM) DNA oligonucleotide.
  • PAM protospacer adjacent motif
  • the SiMREPS technology exploits the direct binding of a short (6- 12-nucleotide) fluorescently labeled DNA probe to an unlabeled nucleic acid target (e.g., miRNA) immobilized on a glass surface (Fig. 1 a) (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety).
  • a short (6- 12-nucleotide) fluorescently labeled DNA probe to an unlabeled nucleic acid target (e.g., miRNA) immobilized on a glass surface
  • a glass surface e.g., a glass surface
  • SiMREPS finds use in quantifying RNA (see, e.g., U.S. Pat. App. Ser. No.
  • SiMREPS finds use in detecting double -stranded DNA.
  • some embodiments of the technology comprise use of a catalytically inactive (“dead") dCas9 enzyme loaded with a specific guide-RNA (gRNA) to melt dsDNA structure locally in a sequence-specific fashion, providing access for the SiMREPS probe (Fig. 3).
  • dead catalytically inactive
  • gRNA specific guide-RNA
  • Cas9 is an RNA-guided endonuclease that targets and destroys foreign DNA in bacteria using RNA:DNA base- pairing between the gRNA and foreign DNA to provide sequence specificity. Recently, Cas9/gRNA complexes have found use in genome editing (see, e.g., Doudna et al. (2014) "The new frontier of genome engineering with CRISPR-Cas9" Science 346: 6213).
  • Cas9/RNA complexes comprise two RNA molecules: (l) a CRISPR RNA (crRNA), possessing a nucleotide sequence complementary to the target nucleotide sequence! and (2) a trans -activating crRNA (tracrRNA).
  • Cas9 functions as an RNA-guided nuclease that uses both the crRNA and tracrRNA to recognize and cleave a target sequence.
  • a single chimeric guide RNA (sgRNA) mimicking the structure of the annealed crRNA/tracrRNA has become more widely used than crRNA/tracrRNA because the gRNA approach provides a simplified system with only two components (e.g., the Cas9 and the sgRNA).
  • sequence -specific binding to a nucleic acid can be guided by a natural dual-RNA complex (e.g., comprising a crRNA, a tracrRNA, and Cas9) or a chimeric single-guide RNA (e.g., a sgRNA and Cas9).
  • a natural dual-RNA complex e.g., comprising a crRNA, a tracrRNA, and Cas9
  • a chimeric single-guide RNA e.g., a sgRNA and Cas9.
  • the targeting region of a crRNA (2-RNA system) or a sgRNA As used herein, the targeting region of a crRNA (2-RNA system) or a sgRNA
  • DNA targeting specificity is determined by two factors: l) a DNA sequence matching the gRNA targeting sequence and a protospacer adjacent motif (PAM) directly downstream of the target sequence.
  • Some Cas9/gRNA complexes recognize a DNA sequence comprising a protospacer adjacent motif (PAM) sequence and the adjacent approximately 20 bases complementary to the gRNA.
  • Canonical PAM sequences are NGG or NAG for Cas9 from Streptococcus pyogenes and NNNNGATT for the Cas9 from Neisseria meningitidis.
  • Cas9 cleaves the DNA sequence via an intrinsic nuclease activity.
  • the dCas9/gRNA complex binds to a target nucleic acid with a sequence specificity provided by the gRNA, but does not cleave the nucleic acid.
  • the dCas9/gRNA "melts" the target sequence to provide single-stranded regions of the target nucleic acid in a sequence -specific manner (see, e.g., Qi et al. (2013) "Repurposing
  • the Cas9 from Streptococcus pyogenes is presently the most commonly used! some of the other Cas9 proteins have high levels of sequence identity with the S. pyogenes Cas9 and use the same guide RNAs. Others are more diverse, use different gRNAs, and recognize different PAM sequences as well (the 2-5 nucleotide sequence specified by the protein which is adjacent to the sequence specified by the RNA). Chylinski et al. classified Cas9 proteins from a large group of bacteria (RNA Biology 10:5, 1-12; 2013), and a large number of Cas9 proteins are listed in supplementary FIG. 1 and supplementary table 1 thereof, which are incorporated by reference herein.
  • the technology described herein encompasses the use of a dCas9 derived from any Cas9 protein (e.g., as listed above) and their corresponding guide RNAs or other guide RNAs that are compatible.
  • the Cas9 from Streptococcus thermophilus LMD-9 CRISPR1 system has been shown to function in human cells (see, e.g., Cong et al. (2013) Science 339: 819). Additionally, Jinek showed in vitro that Cas9 orthologs from S.
  • thermophilus and L. innocua can be guided by a dual S. pyogenes gRNA to cleave target plasmid DNA.
  • the present technology comprises the Cas9 protein from S. pyogenes, either as encoded in bacteria or co don -optimized for expression in mammalian cells, containing mutations at D10, E762, H983, or D986 and H840 or N863, e.g., D10A/D10N and H840A/H840N/H840Y, to render the nuclease portion of the protein catalytically inactive!
  • substitutions at these positions are, in some embodiments, alanine (Nishimasu (2014) Cell 156: 935-949) or, in some embodiments, other residues, e.g., glutamine, asparagine, tyrosine, serine, or aspartate, e.g., E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H.
  • the sequence of one S. pyogenes dCas9 protein that finds use in the technology provided herein is described in US20160010076, which is incorporated herein by reference in its entirety.
  • the dCas9 used herein is at least about 50% identical to the sequence of S. pyogenes Cas9, e.g., at least 50% identical to the following sequence of dCas9 comprising the D 10A and H840A substitutions (SEQ ID NO: l).
  • Asp Asp Ser lie Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
  • Lys Ala Gly Phe lie Lys Arg Gin Leu Val Glu Thr Arg Gin lie Thr
  • Lys Ser Glu Gin Glu lie Gly Lys Ala Thr Ala Lys Tyr Phe Phe
  • the technology comprises use of a nucleotide sequence that is approximately 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% identical to a nucleotide sequence that encodes a protein described by SEQ ID NO: 1.
  • the dCas9 used herein is at least about 50% identical to the sequence of the catalytically inactive S. pyogenes Cas9, i.e., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% identical to SEQ ID NO: i, wherein the mutations at D10 and H840, e.g., D 10A/D10N and H840A/H840N/H840Y are maintained.
  • any differences from SEQ ID NC l are in non-conserved regions, as identified by sequence alignment of sequences set forth in Chylinski et al., RNA Biology 10:5, 1- 12; 2013 (e.g., in supplementary FIG. 1 and supplementary table 1 thereof); Esvelt et al., Nat Methods. 2013 November; 10(ll): il l6-21 and Fonfara et al., Nucl. Acids Res. (2014) 42 (4): 2577-2590. [Epub ahead of print 2013 Nov. 22]
  • sequences are aligned for optimal comparison purposes (gaps are introduced in one or both of a first and a second amino acid or nucleic acid sequence as required for optimal alignment, and non ⁇ homologous sequences can be disregarded for comparison purposes).
  • the length of a reference sequence aligned for comparison purposes is at least 50% (in some
  • corresponding positions are then compared.
  • a position in the first sequence is occupied by the same nucleotide or residue as the corresponding position in the second sequence, then the molecules are identical at that position.
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.
  • the comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
  • the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package, using a Blosum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
  • a dCas9/gRNA-based approach for detecting double-stranded nucleic acids (e.g., double-stranded genomic DNA) using SiMREPS probes.
  • Embodiments of the technology comprise capturing unlabeled genomic DNA targets on a glass or fused silica surface using a gRNA-loaded, enzymatically dead dCas9 enzyme or enzymes that bind one or more segments of the target with high specificity and stability (e.g., forming approximately 20 base pairs at the site
  • a biotinylated dCas9 comprising a guide RNA (gRNA) comprising an appropriate sequence captures a target nucleic acid.
  • dCas9 with the help of the gRNA, melts the DNA, which produces a complementary non-template DNA strand that is single -stranded and accessible to the query probe.
  • the technology comprises use of a second non-biotinylated dCas9 comprising a second gRNA that binds to the target nucleic acid (Fig. 4).
  • the second dCas9/gRNA binds to the target nucleic acid at a distance from the first dCas9/gRNA that is approximately the size of the query probe, e.g., 5 to 30 nucleotides. That is, the biotinylated dCas9/gRNA (capture dCas9/gRNA) and second dCas9/gRNA bind to regions adjacent to the region of the target nucleic acid to which the query probe binds.
  • the two dCas9/gRNA complexes melt the region of nucleic acid between them to provide a single -stranded query region accessible for query probe binding. Accordingly, the spacing between the two dCas9/gRNA complexes has is appropriate for the binding of a query probe between them.
  • the technology provides a method for detecting a double -stranded nucleic acid.
  • a nucleic acid e.g., a genomic DNA
  • dCas9/gRNA e.g., at or near ambient (“room") temperature
  • the guide RNA (gRNA) of the dCas9/gRNA complex hybridizes to the target nucleic acid (e.g., a genomic DNA) at a region adjacent to a query region (e.g., a region of the target nucleic acid that is complementary to a query probe).
  • the dCas9/gRNA melts a specific DNA sequence (e.g., the query region) in the target nucleic acid.
  • Methods comprise capturing the dCas9 (e.g., a biotinylated dCas9) onto a slide surface (e.g., by biotin-avidin interaction).
  • SiMREPS is used to detect binding of a query probe to the query region in the target nucleic acid, which is adjacent to the site of target nucleic acid hybridized to the gRNA. See, e.g., Fig. 3.
  • two dCas9/gRNA complexes are used to make the query region of a target nucleic acid accessible to a query probe.
  • hybridizing two dCas9/gRNA complexes to flank the query region improves accessibility of the target nucleic acid (e.g., the query region) to the query probe for SiMREPS-based detection.
  • One dCas9 is biotinylated for capture onto the surface! the other (e.g., second) dCas9 is optionally biotinylated (in preferred embodiments, the second dCas9 is not biotinylated). See, e.g., Fig. 4.
  • the space between the regions bound by the two dCas9/gRNA complexes bound to the target nucleic acid provides appropriate space for binding of the query probe to the query region of the target nucleic acid.
  • the detectable (e.g., fluorescent) query probe produces a fluorescence emission signal when it is close to the surface of the solid support (e.g., within about 100 nm of the surface of the solid support).
  • the query probes quickly diffuse and thus are not individually detected! accordingly, when in the unbound state, the query probes produce a low level of diffuse background fluorescence.
  • detection of bound query probes comprises use of total internal reflection fluorescence microscopy (TIRF), HiLo microscopy (see, e.g.,
  • the observation comprises monitoring fluorescence emission at a number of discrete locations on the solid support where the target nucleic acids are immobilized (e.g., by being specifically bound to the dCas9/gRNA attached to the surface), e.g., at a number of fluorescent "spots” that blink, e.g., that can be in “on” and “off states.
  • the presence of fluorescence emission (spot is “on”) and absence of fluorescence emission (spot is "off) at each discrete location are recorded.
  • Each spot "blinks" - e.g., a spot alternates between “on” and “off states, respectively, as a query probe binds to the immobilized target nucleic acid at that spot and as the query probe dissociates from the immobilized target nucleic acid at that spot.
  • the data collected provide for the determination of the number of times a query probe binds to each immobilized target (e.g., the number of times each spot blinks "on") and a measurement of the amount of time a query probe remains bound (e.g., the length of time a spot remains "on" before turning "off).
  • the query probe comprises a fluorescent label having an emission wavelength. Detection of fluorescence emission at the emission wavelength of the fluorescent label indicates that the query probe is bound to an immobilized target nucleic acid. Binding of the query probe to the target nucleic acid is a "binding event".
  • a binding event has a fluorescence emission having a measured intensity greater than a defined threshold. For example, in some embodiments a binding event has a fluorescence intensity that is above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid).
  • a binding event has a fluorescence intensity that is at least 1, 2, 3, 4 or more standard deviations above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid). In some embodiments, a binding event has a fluorescence intensity that is at least 2 standard deviations above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid). In some embodiments, a binding event has a fluorescence intensity that is at least 1.5, 2, 3, 4, or 5 times the background fluorescence intensity (e.g., the mean fluorescence intensity observed in the absence of a target nucleic acid).
  • detecting fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold indicates that a binding event has occurred (e.g., at a discrete location on the solid support where a target nucleic acid is immobilized). Also, in some embodiments detecting fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold (e.g., at least 2 standard deviations greater than background intensity) indicates that a binding event has started.
  • detecting an absence of fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold indicates that a binding event has ended (e.g., the query probe has dissociated from the target nucleic acid).
  • the length of time between when the binding event started and when the binding event ended e.g., the length of time that fluorescence at the emission wavelength of the fluorescent probe having an intensity above the defined threshold (e.g., at least 2 standard deviations greater than
  • a “transition” refers to the binding and dissociation of a query probe to the target nucleic acid (e.g., an on/off event).
  • Methods according to the technology comprise counting the number of query probe binding events that occur at each discrete location on the solid support during a defined time interval that is the "acquisition time" (e.g., a time interval that is tens to hundreds to thousands of seconds, e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 seconds; e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 0 minutes; e.g., 1, 1.5, 2, 2.5, or 3 hours).
  • acquisition time e.g., a time interval that is tens to hundreds to thousands of seconds, e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 seconds; e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 0 minutes; e.g., 1, 1.5, 2, 2.5, or 3 hours).
  • the acquisition time is approximately 1 to 10 seconds to 1 to 10 minutes (e.g., approximately 1 to 100 seconds, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 seconds, e.g., 1 to 100 minutes, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 minutes).
  • 1 to 100 minutes e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 minutes.
  • the length of time the query probe remains bound to the target nucleic acid during a binding event is the "dwell time" of the binding event.
  • the number of binding events detected during the acquisition time and/or the lengths of the dwell times recorded for the binding events is/are characteristic of a query probe binding to a target nucleic acid and thus provide an indication that the target nucleic acid is immobilized at said discrete location and thus that the target nucleic acid is present in the sample.
  • Binding of the query probe to the immobilized target nucleic acid and/or and dissociation of the query probe from the immobilized target nucleic acid is/are monitored (e.g., using a light source to excite the fluorescent probe and detecting fluorescence emission from a bound query probe, e.g., using a fluorescence microscope) and/or recorded during a defined time interval (e.g., during the acquisition time).
  • the number of times the query probe binds to the nucleic acid during the acquisition time and/or the length of time the query probe remains bound to the nucleic acid during each binding event and the length of time the query probe remains unbound to the nucleic acid between each binding event are determined, e.g., by the use of a computer and software (e.g., to analyze the data using a hidden Markov model and Poisson statistics).
  • control samples are measured (e.g., in absence of target). Fluorescence detected in a control sample is "background fluorescence” or “background (fluorescence) intensity” or “baseline”.
  • data comprising measurements of fluorescence intensity at the emission wavelength of the query probe are recorded as a function of time.
  • the number of binding events and the dwell times of binding events are determined from the data (e.g., by
  • transitions e.g., binding and dissociation of a query probe
  • a threshold number of transitions is used to discriminate the presence of a target nucleic acid at a discrete location on the solid support from background signal, non-target nucleic acid, and/or spurious binding of the query probe.
  • a number of transitions greater than 10 recorded during the acquisition time indicates the presence of a target nucleic acid at the discrete location on the solid support.
  • a distribution of the number of transitions for each immobilized target is determined - e.g., the number of transitions is counted for each immobilized nucleic acid target observed. In some embodiments a histogram is produced. In some embodiments, characteristic parameters of the distribution are determined, e.g., the mean, median, peak, shape, etc. of the distribution are determined. In some embodiments, data and/or parameters (e.g., fluorescence data (e.g., fluorescence data in the time domain), kinetic data, characteristic parameters of the distribution, etc.) are analyzed by algorithms that recognize patterns and regularities in data, e.g., using artificial intelligence, pattern recognition, machine learning, statistical inference, neural nets, etc.
  • fluorescence data e.g., fluorescence data in the time domain
  • kinetic data characteristic parameters of the distribution, etc.
  • the analysis comprises use of a frequentist analysis and in some embodiments the analysis comprises use of a bayesian analysis.
  • pattern recognition systems are trained using known "training" data (e.g., using supervised learning) and in some embodiments algorithms are used to discover previously unknown patterns (e.g., unsupervised learning). See, e.g., Duda, et al. (2001) Pattern classification (2nd edition), Wiley, New York; Bishop (2006) Pattern Recognition and Machine Learning, Springer.
  • a correlation coefficient relating event number and elapsed time greater than 0.95 when calculated from the probability of a transition event occurring as a function of time at a discrete location on the solid support indicates the presence of a target nucleic acid at said discrete location on the solid support.
  • dwell times of bound query probe (x on ) and unbound query probe (x 0 ff) are used to identify the presence of a target nucleic acid in a sample and/or to distinguish a sample comprising a target nucleic acid from a sample comprising a non- target nucleic acid and/or not comprising the target nucleic acid.
  • the x on for a target nucleic acid is greater than the x on for a non-target nucleic acid! and, the x 0 ff for a target nucleic acid is smaller than the x 0 ff for a non-target nucleic acid.
  • measuring x on and x 0 ff for a negative control and for a sample indicates the presence or absence of the target nucleic acid in the sample.
  • a plurality of x on and x 0 ff values is determined for each of a plurality of spots imaged on a solid support, e.g., for a control (e.g., positive and/or negative control) and a sample suspected of comprising a target nucleic acid.
  • Methylated DNA is a marker for many states of health and disease, including, for example, identifying patients at higher risk of colorectal cancer based on presence of specific methylated loci. Methylated DNA also provides the basis of a diagnostic test for the early detection of colorectal cancer as well as pre-cancerous adenomas and dysplastic lesions.
  • Detection of methylated DNA at specific loci is currently performed by sodium bisulfite treatment of DNA, which deaminates unmethylated cytosines to produce uracil in the DNA, followed by PCR using distinct primer sets that selectively amplify methylated DNA fragments (e.g., 5-mC or 5-hmC, which are protected from conversion by Na bisulfite) or the unmethylated fragments (where primers are designed to bind to the anticipated converted sequence containing uracils in place of cytosines).
  • Another approach is to perform next generation sequencing of bisulfite treated and/or untreated DNA to infer methylated bases.
  • nucleic acid modifications are detected (e.g., modified bases, nucleotide analogs, etc.).
  • nucleic acid modifications e.g., modified bases, nucleotide analogs, etc.
  • epigenetic modifications of nucleic acids that influence gene expression are detected.
  • methylation of DNA is detected.
  • the technology finds use in detecting (e.g., identifying the presence or absence of) nucleotide analogs, nucleotide bases - or nucleosides and/or nucleotides comprising bases - other than adenine, thymine, guanosine, cytosine, and uracil.
  • the technology finds use in detecting, identifying, and/or quantifying a nucleotide, nucleoside, and/or a base including but not limited to, e.g., 5- methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5- carboxycytosine (5caC), N(6)-methyladenosine (m(6)A), pseudouridine ( ⁇ ),
  • dihydrouridine D
  • inosine I
  • 7-methylguanosine m7G
  • hypoxanthine xanthine
  • 2,6- diaminopurine 2,6- diaminopurine
  • 6,8-diaminopurine D
  • the technology comprises detecting methylated DNA by SiMREPS comprises analyzing samples that are treated with bisulfite and samples that have not been treated with bisulfite.
  • SiMREPS query probes that distinguish sequences expected from conversion of unmethylated cytosines to uracil from sequences expected if cytosine(s) at a given locus are not converted to uracil (due to methylation).
  • both query probes are provided in the sample chamber at the same time and each probe comprises a different fluorophore.
  • bisulfite reagent is provided in real-time while imaging a SiMREPS experiment with both query probes present (e.g., one probe that binds to the methylated sequence and a second probe that binds to the uracil - converted sequence, each with a separate fluorophore), where DNA fragments blinking in one color would shift to blinking in the other color based on conversion.
  • Such an assay provides greater accuracy and precision in the measurements than comparing a bisulfite-treated aliquot of sample to an untreated aliquot, which is required for current PCR or next generation sequencing-based approaches.
  • the technology is a multiplexed analysis using microfluidics and/or multi-spectral imaging, for example.
  • the technology comprises use of other reagents that convert additional chemical markers on DNA or RNA into modifications that are easily detected by SiMREPS.
  • the technology finds use in detecting a combination of mutant DNA, methylated DNA, and microRNA biomarkers on the same platform.
  • a technology finds use in the detection of colorectal cancer and advanced adenoma, e.g., by analyzing a stool sample.
  • the technology provides for the detection of mutant DNA (e.g., detecting 1 mutant molecule in a background of 1,000,000 wild-type molecules! e.g., detecting KRAS mutant DNA in a background of wild-type DNA).
  • the technology also provides a technology for measuring all three types of markers (methylated DNA, miRNA, and/or mutant DNA).
  • the technology finds use in analyzing nucleic acids in a buffer solution! the technology finds use in analyzing nucleic acids in a matrix extracted from stool.
  • An average stool weighs 200 g and comprises approximately 10 million diploid- genome equivalents of human DNA.
  • a sensitivity of 1: 1,000,000 provides detection of as few as 10 mutant molecules in DNA extracted from a typical whole stool. This is 10,000-fold more sensitive than current clinic ally -used KRAS assays and 100-fold more sensitive than best-performing research-grade methods. See, e.g., Domagala et al.
  • Colonoscopy is the dominant screening approach for colorectal cancer (CRC) in the U.S., despite its invasiveness, high cost, low patient compliance, and risk of complications.
  • New diagnostics such as stool -based colorectal cancer screening, are limited by low sensitivity for detecting advanced adenomas (AA), removal of which prevents CRC.
  • AA advanced adenomas
  • the currently best-performing stool-based test analyzes stool DNA (mutant DNA and methylation) and occult blood, but is technically complex, expensive ($500/test), and challenged by limited ability to detect rare mutant DNA alleles in a high background of wild-type DNA.
  • the detection technology described herein provides a new technology for lowcost, rapid measurement of rare mutant DNA alleles in stool with extraordinar analytic specificity, with concurrent measurement of occult blood (e.g., using a microRNA marker of occult blood) and methylated DNA on a single platform.
  • the technology provides increased sensitivity for detecting advanced adenoma at a >10-fold lower cost than the current state-of-the-art.
  • the technology provides quantification of rare mutant DNA alleles with orders of magnitude higher specificity than current methods, leading to significantly sensitized AA detection.
  • adenoma- defining mutations such as in the APC gene are detected.
  • markers detected on the platform include microRNA (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety), a stool occult blood marker (e.g., a microRNA marker of stool occult blood), and methylated DNA.
  • the nucleic acid to be detected, characterized, quantified, and/or identified is a RNA (e.g., a IncRNA, e.g., a non-protein coding RNA longer than approximately 200 nucleotides).
  • a RNA e.g., a IncRNA, e.g., a non-protein coding RNA longer than approximately 200 nucleotides.
  • the secondary structure of a nucleic acid e.g., a RNA, e.g., a IncRNA
  • SiMREPS SiMREPS detectino.
  • the technology comprises use of a dCas9/gRNA that recognizes and melts a secondary structured target in an RNA using an auxiliary PAM oligonucleotide (e.g., a PAMmer).
  • a PAMmer e.g., a PAMmer
  • the technology comprises use of two dCas9/gRNA complexes that recognize and melt a secondary structured target in an RNA using an auxiliary PAM oligonucleotide (e.g., a PAMmer).
  • the dCas9 binds to single -stranded RNA targets matching the gRNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide (a "PAM-presenting oligonucleotide” or "PAMmer”).
  • PAMmers provide for the site-specific binding of dCas9/gRNA to single- stranded RNA targets (e.g., IncRNA).
  • the technology provides for the use of PAMmers to direct dCas9 to bind to specific RNA targets and melt secondary structure to make query regions available for the binding of query probes in a SiMREPs assay. See, e.g., O'Connell et al.
  • compositions comprising a dCas9, a gRNA (e.g., a dCas9/gRNA complex), and a PAMmer. Detection of microRNA
  • the nucleic acid to be detected, characterized, quantified, and/or identified is a microRNA.
  • microRNAs miRNA or ⁇ RNA
  • miRNAs are single-stranded RNA molecules of approximately 21 to 23 nucleotides in length that regulate gene expression. miRNAs are encoded by genes from whose DNA they are transcribed, but miRNAs are not translated into protein (see, e.g., Carrington et al,
  • the SiMREPS technology provides for identifying and/or counting genomic aberrations (e.g., other than simple point mutations) in a DNA sample based on detecting unpaired regions after comparative genome hybridization.
  • genomic aberrations e.g., other than simple point mutations
  • SiMREPS finds use to determine the overall degree of genomic instability in a sample, e.g., as evidenced by presence of deletions and insertions compared to a reference, normal DNA sample.
  • Exemplary embodiments comprise providing a normal DNA (for example, wild-type DNA from normal blood cells of a patient with a solid tumor like lung cancer) and mix it with matched tumor DNA (or even circulating cell- free DNA), with fragmentation of the DNA so that it is present as fragments and is immobilized onto a surface, potentially through end modification (e.g., biotinylation) or other approaches. Most of the DNA will form hybridized DNA segments, but areas of deletion or insertion are present as duplexes where one strand bulges out.
  • a normal DNA for example, wild-type DNA from normal blood cells of a patient with a solid tumor like lung cancer
  • end modification e.g., biotinylation
  • the normal DNA and the tumor DNA are provided in ratios appropriate for efficient detection.
  • Embodiments provide a SiMREPS-based affinity reagent that is used to detect these unmatched regions, and counting them provides a measure of genomic instability, which finds use, e.g., in some embodiments as a biomarker of cancer risk.
  • the technology finds use in pre-natal screening for chromosomal abnormalities.
  • the affinity reagent is a single- stranded DNA binding proteins, a Holliday junction recombinases modified not to cleave nucleic acid, e.g., for identification of balanced chromosomal translocations, etc.
  • the technology finds use in the detection of microsatellite repeat aberrations in patients with microsatellite -unstable colorectal cancer, for example. Accordingly, embodiments comprise providing a panel of SiMREPS probes corresponding to different microsatellite loci that show differential kinetic binding properties depending on whether the microsatellite repeats have expanded or not in a sample. In some embodiments, the technology comprises a comparative DNA
  • hybridization approach in which hybridization of an expanded microsatellite repeat sequence to one without expansion generates an unpaired segment that is detected using a SiMREPS -based approach, using a DNA probe or other effective affinity
  • the technology comprises detection of proteins and other analytes with SiMREPS by incorporation of a bifunctional affinity reagent that binds to the target analyte and comprises a nucleic acid that can be counted using a SiMREPS reader probe.
  • some embodiments comprise the use of an antibody linked to a short DNA oligonucleotide, such that if the target protein analyte were immobilized onto the surface of a slide (for example by simple drying onto the surface, or other nonspecific or specific capture methods), the binding of the antibody to the target protein analyte is measured using a SiMREPS -based reading of the conjugated DNA.
  • Affinity reagents include DNA or RNA binding proteins, aptamers, antibodies and antibody fragments, linked to a DNA barcode.
  • Embodiments provide probes that are not nucleic acids.
  • embodiments provide an antibody or other affinity reagent that have a binding interaction with a target analyte that has a stability amenable for SiMREPS, e.g., a transient associated that provides a "blinking" signal and, in some embodiments, a kinetic binding
  • the non-nucleic acid query probe is engineered to weaken its binding relative to the non-engineered version, e.g., to provide a binding and/or association that is less thermodynamically stable.
  • embodiments comprise target analytes and query probes that are proteins, e.g., where one binding partner is the affinity reagent and the other would is the target analyte being measured.
  • embodiments comprise the use of aptamers binding any ligand, lectins binding glycosylated proteins, proteins or other molecules binding lipids, etc.
  • the technology comprises the use of any binding pair with transient binding behavior suitable for detection on the SiMREPS platform, e.g., that produces a kinetic
  • the technology provides a capture probe and a query probe that are linked to provide an intramolecular probing mechanism (see, e.g., Fig. 5 a and b).
  • the probe is asymmetric, so that when the target nucleic acid binds (e.g., with
  • thermodynamic stability e.g., irreversibly
  • the target nucleic acid undergoes transient binding with the query probe.
  • the transient binding and dissociation of the query probe yields a time -dependent change in donor fluorophore intensity or FRET whose kinetics are sensitive to the sequence of the target nucleic acid.
  • an address strand binds the Query/Capture complex to the surface, to provide rigidity that exerts control over the transient binding kinetics, and to provide a means to immobilize many different Query/Capture sequences to different regions of the imaging surface (e.g., as in a DNA microarray).
  • intramolecular SiMREPS probing provides faster acquisition.
  • binding of the query probe is rapid because the binding is an intramolecular hybridization reaction after the target nucleic acid binds to the capture probe.
  • imaging times are reduced compared to the other embodiments of the SiMREPS technology. For instance, the same number of binding and dissociation events occur in 1 to 10 seconds in some embodiments of intramolecular SiMREPS experiments as occur in 10 minutes in other embodiments of the SiMREPS technology.
  • the intramolecular SiMREPS technology provides for the parallelization of experiments through spatial segregation. In some embodiments, the intramolecular SiMREPS technology reduces the concentrations of query probe that provide efficient detection.
  • the intramolecular SiMREPS technology provides a platform that, in some embodiments, comprises many different Capture/Query probes immobilized within different regions of the imaging surface in a manner specified by the Address strand.
  • a standard microarray chip containing thousands of distinct sequences! these sequences could serve as the Address strands for immobilization of SiMREPS Capture/Query probes, permitting SiMREPS assays of thousands of target sequences (microRNAs, IncRNAs, DNA converted to single-stranded form) on a single chip.
  • the Address strands are not related in sequence to any of the targets, since interaction occurs indirectly through the query and capture probes. Indeed, the Address strands are not required to be related to the targets.
  • the query and capture probes comprise affinity reagents other than DNA sequences, such as the dCas9/gRNA complexes discussed elsewhere in this application.
  • Embodiments provide control of exposure of the fluorophores to excitation sources, e.g., to reduce or minimize photobleaching prior to analysis.
  • kinetic signatures provide a correction mechanism to identify and correct false positive detections resulting from, e.g., deposit of a Capture/Query probe on the wrong part of the imaging surface (outside of its Address region).
  • Embodiments also provide a technology in which false positives are minimized or reduced by splitting the Query and Capture probes into two non- contiguous probes that co-localize upon binding to the Address sequence (see, e.g., Fig. 5 b).
  • a nucleic acid comprises a fluorescent moiety (e.g., a fluorogenic dye, also referred to as a "fluorophore” or a "fluor”).
  • a fluorescent moiety e.g., a fluorogenic dye, also referred to as a "fluorophore” or a "fluor”
  • fluorophore also referred to as a "fluorophore”
  • fluor fluorogenic dye
  • Examples of compounds that may be used as the fluorescent moiety include but are not limited to xanthene, anthracene, cyanine, porphyrin, and coumarin dyes.
  • xanthene dyes that find use with the present technology include but are not limited to fluorescein, 6-carboxyfluorescein (6-FAM), 5-carboxyfluorescein (5-FAM), 5- or 6-carboxy-4, 7, 2', 7'- tetrachlorofluorescein (TET), 5- or 6-carboxy-4'5'2'4'5'7'
  • HEX hexachlorofluorescein
  • JOE 5' or 6'-carboxy-4',5'-dichloro-2,'7'-dimethoxyfluorescein
  • ZOE 5-carboxy-2',4',5',7'-tetrachlorofluorescein
  • rhodol rhodamine, tetramethylrhodamine
  • TAMRA 4,7-dlchlorotetramethyl rhodamine
  • ROX rhodamine X
  • Texas Red Texas Red
  • cyanine dyes examples include but are not limited to Cy 3, Cy 3B, Cy 3.5, Cy 5, Cy 5.5, Cy 7, and Cy 7.5.
  • Other fluorescent moieties and/or dyes that find use with the present technology include but are not limited to energy transfer dyes, composite dyes, and other aromatic compounds that give fluorescent signals.
  • the fluorescent moiety comprises a quantum dot.
  • the fluorescent moiety comprises a fluorescent protein (e.g., a green fluorescent protein (GFP), a modified derivative of GFP (e.g., a GFP comprising S65T, an enhanced GFP (e.g., comprising F64L)), or others known in the art such as, e.g., blue fluorescent protein (e.g., EBFP, EBFP2, Azurite, mKalamal), cyan fluorescent protein (e.g., ECFP, Cerulean, CyPet, mTurquoise2), and yellow fluorescent protein derivatives (e.g., YFP, Citrine, Venus, YPet).
  • the fluorescent protein may be covalently or noncovalently bonded to one or more query and/or capture probes.
  • Fluorescent dyes include, without limitation, d-Rhodamine acceptor dyes including Cy 5, dichloro[R110], dichloro[R6G], dichloro [TAMRA], dichloro[ROX] or the like, fluorescein donor dyes including fluorescein, 6-FAM, 5-FAM, or the like! Acridine including Acridine orange, Acridine yellow, Proflavin, pH 7, or the like! Aromatic Hydrocarbons including 2-Methylbenzoxazole, Ethyl p-dimethylaminobenzoate, Phenol, Pyrrole, benzene, toluene, or the like!
  • Arylmethine Dyes including Auramine O, Crystal violet, Crystal violet, glycerol, Malachite Green or the like!
  • Coumarin dyes including 7- Methoxycoumarin-4-acetic acid, Coumarin 1, Coumarin 30, Coumarin 314, Coumarin 343, Coumarin 6 or the like!
  • Cyanine Dyes including l, l'-diethyl-2,2'-cyanine iodide, Cryptocyanine, Indocarbocyanine (C3) dye, Indodicarbocyanine (C5) dye,
  • Indotricarbocyanine (C7) dye Oxacarbocyanine (C3) dye, Oxadicarbocyanine (C5) dye, Oxatricarbocyanine (C7) dye, Pinacyanol iodide, Stains all, Thiacarbocyanine (C3) dye, ethanol, Thiacarbocyanine (C3) dye, n-propanol, Thiadicarbocyanine (C5) dye,
  • Dipyrrin dyes including ⁇ , ⁇ '-Difluoroboryl ⁇ 1, 9- dimethyl-5-(4-iodophenyl)- dipyrrin, N,N'-Difluoroboryb l,9-dimethyl-5-[(4-(2- trimethylsilylethynyl), N,N'-Difluoroboryl- l,9-dimethyl-5-phenydipyrrin, or the like!
  • Merocy anines including 4- (dicy anomethylene) - 2 - methyl- 6 - (p - dimethyl aminostyryl) - 4H - pyran (DCM), acetonitrile, 4-(dicyanomethylene)-2-methyl-6-(p-dimethylaminostyryl)- 4H-pyran (DCM), methanol, 4-Dimethylamino-4'-nitrostilbene, Merocyanine 540, or the like!
  • Miscellaneous Dyes including 4',6-Diamidino-2-phenylindole (DAPI), dimethylsulfoxide, 7-Benzylamino-4-nitrobenz-2-oxa- l,3-diazole, dansyl glycine, dansyl glycine, dioxane, Hoechst 33258, DMF, Hoechst 33258, Lucifer yellow CH, Piroxicam, Quinine sulfate, Quinine sulfate, Squarylium dye III, or the like! Oligophenylenes including 2,5-Di henyloxazole (PPO), Biphenyl, POPOP, p-Quaterphenyl, p-Terphenyl, or the like!
  • PPO 2,5-Di henyloxazole
  • POPOP 2,5-Di henyloxazole
  • POPOP 2,5-Di henyloxazole
  • p-Quaterphenyl p-T
  • Oxazines including Cresyl violet perchlorate, Nile Blue, methanol, Nile Red, ethanol, Oxazine 1, Oxazine 170, or the like!
  • Polycyclic Aromatic Hydrocarbons including 9, 10-Bis(phenylethynyl)anthracene, 9, 10-Diphenylanthracene, Anthracene, Naphthalene, Perylene, Pyrene, or the like!
  • polyene/polynes including 1,2- diphenylacetylene, 1,4-diphenylbutadiene, 1,4-diphenylbutadiyne, 1,6- Diphenylhexatriene, Beta-carotene, Stilbene, or the like!
  • Redox-active Chromophores including Anthraquinone, Azobenzene, Benzoquinone, Ferrocene, Riboflavin, Tris(2,2'- bipyridypruthenium(ll), Tetrapyrrole, Bilirubin, Chlorophyll a, diethyl ether,
  • Chlorophyll a methanol, Chlorophyll b, Diprotonated-tetraphenylporphyrin, Hematin, Magnesium octaethylporphyrin, Magnesium octaethylporphyrin (MgOEP), Magnesium phthalocyanine (MgPc), PrOH, Magnesium phthalocyanine (MgPc), pyridine,
  • Magnesium tetramesitylporphyrin MgTMP
  • Rhodamine 123 Rhodamine 6G, Rhodamine B, Rose bengal, Sulforhodamine 101, or the like! or mixtures or combination thereof or synthetic derivatives thereof.
  • xanthene derivatives such as fluorescein, rhodamine, Oregon green, eosin, and Texas red! cyanine derivatives such as cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, and merocyanine! naphthalene derivatives (dansyl and prodan derivatives); coumarin derivatives! oxadiazole derivatives such as pyridyloxazole, nitrobenzoxadiazole, and benzoxadiazole! pyrene derivatives such as cascade blue!
  • oxazine derivatives such as Nile red, Nile blue, cresyl violet, and oxazine 170; acridine derivatives such as proflavin, acridine orange, and acridine yellow! arylmethine derivatives such as auramine, crystal violet, and malachite green! and tetrapyrrole derivatives such as porphin, phtalocyanine, bilirubin.
  • the fluorescent moiety a dye that is xanthene, fluorescein, rhodamine, BODIPY, cyanine, coumarin, pyrene, phthalocyanine, phycobiliprotein, ALEXA FLUOR® 350, ALEXA FLUOR® 405, ALEXA FLUOR® 430, ALEXA FLUOR® 488, ALEXA FLUOR® 514, ALEXA FLUOR® 532, ALEXA FLUOR® 546, ALEXA FLUOR® 555, ALEXA FLUOR® 568, ALEXA FLUOR® 568, ALEXA FLUOR® 594, ALEXA FLUOR® 610, ALEXA FLUOR® 633, ALEXA FLUOR® 647, ALEXA FLUOR® 660, ALEXA FLUOR® 680, ALEXA FLUOR® 700, ALEXA FLUOR®
  • the label e.g., a fluorescently detectable label
  • the label is one available from ATTO-TEC GmbH (Am Eichenhang 50, 57076 Siegen, Germany), e.g., as described in U.S. Pat. Appl. Pub. Nos. 20110223677, 20110190486, 20110172420, 20060179585, and 20030003486; and in U.S. Pat. No.
  • dyes having emission maxima outside these ranges may be used as well.
  • dyes ranging between 500 nm to 700 nm have the advantage of being in the visible spectrum and can be detected using existing photomultiplier tubes.
  • the broad range of available dyes allows selection of dye sets that have emission wavelengths that are spread across the detection range. Detection systems capable of distinguishing many dyes are known in the art. Samples
  • nucleic acids are isolated from a biological sample containing a variety of other components, such as proteins, lipids, and non- template nucleic acids.
  • Nucleic acid template molecules can be obtained from any material (e.g., cellular material (live or dead), extracellular material, viral material, environmental samples (e.g., metagenomic samples), synthetic material (e.g., amplicons such as provided by PCR or other amplification technologies)), obtained from an animal, plant, bacterium, archaeon, fungus, or any other organism.
  • Biological samples for use in the present technology include viral particles or preparations thereof.
  • Nucleic acid molecules can be obtained directly from an organism or from a biological sample obtained from an organism, e.g., from blood, urine, cerebrospinal fluid, seminal fluid, saliva, sputum, stool, hair, sweat, tears, skin, and tissue.
  • Exemplary samples include, but are not limited to, whole blood, lymphatic fluid, serum, plasma, buccal cells, sweat, tears, saliva, sputum, hair, skin, biopsy, cerebrospinal fluid (CSF), amniotic fluid, seminal fluid, vaginal excretions, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluids, intestinal fluids, fecal samples, and swabs, aspirates (e.g., bone marrow, fine needle, etc.), washes (e.g., oral, nasopharyngeal, bronchial, bronchialalveolar, optic, rectal, intestinal, vaginal, epidermal, etc.), and/or other specimens.
  • CSF cerebrospinal fluid
  • tissue or body fluid specimen may be used as a source for nucleic acid for use in the technology, including forensic specimens, archived specimens, preserved specimens, and/or specimens stored for long periods of time, e.g., fresh-frozen, methanol/acetic acid fixed, or formalin-fixed paraffin embedded (FFPE) specimens and samples.
  • Nucleic acid template molecules can also be isolated from cultured cells, such as a primary cell culture or a cell line. The cells or tissues from which template nucleic acids are obtained can be infected with a virus or other intracellular pathogen.
  • a sample can also be total RNA extracted from a biological specimen, a cDNA library, viral, or genomic DNA.
  • a sample may also be isolated DNA from a non-cellular origin, e.g.
  • amplified/isolated DNA that has been stored in a freezer.
  • Nucleic acid molecules can be obtained, e.g., by extraction from a biological sample, e.g., by a variety of techniques such as those described by Maniatis, et al. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y. (see, e.g., pp. 280- 281).
  • the technology provides for the size selection of nucleic acids, e.g., to remove very short fragments or very long fragments.
  • the technology is used to identify a nucleic acid in situ.
  • embodiments of the technology provide for the identification of a nucleic acid directly in a tissue, cell, etc. (e.g., after permeabilizing the tissue, cell, etc.) without extracting the nucleic acid from the tissue, cell, etc.
  • the technology is applied in vivo, ex vivo, and/or in vitro.
  • the sample is a crude sample, a minimally treated cell lysates, or a biofluid lysate.
  • the nucleic acid is detected in a crude lysates without nucleic acid purification.
  • kits for the detection of a nucleic acid are provided.
  • a kit comprising a solid support (e.g., a microscope slide, a bead, a coverslip, an avidin (e.g., strep tavi din) -conjugated microscope slide or coverslip, a solid support comprising a zero mode waveguide array, or the like), a dCas9/gRNA (e.g., comprising a biotinylated dCas9), and a query probe as described herein.
  • Some embodiments further provide a non-biotinylated dCas9/gRNA.
  • kits for multiplex detection comprise two or more query probes each comprising a sequence
  • kits comprise one or more positive controls and/or one or more negative controls. Some embodiments comprise a series of controls having known concentrations, e.g., to produce a standard curve of concentrations.
  • Systems according to the technology comprise, e.g., a solid support (e.g., a microscope slide, a coverslip, an avidin (e.g., streptavidin)- conjugated microscope slide or coverslip, a solid support comprising a zero mode waveguide array, or the like), a dCas9/gRNA (e.g., comprising a biotinylated dCas9), and a query probe as described herein. Some embodiments further provide a non- biotinylated dCas9/gRNA.
  • Some embodiments further comprise a fluorescence microscope comprising an illumination configuration to excite bound query probes (e.g., a prism-type total internal reflection fluorescence (TIRF) microscope, an objective -type TIRF microscope, a near- TIRF or HiLo microscope, a confocal laser scanning microscope, a zero-mode waveguide, and/or an illumination configuration capable of parallel monitoring of a large area of the slide or coverslip (> 100 pm 2 ) while restricting illumination to a small region of space near the surface).
  • TIRF total internal reflection fluorescence
  • Some embodiments comprise a fluorescence detector, e.g., a detector comprising an intensified charge coupled device (ICCD), an electron-multiplying charge coupled device (EM-CCD), a complementary metal-oxide-semiconductor (CMOS), a photomultiplier tube (PMT), an avalanche photodiode (APD), and/or another detector capable of detecting fluorescence emission from single chromophores.
  • ICCD intensified charge coupled device
  • E-CCD electron-multiplying charge coupled device
  • CMOS complementary metal-oxide-semiconductor
  • PMT photomultiplier tube
  • APD avalanche photodiode
  • embodiments comprise a computer and software encoding instructions for the computer to perform.
  • Some embodiments comprise optics, such as lenses, mirrors, dichroic mirrors, optical filters, etc., e.g., to detect fluorescence selectively within a specific range of wavelengths or multiple ranges of wavelengths.
  • computer-based analysis software is used to translate the raw data generated by the detection assay (e.g., the presence, absence, or amount of one or more nucleic acids (e.g., one or more biomarkers) into data of predictive value for a clinician.
  • the clinician can access the predictive data using any suitable means.
  • a computer system upon which embodiments of the present technology may be implemented.
  • a computer system includes a bus or other communication mechanism for communicating information and a processor coupled with the bus for processing information.
  • the computer system includes a memory, which can be a random access memory (RAM) or other dynamic storage device, coupled to the bus, and instructions to be executed by the processor. Memory also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor.
  • the computer system can further include a read only memory (ROM) or other static storage device coupled to the bus for storing static information and instructions for the processor.
  • ROM read only memory
  • a storage device such as a magnetic disk or optical disk, can be provided and coupled to the bus for storing information and instructions.
  • the computer system is coupled via the bus to a display, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), for displaying information to a computer user.
  • a display such as a cathode ray tube (CRT) or a liquid crystal display (LCD)
  • An input device can be coupled to the bus for communicating information and command selections to the processor.
  • a cursor control such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to the processor and for controlling cursor movement on the display.
  • This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • a computer system can perform embodiments of the present technology.
  • results can be provided by the computer system in response to the processor executing one or more sequences of one or more instructions contained in the memory.
  • Such instructions can be read into the memory from another computer-readable medium, such as a storage device.
  • Execution of the sequences of instructions contained in the memory can cause the processor to perform the methods described herein.
  • hard-wired circuitry can be used in place of or in combination with software instructions to implement the present teachings.
  • implementations of the present technology are not limited to any specific combination of hardware circuitry and software.
  • non-volatile media can include, but are not limited to, optical or magnetic disks, such as a storage device.
  • volatile media can include, but are not limited to, dynamic memory.
  • transmission media can include, but are not limited to, coaxial cables, copper wire, and fiber optics, including the wires that comprise the bus.
  • Computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.
  • Various forms of computer readable media can be involved in carrying one or more sequences of one or more instructions to the processor for execution.
  • the instructions can initially be carried on the magnetic disk of a remote computer.
  • the remote computer can load the instructions into its dynamic memory and send the instructions over a network connection (e.g., a LAN, a WAN, the internet, a telephone line).
  • a local computer system can receive the data and transmit it to the bus.
  • the bus can carry the data to the memory, from which the processor retrieves and executes the instructions.
  • the instructions received by the memory may optionally be stored on a storage device either before or after execution by the processor.
  • instructions configured to be executed by a processor to perform a method are stored on a computer-readable medium.
  • the computer-readable medium can be a device that stores digital information.
  • a computer-readable medium includes a compact disc read-only memory (CD-ROM) as is known in the art for storing software.
  • CD-ROM compact disc read-only memory
  • the computer-readable medium is accessed by a processor suitable for executing instructions configured to be executed.
  • some embodiments of the technology provided herein further comprise functionalities for collecting, storing, and/or analyzing data (e.g., presence, absence, concentration of a nucleic acid).
  • data e.g., presence, absence, concentration of a nucleic acid
  • some embodiments contemplate a system that comprises a processor, a memory, and/or a database for, e.g., storing and executing instructions, analyzing fluorescence, image data, performing calculations using the data, transforming the data, and storing the data.
  • an algorithm applies a statistical model (e.g., a Poisson model or hidden Markov model) to the data.
  • nucleic acid biomarker e.g., a nucleic acid biomarker
  • an equation comprising variables representing the presence, absence, concentration, amount, or sequence properties of multiple nucleic acids produces a value that finds use in making a diagnosis or assessing the presence or qualities of a nucleic acid.
  • this value is presented by a device, e.g., by an indicator related to the result (e.g., an LED, an icon on a display, a sound, or the like).
  • a device stores the value, transmits the value, or uses the value for additional calculations.
  • an equation comprises variables representing the presence, absence, concentration, amount, or sequence properties of one or more of a methylated locus in genomic DNA, a microRNA, a mutant gene biomarker, or a chromosomal aberration.
  • the present technology provides the further benefit that a clinician, who is not likely to be trained in genetics or molecular biology, need not understand the raw data.
  • the data are presented directly to the clinician in its most useful form.
  • the clinician is then able to utilize the information to optimize the care of a subject.
  • the present invention contemplates any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information providers, medical personal, and/or subjects.
  • a sample is obtained from a subject and submitted to a profiling service (e.g., a clinical lab at a medical facility, genomic profiling business, etc.), located in any part of the world (e.g., in a country different than the country where the subject resides or where the information is ultimately used) to generate raw data.
  • a profiling service e.g., a clinical lab at a medical facility, genomic profiling business, etc.
  • the subject may visit a medical center to have the sample obtained and sent to the profiling center or subjects may collect the sample themselves and directly send it to a profiling center.
  • the information may be directly sent to the profiling service by the subject (e.g., an information card containing the information may be scanned by a computer and the data transmitted to a computer of the profiling center using electronic communication systems).
  • the profiling service Once received by the profiling service, the sample is processed and a profile is produced that is specific for the diagnostic or prognostic information desired for the subject.
  • the profile data are then prepared in a format suitable for interpretation by a treating clinician. For example, rather than providing raw expression data, the prepared format may represent a diagnosis or risk assessment for the subject, along with recommendations for particular treatment options.
  • the data may be displayed to the clinician by any suitable method.
  • the profiling service generates a report that can be printed for the clinician (e.g., at the point of care) or displayed to the clinician on a computer monitor.
  • the information is first analyzed at the point of care or at a regional facility.
  • the raw data are then sent to a central processing facility for further analysis and/or to convert the raw data to information useful for a clinician or patient.
  • the central processing facility provides the advantage of privacy (all data are stored in a central facility with uniform security protocols), speed, and uniformity of data analysis.
  • the central processing facility can then control the fate of the data following treatment of the subject. For example, using an electronic communication system, the central facility can provide data to the clinician, the subject, or researchers.
  • the subject is able to access the data using the electronic communication system.
  • the subject may chose further intervention or counseling based on the results.
  • the data are used for research use.
  • the data may be used to further optimize the inclusion or elimination of markers as useful indicators of a particular condition associated with the disease.

Abstract

Provided herein is technology relating to detecting and identifying nucleic acids and particularly, but not exclusively, to compositions, methods, kits, and systems for detecting, identifying, and quantifying target nucleic acids with high confidence at single-molecule resolution.

Description

DETECTION OF NUCLEIC ACIDS
This application claims priority to United States provisional patent application serial number 62/293,589, filed February 10, 2016, which is incorporated herein by reference in its entirety.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with government support under grant GM062357 awarded by the U.S. National Institutes of Health, and under grant W91 INF- 12- 1-0420 awarded by the U.S. Navy, Office of Naval Research. The government has certain rights in the invention.
FIELD
Provided herein is technology relating to detecting and identifying nucleic acids and particularly, but not exclusively, to compositions, methods, kits, and systems for detecting, identifying, and quantifying target nucleic acids with high confidence at single -molecule resolution.
BACKGROUND
Early detection is critical to the effective treatment of many diseases, especially cancer. Research related to identifying detectable biomarkers associated with early-stage disease has indicated that nucleic acids provide highly specific biomarkers of cancer and other maladies. For example, cancer cell-derived double -stranded DNA (dsDNA) together with secondary structured long non-coding RNA (IncRNA) have recently emerged as sensitive and specific biomarkers of cancer and other diseases in crude human biofluids such as blood, urine, and sputum.
However, despite their promise as diagnostic biomarkers, the sensitive and specific detection of nucleic acid biomarkers has proven challenging. In particular, existing techniques for detecting nucleic acids utilize probes that form a
thermodynamically stable complex with the target molecule and are thus limited to weak and often unreliable thermodynamic discrimination against background signal, spurious targets, or closely related mutant nucleic acids. In addition, the presence of a complementary DNA or RNA strand in the sample severely limits accessibility of the target sequence. Thus, a sensitive and specific assay for the amplification-free detection of nucleic acids in minimally treated native biofluids is needed to provide a rapid and reliable identification and/or quantification of nucleic acid biomarkers. SUM MARY
Accordingly, provided herein is a technology for the specific and ultrasensitive detection and counting of single nucleic acid (e.g., dsDNA, IncRNA, methylated DNA, etc.) target molecules based on the transient binding of short labeled probes to a target nucleic acid. In some embodiments, the target nucleic acid is detected by a kinetic "fingerprint" signal produced by the probe-target interaction. This Single -Molecule Recognition with
Equilibrium Poisson Sampling (SiMREPS) technology provides for the sensitive detection of both single-stranded nucleic acids (see, e.g., U.S. Pat. App. Ser. No.
14/589,467, incorporated herein by reference in its entirety) and double-stranded nucleic acids. In some embodiments, detection of double-stranded nucleic acids comprises use of dCas9-guided capture and DNA melting. In some embodiments, this technology comprises the capture of unlabeled targets on a glass or fused silica surface using a guideRNA (gRNA)- loaded, catalytically inactive ("dead") dCas9 enzyme or enzymes that bind one or more segments of the target nucleic acid with high specificity. In some embodiments (e.g., for detecting IncRNA), the technology further comprises use of a protospacer adjacent motif (PAM) oligonucleotide to provide for dCas9 targeting of nucleic acid targets (e.g., IncRNA).
In some embodiments, the technology comprises use of an intramolecular
SiMREPS probe, e.g., a capture probe and a query probe that are linked to provide an intramolecular probing mechanism (see, e.g., Fig. 5 a and b; see infra).
More broadly, in some embodiments, one or more complexes (e.g., one or more dCas9/gRNA complexes) recognize specific segments (e.g., nucleotide sequences) in a target nucleic acid to immobilize the target nucleic acid to a surface. Furthermore, embodiments provide that these same or other complexes and/or complementary oligonucleotides melt double-stranded regions of the target nucleic acid to provide access for the binding of a labeled query probe (e.g., for binding of the query probe to a second segment, e.g., a query region). Surface capture (and, in some embodiments, melting of one or more double-stranded regions) is followed by observation of the repeated, transient binding of a short fluorescently labeled DNA query probe to the second segment (e.g., query region) of the target nucleic acid that has been made accessible by dCas9-mediated melting. Furthermore, in addition to using dCas9/gRNA for immobilization and/or exposure of target sequences (e.g., one or more query regions), embodiments also provide a technology in which a labeled (e.g., fluorescently labeled) dCas9/gRNA complex provides detection of the target nucleic acid. In particular, the dCas9/gRNA complex provides a query probe for SiMREPS, since the dwell time of dCas9/gRNA on a DNA sequence is sensitive to the number of base pairs formed between the gRNA and the target DNA sequence. Embodiments of the technology provide that the number of base pairs formed between the gRNA and the target DNA is tuned to promote rapid dissociation from mutant sequences but slow dissociation from wild-type sequences or vice-versa. Furthermore, engineered dCas9 proteins provide the appropriate kinetics and sequence specificity (see, e.g., Kleinstiver et al. (2016) "High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects" Nature 529: 490-495;
Slaymaker et al. (2015) "Rationally engineered Cas9 nucleases with improved specificity" Science 351: 84-8) describing engineered Cas9 proteins that interact more weakly with the DNA backbone than native Cas9, thus reducing off-target effects.
In particular embodiments described herein, the SiMREPS probes repeatedly bind to a target sequence (e.g., query region) specifically made accessible by binding of dCas9/gRNA to one or more nucleic acid region(s) adjacent to the target region (e.g., "adjacent regions"). The repeated binding of the query probes to the query region provides a unique, continuous kinetic "fingerprint", providing a large number of independent measurements for each observed target molecule. This repeated kinetic sampling affords two main advantages: (l) arbitrarily high discrimination against background signals with increased sampling time, essentially eliminating false positive signals! and (2) exquisite sensitivity to subtle differences in the identity of the molecular target, allowing for discrimination of slightly different biomarkers (e.g., differing by only one DNA/RNA base of a disease-related mutation, a methylation pattern or other chemical marks) with very high confidence. In addition, as a single -molecule technique, this approach detects a small amount of target molecule (such as derived from a single cell) in the presence of a large excess of a closely related, but spurious (e.g., mutant) target. In contrast to techniques requiring PCR amplification (the current standard for low-abundance targets), this technique utilizes no target amplification and hence requires pre-treatment of biological samples only with, e.g., dCas9/gRNA at ambient temperature prior to target detection, avoiding the introduction of samp ling bias. In addition, as a direct detection technique, it does not lose any chemical marks on the target as amplification-based detection approaches often do. The technique finds use, for example, in the diagnosis of cancer from circulating tumor DNA and IncRNA in human blood serum. The technology finds use also in drug and antibiotic resistance gene detection in pathogens and human tumor cells.
Accordingly, provided herein are embodiments of a complex for providing a detectable fingerprint of a double-stranded target nucleic acid, the complex comprising a double- stranded target nucleic acid (e.g., a DNA, an RNA, a DNA/RNA hybrid) comprising a first region adjacent to a second region! a melting component (e.g., an immobilized melting component) interacting with the first region to form a thermodynamically stable complex and provide the second region in a single -stranded form! and a query probe that binds repeatedly to the second region to provide a detectable fingerprint associated with the double-stranded target nucleic acid. In some embodiments, the double-stranded nucleic acid is a single -stranded nucleic acid that comprises a region having a double- stranded secondary structure. In some embodiments, the melting component comprises a dCas9. In some embodiments, the melting component comprises a single -stranded binding protein. In some embodiments, the melting component is a protein and in some embodiments the melting component is a nucleic acid. Embodiments comprise use of a protein that binds to double -stranded nucleic acids (e.g., double-stranded DNA, double- stranded RNA, a double-stranded DNA/RNA hybrid) and/or a melting component (e.g., a protein or a nucleic acid) to dissociate a double-stranded nucleic acid (e.g., a region of a nucleic acid) to provide a single -stranded nucleic acid. In particular embodiments, the melting component comprises a dCas9/gRNA complex comprising a gRNA hybridized to the first region. In some embodiments, the melting component comprises a PAMmer, e.g., provided in trans to the target nucleic acid.
In some embodiments, the query probe hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1. In some embodiments, the query probe hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 1 min-1 and/or a kinetic rate constant kon that is greater than 1 min-1. In some embodiments, the query probe is a fluorescently labeled nucleic acid that hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1. In some embodiments, the query probe is a fluorescently labeled nucleic acid that hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 1 min-1 and/or a kinetic rate constant kon that is greater than 1 min-1. In some embodiments, the melting component comprises a dCas9 that is immobilized to a substrate.
In some embodiments, data are analyzed, e.g., in some embodiments the fingerprint is detectable by a pattern recognition analysis.
Additional embodiments comprise a second melting component interacting with a third region of the target nucleic acid adjacent to the second region of the target nucleic acid. In some embodiments, the second melting component comprises a dCas9. In some embodiments, the second melting component comprises a single-stranded binding protein. In some embodiments, the second melting component is a protein and in some embodiments the second melting component is a nucleic acid. In particular
embodiments, the second melting component comprises a dCas9/gRNA complex comprising a gRNA hybridized to the third region. In some embodiments, the melting component comprises a PAMmer, e.g., provided in trans to the target nucleic acid. In some embodiments, the first and second melting components bind approximately 5 to 15 nucleotides apart on the target nucleic acid, e.g., to provide access to the query region by a query probe.
The technology finds use in the detection, identification, and/or quantification of nucleic acids, e.g., in some embodiments, the target nucleic acid comprises a mutation, a single nucleotide polymorphism, or a modified base.
Additional embodiments provide a method for providing a detectable fingerprint of a double -stranded target nucleic acid in a sample, the method comprising
immobilizing a double-stranded target nucleic acid to a discrete region of a solid support, said double -stranded target nucleic acid comprising a first region adjacent to a second region and said discrete region of said solid support comprising an immobilized melting component interacting with the first region! providing a query probe that binds repeatedly to the second region to provide a detectable fingerprint; and associating the detectable fingerprint with the double -stranded nucleic acid to identify the double- stranded nucleic acid. Some embodiments comprise analyzing data using pattern recognition or a similar analysis (e.g., machine learning, neural network, supervised and/or unsupervised learning, etc.) to produce or identify the detectable fingerprint of the double stranded nucleic acid.
In some embodiments of methods, the melting component comprises a dCas9. In some embodiments of methods, the melting component comprises a single -stranded binding protein. In some embodiments of methods, the melting component is a protein and in some embodiments the melting component is a nucleic acid. In particular method embodiments, the melting component comprises a dCas9/gRNA complex comprising a gRNA hybridized to the first region. In some embodiments of methods, the melting component comprises a PAMmer, e.g., provided in trans to the target nucleic acid.
Additional embodiments comprise providing a second melting component that interacts with a third region of the target nucleic acid, said third target region adjacent to the second region. For example, some embodiments comprise providing a
dCas9/gRNA complex comprising a gRNA complementary to a third region of the target nucleic acid adjacent to the second region of the target nucleic acid. Related
embodiments comprise providing conditions sufficient for the melting component to provide the second region in a second stranded form, e.g., buffered pH conditions, temperature control, solution components (e.g., salts, counterions, cofactors, etc.), etc.
Some embodiments comprise detecting repeated binding of the query probe to the second region with a kinetic rate constant k0ff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1. Some embodiments comprise detecting repeated binding of the query probe to the second region with a kinetic rate constant k0ff that is greater than 1 min-1 and/or a kinetic rate constant kon that is greater than 1 min-1. Some embodiments comprise detecting repeated binding of a iluorescently labeled nucleic acid to the second region with a kinetic rate constant k0ff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1. Some embodiments comprise detecting repeated binding of a iluorescently labeled nucleic acid to the second region with a kinetic rate constant k0ff that is greater than 1 min-1 and/or a kinetic rate constant kon that is greater than 1 min-1.
Some embodiments comprise calculating an amount or concentration of the double-stranded target nucleic acid in the sample from the detectable fingerprint.
Embodiments of the technology provide a system for the detection of a double- stranded nucleic acid. In some embodiments, the system comprises a solid support comprising an immobilized melting component, a detectably labeled query probe that binds repeated to the double-stranded nucleic acid, a fluorescence detector, and a software component configured to perform pattern recognition analysis of query probe binding data. Embodiment of systems comprise compositions described herein and/or comprise components (e.g., a computer, processor, etc.) to perform methods as described herein.
The technology finds use in embodiments of a method for calculating a predictor that a subject has or is at risk of having a cancer. For example, embodiments of said methods comprise determining the presence of a microRNA biomarker, determining the presence of a mutation, and determining the presence of a modified base in genomic DNA. In some embodiments, the predictor is a value calculated from variables associated with the presence of a microRNA biomarker, the presence of a mutation, and the presence of a modified base in genomic DNA.
Related embodiments provide a complex for detecting a target nucleic acid, the complex comprising a target nucleic acid comprising a first region adjacent to a second region! a detectably labeled capture probe hybridized to the first region! a query probe labeled with a quencher or fluorescent acceptor compatible with the label of the capture probe, wherein the query probe hybridizes repeatedly to the second region with a kinetic rate constant koff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1.
Additional embodiments will be apparent to persons skilled in the relevant art based on the teachings contained herein.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other features, aspects, and advantages of the present technology will become better understood with regard to the following drawings:
Figure 1 a is a schematic drawing of an embodiment of the nucleic acid detection technology provided herein.
Figure 1 b shows exemplary SiMREPS data of fluorescently labeled query probes transiently associating non- specifically to a slide surface.
Figure 1 c shows exemplary SiMREPS data of fluorescently labeled query probes transiently binding to a target nucleic acid. The kinetic fingerprints of lc and lb are different, thus providing examples of the different kinetic signatures for specific and non-specific binding of query probes.
Figure 1 d is a series of histograms indicating the number of query probes counted to have a given number of intensity transitions (Nb+d) in the absence (thick gray bars) or presence (thin lines) of 1 pM target nucleic acid (e.g., a miR- 141 microRNA). The four histograms plot data acquired with acquisition times of 1, 2, 5, and 10 minutes.
Figure 1 e shows plots of standard curves from SiMREPS assays of five miRNAs, yielding R2 values > 0.99. The SiMREPS technology provides high-confidence detection of nucleic acids.
Figure 2 a is a plot showing that the fluorescent query probe for let- 7a exhibits long lifetimes of binding to let" 7a (xon = 23.3 ± 8.3 s) but much more transient binding to let- 7c (xon = 4.7 ± 3.0 s) due to a single mismatch in the let- 7c sequence relative to let- 7a (underlined "G").
Figure 2 b is a dwell time analysis showing the high-confidence single-copy-level discrimination between let-7a (closed circles) and let-7c (open circles).
Figure 2 c is a receiver operating characteristic (ROC) plot constructed by varying the xon threshold for discriminating between let- 7a and let- 7c.
Figure 2 d is a Nb+a histogram for the detection of let-7 in crude HeLa cell extract in the presence or absence of the miRCURY let- 7 inhibitor. The Nb+a histogram for endogenous hsa-let- 7a showed a well-defined peak (thin line) that vanished in the presence of a let-7 inhibitor designed to bind and sequester let-7 family members (thick grey bars).
Figure 2 e shows the dwell times for molecules detected in crude HeLa extract using the fluorescent and capture probes for let- 7a. The filled and open circles represent two clusters of target molecules classified by k-means clustering of xon values, consistent with the expected xon distributions for single -nucleotide mutants hsa-let- 7a and hsadet- 7c.
Figure 2 f shows the quantification of synthetic miR- 141 spiked into human serum.
Figure 3 is a schematic diagram showing an embodiment of the technology comprising a dCas9/gRNA. Genomic target DNA is briefly pre -treated with
dCas9/gRNA. During this time, the guide RNA (gRNA) of the dCas9/gRNA complex hybridizes to the target nucleic acid (e.g., a genomic DNA) at a region adjacent to a query region (e.g., complementary to a query probe). The dCas9/gRNA melts a specific DNA sequence (e.g., the query region) in the target nucleic acid. After capture of the biotinylated dCas9 onto a slide surface (e.g., by biotin-avidin interaction), SiMREPS is used to detect binding of the query probe to the now accessible query region in the target nucleic acid, which is adjacent to the site of target nucleic acid hybridized to the gRNA.
Figure 4 is a schematic diagram showing an embodiment of the technology comprising hybridization of two flanking dCas9/gRNA complexes to a target nucleic acid. In some embodiments, hybridizing two dCas9/gRNA complexes to flank the query region further improves accessibility of the target nucleic acid (e.g., the query region) to the query probe for SiMREPS-based detection, and, in some embodiments, increases specificity. In the embodiment shown in the figure, one dCas9 is biotinylated for capture onto the surface and the other dCas9 is not biotinylated, though in some embodiments the second dCas9 is modified, e.g., biotinylated. Figure 5 a is a schematic showing an embodiment of intramolecular SiMREPS probing in which the query and capture probes are linked on a contiguous
oligonucleotide.
Figure 5 b is a schematic showing an embodiment of intramolecular SiMREPS probing in which the non-contiguous query and capture probes are co-localized by an address oligonucleotide.
It is to be understood that the figures are not necessarily drawn to scale, nor are the objects in the figures necessarily drawn to scale in relationship to one another. The figures are depictions that are intended to bring clarity and understanding to various embodiments of apparatuses, systems, and methods disclosed herein. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. Moreover, it should be appreciated that the drawings are not intended to limit the scope of the present teachings in any way.
DETAILED DESCRIPTION
Provided herein is technology relating to detecting and identifying nucleic acids and particularly, but not exclusively, to compositions, methods, kits, and systems for detecting, identifying, and quantifying target nucleic acids with high confidence at single -molecule resolution
In this detailed description of the various embodiments, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of the embodiments disclosed. One skilled in the art will appreciate, however, that these various embodiments may be practiced with or without these specific details. In other instances, structures and devices are shown in block diagram form. Furthermore, one skilled in the art can readily appreciate that the specific sequences in which methods are presented and performed are illustrative and it is contemplated that the sequences can be varied and still remain within the spirit and scope of the various embodiments disclosed herein.
All literature and similar materials cited in this application, including but not limited to, patents, patent applications, articles, books, treatises, and internet web pages are expressly incorporated by reference in their entirety for any purpose. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which the various embodiments described herein belongs. When definitions of terms in incorporated references appear to differ from the definitions provided in the present teachings, the definition provided in the present teachings shall control. The section headings used herein are for organizational purposes only and are not to be construed as limiting the described subject matter in any way. Definitions
To facilitate an understanding of the present technology, a number of terms and phrases are defined below. Additional definitions are set forth throughout the detailed description.
Throughout the specification and claims, the following terms take the meanings explicitly associated herein, unless the context clearly dictates otherwise. The phrase "in one embodiment" as used herein does not necessarily refer to the same embodiment, though it may. Furthermore, the phrase "in another embodiment" as used herein does not necessarily refer to a different embodiment, although it may. Thus, as described below, various embodiments of the invention may be readily combined, without departing from the scope or spirit of the invention.
In addition, as used herein, the term "or" is an inclusive "or" operator and is equivalent to the term "and/or" unless the context clearly dictates otherwise. The term "based on" is not exclusive and allows for being based on additional factors not described, unless the context clearly dictates otherwise. In addition, throughout the specification, the meaning of "a", "an", and "the" include plural references. The meaning of "in" includes "in" and "on."
As used herein, a "nucleic acid" or a "nucleic acid sequence" refers to a polymer or oligomer of pyrimidine and/or purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively (See Albert L. Lehninger, Principles of
Biochemistry, at 793-800 (Worth Pub. 1982)). The present technology contemplates any deoxyribonucleotide, ribonucleotide, or peptide nucleic acid component, and any chemical variants thereof, such as methylated, hydroxymethylated, or glycosylated forms of these bases, and the like. The polymers or oligomers may be heterogenous or homogenous in composition, and may be isolated from naturally occurring sources or may be artificially or synthetically produced. In addition, the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single - stranded or double -stranded form, including homoduplex, heteroduplex, and hybrid states. In some embodiments, a nucleic acid or nucleic acid sequence comprises other kinds of nucleic acid structures such as, for instance, a DNA/RNA helix, peptide nucleic acid (PNA), morpholino, locked nucleic acid (LNA), and/or a ribozyme. Hence, the term "nucleic acid" or "nucleic acid sequence" may also encompass a chain comprising non- natural nucleotides, modified nucleotides, and/or non- nucleotide building blocks that can exhibit the same function as natural nucleotides (e.g., "nucleotide analogs"); further, the term "nucleic acid sequence" as used herein refers to an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin, which may be single or double-stranded, and represent the sense or antisense strand.
The term "nucleotide analog" as used herein refers to modified or non-naturally occurring nucleotides including but not limited to analogs that have altered stacking interactions such as 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP); base analogs with alternative hydrogen bonding configurations (e.g., such as Iso-C and Iso-G and other non-standard base pairs described in U.S. Pat. No. 6,001,983 to S. Benner and herein incorporated by reference); non-hydrogen bonding analogs (e.g., non-polar, aromatic nucleoside analogs such as 2,4-difluorotoluene, described by B. A. Schweitzer and E. T. Kool, J. Org. Chem., 1994, 59, 7238-7242, B. A. Schweitzer and E. T. Kool, J. Am. Chem. Soc, 1995, 117, 1863- 1872; each of which is herein incorporated by reference); "universal" bases such as 5-nitroindole and 3-nitropyrrole; and universal purines and pyrimidines (such as "K" and "P" nucleotides, respectively! P. Kong, et al., Nucleic Acids Res., 1989, 17, 10373- 10383, P. Kong et al., Nucleic Acids Res., 1992, 20, 5149-5152). Nucleotide analogs include nucleotides having modification on the sugar moiety, such as dideoxy nucleotides and 2'-0-methyl nucleotides. Nucleotide analogs include modified forms of deoxyribonucleotides as well as ribonucleotides.
"Peptide nucleic acid" means a DNA mimic that incorporates a peptide ike polyamide backbone.
As used herein, the term "% sequence identity" refers to the percentage of nucleotides or nucleotide analogs in a nucleic acid sequence that is identical with the corresponding nucleotides in a reference sequence after aligning the two sequences and introducing gaps, if necessary, to achieve the maximum percent identity. Hence, in case a nucleic acid according to the technology is longer than a reference sequence, additional nucleotides in the nucleic acid, that do not align with the reference sequence, are not taken into account for determining sequence identity. Methods and computer programs for alignment are well known in the art, including blastn, Align 2, and FASTA.
The term "homology" and "homologous" refers to a degree of identity. There may be partial homology or complete homology. A partially homologous sequence is one that is less than 100% identical to another sequence. The term "sequence variation" as used herein refers to differences in nucleic acid sequence between two nucleic acids. For example, a wild-type structural gene and a mutant form of this wild-type structural gene may vary in sequence by the presence of single base substitutions and/or deletions or insertions of one or more nucleotides. These two forms of the structural gene are said to vary in sequence from one another. A second mutant form of the structural gene may exist. This second mutant form is said to vary in sequence from both the wild-type gene and the first mutant form of the gene.
As used herein, the terms "complementary" or "complementarity" are used in reference to polynucleotides (e.g., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules. For example, for the sequence "5'-A-G-T-3"' is complementary to the sequence "3'-T-C-A-5\" Complementarity may be "partial," in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand.
In some contexts, the term "complementarity" and related terms (e.g.,
"complementary", "complement") refers to the nucleotides of a nucleic acid sequence that can bind to another nucleic acid sequence through hydrogen bonds, e.g., nucleotides that are capable of base pairing, e.g., by Watson-Crick base pairing or other base pairing. Nucleotides that can form base pairs, e.g., that are complementary to one another, are the pairs: cytosine and guanine, thymine and adenine, adenine and uracil, and guanine and uracil. The percentage complementarity need not be calculated over the entire length of a nucleic acid sequence. The percentage of complementarity may be limited to a specific region of which the nucleic acid sequences that are base-paired, e.g., starting from a first base-paired nucleotide and ending at a last base-paired nucleotide. The complement of a nucleic acid sequence as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association." Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids of the present invention and include, for example, inosine and 7-deazaguanine.
Complementarity need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs.
Thus, in some embodiments, "complementary" refers to a first nucleobase sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identical to the complement of a second nucleobase sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more nucleobases, or that the two sequences hybridize under stringent hybridization conditions. "Fully complementary" means each nucleobase of a first nucleic acid is capable of pairing with each nucleobase at a corresponding position in a second nucleic acid. For example, in certain embodiments, an oligonucleotide wherein each nucleobase has complementarity to a nucleic acid has a nucleobase sequence that is identical to the complement of the nucleic acid over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more nucleobases.
"Mismatch" means a nucleobase of a first nucleic acid that is not capable of pairing with a nucleobase at a corresponding position of a second nucleic acid.
As used herein, the term "hybridization" is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the Tm of the formed hybrid. "Hybridization" methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon. The initial observations of the
"hybridization" process by Marmur and Lane, Proc. Natl. Acad. Sci. USA 46:453 (1960) and Doty et al., Proc. Natl. Acad. Sci. USA 46:461 (1960) have been followed by the refinement of this process into an essential tool of modern biology.
As used herein, the term "Tm" is used in reference to the "melting temperature." The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the Tm of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm = 81.5 + 0.41 * (% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi and SantaLucia, Biochemistry 36: 10581-94 (1997) include more sophisticated computations which account for structural, environmental, and sequence characteristics to calculate TV For example, in some embodiments these computations provide an improved estimate of Tm for short nucleic acid probes and targets (e.g., as used in the examples).
As used herein, the term "melting" when used in reference to a nucleic acid refers to the dissociation of a double-stranded nucleic acid or region of a nucleic acid into a single-stranded nucleic acid or region of a nucleic acid.
As used herein, the term "melting component" refers to a substance, molecule (e.g., a biomolecule), or a complex of more than one molecule (e.g., a complex of more than one biomolecule) that interacts with a nucleic acid and melts it, e.g., dissociates double-stranded regions (e.g., secondary structure of a single-stranded nucleic acid, a duplex structure of DNA or of a RNA/DNA hybrid) to provide single -stranded regions, e.g., to provide access to query regions for binding of a query probe. In exemplary embodiments, a melting component is a dCas9/gRNA complex (e.g., in some
embodiments, comprising a biotinylated dCas9 and in some embodiments comprising a non-biotinylated dCas9). The technology is not limited, however, to melting components that comprise dCas9. The technology comprises use of any entity that provides access to a query region by a query probe and allows SiMREPS assay of a target nucleic acid.
As used herein, a "double-stranded nucleic acid" may be a portion of a nucleic acid, a region of a longer nucleic acid, or an entire nucleic acid. A "double-stranded nucleic acid" may be, e.g., without limitation, a double-stranded DNA, a double -stranded RNA, a double-stranded DNA/RNA hybrid, etc. A single-stranded nucleic acid having secondary structure (e.g., base-paired secondary structure) and/or higher order structure comprises a "double-stranded nucleic acid". For example, triplex structures are considered to be "double-stranded". In some embodiments, any base-paired nucleic acid is a "double-stranded nucleic acid"
As used herein, a "non-coding RNA" or "ncRNA" is a functional RNA molecule that is not translated into a protein. Less-frequently used synonyms are non-protein- coding RNA (npcRNA), non-messenger RNA (nmRNA), small non-messenger RNA (snmRNA), and functional RNA (fRNA). The term small RNA (sRNA) is often used for bacterial ncRNAs. The DNA sequence from which a non-coding RNA is transcribed as the end product is often called an RNA gene or a non-coding RNA gene. Non-coding RNA genes include highly abundant and functionally important RNAs such as transfer RNA (tRNA) and ribosomal RNA (rRNA), as well as RNAs such as snoRNAs, microRNAs, siRNAs, and piRNAs. The number of ncRNAs encoded within the human genome is unknown, however recent transcriptomic and bioinformatic studies suggest the existence of thousands of ncRNAs. Since most of the newly identified ncRNAs have not been validated for their function, it is possible that many are non-functional.
As used herein, the term "long non-coding RNA" or "IncRNA" or "long ncRNA" refers to a non-protein coding RNA longer than approximately 200 nucleotides. As used herein, the term is used to distinguish IncRNAs from small regulatory RNAs such as microRNAs (miRNAs), short interfering RNAs (siRNAs), PiwHnteracting RNAs
(piRNAs), small nucleolar RNAs (snoRNAs), and other short RNAs.
As used herein, the term "miRNA" refers to microRNA. As used herein, the term
"miRNA target sequence" refers to a miRNA that is to be detected (e.g., in the presence of other nucleic acids). In some embodiments, a miRNA target sequence is a variant of a miRNA.
The term "siRNAs" refers to short interfering RNAs. In some embodiments, siRNAs comprise a duplex, or double -stranded region, where each strand of the double- stranded region is about 18 to 25 nucleotides long! the double-stranded region can be as short as 16, and as long as 29, base pairs long, where the length is determined by the antisense strand. Often siRNAs contain from about two to four unpaired nucleotides at the 3' end of each strand. SiRNAs appear to function as key intermediates in triggering RNA interference in invertebrates and in vertebrates, and in triggering sequence- specific RNA degradation during posttranscriptional gene silencing in plants. At least one strand of the duplex or double-stranded region of a siRNA is substantially homologous to or substantially complementary to a target RNA molecule. The strand complementary to a target RNA molecule is the "antisense" strand! the strand homologous to the target RNA molecule is the "sense" strand and is also complementary to the siRNA antisense strand. One strand of the double -stranded region need not be the exact length of the opposite strand" thus, one strand may have at least one fewer nucleotides than the opposite complementary strand, resulting in a "bubble" or at least one unmatched base in the opposite strand. One strand of the double-stranded region need not be exactly complementary to the opposite strand; thus, the strand, preferably the sense strand, may have at least one mismatched base pair.
siRNAs may also contain additional sequences! non-limiting examples of such sequences include linking sequences, or loops, which connect the two strands of the duplex region. This form of siRNAs may be referred to "si-like RNA", "short hairpin siRNA" where the short refers to the duplex region of the siRNA, or "hairpin siRNA".
Additional non-limiting examples of additional sequences present in siRNAs include stem and other folded structures. The additional sequences may or may not have known functions! non-limiting examples of such functions include increasing stability of an siRNA molecule, or providing a cellular destination signal.
"Pre-miRNA" or "pre-miR" means a non-coding RNA having a hairpin structure, which is the product of cleavage of a pri-miR by the double-stranded RNA-specific ribonuclease known as Drosha.
"Stem-loop sequence" means an RNA having a hairpin structure and containing a mature miRNA sequence. Pre-miRNA sequences and stem-loop sequences may overlap.
Examples of stem-loop sequences are found in the miRNA database known as miRBase
(available at the worldwide web at microma.sanger.ac.uk).
"Pri-miRNA" or "pri-miR" means a non-coding RNA having a hairpin structure that is a substrate for the double -stranded RNA-specific ribonuclease Drosha.
"miRNA precursor" means a transcript that originates from a genomic DNA and that comprises a non-coding, structured RNA comprising one or more miRNA sequences.
For example, in certain embodiments a miRNA precursor is a pre -miRNA. In certain embodiments, a miRNA precursor is a pri-miRNA.
The term "gene" refers to a DNA sequence that comprises control and coding sequences necessary for the production of an RNA having a non-coding function (e.g., a ribosomal or transfer RNA), a polypeptide or a precursor. The RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
The term "wild-type" refers to a gene or a gene product that has the
characteristics of that gene or gene product when isolated from a naturally occurring source. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal" or "wild-type" form of the gene. In contrast, the term "modified," "mutant," or "polymorphic" refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered
characteristics) when compared to the wild-type gene or gene product. It is noted that naturally- occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
The term "oligonucleotide" as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 10 to 15 nucleotides and more preferably at least about 15 to 30 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof.
Because mononucleotides are reacted to make oligonucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage, an end of an oligonucleotide is referred to as the "5' end" if its 5' phosphate is not linked to the 3' oxygen of a
mononucleotide pentose ring and as the "3' end" if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends. A first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction.
When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream" oligonucleotide and the latter the "downstream" oligonucleotide. Similarly, when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream" oligonucleotide and the second
oligonucleotide may be called the "downstream" oligonucleotide.
As used herein, the terms "subject" and "patient" refer to any organisms including plants, microorganisms, and animals (e.g., mammals such as dogs, cats, livestock, and humans).
The term "sample" in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples. A sample may include a specimen of synthetic origin.
As used herein, a "biological sample" refers to a sample of biological tissue or fluid. For instance, a biological sample may be a sample obtained from an animal (including a human); a fluid, solid, or tissue sample! as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by¬ products, and waste. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagomorphs, rodents, etc. Examples of biological samples include sections of tissues, blood, blood fractions, plasma, serum, urine, or samples from other peripheral sources or cell cultures, cell colonies, single cells, or a collection of single cells. Furthermore, a biological sample includes pools or mixtures of the above mentioned samples. A biological sample may be provided by removing a sample of cells from a subject, but can also be provided by using a previously isolated sample. For example, a tissue sample can be removed from a subject suspected of having a disease by conventional biopsy techniques. In some embodiments, a blood sample is taken from a subject. A biological sample from a patient means a sample from a subject suspected to be affected by a disease.
Environmental samples include environmental material such as surface matter, soil, water, and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.
The term "label" as used herein refers to any atom or molecule that can be used to provide a detectable (preferably quantifiable) effect, and that can be attached to a nucleic acid or protein. Labels include, but are not limited to, dyes (e.g., fluorescent dyes or moities); radiolabels such as 32P; binding moieties such as biotin! haptens such as digoxgenin! luminogenic, phosphorescent, or fluorogenic moieties! mass tags! and fluorescent dyes alone or in combination with moieties that can suppress or shift emission spectra by fluorescence resonance energy transfer (FRET). Labels may provide signals detectable by fluorescence, radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption, magnetism, enzymatic activity, characteristics of mass or behavior affected by mass (e.g., MALDI time -of- flight mass spectrometry! fluorescence polarization), and the like. A label may be a charged moiety (positive or negative charge) or, alternatively, may be charge neutral. Labels can include or consist of nucleic acid or protein sequence, so long as the sequence comprising the label is detectable.
"Support" or "solid support", as used herein, refers to a matrix on or in which nucleic acid molecules, microparticles, and the like may be immobilized, e.g., to which they may be covalently or noncovalently attached or in or on which they may be partially or completely embedded so that they are largely or entirely prevented from diffusing freely or moving with respect to one another.
As used herein, "moiety" refers to one of two or more parts into which something may be divided, such as, for example, the various parts of an oligonucleotide, a molecule, a chemical group, a domain, a probe, etc.
As used herein, a "query probe" or "reader probe" is any entity (e.g., molecule, biomolecule, etc.) that recognizes a nucleic acid (e.g., binds to a nucleic acid, e.g., binds specifically to a nucleic acid). In exemplary embodiments, the query probe is a protein that recognizes a nucleic acid (e.g., a nucleic acid binding protein, an antibody, antibody fragment, a transcription factor, or any other protein that binds to a particular sequence in a nucleic acid). In some other exemplary embodiments, the query probe is a nucleic acid (e.g., a DNA, an RNA, a nucleic acid comprising DNA and RNA, a nucleic acid comprising modified bases and/or modified linkages between bases! e.g., a nucleic acid as described hereinabove, a nucleic acid aptamer or any other nucleic acid that binds to a particular sequence in a nucleic acid). In some embodiments, the query probe is labeled, e.g., with a detectable label such as, e.g., a fluorescent moiety as described herein. In some embodiments, the query probe comprises more than one type of molecule (e.g., more than one of a protein, a nucleic acid, a chemical linker or a chemical moiety).
As used herein, a "capture probe" is any entity (e.g., molecule, biomolecule, etc.) that recognizes a nucleic acid (e.g., binds to a nucleic acid, e.g., binds specifically to a nucleic acid). In exemplary embodiments, the capture probe is a protein that recognizes a nucleic acid (e.g., a nucleic acid binding protein, an antibody, a fragment of an antibody, a transcription factor, or any other protein that binds to a particular sequence in a nucleic acid). In some other exemplary embodiments, a capture probe is a nucleic acid (e.g., a DNA, an RNA, a nucleic acid comprising DNA and RNA, a nucleic acid comprising modified bases and/or modified linkages between bases! e.g., a nucleic acid as described hereinabove). In some embodiments, a capture probe is labeled, e.g., with a detectable label such as, e.g., a fluorescent moiety as described herein. In some embodiments, the capture probe comprises more than one type of molecule (e.g., more than one of a protein, a nucleic acid, a chemical linker or a chemical moiety).
Description
Provided herein are embodiments of a technique for the specific and ultrasensitive detection of single nucleic acids. As previously described (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety), SiMREPS uses total internal reflection fluorescence (TIRF) microscopy, single -molecule visualization, and kinetic analysis of binding and release of fluorescently labeled probes to target molecules (see, e.g., Fig. 1 a). Target molecules are quantified by simple, amplification- free, direct counting upon kinetic fingerprint identification that provides for exquisite discrimination between single nucleotide variants, as demonstrated previously for the detection of microRNA (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety). The technology provided herein provides for the detection of additional forms of nucleic acids, e.g., DNA, mutant DNA, methylated DNA, e.g., in an abundant wild-type background.
Existing techniques for nucleic acid detection utilize probes that form a thermodynamically stable complex with the target molecule, and are thus limited to weak and often unreliable thermodynamic discrimination against background signal or spurious targets. In contrast, the technology described herein utilizes probes that repeatedly bind to the target molecule at the query region and related methods to record the large number of independent binding events that occur for each observed target molecule. This repeated kinetic sampling provides a unique kinetic "fingerprint" for the target and provides for a highly specific and sensitive detection of nucleic acids. In some embodiments, the technology provides for the discrimination of two nucleic acid molecules that differ by as few as one nucleotide. In some embodiments, the technology provides for the discrimination of two nucleic acid molecules when one of the two nucleic acid molecules is present in a large excess (e.g., 10x; 100x; lOOOx; 10,000x; or l,000,000x or more in excess). See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety.
In some embodiments, a labeled nucleic acid is detected, e.g., using an
instrument to detect a signal produced by the label. For instance, some embodiments comprise use of a detectably labeled (e.g., fluorescently labeled) query probe and a detector of fluorescence emission such a fluorescent microscopy technique. In some embodiments, the technology finds use as a diagnostic tool for identifying mutant or aberrantly expressed nucleic acid targets in biological samples. See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety.
In some embodiments, this approach involves the capture of unlabeled nucleic acids by a dCas9/gRNA complex linked to a solid support (e.g., glass or fused silica) and melting of a query region, followed by observation of the repeated, transient binding of a short detectably labeled (e.g., fluorescently labeled) nucleic acid (e.g., DNA) query probe to the query region.
In some embodiments, the dCas9/gRNA complex is attached or fixed to a solid support. In some embodiments, the dCas9/gRNA complex comprises a moiety that provides for the immobilization of the dCas9/gRNA complex to a solid support by interaction of the moiety with a second moiety attached to the solid support. The dCas9/gRNA complex may be fixed directly or indirectly to a solid support.
Any of a variety of materials may be used as a support for the dCas9/gRNA complex, e.g., matrices or particles made of nitrocellulose, nylon, glass, polyacrylate, mixed polymers, polystyrene, silane polypropylene, and magnetically attractable materials. A planar surface is a preferred support for imaging by microscopy as described herein. A dCas9/gRNA complex may be immobilized by linking it directly to the solid support, e.g., by using any of a variety of covalent linkages, chelation, or ionic interaction, or may be immobilized by linking it indirectly via one or more linkers joined to the support. In some embodiments, the linker is a nucleic acid! in some embodiments, the linker is a nucleic acid comprising one or more nucleotides that is/are not intended to hybridize (e.g., that do not hybridize) to the target nucleic acid capture region but that are intended to act as a spacer between the dCas9/gRNA complex and its solid support.
In some embodiments, the dCas9/gRNA complex comprises a biotin group (e.g., the dCas9/gRNA complex is biotinylated) and the solid support comprises a streptavidin group (e.g., attached to the solid support by a linker moiety, e.g., a polyethylene glycol (PEG) linker). The specific interaction of the biotin and streptavidin thus immobilizes the capture probe to the solid support (Figs, la, 3, and 4).
Various other chemical methods can be employed for the immobilization of a dCas9/gRNA complex to a solid support. An example of such a method is to use a combination of a maleimide group and a thiol (-SH) group. In this method, a thiol (-SH) group is bonded to a dCas9/gRNA complex, and the solid support comprises a maleimide group. Accordingly, the thiol group of the dCas9/gRNA complex reacts with the maleimide group on the solid support to form a covalent bond, whereby the dCas9/gRNA complex is immobilized. Introduction of the maleimide group can utilize a process of firstly allowing a reaction between a glass substrate and an aminosilane coupling agent and then introducing the maleimide group onto the glass substrate by a reaction of the amino group with an EMCS reagent (N-(6-maleimidocaproyloxy)succinimide, available from Dojindo). Introduction of the thiol group to a DNA can be carried out using 5'-Thiol- Modifier C6 (available from Glen Research) when the DNA is synthesized by an automatic DNA synthesizer.
Instead of the above -described combination of a thiol group and a maleimide group, a combination of, e.g., an epoxy group (on the solid support) and an amino group (dCas9/gRNA complex), is used in some embodiments as a combination of functional groups for immobilization. Surface treatments using various kinds of silane coupling agents are also effective. Other techniques for the attachment of proteins to solid supports and solid surfaces are known in the art. Poisson processes
Embodiments of the technology are related to single -molecule recognition by recording the characteristic kinetics of a probe (e.g., a query probe) binding to a target (e.g., a query region). In particular embodiments, this process is a Poisson process. A Poisson process is a continuous -time stochastic process that counts the number of events and the time that events (e.g., transient binding of a detectably labeled (e.g., fluorescent) query probe to an immobilized target) occur in a given time interval. The time interval between each pair of consecutive events has an exponential distribution and each interval is assumed to be independent of other intervals. The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of the events occurring in the given time interval if these events occur with a known average rate and independently of the time since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area, or volume.
A Poisson distribution is a special case of the general binomial distribution where the number of trials n is large, the probability of success p is small, and the product np = A is moderate. In a Poisson process, the probability that a number of events Nis / at any arbitrary time / follows the Poisson probability distribution Pj(t)-
Figure imgf000024_0001
That is, the number Nof events that occur up to time t has a Poisson distribution with parameter At. Statistical and mathematical methods relevant to Poisson processes and Poisson distributions are known in the art. See, e.g., "Stochastic Processes (i): Poisson Processes and Markov Chains" in Statistics for Biology and Health - Statistical Methods in Bioinformatics (Ewans and Grant, eds.), Springer (New York, 2001), page 129 et seq., incorporated herein by reference in its entirety. Software packages such as Matlab and R may be used to perform mathematical and statistical methods associated with Poisson processes, probabilities, and distributions.
Kinetics of detection
Particular embodiments of the technology are related to detecting a nucleic acid by analyzing the kinetics of the interaction of a query probe with a query region of a target nucleic acid to be detected. For the interaction of a query probe Q (e.g., at an equilibrium concentration [Q]) with a target nucleic acid T (e.g., at an equilibrium concentration [T]), the kinetic rate constant k0n describes the time -dependent formation of the complex QT comprising the query probe Q hybridized to the query region of the target nucleic acid T. In particular embodiments, while the formation of the QT complex is associated with a second order rate constant that is dependent on the concentration of query probe and has units of M_1min_1 (or the like), the formation of the QT complex is sufficiently described by a k0n that is a pseudo-first order rate constant associated with the formation of the QT complex. Thus, as used herein, k0n is an apparent ("pseudo") first- order rate constant.
Likewise, the kinetic rate constant k0ff describes the time- dependent dissociation of the complex QT into the query probe Q and the target nucleic acid T. Kinetic rates are typically provided herein in units of min-1 or s_1. The "dwell time" of the query probe Q in the bound state (xon) is the time interval (e.g., length of time) that the probe Q is hybridized to the query region of the target nucleic acid T during each instance of query probe Q binding to the query region of the target nucleic acid T to form the QT complex. The "dwell time" of the query probe Q in the unbound state (x0ff) is the time interval
(e.g., length of time) that the probe Q is not hybridized to the query region of the target nucleic acid T between each instance of query probe Q binding to the query region of the target nucleic acid T to form the QT complex (e.g., the time the query probe Q is dissociated from the target nucleic acid T between successive binding events of the query probe Q to the target nucleic acid T). Dwell times may be provided as averages or weighted averages integrating over numerous binding and non-binding events.
Further, in some embodiments, the repeated, stochastic binding of query probes (e.g., detectably labeled query probes (e.g., fluorescent probes), e.g., nucleic acid probes such as DNA or RNA probes) to immobilized targets is modeled as a Poisson process occurring with constant probability per unit time and in which the standard deviation in the number of binding and dissociation events per unit time (Nb+d) increases as (Nb+d)1 2. Thus, the statistical noise becomes a smaller fraction of Nb+d as the observation time is increased. Accordingly, the observation is lengthened as needed in some embodiments to achieve discrimination between target and off-target binding. And, as the acquisition time is increased, the signal and background peaks in the Nb+d histogram become increasingly separated and the width of the signal distribution increases as the square root of Nb+d, consistent with kinetic Monte Carlo simulations. An acquisition time of approximately 10 minutes (e.g., approximately 1 to 100 minutes, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 minutes) yields sufficient (e.g., complete) separation of the signal from background distributions of Nb+d, providing for substantially background-free quantification of the target. See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety.
Further, in some embodiments the probe length is chosen to provide sufficient separation of signal and background peaks on convenient experimental time scales. In particular, the kinetics of query probe exchange are related to the number of
complementary bases between the query probe and target nucleic acid. For instance, in some embodiments, the interaction of a short DNA query probe with its complement increases as an approximately exponential function of the number of base pairs formed, while the rate constant of binding is affected only weakly for interactions comprising at least 6 to 7 base pairs. Thus, varying query probe length provides for tuning the kinetic behavior to improve discrimination of query probe binding events to the target from background binding. In particular, a query (e.g., fluorescent) probe length of 9 nt to 10 nt (providing theoretical Tm values of 17.5°C to 25°C) yields rapid target binding that is distinguished from background signal, as displayed in histograms of intensity
transitions per candidate molecule in the presence and absence of target. See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety. Further, in some embodiments the kinetics of binding and dissociation are more closely correlated to probe length than to the melting temperature of the duplex. While some embodiments comprise use of a probe having a length of 9 to 10 nt, the technology is not limited by this length. Indeed, use of probes longer or shorter than 9 to 10 nt is contemplated by the technology, e.g., as discussed throughout.
Although the disclosure herein refers to certain illustrated embodiments, it is to be understood that these embodiments are presented by way of example and not by way of limitation.
Detection of double-stranded nucleic acids
Embodiments of the technology provide for the detection of double-stranded nucleic acids. Some embodiments provide compositions, reaction mixtures, and complexes comprising a plurality of molecules for detecting one or more nucleic acids. Some embodiments of compositions, reaction mixtures, and complexes comprise a nucleic acid (e.g., a target nucleic acid) that is to be detected, identified, quantified, and/or characterized; a solid substrate comprising a dCas9/gRNA complex linked to a solid surface that binds one or more regions of the target with high specificity! and a detectably labeled (e.g., fluorescent) query probe. Some embodiments further comprise a protospacer adjacent motif (PAM) DNA oligonucleotide.
The SiMREPS technology exploits the direct binding of a short (6- 12-nucleotide) fluorescently labeled DNA probe to an unlabeled nucleic acid target (e.g., miRNA) immobilized on a glass surface (Fig. 1 a) (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety). Using TIRF microscopy (Walter et al (2008) "Do-it-yourself guide: how to use the modern single -molecule toolkit" Nat Methods 5: 475-89), both non-specific surface binding (Fig. 1 b) and specific binding to the immobilized target (Fig. 1 c) are detected. However, equilibrium binding of the probe to target yields a distinctive kinetic signature, or fingerprint, that can achieve ultra- high discrimination against background binding (compare Fig. 1 b with Fig. 1 c). Since the transient binding of probes to an immobilized target resembles a Poisson process, the standard deviation in the number of binding and dissociation events (Nb+d) increases as jNb+d. As experimental acquisition time is increased, the signal and background peaks in histograms of Nb+d are progressively better-resolved (Fig. 1 d) (see, e.g., Johnson- Bucket al. (2015) "Kinetic fingerprinting to identify and count single nucleic acids" Nat Biotechnol ?}?}■ 730-2), allowing for arbitrarily high discrimination between target and off-target binding.
SiMREPS finds use in quantifying RNA (see, e.g., U.S. Pat. App. Ser. No.
14/589,467, incorporated herein by reference in its entirety). In particular, previous experiments quantified four human miRNAs that are dysregulated in cancer and other diseases. Discrimination (e.g., specificity = l) was achieved for all target-probe pairs and standard curves showed linear dependence on target concentration (Fig. 1 e). A single fluorescent probe discriminates with highest specificity between two microRNAs differing by a single nucleotide (Fig. 2 a-c). SiMREPS detects targets in complex biological matrices. (Fig. 2 d, e). Also the prostate cancer biomarker hsa-miR- 1418 was detected in a serum sample after spiking-in varying target concentrations. The measured concentration was strongly correlated with the nominal spiked-in
concentration (Fig. 2 f, R >0.999, slope = 1.07).
One challenge in applying the SiMREPS technology to the detection of double - stranded nucleic acids is that the transient binding of a query probe to the target nucleic acid (e.g., at the query region) competes with association of the complementary strand to the query region (e.g., dsDNA) at the target locus. Accordingly, provided herein is a technology in which SiMREPS finds use in detecting double -stranded DNA. To overcome this challenge, some embodiments of the technology comprise use of a catalytically inactive ("dead") dCas9 enzyme loaded with a specific guide-RNA (gRNA) to melt dsDNA structure locally in a sequence-specific fashion, providing access for the SiMREPS probe (Fig. 3). Related embodiments comprise use of a protein that binds to double-stranded nucleic acids (e.g., double -stranded DNA, double-stranded RNA, a double -stranded DNA/RNA hybrid) and/or a melting component (e.g., a protein or a nucleic acid) to dissociate a double -stranded nucleic acid (e.g., a region of a nucleic acid) to provide a single-stranded nucleic acid. dCas9/gRNA complexes
The technology comprises use of a sequence-specific nucleic acid binding component (e.g., molecule, biomolecule, or complex of one or more molecules and/or biomolecules) to immobilize nucleic acids and/or convert double -stranded nucleic acids (e.g., regions of double-stranded nucleic acids) to single-stranded regions. In exemplary embodiments, the sequence -specific nucleic acid binding component comprises an enzymatically inactive, or "dead", Cas9 protein ("dCas9") and a guide RNA ("gRNA"). While nucleic acid-binding molecules such as the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) (CRISPR/Cas) system have been used extensively for genome editing in cells of various types and species, recombinant and engineered nucleic acid-binding proteins find use in the present technology to melt double-stranded nucleic acids and provide single -stranded nucleic acids for probe binding. The Cas9 protein was discovered as a component of the bacterial adaptive immune system (see, e.g., Barrangou et al. (2007) "CRISPR provides acquired resistance against viruses in prokaryotes" Science 315: 1709-1712). Cas9 is an RNA-guided endonuclease that targets and destroys foreign DNA in bacteria using RNA:DNA base- pairing between the gRNA and foreign DNA to provide sequence specificity. Recently, Cas9/gRNA complexes have found use in genome editing (see, e.g., Doudna et al. (2014) "The new frontier of genome engineering with CRISPR-Cas9" Science 346: 6213).
Accordingly, some Cas9/RNA complexes comprise two RNA molecules: (l) a CRISPR RNA (crRNA), possessing a nucleotide sequence complementary to the target nucleotide sequence! and (2) a trans -activating crRNA (tracrRNA). In this mode, Cas9 functions as an RNA-guided nuclease that uses both the crRNA and tracrRNA to recognize and cleave a target sequence. Recently, a single chimeric guide RNA (sgRNA) mimicking the structure of the annealed crRNA/tracrRNA has become more widely used than crRNA/tracrRNA because the gRNA approach provides a simplified system with only two components (e.g., the Cas9 and the sgRNA). Thus, sequence -specific binding to a nucleic acid can be guided by a natural dual-RNA complex (e.g., comprising a crRNA, a tracrRNA, and Cas9) or a chimeric single-guide RNA (e.g., a sgRNA and Cas9). (see, e.g., Jinek et al. (2012) "A Programmable Dual-RNA- Guided DNA Endonuclease in Adaptive Bacterial Immunity" Science 337:816-821).
As used herein, the targeting region of a crRNA (2-RNA system) or a sgRNA
(single guide system) is referred to as the "guide RNA" (gRNA). In some embodiments, the gRNA comprises, consists of, or essentially consists of 10 to 50 bases, e.g., 15 to 40 bases, e.g., 15 to 30 bases, e.g., 15 to 25 bases (e.g., 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 bases). Methods are known in the art for determining the length of the gRNA that provides the most efficient target recognition for a Cas9. See, e.g., Lee et al. (2016) "The Neisseria meningitidis CRISPR-Cas9 System Enables
Specific Genome Editing in Mammalian Cells" Molecular Therapy, 19 January 2016! doi: i0.1038/mt.
Accordingly, in some embodiments the gRNA is a short synthetic RNA
comprising a "scaffold" sequence for Cas9-binding and a user-defined approximately 20- nucleotide "targeting" sequence that is complementary to the nucleic acid target.
In some embodiments, DNA targeting specificity is determined by two factors: l) a DNA sequence matching the gRNA targeting sequence and a protospacer adjacent motif (PAM) directly downstream of the target sequence. Some Cas9/gRNA complexes recognize a DNA sequence comprising a protospacer adjacent motif (PAM) sequence and the adjacent approximately 20 bases complementary to the gRNA. Canonical PAM sequences are NGG or NAG for Cas9 from Streptococcus pyogenes and NNNNGATT for the Cas9 from Neisseria meningitidis. Following DNA recognition by hybridization of the gRNA to the DNA target sequence, Cas9 cleaves the DNA sequence via an intrinsic nuclease activity. For genome editing and other purposes, the CRISPR/Cas system from S. pyogenes has been used most often. Using this system, one can target a given target nucleic acid (e.g., for editing or other manipulation) by designing a gRNA having nucleotide sequence complementary to an approximately 20-base DNA sequence 5'- adjacent to the PAM. Methods are known in the art for determining the PAM sequence that provides the most efficient target recognition for a Cas9. See, e.g., Zhang et al. (2013) "Processing-independent CRISPR RNAs limit natural transformation in
Neisseria meningitidis" Molecular Cell 50: 488-503; Lee et al., supra.
The present technology comprises use of a catalytically inactive form of Cas9 ("dead Cas9" or "dCas9"), in which point mutations are introduced that disable the nuclease activity. In some embodiments, the dCas9 protein is from S. pyogenes. In some embodiments, the dCas9 protein comprises mutations at, e.g., D10, E762, H983, and/or D986; and at H840 and/or N863, e.g., at D10 and H840, e.g., DIOA or DION and H840A or H840N or H840Y. In some embodiments, the dCas9 is provided as a fusion protein comprising a functional domain for attaching the dCas9 to a solid surface (e.g., an epitope tag, linker peptide, etc.)
The dCas9/gRNA complex binds to a target nucleic acid with a sequence specificity provided by the gRNA, but does not cleave the nucleic acid. In this form, the dCas9/gRNA "melts" the target sequence to provide single-stranded regions of the target nucleic acid in a sequence -specific manner (see, e.g., Qi et al. (2013) "Repurposing
CRISPR as an RNA-guided platform for sequence -specific control of gene expression" Cell 152(5): 1173-83).
Furthermore, while the Cas9/gRNA system and dCas9/gRNA system initially targeted sequences adjacent to a PAM, the dCas9/gRNA system as used herein has been engineered to target any nucleotide sequence for binding. Also, Cas9 and dCas9 orthologs encoded by compact genes (e.g., Cas9 from Staphylococcus aureus) are known (see, e.g., Ran et al. (2015) "In vivo genome editing using Staphylococcus aureus Cas9" Nature 520: 186-191), which improves the cloning and manipulation of the Cas9 components in vitro. A number of bacteria express Cas9 protein variants. The Cas9 from Streptococcus pyogenes is presently the most commonly used! some of the other Cas9 proteins have high levels of sequence identity with the S. pyogenes Cas9 and use the same guide RNAs. Others are more diverse, use different gRNAs, and recognize different PAM sequences as well (the 2-5 nucleotide sequence specified by the protein which is adjacent to the sequence specified by the RNA). Chylinski et al. classified Cas9 proteins from a large group of bacteria (RNA Biology 10:5, 1-12; 2013), and a large number of Cas9 proteins are listed in supplementary FIG. 1 and supplementary table 1 thereof, which are incorporated by reference herein. Additional Cas9 proteins are described in Esvelt et al., Nat Methods. 2013 November; 10(ll):ill6-21 and Fonfara et al., "Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems." Nucleic Acids Res. 2013 Nov. 22. [Epub ahead of print] doi:i0.1093/nar/gktl074.
Cas9, and thus dCas9, molecules of a variety of species find use in the technology described herein. While the S. pyogenes and S. thermophilus Cas9 molecules are widely used, Cas9 molecules of, derived from, or based on the Cas9 proteins of other species listed herein find use in embodiments of the technology. Accordingly, the technology provides for the replacement of S. pyogenes and S. thermophilus Cas9 and dCas9 molecules with Cas9 and dCas9 molecules from the other species can replace them, e.g.:
GenBank Acc No. Bacterium
303229466 Veillonella atypica ACS- 134-V-Col7a
34762592 Fusobacterium nucleatum subsp. vincentii
374307738 Filifactor alocis ATCC 35896
320528778 Solobacterium moorei F0204
291520705 Coprococcus catus GD-7
42525843 Treponema denticola ATCC 35405
304438954 Peptoniphilus duerdenii ATCC BAA- 1640
224543312 Catenibacterium mitsuokai DSM 15897
24379809 Streptococcus mutans UA159
15675041 Streptococcus pyogenes SF370
16801805 Listeria innocua Clip 11262
116628213 Streptococcus thermophilus LMD-9
323463801 Staphylococcus pseudintermedius ED99
352684361 Acidaminococcus intestini RyC_MR95
302336020 Olsenella uli DSM 7084
366983953 Oenococcus kitaharae DSM 17330
310286728 Bifidobacterium bifidum S17
258509199 Lactobacillus rhamnosus GG
300361537 Lactobacillus gasseri JV-V03
169823755 Finegoldia magna ATCC 29328
47458868 Mycoplasma mobile 163K
284931710 Mycoplasma gallisepticum str. F
363542550 Mycoplasma ovipneumoniae SCOl
384393286 Mycoplasma canis PG 14
71894592 Mycoplasma synoviae 53
238924075 Eubacterium rectale ATCC 33656 116627542 Streptococcus thermophilus LMD-9
315149830 Enterococcus faecalis TX0012
315659848 Staphylococcus lugdunensis M23590
160915782 Eubacterium dolichum DSM 3991
336393381 Lactobacillus coryniformis subsp. torquens
310780384 Ilyobacter polytropus DSM 2926
325677756 Ruminococcus albus 8
187736489 Akkermansia muciniphila ATCC BAA-835
117929158 Acidothermus cellulolyticus 1 IB
189440764 Bifidobacterium longum DJO10A
283456135 Bifidobacterium dentium Bdl
38232678 Corynebacterium diphtheriae NCTC 13129
187250660 Elusimicrobium minutum Peil91
319957206 Nitratifractor salsuginis DSM 16511
325972003 Sphaerochaeta globus str. Buddy
261414553 Fibrobacter succinogenes subsp. succinogenes
60683389 Bacteroides fragilis NCTC 9343
256819408 Capnocytophaga ochracea DSM 7271
90425961 Rhodopseudomonas palustris BisB18
373501184 Prevotella micans F0438
294674019 Prevotella ruminicola 23
365959402 Flavobacterium columnare ATCC 49512
312879015 Aminomonas paucivorans DSM 12260
83591793 Rhodospirillum rubrum ATCC 11170
294086111 Candidatus Puniceispirillum marinum IMCC1322
121608211 Verminephrobacter eiseniae EF01-2
344171927 Ralstonia syzygii R24
159042956 Dinoroseobacter shibae DFL 12
288957741 Azospirillum sp- B510
92109262 Nitrobacter hamburgensis X14
148255343 Bradyrhizobium sp- BTAil
34557790 Wolinella succinogenes DSM 1740
218563121 Campylobacter jejuni subsp. jejuni
291276265 Helicobacter mustelae 12198
229113166 Bacillus cereus Rockl-15
222109285 Acidovorax ebreus TPSY
189485225 uncultured Termite group 1
182624245 Clostridium perfringens D str.
220930482 Clostridium cellulolyticum H10
154250555 Parvibaculum lavamentivorans DS-1
257413184 Roseburia intestinalis Ll-82
218767588 Neisseria meningitidis Z2491
15602992 Pasteurella multocida subsp. multocida
319941583 Sutterella wadsworthensis 3 1
254447899 gamma proteobacterium HTCC5015
54296138 Legionella pneumophila str. Paris
331001027 Parasutterella excrementihominis YIT 11859
34557932 Wolinella succinogenes DSM 1740
118497352 Francisella novicida U112
The technology described herein encompasses the use of a dCas9 derived from any Cas9 protein (e.g., as listed above) and their corresponding guide RNAs or other guide RNAs that are compatible. The Cas9 from Streptococcus thermophilus LMD-9 CRISPR1 system has been shown to function in human cells (see, e.g., Cong et al. (2013) Science 339: 819). Additionally, Jinek showed in vitro that Cas9 orthologs from S.
thermophilus and L. innocua, can be guided by a dual S. pyogenes gRNA to cleave target plasmid DNA. In some embodiments, the present technology comprises the Cas9 protein from S. pyogenes, either as encoded in bacteria or co don -optimized for expression in mammalian cells, containing mutations at D10, E762, H983, or D986 and H840 or N863, e.g., D10A/D10N and H840A/H840N/H840Y, to render the nuclease portion of the protein catalytically inactive! substitutions at these positions are, in some embodiments, alanine (Nishimasu (2014) Cell 156: 935-949) or, in some embodiments, other residues, e.g., glutamine, asparagine, tyrosine, serine, or aspartate, e.g., E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H. The sequence of one S. pyogenes dCas9 protein that finds use in the technology provided herein is described in US20160010076, which is incorporated herein by reference in its entirety.
For example, in some embodiments, the dCas9 used herein is at least about 50% identical to the sequence of S. pyogenes Cas9, e.g., at least 50% identical to the following sequence of dCas9 comprising the D 10A and H840A substitutions (SEQ ID NO: l).
Met Asp Lys Lys Tyr Ser He Gly Leu Ala He Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val He Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser He Lys Lys Asn Leu He
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg He Cys
65 70 75 80
Tyr Leu Gin Glu He Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro He Phe Gly Asn He Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr He Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu He Tyr Leu Ala Leu Ala His
145 150 155 160
Met He Lys Phe Arg Gly His Phe Leu He Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe He Gin Leu Val Gin Thr Tyr
180 185 190
Asn Gin Leu Phe Glu Glu Asn Pro He Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala He Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu He Ala Gin Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu He Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gin Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gin He Gly Asp Gin Tyr Ala Asp
275 280 285 Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala He Leu Leu Ser Asp 290 295 300
He Leu Arg Val Asn Thr Glu He Thr Lys Ala Pro Leu Ser Ala Ser 305 310 315 320 Met He Lys Arg Tyr Asp Glu His His Gin Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gin Gin Leu Pro Glu Lys Tyr Lys Glu He Phe Phe
340 345 350
Asp Gin Ser Lys Asn Gly Tyr Ala Gly Tyr He Asp Gly Gly Ala Ser
355 360 365
Gin Glu Glu Phe Tyr Lys Phe He Lys Pro He Leu Glu Lys Met Asp 370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg 385 390 395 400 Lys Gin Arg Thr Phe Asp Asn Gly Ser He Pro His Gin He His Leu
405 410 415
Gly Glu Leu His Ala He Leu Arg Arg Gin Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys He Glu Lys He Leu Thr Phe Arg He
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460
Met Thr Arg Lys Ser Glu Glu Thr He Thr Pro Trp Asn Phe Glu Glu 465 470 475 480 Val Val Asp Lys Gly Ala Ser Ala Gin Ser Phe He Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gin 530 535 540
Lys Lys Ala He Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545 550 555 560 Val Lys Gin Leu Lys Glu Asp Tyr Phe Lys Lys He Glu Cys Phe Asp
565 570 575
Ser Val Glu He Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys He He Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp He Leu Glu Asp He Val Leu Thr Leu Thr 610 615 620
Leu Phe Glu Asp Arg Glu Met He Glu Glu Arg Leu Lys Thr Tyr Ala 625 630 635 640 His Leu Phe Asp Asp Lys Val Met Lys Gin Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu He Asn Gly He Arg Asp
660 665 670
Lys Gin Ser Gly Lys Thr He Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gin Leu He His Asp Asp Ser Leu Thr Phe 690 695 700
Lys Glu Asp He Gin Lys Ala Gin Val Ser Gly Gin Gly Asp Ser Leu 705 710 715 720 His Glu His He Ala Asn Leu Ala Gly Ser Pro Ala He Lys Lys Gly
725 730 735
He Leu Gin Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn He Val He Glu Met Ala Arg Glu Asn Gin
755 760 765
Thr Thr Gin Lys Gly Gin Lys Asn Ser Arg Glu Arg Met Lys Arg He 770 775 780
Glu Glu Gly lie Lys Glu Leu Gly Ser Gin lie Leu Lys Glu His Pro 785 790 795 800
Val Glu Asn Thr Gin Leu Gin Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gin Asn Gly Arg Asp Met Tyr Val Asp Gin Glu Leu Asp lie Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp Ala lie Val Pro Gin Ser Phe Leu Lys
835 840 845
Asp Asp Ser lie Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870 875 880
Asn Tyr Trp Arg Gin Leu Leu Asn Ala Lys Leu lie Thr Gin Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe lie Lys Arg Gin Leu Val Glu Thr Arg Gin lie Thr
915 920 925
Lys His Val Ala Gin lie Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu lie Arg Glu Val Lys Val lie Thr Leu Lys Ser 945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gin Phe Tyr Lys Val Arg
965 970 975
Glu lie Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu lie Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met lie Ala
1010 1015 1020
Lys Ser Glu Gin Glu lie Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn lie Met Asn Phe Phe Lys Thr Glu lie Thr Leu Ala
1040 1045 1050
Asn Gly Glu lie Arg Lys Arg Pro Leu lie Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu lie Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gin Val Asn lie Val Lys Lys Thr
1085 1090 1095
Glu Val Gin Thr Gly Gly Phe Ser Lys Glu Ser lie Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu lie Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly lie Thr lie Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro lie Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu lie lie Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gin Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245 Pro Glu Asp Asn Glu Gin Lys Gin Leu Phe Val Glu Gin His Lys
1250 1255 1260
His Tyr Leu Asp Glu He He Glu Gin He Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val He Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro He Arg Glu Gin Ala Glu Asn
1295 1300 1305
He He His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr He Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu He His Gin Ser He Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg He Asp Leu Ser Gin Leu Gly Gly Asp
1355 1360 1365
In some embodiments, the technology comprises use of a nucleotide sequence that is approximately 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% identical to a nucleotide sequence that encodes a protein described by SEQ ID NO: 1.
In some embodiments, the dCas9 used herein is at least about 50% identical to the sequence of the catalytically inactive S. pyogenes Cas9, i.e., at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100% identical to SEQ ID NO: i, wherein the mutations at D10 and H840, e.g., D 10A/D10N and H840A/H840N/H840Y are maintained.
In some embodiments, any differences from SEQ ID NC l are in non-conserved regions, as identified by sequence alignment of sequences set forth in Chylinski et al., RNA Biology 10:5, 1- 12; 2013 (e.g., in supplementary FIG. 1 and supplementary table 1 thereof); Esvelt et al., Nat Methods. 2013 November; 10(ll): il l6-21 and Fonfara et al., Nucl. Acids Res. (2014) 42 (4): 2577-2590. [Epub ahead of print 2013 Nov. 22]
doi: i0.1093/nar/gktl074, and wherein the mutations at D10 and H840, e.g., D10A/D10N and H840A/H840N/H840Y are maintained.
To determine the percent identity of two sequences, the sequences are aligned for optimal comparison purposes (gaps are introduced in one or both of a first and a second amino acid or nucleic acid sequence as required for optimal alignment, and non¬ homologous sequences can be disregarded for comparison purposes). The length of a reference sequence aligned for comparison purposes is at least 50% (in some
embodiments, about 50%, 55%, 60%, 65%, 70%, 75%, 85%, 90%, 95%, or 100% of the length of the reference sequence) is aligned. The nucleotides or residues at
corresponding positions are then compared. When a position in the first sequence is occupied by the same nucleotide or residue as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.
The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. For purposes of the present application, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package, using a Blosum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
Accordingly, provided herein is a dCas9/gRNA-based approach for detecting double-stranded nucleic acids (e.g., double-stranded genomic DNA) using SiMREPS probes. Embodiments of the technology comprise capturing unlabeled genomic DNA targets on a glass or fused silica surface using a gRNA-loaded, enzymatically dead dCas9 enzyme or enzymes that bind one or more segments of the target with high specificity and stability (e.g., forming approximately 20 base pairs at the site
complementary to the gRNA; Fig. 3). Surface capture of nucleic acid targets by biotinylated dCas9 is followed by observing repeated, transient binding of the SiMREPS probe to a second segment of the target that has been made accessible by dCas9/gRNA- mediated melting.
Thus, in some embodiments a biotinylated dCas9 comprising a guide RNA (gRNA) comprising an appropriate sequence captures a target nucleic acid. In addition, dCas9, with the help of the gRNA, melts the DNA, which produces a complementary non-template DNA strand that is single -stranded and accessible to the query probe. In some embodiments, the technology comprises use of a second non-biotinylated dCas9 comprising a second gRNA that binds to the target nucleic acid (Fig. 4). The second dCas9/gRNA binds to the target nucleic acid at a distance from the first dCas9/gRNA that is approximately the size of the query probe, e.g., 5 to 30 nucleotides. That is, the biotinylated dCas9/gRNA (capture dCas9/gRNA) and second dCas9/gRNA bind to regions adjacent to the region of the target nucleic acid to which the query probe binds. The two dCas9/gRNA complexes melt the region of nucleic acid between them to provide a single -stranded query region accessible for query probe binding. Accordingly, the spacing between the two dCas9/gRNA complexes has is appropriate for the binding of a query probe between them. Methods
In some embodiments, the technology provides a method for detecting a double -stranded nucleic acid. For example, in some embodiments a nucleic acid (e.g., a genomic DNA) target DNA is briefly pre-treated with dCas9/gRNA (e.g., at or near ambient ("room") temperature). Next, the guide RNA (gRNA) of the dCas9/gRNA complex hybridizes to the target nucleic acid (e.g., a genomic DNA) at a region adjacent to a query region (e.g., a region of the target nucleic acid that is complementary to a query probe). The dCas9/gRNA melts a specific DNA sequence (e.g., the query region) in the target nucleic acid. Methods comprise capturing the dCas9 (e.g., a biotinylated dCas9) onto a slide surface (e.g., by biotin-avidin interaction). Following capture and immobilization of the dCas9/gRNA-target nucleic acid to the surface, SiMREPS is used to detect binding of a query probe to the query region in the target nucleic acid, which is adjacent to the site of target nucleic acid hybridized to the gRNA. See, e.g., Fig. 3.
In some embodiments, two dCas9/gRNA complexes are used to make the query region of a target nucleic acid accessible to a query probe. In some embodiments, hybridizing two dCas9/gRNA complexes to flank the query region improves accessibility of the target nucleic acid (e.g., the query region) to the query probe for SiMREPS-based detection. One dCas9 is biotinylated for capture onto the surface! the other (e.g., second) dCas9 is optionally biotinylated (in preferred embodiments, the second dCas9 is not biotinylated). See, e.g., Fig. 4. The space between the regions bound by the two dCas9/gRNA complexes bound to the target nucleic acid provides appropriate space for binding of the query probe to the query region of the target nucleic acid.
In some embodiments, the detectable (e.g., fluorescent) query probe produces a fluorescence emission signal when it is close to the surface of the solid support (e.g., within about 100 nm of the surface of the solid support). When unbound, query probes quickly diffuse and thus are not individually detected! accordingly, when in the unbound state, the query probes produce a low level of diffuse background fluorescence.
Consequently, in some embodiments detection of bound query probes comprises use of total internal reflection fluorescence microscopy (TIRF), HiLo microscopy (see, e.g.,
US20090084980, EP2300983 Bl, WO2014018584 Al, WO2014018584 Al, incorporated herein by reference), confocal scanning microscopy, or other technologies comprising illumination schemes that illuminate (e.g., excite) only those query probe molecules near or on the surface of the solid support. Thus, in some embodiments, only query probes that are bound to an immobilized target near or on the surface produce a point-like emission signal (e.g., a "spot") that can be confirmed as originating from a single molecule.
In general terms, the observation comprises monitoring fluorescence emission at a number of discrete locations on the solid support where the target nucleic acids are immobilized (e.g., by being specifically bound to the dCas9/gRNA attached to the surface), e.g., at a number of fluorescent "spots" that blink, e.g., that can be in "on" and "off states. The presence of fluorescence emission (spot is "on") and absence of fluorescence emission (spot is "off) at each discrete location (e.g., at each "spot" on the solid support) are recorded. Each spot "blinks" - e.g., a spot alternates between "on" and "off states, respectively, as a query probe binds to the immobilized target nucleic acid at that spot and as the query probe dissociates from the immobilized target nucleic acid at that spot.
The data collected provide for the determination of the number of times a query probe binds to each immobilized target (e.g., the number of times each spot blinks "on") and a measurement of the amount of time a query probe remains bound (e.g., the length of time a spot remains "on" before turning "off).
In some embodiments, the query probe comprises a fluorescent label having an emission wavelength. Detection of fluorescence emission at the emission wavelength of the fluorescent label indicates that the query probe is bound to an immobilized target nucleic acid. Binding of the query probe to the target nucleic acid is a "binding event". In some embodiments of the technology, a binding event has a fluorescence emission having a measured intensity greater than a defined threshold. For example, in some embodiments a binding event has a fluorescence intensity that is above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid). In some embodiments, a binding event has a fluorescence intensity that is at least 1, 2, 3, 4 or more standard deviations above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid). In some embodiments, a binding event has a fluorescence intensity that is at least 2 standard deviations above the background fluorescence intensity (e.g., the fluorescence intensity observed in the absence of a target nucleic acid). In some embodiments, a binding event has a fluorescence intensity that is at least 1.5, 2, 3, 4, or 5 times the background fluorescence intensity (e.g., the mean fluorescence intensity observed in the absence of a target nucleic acid).
Accordingly, in some embodiments detecting fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold (e.g., at least 2 standard deviations greater than background intensity) indicates that a binding event has occurred (e.g., at a discrete location on the solid support where a target nucleic acid is immobilized). Also, in some embodiments detecting fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold (e.g., at least 2 standard deviations greater than background intensity) indicates that a binding event has started. Accordingly, in some embodiments detecting an absence of fluorescence at the emission wavelength of the fluorescent probe that has an intensity above the defined threshold (e.g., at least 2 standard deviations greater than background intensity) indicates that a binding event has ended (e.g., the query probe has dissociated from the target nucleic acid). The length of time between when the binding event started and when the binding event ended (e.g., the length of time that fluorescence at the emission wavelength of the fluorescent probe having an intensity above the defined threshold (e.g., at least 2 standard deviations greater than
background intensity) is detected) is the dwell time of the binding event. A "transition" refers to the binding and dissociation of a query probe to the target nucleic acid (e.g., an on/off event).
Methods according to the technology comprise counting the number of query probe binding events that occur at each discrete location on the solid support during a defined time interval that is the "acquisition time" (e.g., a time interval that is tens to hundreds to thousands of seconds, e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 seconds; e.g., 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 0 minutes; e.g., 1, 1.5, 2, 2.5, or 3 hours). In some embodiments, the acquisition time is approximately 1 to 10 seconds to 1 to 10 minutes (e.g., approximately 1 to 100 seconds, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 seconds, e.g., 1 to 100 minutes, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, or 100 minutes).
Further, the length of time the query probe remains bound to the target nucleic acid during a binding event is the "dwell time" of the binding event. The number of binding events detected during the acquisition time and/or the lengths of the dwell times recorded for the binding events is/are characteristic of a query probe binding to a target nucleic acid and thus provide an indication that the target nucleic acid is immobilized at said discrete location and thus that the target nucleic acid is present in the sample.
Binding of the query probe to the immobilized target nucleic acid and/or and dissociation of the query probe from the immobilized target nucleic acid is/are monitored (e.g., using a light source to excite the fluorescent probe and detecting fluorescence emission from a bound query probe, e.g., using a fluorescence microscope) and/or recorded during a defined time interval (e.g., during the acquisition time). The number of times the query probe binds to the nucleic acid during the acquisition time and/or the length of time the query probe remains bound to the nucleic acid during each binding event and the length of time the query probe remains unbound to the nucleic acid between each binding event (e.g., the "dwell times" in the bound and unbound states, respectively) are determined, e.g., by the use of a computer and software (e.g., to analyze the data using a hidden Markov model and Poisson statistics).
In some embodiments, control samples are measured (e.g., in absence of target). Fluorescence detected in a control sample is "background fluorescence" or "background (fluorescence) intensity" or "baseline".
In some embodiments, data comprising measurements of fluorescence intensity at the emission wavelength of the query probe are recorded as a function of time. In some embodiments, the number of binding events and the dwell times of binding events (e.g. for each immobilized nucleic acid) are determined from the data (e.g., by
determining the number of times and the lengths of time the fluorescence intensity is above a threshold background fluorescence intensity). In some embodiments, transitions (e.g., binding and dissociation of a query probe) are counted for each discrete location on the solid support where a target nucleic acid is immobilized. In some embodiments, a threshold number of transitions is used to discriminate the presence of a target nucleic acid at a discrete location on the solid support from background signal, non-target nucleic acid, and/or spurious binding of the query probe. In some embodiments, a number of transitions greater than 10 recorded during the acquisition time indicates the presence of a target nucleic acid at the discrete location on the solid support.
In some embodiments, a distribution of the number of transitions for each immobilized target is determined - e.g., the number of transitions is counted for each immobilized nucleic acid target observed. In some embodiments a histogram is produced. In some embodiments, characteristic parameters of the distribution are determined, e.g., the mean, median, peak, shape, etc. of the distribution are determined. In some embodiments, data and/or parameters (e.g., fluorescence data (e.g., fluorescence data in the time domain), kinetic data, characteristic parameters of the distribution, etc.) are analyzed by algorithms that recognize patterns and regularities in data, e.g., using artificial intelligence, pattern recognition, machine learning, statistical inference, neural nets, etc. In some embodiments, the analysis comprises use of a frequentist analysis and in some embodiments the analysis comprises use of a bayesian analysis. In some embodiments, pattern recognition systems are trained using known "training" data (e.g., using supervised learning) and in some embodiments algorithms are used to discover previously unknown patterns (e.g., unsupervised learning). See, e.g., Duda, et al. (2001) Pattern classification (2nd edition), Wiley, New York; Bishop (2006) Pattern Recognition and Machine Learning, Springer.
Pattern recognition (e.g., using training sets, supervised learning, unsupervised learning, and analysis of unknown samples) associates identified patterns with nucleic acids such that particular patterns provide a "fingerprint" of particular nucleic acids that find use in detection, quantification, and identification of nucleic acids.
In some embodiments, the distribution produced from a target nucleic acid is significantly different than a distribution produced from a non-target nucleic acid or the distribution produced in the absence of a target nucleic acid. In some embodiments, a mean number of transitions is determined for the plurality of immobilized target nucleic acids. In some embodiments, the mean number of transitions observed for a sample comprising a target nucleic acid is approximately linearly related as a function of time and has a positive slope (e.g., the mean number of transitions increases approximately linearly as a function of time).
In some embodiments, the data are treated using statistics (e.g., Poisson statistics) to determine the probability of a transition occurring as a function of time at each discrete location on the solid support. In some particular embodiments, a relatively constant probability of a transition event occurring as a function of time at a discrete location on the solid support indicates the presence of a target nucleic acid at said discrete location on the solid support. In some embodiments, a correlation coefficient relating event number and elapsed time is calculated from the probability of a transition event occurring as a function of time at a discrete location on the solid support. In some embodiments, a correlation coefficient relating event number and elapsed time greater than 0.95 when calculated from the probability of a transition event occurring as a function of time at a discrete location on the solid support indicates the presence of a target nucleic acid at said discrete location on the solid support.
In some embodiments, dwell times of bound query probe (xon) and unbound query probe (x0ff) are used to identify the presence of a target nucleic acid in a sample and/or to distinguish a sample comprising a target nucleic acid from a sample comprising a non- target nucleic acid and/or not comprising the target nucleic acid. For example, the xon for a target nucleic acid is greater than the xon for a non-target nucleic acid! and, the x0ff for a target nucleic acid is smaller than the x0ff for a non-target nucleic acid. In some embodiments, measuring xon and x0ff for a negative control and for a sample indicates the presence or absence of the target nucleic acid in the sample. In some embodiments, a plurality of xon and x0ff values is determined for each of a plurality of spots imaged on a solid support, e.g., for a control (e.g., positive and/or negative control) and a sample suspected of comprising a target nucleic acid. In some embodiments, a mean xon and/or Xoff is determined for each of a plurality of spots imaged on a solid support, e.g., for a control (e.g., positive and/or negative control) and a sample suspected of comprising a target nucleic acid. In some embodiments, a plot of xon versus x0ff (e.g., mean xon and x0ff, time-averaged xon and x0ff, etc.) for all imaged spots indicates the presence or absence of the target nucleic acid in the sample.
Applications
The technology finds use in the detection of nucleic acids, e.g., single-stranded and double-stranded nucleic acids. Accordingly, the technology provides for the detection of various forms of DNA (e.g., DNA comprising modified bases, e.g., methylated DNA, unmethylated DNA) and RNA (e.g., IncRNA, miRNA, etc.), e.g., to provide multi-nucleic acid detection on a common platform. Accordingly, the technology finds use in exemplary applications such as, e.g., detecting one or more of microRNAs, RNAs, mutant DNA alleles, wild-type DNA alleles, and locus -specific methylated DNAs, all on the SiMREPS platform. Detection of these types of nucleic acids occurs, in some embodiments, in the same sample.
For example, the technology finds use in detecting methylated DNA. Methylated DNA is a marker for many states of health and disease, including, for example, identifying patients at higher risk of colorectal cancer based on presence of specific methylated loci. Methylated DNA also provides the basis of a diagnostic test for the early detection of colorectal cancer as well as pre-cancerous adenomas and dysplastic lesions. Detection of methylated DNA at specific loci is currently performed by sodium bisulfite treatment of DNA, which deaminates unmethylated cytosines to produce uracil in the DNA, followed by PCR using distinct primer sets that selectively amplify methylated DNA fragments (e.g., 5-mC or 5-hmC, which are protected from conversion by Na bisulfite) or the unmethylated fragments (where primers are designed to bind to the anticipated converted sequence containing uracils in place of cytosines). Another approach is to perform next generation sequencing of bisulfite treated and/or untreated DNA to infer methylated bases. Although bisulfite conversion followed by PCR or next generation sequencing are commonly used approaches, they suffer from significant limitations. Surprising, some of these limitations of bisulfite based technologies provide advantages for detection of methylated DNA using the SiMREPS technology provided herein. First, the bisulfite treatment not only deaminates cytosines but also randomly fragments the DNA into shorter fragments, creating challenges for designing PCR primers, especially as the sequence length gets very short. In addition, following bisulfite conversion, the regions of DNA that have undergone deamination at cytosines are no longer complementary and therefore not double -stranded, which makes PCR somewhat more challenging as only one strand can be amplified with any given PCR primer set. A SiMREPS approach, on the other hand, fundamentally benefits from the conversion of double-stranded nucleic acids to single-stranded nucleic acids, as well as from fragmentation of nucleic acids to short fragments.
In some embodiments, the presence or absence of nucleic acid modifications are detected (e.g., modified bases, nucleotide analogs, etc.). For example, in some
embodiments, epigenetic modifications of nucleic acids that influence gene expression are detected. In some embodiments, methylation of DNA is detected. In some
embodiments, the technology finds use in detecting (e.g., identifying the presence or absence of) nucleotide analogs, nucleotide bases - or nucleosides and/or nucleotides comprising bases - other than adenine, thymine, guanosine, cytosine, and uracil. For example, in some embodiments the technology finds use in detecting, identifying, and/or quantifying a nucleotide, nucleoside, and/or a base including but not limited to, e.g., 5- methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5- carboxycytosine (5caC), N(6)-methyladenosine (m(6)A), pseudouridine (Ψ),
dihydrouridine (D), inosine (I), 7-methylguanosine (m7G), hypoxanthine, xanthine, 2,6- diaminopurine, and 6,8-diaminopurine.
In an exemplary embodiment, the technology comprises detecting methylated DNA by SiMREPS comprises analyzing samples that are treated with bisulfite and samples that have not been treated with bisulfite. Embodiments provide using
SiMREPS query probes that distinguish sequences expected from conversion of unmethylated cytosines to uracil from sequences expected if cytosine(s) at a given locus are not converted to uracil (due to methylation). In some embodiments, both query probes are provided in the sample chamber at the same time and each probe comprises a different fluorophore. In some embodiments, bisulfite reagent is provided in real-time while imaging a SiMREPS experiment with both query probes present (e.g., one probe that binds to the methylated sequence and a second probe that binds to the uracil - converted sequence, each with a separate fluorophore), where DNA fragments blinking in one color would shift to blinking in the other color based on conversion. Such an assay provides greater accuracy and precision in the measurements than comparing a bisulfite-treated aliquot of sample to an untreated aliquot, which is required for current PCR or next generation sequencing-based approaches. In some embodiments, the technology is a multiplexed analysis using microfluidics and/or multi-spectral imaging, for example. In addition to bisulfate modification, the technology comprises use of other reagents that convert additional chemical markers on DNA or RNA into modifications that are easily detected by SiMREPS.
In some embodiments, the technology finds use in detecting a combination of mutant DNA, methylated DNA, and microRNA biomarkers on the same platform. For example, such a technology finds use in the detection of colorectal cancer and advanced adenoma, e.g., by analyzing a stool sample. The technology provides for the detection of mutant DNA (e.g., detecting 1 mutant molecule in a background of 1,000,000 wild-type molecules! e.g., detecting KRAS mutant DNA in a background of wild-type DNA).
Moreover, the technology also provides a technology for measuring all three types of markers (methylated DNA, miRNA, and/or mutant DNA). The technology finds use in analyzing nucleic acids in a buffer solution! the technology finds use in analyzing nucleic acids in a matrix extracted from stool. An average stool weighs 200 g and comprises approximately 10 million diploid- genome equivalents of human DNA. A sensitivity of 1: 1,000,000 provides detection of as few as 10 mutant molecules in DNA extracted from a typical whole stool. This is 10,000-fold more sensitive than current clinic ally -used KRAS assays and 100-fold more sensitive than best-performing research-grade methods. See, e.g., Domagala et al. (2012) "KRAS mutation testing in colorectal cancer as an example of the pathologist's role in personalized targeted therapy: a practical approach" Pol J Pathol 63(3): 145-64; Gerecke et al. (2013) "Ultrasensitive detection of unknown colon cancer-initiating mutations using the example of the Adenomatous polyposis coli gene" Cancer Prev Res (Phila). 6(9): 898-907. This dramatic advance in performance is provided by the arbitrary specificity for allele discrimination that is inherent to the kinetic fingerprinting of the SiMREPS technology.
Detecting biomarkers for cancer
Colonoscopy is the dominant screening approach for colorectal cancer (CRC) in the U.S., despite its invasiveness, high cost, low patient compliance, and risk of complications. New diagnostics, such as stool -based colorectal cancer screening, are limited by low sensitivity for detecting advanced adenomas (AA), removal of which prevents CRC. The currently best-performing stool-based test analyzes stool DNA (mutant DNA and methylation) and occult blood, but is technically complex, expensive ($500/test), and challenged by limited ability to detect rare mutant DNA alleles in a high background of wild-type DNA.
The detection technology described herein provides a new technology for lowcost, rapid measurement of rare mutant DNA alleles in stool with exquisite analytic specificity, with concurrent measurement of occult blood (e.g., using a microRNA marker of occult blood) and methylated DNA on a single platform. The technology provides increased sensitivity for detecting advanced adenoma at a >10-fold lower cost than the current state-of-the-art.
The technology provides quantification of rare mutant DNA alleles with orders of magnitude higher specificity than current methods, leading to significantly sensitized AA detection. In some embodiments, adenoma- defining mutations such as in the APC gene are detected.
The technology provides for the measurement of multiple stool biomarker types, all on a single platform. In some embodiments, in addition to mutant DNA, markers detected on the platform include microRNA (see, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety), a stool occult blood marker (e.g., a microRNA marker of stool occult blood), and methylated DNA.
Detection of RNA
In some embodiments, the nucleic acid to be detected, characterized, quantified, and/or identified (e.g., the target nucleic acid) is a RNA (e.g., a IncRNA, e.g., a non-protein coding RNA longer than approximately 200 nucleotides). Embodiments provide that the secondary structure of a nucleic acid (e.g., a RNA, e.g., a IncRNA) is melted to provide access to query regions by query probes for SiMREPS detectino. For example, in some embodiments the technology comprises use of a dCas9/gRNA that recognizes and melts a secondary structured target in an RNA using an auxiliary PAM oligonucleotide (e.g., a PAMmer). In some embodiments the technology comprises use of two dCas9/gRNA complexes that recognize and melt a secondary structured target in an RNA using an auxiliary PAM oligonucleotide (e.g., a PAMmer).
In some embodiments, the dCas9 binds to single -stranded RNA targets matching the gRNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide (a "PAM-presenting oligonucleotide" or "PAMmer"). Accordingly, in some embodiments PAMmers provide for the site-specific binding of dCas9/gRNA to single- stranded RNA targets (e.g., IncRNA). Furthermore, the technology provides for the use of PAMmers to direct dCas9 to bind to specific RNA targets and melt secondary structure to make query regions available for the binding of query probes in a SiMREPs assay. See, e.g., O'Connell et al. 2014 "Programmable RNA recognition and cleavage by CRISPR/Cas9" Nature 516: 263-6. Thus, embodiments provide compositions comprising a dCas9, a gRNA (e.g., a dCas9/gRNA complex), and a PAMmer. Detection of microRNA
In some embodiments, the nucleic acid to be detected, characterized, quantified, and/or identified (e.g., the target nucleic acid) is a microRNA. microRNAs (miRNA or μRNA) are single-stranded RNA molecules of approximately 21 to 23 nucleotides in length that regulate gene expression. miRNAs are encoded by genes from whose DNA they are transcribed, but miRNAs are not translated into protein (see, e.g., Carrington et al,
2003, which is hereby incorporated by reference). The genes encoding miRNAs are much longer than the processed mature miRNA molecule. See, e.g., U.S. Pat. App. Ser. No. 14/589,467, incorporated herein by reference in its entirety. Detection of genomic aberrations
In some embodiments, the SiMREPS technology provides for identifying and/or counting genomic aberrations (e.g., other than simple point mutations) in a DNA sample based on detecting unpaired regions after comparative genome hybridization. In some
embodiments, SiMREPS finds use to determine the overall degree of genomic instability in a sample, e.g., as evidenced by presence of deletions and insertions compared to a reference, normal DNA sample. Exemplary embodiments comprise providing a normal DNA (for example, wild-type DNA from normal blood cells of a patient with a solid tumor like lung cancer) and mix it with matched tumor DNA (or even circulating cell- free DNA), with fragmentation of the DNA so that it is present as fragments and is immobilized onto a surface, potentially through end modification (e.g., biotinylation) or other approaches. Most of the DNA will form hybridized DNA segments, but areas of deletion or insertion are present as duplexes where one strand bulges out. In some embodiments, the normal DNA and the tumor DNA are provided in ratios appropriate for efficient detection. Embodiments provide a SiMREPS-based affinity reagent that is used to detect these unmatched regions, and counting them provides a measure of genomic instability, which finds use, e.g., in some embodiments as a biomarker of cancer risk. In some embodiments, the technology finds use in pre-natal screening for chromosomal abnormalities. In some embodiments, the affinity reagent is a single- stranded DNA binding proteins, a Holliday junction recombinases modified not to cleave nucleic acid, e.g., for identification of balanced chromosomal translocations, etc.
In some embodiments, the technology finds use in the detection of microsatellite repeat aberrations in patients with microsatellite -unstable colorectal cancer, for example. Accordingly, embodiments comprise providing a panel of SiMREPS probes corresponding to different microsatellite loci that show differential kinetic binding properties depending on whether the microsatellite repeats have expanded or not in a sample. In some embodiments, the technology comprises a comparative DNA
hybridization approach in which hybridization of an expanded microsatellite repeat sequence to one without expansion generates an unpaired segment that is detected using a SiMREPS -based approach, using a DNA probe or other effective affinity
SiMREPS reader reagent.
Bifunctional affinity reagents
In some embodiments, the technology comprises detection of proteins and other analytes with SiMREPS by incorporation of a bifunctional affinity reagent that binds to the target analyte and comprises a nucleic acid that can be counted using a SiMREPS reader probe. As an example, some embodiments comprise the use of an antibody linked to a short DNA oligonucleotide, such that if the target protein analyte were immobilized onto the surface of a slide (for example by simple drying onto the surface, or other nonspecific or specific capture methods), the binding of the antibody to the target protein analyte is measured using a SiMREPS -based reading of the conjugated DNA. This would allow multiple antibodies to be multiplexed and distinguished by different DNA sequence "barcodes" linked to the various antibodies. Embodiments provide that the samples are a large variety of types including even cells or cell lysates. Affinity reagents include DNA or RNA binding proteins, aptamers, antibodies and antibody fragments, linked to a DNA barcode.
Non-nucleic acid SiMREPS probes
Embodiments provide probes that are not nucleic acids. For example, embodiments provide an antibody or other affinity reagent that have a binding interaction with a target analyte that has a stability amenable for SiMREPS, e.g., a transient associated that provides a "blinking" signal and, in some embodiments, a kinetic binding
fingerprint. In some embodiments, the non-nucleic acid query probe is engineered to weaken its binding relative to the non-engineered version, e.g., to provide a binding and/or association that is less thermodynamically stable. In addition to antibodies, embodiments comprise target analytes and query probes that are proteins, e.g., where one binding partner is the affinity reagent and the other would is the target analyte being measured. Embodiments comprise the use of aptamers binding any ligand, lectins binding glycosylated proteins, proteins or other molecules binding lipids, etc. The technology comprises the use of any binding pair with transient binding behavior suitable for detection on the SiMREPS platform, e.g., that produces a kinetic
fingerprint.
Intramolecular SiMREPS probing
In some embodiments, the technology provides a capture probe and a query probe that are linked to provide an intramolecular probing mechanism (see, e.g., Fig. 5 a and b). The probe is asymmetric, so that when the target nucleic acid binds (e.g., with
thermodynamic stability, e.g., irreversibly) to the capture sequence, the target nucleic acid undergoes transient binding with the query probe. The transient binding and dissociation of the query probe yields a time -dependent change in donor fluorophore intensity or FRET whose kinetics are sensitive to the sequence of the target nucleic acid. In some embodiments, an address strand binds the Query/Capture complex to the surface, to provide rigidity that exerts control over the transient binding kinetics, and to provide a means to immobilize many different Query/Capture sequences to different regions of the imaging surface (e.g., as in a DNA microarray).
In some embodiments, intramolecular SiMREPS probing provides faster acquisition. In particular, binding of the query probe is rapid because the binding is an intramolecular hybridization reaction after the target nucleic acid binds to the capture probe. Furthermore, in some embodiments, imaging times are reduced compared to the other embodiments of the SiMREPS technology. For instance, the same number of binding and dissociation events occur in 1 to 10 seconds in some embodiments of intramolecular SiMREPS experiments as occur in 10 minutes in other embodiments of the SiMREPS technology. The intramolecular SiMREPS technology provides for the parallelization of experiments through spatial segregation. In some embodiments, the intramolecular SiMREPS technology reduces the concentrations of query probe that provide efficient detection. The intramolecular SiMREPS technology provides a platform that, in some embodiments, comprises many different Capture/Query probes immobilized within different regions of the imaging surface in a manner specified by the Address strand. As an example, one might use a standard microarray chip containing thousands of distinct sequences! these sequences could serve as the Address strands for immobilization of SiMREPS Capture/Query probes, permitting SiMREPS assays of thousands of target sequences (microRNAs, IncRNAs, DNA converted to single-stranded form) on a single chip.
Notably, in some embodiments, the Address strands are not related in sequence to any of the targets, since interaction occurs indirectly through the query and capture probes. Indeed, the Address strands are not required to be related to the targets.
Furthermore, embodiments provide that the query and capture probes comprise affinity reagents other than DNA sequences, such as the dCas9/gRNA complexes discussed elsewhere in this application.
Embodiments provide control of exposure of the fluorophores to excitation sources, e.g., to reduce or minimize photobleaching prior to analysis. In some
embodiments, kinetic signatures provide a correction mechanism to identify and correct false positive detections resulting from, e.g., deposit of a Capture/Query probe on the wrong part of the imaging surface (outside of its Address region). Embodiments also provide a technology in which false positives are minimized or reduced by splitting the Query and Capture probes into two non- contiguous probes that co-localize upon binding to the Address sequence (see, e.g., Fig. 5 b).
Fluorescent moieties
In some embodiments, a nucleic acid comprises a fluorescent moiety (e.g., a fluorogenic dye, also referred to as a "fluorophore" or a "fluor"). A wide variety of fluorescent moieties is known in the art and methods are known for linking a fluorescent moiety to a nucleotide prior to incorporation of the nucleotide into an oligonucleotide and for adding a fluorescent moiety to an oligonucleotide after synthesis of the oligonucleotide.
Examples of compounds that may be used as the fluorescent moiety include but are not limited to xanthene, anthracene, cyanine, porphyrin, and coumarin dyes.
Examples of xanthene dyes that find use with the present technology include but are not limited to fluorescein, 6-carboxyfluorescein (6-FAM), 5-carboxyfluorescein (5-FAM), 5- or 6-carboxy-4, 7, 2', 7'- tetrachlorofluorescein (TET), 5- or 6-carboxy-4'5'2'4'5'7'
hexachlorofluorescein (HEX), 5' or 6'-carboxy-4',5'-dichloro-2,'7'-dimethoxyfluorescein (JOE), 5-carboxy-2',4',5',7'-tetrachlorofluorescein (ZOE), rhodol, rhodamine, tetramethylrhodamine (TAMRA), 4,7-dlchlorotetramethyl rhodamine (DTAMRA), rhodamine X (ROX), and Texas Red. Examples of cyanine dyes that may find use with the present invention include but are not limited to Cy 3, Cy 3B, Cy 3.5, Cy 5, Cy 5.5, Cy 7, and Cy 7.5. Other fluorescent moieties and/or dyes that find use with the present technology include but are not limited to energy transfer dyes, composite dyes, and other aromatic compounds that give fluorescent signals. In some embodiments, the fluorescent moiety comprises a quantum dot.
In some embodiments, the fluorescent moiety comprises a fluorescent protein (e.g., a green fluorescent protein (GFP), a modified derivative of GFP (e.g., a GFP comprising S65T, an enhanced GFP (e.g., comprising F64L)), or others known in the art such as, e.g., blue fluorescent protein (e.g., EBFP, EBFP2, Azurite, mKalamal), cyan fluorescent protein (e.g., ECFP, Cerulean, CyPet, mTurquoise2), and yellow fluorescent protein derivatives (e.g., YFP, Citrine, Venus, YPet). Embodiments provide that the fluorescent protein may be covalently or noncovalently bonded to one or more query and/or capture probes.
Fluorescent dyes include, without limitation, d-Rhodamine acceptor dyes including Cy 5, dichloro[R110], dichloro[R6G], dichloro [TAMRA], dichloro[ROX] or the like, fluorescein donor dyes including fluorescein, 6-FAM, 5-FAM, or the like! Acridine including Acridine orange, Acridine yellow, Proflavin, pH 7, or the like! Aromatic Hydrocarbons including 2-Methylbenzoxazole, Ethyl p-dimethylaminobenzoate, Phenol, Pyrrole, benzene, toluene, or the like! Arylmethine Dyes including Auramine O, Crystal violet, Crystal violet, glycerol, Malachite Green or the like! Coumarin dyes including 7- Methoxycoumarin-4-acetic acid, Coumarin 1, Coumarin 30, Coumarin 314, Coumarin 343, Coumarin 6 or the like! Cyanine Dyes including l, l'-diethyl-2,2'-cyanine iodide, Cryptocyanine, Indocarbocyanine (C3) dye, Indodicarbocyanine (C5) dye,
Indotricarbocyanine (C7) dye, Oxacarbocyanine (C3) dye, Oxadicarbocyanine (C5) dye, Oxatricarbocyanine (C7) dye, Pinacyanol iodide, Stains all, Thiacarbocyanine (C3) dye, ethanol, Thiacarbocyanine (C3) dye, n-propanol, Thiadicarbocyanine (C5) dye,
Thiatricarbocyanine (C7) dye, or the like! Dipyrrin dyes including Ν,Ν'-Difluoroboryl· 1, 9- dimethyl-5-(4-iodophenyl)- dipyrrin, N,N'-Difluoroboryb l,9-dimethyl-5-[(4-(2- trimethylsilylethynyl), N,N'-Difluoroboryl- l,9-dimethyl-5-phenydipyrrin, or the like! Merocy anines including 4- (dicy anomethylene) - 2 - methyl- 6 - (p - dimethyl aminostyryl) - 4H - pyran (DCM), acetonitrile, 4-(dicyanomethylene)-2-methyl-6-(p-dimethylaminostyryl)- 4H-pyran (DCM), methanol, 4-Dimethylamino-4'-nitrostilbene, Merocyanine 540, or the like! Miscellaneous Dyes including 4',6-Diamidino-2-phenylindole (DAPI), dimethylsulfoxide, 7-Benzylamino-4-nitrobenz-2-oxa- l,3-diazole, Dansyl glycine, Dansyl glycine, dioxane, Hoechst 33258, DMF, Hoechst 33258, Lucifer yellow CH, Piroxicam, Quinine sulfate, Quinine sulfate, Squarylium dye III, or the like! Oligophenylenes including 2,5-Di henyloxazole (PPO), Biphenyl, POPOP, p-Quaterphenyl, p-Terphenyl, or the like! Oxazines including Cresyl violet perchlorate, Nile Blue, methanol, Nile Red, ethanol, Oxazine 1, Oxazine 170, or the like! Polycyclic Aromatic Hydrocarbons including 9, 10-Bis(phenylethynyl)anthracene, 9, 10-Diphenylanthracene, Anthracene, Naphthalene, Perylene, Pyrene, or the like! polyene/polyynes including 1,2- diphenylacetylene, 1,4-diphenylbutadiene, 1,4-diphenylbutadiyne, 1,6- Diphenylhexatriene, Beta-carotene, Stilbene, or the like! Redox-active Chromophores including Anthraquinone, Azobenzene, Benzoquinone, Ferrocene, Riboflavin, Tris(2,2'- bipyridypruthenium(ll), Tetrapyrrole, Bilirubin, Chlorophyll a, diethyl ether,
Chlorophyll a, methanol, Chlorophyll b, Diprotonated-tetraphenylporphyrin, Hematin, Magnesium octaethylporphyrin, Magnesium octaethylporphyrin (MgOEP), Magnesium phthalocyanine (MgPc), PrOH, Magnesium phthalocyanine (MgPc), pyridine,
Magnesium tetramesitylporphyrin (MgTMP), Magnesium tetraphenylporphyrin
(MgTPP), Octaethylporphyrin, Phthalocyanine (Pc), Porphin, ROX, TAMRA, Tetra-t- butylazaporphine, Tetra-t-butylnaphthalocyanine, Tetrakis(2,6- dichlorophenyl)porphyrin, Tetrakis(o- aminophenyl)porphyrin, Tetramesitylporphyrin (TMP), Tetraphenylporphyrin (TPP), Vitamin B12, Zinc octaethylporphyrin (ZnOEP), Zinc phthalocyanine (ZnPc), pyridine, Zinc tetramesitylporphyrin (ZnTMP), Zinc tetramesitylporphyrin radical cation, Zinc tetraphenylporphyrin (ZnTPP), or the like! Xanthenes including Eosin Y, Fluorescein, basic ethanol, Fluorescein, ethanol,
Rhodamine 123, Rhodamine 6G, Rhodamine B, Rose bengal, Sulforhodamine 101, or the like! or mixtures or combination thereof or synthetic derivatives thereof.
Several classes of fluorogenic dyes and specific compounds are known that are appropriate for particular embodiments of the technology: xanthene derivatives such as fluorescein, rhodamine, Oregon green, eosin, and Texas red! cyanine derivatives such as cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, and merocyanine! naphthalene derivatives (dansyl and prodan derivatives); coumarin derivatives! oxadiazole derivatives such as pyridyloxazole, nitrobenzoxadiazole, and benzoxadiazole! pyrene derivatives such as cascade blue! oxazine derivatives such as Nile red, Nile blue, cresyl violet, and oxazine 170; acridine derivatives such as proflavin, acridine orange, and acridine yellow! arylmethine derivatives such as auramine, crystal violet, and malachite green! and tetrapyrrole derivatives such as porphin, phtalocyanine, bilirubin. In some embodiments the fluorescent moiety a dye that is xanthene, fluorescein, rhodamine, BODIPY, cyanine, coumarin, pyrene, phthalocyanine, phycobiliprotein, ALEXA FLUOR® 350, ALEXA FLUOR® 405, ALEXA FLUOR® 430, ALEXA FLUOR® 488, ALEXA FLUOR® 514, ALEXA FLUOR® 532, ALEXA FLUOR® 546, ALEXA FLUOR® 555, ALEXA FLUOR® 568, ALEXA FLUOR® 568, ALEXA FLUOR® 594, ALEXA FLUOR® 610, ALEXA FLUOR® 633, ALEXA FLUOR® 647, ALEXA FLUOR® 660, ALEXA FLUOR® 680, ALEXA FLUOR® 700, ALEXA FLUOR® 750, or a squaraine dye. In some embodiments, the label is a fluorescently detectable moiety as described in, e.g., Haugland (September 2005) MOLECULAR PROBES HANDBOOK OF
FLUORESCENT PROBES AND RESEARCH CHEMICALS (lOth ed.), which is herein incorporated by reference in its entirety.
In some embodiments the label (e.g., a fluorescently detectable label) is one available from ATTO-TEC GmbH (Am Eichenhang 50, 57076 Siegen, Germany), e.g., as described in U.S. Pat. Appl. Pub. Nos. 20110223677, 20110190486, 20110172420, 20060179585, and 20030003486; and in U.S. Pat. No. 7,935,822, all of which are incorporated herein by reference (e.g., ATTO 390, ATTO 425, ATTO 465, ATTO 488, ATTO 495, ATTO 514, ATTO 520, ATTO 532, ATTO Rho6G, ATTO 542, ATTO 550, ATTO 565, ATTO Rho3B, ATTO Rholl, ATTO Rhol2, ATTO Thiol2, ATTO RholOl, ATTO 590, ATTO 594, ATTO Rhol3, ATTO 610, ATTO 620, ATTO Rhol4, ATTO 633, ATTO 647, ATTO 647N, ATTO 655, ATTO Oxal2, ATTO 665, ATTO 680, ATTO 700, ATTO 725, ATTO740).
One of ordinary skill in the art will recognize that dyes having emission maxima outside these ranges may be used as well. In some cases, dyes ranging between 500 nm to 700 nm have the advantage of being in the visible spectrum and can be detected using existing photomultiplier tubes. In some embodiments, the broad range of available dyes allows selection of dye sets that have emission wavelengths that are spread across the detection range. Detection systems capable of distinguishing many dyes are known in the art. Samples
In some embodiments, nucleic acids (e.g., DNA or RNA) are isolated from a biological sample containing a variety of other components, such as proteins, lipids, and non- template nucleic acids. Nucleic acid template molecules can be obtained from any material (e.g., cellular material (live or dead), extracellular material, viral material, environmental samples (e.g., metagenomic samples), synthetic material (e.g., amplicons such as provided by PCR or other amplification technologies)), obtained from an animal, plant, bacterium, archaeon, fungus, or any other organism. Biological samples for use in the present technology include viral particles or preparations thereof. Nucleic acid molecules can be obtained directly from an organism or from a biological sample obtained from an organism, e.g., from blood, urine, cerebrospinal fluid, seminal fluid, saliva, sputum, stool, hair, sweat, tears, skin, and tissue. Exemplary samples include, but are not limited to, whole blood, lymphatic fluid, serum, plasma, buccal cells, sweat, tears, saliva, sputum, hair, skin, biopsy, cerebrospinal fluid (CSF), amniotic fluid, seminal fluid, vaginal excretions, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluids, intestinal fluids, fecal samples, and swabs, aspirates (e.g., bone marrow, fine needle, etc.), washes (e.g., oral, nasopharyngeal, bronchial, bronchialalveolar, optic, rectal, intestinal, vaginal, epidermal, etc.), and/or other specimens.
Any tissue or body fluid specimen may be used as a source for nucleic acid for use in the technology, including forensic specimens, archived specimens, preserved specimens, and/or specimens stored for long periods of time, e.g., fresh-frozen, methanol/acetic acid fixed, or formalin-fixed paraffin embedded (FFPE) specimens and samples. Nucleic acid template molecules can also be isolated from cultured cells, such as a primary cell culture or a cell line. The cells or tissues from which template nucleic acids are obtained can be infected with a virus or other intracellular pathogen. A sample can also be total RNA extracted from a biological specimen, a cDNA library, viral, or genomic DNA. A sample may also be isolated DNA from a non-cellular origin, e.g.
amplified/isolated DNA that has been stored in a freezer.
Nucleic acid molecules can be obtained, e.g., by extraction from a biological sample, e.g., by a variety of techniques such as those described by Maniatis, et al. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y. (see, e.g., pp. 280- 281).
In some embodiments, the technology provides for the size selection of nucleic acids, e.g., to remove very short fragments or very long fragments.
In some embodiments, the technology is used to identify a nucleic acid in situ. In particular, embodiments of the technology provide for the identification of a nucleic acid directly in a tissue, cell, etc. (e.g., after permeabilizing the tissue, cell, etc.) without extracting the nucleic acid from the tissue, cell, etc. In some embodiments of the technology related to in situ detection, the technology is applied in vivo, ex vivo, and/or in vitro. In some embodiments, the sample is a crude sample, a minimally treated cell lysates, or a biofluid lysate. In some embodiments, the nucleic acid is detected in a crude lysates without nucleic acid purification.
Kits
Some embodiments are related to kits for the detection of a nucleic acid. For instance, in some embodiments are provided a kit comprising a solid support (e.g., a microscope slide, a bead, a coverslip, an avidin (e.g., strep tavi din) -conjugated microscope slide or coverslip, a solid support comprising a zero mode waveguide array, or the like), a dCas9/gRNA (e.g., comprising a biotinylated dCas9), and a query probe as described herein. Some embodiments further provide a non-biotinylated dCas9/gRNA.
Some embodiments further provide software on a computer-readable format or downloadable from the internet for the collection and analysis of query probe binding events and dwell times as described herein. In some embodiments, kits for multiplex detection comprise two or more query probes each comprising a sequence
complementary to distinct query regions of one or more target nucleic acids and each comprising a different fluorescent moiety. In some embodiments, query probes are complementary to query regions of one or more nucleic acid targets. Some embodiments of kits comprise one or more positive controls and/or one or more negative controls. Some embodiments comprise a series of controls having known concentrations, e.g., to produce a standard curve of concentrations.
Systems
Some embodiments of the technology provide systems for the detection and
quantification of a target nucleic acid. Systems according to the technology comprise, e.g., a solid support (e.g., a microscope slide, a coverslip, an avidin (e.g., streptavidin)- conjugated microscope slide or coverslip, a solid support comprising a zero mode waveguide array, or the like), a dCas9/gRNA (e.g., comprising a biotinylated dCas9), and a query probe as described herein. Some embodiments further provide a non- biotinylated dCas9/gRNA.
Some embodiments further comprise a fluorescence microscope comprising an illumination configuration to excite bound query probes (e.g., a prism-type total internal reflection fluorescence (TIRF) microscope, an objective -type TIRF microscope, a near- TIRF or HiLo microscope, a confocal laser scanning microscope, a zero-mode waveguide, and/or an illumination configuration capable of parallel monitoring of a large area of the slide or coverslip (> 100 pm2) while restricting illumination to a small region of space near the surface). Some embodiments comprise a fluorescence detector, e.g., a detector comprising an intensified charge coupled device (ICCD), an electron-multiplying charge coupled device (EM-CCD), a complementary metal-oxide-semiconductor (CMOS), a photomultiplier tube (PMT), an avalanche photodiode (APD), and/or another detector capable of detecting fluorescence emission from single chromophores. Some
embodiments comprise a computer and software encoding instructions for the computer to perform.
Some embodiments comprise optics, such as lenses, mirrors, dichroic mirrors, optical filters, etc., e.g., to detect fluorescence selectively within a specific range of wavelengths or multiple ranges of wavelengths.
For example, in some embodiments, computer-based analysis software is used to translate the raw data generated by the detection assay (e.g., the presence, absence, or amount of one or more nucleic acids (e.g., one or more biomarkers) into data of predictive value for a clinician. The clinician can access the predictive data using any suitable means.
For instance, some embodiments comprise a computer system upon which embodiments of the present technology may be implemented. In various embodiments, a computer system includes a bus or other communication mechanism for communicating information and a processor coupled with the bus for processing information. In various embodiments, the computer system includes a memory, which can be a random access memory (RAM) or other dynamic storage device, coupled to the bus, and instructions to be executed by the processor. Memory also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor. In various embodiments, the computer system can further include a read only memory (ROM) or other static storage device coupled to the bus for storing static information and instructions for the processor. A storage device, such as a magnetic disk or optical disk, can be provided and coupled to the bus for storing information and instructions.
In various embodiments, the computer system is coupled via the bus to a display, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), for displaying information to a computer user. An input device, including alphanumeric and other keys, can be coupled to the bus for communicating information and command selections to the processor. Another type of user input device is a cursor control, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to the processor and for controlling cursor movement on the display. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
A computer system can perform embodiments of the present technology.
Consistent with certain implementations of the present technology, results can be provided by the computer system in response to the processor executing one or more sequences of one or more instructions contained in the memory. Such instructions can be read into the memory from another computer-readable medium, such as a storage device. Execution of the sequences of instructions contained in the memory can cause the processor to perform the methods described herein. Alternatively, hard-wired circuitry can be used in place of or in combination with software instructions to implement the present teachings. Thus, implementations of the present technology are not limited to any specific combination of hardware circuitry and software.
The term "computer-readable medium" as used herein refers to any media that participates in providing instructions to the processor for execution. Such a medium can take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Examples of non- volatile media can include, but are not limited to, optical or magnetic disks, such as a storage device. Examples of volatile media can include, but are not limited to, dynamic memory. Examples of transmission media can include, but are not limited to, coaxial cables, copper wire, and fiber optics, including the wires that comprise the bus.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.
Various forms of computer readable media can be involved in carrying one or more sequences of one or more instructions to the processor for execution. For example, the instructions can initially be carried on the magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a network connection (e.g., a LAN, a WAN, the internet, a telephone line). A local computer system can receive the data and transmit it to the bus. The bus can carry the data to the memory, from which the processor retrieves and executes the instructions. The instructions received by the memory may optionally be stored on a storage device either before or after execution by the processor. In accordance with various embodiments, instructions configured to be executed by a processor to perform a method are stored on a computer-readable medium. The computer-readable medium can be a device that stores digital information. For example, a computer-readable medium includes a compact disc read-only memory (CD-ROM) as is known in the art for storing software. The computer-readable medium is accessed by a processor suitable for executing instructions configured to be executed.
In accordance with such a computer system, some embodiments of the technology provided herein further comprise functionalities for collecting, storing, and/or analyzing data (e.g., presence, absence, concentration of a nucleic acid). For example, some embodiments contemplate a system that comprises a processor, a memory, and/or a database for, e.g., storing and executing instructions, analyzing fluorescence, image data, performing calculations using the data, transforming the data, and storing the data. It some embodiments, an algorithm applies a statistical model (e.g., a Poisson model or hidden Markov model) to the data.
Many diagnostics involve determining the presence of, or a nucleotide sequence of, one or more nucleic acids (e.g., a nucleic acid biomarker). Thus, in some
embodiments, an equation comprising variables representing the presence, absence, concentration, amount, or sequence properties of multiple nucleic acids produces a value that finds use in making a diagnosis or assessing the presence or qualities of a nucleic acid. As such, in some embodiments this value is presented by a device, e.g., by an indicator related to the result (e.g., an LED, an icon on a display, a sound, or the like). In some embodiments, a device stores the value, transmits the value, or uses the value for additional calculations. In some embodiments, an equation comprises variables representing the presence, absence, concentration, amount, or sequence properties of one or more of a methylated locus in genomic DNA, a microRNA, a mutant gene biomarker, or a chromosomal aberration.
Thus, in some embodiments, the present technology provides the further benefit that a clinician, who is not likely to be trained in genetics or molecular biology, need not understand the raw data. The data are presented directly to the clinician in its most useful form. The clinician is then able to utilize the information to optimize the care of a subject. The present invention contemplates any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information providers, medical personal, and/or subjects. For example, in some embodiments of the present technology, a sample is obtained from a subject and submitted to a profiling service (e.g., a clinical lab at a medical facility, genomic profiling business, etc.), located in any part of the world (e.g., in a country different than the country where the subject resides or where the information is ultimately used) to generate raw data. Where the sample comprises a tissue or other biological sample, the subject may visit a medical center to have the sample obtained and sent to the profiling center or subjects may collect the sample themselves and directly send it to a profiling center. Where the sample comprises previously determined biological information, the information may be directly sent to the profiling service by the subject (e.g., an information card containing the information may be scanned by a computer and the data transmitted to a computer of the profiling center using electronic communication systems). Once received by the profiling service, the sample is processed and a profile is produced that is specific for the diagnostic or prognostic information desired for the subject. The profile data are then prepared in a format suitable for interpretation by a treating clinician. For example, rather than providing raw expression data, the prepared format may represent a diagnosis or risk assessment for the subject, along with recommendations for particular treatment options. The data may be displayed to the clinician by any suitable method. For example, in some embodiments, the profiling service generates a report that can be printed for the clinician (e.g., at the point of care) or displayed to the clinician on a computer monitor. In some embodiments, the information is first analyzed at the point of care or at a regional facility. The raw data are then sent to a central processing facility for further analysis and/or to convert the raw data to information useful for a clinician or patient. The central processing facility provides the advantage of privacy (all data are stored in a central facility with uniform security protocols), speed, and uniformity of data analysis. The central processing facility can then control the fate of the data following treatment of the subject. For example, using an electronic communication system, the central facility can provide data to the clinician, the subject, or researchers. In some embodiments, the subject is able to access the data using the electronic communication system. The subject may chose further intervention or counseling based on the results. In some embodiments, the data are used for research use. For example, the data may be used to further optimize the inclusion or elimination of markers as useful indicators of a particular condition associated with the disease.
All publications and patents mentioned in the above specification are herein incorporated by reference in their entirety for all purposes. Various modifications and variations of the described compositions, methods, and uses of the technology will be apparent to those skilled in the art without departing from the scope and spirit of the technology as described. Although the technology has been described in connection with specific exemplary embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various
modifications of the described modes for carrying out the invention that are obvious to those skilled in the art are intended to be within the scope of the following claims.

Claims

CLAIMS WE CLAIM:
1. A complex for providing a detectable fingerprint of a double-stranded target
nucleic acid, the complex comprising:
a) a double -stranded target nucleic acid comprising a first region adjacent to a second region!
b) an immobilized melting component interacting with the first region to form a thermodynamically stable complex and provide the second region in a single -stranded form! and
c) a query probe that binds repeatedly to the second region to provide a
detectable fingerprint associated with the double-stranded target nucleic acid.
2. The complex of claim 1 wherein the melting component comprises a dCas9.
3. The complex of claim 1 wherein the melting component comprises a
dCas9/gRNA complex comprising a gRNA hybridized to the first region.
4. The complex of claim 1 wherein the melting component comprises a PAMmer.
5. The complex according to claim 1 wherein the query probe is a fluoresce ntly
labeled nucleic acid that hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 0.1 min-1 and/or a kinetic rate constant kon that is greater than 0.1 min-1.
6. The complex according to claim 2 wherein the dCas9 is immobilized to a
substrate.
7. The complex according to claim 1 wherein the fingerprint is detectable by a
pattern recognition analysis.
8. The complex according to claim 1 further comprising a second melting component interacting with a third region of the target nucleic acid adjacent to the second region of the target nucleic acid.
9. The complex according to claim 1 wherein the target nucleic acid comprises a mutation, single nucleotide polymorphism, or a modified base.
10. The complex of claim 8 wherein the first and second melting components bind approximately 5 to 15 nucleotides apart on the target nucleic acid.
11. A method for providing a detectable fingerprint of a double -stranded target
nucleic acid in a sample, the method comprising:
a) immobilizing a double-stranded target nucleic acid to a discrete region of a solid support, said double-stranded target nucleic acid comprising a first region adjacent to a second region and said discrete region of said solid support comprising an immobilized melting component interacting with the first region!
b) providing a query probe that binds repeatedly to the second region to
provide a detectable fingerprint; and
c) associating the detectable fingerprint with the double-stranded nucleic acid to identify the double-stranded nucleic acid.
12. The method of claim 11 comprising analyzing data using pattern recognition to produce or identify the detectable fingerprint of the double stranded nucleic acid.
13. The method of claim 11 wherein the melting component comprises a dCas9.
14. The method of claim 11 further comprising providing a second dCas9/gRNA
complex comprising a gRNA complementary to a third region of the target nucleic acid adjacent to the second region of the target nucleic acid.
15. The method according to claim 11 comprising providing conditions sufficient for the melting component to provide the second region in a single-stranded form.
16. The method of claim 11 comprising detecting repeated binding of the detectably labeled query probe to the second region of the target nucleic acid, wherein the kinetic rate constant k0ff is greater than 1 min-1 and/or the kinetic rate constant kon is greater than 1 min-1.
The method of claim 16 further comprising calculating an amount or
concentration of the double-stranded target nucleic acid in the sample from the detectable fingerprint.
A system for the detection of a double -stranded nucleic acid, the system comprising a solid support comprising an immobilized melting component, a detectably labeled query probe that binds repeated to the double -stranded nucleic acid, a fluorescence detector, and a software component configured to perform pattern recognition analysis of query probe binding data.
A method for calculating a predictor that a subject has or is at risk of having a cancer, the method comprising one or more of determining the presence of a microRNA biomarker, determining the presence of a mutation, and/or
determining the presence of a modified base in genomic DNA.
The method of claim 19 wherein the predictor is a value calculated from variables associated with one or more of the presence of a microRNA biomarker, the presence of a mutation, and/or the presence of a modified base in genomic DNA.
A complex for detecting a target nucleic acid, the complex comprising:
1) a target nucleic acid comprising a first region adjacent to a second region!
2) a detectably labeled capture probe hybridized to the first region!
3) a query probe labeled with a quencher or fluorescent acceptor compatible with the label of the capture probe,
wherein the query probe hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 0.1 mkr1 and/or a kinetic rate constant kon that is greater than 0.1 mkr1.
A system for providing a detectable fingerprint of a double -stranded target nucleic acid, the system comprising:
a) an immobilized melting component capable of interacting with a first region of the double-stranded target nucleic acid to form a
thermodynamically stable complex and provide a second region of the double-stranded target nucleic acid in a single -stranded form! and b) a query probe that binds repeatedly to the second region to provide a detectable fingerprint associated with the double-stranded target nucleic acid.
23. The system of claim 22 wherein the melting component comprises a dCas9.
24. The system of claim 22 wherein the melting component comprises a
dCas9/gRNA complex comprising a gRNA complementary to the first region.
25. The system of claim 22 wherein the melting component comprises a PAMmer.
26. The system according to claim 22 wherein the query probe is a fluorescently
labeled nucleic acid that hybridizes repeatedly to the second region with a kinetic rate constant k0ff that is greater than 0.1 nikr1 and/or a kinetic rate constant kon that is greater than 0.1 mkr1.
27. The system according to claim 23 wherein the dCas9 is immobilized to a
substrate.
28. The system according to claim 22 wherein the fingerprint is detectable by a
pattern recognition analysis.
29. The system according to claim 22 further comprising a second melting component interacting with a third region of the target nucleic acid adjacent to the second region of the target nucleic acid.
30. The system according to claim 22 wherein the target nucleic acid comprises a mutation, single nucleotide polymorphism, or a modified base.
31. The system of claim 29 wherein the first and second melting components bind approximately 5 to 15 nucleotides apart on the target nucleic acid.
32. The system of claim 22 further comprising a fluorescence detector. The system of claim 22 further comprising a software component configured to perform pattern recognition analysis of query probe binding data.
The system of claim 22 further comprising a software component configured to calculate an amount or concentration of the double-stranded target nucleic acid in the sample from the detectable fingerprint.
The system of claim 22 further comprising an indicator or display for providing a result related to the presence, absence, concentration, amount, or sequence properties of the target nucleic acid.
A kit for providing a result related to the presence, absence, concentration, amount, or sequence properties of a target nucleic acid, the kit comprising a solid support, a dCas9/gRNA, and a query probe.
The kit of claim 36 further comprising software on a computer-readable format for the collection and analysis of query probe binding events and/or dwell times.
The kit of claim 36 further comprising one or more positive controls and/or one or more negative controls.
The kit of claim 36 wherein the solid support comprises a microscope slide, a bead, or a coverslip
The kit of claim 36 wherein the solid support comprises avidin or streptavidin
The kit of claim 36 wherein the solid support comprises a zero mode waveguide array.
Use of a system for providing a detectable fingerprint of a double-stranded target nucleic acid, the system comprising:
a) an immobilized melting component capable of interacting with a first region of the double -stranded target nucleic acid to form a
thermodynamically stable complex and provide a second region of the double-stranded target nucleic acid in a single-stranded form! and b) a query probe that binds repeatedly to the second region to provide a detectable fingerprint associated with the double-stranded target nucleic acid.
Use of a method to provide a detectable fingerprint of a double-stranded target nucleic acid in a sample, the method comprising:
a) immobilizing a double-stranded target nucleic acid to a discrete region of a solid support, said double-stranded target nucleic acid comprising a first region adjacent to a second region and said discrete region of said solid support comprising an immobilized melting component interacting with the first region!
b) providing a query probe that binds repeatedly to the second region to provide a detectable fingerprint; and
c) associating the detectable fingerprint with the double-stranded nucleic acid to identify the double-stranded nucleic acid.
Use of a complex to provide a detectable fingerprint of a double -stranded target nucleic acid in a sample, the complex comprising:
a) a double -stranded target nucleic acid comprising a first region adjacent to a second region!
b) an immobilized melting component interacting with the first region to form a thermodynamically stable complex and provide the second region in a single-stranded form! and
c) a query probe that binds repeatedly to the second region to provide a
detectable fingerprint associated with the double-stranded target nucleic acid.
Use of a kit to provide a result related to the presence, absence, concentration, amount, or sequence properties of a target nucleic acid, the kit comprising a solid support, a dCas9/gRNA, and a query probe.
A data structure comprising a fingerprint of a double-stranded target nucleic acid in a sample produced according to a method comprising:
a) immobilizing a double-stranded target nucleic acid to a discrete region of a solid support, said double-stranded target nucleic acid comprising a first region adjacent to a second region and said discrete region of said solid support comprising an immobilized melting component interacting with the first region!
b) providing a query probe that binds repeatedly to the second region to provide a detectable fingerprint; and
c) associating the detectable fingerprint with the double -stranded nucleic acid to identify the double -stranded nucleic acid.
A data structure comprising a fingerprint of a double -stranded target nucleic acid in a sample produced using a system comprising:
a) an immobilized melting component capable of interacting with a first region of the double-stranded target nucleic acid to form a
thermodynamically stable complex and provide a second region of the double-stranded target nucleic acid in a single -stranded form! and b) a query probe that binds repeatedly to the second region to provide a
detectable fingerprint associated with the double-stranded target nucleic acid.
A data structure comprising a fingerprint of a double-stranded target nucleic acid in a sample produced using a complex comprising:
a) a double -stranded target nucleic acid comprising a first region adjacent to a second region!
b) an immobilized melting component interacting with the first region to form a thermodynamically stable complex and provide the second region in a single-stranded form! and
PCT/US2017/016977 2016-02-10 2017-02-08 Detection of nucleic acids WO2017139354A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US16/076,853 US20190048415A1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids
EP22179788.9A EP4155397A1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids
JP2018541687A JP2019513345A (en) 2016-02-10 2017-02-08 Nucleic acid detection
EP20198764.1A EP3800249B1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids
CN202111607215.6A CN114196746A (en) 2016-02-10 2017-02-08 Detection of nucleic acids
ES17750678T ES2835101T3 (en) 2016-02-10 2017-02-08 Nucleic acid detection
EP17750678.9A EP3414327B1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids
CN201780020624.1A CN109072205A (en) 2016-02-10 2017-02-08 The detection of nucleic acid
US17/319,289 US20210348230A1 (en) 2016-02-10 2021-05-13 Detection of nucleic acids

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662293589P 2016-02-10 2016-02-10
US62/293,589 2016-02-10

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US16/076,853 A-371-Of-International US20190048415A1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids
US17/319,289 Continuation US20210348230A1 (en) 2016-02-10 2021-05-13 Detection of nucleic acids

Publications (1)

Publication Number Publication Date
WO2017139354A1 true WO2017139354A1 (en) 2017-08-17

Family

ID=59563387

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/016977 WO2017139354A1 (en) 2016-02-10 2017-02-08 Detection of nucleic acids

Country Status (6)

Country Link
US (2) US20190048415A1 (en)
EP (3) EP3800249B1 (en)
JP (3) JP2019513345A (en)
CN (2) CN109072205A (en)
ES (1) ES2835101T3 (en)
WO (1) WO2017139354A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018165309A1 (en) 2017-03-08 2018-09-13 The Regents Of The University Of Michigan Analyte detection
US10093967B2 (en) 2014-08-12 2018-10-09 The Regents Of The University Of Michigan Detection of nucleic acids
WO2020014388A1 (en) * 2018-07-11 2020-01-16 ChromaCode, Inc. Methods for quantitation of analytes in multiplexed biochemical reactions
CN111971380A (en) * 2017-12-14 2020-11-20 密歇根大学董事会 Concentration of analytes
CN112063604A (en) * 2020-09-15 2020-12-11 济南国科医工科技发展有限公司 Functionalized dCas9 modified protein and application thereof in nucleic acid detection
WO2020228844A3 (en) * 2019-03-15 2020-12-30 北京大学 Method of testing activity of double strand break-generating reagent
WO2021008805A1 (en) * 2019-07-12 2021-01-21 Illumina Cambridge Limited Compositions and methods for preparing nucleic acid sequencing libraries using crispr/cas9 immobilized on a solid support
CN112930354A (en) * 2018-07-24 2021-06-08 密歇根大学董事会 Intramolecular kinetic probes
US20210292837A1 (en) * 2020-03-19 2021-09-23 The Regents Of The University Of Michigan Analyte detection
WO2022149568A1 (en) * 2021-01-05 2022-07-14 学校法人 川崎学園 Transcriptional product in cells of organism including human, transfected rna, and seal for purifying complex thereof
US11827921B2 (en) 2012-02-03 2023-11-28 California Institute Of Technology Signal encoding and decoding in multiplexed biochemical assays
US11959856B2 (en) 2012-08-03 2024-04-16 California Institute Of Technology Multiplexing and quantification in PCR with reduced hardware and requirements

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11111492B2 (en) * 2017-03-06 2021-09-07 Florida State University Research Foundation, Inc. Genome engineering methods using a cytosine-specific Cas9
US11561221B2 (en) * 2018-05-02 2023-01-24 Trustees Of Boston University Dynamic tracking of captured targets for enhanced digital biosensing
CN110208226B (en) * 2019-04-15 2020-06-16 北京化工大学 High-specificity single-molecule fluorescence detection method
WO2021155775A1 (en) * 2020-02-03 2021-08-12 苏州克睿基因生物科技有限公司 Method and kit for dectecting target nucleic acid
EP4156910A2 (en) * 2020-05-29 2023-04-05 Mammoth Biosciences, Inc. Programmable nuclease diagnostic device
WO2022163770A1 (en) * 2021-01-28 2022-08-04 国立研究開発法人理化学研究所 Genome-editing-tool evaluation method
WO2023122427A1 (en) * 2021-12-21 2023-06-29 Foundation Medicine, Inc. Methods and systems for predicting genomic profiling success

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6001983A (en) 1990-10-09 1999-12-14 Benner; Steven Albert Oligonucleotides with non-standard bases and methods for preparing same
US20030003486A1 (en) 1999-11-16 2003-01-02 Atto-Tec Gmbh Dye-labeled oligonucleotide for labeling a nucleic acid molecule
US20060179585A1 (en) 2003-07-02 2006-08-17 Atto-Tec Gmbh Sulfonamide derviatives of polycyclic dyes used for analytical applications
US20090084980A1 (en) 2007-09-28 2009-04-02 Trustees Of Boston University System and method for providing enhanced background rejection in thick tissue with differential-aberration two-photon microscopy
US7935822B2 (en) 2002-12-18 2011-05-03 Atto-Tec Gmbh Carboxamide-substituted dyes for analytical applications
EP2300983B1 (en) 2008-06-05 2012-05-09 Trustees of Boston University System and method for producing an optically sectioned image using both structured and uniform illumination
WO2014018584A1 (en) 2012-07-24 2014-01-30 Trustees Of Boston University Partitioned aperture wavefront imaging method and system
EP2800811A1 (en) 2012-05-25 2014-11-12 The Regents of The University of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription
US20160010076A1 (en) 2013-03-15 2016-01-14 The General Hospital Corporation RNA-Guided Targeting of Genetic and Epigenomic Regulatory Proteins to Specific Genomic Loci
WO2016025477A1 (en) * 2014-08-12 2016-02-18 The Regents Of The University Of Michigan Detection of nucleic acids
WO2016172727A1 (en) * 2015-04-24 2016-10-27 Editas Medicine, Inc. Evaluation of cas9 molecule/guide rna molecule complexes

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5914230A (en) * 1995-12-22 1999-06-22 Dade Behring Inc. Homogeneous amplification and detection of nucleic acids
US6514768B1 (en) * 1999-01-29 2003-02-04 Surmodics, Inc. Replicable probe array
JP2003180374A (en) * 2001-10-12 2003-07-02 Adgene Co Ltd Method for detecting nucleic acid mutation by application of nucleic acid dissolution curve
WO2005118853A2 (en) * 2004-06-01 2005-12-15 Asm Scientific, Inc. Recombinase polymerase amplification
WO2004097043A1 (en) * 2003-04-25 2004-11-11 The University Of Manchester Exciplexes
WO2015017586A1 (en) * 2013-07-30 2015-02-05 President And Fellows Of Harvard College Quantitative dna-based imaging and super-resolution imaging
WO2015116686A1 (en) * 2014-01-29 2015-08-06 Agilent Technologies, Inc. Cas9-based isothermal method of detection of specific dna sequence
AU2015294354B2 (en) * 2014-07-21 2021-10-28 Illumina, Inc. Polynucleotide enrichment using CRISPR-Cas systems
CN107075546B (en) * 2014-08-19 2021-08-31 哈佛学院董事及会员团体 RNA-guided system for probing and mapping nucleic acids
US11180792B2 (en) * 2015-01-28 2021-11-23 The Regents Of The University Of California Methods and compositions for labeling a single-stranded target nucleic acid
KR101710026B1 (en) * 2016-08-10 2017-02-27 주식회사 무진메디 Composition comprising delivery carrier of nano-liposome having Cas9 protein and guide RNA
CN109790541B (en) * 2016-09-29 2022-12-09 豪夫迈·罗氏有限公司 Methods of analyzing and optimizing gene editing modules and delivery protocols

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6001983A (en) 1990-10-09 1999-12-14 Benner; Steven Albert Oligonucleotides with non-standard bases and methods for preparing same
US20030003486A1 (en) 1999-11-16 2003-01-02 Atto-Tec Gmbh Dye-labeled oligonucleotide for labeling a nucleic acid molecule
US20110223677A1 (en) 2002-12-18 2011-09-15 Atto-Tec Gmbh Carboxamide-substituted dyes for analytical applications
US7935822B2 (en) 2002-12-18 2011-05-03 Atto-Tec Gmbh Carboxamide-substituted dyes for analytical applications
US20060179585A1 (en) 2003-07-02 2006-08-17 Atto-Tec Gmbh Sulfonamide derviatives of polycyclic dyes used for analytical applications
US20110172420A1 (en) 2003-07-02 2011-07-14 Atto-Tec Gmbh Sulfonamide derivatives of polycyclic dyes used for analytical applications
US20110190486A1 (en) 2003-07-02 2011-08-04 Atto-Tec Gmbh Sulfonamide derivatives of polycyclic dyes used for analytical applications
US20090084980A1 (en) 2007-09-28 2009-04-02 Trustees Of Boston University System and method for providing enhanced background rejection in thick tissue with differential-aberration two-photon microscopy
EP2300983B1 (en) 2008-06-05 2012-05-09 Trustees of Boston University System and method for producing an optically sectioned image using both structured and uniform illumination
EP2800811A1 (en) 2012-05-25 2014-11-12 The Regents of The University of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription
WO2014018584A1 (en) 2012-07-24 2014-01-30 Trustees Of Boston University Partitioned aperture wavefront imaging method and system
US20160010076A1 (en) 2013-03-15 2016-01-14 The General Hospital Corporation RNA-Guided Targeting of Genetic and Epigenomic Regulatory Proteins to Specific Genomic Loci
WO2016025477A1 (en) * 2014-08-12 2016-02-18 The Regents Of The University Of Michigan Detection of nucleic acids
US20160046988A1 (en) 2014-08-12 2016-02-18 The Regents Of The University Of Michigan Detection of nucleic acids
WO2016172727A1 (en) * 2015-04-24 2016-10-27 Editas Medicine, Inc. Evaluation of cas9 molecule/guide rna molecule complexes

Non-Patent Citations (38)

* Cited by examiner, † Cited by third party
Title
"Statistics for Biology and Health - Statistical Methods in Bioinfórmatics", 2001, SPRINGER, article "Stochastic Processes (i): Poisson Processes and Markov Chains", pages: 129
ALBERT L. LEHNINGER: "Molecular Cloning: A Laboratory Manual", 1982, COLD SPRING HARBOR, pages: 793 - 800
ALLAWISANTALUCIA, BIOCHEMISTRY, vol. 36, 1997, pages 10581 - 94
ANDERS ET AL.: "Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease", NATURE, vol. 513, no. 7519, 27 July 2014 (2014-07-27), pages 569 - 573, XP055240929 *
ANDERSONYOUNG: "Quantitative Filter Hybridization", NUCLEIC ACID HYBRIDIZATION, 1985
ATHAMANOLAP ET AL.: "Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants", PLOS ONE, vol. 9, no. 9, 2 October 2014 (2014-10-02), pages e109094, XP055233770 *
B. A. SCHWEITZERE. T. KOOL, J. AM. CHEM. SOC., vol. 117, 1995, pages 1863 - 1872
B. A. SCHWEITZERE. T. KOOL, J. ORG. CHEM., vol. 59, 1994, pages 7238 - 7242
BARRANGOU ET AL.: "CRISPR provides acquired resistance against viruses in prokaryotes", SCIENCE, vol. 315, 2007, pages 1709 - 1712, XP002428071, DOI: 10.1126/science.1138140
BISHOP: "Pattern Recognition and Machine Learning", 2006, SPRINGER
CHYLINSKI ET AL., RNA BIOLOGY, vol. 10, no. 5, 2013, pages 1 - 12
CONG ET AL., SCIENCE, vol. 339, 2013, pages 819
DOMAGALA ET AL.: "KRAS mutation testing in colorectal cancer as an example of the pathologist's role in personalized targeted therapy: a practical approach", POL J PATHOL, vol. 63, no. 3, 2012, pages 145 - 64
DOTY ET AL., PROC. NATL. ACAD. SCI. USA, vol. 46, 1960, pages 461
DOUDNA ET AL.: "The new frontier of genome engineering with CRISPR-Cas9", SCIENCE, vol. 346, 2014, pages 6213
DUDA ET AL.: "Pattern classification", 2001, WILEY
ESVELT ET AL., NAT METHODS, vol. 10, no. 11, November 2013 (2013-11-01), pages 1116 - 21
FONFARA ET AL., NUCL. ACIDS RES., vol. 42, no. 4, 2014, pages 2577 - 2590
FONFARA ET AL.: "Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems", NUCLEIC ACIDS RES., 22 November 2013 (2013-11-22)
GERECKE ET AL.: "Ultrasensitive detection of unknown colon cancer-initiating mutations using the example of the Adenomatous polyposis coli gene", CANCER PREV RES (PHILA, vol. 6, no. 9, 2013, pages 898 - 907
HAUGLAND: "MOLECULAR PROBES HANDBOOK OF FLUORESCENT PROBES AND RESEARCH CHEMICALS", September 2005
JINEK ET AL.: "A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity", SCIENCE, vol. 337, 2012, pages 816 - 821, XP055549487, DOI: 10.1126/science.1225829
JOHNSON-BUCK ET AL.: "Kinetic fingerprinting to identify and count single nucleic acids", NAT BIOTECHNOL., vol. 33, no. 7, 22 June 2015 (2015-06-22), pages 730 - 732, XP055406287 *
JOHNSON-BUCKET: "Kinetic fingerprinting to identify and count single nucleic acids", NAT BIOTOCHNOL, vol. 33, 2015, pages 730 - 2, XP055406287, DOI: 10.1038/nbt.3246
KLEINSTIVER ET AL.: "High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects", NATURE, vol. 529, 2016, pages 490 - 495, XP055650074, DOI: 10.1038/nature16526
LEE ET AL.: "The Neisseria meningitidis CRISPR-Cas9 System Enables Specific Genome Editing in Mammalian Cells", MOLECULAR THERAPY, 19 January 2016 (2016-01-19)
NASEF ET AL.: "Melting temperature of surface-tethered DNA", ANAL BIOCHEM, vol. 406, no. 1, 7 June 2010 (2010-06-07), pages 34 - 40, XP027226799 *
NEEDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 444 - 453
NISHIMASU, CELL, vol. 156, 2014, pages 935 - 949
O'CONNELL ET AL.: "Programmable RNA recognition and cleavage by CRISPR/Cas9", NATURE, vol. 516, 2014, pages 263 - 6, XP055168138, DOI: 10.1038/nature13769
P. KONG ET AL., NUCLEIC ACIDS RES., vol. 17, 1989, pages 10373 - 10383
P. KONG ET AL., NUCLEIC ACIDS RES., vol. 20, 1992, pages 5149 - 5152
QI ET AL.: "Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression", CELL, vol. 152, no. 5, 2013, pages 1173 - 83, XP055346792, DOI: 10.1016/j.cell.2013.02.022
RAN ET AL.: "In vivo genome editing using Staphylococcus aureus Cas9", NATURE, vol. 520, 2015, pages 186 - 191, XP055484527, DOI: 10.1038/nature14299
SELLECK ET AL.: "Biophysical characterization and direct delivery of S. Pyogenes Cas9 ribonucleoprotein complexes", MOL THER, vol. 23, 27 April 2015 (2015-04-27), pages S66, XP055408922 *
SLAYMAKER ET AL.: "Rationally engineered Cas9 nucleases with improved specificity", SCIENCE, vol. 351, 2015, pages 84 - 8, XP055551663, DOI: 10.1126/science.aad5227
WALTER ET AL.: "Do-it-yourself guide: how to use the modern single-molecule toolkit", NAT METHODS, vol. 5, 2008, pages 475 - 89, XP055448009, DOI: 10.1038/nmeth.1215
ZHANG ET AL.: "Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis", MOLECULAR CELL, vol. 50, 2013, pages 488 - 503, XP028553287, DOI: 10.1016/j.molcel.2013.05.001

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11866768B2 (en) 2012-02-03 2024-01-09 California Institute Of Technology Signal encoding and decoding in multiplexed biochemical assays
US11827921B2 (en) 2012-02-03 2023-11-28 California Institute Of Technology Signal encoding and decoding in multiplexed biochemical assays
US11959856B2 (en) 2012-08-03 2024-04-16 California Institute Of Technology Multiplexing and quantification in PCR with reduced hardware and requirements
US10093967B2 (en) 2014-08-12 2018-10-09 The Regents Of The University Of Michigan Detection of nucleic acids
WO2018165309A1 (en) 2017-03-08 2018-09-13 The Regents Of The University Of Michigan Analyte detection
JP2021506253A (en) * 2017-12-14 2021-02-22 ザ リージェンツ オブ ザ ユニバーシティ オブ ミシガン Concentration of analyte
JP7270992B2 (en) 2017-12-14 2023-05-11 ザ リージェンツ オブ ザ ユニバーシティ オブ ミシガン Analyte concentration
CN111971380A (en) * 2017-12-14 2020-11-20 密歇根大学董事会 Concentration of analytes
WO2020014388A1 (en) * 2018-07-11 2020-01-16 ChromaCode, Inc. Methods for quantitation of analytes in multiplexed biochemical reactions
CN112930354A (en) * 2018-07-24 2021-06-08 密歇根大学董事会 Intramolecular kinetic probes
JP2021531024A (en) * 2018-07-24 2021-11-18 ザ リージェンツ オブ ザ ユニバーシティ オブ ミシガン Intramolecular kinetic probe
EP3827012A4 (en) * 2018-07-24 2022-04-06 The Regents of the University of Michigan Intramolecular kinetic probes
JP7438553B2 (en) 2018-07-24 2024-02-27 ザ リージェンツ オブ ザ ユニバーシティ オブ ミシガン Intramolecular dynamics probe
WO2020228844A3 (en) * 2019-03-15 2020-12-30 北京大学 Method of testing activity of double strand break-generating reagent
WO2021008805A1 (en) * 2019-07-12 2021-01-21 Illumina Cambridge Limited Compositions and methods for preparing nucleic acid sequencing libraries using crispr/cas9 immobilized on a solid support
US20210292837A1 (en) * 2020-03-19 2021-09-23 The Regents Of The University Of Michigan Analyte detection
CN112063604A (en) * 2020-09-15 2020-12-11 济南国科医工科技发展有限公司 Functionalized dCas9 modified protein and application thereof in nucleic acid detection
WO2022149568A1 (en) * 2021-01-05 2022-07-14 学校法人 川崎学園 Transcriptional product in cells of organism including human, transfected rna, and seal for purifying complex thereof

Also Published As

Publication number Publication date
CN109072205A (en) 2018-12-21
ES2835101T3 (en) 2021-06-21
JP2023153898A (en) 2023-10-18
EP3800249B1 (en) 2022-06-22
EP3800249A1 (en) 2021-04-07
JP7327826B2 (en) 2023-08-16
EP3414327A4 (en) 2019-10-16
US20210348230A1 (en) 2021-11-11
EP3414327B1 (en) 2020-09-30
EP4155397A1 (en) 2023-03-29
CN114196746A (en) 2022-03-18
EP3414327A1 (en) 2018-12-19
JP2019513345A (en) 2019-05-30
US20190048415A1 (en) 2019-02-14
JP2021129581A (en) 2021-09-09

Similar Documents

Publication Publication Date Title
US20210348230A1 (en) Detection of nucleic acids
US20230070399A1 (en) Methods and systems for processing time-resolved signal intensity data
AU2019200289B2 (en) Transposition into native chromatin for personal epigenomics
AU2017200433B2 (en) Multivariate diagnostic assays and methods for using same
CN104372080B (en) polynucleotide mapping and sequencing
CN105899680A (en) Nucleic acid probe and method of detecting genomic fragments
US20210318296A1 (en) Intramolecular kinetic probes
CA3088467A1 (en) Biomarker panel and methods for detecting microsatellite instability in cancers
WO2014114189A1 (en) Methods and compositions for detecting target snp
Wu et al. Multiplexed discrimination of microRNA single nucleotide variants through triplex molecular beacon sensors
US20210155972A1 (en) Targeted rare allele crispr enrichment
CA3150283A1 (en) Selective enrichment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17750678

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018541687

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017750678

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017750678

Country of ref document: EP

Effective date: 20180910