How it works

General requirements for primers

Since the primers are used for sanger sequencing, these settings are important to create usable primers. Of course the standard settings can be changed if needed.

  • maximum insert size for a primer: 700 bp

  • minimum insert size for a primer: 200 bp

  • distance of primers from exon borders: 40 bp

These are settings used for primer3:

  • optimal primer size: 20 bp

  • minimum primer size: 20 bp

  • maximum primer size: 22 bp

  • optimal temperature: 60°C

  • minimum temperature: 58°C

  • maximum temperature: 62°C

  • maximum poly x (maximum allowable length of a mononucleotide repeat): 5

  • gcc clamp (Require the specified number of consecutive Gs and Cs at the 3’ end of both the left and right primer): 1

VariantPrimerGenerator

A variant in HGVS nomenclature is used to generate primers for the specific position. If the variant is located in an exon, primers are generated for the whole exon. If not, primers are generated based on the genomic position of the variant.

The general flow of this code is as follows:

  1. Check if given reference genome is downloaded and if not, download the specific file.

  2. Check the given mutation with mutalyzer and convert from coding into the genomic position.

  3. Extract transcript number (NM) and transcript specific information from the NCBI Refseq database.

  4. Find the sequence position based on the gene information and genomic position of the variant.

  5. Check if the mutation is in an exon and if this exon fits in the maximal insert size.

    5a. The mutation is in an exon and the exon fits in the insert size: Generate primers using PrimerExon.

    5b. The mutation is in an exon but the exon is larger than the maximal insert size: Generate primers using PrimerGenomic

    5c. The mutation is outside of exon borders: Generate primers using PrimerGenomic

ExonPrimerGenerator

If primers for a specific exon in a transcript are needed, they can be generated by stating the transcript number (NM-number) and the exon number.

The general flow of this code is as follows:

  1. Check if given reference genome is downloaded and if not, download the specific file.

  2. Extract transcript specific information from the NCBI Refseq database.

  3. Check if exon exists in this transcript.

  4. Determine exon start and end position based on strand the gene is located on.

  5. Generate primers and either write to file or return primers as list.

GenePrimerGenerator

This option generates primers for every exon in a transcript using PrimerExon.

The general flow of this code is as follows:

  1. Check if given reference genome is downloaded and if not, download the specific file.

  2. Extract transcript specific information from the NCBI Refseq database.

  3. Iterate over exons and generate exon specific primers using PrimerExon.

GenomicPositionPrimerGenerator

This is the easiest version to generate primers, since it only requires a start and end position, chromosome and the name of the reference genome.

The general flow of this code is as follows:

  1. Check if given reference genome is downloaded and if not, download the specific file.

  2. Check if start and stop position in the min/max insert size.

  3. Generate primers based on positions.

Resources used

Primer3

Primer3 exists both as online version and as python package. An additional manual exists, that contains explanations for all variables.

This package generates primers with a given genomic sequence and a dictionary containing all needed variables.

genomepy

genomepy allows easy installation and use of genomes in python. Here it is used to download the needed version of the human genome (hg19 or hg38) and extract the sequences in which the primers are generated.

hgvs.parser

The hgvs parser is a package to handle biological sequence variants based on the HGVS nomenclature. Here it is used to parse a variant in