3 Function overview
Function | GenoM | Pedigree | Pairs | AgePrior | SeqList | Reusable output | Plot function |
---|---|---|---|---|---|---|---|
sequoia | Y | + | + | Pedigree, SeqList | SeqListSummary | ||
GetMaybeRel | Y | + | + | Pairs | PlotRelPairs | ||
CalcOHLLR | Y | Y | + | + | SeqListSummary | ||
CalcPairLL | Y | + | Y | + | + | PlotPairLL | |
MakeAgePrior | + | AgePrior | PlotAgePrior | ||||
SimGeno | Y | GenoM | SnpStats | ||||
PedCompare | YY | PlotPedComp | |||||
ComparePairs | Y+ | + | Pairs | PlotRelPairs |
3.1 Input
GenoConvert
Read in genotype data from PLINK file, Colony file, or many user-specified formats, and return a matrix in sequoia’s (or Colony) format. Does not support vcf yet.LHConvert
Extract sex and birth year from PLINK file; optionally recode sex to 1=female, 2=male, check consistency with other LifeHistData, or combine family ID and individual ID into FID__IID.CheckGeno
Check that the provided genotype matrix is in the correct format, and check for low call rate samples and SNPsSnpStats
Calculate per-SNP allele frequency, missingness, and, if a pedigree provided, number of Mendelian errors.CalcMaxMismatch
Calculate the maximum expected number of mismatches for duplicate samples, and Mendelian errors for parent-offspring pairs and parent-parent-offspring trios.
3.2 Simulate
SimGeno
Simulate genotype data for independent SNPs. Specify pedigree, founder MAF, call rate, proportion of non-genotyped parents, genotyping error & error model.MkGenoErrors
Add genotyping errors and missingness to genotype data; more fine-scale control than with SimGeno.
3.3 Ageprior
MakeAgePrior
For various categories of pairwise relatives (R), calculate age-difference (A) based probability ratios \(P(A|R) / P(A) = P(R|A) / P(R)\), or how much likelier a relationship is given the age difference. It applies corrections when the skeleton-pedigree contains few/no pairs with known age difference for some relationships.PlotAgePrior
Visualise the age-difference based prior probability ratios as a heatmap.
3.4 Pedigree reconstruction
sequoia
Main function to run parentage assignment and full pedigree reconstruction, calls many of the other functions.GetMaybeRel
Identify pairs of individuals likely to be related, but not assigned as such in the provided pedigree. Either search only for potential parent-offspring pairs, or for all 1st and 2nd degree relatives.
3.5 Pedigree check
These functions can be applied to any pedigree, not just pedigrees reconstructed by sequoia. Required input between brackets
SummarySeq (1 pedigree)
Graphical overview of the assignment rate, the proportion dummy parents, sibship sizes, parental LLR distributions, and Mendelian errors, \(+\) tables with pedigree summary statistics.CalcOHLLR (pedigree + genotypes)
Count opposite homozygous (OH) loci between parent-offspring pairs and Mendelian errors (ME) between parent-parent-offspring trios, and calculate the parental log-likelihood ratios (LLR).EstConf (pedigree + genotypes)
Estimate assignment error rate (false positives & false negatives). Using a reference pedigree, repeatedly simulate genotype data, run sequoia, and compare inferred to reference pedigree.getAssignCat (1 pedigree)
Identify which individuals are genotyped, and which can potentially be substituted by a dummy individual. ‘Dummifiable’ are those non-genotyped individuals with at least 2 genotyped offspring, or at least 1 genotyped offspring and 1 genotyped parent.PedCompare (2 pedigrees)
Compare 2 pedigrees, e.g. field and genetically inferred, or reference and inferred-from-simulated-data. Matches dummy parents to non-genotyped parents.CalcRped (1 pedigree) This is a wrapper for
kinship()
in packagekinship2
.
3.6 Pairwise relationships
CalcPairLL (pairs + genotypes) For each pair, calculate the log10-likelihoods of being various different types of relative, or unrelated.
GetRelCat (1 pedigree)
Determine the relationship between individual X and all other individuals in the pedigree, going up to 1 or 2 generations back.GetRelM (pedigree or pairs) Generate a matrix with all pairwise relationships from a pedigree or dataframe with pairs
ComparePairs (1 or 2 pedigrees)
Compare, count and identify different types of relative pairs between two pedigrees, or within one pedigree. 2
- PlotRelPairs (matrix) plot pairwise relationships between all individuals, as by Colony.
3.7 Miscellaneous
PedPolish
Ensure all parents & all genotyped individuals are included, remove duplicates, rename columns, and replace 0 by NA or v.v. Can also generate ‘filler’ parents for software that requires individuals to have either 0 or 2 parents, never 1.getGenerations For each individual in a pedigree, count the number of generations since its most distant pedigree founder.
ErrToM
Generate a matrix with the probabilities of observed genotypes (columns) conditional on actual genotypes (rows), or return a function to generate such matrices. The error matrix can be used as input forsequoia
andCalcOHLLR
, the error function as input forSimGeno
writeSeq
Write the list with sequoia output in human-readable format, either as a folder with .txt files, or as a many-tabbed excel file. The latter uses R packagexlsx
, which requires java and can (therefore) be cumbersome to install.writeColumns
write data.frame or matrix to a text file, using white space padding to keep columns aligned.FindFamilies
Add a column with family IDs (FIDs) to a pedigree, with each number denoting a cluster of connected individuals.PedStripFID
Reverse the joining of FID and IID in GenoConvert and LHConvert** CalcBYprobs** Estimate the probability that individual i with unknown birth year is born in year y, based on the birthyears of its parents and offspring.
The matrix returned by
DyadCompare
is a subset of the matrix returned here using default settings.↩︎