Heterozygosity outliers are typically removed from genetic association analyses. This function returns either a vector of heterozygosity outliers to remove (+/- 3sd from mean heterozygosity), or a data frame with heterozygosity scores for all samples.
ukb_gen_het(data, all.het = FALSE)
A UKB dataset created with ukb_df
.
Set all.het = TRUE
for heterozygosity scores for all samples. By default all.het = FALSE
returns a vector of sample IDs for individuals +/-3SD from the mean heterozygosity.
A vector of IDs if all.het = FALSE
(default), or a dataframe with ID, heterozygosity and PCA-corrected heterozygosity if all.het = TRUE
.
UKB have published full details of genotyping and quality control for the interim genotype data.
if (FALSE) {
#' # Heterozygosity outliers (+/-3SD)
outlier_het_ids <- ukb_gen_het(my_ukb_data)
# Retrieve all raw and pca-corrected heterozygosity scores
ukb_het <- ukb_gen_het(my_ukb_data, all.het = TRUE)
}