ukbtools

ukbtools: Manipulate and Explore UK Biobank Data

UKB Dataframe

Functions to wrangle the UKB data into a dataframe with meaningful column names.

ukb_df()

Reads a UK Biobank phenotype fileset and returns a single dataset.

ukb_df_field()

Makes a UKB data-field to variable name table for reference or lookup.

ukb_df_full_join()

Recursively join a list of UKB datasets

ukb_df_duplicated_name()

Checks for duplicated names within a UKB dataset

ukb_centre()

Inserts UKB centre names into data

ukb_context()

Demographics of a UKB sample subset

Genetic Metadata

Functions to query the associated genetic sample QC information

ukb_gen_read_fam()

Reads a PLINK format fam file

ukb_gen_read_sample()

Reads an Oxford format sample file

ukb_gen_rel_count()

Relatedness count

ukb_gen_related_with_data()

Subset of the UKB relatedness dataframe with data

ukb_gen_samples_to_remove()

Related samples (with data on the variable of interest) to remove

ukb_gen_sqc_names()

Sample QC column names

ukb_gen_write_bgenie()

Writes a BGENIE format phenotype or covariate file.

ukb_gen_write_plink()

Writes a PLINK format phenotype or covariate file

Disease Diagnoses

Functions to query the UKB hospital episodes statistics.

ukb_icd_code_meaning()

Retrieves description for a ICD code.

ukb_icd_diagnosis()

Retrieves diagnoses for an individual.

ukb_icd_freq_by()

Frequency of an ICD diagnosis by a target variable

ukb_icd_keyword()

Retrieves diagnoses containing a description.

ukb_icd_prevalence()

Returns the prevalence for an ICD diagnosis

Datasets

ukbcentre

UKB assessment centre

icd10chapters

International Classification of Diseases Revision 10 (ICD-10) chapters

icd10codes

International Classification of Diseases Revision 10 (ICD-10) codes

icd9chapters

International Classification of Diseases Revision 9 (ICD-9) chapters

icd9codes

International Classification of Diseases Revision 9 (ICD-9) codes

Defunct

The genetic metadata functions were written to retrieve genetic metadata from the phenotype file for the interim genotype release. The fields retrieved became obselete when the full genotyping results were released at the end of 2017. With the release of the full sample (500K individuals) genotypes, sample QC (ukb_sqc_v2.txt) and relatedness (ukbA_rel_sP.txt) data are now supplied as separate files. The contents of these files, along with all other genetic files are described in UKB Resource 531.

ukb_gen_excl()

Sample exclusions

ukb_gen_excl_to_na()

Inserts NA into phenotype for genetic metadata exclusions

ukb_gen_het()

Heterozygosity outliers

ukb_gen_meta()

Genetic metadata

ukb_gen_pcs()

Genetic principal components

ukb_gen_rel()

Creates a table of related individuals

ukb_gen_write_plink_excl()

Writes a PLINK format file for combined exclusions