Reads returned data

bio_return(project_dir, return = NULL)

Arguments

project_dir

Path to the enclosing directory of a UKB project.

return

A integer indicating which UKB return to read, e.g., 3388 for PGxPOP returned allele and phenotype calls.

Value

For return 3388 a data frame of PGxPOPcalled diplotypes and phenotypes, for both the imputed and integrated (imputed plus exome) data. See McInnes et al (2020) Pharmacogenetics at scale: An analysis of the UK BiobankFor return 1701 a list of two dataframes: calls and sumstats. CNV calls for the full UK Biobank analysed with Affymetrix Powertools, followed by PennCNV. For methods see Kendall et al., Biol Psychiatry, 2017. sumstats column Filter indicates whether a person passed (1) or failed (0) filtering criteria: call_rate>0.96 & NumCNV<31 & WF>-0.03 & WF<0.03 & LRR_SD <0.35 calls includes:

  • f.eid: the ID (specific to project 14421)

  • chr: chromosome number (only autosomes are included)

  • start / end: position on the chromosome in bp, according to hg19.

  • Type: copy number (0,1 = deletions, 3,4 = duplications)

  • Size: length of the CNV in base pairs

  • Probe: number of SNP probes within the CNV (we have retained only CNVs covered with 10 or more probes)

  • Conf: confidence call for the CNV, according to PennCNV

  • Pathogenic_CNVs: CNVs in 92 regions described in the Supplementary material (Supplementary Table 1) of Owen D, Bracher-Smith M, Kendall KM, Rees E, Einon M, Escott-Price V, Owen MJ, O'Donovan MC, Kirov G. Effects of pathogenic CNVs on physical traits in participants of the UK Biobank. BMC Genomics. 2018 Dec 4;19(1):867. doi: 10.1186/s12864-018-5292-7. PMID: 30509170. The calls have been checked manually and the criteria for accepting a CNV are listed in the same paper: Supplementary Table 2 (typically >50% of the critical interval).

  • N_genes_hit: the number of genes within the CNV (can be intronic)

  • Call_rate / LRR_SD / WF / Num_CNV: indicate the quality control measures for the individual carrying the CNV, identical to those in the “Summary_statistics.dat” file.

  • Density: indicates the number of base pairs per probe within the CNV. We recommend no more than 20,000bp per probe for a CNV to be accepted.

  • Filter: CNVs are filtered out (0) if they were called on good arrays (call_rate>0.96 & NumCNV<31 & WF>-0.03 & WF<0.03 & LRR_SD <0.35) and had a density of <20,000bp.