Last updated: 2025-02-24
Checks: 7 0
Knit directory: organoid_oxygen_eqtl/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20250224)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 2b45f01. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .DS_Store
Ignored: analysis/figure/
Ignored: code/.DS_Store
Unstaged changes:
Modified: analysis/figure3.Rmd
Modified: analysis/shared_functions_style_items.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/figure4.Rmd
) and HTML
(docs/figure4.html
) files. If you’ve configured a remote
Git repository (see ?wflow_git_remote
), click on the
hyperlinks in the table below to view the files as they were in that
past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 2b45f01 | Ben Umans | 2025-02-24 | updated for reviews |
Rmd | f0eaf05 | Ben Umans | 2024-09-05 | organized for upload to github |
html | f0eaf05 | Ben Umans | 2024-09-05 | organized for upload to github |
This page describes steps used to compare eQTLs to disease gene results.
library(Seurat)
Attaching SeuratObject
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(pals)
library(RColorBrewer)
library(mashr)
Loading required package: ashr
library(udr)
library(knitr)
library(ggrepel)
library(gridExtra)
Attaching package: 'gridExtra'
The following object is masked from 'package:dplyr':
combine
library(MatrixEQTL)
library(vroom)
Attaching package: 'vroom'
The following objects are masked from 'package:readr':
as.col_spec, col_character, col_date, col_datetime, col_double,
col_factor, col_guess, col_integer, col_logical, col_number,
col_skip, col_time, cols, cols_condense, cols_only, date_names,
date_names_lang, date_names_langs, default_locale, fwf_cols,
fwf_empty, fwf_positions, fwf_widths, locale, output_column,
problems, spec
source("analysis/shared_functions_style_items.R")
Import GWAS Catalog data and separate the mapped genes.
gwas <- read.delim(file = "/project/gilad/umans/references/gwas/gwas_catalog_v1.0.2-associations_e110_r2023-09-10.tsv", quote = "", fill = FALSE) %>%
dplyr::select(!c("DATE.ADDED.TO.CATALOG", "PUBMEDID" ,"FIRST.AUTHOR", "DATE", "JOURNAL","LINK", "STUDY", "INITIAL.SAMPLE.SIZE", "REPLICATION.SAMPLE.SIZE", "PLATFORM..SNPS.PASSING.QC.", "P.VALUE..TEXT.", "X95..CI..TEXT.", "GENOTYPING.TECHNOLOGY", "MAPPED_TRAIT_URI"))
gwas_formatted <- gwas %>%
mutate(MAPPED_GENE=if_else(str_detect(MAPPED_GENE, pattern = "No mapped genes"), "", MAPPED_GENE)) %>%
rowwise() %>%
mutate(gene=strsplit(MAPPED_GENE, split="[ - ]")[[1]][1], upstream_gene=strsplit(MAPPED_GENE, split=" - ")[[1]][1], downstream_gene=strsplit(MAPPED_GENE, split=" - ")[[1]][2]) %>%
mutate(nearest_gene= case_when(UPSTREAM_GENE_DISTANCE<DOWNSTREAM_GENE_DISTANCE ~ upstream_gene,
UPSTREAM_GENE_DISTANCE>DOWNSTREAM_GENE_DISTANCE ~ downstream_gene,
is.na(UPSTREAM_GENE_DISTANCE)|is.na(DOWNSTREAM_GENE_DISTANCE) ~ gene)) %>%
mutate(nearest_gene=gsub("([a-zA-Z]),", "\\1 ", nearest_gene)) %>%
mutate(nearest_gene=gsub(",", "", nearest_gene)) %>%
mutate(nearest_gene=str_trim(nearest_gene)) %>%
mutate(trait=strsplit(DISEASE.TRAIT, split=", ")[[1]][1]) %>%
add_count(trait) %>%
ungroup()
gwas_formatted <- readRDS(file = "/project/gilad/umans/references/gwas/gwas_catalog_v1.0.2-associations_e110_r2023-09-10.RDS")
In cases where the GWAS reported a gene for a given SNP, use the reported gene. Otherwise, assign the nearest gene as the one that corresponds to the associated SNP.
gwas_formatted <- gwas_formatted %>%
mutate(`REPORTED.GENE.S.`=if_else(str_detect(`REPORTED.GENE.S.`, pattern = "NR"), NA, `REPORTED.GENE.S.`)) %>%
mutate(`REPORTED.GENE.S.`=if_else(str_detect(`REPORTED.GENE.S.`, pattern = "intergenic"), NA, `REPORTED.GENE.S.`)) %>%
mutate(`REPORTED.GENE.S.`=if_else(str_detect(`REPORTED.GENE.S.`, pattern = "-"), NA, `REPORTED.GENE.S.`)) %>%
mutate(`REPORTED.GENE.S.`= na_if(`REPORTED.GENE.S.`, "")) %>%
mutate(nearest_gene= case_when(!is.na(`REPORTED.GENE.S.`) ~ `REPORTED.GENE.S.`,
UPSTREAM_GENE_DISTANCE<DOWNSTREAM_GENE_DISTANCE ~ upstream_gene,
UPSTREAM_GENE_DISTANCE>DOWNSTREAM_GENE_DISTANCE ~ downstream_gene,
is.na(UPSTREAM_GENE_DISTANCE)|is.na(DOWNSTREAM_GENE_DISTANCE) ~ gene
)) %>%
mutate(nearest_gene=str_extract(nearest_gene, "[^, ]+")) %>%
mutate(nearest_gene=gsub(",", "", nearest_gene)) %>%
mutate(nearest_gene=str_trim(nearest_gene))
I next curated traits that were reasonably brain-related. This includes addiction-related traits, but not dietary preference traits.
filtered_traits <- read_csv(file = "/project/gilad/umans/references/gwas/gwas_catalog_v1.0.2-associations_e110_r2023-09-10_organoid-related_n15.csv", col_names = FALSE) %>% pull(X1)
Rows: 512 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): X1
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
I next exclude traits with fewer than 15 associations or more than 500 associations, and further filter consider only genome-wide significant associations.
gwas_gene_lists <- split(gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>% filter(PVALUE_MLOG>7.3) %>% pull(nearest_gene), gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>% filter(PVALUE_MLOG>7.3) %>% pull(trait))
gwas_snps <- gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>%
filter(PVALUE_MLOG>7.3) %>% unite("snp", CHR_ID:CHR_POS, sep = ":") %>% pull(snp) %>% unique()
Note that for some common traits there are many GWAS studies:
gwas_formatted %>%
filter(trait %in% filtered_traits) %>%
filter(n>15) %>% filter(n<500) %>% filter(PVALUE_MLOG>7.3) %>%
select(STUDY.ACCESSION, trait) %>% distinct() %>% group_by(trait) %>% summarize(studies=n()) %>%
arrange(desc(studies)) %>% head()
# A tibble: 6 × 2
trait studies
<chr> <int>
1 Alzheimer's disease 35
2 Parkinson's disease 24
3 Bipolar disorder 18
4 Amyotrophic lateral sclerosis 15
5 Major depressive disorder 14
6 Alzheimer's disease (late onset) 12
And, of course, there are a number of very similar traits in this large compendium.
Next, I import significant eGenes from GTEx cerebral cortex tissue samples.
gtex_cortex_signif <- read.table(file = "/project/gilad/umans/references/gtex/GTEx_Analysis_v8_eQTL/Brain_Cortex.v8.egenes.txt.gz", header = TRUE, sep = "\t", stringsAsFactors = FALSE) %>% filter(qval<0.05) %>% pull(gene_name)
gtex_frontalcortex_signif <- read.table(file = "/project/gilad/umans/references/gtex/GTEx_Analysis_v8_eQTL/Brain_Frontal_Cortex_BA9.v8.egenes.txt.gz", header = TRUE, sep = "\t", stringsAsFactors = FALSE) %>% filter(qval<0.05) %>% pull(gene_name)
gtex_cortex_signif <- unique(c(gtex_cortex_signif, gtex_frontalcortex_signif))
And import the eQTL results.
mash_by_condition_output <- readRDS(file = "/project2/gilad/umans/oxygen_eqtl/output/combined_mash-by-condition_EE_fine_reharmonized_newmash_032024.rds") %>% ungroup()
I classified eGenes from the organoid dataset by whether they were significant under normoxia and whether they are responsive to manipulating oxygen. These two binary classifications result in 4 groups: (1) shared effects in all conditions, detectable under normoxia; (2) dynamic and detectable under normoxia; (3) dynamic and not detectable under normoxia; and (4) shared effects under all conditions but not detectable under normoxia. Implicitly, group 4 effects needed additional treatment conditions to detect them not because they’re responsive to treatment but because of the additional power we get.
Here, I plot the number of eGenes of each these four classes that are the nearest genes to a significant GWAS variant in each cell type.
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_gwas)) + geom_boxplot(outlier.shape = NA, aes(color=class), linewidth=0.4) +
xlab("eGene class (see note)") +
ylab("Number of eGenes in GWAS") +
theme_light() +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0), size=0.5) +
scale_color_manual(values=c(manual_palette_fine, class_colors)) +
xlab("eGene class (see note)") +
theme(legend.position="none")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Version | Author | Date |
---|---|---|
f0eaf05 | Ben Umans | 2024-09-05 |
What if we looked at fraction of the eGenes in that category instead of total numbers?
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique())),
egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_in_gwas_fraction)) + geom_boxplot(outlier.shape = NA, aes(color=class), linewidth=0.4) +
xlab("eGene class (see note)") +
ylab("Number of eGenes in GWAS") +
theme_light() +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0), size=0.5) +
scale_color_manual(values=c(manual_palette_fine, class_colors)) +
xlab("eGene class (see note)") +
theme(legend.position="none")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Basically the same proportion of each category.
Here, I show the numbers plotted above and perform a Wilcoxon test comparing the dynamic/not-detected-in-normoxia (“class3”) GWAS-associated eGenes with the non-dynamic/detected-in-normoxia (“class1”) GWAS-associated eGenes.
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
group_by(class) %>%
summarize(median(egene_gwas)) %>% kable()
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
class | median(egene_gwas) |
---|---|
class1 | 28 |
class2 | 4 |
class3 | 9 |
class4 | 11 |
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by( class) %>%
summarise(egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n())
# A tibble: 4 × 2
class egene_in_gwas_fraction
<chr> <dbl>
1 class1 0.332
2 class2 0.288
3 class3 0.306
4 class4 0.318
Of course, there’s a greater number of eQTLs in GWAS among the “standard” eQTLs, because there are more of them:
wilcox.test(mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class1),
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class3), paired = TRUE)
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Warning in wilcox.test.default(mash_by_condition_output %>%
filter(sig_anywhere) %>% : cannot compute exact p-value with ties
Wilcoxon signed rank test with continuity correction
data: mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class1) and mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class3)
V = 84, p-value = 0.007896
alternative hypothesis: true location shift is not equal to 0
But the proportion (i.e., the yield) is the same:
wilcox.test(mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>%
pivot_wider(names_from = class, values_from = egene_in_gwas_fraction) %>% pull(class1),
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>%
pivot_wider(names_from = class, values_from = egene_in_gwas_fraction) %>% pull(class3), paired = TRUE)
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Wilcoxon signed rank exact test
data: mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>% pivot_wider(names_from = class, values_from = egene_in_gwas_fraction) %>% pull(class1) and mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>% pivot_wider(names_from = class, values_from = egene_in_gwas_fraction) %>% pull(class3)
V = 64, p-value = 0.2163
alternative hypothesis: true location shift is not equal to 0
Overall (ie, not per-cell type):
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
select(gene, class) %>%
distinct() %>%
group_by(class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
kable()
class | egene_gwas |
---|---|
class1 | 291 |
class2 | 59 |
class3 | 148 |
class4 | 175 |
The same numbers can be calculated excluding genes that have already been identified as eGenes in GTEx cerebral cortex tissue.
mash_by_condition_output %>%
filter(sig_anywhere) %>%
filter(gene %not_in% gtex_cortex_signif) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
group_by(class) %>%
summarize(median(egene_gwas))
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
# A tibble: 4 × 2
class `median(egene_gwas)`
<chr> <int>
1 class1 15
2 class2 2
3 class3 5
4 class4 8
Overall (ie, not per-cell type):
mash_by_condition_output %>%
filter(sig_anywhere) %>%
filter(gene %not_in% gtex_cortex_signif) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
select(gene, class) %>%
distinct() %>%
group_by(class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
kable()
class | egene_gwas |
---|---|
class1 | 171 |
class2 | 32 |
class3 | 96 |
class4 | 108 |
To test whether the “dynamic” vs “standard” eGenes (defined solely by comparisons of effect size) are more likely to be present in GWAS gene lists, we group and test them as we did for comparison to GTEx. Note this yields a similar conclusion as above, where we broke eGenes down by whether they were dynamic and detectable at baseline.
wilcox.test(mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(allsharing ~ "standard",
hypoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hyperoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hypoxia_hyperoxia_shared & sharing_contexts==1 ~ "dynamic",
sharing_contexts==2 ~ "standard",
sharing_contexts==0 ~ "dynamic"
)) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
filter(class %in% c("dynamic")) %>% pull(egene_gwas),
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(allsharing ~ "standard",
hypoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hyperoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hypoxia_hyperoxia_shared & sharing_contexts==1 ~ "dynamic",
sharing_contexts==2 ~ "standard",
sharing_contexts==0 ~ "dynamic"
)) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
filter(class %in% c("standard")) %>% pull(egene_gwas), paired=TRUE, alternative = "two.sided")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Wilcoxon signed rank exact test
data: mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(allsharing ~ "standard", hypoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hyperoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hypoxia_hyperoxia_shared & sharing_contexts == 1 ~ "dynamic", sharing_contexts == 2 ~ "standard", sharing_contexts == 0 ~ "dynamic")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("dynamic")) %>% pull(egene_gwas) and mash_by_condition_output %>% filter(sig_anywhere) %>% mutate(class = case_when(allsharing ~ "standard", hypoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hyperoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hypoxia_hyperoxia_shared & sharing_contexts == 1 ~ "dynamic", sharing_contexts == 2 ~ "standard", sharing_contexts == 0 ~ "dynamic")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("standard")) %>% pull(egene_gwas)
V = 0, p-value = 0.0002441
alternative hypothesis: true location shift is not equal to 0
mash_by_condition_output_coarse <- readRDS(file = "/project2/gilad/umans/oxygen_eqtl/output/combined_mash-by-condition_EE_coarse_reharmonized_newmash_032024.rds") %>% ungroup()
Again, I plot the number of eGenes of each these four classes that are the nearest genes to a significant GWAS variant in each cell type.
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_gwas)) + geom_boxplot(outlier.shape = NA, aes(color=class), linewidth=0.4) +
xlab("eGene class (see note)") +
ylab("Number of eGenes in GWAS") +
theme_light() +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0), size=0.5) +
scale_color_manual(values=c(manual_palette_coarse, class_colors)) +
xlab("eGene class (see note)") +
theme(legend.position="none")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Here, I show the numbers plotted above and perform a Wilcoxon test comparing the dynamic/not-detected-in-normoxia (“class3”) GWAS-associated eGenes with the non-dynamic/detected-in-normoxia (“class1”) GWAS-associated eGenes.
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
group_by(class) %>%
summarize(median(egene_gwas)) %>% kable()
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
class | median(egene_gwas) |
---|---|
class1 | 37 |
class2 | 4 |
class3 | 14 |
class4 | 15 |
wilcox.test(mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class1),
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class3), paired = TRUE)
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Warning in wilcox.test.default(mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>% : cannot compute exact p-value with ties
Wilcoxon signed rank test with continuity correction
data: mash_by_condition_output_coarse %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class1) and mash_by_condition_output_coarse %>% filter(sig_anywhere) %>% mutate(class = case_when(control10_lfsr < 0.1 & allsharing ~ "class1", control10_lfsr < 0.1 & !allsharing ~ "class2", sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3", sig_anywhere & allsharing ~ "class4")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% pivot_wider(names_from = class, values_from = egene_gwas) %>% pull(class3)
V = 27, p-value = 0.03429
alternative hypothesis: true location shift is not equal to 0
Note that, in the coarsely-clustered data, more GWAS-associated eGenes appear non-dynamic (ie, have equivalent responses across oxygen conditions) and fewer appear context-specific.
The same numbers can be calculated excluding genes that have already been identified as eGenes in GTEx cerebral cortex tissue.
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
filter(gene %not_in% gtex_cortex_signif) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
group_by(class) %>%
summarize(median(egene_gwas))
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
# A tibble: 4 × 2
class `median(egene_gwas)`
<chr> <dbl>
1 class1 20
2 class2 2
3 class3 9
4 class4 10
What if we looked at fraction of the eGenes in that category instead of total numbers?
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique())),
egene_in_gwas_fraction = (sum(gene %in% (unlist(gwas_gene_lists) %>% unique())))/n()) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_in_gwas_fraction)) + geom_boxplot(outlier.shape = NA, aes(color=class), linewidth=0.4) +
xlab("eGene class (see note)") +
ylab("Number of eGenes in GWAS") +
theme_light() +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0), size=0.5) +
scale_color_manual(values=c(manual_palette_fine, class_colors)) +
xlab("eGene class (see note)") +
theme(legend.position="none")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Basically the same proportion of each category.
To test whether the “dynamic” vs “standard” eGenes (defined solely by comparisons of effect size) are more likely to be present in GWAS gene lists, we group and test them as we did for comparison to GTEx. Note this yields a similar conclusion as above, where we broke eGenes down by whether they were dynamic and detectable at baseline.
wilcox.test(mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(allsharing ~ "standard",
hypoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hyperoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hypoxia_hyperoxia_shared & sharing_contexts==1 ~ "dynamic",
sharing_contexts==2 ~ "standard",
sharing_contexts==0 ~ "dynamic"
)) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("dynamic")) %>% pull(egene_gwas),
mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(allsharing ~ "standard",
hypoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hyperoxia_normoxia_shared & sharing_contexts==1 ~ "dynamic",
hypoxia_hyperoxia_shared & sharing_contexts==1 ~ "dynamic",
sharing_contexts==2 ~ "standard",
sharing_contexts==0 ~ "dynamic"
)) %>%
complete(source, class) %>%
group_by(source, class, .drop=FALSE) %>%
summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("standard")) %>% pull(egene_gwas), paired=TRUE, alternative = "two.sided")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Wilcoxon signed rank exact test
data: mash_by_condition_output_coarse %>% filter(sig_anywhere) %>% mutate(class = case_when(allsharing ~ "standard", hypoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hyperoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hypoxia_hyperoxia_shared & sharing_contexts == 1 ~ "dynamic", sharing_contexts == 2 ~ "standard", sharing_contexts == 0 ~ "dynamic")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("dynamic")) %>% pull(egene_gwas) and mash_by_condition_output_coarse %>% filter(sig_anywhere) %>% mutate(class = case_when(allsharing ~ "standard", hypoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hyperoxia_normoxia_shared & sharing_contexts == 1 ~ "dynamic", hypoxia_hyperoxia_shared & sharing_contexts == 1 ~ "dynamic", sharing_contexts == 2 ~ "standard", sharing_contexts == 0 ~ "dynamic")) %>% complete(source, class) %>% group_by(source, class, .drop = FALSE) %>% summarise(egene_gwas = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>% filter(class %in% c("standard")) %>% pull(egene_gwas)
V = 0, p-value = 0.01563
alternative hypothesis: true location shift is not equal to 0
Finally, compare the topic-interacting eQTL genes to the GWAS gene list. Obtain the topic 7-correlated eGenes the same as shown previously:
crm_signif <- vroom("/project2/gilad/umans/oxygen_eqtl/topicqtl/outputs/topics15/all_genes_merged_fine_fasttopics_15_topics.cellregmap.sighits.tsv")
Rows: 289 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (2): GENE_HGNC, VARIANT_ID
dbl (3): P_CELLREGMAP, P_BONF, q
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
crm_iegenes <- crm_signif$GENE_HGNC
crm_betas <- vroom(paste0("/project2/gilad/umans/oxygen_eqtl/topicqtl/outputs/topics15/fasttopics_fine_15_topics.", crm_iegenes, ".cellregmap.betas.tsv"))
Rows: 3094323 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (3): PSEUDOCELL, GENE_HGNC, VARIANT_ID
dbl (2): BETA_GXC, BETA_G
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
crm_betas_wide <- dplyr::select(crm_betas, c(PSEUDOCELL, BETA_GXC, GENE_HGNC, VARIANT_ID)) %>%
unite(TOPIC_QTL, GENE_HGNC, VARIANT_ID, sep="_") %>%
pivot_wider(id_cols=PSEUDOCELL, names_from=TOPIC_QTL, values_from=BETA_GXC)
topic_loadings <- vroom("/project2/gilad/umans/oxygen_eqtl/topicqtl/pseudocell_loadings_k15.tsv")
Rows: 10707 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (1): pseudocell
dbl (15): k1, k2, k3, k4, k5, k6, k7, k8, k9, k10, k11, k12, k13, k14, k15
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
crm_betas_loadings <- left_join(crm_betas_wide, topic_loadings, by=c("PSEUDOCELL"="pseudocell"))
beta_topic_corrs_matrix <- cor(dplyr::select(crm_betas_loadings, -c(PSEUDOCELL, paste0("k", seq(15)))),
dplyr::select(crm_betas_loadings, paste0("k", seq(15))))
beta_topic_corrs_p <- apply(dplyr::select(crm_betas_loadings, -c(PSEUDOCELL, paste0("k", seq(15)))) %>% as.matrix(), MARGIN = 2, FUN = function(x) cor.test(x, crm_betas_loadings$k7)$p.value )
k7_egenes <- beta_topic_corrs_matrix %>% as.data.frame() %>%
dplyr::select(k7) %>%
bind_cols(beta_topic_corrs_p) %>%
rownames_to_column(var = "gene_snp") %>%
separate(gene_snp, into = c("genename", "snp"), sep = "_") %>%
mutate(snp_short = str_sub(snp, end = -5)) %>%
filter(`...2` < (0.05/289)) %>% pull(genename)
New names:
• `` -> `...2`
Now, intersect with GWAS genes:
sum(k7_egenes %in% (unlist(gwas_gene_lists) %>% unique()))
[1] 55
If we exclude the genes in GTEx cortex samples:
length(intersect(setdiff(k7_egenes, gtex_cortex_signif), (unlist(gwas_gene_lists) %>% unique())))
[1] 31
How many are not included in our pseudobulk-identified set of hypoxia-specific eGenes?
hypoxia_egenes <- mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(allsharing ~ "allsharing",
hypoxia_normoxia_shared & sharing_contexts==1 ~ "hyperoxia_specific",
hyperoxia_normoxia_shared & sharing_contexts==1 ~ "hypoxia_specific",
hypoxia_hyperoxia_shared & sharing_contexts==1 ~ "normoxia_specific",
sharing_contexts==2 ~ "partial_shared",
sharing_contexts==0 ~ "alldiff"
)) %>%
filter(class=="hypoxia_specific") %>%
pull(gene) %>%
unique()
any_egenes <- mash_by_condition_output %>%
filter(sig_anywhere) %>%
pull(gene) %>%
unique()
length(setdiff(k7_egenes, hypoxia_egenes))
[1] 214
length(setdiff(k7_egenes, any_egenes))
[1] 150
Here, I import datasets downloaded from SCHEMA, BipEx, Epi25, DDD, and SFARI Genes.
schema <- read.table(file = "/project2/gilad/umans/oxygen_eqtl/data/external/SCHEMA_gene_results.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE)
# convert to gene symbols
bipex <- read.table(file = "/project2/gilad/umans/oxygen_eqtl/data/external/BipEx_gene_results_new.tsv", header = TRUE, sep = "\t", stringsAsFactors = FALSE)
# convert to gene symbols
epi25 <- read.table(file = "/project2/gilad/umans/oxygen_eqtl/data/external/Epi25_gene_results.tsv", header = TRUE, sep = "\t", stringsAsFactors = FALSE)
# convert to gene symbols
ddd <- read.csv(file = "/project2/gilad/umans/oxygen_eqtl/data/external/DDD_gene_results.csv", header = TRUE, stringsAsFactors = FALSE)
sfari <- read.csv(file = "/project2/gilad/umans/oxygen_eqtl/data/external/SFARI-Gene_genes.csv", header = TRUE, stringsAsFactors = FALSE)
Change the ensembl gene names in the SCHEMA, BipEx, and Epi25 sets:
library(GenomicRanges)
gene.gr <- readRDS('/project/gilad/umans/brainchromatin/data_files/rds/AllGenes_GenomicRanges.RDS')
convertGeneIDs <- function(genelist, id.mapping.table) {
return(mapValues(x = genelist, mapping.table = id.mapping.table))
}
mapValues <- function(x,mapping.table) {
out <- mapping.table[x]
if(any(is.na(out))) {
na <- sum(is.na(out))
cat(sprintf("Warning: mapping returned %s NA values\n", as.character(na)))
}
return(out)
}
name2ensembl <- names(gene.gr)
# Something funny about this one, unsure why yet
name2ensembl['CDR1'] <- 'ENSG00000281508'
names(name2ensembl) <- elementMetadata(gene.gr)[ ,'gene_name']
ensembl2name <- elementMetadata(gene.gr)[ ,'gene_name']
names(ensembl2name) <- names(gene.gr)
schema$gene_name <- unname(convertGeneIDs(schema$gene_id, ensembl2name))
Warning: mapping returned 222 NA values
bipex$gene_name <- unname(convertGeneIDs(bipex$gene_id, ensembl2name))
Warning: mapping returned 684 NA values
epi25$gene_name <- unname(convertGeneIDs(epi25$gene_id, ensembl2name))
Warning: mapping returned 28 NA values
Compile results into a list.
rare_gene_list <- list(bipex= bipex %>%
filter(group=="Bipolar Disorder") %>%
filter(damaging_missense_fisher_gnom_non_psych_pval<0.05|ptv_fisher_gnom_non_psych_pval<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name),
epi25 = epi25 %>% filter(group=="EPI") %>%
filter(ptv_pval<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name),
schema = schema %>% filter(Qmeta<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name),
sfari_asd = sfari %>% filter(gene.score==1) %>%
pull(gene.symbol),
sfari_syndromic = sfari %>% filter(syndromic==1) %>%
pull(gene.symbol),
ddd= ddd %>% filter(significant) %>% pull(symbol)
)
Pleiotropy in psychiatric genetics is well documented. Are the genes pulled from each of these sources largely unique?
length(unique(unlist(rare_gene_list)))/length(unlist(rare_gene_list))
[1] 0.8132296
Yes, about 81% of the genes here are not redundant across studies.
Compile the same results into a data frame.
rare_gene_frame <- rbind(data.frame(source= "bipex", gene= bipex %>%
filter(group=="Bipolar Disorder") %>%
filter(damaging_missense_fisher_gnom_non_psych_pval<0.05|ptv_fisher_gnom_non_psych_pval<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name)),
data.frame(source="epi25", gene = epi25 %>% filter(group=="EPI") %>%
filter(ptv_pval<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name)),
data.frame(source= "schema", gene = schema %>% filter(Qmeta<0.05) %>%
drop_na(gene_name) %>%
pull(gene_name)),
data.frame(source= "sfari_asd", gene = sfari %>% filter(gene.score==1) %>%
pull(gene.symbol)),
data.frame(source="sfari_syndromic", gene = sfari %>% filter(syndromic==1) %>%
pull(gene.symbol)),
data.frame(source= "ddd", gene= ddd %>% filter(significant) %>% pull(symbol)))
Now, just like above for GWAS genes, compare each eGene class to rare variant genes in each cell type.
mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_rare = sum(gene %in% (unlist(rare_gene_frame$gene) %>% unique()))) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_rare)) +
geom_boxplot(outlier.shape = NA, aes(color=class)) +
ylab("Rare variant genes with eQTLs") +
theme_light() +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0)) +
scale_color_manual(values=c(manual_palette_fine, class_colors)) +
xlab("eGene class (see note)") +
theme(legend.position="none")
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
Version | Author | Date |
---|---|---|
f0eaf05 | Ben Umans | 2024-09-05 |
# additionally, compare to GWAS results as above, but first restricting to rare variant-associated genes.
mash_by_condition_output %>%
filter(sig_anywhere) %>%
filter(gene %in% rare_gene_frame$gene) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
group_by(source, class) %>%
summarise(egene_rare = sum(gene %in% (unlist(gwas_gene_lists) %>% unique()))) %>%
ggplot(aes(x=factor(class, levels = c("class1", "class4", "class2", "class3")), y=egene_rare)) +
geom_boxplot(outlier.shape = NA, aes(color=class), linewidth=0.4) +
xlab("eGene class (see note") +
ylab("Rare variant genes with GWAS eQTLs") +
theme_light() +
scale_color_manual(values=c(manual_palette_fine, class_colors)) +
theme(legend.position="none") +
geom_point(aes(group=source, color=source), position = position_jitter(width = 0.2, height = 0), size=0.5)
`summarise()` has grouped output by 'source'. You can override using the
`.groups` argument.
The total number of rare variant genes with an eQTL is:
mash_by_condition_output %>%
filter(sig_anywhere) %>%
filter(gene %in% rare_gene_frame$gene) %>%
pull(gene) %>%
unique() %>%
length()
[1] 225
The total number of rare variant genes with an eQTL that correspond to one of our GWAS traits is:
mash_by_condition_output %>%
filter(sig_anywhere) %>%
filter(gene %in% rare_gene_frame$gene) %>%
filter(gene %in% (unlist(gwas_gene_lists) %>% unique())) %>%
pull(gene) %>%
unique() %>%
length()
[1] 123
Rare variant genes that are eGenes in our dataset allow us to ask about the phenotypic consequences of modulating expression levels of genes for which loss of function has an identified phenotype, linked together by a SNP. Here, we can join those together, bridging the GWAS results and the rare variant results by our eQTL results. Note that this is of course an underestimate of this analysis, since we’re only using those SNPs chosen as lead variants by mash which are themselves GWAS hits. There may be other equivalent or secondary SNPs masked by mash here that still provide information.
inner_join(x=mash_by_condition_output %>%
# filter(sig_anywhere) %>%
filter(control10_lfsr < 0.2 | stim1pct_lfsr < 0.2 | stim21pct_lfsr < 0.2 ) %>%
filter(gene %in% rare_gene_frame$gene) %>%
mutate(class=case_when(control10_lfsr < 0.2 & allsharing ~ "class1",
control10_lfsr < 0.2 & !allsharing ~ "class2",
control10_lfsr > 0.2 & !allsharing ~ "class3",
allsharing ~ "class4")) %>%
separate(gene_snp, into = c("genename", "snp"), sep = "_") %>%
mutate(snp_short = str_sub(snp, end = -5)) %>%
select(gene, snp, snp_short, source, control10_lfsr:stim21pct_beta, class),
y=gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>%
filter(PVALUE_MLOG>7.3) %>% unite("snp", CHR_ID:CHR_POS, sep = ":") %>% select(nearest_gene, REPORTED.GENE.S., trait, n, snp, STUDY.ACCESSION, PVALUE_MLOG, UPSTREAM_GENE_DISTANCE, DOWNSTREAM_GENE_DISTANCE) ,
by=join_by(snp_short==snp), relationship = "many-to-many") %>%
mutate(across(ends_with("lfsr"), signif)) %>%
mutate(across(ends_with("beta"), signif)) %>%
select(gene, source, class, snp, nearest_gene, STUDY.ACCESSION, trait) %>%
left_join(x=., y=rare_gene_frame, by=join_by(gene==gene), relationship = "many-to-many")
gene source.x class snp nearest_gene STUDY.ACCESSION
1 NRXN3 Glut class1 14:78164146:C:G NRXN3 GCST90267280
2 CUL3 InhGNRH class3 2:224608546:C:T CUL3 GCST90239693
3 CTTNBP2 InhThalamic class4 7:117883655:A:G CTTNBP2 GCST005324
4 CTTNBP2 InhThalamic class4 7:117883655:A:G CTTNBP2 GCST008810
5 CTTNBP2 InhThalamic class4 7:117883655:A:G CTTNBP2 GCST007474
6 CTTNBP2 InhThalamic class4 7:117883655:A:G CTTNBP2 GCST007462
7 CTTNBP2 InhThalamic class4 7:117883655:A:G CTTNBP2 GCST007085
8 SCN1A NeuronOther class1 2:166142257:C:G SCN3A GCST007343
9 SCN1A NeuronOther class1 2:166142257:C:G SCN3A GCST007343
10 SCN1A NeuronOther class1 2:166142257:C:G SCN3A GCST007343
11 SCN1A NeuronOther class1 2:166142257:C:G SCN3A GCST007343
12 GNAO1 RG class1 16:56227517:T:G GNAO1 GCST007982
13 POU3F3 RGcycling class1 2:104849547:G:A PANTR1 GCST90012894
14 POU3F3 RGcycling class1 2:104849547:G:A PANTR1 GCST90012894
15 POU3F3 RGcycling class1 2:104849547:G:A PANTR1 GCST90131905
16 POU3F3 RGcycling class1 2:104849547:G:A PANTR1 GCST90131905
trait
1 Age when finished full-time education (standard GWA)
2 Risk-taking behavior (multivariate analysis)
3 Depressive symptoms
4 Smoking initiation (ever regular vs never regular)
5 Smoking initiation (ever regular vs never regular)
6 Age of smoking initiation (MTAG)
7 Smoking status
8 Epilepsy
9 Epilepsy
10 Epilepsy
11 Epilepsy
12 Sleep duration
13 Brain shape (segment 15)
14 Brain shape (segment 15)
15 Whole brain restricted directional diffusion (multivariate analysis)
16 Whole brain restricted directional diffusion (multivariate analysis)
source.y
1 sfari_asd
2 sfari_asd
3 epi25
4 epi25
5 epi25
6 epi25
7 epi25
8 epi25
9 sfari_asd
10 sfari_syndromic
11 ddd
12 ddd
13 sfari_syndromic
14 ddd
15 sfari_syndromic
16 ddd
Finally, we can look specifically for cases in which our data link a GWAS SNP to a different gene than was either reported in the GWAS or closest to the associated locus.
full_join(x=mash_by_condition_output %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
separate(gene_snp, into = c("genename", "snp"), sep = "_") %>%
mutate(snp = str_sub(snp, end = -5)) %>%
filter(snp %in% unique(gwas_snps)) %>%
select(gene, snp, source, control10_lfsr:stim21pct_beta, class),
y=gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>%
filter(PVALUE_MLOG>7.3) %>% unite("snp", CHR_ID:CHR_POS, sep = ":") %>% select(nearest_gene, REPORTED.GENE.S., trait, n, snp, STUDY.ACCESSION, PVALUE_MLOG) ,
by=join_by(snp==snp), relationship = "many-to-many") %>%
mutate(across(ends_with("lfsr"), signif)) %>%
mutate(across(ends_with("beta"), signif)) %>%
filter(gene != nearest_gene) %>%
select(gene) %>%
# distinct() %>% # remove this to get the number of associations (some pleiotoropic genes will be duplicated)
dim()
[1] 19 1
# this gets the number of genes with a changed interpretation
(7 genes)
full_join(x=mash_by_condition_output_coarse %>%
filter(sig_anywhere) %>%
mutate(class=case_when(control10_lfsr < 0.1 & allsharing ~ "class1",
control10_lfsr < 0.1 & !allsharing ~ "class2",
sig_anywhere & control10_lfsr > 0.1 & !allsharing ~ "class3",
sig_anywhere & allsharing ~ "class4")) %>%
separate(gene_snp, into = c("genename", "snp"), sep = "_") %>%
mutate(snp = str_sub(snp, end = -5)) %>%
filter(snp %in% unique(gwas_snps)) %>%
select(gene, snp, source, control10_lfsr:stim21pct_beta, class),
y=gwas_formatted %>% filter(trait %in% filtered_traits) %>% filter(n>15) %>% filter(n<500) %>%
filter(PVALUE_MLOG>7.3) %>% unite("snp", CHR_ID:CHR_POS, sep = ":") %>% select(nearest_gene, REPORTED.GENE.S., trait, n, snp, STUDY.ACCESSION, PVALUE_MLOG) ,
by=join_by(snp==snp), relationship = "many-to-many") %>%
mutate(across(ends_with("lfsr"), signif)) %>%
mutate(across(ends_with("beta"), signif)) %>%
filter(gene != nearest_gene) %>%
select(gene) %>%
# distinct() %>% # remove this to get the number of associations (some pleiotoropic genes will be duplicated)
dim()
[1] 25 1
# this gets the number of genes with a changed interpretation
(9 genes)
Here, I plot a couple of example eGenes identified above as cases of changed or updated interpretation of a GWAS SNP’s regulatory target based on our data.
genotypes <- read_table("/project2/gilad/umans/oxygen_eqtl/data/MatrixEQTL/snps/YRI_genotypes_maf10hwee-6_full/chr1_for_matrixeqtl.snps") %>% pivot_longer(cols=starts_with("NA"), names_to = "individual", values_to = "snp")
for (i in 2:22){
genotypes <- rbind(genotypes, read_table(paste0("/project2/gilad/umans/oxygen_eqtl/data/MatrixEQTL/snps/YRI_genotypes_maf10hwee-6_full/chr",i,"_for_matrixeqtl.snps", sep="")) %>% pivot_longer(cols=starts_with("NA"), names_to = "individual", values_to = "snp"))
}
genotypes <- genotypes %>% mutate(ID = str_sub(ID, end = -5))
pseudo_data <- readRDS(file = "/project2/gilad/umans/oxygen_eqtl/output/pseudo_fine_quality_filtered_nomap_qtl_20240305.RDS")
pseudo_data <- filter.pseudobulk(pseudo_data, threshold = 20)
As a first example, the SNP located at 17:75826364 (rs2008012), which is associated with a white matter structural phenotype (uncinate fasciculus thickness). The nearest gene to this SNP is UNK. Note that this QTL is not significant by our lfsr threshold of 0.1 (lfsr 0.26).
p1 <- make_boxplot(celltype = "Immature", condition = "control10", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("") + theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())
p2 <- make_boxplot(celltype = "Immature", condition = "stim1pct", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())
p3 <- make_boxplot(celltype = "Immature", condition = "stim21pct", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank())
grid.arrange(p1, p2, p3, nrow=1)
p1 <- make_boxplot(celltype = "IPcycling", condition = "control10", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_classic() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA18853 = col_double(),
NA18856 = col_double(),
NA19093 = col_double(),
NA19210 = col_double(),
NA19190 = col_double(),
NA18502 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA19128 = col_double()
)
p2 <- make_boxplot(celltype = "IPcycling", condition = "stim1pct", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_classic() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA18853 = col_double(),
NA18856 = col_double(),
NA19093 = col_double(),
NA19144 = col_double(),
NA19210 = col_double(),
NA19190 = col_double(),
NA18502 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA19128 = col_double()
)
p3 <- make_boxplot(celltype = "IPcycling", condition = "stim21pct", testsnp = "17:75826364", testgene = "H3F3B") + geom_boxplot(alpha=0, linewidth=0.4) + theme_classic() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_blank(), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA18853 = col_double(),
NA18856 = col_double(),
NA19093 = col_double(),
NA19210 = col_double(),
NA19190 = col_double(),
NA18502 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18519 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA19128 = col_double()
)
grid.arrange(p1, p2, p3, nrow=1)
For the gene PPIA (cyclophilin A). Here, lfsr<0.2.
p1 <- make_boxplot(celltype = "NeuronOther", condition = "stim1pct", testsnp = "7:44768492", testgene = "PPIA") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
p2 <- make_boxplot(celltype = "NeuronOther", condition = "stim21pct", testsnp = "7:44768492", testgene = "PPIA") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
p3 <- make_boxplot(celltype = "NeuronOther", condition = "control10", testsnp = "7:44768492", testgene = "PPIA") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
grid.arrange(p3, p1, p2, nrow=1)
The following example genes do have significant effects in at least one condition (lfsr<0.1):
p1 <- make_boxplot(celltype = "Immature", condition = "stim1pct", testsnp = "14:103551768", testgene = "CKB") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
p2 <- make_boxplot(celltype = "Immature", condition = "stim21pct", testsnp = "14:103551768", testgene = "CKB") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
p3 <- make_boxplot(celltype = "Immature", condition = "control10", testsnp = "14:103551768", testgene = "CKB") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
grid.arrange(p3, p1, p2, nrow=1)
COA8, a cytochrome c oxidase interacting factor:
p1 <- make_boxplot(celltype = "InhThalamic", condition = "control10", testsnp = "14:103551768", testgene = "COA8") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("") + theme(axis.text.x=element_text(size=6), axis.text.y=element_text(size=6), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
p2 <- make_boxplot(celltype = "InhThalamic", condition = "stim1pct", testsnp = "14:103551768", testgene = "COA8") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_text(size=6), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
p3 <- make_boxplot(celltype = "InhThalamic", condition = "stim21pct", testsnp = "14:103551768", testgene = "COA8") + geom_boxplot(alpha=0, linewidth=0.4) + theme_light() + theme(legend.position="none") + ylab("")+ theme(axis.text.x=element_text(size=6), axis.text.y=element_text(size=6), plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_blank())
grid.arrange(p1, p2, p3, nrow=1)
The gene RPS26, which has a significant QTL effect in multiple cell types. This SNP is associated with cognitive phenotypes.
p1 <- make_boxplot(celltype = "NeuronOther", condition = "stim1pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA19098 = col_double(),
NA19093 = col_double(),
NA19144 = col_double(),
NA19143 = col_double(),
NA18502 = col_double(),
NA18508 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA18511 = col_double()
)
p2 <- make_boxplot(celltype = "NeuronOther", condition = "stim21pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA19098 = col_double(),
NA18853 = col_double(),
NA19093 = col_double(),
NA19144 = col_double(),
NA19143 = col_double(),
NA19190 = col_double(),
NA18502 = col_double(),
NA18508 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA18511 = col_double(),
NA19128 = col_double()
)
p3 <- make_boxplot(celltype = "NeuronOther", condition = "control10", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
── Column specification ────────────────────────────────────────────────────────
cols(
ID = col_character(),
NA19153 = col_double(),
NA19207 = col_double(),
NA18913 = col_double(),
NA19098 = col_double(),
NA18856 = col_double(),
NA19093 = col_double(),
NA19144 = col_double(),
NA19143 = col_double(),
NA18502 = col_double(),
NA18508 = col_double(),
NA18517 = col_double(),
NA18507 = col_double(),
NA18501 = col_double(),
NA18489 = col_double(),
NA19102 = col_double(),
NA18511 = col_double()
)
grid.arrange(p3, p1, p2, nrow=1)
p1 <- make_boxplot(celltype = "RGcycling", condition = "stim1pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
p2 <- make_boxplot(celltype = "RGcycling", condition = "stim21pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
p3 <- make_boxplot(celltype = "RGcycling", condition = "control10", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
grid.arrange(p3, p1, p2, nrow=1)
p1 <- make_boxplot(celltype = "GliaProg", condition = "stim1pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
p2 <- make_boxplot(celltype = "GliaProg", condition = "stim21pct", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
p3 <- make_boxplot(celltype = "GliaProg", condition = "control10", testsnp = "12:56007301", testgene = "RPS26") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
grid.arrange(p3, p1, p2, nrow=1)
And PLRG1, an important and pleiotropic regulator, which has a significant eQTL effect only under normoxia that is suppressed under oxygen stress. This SNP is strongly associated with stroke risk (GCST005838, GCST005843).
p1 <- make_boxplot(celltype = "CorticalHem", condition = "stim1pct", testsnp = "4:154580036", testgene = "PLRG1") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none") + ylab("")
p2 <- make_boxplot(celltype = "CorticalHem", condition = "stim21pct", testsnp = "4:154580036", testgene = "PLRG1") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
p3 <- make_boxplot(celltype = "CorticalHem", condition = "control10", testsnp = "4:154580036", testgene = "PLRG1") + geom_boxplot(alpha=0) + theme_classic() + theme(legend.position="none")+ ylab("")
grid.arrange(p3, p1, p2, nrow=1)
sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C
[4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9 IRanges_2.32.0
[4] S4Vectors_0.36.2 BiocGenerics_0.44.0 vroom_1.6.4
[7] MatrixEQTL_2.3 gridExtra_2.3 ggrepel_0.9.4
[10] knitr_1.45 udr_0.3-153 mashr_0.2.79
[13] ashr_2.2-63 RColorBrewer_1.1-3 pals_1.8
[16] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.0
[19] dplyr_1.1.4 purrr_1.0.2 readr_2.1.4
[22] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.4
[25] tidyverse_2.0.0 SeuratObject_4.1.4 Seurat_4.4.0
[28] workflowr_1.7.0
loaded via a namespace (and not attached):
[1] plyr_1.8.9 igraph_1.5.1 lazyeval_0.2.2
[4] sp_2.1-3 splines_4.2.0 listenv_0.9.1
[7] scattermore_1.2 digest_0.6.34 invgamma_1.1
[10] htmltools_0.5.7 SQUAREM_2021.1 fansi_1.0.5
[13] magrittr_2.0.3 tensor_1.5 cluster_2.1.3
[16] ROCR_1.0-11 tzdb_0.4.0 globals_0.16.2
[19] matrixStats_1.1.0 timechange_0.2.0 spatstat.sparse_3.0-3
[22] colorspace_2.1-0 xfun_0.41 RCurl_1.98-1.13
[25] crayon_1.5.2 callr_3.7.3 jsonlite_1.8.7
[28] progressr_0.14.0 spatstat.data_3.0-3 survival_3.3-1
[31] zoo_1.8-12 glue_1.6.2 polyclip_1.10-0
[34] gtable_0.3.4 zlibbioc_1.44.0 XVector_0.38.0
[37] leiden_0.4.2 future.apply_1.11.1 maps_3.4.0
[40] abind_1.4-5 scales_1.2.1 mvtnorm_1.2-3
[43] spatstat.random_3.1-3 miniUI_0.1.1.1 Rcpp_1.0.12
[46] viridisLite_0.4.2 xtable_1.8-4 reticulate_1.34.0
[49] bit_4.0.5 mapproj_1.2.11 truncnorm_1.0-9
[52] htmlwidgets_1.6.2 httr_1.4.7 ellipsis_0.3.2
[55] ica_1.0-3 farver_2.1.1 pkgconfig_2.0.3
[58] sass_0.4.7 uwot_0.1.16 deldir_1.0-6
[61] utf8_1.2.4 labeling_0.4.3 tidyselect_1.2.0
[64] rlang_1.1.3 reshape2_1.4.4 later_1.3.0
[67] munsell_0.5.0 tools_4.2.0 cachem_1.0.8
[70] cli_3.6.1 generics_0.1.3 ggridges_0.5.3
[73] evaluate_0.23 fastmap_1.1.1 yaml_2.3.5
[76] goftest_1.2-3 bit64_4.0.5 processx_3.8.2
[79] fs_1.6.3 fitdistrplus_1.1-8 RANN_2.6.1
[82] pbapply_1.5-0 future_1.33.1 nlme_3.1-157
[85] whisker_0.4.1 mime_0.12 compiler_4.2.0
[88] rstudioapi_0.17.1 plotly_4.10.0 png_0.1-8
[91] spatstat.utils_3.0-4 bslib_0.5.1 stringi_1.8.1
[94] highr_0.9 ps_1.7.5 lattice_0.20-45
[97] Matrix_1.6-3 vctrs_0.6.4 pillar_1.9.0
[100] lifecycle_1.0.4 spatstat.geom_3.2-7 lmtest_0.9-40
[103] jquerylib_0.1.4 RcppAnnoy_0.0.19 bitops_1.0-7
[106] data.table_1.14.8 cowplot_1.1.1 irlba_2.3.5.1
[109] httpuv_1.6.5 patchwork_1.1.3 R6_2.5.1
[112] promises_1.2.0.1 KernSmooth_2.23-20 parallelly_1.37.0
[115] codetools_0.2-18 dichromat_2.0-0.1 assertthat_0.2.1
[118] MASS_7.3-56 rprojroot_2.0.3 withr_2.5.2
[121] sctransform_0.4.1 GenomeInfoDbData_1.2.9 parallel_4.2.0
[124] hms_1.1.3 grid_4.2.0 rmarkdown_2.26
[127] Rtsne_0.16 git2r_0.30.1 mixsqp_0.3-48
[130] getPass_0.2-2 spatstat.explore_3.0-6 shiny_1.7.5.1
[133] rmeta_3.0