E-Locus - Institutional Repository of the University of Crete

Home Collections School/Department School of Sciences and Engineering Department of Mathematics and Applied Mathematics Post-graduate theses

Post-graduate theses

Current Record: 14 of 127

[Add to Basket]

Identifier

000457260

Title

Bayesian causal feature selection from observational and limited experimental data

Alternative Title

Μπεϋζιανή αιτιακή επιλογή χαρακτηριστικών από παρατηρησιακά και περιορισμένα πειραματικά δεδομένα

Author

Λελόβα, Κωνσταντίνα

Thesis advisor

Τριανταφύλλου, Σοφία

Reviewer

Τσαμαρδίνος, Ιωάννης
Καμαριανάκης, Ιωάννης

Abstract

In medical research, the selection of variables that contribute to an optimal predictive model or aid in uncovering associations between treatment, outcome, and pre-treatment variables poses a paramount goal. However, one of the most crucial challenges faced by doctors is the selection of treatments that will optimize individual patient outcomes. This objective can be effectively addressed by framing it as the problem of feature selection for predicting post-intervention outcomes using pre-intervention variables. Experimental data from randomized controlled trials allow for unbiased estimation of the probability of post-treatment outcomes. However, such data have limited sample sizes and may be underpowered to accurately estimate conditional effects. Observational data contain many more samples but in most realistic cases, the presence of confounding variables makes it difficult to establish causal relationships. Thus, identifying a set of appropriate covariates and adjusting for their influence to mitigate confounding bias is not always possible from the observational data alone. This thesis argues that the combination of experimental and observational data may help to improve the prediction of the post-intervention outcome and lead to an unbiased conditional treatment effect estimation. We propose a Bayesian feature selection method for finding the Markov boundary from the observational data and using the concepts of feature selection, Bayesian inference, and Bayesian regression, we extend a recently proposed method that combines large observational and limited experimental data to identify adjustment sets and improve the estimation of causal effects for a target population. [40] This method was developed for multinomial distributions with Dirichlet priors and closed-form solutions and we present its extension for data sets with both binary and continuous explanatory variables when the outcome is binary or ordinal. In healthcare settings, the ordinal data is of great importance as it allows for the nuanced measurement of patient outcomes and a significant gap exists in effective methods for predicting post-interventional outcomes in this case. We test our method in a simulated data set under different conditions. Results indicate that our method (a) demonstrates high performance in accurately identifying the correct Markov boundary for both binary and ordinal cases, even when applied to small observational data sets, (b) exhibits strong performance in identifying the optimal set Z that when included in a model, yields the best prediction for the post-intervention outcome P(Y |do(X), Z). The experiments were conducted using limited experimental and large observational data samples, respectively. When dealing with ordinal data, it is essential to have a larger set of experimental data compared to the binary case.

Language

English

Issue date

2023-07-19

Collection

School/Department--School of Sciences and Engineering--Department of Mathematics and Applied Mathematics--Post-graduate theses

Type of Work--Post-graduate theses

Permanent Link

https://elocus.lib.uoc.gr//dlib/8/8/b/metadata-dlib-1689319668-892667-8183.tkl

Views

473

Digital Documents
	Download document View document Views : 1