Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Applied Mathematicsen_US
dc.contributor.advisorWong, Kin Yau (AMA)en_US
dc.creatorFeng, Jiahui-
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titleAssociation tests with incomplete covariates and high-dimensional auxiliary variablesen_US
dcterms.abstractIn many clinical and epidemiological studies, investigators are interested in testing the presence of association between an outcome variable and covariates of interest. Such analyses are often complicated by missing data. When variables of interest are missing for some subjects, it is desirable to use observed auxiliary variables, which are sometimes high-dimensional, to impute or predict the missing values to improve statistical efficiency. Although many methods have been developed for prediction using high-dimensional variables, it is challenging to perform valid inference based on the predicted values. In this dissertation, we propose novel association testing methods involving missing data with the goal of detecting relevant predictors for outcomes of interest.en_US
dcterms.abstractWe first focus on parametric models and develop an association test for an outcome variable and a partially missing covariate, where the missing values can be predicted using a set of high-dimensional auxiliary variables. The proposed analysis consists of a model selection step and a testing step. Specifically, in the first step, we select a subset of auxiliary variables and fit a regression model of the covariate of interest against the selected features. In the second step, we perform the score test for the covariate in the outcome model under the full likelihood, which includes both the outcome model and the missing covariate model. We then extend the proposed method to a class of semiparametric transformation models for potentially right-censored survival outcomes. We propose a supremum test, where we consider multiple choices of transformation functions, perform individual score test under each outcome model, and take the supremum of the individual test statistics as the proposed test statistic. We show that the proposed testing procedure improves the test performance when the outcome model is unknown.en_US
dcterms.abstractThe validity and advantages of the proposed methods are demonstrated both theoretically and numerically. We establish the asymptotic properties of the proposed test statistics under regularity conditions and show the validity of the tests under data-driven model selection procedures. We evaluate the proposed methods through extensive simulation studies, and show their superior performances over some existing methods. Real data analyses are carried out on major cancer genomic studies.en_US
dcterms.extentix, 139 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.educationalLevelAll Doctorateen_US
dcterms.LCSHMultivariate analysisen_US
dcterms.LCSHMissing observations (Statistics)en_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsopen accessen_US

Files in This Item:
File Description SizeFormat 
6651.pdfFor All Users988.61 kBAdobe PDFView/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12195