On the motivation by an integrative study of multi-omics data, we are interested in estimating the structure of the sparse cross correlation matrix of two high-dimensional random vectors. We rewrite the problem as a multiple testing problem and propose a new method to estimate the sparse structure of the cross correlation matrix. To do so, we test the correlation coefficients simultaneously and threshold the correlation coefficients by controlling FRD at a predetermined level α. Further, we apply the proposed method and an alternative adaptive thresholding procedure by Cai and Liu (2016) to the integrative analysis of the protein expression data (X) and the mRNA expression data (Y) in TCGA breast cancer cohort. By varying the FDR level α, we show that the new procedure is consistently more efficient in estimating the sparse structure of cross correlation matrix than the alternative one.
Yin Cao and Kwangok Seo contributed equally to this research. This research was supported by the National Research Foundation of Korea (NRF-2019R1F1A1056779). 1Corresponding author: Department of Mathematics, Ajou University, 206 World cup-ro, Yeongtong-gu, Gyeonggi-do 16499, Korea. E-mail: shahn@ajou.ac.kr