An improved association analysis pipeline for tumor susceptibility variant in haplotype amplification area.
10.12122/j.issn.1673-4254.2020.10.16
- Author:
Yu GENG
1
;
Rongrong YANG
2
;
Jing ZHANG
1
Author Information
1. School of Health Management, Jinzhou Medical University, Jinzhou 121001, China.
2. School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China.
- Publication Type:Journal Article
- Keywords:
cancer genomics;
haplotype amplification;
rare variants;
variant association method
- MeSH:
Algorithms;
Cluster Analysis;
Gene Amplification;
Gene Frequency;
Haplotypes;
Humans;
Neoplasms/genetics*;
Polymorphism, Single Nucleotide
- From:
Journal of Southern Medical University
2020;40(10):1493-1499
- CountryChina
- Language:Chinese
-
Abstract:
OBJECTIVE:Haplotype amplification on germline variants is suggested to imply potential selective advantages and clonal expansion susceptibility and has become an important signature for seeking cancer susceptibility gene.Here we propose an improved association method that fully considers the haplotype amplification status.
METHODS:The haplotype amplification status was estimated by the variant allelic frequencies.We adopted a permutation test on variant allelic frequencies to divide the candidate variants into multiple groups.A likelihood clustering method was then applied to establish the neighborhood system of the hidden Markov random field framework.A filtering pipeline was introduced into the proposed method to further refine the candidate variants, including a Wilson's interval filter and a false discovery rate controller.The final candidate set along with the haplotype amplification status was collapsed into the weighted virtual sites for association tests.
RESULTS:Through simulated tests on a series of datasets, we compared the type Ⅰ error rates of different minor allele frequencies, which stably fell within 2%, suggesting good robustness of the algorithm.In addition, we compared another 5 published association approaches for Type-Ⅰ and Type-Ⅱ error rates with the proposed method, which resulted in the error rates all within 2%, demonstrating significant advantages and a good statistical ability of the proposed method.
CONCLUSIONS:The proposed method can accurately identify tumor susceptibility variants in haplotype amplification area with good robustness and stability.