1.Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms
Karl R. FRANKE ; Erin L. CROWGEY
Genomics & Informatics 2020;18(1):e10-
Advancements in next generation sequencing (NGS) technologies have significantly increased the translational use of genomics data in the medical field as well as the demand for computational infrastructure capable processing that data. To enhance the current understanding of software and hardware used to compute large scale human genomic datasets (NGS), the performance and accuracy of optimized versions of GATK algorithms, including Parabricks and Sentieon, were compared to the results of the original application (GATK V4.1.0, Intel x86 CPUs). Parabricks was able to process a 50× whole-genome sequencing library in under 3 h and Sentieon finished in under 8 h, whereas GATK v4.1.0 needed nearly 24 h. These results were achieved while maintaining greater than 99% accuracy and precision compared to stock GATK. Sentieon’s somatic pipeline achieved similar results greater than 99%. Additionally, the IBM POWER9 CPU performed well on bioinformatic workloads when tested with 10 different tools for alignment/mapping.
2.Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms
Karl R. FRANKE ; Erin L. CROWGEY
Genomics & Informatics 2020;18(1):e10-
Advancements in next generation sequencing (NGS) technologies have significantly increased the translational use of genomics data in the medical field as well as the demand for computational infrastructure capable processing that data. To enhance the current understanding of software and hardware used to compute large scale human genomic datasets (NGS), the performance and accuracy of optimized versions of GATK algorithms, including Parabricks and Sentieon, were compared to the results of the original application (GATK V4.1.0, Intel x86 CPUs). Parabricks was able to process a 50× whole-genome sequencing library in under 3 h and Sentieon finished in under 8 h, whereas GATK v4.1.0 needed nearly 24 h. These results were achieved while maintaining greater than 99% accuracy and precision compared to stock GATK. Sentieon’s somatic pipeline achieved similar results greater than 99%. Additionally, the IBM POWER9 CPU performed well on bioinformatic workloads when tested with 10 different tools for alignment/mapping.
3.Strong concordance between RNA structural and single nucleotide variants identified via next generation sequencing techniques in primary pediatric leukemia and patient-derived xenograft samples
Sonali P. BARWE ; Anilkumar GOPALAKRISNAPILLAI ; Nitin MAHAJAN ; Todd E. DRULEY ; E. Anders KOLB ; Erin L. CROWGEY
Genomics & Informatics 2020;18(1):e6-
Acute leukemia represents the most common pediatric malignancy comprising diverse subtypes with varying prognosis and treatment outcomes. New and targeted treatment options are warranted for this disease. Patient-derived xenograft (PDX) models are increasingly being used for preclinical testing of novel treatment modalities. A novel approach involving targeted error-corrected RNA sequencing using ArcherDX HemeV2 kit was employed to compare 25 primary pediatric acute leukemia samples and their corresponding PDX samples. A comparison of the primary samples and PDX samples revealed a high concordance between single nucleotide variants and gene fusions whereas other complex structural variants were not as consistent. The presence of gene fusions representing the major driver mutations at similar allelic frequencies in PDX samples compared to primary samples and over multiple passages confirms the utility of PDX models for preclinical drug testing. Characterization and tracking of these novel cryptic fusions and exonal variants in PDX models is critical in assessing response to potential new therapies.
4.Strong concordance between RNA structural and single nucleotide variants identified via next generation sequencing techniques in primary pediatric leukemia and patient-derived xenograft samples
Sonali P. BARWE ; Anilkumar GOPALAKRISNAPILLAI ; Nitin MAHAJAN ; Todd E. DRULEY ; E. Anders KOLB ; Erin L. CROWGEY
Genomics & Informatics 2020;18(1):e6-
Acute leukemia represents the most common pediatric malignancy comprising diverse subtypes with varying prognosis and treatment outcomes. New and targeted treatment options are warranted for this disease. Patient-derived xenograft (PDX) models are increasingly being used for preclinical testing of novel treatment modalities. A novel approach involving targeted error-corrected RNA sequencing using ArcherDX HemeV2 kit was employed to compare 25 primary pediatric acute leukemia samples and their corresponding PDX samples. A comparison of the primary samples and PDX samples revealed a high concordance between single nucleotide variants and gene fusions whereas other complex structural variants were not as consistent. The presence of gene fusions representing the major driver mutations at similar allelic frequencies in PDX samples compared to primary samples and over multiple passages confirms the utility of PDX models for preclinical drug testing. Characterization and tracking of these novel cryptic fusions and exonal variants in PDX models is critical in assessing response to potential new therapies.
5.The PNPLA3 rs738409 Variant but not MBOAT7 rs641738 is a Risk Factor for Nonalcoholic Fatty Liver Disease in Obese U.S. Children of Hispanic Ethnicity
Sana MANSOOR ; Anshu MAHESHWARI ; Matthew Di GUGLIELMO ; Katryn FURUYA ; Makala WANG ; Erin CROWGEY ; Zarela MOLLE-RIOS ; Zhaoping HE
Pediatric Gastroenterology, Hepatology & Nutrition 2021;24(5):455-469
Purpose:
The rs641738 C>T in membrane-bound O-acyltransferase domain-containing protein 7 (MBOAT7) is implicated, along with the rs738409 C>G polymorphism in patatin-like phospholipase domain-containing protein 3 (PNPLA3), in nonalcoholic fatty liver disease (NAFLD). The association of these polymorphisms and NAFLD are investigated in Hispanic children with obesity.
Methods:
Obese children with and without NAFLD were enrolled at a pediatric tertiary care health system and genotyped for MBOAT7 rs641738 C>T and PNPLA3 rs738409 C>G. NAFLD was characterized by the ultrasonographic presence of hepatic steatosis along with persistently elevated liver enzymes. Genetic variants and demographic and biochemical data were analyzed for the effects on NAFLD.
Results:
Among 126 enrolled subjects, 84 in the case group had NAFLD and 42 in the control group did not. The two groups had similar demographic distribution. NAFLD was associated with abnormal liver enzymes and elevated triglycerides and cholesterol (p<0.05). Children with NAFLD had higher percentage of PNPLA3 GG genotype at 70.2% versus 31.0% in non-NAFLD, and lower MBOAT7 TT genotype at 4.8% versus 16.7% in non-NAFLD (p<0.05). PNPLA3 rs738409 C>G had an additive effect in NAFLD; however, MBOAT7 rs641738 C>T had no effects alone or synergistically with PNPLA3 polymorphism. NAFLD risk increased 3.7-fold in subjects carrying PNPLA3 GG genotype and decreased in MBOAT7 TT genotype.
Conclusion
In Hispanic children with obesity, PNPLA3 rs738409 C>G polymorphism increased the risk for NAFLD. The role of MBOAT7 rs641738 variant in NAFLD is less evident.
6.The PNPLA3 rs738409 Variant but not MBOAT7 rs641738 is a Risk Factor for Nonalcoholic Fatty Liver Disease in Obese U.S. Children of Hispanic Ethnicity
Sana MANSOOR ; Anshu MAHESHWARI ; Matthew Di GUGLIELMO ; Katryn FURUYA ; Makala WANG ; Erin CROWGEY ; Zarela MOLLE-RIOS ; Zhaoping HE
Pediatric Gastroenterology, Hepatology & Nutrition 2021;24(5):455-469
Purpose:
The rs641738 C>T in membrane-bound O-acyltransferase domain-containing protein 7 (MBOAT7) is implicated, along with the rs738409 C>G polymorphism in patatin-like phospholipase domain-containing protein 3 (PNPLA3), in nonalcoholic fatty liver disease (NAFLD). The association of these polymorphisms and NAFLD are investigated in Hispanic children with obesity.
Methods:
Obese children with and without NAFLD were enrolled at a pediatric tertiary care health system and genotyped for MBOAT7 rs641738 C>T and PNPLA3 rs738409 C>G. NAFLD was characterized by the ultrasonographic presence of hepatic steatosis along with persistently elevated liver enzymes. Genetic variants and demographic and biochemical data were analyzed for the effects on NAFLD.
Results:
Among 126 enrolled subjects, 84 in the case group had NAFLD and 42 in the control group did not. The two groups had similar demographic distribution. NAFLD was associated with abnormal liver enzymes and elevated triglycerides and cholesterol (p<0.05). Children with NAFLD had higher percentage of PNPLA3 GG genotype at 70.2% versus 31.0% in non-NAFLD, and lower MBOAT7 TT genotype at 4.8% versus 16.7% in non-NAFLD (p<0.05). PNPLA3 rs738409 C>G had an additive effect in NAFLD; however, MBOAT7 rs641738 C>T had no effects alone or synergistically with PNPLA3 polymorphism. NAFLD risk increased 3.7-fold in subjects carrying PNPLA3 GG genotype and decreased in MBOAT7 TT genotype.
Conclusion
In Hispanic children with obesity, PNPLA3 rs738409 C>G polymorphism increased the risk for NAFLD. The role of MBOAT7 rs641738 variant in NAFLD is less evident.
7.A Systems Biology Approach for Studying Heterotopic Ossification: Proteomic Analysis of Clinical Serum and Tissue Samples.
Erin L CROWGEY ; Jennifer T WYFFELS ; Patrick M OSBORN ; Thomas T WOOD ; Laura E EDSBERG
Genomics, Proteomics & Bioinformatics 2018;16(3):212-220
Heterotopic ossification (HO) refers to the abnormal formation of bone in soft tissue. Although some of the underlying processes of HO have been described, there are currently no clinical tests using validated biomarkers for predicting HO formation. As such, the diagnosis is made radiographically after HO has formed. To identify potential and novel biomarkers for HO, we used isobaric tags for relative and absolute quantitation (iTRAQ) and high-throughput antibody arrays to produce a semi-quantitative proteomics survey of serum and tissue from subjects with (HO) and without (HO) heterotopic ossification. The resulting data were then analyzed using a systems biology approach. We found that serum samples from subjects experiencing traumatic injuries with resulting HO have a different proteomic expression profile compared to those from the matched controls. Subsequent quantitative ELISA identified five blood serum proteins that were differentially regulated between the HO and HO groups. Compared to HO samples, the amount of insulin-like growth factor I (IGF1) was up-regulated in HO samples, whereas a lower amount of osteopontin (OPN), myeloperoxidase (MPO), runt-related transcription factor 2 (RUNX2), and growth differentiation factor 2 or bone morphogenetic protein 9 (BMP-9) was found in HO samples (Welch two sample t-test; P < 0.05). These proteins, in combination with potential serum biomarkers previously reported, are key candidates for a serum diagnostic panel that may enable early detection of HO prior to radiographic and clinical manifestations.
Adult
;
Aged
;
Aged, 80 and over
;
Biomarkers
;
metabolism
;
Case-Control Studies
;
Female
;
Humans
;
Male
;
Middle Aged
;
Ossification, Heterotopic
;
blood
;
diagnosis
;
metabolism
;
Proteome
;
analysis
;
Proteomics
;
methods
;
Systems Biology
;
methods
;
Young Adult