INTEGRATIVE ‘OMIC’ APPROACH TOWARDS UNDERSTANDING THE NATURE OF HUMAN DISEASES
Peterlin B*, Maver A
*Corresponding Author: Professor Borut Peterlin, M.D., Ph.D., Institute of Medical Genetics, Department of Gynecology and Obstetrics, University Medical Centre Ljubljana, 3, Šlajmerjeva Street, Ljubljana 1000, Slovenia; Tel./Fax: +386(0)1-540-1137; E mail: borut.peterlin@guest.arnes.si
page: 45

MATERIALS AND METHODS

The pathway towards constructing the initial database used for subsequent integration is highly dependent on the disease of interest. While some common disorders have been investigated in several ‘omic’ studies that investigated several biological cellular levels, the sourcing data for other diseases may be more scarce. The search for data sources should be initiated by an overview of literature published to date. When the investigator is familiar with the studies performed, the published reports and their tables in supplemental materials may be used to extract the lists of genes or other genomic features with detected significant alterations. A crucial step in obtaining data sources of high quality is inspection of available databases that are stored in public data repositories. These tend to be highly specialized for the biological layer of investigation. For genomic data from genome-wide association studies, data may be extracted from dbGAP (http://www.ncbi.nlm.nih.gov/gap), for epigenomic, transcriptomic and methylomic data, Array Express (http://www.ebi.ac.uk/arrayexpress/) or Gene Expression Omnibus (GEO; http://www. ncbi.nlm.nih.gov/ geo/), and for next generation sequencing databases European Nucleotide Archive (ENA, http://www.ebi.ac.uk/ ena/) and Sequence Read Archive (SRA; http://www.ncbi. nlm.nih.gov/ sra) [4-7]. After all the sources have been investigated, a collected database of features [genes, mRNAs, microRNAs (miRNAs), CpG islands, proteins and others] with significant alterations in chosen disease should be prepared for each included study. We also advise collection of information, such as significance values and fold change values, on which prioritization of features for each biological layer will be performed in the later steps. If the latter information is not available, all the significant alterations in a given study will have the same importance in integration. In the following section, significant results from various study types will be collectively referred to as “signals” for reasons of clarity. Data Integration. Before data can be integrated, they have to be reduced to a universal common denominator. Due to increasing heterogeneity of genetic information, tying biological information to gene-level annotation is becoming increasingly more difficult. Genomic variation and methylation patterns are two examples of information that is prohibitively difficult to associate with genes in any straightforward manner, as such alterations occur in genes, between genes or spread across several genes. For this reason, we opted for an integration based on the genomic position of features originating from various data sources. This required the signals from all databases to be converted to their genomic positions and projected on the genome assembly backbone. This step then allows for complete omission of difficult annotation conversion steps, required before final integration can be performed, greatly simplifying the synthesis of heterogeneous data. After signals are positioned on the genomic backbone, the complete assembly is divided into bins of equal size. For each study, a score is given to each of the bins, depending on the score of alterations residing in that segment of the genome. After this step, the scores of all bins are prioritized and their rank scores calculated. The integration step is attained when the non parametric rank product for each of the bins is calculated, and on the basis of rank scores of bins originating from each data source, as we have previously described [8]. The lower final rank product signifies that higher ranks were attained by bins on several separate biological layers [9]. Therefore, these bins represent genomic regions where accumulation of signals is detected on various biological levels, and thus represent regions of interest for further investigation. Ultimately, a permutational test may be employed to determine the significance of signal accumulation in each bin [8]. The detailed overview of the process may be observed in Figure 1.



Number 27
VOL. 27 (2), 2024
Number 27
VOL. 27 (1), 2024
Number 26
Number 26 VOL. 26(2), 2023 All in one
Number 26
VOL. 26(2), 2023
Number 26
VOL. 26, 2023 Supplement
Number 26
VOL. 26(1), 2023
Number 25
VOL. 25(2), 2022
Number 25
VOL. 25 (1), 2022
Number 24
VOL. 24(2), 2021
Number 24
VOL. 24(1), 2021
Number 23
VOL. 23(2), 2020
Number 22
VOL. 22(2), 2019
Number 22
VOL. 22(1), 2019
Number 22
VOL. 22, 2019 Supplement
Number 21
VOL. 21(2), 2018
Number 21
VOL. 21 (1), 2018
Number 21
VOL. 21, 2018 Supplement
Number 20
VOL. 20 (2), 2017
Number 20
VOL. 20 (1), 2017
Number 19
VOL. 19 (2), 2016
Number 19
VOL. 19 (1), 2016
Number 18
VOL. 18 (2), 2015
Number 18
VOL. 18 (1), 2015
Number 17
VOL. 17 (2), 2014
Number 17
VOL. 17 (1), 2014
Number 16
VOL. 16 (2), 2013
Number 16
VOL. 16 (1), 2013
Number 15
VOL. 15 (2), 2012
Number 15
VOL. 15, 2012 Supplement
Number 15
Vol. 15 (1), 2012
Number 14
14 - Vol. 14 (2), 2011
Number 14
The 9th Balkan Congress of Medical Genetics
Number 14
14 - Vol. 14 (1), 2011
Number 13
Vol. 13 (2), 2010
Number 13
Vol.13 (1), 2010
Number 12
Vol.12 (2), 2009
Number 12
Vol.12 (1), 2009
Number 11
Vol.11 (2),2008
Number 11
Vol.11 (1),2008
Number 10
Vol.10 (2), 2007
Number 10
10 (1),2007
Number 9
1&2, 2006
Number 9
3&4, 2006
Number 8
1&2, 2005
Number 8
3&4, 2004
Number 7
1&2, 2004
Number 6
3&4, 2003
Number 6
1&2, 2003
Number 5
3&4, 2002
Number 5
1&2, 2002
Number 4
Vol.3 (4), 2000
Number 4
Vol.2 (4), 1999
Number 4
Vol.1 (4), 1998
Number 4
3&4, 2001
Number 4
1&2, 2001
Number 3
Vol.3 (3), 2000
Number 3
Vol.2 (3), 1999
Number 3
Vol.1 (3), 1998
Number 2
Vol.3(2), 2000
Number 2
Vol.1 (2), 1998
Number 2
Vol.2 (2), 1999
Number 1
Vol.3 (1), 2000
Number 1
Vol.2 (1), 1999
Number 1
Vol.1 (1), 1998

 

 


 About the journal ::: Editorial ::: Subscription ::: Information for authors ::: Contact
 Copyright © Balkan Journal of Medical Genetics 2006