
INTEGRATIVE ‘OMIC’ APPROACH TOWARDS
UNDERSTANDING THE NATURE OF HUMAN DISEASES Peterlin B*, Maver A *Corresponding Author: Professor Borut Peterlin, M.D., Ph.D., Institute of Medical Genetics, Department
of Gynecology and Obstetrics, University Medical Centre Ljubljana, 3, Šlajmerjeva Street, Ljubljana 1000,
Slovenia; Tel./Fax: +386(0)1-540-1137; E mail: borut.peterlin@guest.arnes.si page: 45
|
MATERIALS AND METHODS
The pathway towards constructing the initial
database used for subsequent integration is highly
dependent on the disease of interest. While some
common disorders have been investigated in several
‘omic’ studies that investigated several biological
cellular levels, the sourcing data for other diseases
may be more scarce. The search for data sources
should be initiated by an overview of literature
published to date. When the investigator is familiar
with the studies performed, the published reports
and their tables in supplemental materials may be
used to extract the lists of genes or other genomic
features with detected significant alterations.
A crucial step in obtaining data sources of high
quality is inspection of available databases that are
stored in public data repositories. These tend to be
highly specialized for the biological layer of investigation.
For genomic data from genome-wide
association studies, data may be extracted from
dbGAP (http://www.ncbi.nlm.nih.gov/gap), for
epigenomic, transcriptomic and methylomic data,
Array Express (http://www.ebi.ac.uk/arrayexpress/)
or Gene Expression Omnibus (GEO; http://www.
ncbi.nlm.nih.gov/ geo/), and for next generation sequencing
databases European Nucleotide Archive
(ENA, http://www.ebi.ac.uk/ ena/) and Sequence
Read Archive (SRA; http://www.ncbi. nlm.nih.gov/
sra) [4-7].
After all the sources have been investigated,
a collected database of features [genes, mRNAs,
microRNAs (miRNAs), CpG islands, proteins and
others] with significant alterations in chosen disease
should be prepared for each included study. We also
advise collection of information, such as significance
values and fold change values, on which prioritization
of features for each biological layer will
be performed in the later steps. If the latter information
is not available, all the significant alterations
in a given study will have the same importance in
integration. In the following section, significant results
from various study types will be collectively
referred to as “signals” for reasons of clarity.
Data Integration. Before data can be integrated,
they have to be reduced to a universal common
denominator. Due to increasing heterogeneity of
genetic information, tying biological information
to gene-level annotation is becoming increasingly
more difficult. Genomic variation and methylation
patterns are two examples of information that is
prohibitively difficult to associate with genes in any
straightforward manner, as such alterations occur in
genes, between genes or spread across several genes.
For this reason, we opted for an integration based on
the genomic position of features originating from
various data sources. This required the signals from
all databases to be converted to their genomic positions
and projected on the genome assembly backbone.
This step then allows for complete omission
of difficult annotation conversion steps, required
before final integration can be performed, greatly
simplifying the synthesis of heterogeneous data.
After signals are positioned on the genomic
backbone, the complete assembly is divided into
bins of equal size. For each study, a score is given
to each of the bins, depending on the score of alterations
residing in that segment of the genome. After
this step, the scores of all bins are prioritized and
their rank scores calculated. The integration step
is attained when the non parametric rank product
for each of the bins is calculated, and on the basis
of rank scores of bins originating from each data
source, as we have previously described [8]. The
lower final rank product signifies that higher ranks were attained by bins on several separate biological
layers [9]. Therefore, these bins represent genomic
regions where accumulation of signals is detected on
various biological levels, and thus represent regions
of interest for further investigation. Ultimately, a
permutational test may be employed to determine
the significance of signal accumulation in each bin
[8]. The detailed overview of the process may be
observed in Figure 1.
|
|
|
|



 |
Number 27 VOL. 27 (2), 2024 |
Number 27 VOL. 27 (1), 2024 |
Number 26 Number 26 VOL. 26(2), 2023 All in one |
Number 26 VOL. 26(2), 2023 |
Number 26 VOL. 26, 2023 Supplement |
Number 26 VOL. 26(1), 2023 |
Number 25 VOL. 25(2), 2022 |
Number 25 VOL. 25 (1), 2022 |
Number 24 VOL. 24(2), 2021 |
Number 24 VOL. 24(1), 2021 |
Number 23 VOL. 23(2), 2020 |
Number 22 VOL. 22(2), 2019 |
Number 22 VOL. 22(1), 2019 |
Number 22 VOL. 22, 2019 Supplement |
Number 21 VOL. 21(2), 2018 |
Number 21 VOL. 21 (1), 2018 |
Number 21 VOL. 21, 2018 Supplement |
Number 20 VOL. 20 (2), 2017 |
Number 20 VOL. 20 (1), 2017 |
Number 19 VOL. 19 (2), 2016 |
Number 19 VOL. 19 (1), 2016 |
Number 18 VOL. 18 (2), 2015 |
Number 18 VOL. 18 (1), 2015 |
Number 17 VOL. 17 (2), 2014 |
Number 17 VOL. 17 (1), 2014 |
Number 16 VOL. 16 (2), 2013 |
Number 16 VOL. 16 (1), 2013 |
Number 15 VOL. 15 (2), 2012 |
Number 15 VOL. 15, 2012 Supplement |
Number 15 Vol. 15 (1), 2012 |
Number 14 14 - Vol. 14 (2), 2011 |
Number 14 The 9th Balkan Congress of Medical Genetics |
Number 14 14 - Vol. 14 (1), 2011 |
Number 13 Vol. 13 (2), 2010 |
Number 13 Vol.13 (1), 2010 |
Number 12 Vol.12 (2), 2009 |
Number 12 Vol.12 (1), 2009 |
Number 11 Vol.11 (2),2008 |
Number 11 Vol.11 (1),2008 |
Number 10 Vol.10 (2), 2007 |
Number 10 10 (1),2007 |
Number 9 1&2, 2006 |
Number 9 3&4, 2006 |
Number 8 1&2, 2005 |
Number 8 3&4, 2004 |
Number 7 1&2, 2004 |
Number 6 3&4, 2003 |
Number 6 1&2, 2003 |
Number 5 3&4, 2002 |
Number 5 1&2, 2002 |
Number 4 Vol.3 (4), 2000 |
Number 4 Vol.2 (4), 1999 |
Number 4 Vol.1 (4), 1998 |
Number 4 3&4, 2001 |
Number 4 1&2, 2001 |
Number 3 Vol.3 (3), 2000 |
Number 3 Vol.2 (3), 1999 |
Number 3 Vol.1 (3), 1998 |
Number 2 Vol.3(2), 2000 |
Number 2 Vol.1 (2), 1998 |
Number 2 Vol.2 (2), 1999 |
Number 1 Vol.3 (1), 2000 |
Number 1 Vol.2 (1), 1999 |
Number 1 Vol.1 (1), 1998 |
|
|