- Contributed Components
- Installation Instructions
- User Interface Documentation:
- Use Case Workflow
- Contact Information
Download the complete package as a zip file:
dbi_tju_feb_23.zip
Contributed Components:
- Promoter Analysis and Interactive Network Toolset (PAINT):
Highly parallel gene expression analysis has led to analysis of gene regulation, in particular
co-regulation, at a system level. PAINT was developed to provide the biologist a computational tool
to integrate functional genomics data, for example from microarray-based gene expression analysis,
with genomic sequence data to carry out transcriptional regulatory network analysis, TRNA. TRNA
combines bioinformatics, used to identify and analyze gene regulatory regions, and statistical
significance testing, used to rank the likelihood of the involvement of individual transcription
factors, with visualization tools to identify transcription factors likely to play a role in the
biology under study. In addition this tool can output results in several different formats for use
with modeling and simulation tools.
The project is conceived and implmented as an automated modular scalable, extensible,
integrative framework fo software tools. PAINT's modular architecture, in combination with BioSPICE's
Dashboard and OAA framework, allows a biologist to take advantage of the existing, as well as future
BioSPICE Agents.
The current Dashboard release of PAINT consists of the following modules:
- PAINT Feasnetbuilder:
Takes an annotated gene list in SBML2 format and produces an output of genes and Transcriptional
Regulatory Elements interaction.
- Feasnet Adapter:
Converts genes and TREs interaction in SBML2 format to Feasnet Object that can be used by
PAINT FeasnetViewer.
- PAINT FeasnetViewer:
Takes a Feasnet Object of interest, and optionally a reference Feasnet to analyze the relative
significance of TRE occurence in the gene list of interest against the given reference. The module
has a visulization component, and an output to the PtPlot module.
For more information, refer to
this OMICS publication. PAINT is also available
online.
- CloneUpdater:
Gene annotation is one of the most important data classes used in the postgenomic era. CloneUpdater
provides the biologist an easy to use path to the most "up to date" annotation of genes and associated
reagents such as EST clones. It is compatible with many different types of identifiers and provides a
large number of options for updating preexisting annotations or adding new annotations to user-provided
identifiers from many different databases including UniGene, LocusLink, RefSeq, and more. Processing more
than 50 identifiers per second, CloneUpdater is fast enough to be used for one gene identifier or tens
of thousands. In addition, CloneUpdater has the important ability to find identifiers that represent
the same gene and thereby highlight redundancies in large reagent collections.
In the context of analyzing gene regulatory networks, CloneUpdater can be used as a precprocessor
for PAINT: it can eliminate redundant clones, and update annotations that PAINT can process.
This Dashboard release of CloneUpdater features a GUI that offers the same features as the online version. More information about CloneUpdater is
available at the same URL.
- Metaclustering Toolbox:
A large diversity of clustering algorithms are available for data analysis, all of which generate
results which are specific to the algorithm used or even to the individual iteration. MetaCluster was
developed to help biologists mine several different clustering results for those data relationship
insensitive to the clustering methods used. Metaclustering provides a computational tool which
co-analyses diverse clustering results to highlight the relationships that are stable across algorithms.
These method-independent co-clustering results provide the strongest evidence for biological significance.
For more information on Metacluster algorithm, please refer to
this publication (PDF).
The MetaCluster Toolbox included in this Dashboard contribution
features a GUI that takes Timeseries as input (such as microarray expression data), allows the user to
cluster the data interactively, and produces a SBML2 output of the clustering results.
[ Table of Contents ]
Installation instructions:
-
Download the zip file: dbi_release_jan29_04.zip that contains
all the NetBeans modules, sample input/output files and the use case documentation,
and unzip it in a local directory.
- From within the BioSPICE, using Tools -> Update Center, choose "Install Manually Downloaded
Modules", as shown below:

- Click on Next, and choose the downloaded and unzipped NetBeans modules:

- Continue with the installation process following the onscreen prompts.
[ Table of Contents ]
User Interface Documentation:
In this section, a brief UI documentation is provided; for more detailed information about the
algorithms please refer to the URL/publication of each of the modules given in the
Contributed Components section.
The following User Interface documentation was prepared by running a real data set of biological interest
through the Dashboard workflow. The starting point is the cDNA microarray expression data consisting of
309 differentially expressed genes from Transcriptome Analysis
during Hemoglobinization of Human Erythroleukemia (K562) cells, at six timepoints. From this dataset,
a subset of 30 genes were considered for further analysis
presented in this demonstration.
The biological question we examine is "What are the TREs that characterize the 30 genes subset
relative to the 309 differentially expressed genes?"
- Metaclustering Toolbox:
Input: Microarray expression data (Timeseries format).
Output: Cluster membership for each identifier (SBML2).
On the Dashboard, Metaclustering Toolbox starts up the GUI as shown in Fig. 1(a), with the
data obtained from the workflow in the Timeseries format. Currently, the Metaclustering Toolbox
implements the clustering algorithms as shown in the Fig. 1(a).
The user can choose any of the clustering algorithms, and specify the number of clusters expected.
Each choice of clustering opens a new tabbed pane of clustering results as shown in the
Fig. 2(a). Using metaclustering algorithm requires atleast two clustering results. If the user
selects Meta-clustering from the Clustering Tools menu, the Metaclustering Options
dialog box is presented as shown in the Fig. 2(a). This way, the user can continue with
clustering the data with different algorithms/options until a satisfactory result is obtained.
After a reasonable clustering result is obtained, the user may choose to quit the Metaclustering Toolbox
and pass on the clustering results to the next component in the workflow. To do this, the specific
tabbed pane in which desired results are present has to be selected (for example, C_4 in the
Fig. 1(c)). At this point, choosing Output to SBML2 and Quit option from the File menu
quits the Metaclustering Toolbox, and sends the clustering results out. Please note that closing the
Metaclustering ToolBox window forcibly does not send out the clustering results, and may result in
abnormal termination of the workflow.
The cluster information is stored in the <annotation> node of the
<species>, where species is the gene identifier from the Timeseries data.
Within the annotation, the value attribue in <dbi:user-def
name="metacluster"> indicates the cluster membership. Please
refer to the sample output file below for specific details.
Sample input file: demo_30_metacluster_in.timseries
Sample output file: demo_30_metacluster_out.xml
|

Fig. 1(a) |

Fig. 1(b) |

Fig. 1(c)
|
[ Table of Contents ]
- Clone Updater:
Input: Gene list, with optional annotation (SBML2).
Output: Gene list, with new/updated annotation (SBML2).
CloneUpdater starts up with the GUI as shown in Fig. 2(a). By this point, the SBML2 input
is parsed for existing annotation, if any. For this demonstration, we have only one annotation
item: metacluster, associated with each clone. Since this is not a standard Unigene
annotation element, we leave the default value no change unchanged. For more information
on using CloneUpdater please refer to the online documentation at the
CloneUpdater page.
The next step is to choose the organism (to which the clones belong to) using the Choose an
Organism drop down list in the main window (Fig. 2(a)). At this point, the user is
presented with the option of fetching more annotation information (from Unigene database) for the
selected clones, using the Add New Headers button. The user is expected to enter the number
of desired new annotation elements in the box provided, and then click the Add New Headers.
In this example, we chose 3 new headers, as shown in the Fig. 2(b). The drop down
list in the Define New Headers dialog box lets the user choose desired annotation information
for each of the clones in the list.
After the desired annotation is chosen, clicking on the Continue button submits the request
for processing. As shown in Fig. 2(c), it takes about 3 seconds to process a list of 30 clones.
A drop down list lets the user to select All Clones, Distinct Clones (default), Redundant
Clones or Clones not in Unigene to be sent to the next component in the Dashboard workflow.
Clicking the Output to SBML2 and Quit button quits Cloneupdater and transfers the control to
the Dashboard.
In the output file, for each of the <species>, <annotation> tags are
added or updated as specified. For more details on the format, please refer
to the sample output file below.
Sample input file: demo_30_metacluster_out.xml
Sample output file: demo_30_cloneupdater_out.xml
|

Fig. 2(a) |

Fig. 2(b) |

Fig. 2(c) |
[ Table of Contents ]
- PAINT FeasnetBuilder:
Input: Annotated gene list (SBML2).
Output: Gene-TF interaction data (SBML2).
|
PAINT Feasnet Builder starts up with the GUI as shown in the Fig. 3(a), with the clone list
received from the workflow.
PAINT currently maintains a promoter database for Mouse, Human and Rat.
In the Organism select box, the user is expected to choose the organism to which the
input clones belong to.
This module uses TRANSFAC Professional by
Cognia Corporation to find known TREs. The user is expected to
be registered with Cognia Corporation to be able to use the BIOBASE match. However, we can
provide a limited time authorization to use this program on our server upon e-mail request. Please note
that this is intended for demo purposes only.
For our example, we chose an Upstream Length of 500 basepairs, and
Core similarity threshold of 0.9. Clicking on the Send Request button posts
the request to the web based PAINT. This process takes about 30-40 seconds to complete. The
Output to SBML2 and Quit button is activated as soon as the processing is complete
(Fig. 3(b)). Clicking this button closes PAINT Feasnet Builder and sends the gene-Transcription
Factor data to the downstream module on the Dashboard.
In the output, each TRE on a gene is represented as a <reaction>,
with Transcription Factor in the <listOfReactants>, and gene
identifier in the <listOfProducts>. For more details, please
refer to the sample output below.
Sample input file: demo_30_cloneupdater_out.xml
Sample output file: demo_30_hs_core0.9_5000bp_nocomp.xml
|

Fig. 3(a) |

Fig. 3(b) |
[ Table of Contents ]
- Feasnet Adapter:
Input: Gene-TF interaction data (SBML2).
Output: Gene-TF interaction data (Feasnet Object).
|
Feasnet Adapter module converts gene-TF interaction in SBML2 format to Feasnet format that
the Feasnet Viewer module can use. This operation doesn't require any user interaction.
|
[ Table of Contents ]
- PAINT Feasnet Viewer:
Input: Gene-TF interaction data of gene list (Feasnet),
Gene-TF interaction data of reference (Feasnet) (optional),
clustering information of gene list (SBML2) (optional).
Output: TRE over-representation/under-representation, relative
to the specified reference (PlotML).
Feasnet Viewer module is the analysis and visualization component of the
PAINT. Currently, the module can:
- Analyze the significance of the TREs present - TRE over/under representation -
for each cluster in the gene list, relative to a reference (such as all
the genes in a microarray experiment, or entire genome).
- Display the Gene-TF interaction as a "matrix".
- Export the five most significantly over-represented or most significantly
under-represented TREs across all the clusters.
Feasnet Viewer starts up with the GUI as shown in the Fig. 5(a).
Clicking the View Feasnet image button computes analytical p-values
for each TRE occurence on the given genes relative to the specified reference,
and displays the image in a tabbed pane as shown in the Fig. 5(a).
For these signifiance values to be meaningful, note that the intended reference
gene list should also use the same parameters for finding TRE occurence as the
desired gene list. For example, in the sample files below, both the reference
(demo_309_hs_core0.9_5000bp_nocomp.xml) and the demo_30_hs_core0.9_5000bp_nocomp.xml
use the same Upstream Length (5000) and Core Threshold (0.9) parameters.
In the Feasnet image, each "dot" indicates the presence of the specific TRE (along columns)
on the promoter of the given gene (along rows). Over representation is indicated by different
shades of red (the brighter the red color, the greater the TRE is over represented), and under
representation by cyan. The gray shades mean that the TRE occurence in this gene cluster is not
significantly different from a randomly picked gene cluster.
Fig. 5(b) is the same result as Fig. 5(a), scrolled to the right most end. The colored
bar to the left of the gene identifiers indicates the cluster membership (generated from MetaClustering
Toolbox).
After the image is generated, clicking on Export to PtPlot button (Fig. 5(a))
prepares the statistics of five most over/under represented TREs across all the gene clusters.
Closing the window, sends the PlotML data to the next component (typically, PtPlot module) on the
Dashboard.
|
Fig. 5(a)
Fig. 5(b)
|
Typically, the significance of a TRE varies across the gene clusters. Fig. 5(c) shows the
significance score (log scale of probabilities) of five most over/under-represented TREs across all the
gene clusters. A positive value in the plot indicates over-representation, and negative value indicates
under-representation. For example, the TRE COREBINDINGFACTOR_Q6 (blue color bars) is significantly
over-represented in cluster 3 (signifiance score ~1.5), compared to the other two clusters (significance
scores < 0).

Fig. 5(c)
Sample input files:
Sample output file: None (directly connected to PtPlot module).
|
[ Table of Contents ]
Use Case Workflow:
The biological context of the following use case is given at the beginning of the
UI Documentation section. This section provides a walk through of the workflow shown in the
Fig. 6(b). Our input files are
demo_30_metacluster_in.timeseries (microarray expression data), and
demo_309_hs_core0.9_5000bp_nocomp.xml.
Before starting the workflow, please make sure that you have:
-
The latest Netbeans modules of Metacluster, Cloneupdater, PAINT Feasnet Builder, Feasnet Adapter,
PAINT Feasnet Viewer are installed. From the BioSPICE Dashboard main menu, select BioSPICE --> Analyzer
Table, which brings up the workflow area, and the list of the analyzers installed. Select each of the
above analyzers, and refer to the Description section. The text of the Description section should
have the timestamp of (3.10pm Feb 20), as seen in the Fig. 6(a). If this is not the case,
please try reinstalling the modules that do not have this timestamp.
 Fig. 6(a)
- The SBML to laidout Dot module is hardocded to execute C:\Program Files\ATT\Graphviz\bin\dot.exe.
To use this module, Install the GraphViz program. We
observed that the graphs generated are more appealing using the twopi.exe program instead of the
hardcoded dot.exe. Replacing the dot.exe in the C:\Program Files\ATT\Graphviz directory
with twopi.exe (located in the same directory) produces a graph similar to the one shown in
Fig. 6(d).
- A valid TRANSFAC Professional account.
This is required to use the PAINT Feasnet Builder module. If you don't have an account, we can provide a
temporary username/password on our server, upon email request, which is valid
for demo purposes only.
- Internet connection. This is required for PAINT Feasnet Builder and CloneUpdater modules.
Constructing the Workflow:
Select BioSPICE --> Analyzer Table from the main menu. This brings up the workspace (where
we can construct the workflow), and an Analyzer Table - which lists the available analyzers.
Construct the workflow as shown in the Fig. 6(b). This is done by dragging and dropping the
appropriate analyzer modules from the analyzer table, and adding source/destination/intermediate
documents as shown.
When making connections from the PAINT Feasnet Adaptors to PAINT FeasnetViewer, please
set the parameter mappings as shown in the Fig. 6(b).i and Fig. 6(b).ii
 Fig. 6(b)
 Fig. 6(b).i
 Fig. 6(b).ii
Running the Workflow:
After constructing the workflow as shown in the Fig. 6(b) on BioSPICE Dashboard, select
Workflow --> Start from the workflow menu. The following sequence of GUI operations were performed:
- Metacluster: [ UI Documentation ]
- Select Clustering Tools --> K-Means from the main menu.
- Enter the number 3 when prompted to enter the expected number of clusters, and
click OK
- Select Clustering Tools --> K-Means from the main menu.
- Enter the number 3 when prompted to enter the expected number of clusters, and
click OK
- Select Clustering Tools --> Average-linkage from the main menu.
- Enter the number 3 when prompted to enter the expected number of clusters, and
click OK
- Select Clustering Tools --> Complete-linkage from the main menu.
- Enter the number 3 when prompted to enter the expected number of clusters, and
click OK
- Select Clustering Tools -->Meta-clustering from the main menu.
- In the Metacluster options dialog, select C_1 (Kmeans) and
C_3 (Complete Linkage) (hold Control key for multiple selection), and enter
3 in the Number of clusters expected text box, and click OK.
- Make sure that the tabbed pane titled C_4 is selected, and choose
Output to SBML2 and Quit from the File menu. This operation closes the
Metacluster Toolbox dialog.
- CloneUpdater: [ UI Documentation ]
- Select Homo Sapiens (human) from the Choose an Organism drop down list.
- Enter 3 in the text box provided in the New Headers section.
- Click on Add New Headers button, which brings up Define New Headers dialog.
- In the Define New Headers dialog, select cluster_id, gene and express
respectively, from the three drop down lists provided. Click on OK.
- Click on the Continune button to submit a CloneUpdater request. For our dataset of 30
genes, it takes about 3 seconds for processing.
- In the Results dialog that comes up, select Distinct Clones in the drop down list,
and click on Output to SBML and Quit.
- This closes the CloneUpdater dialog box.
- PAINT Feasnet Builder: [ UI Documentation ]
- Select Homo Sapiens (Human) from the Organism drop down list.
- Select 5000 from the Upstream Length drop down list.
- Type in the User name and Password in the text boxes provided.
- Select 0.9 from the Core similarity threshold drop down list.
- Click on Send Request to submit the PAINT Feasnet Builder request. This process
takes about one minute for our dataset of 30 genes.
- After the processing is complete, the Output to SBML2 and Quit button is activated.
Click on this button to close PAINT Feasnet Builder.
- Feasnet Adapter:
- No user interface is provided, as no interaction is required.
- PAINT Feasnet Viewer: [ UI Documentation ]
- When using a Reference Feasnet for TRE significance analysis, explicit
parameter mapping is required. The output from the original gene list we started
out with (demo_30_metacluster_in.timeseries) should be mapped to the
FEASNETVIEWER_FEASNET_INPUT_TARGET (or more precisely, the output from the
Feasnet Adapter module for which the input is from our original gene list should
be mapped to the FEASNETVIEWER_FEASNET_INPUT_TARGET) and the output from the Reference
feasnet (in our example, demo_309_hs_core0.9_5000_nocomp.xml) should be mapped
to the FEASNETVIEWER_FEASNET_INPUT_REFERENCE parameter, using the Dashboard's
parameter editor.
- Click on View Feasnet Image button to generate the feasnet image, with the significance
values. This image can be saved in JPG format by selecting File --> Save from the menu.
- Click on Export Plot to generate PlotML of TRE significance values across the gene
clusters.
- Close this window to continue the workflow with PlotML output.
PAINT to Pathway Builder:
The workflow shown in the Fig. 6(c) generates
demo_30_pathwaybuilder_in.xml that is the input to Pathway
Builder module. Recall that the input files demo_30_hs_core0.9_5000bp_nocomp.xml and demo_30_gene_tre_layout.dot are generated
from the workflow shown in the Fig. 6(b).
 Fig. 6(c)
Start Pathway Builder by choosing File --> New... from the BioSPICE Dashboard main menu. In the
Choose Template step, click on BioSPICE --> Pathway Builder Pathway. Click the Next >
button, and choose a name (such as demo_30_pathway) for the Pathway. Click on Finish button
to open the Pathway Builder GUI.
In the Pathway Builder GUI, select File --> Open... from the main menu, and choose
demo_30_pathwaybuilder_in that was created from the workflow in the Fig. 6(c). You should
see the candiate regulatory network from PAINT as shown in Fig. 6(d).
 Fig. 6(d)
[ Table of Contents ]
Contact Information:
If you have any questions please contact Praveen Chakravarthula (praveenc@mail.dbi.tju.edu) and
Dr. James Schwaber (james.schwaber@jefferson.edu).
|