- Scope of this Database
The Plant Gene Expression Database (PED) provides a subset of the publicly available Affymetrix expression data from Arabidopsis in pre-analyzed form. Several statistical methods are used for identifying differentially expressed genes (DEGs), and correlation and clustering techniques for co-regulation analyses. To provide high-confidence data, the database is restricted to data sets with two or more replicates. Most data analysis routines and online tools in PED are based on R and BioConductor resources.
- Search PED
A. Gene-Wise Search
Gene-wise searches can be performed via the Genome Cluster Database (GCD) using the following query types:
The hyperlinks on the subsequent GCD query result page redirect to the following expression information:
B. Treatment-Based Search
The 'Differential Expression Search Page' allows searching for differentially expressed genes (DEGs) by specific treatments and filtering by various quantitative values to obtain candidate gene lists with strategies that resemble typical microarray analysis routines. To perform queries on this page, it is important to select an Experiment Set (e.g. GSE2473 or ME00325) and a Comparison/Contrast ID (e.g. 1) of interest from the Experiment Summary Page. Other important values to select on this page are the fold change level (in log2), the false discovery rate (adjusted p-value) of the LIMMA method (Smyth, 2004) and the normalization type (MAS5 or RMA; Qin et al, 2006, Irizarry et al, 2003). For instance, the settings [fold change ≥ 1 OR ≤ -1] AND [adjusted p-value ≤ 0.05] will return all DEGs that show in response to a selected treatment a fold change of at least two with a false discovery rate below 0.05. The returned DEG result pages are fully integrated with the co-expression data from several correlation and clustering methods.
C. Compound-Based Search
- The compound-based access via ChemMine is under construction.
- Gene Expression Pages
The gene expression pages provide for every gene of interest an expression summary across all experiment sets in the database, as well as detailed information on the normalized expression levels and differential expression analysis results from the LIMMA (Smyth, 2004) and Rank Product (Hong et al, 2006) methods. All expression data are available for the MAS5 (Qin et al, 2006) and RMA (Irizarry et al, 2003) normalization algorithms. A link on the top of each page allows to switch from the default view of the MAS5 data to the RMA normalized data. Detailed descriptions of the different data fields are provided in form of pop-up help windows when cursing over them. An expansion system is available to zoom into the expression raw data and the differential analysis results by clicking on the black triangles next to the list of experiment sets. The 'Ratio' expansion system opens the DEG data view, while the 'Int' view displays the data for comparing expression levels. The corresponding experiment definitions used in the DEG analysis of each treatment series can be downloaded via the EXP DEF links.
- Correlation and Cluster Pages
The link "Correlation Data" on top of the gene expression pages opens the corresponding correlation cluster page for a given gene, while the adjacent cluster links (e.g. CL4(134) open the same correlation cluster pages for specific clusters. The correlation page allows to identify for a gene of interest its most positively or negatively co-regulated neighbors by providing for every gene on the ATH1 array the Pearson and Spearman correlation profiles against all other genes. Their values are calculated across all expression sets in the database. A paging function on the top of the correlation cluster pages allows efficient navigation through the complex data matrices or to download them into local applications. The integrated cluster information on these pages contains four separate Hierarchical Threshold Clustering (HTC) data sets. These were calculated using as distance measures the Pearson and Spearman correlation coefficients in their signed and absolute forms (PCC, PCCa, SCC and SCCa). By clicking the cluster links one can restrict the view to the genes of a cluster of interest.
- Expression Profile Plots
To evaluate the quality of expression clusters or to visualize the expression patterns for custom gene sets across all samples in the database, an expression profile plotting tool can be accessed from the correlation cluster pages. By clicking the plotting icons or selecting genes via the available checkboxes the MAS5 normalized expression profiles for the selected gene sets will be plotted in raw and centered form. The corresponding biosamples of the plotted data points are available on the Experiment Summary Page. Alternatively, the plotting tool can be accessed through the R/BioC Tools page.
- R/BioC Tools
The R/BioC Tools page provides access to various R and BioConductor data analysis resources. Currently, this service is restricted to the expression profile plotting tool. Other functions, such as clustering and functional enrichment utilities, will be made available in the near future.
- Download Options
Extensive download options for imports into local spreadsheet programs are available on the top of all query result pages for intensity, DEG, correlation and cluster data.