Index of /norm

Icon  Name                                  Last modified      Size  Description
[DIR] Parent Directory - [   ] E_coli_v3_Build_1_norm.tar.gz 28-Feb-2007 15:15 50M [   ] E_coli_v3_Build_2_norm.tar.gz 28-Feb-2007 15:15 61M [   ] E_coli_v3_Build_3_norm.tar.gz 07-Sep-2007 19:05 65M [TXT] E_coli_v4_Build_2_norm.probe_data.txt 05-Sep-2007 14:02 8.0M [   ] E_coli_v4_Build_2_norm.tar.gz 28-Aug-2007 17:33 61M [   ] E_coli_v4_Build_3_norm.tar.gz 29-Oct-2007 17:51 64M [   ] E_coli_v4_Build_4_norm.tar.gz 20-Dec-2007 12:32 74M [   ] E_coli_v4_Build_5.tar.gz 30-Oct-2008 01:16 90M [   ] E_coli_v4_Build_5_affy_cdf.tar.gz 30-Oct-2008 01:16 90M [   ] E_coli_v4_Build_6.tar.gz 03-Sep-2009 11:12 112M [   ] S_cerevisiae_v3_Build_1_norm.tar.gz 28-Feb-2007 15:15 68M [   ] S_oneidensis_v3_Build1_norm.tar.gz 28-Feb-2007 15:15 2.9M [   ] S_oneidensis_v4_Build1_norm.tar.gz 24-Aug-2007 14:52 3.0M [   ] S_oneidensis_v4_Build_2.tar.gz 23-Jun-2008 13:01 42M [DIR] helper_scripts/ 17-Oct-2007 18:59 - [   ] yg_s98_v3_Build_2_norm.tar.gz 30-Sep-2008 19:37 26M
This file describes the normalized compendium dumps from M3D.

--- EXPRESSION DATA ---
You should find six files with expression data in them. The
naming convention for these files is Compendium_chipsMprobesN.tab
where M = the number of chips in the file and N is the number
of probe sets in each file.  You find three different numbers of
probe sets, which from smallest to largest correspond to:
genes only, genes + intergenic regions, genes + intergenic regions +
control probes.  The final three files contain "avg" preceding
the compendium name and they have "exps" rather than "chips".  These
three files contain the average of the replicates for experiments that
have replicates.

--- PROBE INFORMATION ---
In each dump, you will find a file of the form 

Compendium.probe_set_descriptions 

This file contains additional names
for each probe_set. It contains the probe set name, the locus (the standard
gene name used for the species, for example b0123 for E. coli and SO0123 for 
Shewanella), the common name, and a friendly name. The friendly name is 
the common name if a gene has a common name, otherwise it is the locus.

For the E_coli_v4_Build_2_norm data set we have included a separate file 
containing probe_set --> probe mappings and sequences:

  E_coli_v4_Build_2_norm.probe_data.txt


--- CHIP INFORMATION ---
In each dump, you will find a file of the form 

Compendium.experiment_descriptions 

This file contains basic condition information for each experiment and 
chip in the compendium.


--- STRUCTURED EXPERIMENTAL METADATA ---
Compendia that are version 4 or later have an additional file

Compendium.experiment_feature_descriptions 

This file contains curated detailed condition information for each experiment.
For each experiment, you will find many rows.  Each row corresponds to one
feature of the experiment.  Each feature has a defined unit and type
that are enforced across all experiments M3D.  For example, all experiments
in the database utilizing glucose, contain the glucose value as a real number
in mM.  In general, we have tried to use mM with all chemicals in the database,
so that it is easier to combine and merge the different chemicals to calculate 
the number of certain important atoms like Sulfur and Nitrogen.  Chemically 
undefined constituents like yeast extract are provided in their commonly used
units.

--- HELPER SCRIPTS ---
directory contains a script for parsing the normalized data into matlab