Wigwams in MATLAB

Load one of the datasets
----------------------------------

>> load ../datasets/ArabdidopsisAbioticShoot

other datasets:

ArabidopsisAbioticRoot
EColiDiauxie
EColDiaxieLarge
HomoSapiensMS
YeastCellCycle
YeastCellCycleLarge
YeastStress

run Wigwams
------------------

MultiHyperAll(dataset, geneList, pseudo_set_sizes, job, score_threshold)

-dataset = dataset you have loaded
-geneList = gene indices you wish to run Wigwams on.
-pseudo_set_sizes = default is [50:50:250]
-job = a name to call your output e.g. '_job1'
-score_threshold = if you are using the differential expression scores, pick a cutoff, above which, genes are considered DE.  If not, type 'no'.

>>MultiHyperAll(ArabdidopsisAbioticShoot, 1:6, [50:50:250], '_genes-1-6', 'no')

run Pruning procedure
------------------------------

pruning2(dataset, pvalue_cutoff, correlation_cutoff, significant_difference)

pvalue_cutoff = log p-values below this value considered
correlation_cutoff = cutoff below which pairs of cluster are deemed unsimilar
significant_difference = difference by which one log p-value is smaller than another log p-value to be considered significantly stronger.

>>pruning2(ArabdidopsisAbioticShoot, -10, 0.75, 25)

export the overlaps
--------------------------

overlap_function(dataset, pseudo_set_sizes, final_genes, score_threshold)

dataset, pseudo_set_sizes and score_threshold as before.
final_genes = the gene indices from the pruning stage.  In order to find these indices, open 'final_genes.txt', which will be located in the 'output' folder.  In the second column, each cell will say something like 'gene number (e.g. 6) pruned 75 genes'.  The gene number (e.g 6) has been kept, along with the cluster it has seeded.  Therefore, to get the final gene indices, copy and paste all the gene indices in that column into a vector:

>>final_genes = [copy and paste here];

>>overlap_function(ArabdidopsisAbioticShoot, [50:50:250], 6, 'no')

overlap_function will create three files:

output.eps - all plots of significant clusters
overlap_table.txt = all clusters, datasets they are significantly co-opted in, number of datasets co-opted in, and size of cluster.
page_two.txt = each cluster and gene members of cluster.

