Discussion

 

We will mark the running results of the original data on the non-weighted version as results A, running the noised data on the non-weighted version are results B and the results of the noised data on the weighted versions are results C.

We expected before implementing that results type C will show more similarity with results A rather that results B with results A.

Our expectations were only half filled. After running both versions of the implemented algorithms we could observe that the attached version improved the M measure significantly.

Surprisingly, the "unattached version" of the algorithm not only not improved M, but also showed worse M then the one of the noised non-weighted run.

Another interesting fact that we have observed is that in the unattached version of the weighted TNoM most of the genes with low TNoM score showed corresponding very bad P-values (close to 1 which means not significant TNoM score).

Possible explanations of these results are that our basic assumptions might be wrong:

  1. The unattached version is unidirectional (assuming the correct position of all pluses is the left side of the vector) which reduces validity of 50% of all vectors.
  2. In the unattached version unlimited repetition of weights could be wrong. In the unidirectional case it even enlarges the partial part of vectors with the wanted TNoM or lower and causes worse (less significant) P value.

 

 

Our final conclusions from the results are that the attached version is indeed a good improvement of the original TNoM algorithm resulting in a new algorithm, which is more reliable and more resistant to noise. The algorithm shows the greatest improvement with gene qualities 4-9 (inclusive). We expect that most of the real life data will have weights distribution in that range and therefore the algorithm is suitable for real life needs.