Standard deviations
Aug. 23rd, 2006 01:50 pmThe main conclusion is to finish the random files issue.
TODO: Make a number of shuffling series of 10 shufflings each. Choose a word from the top of occurrences list. The means for that word should not differ more than 5% and the standard deviations shoiuld not differ more that 15%. It it is not true, increase a number of shuffles in each series to 15 and so forth.
TEN recognizes two different measures of standard deviation - the standard deviation of a series (he calls it an individual measurement dispersion) and a standard deviation of averages. This notation is new to me, although I do inderstand tbat given K smapling series, the standard deviation between means of those series will be very low. I just don't understand why we need such a parameter and when can it be used. TEN argues that sigma_averages = sigma_sample/sqrt(N), where N is a number of samples in each series.
TODO: Make a number of shuffling series of 10 shufflings each. Choose a word from the top of occurrences list. The means for that word should not differ more than 5% and the standard deviations shoiuld not differ more that 15%. It it is not true, increase a number of shuffles in each series to 15 and so forth.
TEN recognizes two different measures of standard deviation - the standard deviation of a series (he calls it an individual measurement dispersion) and a standard deviation of averages. This notation is new to me, although I do inderstand tbat given K smapling series, the standard deviation between means of those series will be very low. I just don't understand why we need such a parameter and when can it be used. TEN argues that sigma_averages = sigma_sample/sqrt(N), where N is a number of samples in each series.
(no subject)
Aug. 22nd, 2006 12:22 pmThere are different techniques of shuffling. Total shuffling of all the codons will destroy local nonuniformity of amino acids composition. It is right that at the present state of research we are interested to show general meaningful of the third codon position in terms of combinatorial bias, but it seems natural that CB changes along the protein. The question is if we should use windowed shuffling in order to preserve codon distribution nonuniformity. The windowed shuffling is a shuffling within a window (say 30 codons long). The next window starts 5(10) codons further. What is a statistical prove for such a shuffling?
Thesis mess
Jun. 22nd, 2006 09:35 amWriting a thesis is a disgusting work, when you actually do the research it is fun (despite all the failures), but to put together a work of two years... More than that, we found something very suspicious and all the tests in current lack of time situation trouble me very much... Summary: write your thesis report during all the research period...