Open In App

Mathematics | Hypergeometric Distribution model

Last Updated : 09 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report
Hypergeometric Distribution Model is used for estimating the number of faults initially resident in a program at the beginning of the test or debugging process based on the hypergeometric distribution. Let $C_i-1$ be the cumulative number of errors already detected so far by $t_1, t_2, ...., t_i-1$, and let $N_i be the number of newly detected errors by time $t_i$. Assumptions:
  1. A program initially contains m faults when the test phase starts.
  2. A test is defined as a number of test instances which are couples of input data and output data. In other words, the collection of test operations performed in a day or a week is called a test instance. The test instances are denoted by $t_i$ for i = 1, 2, . . ., n.
  3. Detected faults are not removed between test instances.
Therefore, from the latter assumption, the same faults can be experienced at several test instances. Let $W_i$ be the number of faults experienced by test instance $t_i$. It should be noted that some of the $W_i$ faults may be those that are already counted in $C_i-1$, and the remaining Wi faults account for the newly detected faults. If $n_i$ is an observed instance of $N_i$, then we can see that $n_i \leq W_i$. Each fault can be classified into one of two categories:
  1. Newly discovered faults
  2. Rediscovered faults
If we assume that the number of newly detected faults $N_i$ follows a hypergeometric distribution, then the probability of obtaining exactly $n_i$ newly detected faults among $W_i$ faults is,

    $$P(N_i=n_i)=\frac{\binom{m-C_{i-1}}{n_i}\binom{C_{i-1}}{W_i-n_i}}{\binom{m}{W_i}}$$

where

    $$C_{i-1}= \Sigma_{k=1}^{i-1}n_k, \; C_0=0\; n_0=0 $$

and

    $$max\{0, W_i-C_{}i-1\}\leq n_i\leq max\{W_i, m-C_{i-1}\}$$

for all i. Since $N_i$ is assumed to be hypergeometrically distributed, the expected number of newly detected faults during the interval $[t_{i-1}, t_i]$ is,

    $$E(N_i)=\frac{(m-C_i)W_i}{m}$$

and the expected value of $C_i$ is given by,

    $$E(C_i)=m\left [1- \prod_{j=1}^i (1-p_i)  \right ]$$

where

    $$p_i=\frac{W_i}{m}\; i=1, 2, ...$$



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads