Constrained Bayesian Method of Composite Hypotheses Testing: Singularities and Capabilities

: The paper deals with the constrained Bayesian Method (CBM) for testing composite hypotheses. It is shown that, similarly to the cases when CBM is optimal for testing simple and multiple hypotheses in parallel and sequential experiments, it keeps the optimal properties at testing composite hypotheses. In particular, it easily, without special efforts, overcomes the Lindley’s paradox arising when testing a simple hypothesis versus a composite one. The CBM is compared with Bayesian test in the classical case and when the a priori probabilities are chosen in a special manner for overcoming the Lindley’s paradox. Superiority of CBM against these tests is demonstrated by simulation. The justice of the theoretical judgment is supported by many computation results of different characteristics of the considered methods.


INTRODUCTION
In [1] was offered Constrained Bayesian Methods (CBM) of statistical hypotheses testing.The CBM have all positive characteristics of well-known classical approaches.It is a data-dependent measure like Fisher's test for making a decision, it uses a posteriori probabilities like the Jeffreys test and computes error probabilities Type I and Type II like the Neyman-Pearson's approach does [2].The parallel and sequential methods for testing many (more than or equal to two) and multiple simple hypotheses are considered in the works [2,[3][4][5][6][7][8][9].Also there are given the results of comparison of CBM with the Fisher, Jeffreys, Neyman-Pearson, Berger (parallel and sequential), Wald (sequential), Bonferroni and step-up and step-down methods of multiple hypotheses testing, which shows the unique properties of this method.In particular, it gives more logical and reliable decisions, using information contained in the sample more completely than the existing methods.Continuing the development and examination of CBM, below we offer its application for testing composite hypotheses.
A lot of works are dedicated to the problem of testing composite hypotheses.In particular, a compact and exhaustive review of the history of development of the theory of hypotheses testing (including composite ones) is given in the paper [10].In this case Neyman and Pearson replace the density at a single parameter value with the maximum of the density over all parameters in that hypothesis.The maximum likelihood ratio test rejects the null hypothesis for large values of sup !"#A f !(x) / sup !"# 0 f !(x) .This is a useful method for finding good testing procedures.Stein's method [11,12] integrates the density over !using special measures to obtain the density of the maximum invariant statistic, which can then be used to analyze the problems; for example, to find the uniformly most powerful invariant test.The Bayesian approach to hypothesis testing uses the integrated likelihood ratio for making a decision.
Here !A is the density of the parameter ! at the alternative supposition, !0 is the density of the parameter at the null supposition.In the Bayesian statement, the a priori probability that the null hypothesis is true !0 is necessary.Hence it is clear that, in the Bayesian approach, the problems of proper choice of the densities !0 and !A and the a priori probability arise.For overcoming this problem, in [13] was recommended "reference" priors !A and !0 in many common problems.They must be carefully chosen so that the priors would not overwhelm the data.Other approaches include developing priors using imaginary prior data, and using part of the data as a training sample [14,15].The conditioning strategy proposed in [16] for the simple versus simple hypothesis was generalized to testing a precise null hypothesis versus a composite alternative hypothesis in [17].In the latter paper, it is shown that the conditional frequentist method of testing a precise hypothesis can be made virtually equivalent to Bayesian testing.The method is adapted to sequential testing involving precise and composite hypotheses in [18].Testing of a composite null hypothesis versus a composite alternative when both have a related invariance structure is considered in [19].There were developed conditional frequentist tests that allowed reporting the data-dependent error probabilities.It was shown that the resulting tests were also Bayesian tests, because the reported frequentist error probabilities were also the posterior probabilities of the hypotheses under definite choices of the prior distribution.The new procedures are illustrated in a variety of applications to model selection and testing of multivariate hypothesis.The paper considered the case when both hypotheses were composite, the situation that arises most frequently in practice.The Bayesian test for simple versus two-sided hypotheses in the multivariate case was developed in [20].The Bayesian and the p-value approaches were compared.A better approximation was obtained by the Bayesian approach, because the p-value is in the range of the Bayesian measures of evidence.Testing of simple versus composite alternative hypotheses concerning the mean vector of the asymptotic normal distribution was given in [21].
The numerical algorithm used to compute the inverses were considered.There were compared different weighting strategies and local asymptotic powers of different test procedures.As a result, it was inferred that no test was uniformly dominated by another, and the ranking of the powers varied significantly with the direction of the alternative, and it is different if the alternative was fixed or local.For simple and composite null hypotheses, the likelihood ratio test (LR test), Wald's test, and Rao's score test were derived and turned out to have simple representations in [22].The asymptotic distribution of the corresponding test statistics under the null hypothesis was stated, and, asymptotic optimality of the LR test in the case of a simple null hypothesis was shown.A power study was performed to measure and compare the quality of the tests for both simple and composite null hypotheses.
The problem of testing simple versus composite hypotheses using the Bayesian approach was considered in [23].The method of choosing the prior distribution of hypotheses with consideration of the simple basic hypotheses versus the composite alternative hypothesis for avoiding the so-called Lindley's paradox was offered.The essence of the paradox is in the following: for any fixed prior probability of the basic hypothesis, the posterior probability of this hypothesis can be made as close to one as desired, whatever the data for a sufficiently large prior value of some parameter of the considered distribution at the alternative hypothesis.Similar behavior was founded in [24,25] when a uniform distribution over a finite interval was chosen for the parameter under investigation at the alternative hypothesis (though the same type of a result is hold with any other distributional assumption).Namely, the posterior probability of the basic hypothesis tends to one as the size of the interval increases.The method offered in [23] was concretized for the cases of normal and uniform distributions of the parameter of mathematical expectation under a composite alternative hypothesis.
Below we apply CBM to composite hypotheses testing for the cases considered in [23].The results of investigation show high quality of CBM in these cases as well.In particular, it preserves all positive properties that it has at testing simple and multiple hypotheses.Moreover it naturally overcomes the problem of the Lindley's paradox arising in classical methods based on the likelihood ratio.These facts are shown theoretically and on the basis of computation of the concrete examples taken from work [23].The Application of CBM to composite hypotheses testing is described in Section 2. The Lindley's paradox in classical hypothesis testing and CBM is considered in Section 3. Computation results of some examples are presented in Section 4. The discussion of the problem and conclusions made are given in Sections 5 and 6, respectively.

CBM FOR TESTING COMPOSITE HYPOTHESES
Let us consider the problem of testing composite hypotheses.We assume that the data are represented by x , which has density f !(x) for parameter ! .Let us test the hypotheses where !i , i = 1,..., S ( S ! 2 ), are the disjoint subsets of the parameter space, i.e. !i !j != " , j i ! .
For testing hypotheses (1) there are different ways of action.As was mentioned above, one of possible ways was offered by Neyman and Pearson.They replaced the density at a single parameter value with the maximum of the density over all parameters in that hypothesis, i.e. instead of density f !(x) , the conditional method integrates the density over !using special measures to obtain the density of the maximum invariant statistic, which can then be used to analyze the problems.Here, instead of density f !(x) , the where the density !i of the parameter conditional on it being in the hypothesis H i is used.As was mentioned in [10], the Neyman and Pearson's method is an extremely useful method for finding good testing procedures and the Stein's method allows us to find the uniformly most powerful invariant test, though there arises the problem of finding the proper densities !i , i = 1,..., S .When testing composite hypotheses (1) using CBM, we can use one of the possible decision rules of different statements of CBM depending on the final aim [1,4,9].In these decision rules, one must use

LINDLEY'S PARADOX AND CBM
The likelihood ratio tests are based on the relations of the densities f !(x | H i ) , and it is logically clear that, the bigger is the difference among these densities, the bigger is evidence for making a proper decision on the basis of the observation results.When the integrated is used for testing composite hypotheses, unfortunately, it is not possible to make a proper decision for distanced densities for all the possible densities !i , i = 1,..., S .For example, when testing two hypotheses, null and alternative, the integrated likelihood ratio is The Bayesian approach to hypothesis testing uses this ratio, and the posterior probability that the null hypothesis is true is calculated as where

%
, ! 0 is a priori probability that the null hypothesis is true.
It is obvious that the posterior odds depend heavily on !0 and the individual priors !A and !0 .It is known that, as the prior !A becomes increasingly flat, the posterior odds usually approach 0 [10,23].In [13] was recommended "reference" priors !A and !0 so that the priors did not overwhelm the data.Other approaches recommend to use part of the data as a training sample.In [14] was offered averaging over training samples for choosing so-called "intrinsic" priors.A number of methods for finding the priors were reviewed in [15].For choosing !0 , in [13] was proposed to take , which means that the posterior odds are equal to the Bayes factor.
The Bayesian rule of testing hypotheses computes the posterior probability of the null hypothesis on the basis of some data D = x 1 , x 2 ,..., x n

{
} and, if it is close to one, accepts the null hypothesis, otherwise it accepts the alternative hypothesis.It is known that, when both null and alternative hypotheses have the same dimension, i.e. when both are simple or both are composite, the solution is easily obtained, but when the null hypothesis is simple and the alternative composite, the posterior probability of the null hypothesis may be rather misleading.For showing this fact, the following examples were considered in [23].The first of them consists in the following.Let D = x 1 , x 2 ,..., x n { } be a random sample of the random variable ! with normal distribution.The null and alternative hypotheses are with a priori probabilities p(H 0 ) = p and p(H 1 ) = 1 !p .
The arithmetic mean of the observation x is sufficient and the posterior probability of the null hypothesis is where .
The final form of (3) is It is easy to see from (4) that, for any fixed x and p , the right hand side of (4) tends to one as ! 1 2 increases, i.e. for sufficiently large prior variance ! 1 2 , independently of the data values and a priori probability p , the null hypothesis will always be accepted.This fact is called the Lindley's statistical paradox.Similar behavior was mentioned in [24,25] at uniform distribution of p(µ | H A ) when the size of the interval increases.For avoiding this misleading, in [23] it was suggested not to choose the fixed value of p , but to choose it depending on the form of p(µ | H A ) .In particular, we must choose the prior probability which maximizes the amount of missing information about ! , which gives Application of this methodology to considered case (3) gives the following posterior for null . (5) which does not tend to one as ! 1 2 increases.
In the second example, at alternative hypothesis , the posterior probability of H 0 is [23]   p(H which, for fixed p , tends to one as A increases.
In this case the proper choice of a priori probability in accordance with the above-mentioned methodology gives the following result and, substituting it into (6), we get which does not depend on the arbitrary constant A .
Thus, the method of choosing the proper priors for overcoming the Lindley's paradox was offered in [23].But let us imagine the situation when the priors are known on the basis of previous experience or they are defined on the basis of preliminary investigation of the problem.The existing knowledge is very important and it is not reasonable to ignore it as it can lead to misleading conclusions.
Let us apply one of CBM, for example, task 1 with restrictions on the averaged probability of acceptance of true hypotheses for testing hypotheses [1,4,9].For two hypotheses, it has the following form at (10)   where !0 and !A are the acceptance regions of hypotheses H 0 and H A , respectively, and For the first example considered above, we have It is evident that, for any fixed x and p , the left- hand-side member of inequalities of !0 and !A tend to zero as ! 1 2 increases.If ! is fixed (as it is in the classical Bayesian statement when ! = 1 ), for a large prior variance ! 1 2 , inequality in (11) will always be satisfied and that in (12) will newer be satisfied, i.e. hypothesis H 0 will always be accepted.
In the considered case, Lagrange multiplier !must be chosen so that condition (10) was satisfied.This means that the average power of the test will always be equal to the chosen level 1 !" .The value of !compensates the influence of ! 1 2 on the conditions in (11) and (12), and its value depends on the value of ! .When ! 1 2 increases the value of !decreases for satisfying condition (10).Therefore, the sizes of hypotheses acceptance regions !0 and !A will be kept the same for the preservation of condition (10) and the situation like the Lindley's paradox will never arise.
For the given values of µ 0 , µ 1 , ! 2 , ! 1 2 and ! ,the value of ! will be defined from condition (10).That means that, to the concrete values of µ 0 , µ 1 , ! 2 , ! 1 2 and ! ,the concrete value of !corresponds.In this condition, it is evident that, when x !µ 0 , the probability of satisfying of the inequality in (11) increases but the same probability in (12) decreases, i.e. the probability that hypothesis H 0 will be accepted increases, and the probability that H A will be accepted decreases independently of the value of ! 1 2 .On the other hand, when x !µ 1 , there is diametrically opposite circumstance: the probability of satisfying the inequality in (12) increases but that in (11) decreases.This means that the probability of acceptance of H A increases, and the probability of acceptance of H 0 decreases.
Similarly to [23], let us denote !x and ! 1 so that !x and ! 1 are the measures in standard unites, how far the sample and the prior mean are from the null hypothesis, respectively.

. (14)
When either ! 1 2 or n increases, the expression in the power of exponents tends to exp 13) and ( 14) could be rewritten as follows where != " 2 / n " 1 2 + " 2 / n is a very small value.Form here it is evident that, for fixed ! 2 , ! 1 2 and n , the only important features are the distances in standard unites !x and ! 1 from the sample and from the prior mean to the null hypothesis.Under the null hypothesis, !x will be moderate, !x 2 " ! 1 2 ( ) will not be too large, therefore the fulfillment of the inequality defining the region !0 is more probable, i.e. it is more probable to accept H 0 .
Under the alternative hypothesis, !x will increase as ) increases as n , exponent in (16) will increase and the probability of fulfilling the inequality defined the region !A (see ( 16)) increases, i.e. acceptance of H A is more probable.
Analyzing ( 13) and ( 14), we arrive at the following conclusion.Let us suppose that x !µ 0 .Then ( 13) and ( 14) tend to the following expressions Hence it is evident that, when for fixed ! 2 , ! 1 2 and n we have Let us suppose that x !µ 1 .Then ( 13) and ( 14) tend to the following expressions When for fixed ! 2 , ! 1 2 and n we have # , the following is true P(x !" 0 ) # 0 and Let us consider the second example when, with the alternative hypothesis, the mathematical expectation is uniformly distributed in the finite interval.In this case, the decision-making regions have the following forms Similarly, for the given µ 0 , ! 2 and n , in the classical Bayesian rule != 1 and, when A increases, the inequality in (19) will always be satisfied, while the inequality in (20), on the contrary, will not be satisfied.That means that, for fixed p , the probability of inequality in (19) tends to one as A increases and the probability of inequality in (20) tends to zero as A increases.This is the same that, when A increases the null hypothesis will always be accepted.
In the constrained Bayesian task, for given ! ,when A increases the value of !decreases in such a way that condition (10) was satisfied.For given p , µ 0 , ! 2 , n , A and ! ,the value of ! is defined from condition (10), and, the smaller is the difference between !x and µ 0 , i.e. the smaller is x !µ 0 ( ) 2 , the bigger is the probability of satisfying the inequality in (20).This means that the probability of acceptance of H 0 increases.On the other hand, the bigger is the divergence between !x and µ 0 , i.e. the bigger is x !µ 0 ( ) 2 , the bigger is the probability of inequality in (20) and the smaller is the inequality in (19).This means that the probability of acceptance of H A increases.

COMPUTATION RESULTS
Let us present the computation results of the considered examples for demonstration of the theoretically investigated properties of the Bayes method with arbitrary and a priori probabilities offered by Bernardo and CBM.
The following notations are used below in addition to the above-introduced ones: r -risk function computed by ratio (9), !0 -probability of incorrect rejection of the basic hypothesis, ! 1 -probability of incorrect rejection of the alternative hypothesis, !0 -probability of incorrect acceptance of the basic hypothesis, ! 1 -probability of incorrect acceptance of the alternative hypothesis; at ! > 1 p 01 -probability of making no decision when the observed value belongs to the intersection area of the regions of acceptance of tested hypotheses at validity of H 0 , p 11 -probability of making no decision when the observed value belongs to the intersection area of the regions of acceptance of tested hypotheses at validity of H 1 ; at ! < 1 p 02 -probability of making no decision when the observed value does not belong to the regions of acceptance of tested hypotheses at validity of H 0 , p 12 -probability of making no decision when the observed value does not belong to the regions of acceptance of tested hypotheses at validity of H 1 .
The results given below were obtained by simulation of 10,000 experiments in which by x 0 and x 1 computed at validity of hypotheses , respectively, with sample sizes equal to n and a priori probability p , decisions were made and the appropriate error probabilities were computed.
Example 1.The results of computation of Lagrange multiplier, risk function and error probabilities depending on the variance of the alternative hypothesis in CBM are given in Table 1.Figures 1a, b and c were constructed by these results.All computations for this paper are realized in MATLAB by programs developed by the author of the paper.The codes of these programs are given in Appendix.
From the computation results, the following is evident.When the variance of alternative hypotheses increases, the Lagrange multiplier in the decision rule changes: in the beginning it increases, achieves its maximum and then decreases.For the considered example, it increased from 0.4615 for ! 1 2 = 1 up to the maximum value 9.7657 for ! 1 2 = 50 and then decreased to 1.0898 for ! 1 2 = 500 .The change of ! 1 2 from 1 to 500 caused the change of the value of the probability of rejection of the null hypothesis !0 from 0.0857 practically to zero (0.0004) for ! 1 2 = 100 , after  that it increased a little up to 0.0054 for ! 1 2 = 500 .In this case the value of the probability of rejection of the alternative hypothesis ! 1 increased from 0.0143 for ! 1 2 = 1 up to 0.0996 for ! 1 2 = 100 and then it decreased to 0.0946 for ! 1 2 = 500 .The probabilities of incorrect acceptance of basic and alternative hypotheses !0 and ! 1 in the beginning increased from 0.0024 and 0.0211 up to 0.3147 and 0.3375, respectively, for ! 1 2 = 30 and then they decreased to 0.0965 and 0.0062, respectively, for ! 1 2 = 500 .When ! > 1 , the probabilities p 01 and p 11 of making no decision when the observed value belongs to the intersection area of the regions of acceptance of testing hypotheses at validity of H 0 and H 1 , change similarly from 0.0154 and 0.0119, respectively, for ! 1 2 = 3 up to 0.3360 and 0.2162 for ! 1 2 = 30 , and then they decrease to 0.0008 and 0.0019 for ! 1 2 = 500 .When ! < 1 , the probability p 02 of making no decision when the observed value does not belong to the regions of acceptance of testing hypotheses at validity of H 0 decreases from 0.0646 for ! 1 2 = 1 to 0.0259 for ! 1 2 = 2 and the probability p 12 of making no decision when the observed value does not belong to the regions of acceptance of testing hypotheses at validity of H 1 increases from 0.0119 for In accordance with expression (9), the value of the risk function is within the interval between !0 and ! 1 .
As was mentioned above, for simulation there were used 10,000 random numbers distributed by appropriate distribution laws.Let us explain the essence of the computed values of the considered probabilities from the point of view of decision-making.Let us consider the case ! 1 2 = 500 .In this case, for != 1.0898 , the decisions about the validity of tested hypotheses were made 10,000 times.Among them, the null hypothesis was incorrectly rejected 54 times, while the alternative hypothesis -946 times.The null hypothesis was incorrectly accepted 965 times, while the alternative hypothesis -62 times.No decision was made 27 times, because the information contained in the sample was not sufficient for making a simple decision with given reliability, among them 8 times because both hypotheses were supposed to be true and 19 times both hypotheses were supposed to be false.It seems that these results are quite good for so noised alternative hypothesis.The results of computation of Lagrange multiplier, risk function and error probabilities depending on the sample size are given in Table 2.The dependences of the considered probabilities on the sample size n are shown in Figure 2.
Hence the following is obvious.When the sample size n increases from 1 to 50, the error probabilities of type II for both hypotheses, i.e. the probabilities of incorrect acceptance of both hypotheses and the risk function which is the averaged value of these probabilities, decrease.They tend to zero (see Figure 2a).The probabilities of type I for both hypotheses, i.e. the probabilities of incorrect rejection of both hypotheses change in such a manner that their sum weighed by a priori probabilities was not more then ! , i.e. restriction (10) in the statement of the constrained Bayesian task was satisfied.The changes have the following character.The probability of rejection of the basic hypothesis decreases and the probability of rejection of the alternative hypothesis increases when the sample size n increases from 1 up to 10.After that, i.e. when n increases from 10 up to 50, the probability of rejection of the basic hypothesis increases and the probability of rejection of the alternative hypothesis decreases (see Figure 2a).These two probabilities are equal to each other when n = 37 .The described situation logically can be explained completely from the  point of view of the information theory.In particular, the more is the information about the studied phenomenon, the more reliable must be the decision made in relation with this phenomenon.In the considered case, the increase of the information, i.e. the increase of sample size n , causes the decrease of probabilities of type II errors and, on the basis of the condition of the considered problem, type I errors stay in the frame of chosen restriction (10).To the small or big values of the variance of the alternative hypothesis corresponds the increase of information about tested hypotheses.Therefore, to the small or big values of the variance of alternative hypothesis correspond small values of type II errors, and the weighted sum of type I error probabilities satisfy restriction (10) (see Figure 1a).
Constrained Bayesian methods of hypotheses testing have the following specificity [1,9]: when the Lagrange multiplier !differs from the one in the observation space there appears the region of no acceptance of the simple hypothesis and the region of supposition of the validity of both hypotheses.In the considered case, p 01 and p 11 are the probabilities of no acceptance of basic and alternative hypotheses, respectively, when they are true, and p 02 and p 12 are the probabilities of supposition of the validity of both hypotheses when basic and alternative hypotheses are true, respectively.When n changes from 1 up to 50, the value of !changes.In the beginning it is more than one and after n = 22 it becomes less than one (see Figure 2c).Therefore, in the computed values (see Table 2), up to n = 25 p 02 and p 12 are equal to zero, after that p 01 and p 11 become equal to zero.The fact that n increases corresponds to the increase of the existing information about tested hypotheses.To the increasing information corresponds the increase in the information distance between the tested hypotheses in the considered case.The increase in the information distance between the tested hypotheses logically causes the decrease in the intersection of hypotheses acceptance regions because of restriction (10).The decrease in the intersection of hypotheses acceptance regions causes the decrease in the probabilities p 01 and p 11 .When the information distance increases, after some its value (which is determined by restriction (10)), the regions of acceptance of tested hypotheses are not intersected, and the following increase in the distance causes the appearance of the region which does not belong to the hypotheses acceptance regions.The bigger is the distance, the bigger is the latter region, i.e. the bigger is the sum of probabilities p 02 and p 12 .The changes of probabilities p 01 , p 11 , p 02 and p 12 given in Figure 2b correspond completely to this logic.
If there is no other objective reason, the choice of a priori probability p in accordance with Bernardo for only overcoming the Lindley's paradox, on the one hand, unjustifiably significantly increases the probability of incorrect rejection of the basic hypothesis ( ! 0 ) and the probability of incorrect acceptance of the alterative hypothesis ( ! 1 ) and, on the other hand, unjustifiably significantly decreases the probability of incorrect rejection of the alternative hypothesis ( ! 1 ) and the probability of incorrect acceptance of the basic hypothesis ( ! 0 ) (see p opt in lines of Table 2 at n = 1 and n = 5 ).
The results of computation of Lagrange multiplier, risk function and error probabilities depending on the a priori probability are given in Table 3. Figures 3a, b     Hence the following is clear.With the increasing a priori probability of basic hypothesis p and accordingly the decreasing a priori probability of alternative hypothesis (1 !p) , the probability of acceptance of basic hypothesis !0 , when it is not true, increases up to p = 0.65 and then it decreases.At the same time, the probability of acceptance of the alternative hypothesis ! 1 , when it is not true, decreases (see Figure 3a).The risk function is sum of these two probabilities, weighed by a priory probabilities (see (9)).Therefore, the form of the change of the risk function up to p = 0.5 has the form of ! 1 and then, it has the form of !0 (see Figure 3a).
As was investigated in [1], the change of a priori probability affects the decision made.This influence is not so significant as in the classical Bayesian test, but still it is present in the constrained test too, especially, when p is very small ( < 0.1 ) or very big ( > 0.8 ) (see Table 3 and Figure 3a).When p changes from 0.1 up to 0.99, the probability of rejection of the basic hypothesis !0 when it is true decreases to p = 0.8 and then it increases.At the same time, the probability of rejection of the alternative hypothesis ! 1 when it is true increases from zero up to 0.4949 so that restriction (10) in the statement of the task was fulfilled.(Remark: here must be taken into account the fact that !0 and ! 1 from Table 3 contain the probabilities of no making the decision, i.e. probabilities p 01 , p 11 , p 02 and p 12 and, after their exclusion, restriction ( 10) is fulfilled).
The Lagrange multiplier !changes depending on p and, when p changes from 0.1 up to 0.99, !increases from 0.1235 up to 9.8921 for p = 0.65 , and then, it decreases to 0.0059 (see Figure 3c).The probabilities of no-acceptance of hypotheses p 01 , p 11 , p 02 and p 12 change depending on ! .When ! < 1 , to which correspond p < 0.1 and p > 0.8 , p 01 and p 11 are equal to zero.When 0.1 !p ! 0.8 , p 01 and p 11 increase in the beginning and then, they decrease.The probabilities p 02 and p 12 are equal to zero when 0.1 !p ! 0.8 , i.e. when ! > 1 .They decrease on the left of this interval and they increase on the right of this interval.
Let us present the simulation results for the considered example obtained by the classical Bayesian method and with Bernardo's correction.Computation results of the risk function and error probabilities depending on the variance of the alternative hypothesis and on the sample size are given in Tables 4 and 5, respectively, for both methods.Appropriate graphs are shown in Figures 4 and 5, respectively.From these results, the following is evident.The property of (4) that, for the sufficiently large prior

Computed values for classical Bayesian method
Computed values for Bernardo's correction and a priori probability p , the null hypothesis will always be accepted is theoretically right.But, when the decision is made on the basis of real data (or on the basis of modeled results, as it is in the considered case), the situation is changed.It is true that the coefficient in (4) tends to zero at increasing ! 1 2 .When the variance ! 1 2 is very big, the arithmetic mean of observation results x (for a small number of observations n ), at the validity of hypothesis H 1 significantly differs from the mathematical expectations µ 0 and µ 1 , and the value of the exponent in the denominator is significantly smaller in comparison with the value of the exponent in the numerator.The result of their division is so big that the result of its multiplication by the coefficient becomes more than one and its inverse result becomes less than one.Therefore, in such cases, hypothesis H 1 is accepted (see the columns !0 = " 1 and ! 1 = " 0 in Table 4 giving the results of modeling for the classical Bayesian method).For this reason, for the real data the classical

Table 5: Computation Results of the Risk Function and Error Probabilities Depending on the Sample Size
Initial values m = 10, 000 , p = 0.5

Computed values for classical Bayesian method
Computed values for Bernardo's correction  Bayesian method gives considerably better results in comparison with the Bernardo's modification (see Table 4 and Figure 4).In the latter method, the probabilities of rejection and acceptance of basic and alternative hypotheses, respectively, converge to one at increasing ! 1 2 , which is incorrect.The advantage of the classical Bayesian method in comparison with the Bernardo's correction depending on the sample size is evident from the data given in Table 5 and Figure 5.
The computation results of the risk function and error probabilities depending on the a priori probability in the Bayesian test are given in Table 6.The appropriate graphs are shown in Figure 6.Hence it is evident that, when p increases, !0 (that is the same as ! 1 ) decreases, and ! 1 (that is the same as !0 ) increases, but, the bigger is ! 1 2 , the faster are these changes, i.e. to the bigger value of the variance ! 1 2 (which defines indefiniteness in the alternative hypothesis H 1 ) correspond the less value of the rejection probability of the basic hypothesis H 0 and the big value of the rejection probability of the alternative hypothesis H 1 for one and the same p .In other words, for the given value of p , when the value of variance ! 1 2 increases, the frequency of rejection of the basic hypothesis H 0 decreases, i.e. it will be more frequently accepted and the frequency of rejection of the alternative hypothesis H 1 increases, i.e. it will be more frequently rejected.This corresponds completely to the noted above theoretical property of this test.
The choice of a priori probability p in accordance with the Bernardo's condition does not change the natural motion of the investigated probabilities (see the cases p = 0.1640 for ! 1 2 = 5 and p = 0.0753 for ! 1 2 = 30 ).It only avoids the Lindley's paradox effect for fixed x .
Dependences of the risk function and error probabilities on the variance of the alternative hypothesis in the considered tests are shown in Figure 7.As was mentioned above, for classical and Bernardo methods there are the following ratios: !0 = " 1 and ! 1 = " 0 .In CBM they differ from each other.Hence, the optimality of CBM in comparison with the considered tests is obvious.
Example 2. For saving in room, we bring similar computation results for the second example with minimal explanations, because the behavior and ratios of the results obtained by the considered tests are similar to the results of the first example.
On the basis of the results given in Tables 13, 14 and Figures 14, 15, the following is evident.The probability of rejection of the basic hypothesis when it is true (that is the same, the probability of acceptance of the alternative hypothesis when it is false) at increasing variance of the basic hypothesis tends to

A priori probability
Initial values m = 10, 000 , n = 5 Initial values m = 10, 000 , n = 5        one.At the same time, the probability of rejection of the alternative hypothesis when it is true (that is the same, the probability of acceptance of the basic hypothesis when it is false) tends to zero.The first of these probabilities for the Bernardo's correction practically does not change and the second one tends to one.It is not difficult to be convinced that these behaviors are not correct because, with increasing indefiniteness in relation to the basic hypothesis, the probability of its rejection must tend to one and the probability of its acceptance must tend to zero, as they do in the classical Bayesian method.The behavior of CBM coincides with the classical Bayesian test in principle, but it is more thoughtful, because with the increasing ! 2 , the indefiniteness in relation to both hypotheses increases, and it becomes more and more difficult to make a correct decision in relation to the tested hypotheses.In accordance with that, the regions of acceptance of both hypotheses increase (see ! 0 and ! 1 ) and consequently the region of their intersection increases too.Therefore, the probabilities of supposition for both hypotheses p 01 and p 11 to be true              increase.Because !0 " 0.1 and ! 1 " 0 , the following takes place p 01 !0.9 and p 11 ! 1 .At the same time, the risk function r ! 1 .Because of increasing regions of acceptance of both hypotheses, the regions which do not belong to hypotheses acceptance regions become empty and hence the appropriate probabilities p 02 and p 12 become equal to zero.
The graphs of dependences of the risk function and error probabilities of the considered methods on the length of the domain of definition of the uniformly distributed random variable with the alternative hypothesis are given below.The optimality of CBM in comparison with two other tests is obvious (similarly to the first example given in Figure 7).

DISCUSSION
The computation results show that the choice of a priori probabilities on the basis of a preliminarily stated aim does not make sense, because it deteriorates the results of the test.A priori probabilities must be chosen on the basis of objective information [14,26], and, the more objectively they are chosen, the better is for the final results.The choice of a priori probabilities on the basis of the sample which is used for making a decision is not a good solution of the problem either, because they are subjective probabilities which are determined by a concrete sample, i.e. by the information contained in the concrete sample.The purpose of the a priori probabilities in the Bayesian statement is to decrease the subjective factor (which is contained in a concrete sample as it is determined by concrete (subjective) values of the random component) on the basis of previous knowledge and experience for neutralization of the concrete values of the random component.If there is no objective information for determination of the a priori probabilities of tested hypotheses, it is better to choose equal values of these probabilities for all hypotheses.

CONCLUSION
The optimality of CBM for testing composite hypotheses is shown.In particular, there is shown that the risk of the Lindley's paradox does not exist for it.Superiority of CBM in comparison with the Bayesian test with classical and Bernardo's choice of the a priori probabilities of tested hypotheses is shown with theoretical consideration that is supported by computation results of many characteristics of different (Figure 16).Continued.practical examples.Necessity in testing composite hypotheses arises very often at solving many practical problems.In particular, they are very important in many medical and biological applications.It shown that, CBM keeps the optimal properties at testing composite hypotheses and, therefore it gives great opportunities for making reliable decisions in theory and practice.

Figure 1 :
Figure 1: Dependences of Lagrange multiplier, risk function and error probabilities on the variance of the alternative hypothesis in the constrained Bayesian test.
and c are constructed by these results.

Figure 2 :
Figure 2: Dependences of Lagrange multiplier, risk function and error probabilities on the sample size in the constrained Bayesian test.

Figure 3 :
Figure 3: Dependences of Lagrange multiplier, risk function and error probabilities on the a priori probability in the constrained Bayesian test.

Figure 4 :
Figure 4: Dependences of the risk function and error probabilities on the variance of the alternative hypothesis in the Bayesian and Bernardo tests.

Figure 5 :
Figure 5: Dependences of the Lagrange multiplier, the risk function and error probabilities on the sample size in the Bayesian and Bernardo tests.

Figure 6 :
Figure 6: Dependences of the risk function and error probabilities on the a priori probability in the Bayesian test.

Figure 7 :
Figure 7: Dependences of the risk function and error probabilities on the variance of the alternative hypothesis in the considered tests for example 1.

Note:Figure 8 :Table 8 :
Figure 8: Dependences of the Lagrange multiplier, the risk function and error probabilities on the length of the domain of definition of uniformly distributed random variable at the alternative hypothesis in the constrained Bayesian test.

Figure 9 :
Figure 9: Dependences of the Lagrange multiplier, the risk function and error probabilities on the sample size in the constrained Bayesian test.

Figure 10 :
Figure 10: Dependences of the Lagrange multiplier, the risk function and error probabilities on the a priori probability in the constrained Bayesian test.

Figure 11 : 1 p
Figure 11: Dependences of the risk function and error probabilities on the interval of definition of uniformly distributed random variable in the Bayesian and Bernardo tests.

Figure 12 :
Figure 12: Dependences of the risk function and error probabilities on the sample size in the Bayesian and Bernardo tests.

Table 12 : 5 µ 0 = 3 , ! 2 = 1 A = 10 Computed
Computation Results of the Risk Function and Error Probabilities in the Bayesian and Bernardo Tests Depending on the a Priori Probability Initial values m = 10, 000 , n =

Figure 13 :
Figure 13: Dependences of the risk function and error probabilities on the a priori probability in the Bayesian and Bernardo tests.

Table 13 :
Computation Results of the Risk Function and Error Probabilities in the Bayesian and Bernardo Tests Depending on the Variance of Basic Hypothesis Initial values m = 10, 000 , n = 5 µ 0 = 3 , p = 0.5 A = 10 , ! = 0.05 Computed values for classical Bayesian method Computed values for Bernardo's correction

Figure 14 :
Figure 14: Dependences of the risk function and error probabilities in the Bayesian and Bernardo tests depending on the variance of the basic hypothesis.Table 14: Computation Results of the Lagrange Multiplier, the Risk Function and Error Probabilities of CBM Depending on the Variance of the Basic Hypothesis Initial values m = 10, 000 , n = 5 µ 0 = 3 , p = 0.5 A = 10 , ! = 0.05

Figure 15 :
Figure 15: Dependences of the risk function and error probabilities of CBM depending on the variance of basic hypothesis.a) Probabilities of incorrect making decision.b) Probabilities of no making decision.

Figure 16 :
Figure 16: Dependences of the risk function and error probabilities on the length of the domain of definition of the uniformly distributed random variable with the alternative hypothesis in the considered tests for example 2.
Pearson or Stein's methods depending on the existed information and desired properties of decisions made.The integrated densities are used in the examples considered below with the aim to study the CBM on the Lindley's paradox problem.

Table 1 : Computation Results of Lagrange Multiplier, Risk Function and Error Probabilities Depending on the Variance of the Alternative Hypothesis Initial values
m = 10, 000

Table 2 : Computation Results of Lagrange Multiplier, Risk Function and Error Probabilities Depending on the Sample Size Initial values
m = 10, 000 05 *these two lines of computation results are obtained for optimal (in accordance with Bernardo for overcoming the Lindley's paradox) values of a priori probability at appropriate sample sizes.

Table 3 : Computation Results of Lagrange Multiplier, Risk Function and Error Probabilities Depending on the a Priori Probability Initial values
m = 10, 000 n = 5 , ! = 0.05

Table 4 : Computation Results of the Risk Function and Error Probabilities Depending on the Variance of the Alternative Hypothesis Initial values
m = 10, 000 , n = 5

Table 7 : Computation Results of the Lagrange Multiplier, the Risk Function and Error Probabilities of CBM Depending on the Length of the Domain of Definition of Uniformly Distributed Random Variable at the Alternative Hypothesis Initial values
m = 10, 000 , n = 5