Hardy-Weinberg Equilibrium as Foundational

The Hardy-Weinberg Principle explains how random mating can produce and maintain a population in equilibrium, that is: with constant genotypic proportions. The Hardy-Weinberg formula is in constant use as a basis for developing population genetics theory. Here we give a complete description of a model which can sustain equilibrium but with a general mating system, thereby giving a much broader basis on which to develop population genetics. It was S. N. Bernstein who first showed how Mendel’s first law could be justified simply on the basis of observations of populations in equilibrium. We show how the model can be applied to exploring the change in incidence of a genetic disorder.


INTRODUCTION
Penrose gives the conventional introduction to the Hardy-Weinberg Principle of population genetics [1].He uses it in his superb exposition of the then current knowledge of the aetiology of mental defect.He derives the Hardy-Weinberg formula giving the frequencies of genotypes in a population in equilibrium: Penrose uses (1), inter alia, to explain how, in the case of recessive inheritance, the great majority of defectives would be derived from normal parents.
The usual textbook derivation of the (1) takes as foundational Mendel's coefficients of heredity (for example, that the relative frequency of aa offspring in Aa x Aa parental crossings is ), together with random mating between individuals in the parental generation.Mendel's experimental approach had relied on the collation of outcomes of single pollination events.Around 1920 S. N. Bernstein (see Seneta [6]), using axiomatic reasoning, derived Mendel's first law of inheritance (see Bernstein, 1942 for the mathematical derivation): that is, the structure of the Mendelian coefficients of heredity [5].By contrast with Mendel's experimental approach, Bernstein postulated a population with many members which was maintained in equilibrium by a system of random mating.From these basic assumptions Bernstein derived Mendel's coefficients of heredity which we put in matrix form in the section on notation below.
Stark & Seneta [7] put Bernstein's result in a more general context, and gave a more transparent derivation, by modifying the mating system.The population is still assumed to be maintained in equilibrium, which thus becomes foundational, but is not necessarily in the form of (1).But a more elaborate set of conditions is needed to specify the equilibrium than is required in a model of random mating.However, there is a considerable gain in generality and realism beyond that provided by the Hardy-Weinberg model.
The first purpose of this note is to delineate visually the region of equilibrium which varies according to the gene frequency.It is three-dimensional, with simpler form when the gene frequency is less than or equal to .Thus, although the basic nature of the system is simple, its full description of the region is less so.This is an extension of work of Stark [8].
We begin by introducing notation which enables a convenient description of the mating system without resort to cumbersome tables.This is followed by descriptions of the admissible regions of equilibrium.Then we point out the highly restrictive nature of the model of random mating.
Our final section is a discussion of how Hardy-Weinberg equilibrium, as described by (1), can be achieved under mating which is not necessarily random.Our illustration is in terms of the Tay-Sachs disease.We demonstrate by this example how the model of equilibrium specified by (7) below can provide a broader basis for developing many of the concepts treated by Penrose [1].

NOTATION
We deal only with a single autosomal locus with two alleles U and T with frequencies in the population q and p (q + p = 1) .Throughout q remains constant because this is guaranteed by the nature of the selected mating system.A set of frequencies of genotypes {UU ,UT ,TT} can be represented in terms of q and a measure of departure from Hardy-Weinberg (HW) form F as, say, {q 2 + Fpq,2 pq 2Fpq, p 2 + Fpq}.These will vary according to F and will be denoted generally by The population is maintained in discrete generations according to the mating scheme

UU UU UU UT UU TT UT UU UT UT UT TT TT UU TT UT TT TT
(2) with commensurate pairing frequencies given by the matrix C is symmetric, that is f ij = f ji , with row and column sums { f 0 , f 1 , f 2 }.This triplet of sums is the parental frequency distribution.
Below we use C in the extended (row vector) form To follow the progression of generations we need Mendel's coefficients of heredity given in matrix form by Then the frequency distribution of juveniles is calculated from The population is in equilibrium, that is: the distribution of juveniles is the same as that of adults, if and only if matrix C has, in addition to the properties given above, the special property The notation used here is a modified version of that given in Stark & Seneta [7,9].Identity (7) allows for non-random mating (NRM) as well as random mating (RM).

REGIONS OF ADMISSIBLE POINTS
Mating matrix C , displayed in (3), is determined by combinations of q , F , f 11 , and f 01 .We first fix q then consider sets of F , f 11 , and f 01 as points in 3- dimensional space by giving their coordinates with reference to the orthogonal axes shown in Figure 1.Without loss of generality we take 0 < q 1 2 throughout.The shape of the region of admissible points is governed by the elements f ij of C -that they sum to 1, etc. So, while the requirements are basically simple, they require detailed description.The regions are of three main types, depending on q : (i) 1 4 < q < 1 2 ; (ii) q 1 4 ; (iii) q = 1 2 .We describe these separately.
Points (F, f 11 , f 01 ) on and between the bounding planes are admissible, that is to say correspond to admissible C .The faces of the region are planar, the planes defined by f 11 = 0 ( 1 ) and f 01 = 0 ( 2 ) being two main ones.The remaining planes are defined by the equations Figure 2 is a schematic of this region.The coordinates of the vertices are given in Table 1.Other points of reference, not shown on Figure 2, are: O ( q p,0,0) ; B (( p 2q) / (3p),0,0) ; N ((2 p q) / (3p),0,0) .The distance between O and V is 1 p , that between O and B is 1 (3p) , that between O and N is 2 (3p) , and therefore that between B and N is 1 (3p) .The bounding planes, identified by their vertices, are as follows: 1 : AQVZ; 2 : QVDE; 3 : AZDE; 4 : VDZ; 5 : AQE  In Figure 2, the plane shaded in lime-green is 2 , that is when f 01 = 0 .To assist with orientation, the line segment QV lies along the F axis, so that the corner letters of 2 lie in the F f 11 plane.In Figure 3, the plane shaded in lime-green is 2 , that is when f 01 = 0 .To assist with orientation, the line segment AV lies along the F axis.
This region is shown in Figure 4.The planes are: 1 : ANVZ; 2 : NVD; 3 : AZD; 4 : VDZ; 5 : AND When q = 1 2 , points D and E coalesce, as do Q and N. When q = 1 4 , points A, E and O coalesce.In Figure 4, the plane shaded in lime-green is 2 , that is when f 01 = 0 .To assist with orientation, the line segment NV lies along the F axis.

HARDY-WEINBERG PROPORTIONS AND RANDOM MATING
The fact that Hardy-Weinberg proportions can be maintained by non-random as well as random mating is supported by Figure 5 which displays the section of the region depicted by Figure 2 corresponding to F = 0 , when q = 3 10 .Provided the mating system follows the restrictions applying to matrix C outlined above, any point within and on the edges of the triangle gives an admissible system.This shows the highly restrictive nature of random mating which corresponds to a single point within the triangle.The ( f 11 , f 01 ) coordinates of the points in the diagram are: L:-(0,0); M:-(0.36,0);N:-(0,0.09);random mating:-(0.1764,0.0378).

APPLICATION
The trinity of a population in equilibrium under random mating with Hardy-Weinberg proportions is a standard of monographs in human genetics.Penrose [1, page 103] writes: The concept of gene frequency is of fundamental importance in the genetics of wild populations, which human populations resemble much more closely than selected breeds of laboratory animals.The idea is essential to the mathematical study of evolution because many of the processes of natural selection can be expressed in terms of progressive increase or decrease in gene frequencies.In the shorter-term problems of human populations, the concept is also indispensable.The elementary theoretical results were discovered independently by Hardy, Pearson and Weinberg.
As can be seen from the preceding sections, the Hardy-Weinberg Principle is less elementary than commonly supposed.Figure 5 shows that random mating is associated with a single point in the wider set of matings consistent with Hardy-Weinberg proportions.Furthermore a more general equilibrium is sustainable even without natural selection.
Penrose has a section entitled "Amaurotic Idiocy" [1].He includes, in particular, infantile amaurotic idiocy, which he says is known as Tay-Sachs disease.This disorder is the focus of our application.Penrose notes the then recently-established connection with hexosaminidase A established by O'Brien [10].Penrose included Tay-Sachs disease in the category of cerebromacular degeneration.[1, pages 163, 382].He notes: "Slome's (1933) analysis of published sibships clearly indicates that it is due to a single autosomal gene in spite of a slight excess of female cases."[11] Since then extensive research has mapped the relevant HEXA gene to chromosome 15.O'Brien writes: "The widespread absence of hexosaminidase A in the tissues, the normal or increased activity of other lysosomal hydrolases, the normal or increased activity of hexosaminidase A in other gangliosidoses, and the partial reduction of this enzyme in the serum of individuals heterozygous for the Tay-Sachs gene, suggest that the absence of hexosaminidase A is the fundamental enzymic defect in Tay-Sachs disease."[10] The finding of O'Brien led to a test for Tay-Sachs heterozygotes.Lewis, an Ashkenazim, gives an account of her own experience [12, pp. 126-130].For the purposes of this paper our population is the people of Ashkenazi Jewish background, that is Jews whose ancestors came from central or eastern Europe.Westman [13] gives additional information about the Ashkenazi community.Lewis states that, before genetic testing, about 1 in 3,600 Ashkenazi newborn were diagnosed with Tay-Sachs disease.Using Hardy-Weinberg proportions we take the gene frequency in the population to be q = 1 3600 = 1 60 .Lewis notes that, as a consequence of screening, the incidence of Tay-Sachs in the Ashkenazi population has been substantially reduced.Just for the purpose of illustration, we suppose the incidence now to be 1 36000 , although it is lower than that.
Under random mating, the former mating matrix, (3), is Because of the programme to reduce the incidence of Tay-Sachs in the population, the frequency of marriage between carriers has been reduced.Our model, incorporating incidence 1 36000 , is Note that the gene frequency is 1 60 and the population is in Hardy-Weinberg form in both ( 11) and (12).Neither model is fully valid in that children with Tay-Sachs die long before reproductive age.
Consequently there is natural selection against the gene on this account but its reduction is slow because of the mode of inheritance.Mating matrix ( 12) could be replaced by many others which satisfy equation (7).
Sebro et al. show the importance of taking account of the stratification of populations in epidemiological research [14].

Figure 1 :
Figure 1: Orthogonal axes used to specify coordinates F , f 11 , and f 01 , for given q .

Figure 2 :
Figure 2: Schematic illustration of the bounding region of admissible sets of F , f 11 and f 01 for 1 4 < q < 1 2 .

Figure 3 :
Figure 3: Schematic illustration of the bounding region of admissible sets of F , f 11 and f 01 for q 1 4 .

Figure 4 :
Figure 4: Schematic illustration of the bounding region of admissible sets of F , f 11 and f 01 for q = 1 2 .

Figure 5 :
Figure 5: The region of admissible combinations of values of f 11 and f 01 when q = 3 10 and F = 0 , that is for a population in Hardy-Weinberg form.The letters L, M, and N identify the vertices.The point corresponding to random mating is shown within the region.points within and on the edges are admissible.