Сибирское отделение РАН
Генетические основы биоразнообразия
An explanation of similarity and difference between homologous chromosomal fragments typed in a population is the supposition that the degree of similarity of such fragments is defined by proximity of theirs origin. In the case of an autosome, as opposed to Y-chromosome or mitochondrial DNA, the origins of different sites on a chromosomal fragment do not necessarily coincide. The concept of an origin, in this context, is uniquely defined only in relation to a concrete site of an autosome. For example, when we speak about the common origin of specific mutation carriers, we mean the tree on which this mutation site was inherited from a common ancestor. An algorithm of this tree reconstruction for a specific autosomal site on the basis of adjacent marker haplotypes is the purpose of the present research. A model of a site origin on a set of marker haplotypes is an oriented graph (tree). The set of haplotypes defines the nodes of a tree. The edges are oriented from the node-ancestor to the node-descendent inheriting the site. Thus, the final nodes are known haplotypes typed in a population, the interior nodes and the root node are unknown ancestral haplotypes. Length of an edge is defined by the number of generations between nodes. For an edge of a tree, the following stochastic processes determine the transition probability of a haplotype of a descendant with a given haplotype of the ancestor: mutation of alleles of loci, recombination with haplotypes from the population retaining the site. The best fitting binary tree, maximized the likelihood of the observation (i.e. a set of haplotypes sampled), is defined. Then, based on the maximal likelihood ratio test, the most parsimonious tree is defined as a tree with maximal likelihood, among trees with the minimal number of nodes and with a statistically insignificant decrease in likelihood compared to the best fitting binary tree. The algorithm is implemented as a Visual C ++ program, S-TREE, version 1.0. The given version is limited by the following simplifying assumptions: 1) the recombination of haplotypes of a T-tree is limited to no more than one recombination from each side of the site per generation, 2) mutations of alleles are given by the model of unconditional frequencies, 3) the haplotypes recombine with random haplotypes of a stationary population at linkage equilibrium. Input data and parameters: haplotype sample, frequency of alleles in a population, intensity of mutations for each locus, recombination rates between loci, site location. The algorithm application to the multilocus haplotypes found in the unrelated Iraqi-Jewish families carried a deletion in the GPIIIa gene is described.
Примечание. Тезисы докладов публикуются в авторской редакции
© 1996-2000, Сибирское отделение Российской академии наук, Новосибирск
Дата последней модификации: 06-Jul-2012 (11:44:54)