Yang Z. Molecular evolution: a statistical approach (Oxford, 2014). - ОГЛАВЛЕНИЕ / CONTENTS
Навигация

Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
ОбложкаYang Z. Molecular evolution: a statistical approach. - Oxford: Oxford university press, 2014. - xv, 492 p.: ill. - Bibliogr.: p.450-487. - Ind.: p.488-492. - ISBN 978-0-19-960260-5
 

Место хранения: 02 | Отделение ГПНТБ СО РАН | Новосибирск

Оглавление / Contents
 
1  Models of nucleotide substitution ............................ i
   1.1  Introduction ............................................ 1
   1.2  Markov models of nucleotide substitution and distance
        estimation .............................................. 4
        1.2.1  The JC69 model ................................... 4
        1.2.2  The K80 model .................................... 7
        1.2.3  HKY85, F84, TN93, etc. ........................... 9
        1.2.4  The transition/transversion rate ratio .......... 13
   1.3  Variable substitution rates across sites ............... 15
   1.4  Maximum likelihood estimation of distance .............. 17
        1.4.1  The JC69 model .................................. 18
        1.4.2  The K80 model ................................... 22
        1.4.3  Likelihood ratio test of substitution models .... 22
        1.4.4  Profile and integrated likelihood methods ....... 24
   1.5  Markov chains and distance estimation under general
        models ................................................. 26
        1.5.1  Markov chains ................................... 26
        1.5.2  Distance under the unrestricted (UNREST) model .. 27
        1.5.3  Distance under the general time-reversible
               model ........................................... 29
   1.6  Discussions ............................................ 32
        1.6.1  Distance esdmation under different
               substitution models ............................. 32
        1.6.2  Limitations of pairwise comparison .............. 32
   1.7  Problems ............................................... 33
2  Models of amino acid and codon substitution ................. 35
   2.1  Introduction ........................................... 35
   2.2  Models of amino acid replacement ....................... 35
        2.2.1  Empirical models ................................ 35
        2.2.2  Mechanistic models .............................. 39
        2.2.3  Among-site heterogeneity ........................ 39
   2.3  Estimation of distance between two protein sequences ... 40
        2.3.1  The Poisson model ............................... 40
        2.3.2  Empirical models ................................ 41
        2.3.3  Gamma distances ................................. 41
   2.4  Models of codon substitution ........................... 42
        2.4.1  The basic model ................................. 42
        2.4.2  Variations and extensions ....................... 44
   2.5  Estimation of dS and dN ................................ 47
        2.5.1  Counting methods ................................ 47
        2.5.2  Maximum likelihood method ....................... 55
        2.5.3  Comparison of methods ........................... 57
        2.5.4  More distances and interpretation of the dN/dS
               ratio ........................................... 58
        2.5.5  Estimation of S and dN in comparative
               genomics ........................................ 61
        2.5.6  Distances based on the physical-site
               definition ...................................... 63
        2.5.7  Utility of the distance measures ................ 65
   2.6  Numerical calculation of the transition probability
        matrix ................................................. 65
   2.7  Problems ............................................... 68
3  Phylogeny reconstruction: overview .......................... 70
   3.1  Tree concepts .......................................... 70
        3.1.1  Terminology ..................................... 70
        3.1.2  Species trees and gene trees .................... 79
        3.1.3  Classification of tree reconstruction methods ... 81
   3.2  Exhaustive and heuristic tree search ................... 82
        3.2.1  Exhaustive tree search .......................... 82
        3.2.2  Heuristic tree search ........................... 82
        3.2.3  Branch swapping ................................. 84
        3.2.4  Local peaks in the tree space ................... 86
        3.2.5  Stochastic tree search .......................... 88
   3.3  Distance matrix methods ................................ 88
        3.3.1  Least-squares method ............................ 89
        3.3.2  Minimum evolution method ........................ 91
        3.3.3  Neighbour-joining method ........................ 91
   3.4  Maximum parsimony ...................................... 95
        3.4.1  Brief history ................................... 95
        3.4.2  Counting the minimum number of changes on
               a tree .......................................... 95
        3.4.3  Weighted parsimony and dynamic programming ...... 96
        3.4.4  Probabilities of ancestral states ............... 99
        3.4.5  Long-branch attraction .......................... 99
        3.4.6  Assumptions of parsimony ....................... 100
   3.5  Problems .............................................. 101
4  Maximum likelihood methods ................................. 102
   4.1  Introduction .......................................... 102
   4.2  Likelihood calculation on tree ........................ 102
        4.2.1  Data, model, tree, and likelihood .............. 102
        4.2.2  The pruning algorithm .......................... 103
        4.2.3  Time reversibility, the root of the tree, and
               the molecular clock ............................ 107
        4.2.4  A numerical example: phylogeny of apes ......... 108
        4.2.5  Amino acid, codon, and RNA models .............. 110
        4.2.6  Missing data, sequence errors, and alignment
               gaps ........................................... 110
   4.3  Likelihood calculation under more complex models ...... 114
        4.3.1  Mixture models for variable rates among sites .. 114
        4.3.2  Mixture models for pattern heterogeneity
               among sites .................................... 122
        4.3.3  Partition models for combined analysis of
               multiple datasets .............................. 123
        4.3.4  Nonhomogeneous and nonstationary models ........ 125
   4.4   Reconstruction of ancestral states ................... 125
        4.4.1  Overview ....................................... 125
        4.4.2  Empirical and hierarchical Bayesian
               reconstruction ................................. 127
        4.4.3  Discrete morphological characters .............. 130
        4.4.4  Systematic biases in ancestral reconstruction .. 131
   4.5  Numerical algorithms for maximum likelihood
        estimation ............................................ 133
        4.5.1 Univariate optimization ......................... 134
        4.5.2 Multivariate optimization ....................... 136
   4.6  ML optimization in phylogenetics ...................... 138
        4.6.1  Optimization on a fixed tree ................... 138
        4.6.2  Multiple local peaks on the likelihood
               surface for a fixed tree ....................... 139
        4.6.3  Search in the tree space ....................... 140
        4.6.4  Approximate likelihood method .................. 143
   4.7  Model selection and robustness ........................ 144
        4.7.1  Likelihood ratio test applied to rbcL dataset .. 144
        4.7.2  Test of goodness of fit and parametric
               bootstrap ...................................... 146
        4.7.3  Diagnostic tests to detect model violations .... 147
        4.7.4  Akaike information criterion (AIC and AICc) .... 148
        4.7.5  Bayesian information criterion ................. 149
        4.7.6  Model adequacy and robustness .................. 150
   4.8  Problems .............................................. 151
5  Comparison of phylogenetic methods and tests on trees ...... 153
   5.1  Statistical performance of tree reconstruction
        methods ............................................... 153
        5.1.1  Criteria ....................................... 154
        5.1.2  Performance .................................... 156
   5.2  Likelihood ............................................ 157
        5.2.1  Contrast with conventional parameter
               estimation ..................................... 157
        5.2.2  Consistency .................................... 158
        5.2.3  Efficiency ..................................... 159
        5.2.4  Robustness ..................................... 163
   5.3  Parsimony ............................................. 165
        5.3.1  Equivalence with misbehaved likelihood models .. 165
        5.3.2  Equivalence with well-behaved likelihood
               models ......................................... 168
        5.3.3  Assumptions and justifications ................. 169
   5.4  Testing hypotheses concerning trees ................... 171
        5.4.1  Bootstrap ...................................... 172
        5.4.2  Interior-branch test ........................... 177
        5.4.3  K-H test and related tests ..................... 178
        5.4.4  Example: phylogeny of apes ..................... 179
        5.4.5  Indexes used in parsimony analysis ............. 180
   5.5  Problems .............................................. 181
6  Bayesian theory ............................................ 182
   6.1  Overview .............................................. 182
   6.2  The Bayesian paradigm ................................. 183
        6.2.1  The Bayes theorem .............................. 183
        6.2.2  The Bayes theorem in Bayesian statistics ....... 184
        6.2.3 Classical versus Bayesian statistics ............ 189
   6.3  Prior ................................................. 197
        6.3.1  Methods of prior specification ................. 197
        6.3.2  Conjugate priors ............................... 198
        6.3.3  Flat or uniform priors ......................... 199
        6.3.4 The Jeffreys priors ............................. 200
        6.3.5 The reference priors ............................ 202
   6.4  Methods of integration ................................ 203
        6.4.1  Laplace approximation .......................... 203
        6.4.2  Mid-point and trapezoid methods ................ 204
        6.4.3  Gaussian quadrature ............................ 205
        6.4.4  Marginal likelihood calculation for JC69
               distance estimation ............................ 206
        6.4.5  Monte Carlo integration ........................ 210
        6.4.6  Importance sampling ............................ 210
   6.5  Problems .............................................. 212
7  Bayesian computation (MCMC) ................................ 214
   7.1  Markov chain Monte Carlo .............................. 214
        7.1.1  Metropolis algorithm ........................... 214
        7.1.2  Asymmetrical moves and proposal ratio .......... 218
        7.1.3  The transition kernel .......................... 219
        7.1.4  Single-component Metropolis-Hastings
               algorithm ...................................... 220
        7.1.5  Gibbs sampler .................................. 221
   7.2  Simple moves and their proposal ratios ................ 221
        7.2.1  Sliding window using the uniform proposal ...... 222
        7.2.2  Sliding window using the normal proposal ....... 223
        7.2.3  Bactrian proposal .............................. 223
        7.2.4  Sliding window using the multivariate normal
               proposal ....................................... 224
        7.2.5  Proportional scaling ........................... 225
        7.2.6  Proportional scaling with bounds ............... 226
   7.3  Convergence, mixing, and summary of MCMC .............. 226
        7.3.1  Convergence and tail behaviour ................. 226
        7.3.2  Mixing efficiency, jump probability, and step
               length ......................................... 230
        7.3.3  Validating and diagnosing MCMC algorithms ...... 241
        7.3.4  Potential scale reduction statistic ............ 242
        7.3.5  Summary of MCMC output ......................... 243
   7.4  Advanced Monte Carlo methods .......................... 244
        7.4.1  Parallel tempering (MC3) ....................... 245
        7.4.2  Trans-model and trans-dimensional MCMC ......... 247
        7.4.3  Bayes factor and marginal likelihood ........... 256
   7.5  Problems .............................................. 260
8  Bayesian phylogenetics ..................................... 263
   8.1  Overview .............................................. 263
        8.1.1  Historical background .......................... 263
        8.1.2  A sketch MCMC algorithm ........................ 264
        8.1.3  The statistical nature of phylogeny
               estimation ..................................... 264
   8.2  Models and priors in Bayesian phylogenetics ........... 266
        8.2.1  Priors on branch lengths ....................... 266
        8.2.2  Priors on parameters in substitution models .... 269
        8.2.3  Priors on tree topology ........................ 276
   8.3  MCMC proposals in Bayesian phylogenetics .............. 279
        8.3.1  Within-tree moves .............................. 279
        8.3.2  Cross-tree moves ............................... 281
        8.3.3  NNl for unrooted trees ......................... 284
        8.3.4  SPR for unrooted trees ......................... 287
        8.3.5  TBR for unrooted trees ......................... 289
        8.3.6  Subtree swapping ............................... 291
        8.3.7  NNI for rooted trees ........................... 292
        8.3.8  SPR on rooted trees ............................ 293
        8.3.9  Node slider .................................... 294
   8.4  Summarizing MCMC output ............................... 295
   8.5  High posterior probabilities for trees ................ 296
        8.5.1  High posterior probabilities for trees or
               splits ......................................... 296
        8.5.2  Star tree paradox .............................. 298
        8.5.3  Fair coin paradox, fair balance paradox, and
               Bayesian model selection ....................... 300
        8.5.4  Conservative Bayesian phylogenetics ............ 305
   8.6  Problems .............................................. 306
9  Coalescent theory and species trees ........................ 308
   9.1  Overview .............................................. 308
   9.2  The coalescent model for a single species ............. 309
        9.2.1  The backward time machine ...................... 309
        9.2.2  Fisher-Wright model and the neutral
               coalescent ..................................... 309
        9.2.3  A sample of n genes ............................ 312
        9.2.4  Simulating the coalescent ...................... 315
        9.2.5  Estimation of 9 from a sample of DNA
               sequences ...................................... 316
   9.3  Population demographic process ........................ 320
        9.3.1  Homogeneous and nonhomogeneous Poisson
               processes ...................................... 321
        9.3.2  Deterministic population size change ........... 322
        9.3.3  Nonparametric population demographic models .... 323
   9.4  Multispecies coalescent, species trees and gene
        trees ................................................. 325
        9.4.1  Multispecies coalescent ........................ 325
        9.4.2  Species tree-gene tree conflict ................ 331
        9.4.3  Estimation of species trees .................... 335
        9.4.4  Migration ...................................... 343
   9.5  Species delimitation .................................. 349
        9.5.1  Species concept and species delimitation ....... 349
        9.5.2  Simple methods for analysing genetic data ...... 351
        9.5.3  Bayesian species delimitation .................. 352
        9.5.4  The impact of guide tree, prior, and
               migration ...................................... 355
        9.5.5  Pros and cons of Bayesian species
               delimitation ................................... 358
   9.6  Problems .............................................. 359
10 Molecular clock and estimation of species divergence
   times ...................................................... 361
   10.1 Overview .............................................. 361
   10.2 Tests of the molecular clock .......................... 363
        10.2.1 Relative-rate tests ............................ 363
        10.2.2 Likelihood ratio test .......................... 364
        10.2.3 Limitations of molecular clock tests ........... 365
        10.2.4 Index of dispersion ............................ 366
   10.3 Likelihood estimation of divergence times ............. 366
        10.3.1 Global clock model ............................. 366
        10.3.2 Local clock model .............................. 367
        10.3.3 Heuristic rate-smoothing methods ............... 368
        10.3.4 Uncertainties in calibrations .................. 370
        10.3.5 Dating viral divergences ....................... 372
        10.3.6 Dating primate divergences ..................... 373
   10.4 Bayesian estimation of divergence times ............... 375
        10.4.1 General framework .............................. 375
        10.4.2 Approximate calculation of likelihood .......... 376
        10.4.3 Prior on evolutionary rates .................... 377
        10.4.4 Prior on divergence times and fossil
               calibrations ................................... 378
        10.4.5 Uncertainties in time estimates ................ 382
        10.4.6 Dating viral divergences ....................... 384
        10.4.7 Application to primate and mammalian
               divergences .................................... 385
   10.5 Perspectives .......................................... 388
   10.6 Problems .............................................. 389
11 Neutral and adaptive protein evolution ..................... 390
   11.1 Introduction .......................................... 390
   11.2 The neutral theory and tests of neutrality ............ 391
        11.2.1 The neutral and nearly neutral theories ........ 391
        11.2.2 Tajima's D statistic ........................... 393
        11.2.3 Fu and Li's D, and Fay and Wu's H statistics ... 394
        11.2.4 McDonald-Kreitman test and estimation of
               selective strength ............................. 395
        11.2.5 Hudson-Kreitman-Aquade test .................... 397
   11.3 Lineages undergoing adaptive evolution ................ 398
        11.3.1 Heuristic methods .............................. 398
        11.3.2 Likelihood method .............................. 399
   11.4 Amino acid sites undergoing adaptive evolution ........ 400
        11.4.1 Three strategies ............................... 400
        11.4.2 Likelihood ratio test of positive selection
               under random-site models ....................... 402
        11.4.3 Identification of sites under positive
               selection ...................................... 405
        11.4.4 Positive selection at the human MHC ............ 406
   11.5 Adaptive evolution affecting particular sites and
        lineages .............................................. 408
        11.5.1 Branch-site test of positive selection ......... 408
        11.5.2 Other similar models ........................... 409
        11.5.3 Adaptive evolution in angiosperm phytochromes .. 410
   11.6 Assumptions, limitations, and comparisons ............. 411
        11.6.1 Assumptions and limitations of current
               methods ........................................ 412
        11.6.2 Comparison of methods for detecting positive
               selection ...................................... 413
   11.7 Adaptively evolving genes ............................. 414
   11.8 Problems .............................................. 416
12 Simulating molecular evolution ............................. 418
   12.1 Introduction .......................................... 418
   12.2 Random number generator ............................... 418
   12.3 Generation of discrete random variables ............... 420
        12.3.1 Inversion method for sampling from a general
               discrete distribution .......................... 420
        12.3.2 The alias method for sampling from a discrete
               distribution ................................... 421
        12.3.3 Discrete uniform distribution .................. 422
        12.3.4 Binomial distribution .......................... 423
        12.3.5 The multinomial distribution ................... 423
        12.3.6 The Poisson distribution ....................... 423
        12.3.7 The composition method for mixture
               distributions .................................. 424
   12.4 Generation of continuous random variables ............. 424
        12.4.1 The inversion method ........................... 425
        12.4.2 The transformation method ...................... 425
        12.4.3 The rejection method ........................... 425
        12.4.4 Generation of a standard normal vanate using
               the polar method ............................... 428
        12.4.5 Gamma, beta, and Dirichlet variables ........... 430
   12.5 Simulation of Markov processes ........................ 430
        12.5.1 Simulation of the Poisson process .............. 430
        12.5.2 Simulation of the nonhomogeneous Poisson
               process ........................................ 431
        12.5.3 Simulation of discrete-time Markov chains ...... 433
        12.5.4 Simulation of continuous-time Markov chains .... 435
   12.6 Simulating molecular evolution ........................ 436
        12.6.1 Simulation of sequences on a fixed tree ........ 436
        12.6.2 Simulation of random trees ..................... 439
   12.7 Validation of the simulation program .................. 439
   12.8 Problems .............................................. 440
Appendices .................................................... 442
Appendix A. Functions of random variables ..................... 442
Appendix B. The delta technique ............................... 446
Appendix C. Phylogenetic software ............................. 448
References .................................................... 450
Index ......................................................... 488
Reader note: The asterisk next to a heading indicates
a more difficult or technical section/problem.


Архив выставки новых поступлений | Отечественные поступления | Иностранные поступления | Сиглы
 

[О библиотеке | Академгородок | Новости | Выставки | Ресурсы | Библиография | Партнеры | ИнфоЛоция | Поиск]
  Пожелания и письма: branch@gpntbsib.ru
© 1997-2024 Отделение ГПНТБ СО РАН (Новосибирск)
Статистика доступов: архив | текущая статистика
 

Документ изменен: Wed Feb 27 14:28:12 2019. Размер: 27,168 bytes.
Посещение N 1279 c 22.12.2015