Bibliography

1
Hirotugu Akaike.
A new look at the statistical model identification.
IEEE Transactions on Automatic Control, AC-19(6):716-723, December 1974.

2
S. F. Altschul, J. C. Wootton, E. Zaslavsky, and Y. K. Yu.
The construction and use of log-odds substitution scores for multiple sequence alignment.
PLoS Comput. Biol., 6(7):e1000852, Jul 2010.

3
D. Richard Hipp.
Sqlite home page.
https://www.sqlite.org.
Accessed: 2017-08-14.

4
F. Johansson and H. Toh.
A comparative study of conservation and variation scores.
BMC Bioinformatics, 11:388, Jul 2010.

5
K. Katoh, K. Kuma, T. Miyata, and H. Toh.
Improvement in the accuracy of multiple sequence alignment program MAFFT.
Genome Inform, 16(1):22-33, 2005.

6
K. Katoh, K. Kuma, H. Toh, and T. Miyata.
MAFFT version 5: improvement in accuracy of multiple sequence alignment.
Nucleic Acids Res., 33(2):511-518, 2005.

7
K. Katoh, K. Misawa, K. Kuma, and T. Miyata.
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.
Nucleic Acids Res., 30(14):3059-3066, Jul 2002.

8
K. Katoh and H. Toh.
Recent developments in the MAFFT multiple sequence alignment program.
Brief. Bioinformatics, 9(4):286-298, Jul 2008.

9
V. Lefort, R. Desper, and O. Gascuel.
FastME 2.0: A Comprehensive, Accurate, and Fast Distance-Based Phylogeny Inference Program.
Mol. Biol. Evol., 32(10):2798-2800, Oct 2015.

10
X. S. Liu and W. L. Guo.
Robustness of the residue conservation score reflecting both frequencies and physicochemistries.
Amino Acids, 34(4):643-652, May 2008.

11
G. McLachlan and D. Peel.
Finite Mixture Models.
Wiley, 2000.

12
E.G. Schwarz.
Estimating the dimension of a model.
Annals of Statistics, 6(2):461-464, 1978.

13
J. D. Thompson, T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins.
The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.
Nucleic Acids Res., 25(24):4876-4882, Dec 1997.

14
J. D. Thompson, D. G. Higgins, and T. J. Gibson.
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res., 22(22):4673-4680, Nov 1994.

15
J. D. Thompson, A. Muller, A. Waterhouse, J. Procter, G. J. Barton, F. Plewniak, and O. Poch.
MACSIMS: multiple alignment of complete sequences information management system.
BMC Bioinformatics, 7:318, Jun 2006.

16
S. M. Thompson.
Constructing and refining multiple sequence alignments with PileUp, SeqLab, and the GCG suite.
Curr Protoc Bioinformatics, Chapter 3:Unit 3.6, Feb 2003.

17
W. S. Valdar.
Scoring residue conservation.
Proteins, 48(2):227-241, Aug 2002.

18
N. Wicker, D. Dembele, W. Raffelsberger, and O. Poch.
Density of points clustering, application to transcriptomic data analysis.
Nucleic Acids Res., 30(18):3992-4000, Sep 2002.

19
N. Wicker, G. R. Perrin, J. C. Thierry, and O. Poch.
Secator: a program for inferring protein subfamilies from phylogenetic trees.
Mol. Biol. Evol., 18(8):1435-1441, Aug 2001.

20
D. D. Womble.
GCG: The Wisconsin Package of sequence analysis programs.
Methods Mol. Biol., 132:3-22, 2000.