Readings – Statistical Physics and Information Theory
Statistical Physics, Entropy, and Information Theory
Statistical Physics, Entropy, and Information Theory – Essential Books
- Tsallis, Constantino (2009). Introduction to Nonextensive Statistical Mechanics: Approaching a Complex World, New York: Springer. (Google Books)
More details on statistical physics and information theory books, including abstracts, descriptions, key articles by the authors presenting their main points, etc.
Information Theory – Seminal Papers
- On Information and Sufficiency, Kullback S, Leibler RA: On Information and Sufficiency. Ann. Math. Statist. 22 (1) (1951), 79-86. On Information and Sufficiency – full PDF file.
- A Mathematical Theory of Communication, Shannon CE: A Mathematical Theory of Communication. The Bell System Technical Journal. XXVII (3) (July, 1948), 379-423.
- Prediction and entropy of printed English, Shannon CE (1950).
Information Theory – Review Papers
- Burnham, K.P., and Anderson, D.R., Kullback-Leibler information as a basis for strong inference in ecological studies, Wildlife Research (2001), 28, 111-119. Review Paper on Information Theory (reviews concepts and methods – including K-L – in the context of practical applications to experimental data, rather than a deeply mathematical review – good for understanding basics)
Cluster Variation Method – Essential Papers
- Kikuchi, R. (1951). A theory of cooperative phenomena. Phys. Rev. 81, 988-1003.
- Kikuchi, R., & Brush, S.G. (1967), “Improvement of the Cluster‐Variation Method,” J. Chem. Phys. 47, 195; online as: http://jcp.aip.org/jcpsa6/v47/i1/p195_s1?isAuthorized=no
- Pelizzola, A. (2005), “Cluster variation method in statistical physics and probabilistic graphical models,” J. Physics A: Mathematical & General, 38 (33) R308; online as: Pellizola CVM PDF, see also Pellizola-abstract in Cluster Variation Methods page.
- Yedidia, J.S.; Freeman, W.T.; Weiss, Y., “Understanding Belief Propagation and Its Generalizations”, Exploring Artificial Intelligence in the New Millennium, ISBN 1558608117, Chap. 8, pp. 239-236, January 2003 (Science & Technology Books). Also available online as Mitsubishi Electronic Research Laboratories Technical Report MERL TR-2001-22, online as: http://www.merl.com/reports/docs/TR2001-22.pdf
Additional Cluster Variation Method Readings
Statistical Mechanics – Essential Papers
- To be filled in.
General Interesting Work on Entropy
- Tsallis, Constantino (2011). The Nonadditive Entropy Sq and Its Applications in Physics and Elsewhere: Some Remarks (Review Paper). Entropy 2011, 13, 1765-1804; doi:10.3390/e13101765
- Baez, John C., Fritz, Tobias, and Leinster, Tom (2011). A Characterization of Entropy in Terms of Information Loss, Entropy 2011, 13, 1945-1957; doi:10.3390/e13111945. See also: A characterization of entropy (blog entry).
- Presse´, Steve, Ghosh, Kingshuk, Lee, Julian and Dill, Ken A. (2013). Nonadditive Entropies Yield Probability Distributions with Biases notWarranted by the Data, Phys. Rev. Let., PRL 111, 180604. DOI: 10.1103/PhysRevLett.111.180604.
- What Is Entropy & Information Gain?, Nice little intro discussion on entropy, decision trees, information gain, & related … reread sometime.
Nonlinear Forecasting Methods
- Sugihara, George, & May, Robert M. (1990). Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series, Nature, 344, 734-741 (19 April, 1990).
- Leon, Florin, & Zaharia, Mihai H. (1990). Stacked heterogeneous neural networks for time series forecasting, Hindawi Publishing Corporation: Mathematical Problems in Engineering, 2010, Article ID 373648 (doi: 10.1155/2010/373648).
- Langton, Chris G. (1990) Computation at the Edge of Chaos: Phase Transitions and Emergent Computation, Physica D, 42, 12-37.
Dissipative Systems
- Willems, Jan C. (1972). Dissipative dynamical systems part I: General theory, Archive for Rational Mechanics and Analysis 5. VI. 1972, Volume 45, Issue 5, pp 321-351.
- James, Matthew R., & Gough, John (2009). Quantum Dissipative Systems and Feedback
Control Design by Interconnection, Preprint. - Ingber, L. (1998). Data mining and knowledge discovery via statistical mechanics in nonlinear stochastic systems, White Paper. Actually has little to do w/ knowledge discovery OR data mining in the sense now used (text mining, etc.), but is a useful reprise of Ingber’s work in simulated annealing.
Must-Read List – Not Prioritized
Informal Resources – Tutorials, PPT Presentations, Historical
- Yu, B. (2008). Tutorial: Information Theory and Statistics. ICMLA 2008, San Diego. Nice.pdf
- Spin-glass theory, Tommaso Castellani1 and Andrea Cavagna, Spin-glass theory for pedestrians. J. Stat. Mech. (2005) P05012. (doi:10.1088/1742-5468/2005/05/P05012)
A very comprehensive intro (graduate-level) tutorial on spin glasses, with three distinct approaches covered. Well-worth the read. - Shannon’s Information Theory, by Lê Nguyên Hoang , a lovely intro-level and intuitive (with nice graphics) tutorial on Shannon’s Information Theory with relationship to the Boltzmann concept of entropy. Very pleasant and well-worth the read.
- The Essential Message: Claude Shannon and the Making of Information Theory, by Erico Marui Guizzo, study of Shannon and his thoughts leading to information theory.
- Schreiber, Thomas. Nonlinear Prediction (Web-based tutorial on predictive methods).
- Markov Chain Applications, Philipp von Hilgers and Amy N. Langville, The Five Greatest Applications of Markov Chains.
- Letter, digram and trigram frequencies have been tabulated by cryptologists and can be found for example in Secret and Urgent by Fletcher Pratt, Blue Ribbon Books, 1939.
- I.M. Yaglom, A.M. Yaglom, Probability and Information. Moscow, 1973. Pub. by Hindustan Pub. Co. Has ref to Pratt. (Google Book Result search)
Dynamic Systems
- Liberzon, D. (2000). Nonlinear feedback systems perturbed by noise: steady-state probability distributions and optimal control. IEEE Trans. Automatic Control (2005) 45 (6), 1116-1130. pdf
Get back to this one
Information Maximization
- Bell, A. J., & Sejnowski, T. J. (1995). An information-maximization approach to blind separation and blind deconvolution, Neural Computation 7 1129-1159. pdf
Classic paper - Linsker, R. (1989). An application of the principle of maximum information
preservation to linear systems. In Advances in Neural Information Processing
Systems 1 Ed. D. S. Touretzky (Morgan Kaufmann, San Mateo, CA.) pdf Classic paper