Feeding america: the historic american cookbook dataset. East Lansing: Michigan State University Libraries Special Collections. Accessed: 2018-07-23.


Ken Abala. Cookbooks as historical documents. In Jeffrey M. Pilcher, editor, The Oxford Handbook of Food History, Oxford Handbooks Online. Oxford University Press, 2012.


James Abello, Peter Broadwell, and Timothy R Tangherlini. Computational folkloristics. Communications of the ACM, 55(7):60–70, 2012.


Alberto Acerbi and R Alexander Bentley. Biases in cultural transmission shape the turnover of popular traits. Evolution and Human Behavior, 35(3):228–236, 2014.


Alberto Acerbi, Stefano Ghirlanda, and Magnus Enquist. The logic of fashion cycles. PloS one, 7(3):e32541, 2012.


Maria José Afanador-Llach, James Baker, Adam Crymble, Víctor Gayol, Martin Grandjean, Jennifer Isasi, François Dominic Laramée, Zoe LeBlanc, Matthew Lincoln, Sarah Melton, Jose Antonio Motilla, Joshua G. Ortiz Baco, Sofia Papastamkou, Jessica Parr, Marie Puren, Riva Quiroga, Antonio Rojas Castro, Anna-Maria Sichani, Anandi Silva Knuppel, Amanda Visconti, and Brandon Walsh. 2019 Programming Historian Deposit release. November 2019. URL:, doi:10.5281/zenodo.3525082.


Apoorv Agarwal, Anup Kotalwar, and Owen Rambow. Automatic extraction of social networks from literary text: a case study on alice in wonderland. In Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013), 1202–1208. Nagoya, Japan, 2013.


David W. Aha, Dennis Kibler, and Marc K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.


Edoardo M. Airoldi, Annelise G. Anderson, Stephen E. Fienberg, and Kiron K. Skinner. Who wrote Ronald Reagan's radio addresses? Bayesian Analysis, 1:289–319, June 2006. doi:10.1214/06-BA110.


R. Alberich, J. Miro-Julia, and F. Rossello. Marvel universe looks almost like a real social network. 2002. URL:


Shlomo Argamon. Interpreting Burrows's Delta: Geometric and probabilistic foundations. Literary and Linguistic Computing, 23(2):131–147, 2008.


Jeffrey Arnold. American Civil War Battle Data. March 2018. doi:10.6084/m9.figshare.1515995.v14.


Taylor Arnold and Lauren Tilton. Humanities Data in R. Exploring Networks, Geospatial Data, Images, and Text. Springer, 2015.


Mark Aronoff and Kristen Fudeman. What is Morphology? Blackwell, 2005.


Sheldon Axler. Linear Algebra Done Right. Springer, New York, 2 edition, February 2004. ISBN 978-0-387-98258-8.


Ralf Harald Baayen. Analyzing linguistic data. A practical introduction to Statistics using R. Cambridge University Press, 2008.


Herbert Barry and Aylene S Harper. Evolution of unisex names. Names, 30(1):15–22, 1982.


Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. The million song dataset. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011). 2011.


J. Binongo and W. Smith. The application of principal components analysis to stylometry. Literary and Linguistic Computing, 14(4):445–466, 1999.


Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing with Python. O'Reilly Media, Inc., 1st edition, 2009. ISBN 0596516495, 9780596516499.


Christopher M Bishop. Pattern Recognition and Machine Learning. Springer, New York, NY, 2007. ISBN 0-387-31073-8 978-0-387-31073-2.


David M. Blei and John D. Lafferty. Dynamic Topic Models. In Proceedings of the 23rd International Conference on Machine Learning, 113–120. Pittsburgh, PA, 2006. ACM.


David M. Blei and John D. Lafferty. A Correlated Topic Model of Science. The Annals of Applied Statistics, 1(1):17–35, 2007. doi:10.1214/07-AOAS114.


David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993–1022, 2003.


Sharon Block and David Newman. What, Where, When, and Sometimes Why: Data Mining Two Decades of Women's History Abstracts. Journal of Women's History, 23(1):81–109, 2011.


Katherine Bode. Reading by Numbers: Recalibrating the Literary Field. Anthem Press, New York, July 2012. ISBN 978-0-85728-454-9.


Christine L Borgman. Scholarship in the digital age: Information, infrastructure, and the Internet. MIT press, 2010.


William Brooks. Philippe Quinault, Dramatist. Peter Lang, 2009.


Wray Buntine. Estimating Likelihoods for Topic Models. In Advances in Machine Learning, First Asian Conference on Machine Learning, 51–64. 2009. doi:


Wray Buntine. Hca 0.63. 2016.


Wray L. Buntine and Swapnil Mishra. Experiments with Non-parametric Topic Models. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '14, 881–890. New York, NY, USA, 2014. ACM. doi:10.1145/2623330.2623691.


John Burrows. Computation into criticism. A Study of Jane Austen's novels and an experiment in methods. Clarendon Press, 1987.


John Burrows. 'Delta': A measure of stylistic difference and a guide to likely authorship. Literary and Linguistic Computing, 17(3):267–287, 2002.


Bob Carpenter. Integrating Out Multinomial Parameters in Latent Dirichlet Allocation and Naive Bayes for Collapsed Gibbs Sampling. Technical Report, LingPipe, Inc., September 2010. URL:


George Casella and Roger L. Berger. Statistical Inference. Duxbury Press, 2 edition, June 2001. ISBN 0-534-24312-6.


Karin Knorr Cetina. Epistemic cultures: How the sciences make knowledge. Harvard University Press, 2009.


missing journal in chandler1997


Allison June-Barlow Chaney and David M. Blei. Visualizing Topic Models. In ICWSM. 2012.


Kenneth W. Church and William A. Gale. Poisson mixtures. Natural Language Engineering, 1(02):163–190, June 1995. doi:10.1017/S1351324900000139.


Eric Clarke and Nicholas Cook, editors. Empirical Musicology: Aims, Methods, Prospects. Oxfor University Press, 2004.


Tanya Clement and Stephen McLaughlin. Measured Applause: Toward a Cultural Analysis of Audio Collections. Journal of Cultural Analytics, pages 11058, May 2016. doi:10.22148/16.002.


Kevin M. Clermont and Emily Sherwin. A Comparative View of Standards of Proof. The American Journal of Comparative Law, 50(2):243–275, 2002.


Jim Collins. Bring on the Books for Everybody: How Literary Culture Became Popular Culture. Duke University Press, Durham, NC, June 2010. ISBN 978-0-8223-4588-6.


Nicholas Cook. Beyond the score: Music as performance. Oxford University Press, 2013.


National Research Council. Measuring Racial Discrimination. The National Academies Press, 2004. ISBN 978-0-309-46923-4. URL:, doi:10.17226/10887.


Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley-Interscience, Hoboken, NJ, 2 edition, 2006. ISBN 978-0-471-24195-9.


Hugh Craig. Stylistic Analysis and Authorship Studies. In Susan Schreibman, Ray Siemens, and John Unsworth, editors, Companion to Digital Humanities (Blackwell Companions to Literature and Culture), Blackwell Companions to Literature and Culture. Blackwell Publishing Professional, Oxford, hardcover edition, December 2004.


Sara Graça Da Silva and Jamshid J Tehrani. Comparative phylogenetic analyses uncover the ancient roots of indo-european folktales. Royal Society open science, 3(1):150645, 2016.


Walter Daelemans and Antal Van den Bosch. Memory-Based Language Processing. Studies in Natural Language Processing. Oxford University Press, 2005.


Amy J Devitt. Generalizing about genre: new conceptions of an old concept. College composition and Communication, 44(4):573–586, 1993.


Persi Diaconis and Brian Skyrms. Ten Great Ideas about Chance. Princeton University Press, 2017.


David K. Elson, Nicholas Dames, and Kathleen R. McKeown. Extracting social networks from literary fiction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 138–147. Uppsala, Sweden, 2010.


Robert Escarpit. Sociologie de la littérature. P.U.F., 1958.


Stefan Evert, Thomas Proisl, Fotis Jannidis, Isabella Reger, Pielström Steffen, Christof Schöch, and Thorsten Vitt. Understanding and explaining Delta measures for authorship attribution. Digital Scholarship in the Humanities, 32:ii4–ii16, 2017.


Rita Felski. Uses of Literature. Wiley-Blackwell, Oxford, 2008. ISBN 978-1-4051-4724-8.


David Hackett Fischer. Albion's Seed: Four British Folkways in America. Oxford University Press, 1989. ISBN 978-0-19-506905-1.


Roman Frigg and Charlotte Werndl. Entropy: A Guide for the Perplexed. In Probabilities in Physics. Oxford University Press, Oxford, New York, December 2011.


Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, 2 edition, July 2003. ISBN 1-58488-388-X.


Sean Gerrish and David M. Blei. How they vote: Issue-adjusted models of legislative behavior. In Advances in Neural Information Processing Systems, 2753–2761. 2012.


Chris Glynn, Surya T. Tokdar, Brian Howard, and David L. Banks. Bayesian Analysis of Dynamic Linear Topic Models. Bayesian Analysis, 2018.


Alex Gourevitch. Quitting Work but Not the Job: Liberty and the Right to Strike. Perspectives on Politics, 14(2):307–323, June 2016. doi:10.1017/S1537592716000049.


Ian N. Gregory and Paul S. Ell. Historical GIS: Technologies, Methodologies, and Scholarship. Cambridge University Press, Cambridge ; New York, 2007. ISBN 978-0-521-67170-5.


S. Gries. Statistics for Linguistics with R. A Practical Introduction. De Gruyter Mouton, 2013.


Jack Grieve. Quantitative authorship attribution: an evaluation of techniques. Literary and Linguistic Computing, 22(3):251–270, 2007. URL:, arXiv:, doi:10.1093/llc/fqm020.


Charles M. Grinstead and Laurie J. Snell. Introduction to Probability. American Mathematical Society, n. p., 2 edition, 2012. ISBN 978-0-8218-9414-9.


Philip Guo. Python Is Now the Most Popular Introductory Teaching Language at Top U.s. Universities., July 2014.


Ian Hacking. An Introduction to Probability and Inductive Logic. Cambridge University Press, July 2001. ISBN 0-521-77501-9.


Nicholas Hammond. Highly Irregular: Defining Tragicomedy in Seventeenth-Century France. In Subha Mukherji and Raphael Lyne, editors, Early Modern Tragicomedy, pages 76–83. DS Brewer, 2007.


N. Katherine Hayles. How We Think: Digital Media and Contemporary Technogenesis. University of Chicago Press, 2012. ISBN 0-226-32142-8.


Gregor Heinrich. Parameter Estimation for Text Analysis. Technical Report, vsonix GmbH, University of Leipzig, September 2009. Version 2.9. URL:


Joseph Henrich, Steven J. Heine, and Ara Norenzayan. Most people are not WEIRD. Nature, 466(7302):29, 2010.


J Berenike Herrmann, Karina van Dalen-Oskam, and Christof Schöch. Revisiting style, a key concept in literary studies. Journal of Literary Theory, 9(1):25–52, 2015.


Peter D. Hoff. A First Course in Bayesian Statistical Methods. Springer, New York, 2009. ISBN 0-387-92299-7.


Richard Hoggart. The Uses of Literacy: Aspects of Working-Class Life with Special References to Publications and Entertainments. Chatto and Windus, London, 1957.


David I. Holmes. Authorship attribution. Computers and the Humanities, 28(2):87–106, 1994.


David I. Holmes. The evolution of stylometry in Humanities scholarship. Literary and Linguistic Computing, 13(3):111–117, 1998.


David L. Hoover. Corpus Stylistics, Stylometry, and the Styles of Henry James. Style, 2007.


Michael Hout. Getting the most out of the GSS income measures. Technical Report 101, National Opinion Research Center Washington, 2004.


Magnus Huber. The old bailey proceedings, 1674-1834. evaluating and annotating a corpus of 18th- and 19th-century spoken english. In Anneli Meurman-Solin and Arja Nurmi, editors, Annotating Variation and Change (Studies in Variation, Contacts and Change in English 1). University of Helsinki, 2007.


Yohei Igarashi. Statistical Analysis at the Birth of Close Reading. New Literary History, 46(3):485–504, 2015. doi:10.1353/nlh.2015.0023.


Kosuke Imai. Quantitative Social Science. An Introduction. Princeton University Press, 2018.


Bill James. Battling Expertise with the Power of Ignorance. April 2010.


Fotis Jannidis and Gerhard Lauer. Burrows delta and its use in german literary history. In M. Erlan and L. Tatlock, editors, Distant Readings. Topologies of German Culture in the Long Nineteenth Century, pages 29–54. Camden House, 2014.


Edwin Thompson Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, Cambridge, UK ; New York, NY, 2003. ISBN 978-0-521-59271-0.


Matthew L. Jockers. Text Analysis with R for Students of Literature. Springer, 2014.


Eric Jones, Travis Oliphant, Pearu Peterson, and others. SciPy: open source scientific tools for Python. 2001–2017. [Online; accessed 2017-09-09]. URL:


Lyle V. Jones. The Collected Works of John W. Tukey: Philosophy and Principles of Data Analysis 1949-1964. Volume 3. CRC Press, 1986.


Patrick Juola. Authorship Attribution. Foundations and Trends® in Information Retrieval, 1(3):233–334, 2006. doi:10.1561/1500000005.


Patrick Juola and Stephen Ramsay. Six Septembers: Mathematics for the Humanist. Zea Books, Lincoln, NE, 2017. doi:10.13014/K2D21VHX.


Dan Jurafsky and James H. Martin. Speech and Language Processing. Prentice Hall, 3 edition, in press.


Joseph B. Kadane. Principles of Uncertainty. Chapman and Hall/CRC, Boca Raton, FL, 2011. ISBN 978-1-4398-6161-5.


Folgert Karsdorp, Mike Kestemont, Christof Schöch, and Antal van den Bosch. The Love Equation: Computational Modeling of Romantic Relationships in French Classical Drama. In Mark A. Finlayson, Ben Miller, Antonio Lieto, and Remi Ronfard, editors, 6th Workshop on Computational Models of Narrative (CMN 2015), volume 45 of OpenAccess Series in Informatics (OASIcs), 98–107. Dagstuhl, Germany, 2015. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL:, doi:


Folgert Karsdorp, Mike Kestemont, Christof Schöch, and Antal Van den Bosch. The love equation. computational modeling of romantic relationships in french classical drama. In 6th Workshop on Computational Models of Narrative (CMN 2015), 98–107. Dagstuhl, Germany, 2015.


Jason Kessler. Scattertext: a browser-based tool for visualizing how corpora differ. In Proceedings of ACL 2017, System Demonstrations, 85–90. Association for Computational Linguistics, 2017. URL:


Mike Kestemont. Function words in authorship attribution. from black magic to theory? In Proceedings of the 3rd Workshop on Computational Linguistics for Literature, 59–66. Association for Computational Linguistics, 2014.


Mike Kestemont, Sara Moens, and Jeroen Deploige. Collaborative authorship in the twelfth century. a stylometric study of hildegard of bingen and guibert of gembloux. Digital Scholarship in the Humanities, 30(2):199––224, 2015.


Mike Kestemont, Justin Stover, Moshe Koppel, Folgert Karsdorp, and Walter Daelemans. Authenticating the writings of julius caesar. Expert Systems with Applications, 63:86–96, November 2016. URL:, doi:10.1016/j.eswa.2016.06.029.


Anne Kelly Knowles and Amy Hillier, editors. Placing history: how maps, spatial data, and GIS are changing historical scholarship. ESRI, Inc., 2008.


Moshe Koppel, Jonathan Schler, and Shlomo Argamon. Computational methods in authorship attribution. Journal of the American Society for Information Science and Technology, 60(1):9–26, 2009.


Peter Krafft, Juston Moore, Bruce Desmarais, and Hannah M. Wallach. Topic-partitioned multinetwork embeddings. In Advances in Neural Information Processing Systems, 2807–2815. 2012.


Bruce Kraig. The Oxford encyclopedia of food and drink in America. Volume 1. Oxford University Press, 2013.


Bruno Latour. Why Has Critique Run out of Steam? From Matters of Fact to Matters of Concern. Critical Inquiry, 30(2):225–248, 2004.


Benjamin E. Lauderdale and Tom S. Clark. Scaling politically meaningful dimensions using texts and votes. American Journal of Political Science, 58(3):754–771, 2014.


Peter M. Lee. Bayesian Statistics: An Introduction. Wiley, 4 edition, September 2012. ISBN 1-118-33257-1.


Stanley Lieberson. A Matter of Taste: How Names, Fashions, and Culture Change. New Haven, Connecticut: Yale University Press. Yale University Press, New Haven, Connecticut, 2000.


Stanley Lieberson and Freda B Lynn. Popularity as a taste: an application to the naming process. Onoma, 38:235–276, 2003.


Jan Longone. Feeding america: the historic american cookbook project. Accessed: 2018-07-23.


Max Louwerse, Sterling Hutchinson, and Zhiqiang Cai. The chinese route argument: predicting the longitude and latitude of cities in china and the middle east using statistical linguistic frequencies. In Proceedings of the Annual Meeting of the Cognitive Science Society, volume 34. 2012.


Max Louwerse and Benesh Nick. Representing spatial structure through maps and language: lord of the rings encodes the spatial structure of middle earth. Cognitive Science, 36(8):1556–1569, 2012. URL:, arXiv:, doi:10.1111/cogs.12000.


Karin Puga Magnus Huber, Magnus Nissel. Old bailey corpus 2.0. 2016. URL:


Christopher D. Manning and Hinrich Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts, 1999.


Eric Matthes. Python crash course: a hands-on, project-based introduction to programming. No Starch Press, San Francisco, California, 2016.


Andrew Kachites McCallum. Mallet: a machine learning for language toolkit (2002). 2002.


Willard McCarty. Knowing... modelling in literary studies. In Susan Schreibman and Ray Siemens, editors, A Companion to Digital Literary Studies. Blackwell, 2004.


Willard McCarty. Modeling: a study in words and meanings. In Susan Schreibman, Ray Siemens, and John Unsworth, editors, A Companion to Digital Humanities. Blackwell, 2004.


Willard McCarty. Humanities Computing. Palgrave Macmillan, 2005.


Wes McKinney. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, 51–56. 2012.


Wes McKinney. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly Media, Beijing, first edition, 2012.


Wes McKinney. Python for Data Analysis. Second Edition. O'Reilly, 2017.


David Mimno. Computational Historiography: Data Mining in a Century of Classics Journals. ACM Journal of Computing in Cultural Heritage, 5(1):3:1–3:19, April 2012.


David Mimno. Topic Regression. Ph.D. Thesis, University of Massachusetts Amherst, 2012.


Janet Mitchell. Cookbooks as a social and historical document. a scottish case study. Food Service Technology, 1:13–23, 2001.


Ryan Mitchell. Web Scraping with Python. Collection Data from the Modern Web. O'Reilly Media, 2015.


Franco Moretti. Network theory, plot analysis. New Left Review, 68:80–102, 2011.


Frederick Mosteller. A Statistical Study of the Writing Styles of the Authors of "The Federalist" Papers. Proceedings of the American Philosophical Society, 131:132–140, June 1987. URL:


Frederick Mosteller and David L Wallace. Inference and Disputed Authorship: The Federalist. Addison-Wesley, Reading, MA, 1964.


Frederick Mosteller and David L. Wallace. Inference and disputed authorship: The Federalist. Addison-Wesley, 1964.


Charles Muller. Étude de statistique lexicale: le vocabulaire du théâtre de Pierre Corneille. Larousse, Paris, 1967.


Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge, MA, August 2012. ISBN 978-0-262-01802-9.


Mark Newman. Networks. An Introduction. Oxford University Press, New York, NY, USA, 2010.


Dong Nguyen, Rilana Gravel, Dolf Trieschnigg, and Theo Meder. \"how old do you think i am?ä study of language and age in twitter. In ICWSM. 2013.


Andrea Nini. An authorship analysis of the jack the ripper letters. Digital Scholarship in the Humanities, 33:621–636, 01 2018. URL:, arXiv:, doi:10.1093/llc/fqx065.


Rebecca J. Passonneau and Bob Carpenter. The benefits of a model of annotation. Transactions of the Association for Computational Linguistics, 2:311–326, 2014.


F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.


Andrew Pickering. The Mangle of Practice: Time, Agency, and Science. University of Chicago Press, Chicago, 1995. ISBN 978-0-226-66803-1.


Elena Pierazzo. Digital Scholarly Editing. Theories, Models and Methods. Ashgate, 2015.


Jonathan K. Pritchard, Matthew Stephens, and Peter Donnelly. Inference of Population Structure Using Multilocus Genotype Data. Genetics, 155:945 –959, June 2000.


Janice A. Radway. Reading the Romance: Women, Patriarchy, and Popular Literature. The University of North Carolina Press, Chapel Hill, revised edition edition edition, 1991. ISBN 978-0-8078-4349-9.


Janice A. Radway. A Feeling for Books: The Book-of-the-Month Club, Literary Taste, and Middle-Class Desire. University of North Carolina Press, Chapel Hill, 1999. ISBN 978-0-8078-4830-2.


Stephen Ramsay. The Hermeneutics of Screwing Around; or What You Do with A Million Books. In Kevin Kee, editor, Pastplay: Teaching and Learning History with Technology, pages 111–120. The University of Michigan Press, Ann Arbor, March 2014.


Josyula R. Rao, Pankaj Rohatgi, and others. Can pseudonymity really guarantee privacy? In USENIX Security Symposium, 85–96. 2000.


Allen B. Riddell. How to Read 22,198 Journal Articles: Studying the History of German Studies with Topic Models. In Matt Erlin and Lynne Tatlock, editors, Distant Readings: Topologies of German Culture in the Long Nineteenth Century, pages 91–114. Camden House, Rochester, New York, January 2014.


David Robinson. The Incredible Growth of Python | Stack Overflow., September 2017.


Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas Griffiths, Padhraic Smyth, and Mark Steyvers. Learning author-topic models from text corpora. ACM Transactions on Information Systems (TOIS), 28(1):1–38, 2010.


Edward W. Said. Orientalism. Pantheon Books, New York, 1978.


Richard Schneirov, Shelton Stromquist, and Nick Salvatore, editors. The Pullman Strike and the Crisis of the 1890s: Essays on Labor and Politics. Working class in American history. University of Illinois Press, Urbana, 1999. ISBN 978-0-252-06755-6 978-0-252-02447-4.


Alexandra Schofield, M\a ans Magnusson, and David Mimno. Pulling out the stops: Rethinking stopword removal for topic models. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, volume 2, 432–436. 2017.


Alexandra Schofield and David Mimno. Comparing apples to apple: The effects of stemmers on topic models. Transactions of the Association for Computational Linguistics, 4:287–300, 2016.


Susan Schreibman, Ray Siemens, and John Unsworth, editors. A companion to Digital Humanities. Blackwell, 2004. URL:


Christof Schöch. Topic modeling genre: an exploration of french classical and enlightenment drama. Digital Humanities Quarterly, 2017. URL:


Christof Schöch. Topic Modeling Genre: An Exploration of French Classical and Enlightenment Drama. Digital Humanities Quarterly, 2017.


Fabrizio Sebastiani. Machine Learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.


William H. Sewell Jr. The Political Unconscious of Social and Cultural History, or, Confessions of a Former Quantitative Historian. In The Politics of Method in the Human Sciences: Positivism and Its Epistemological Others. Duke University Press Books, 2005.


Cosma Shalizi. Graphs, Trees, Materialism, Fishing. In John Holbo and Jonathan Goodwin, editors, Reading Graphs, Maps, and Trees: Responses to Franco Moretti. Parlor Press, 2011.


Jonathon Shlens. A tutorial on principal component analysis. CoRR, 2014. URL:, arXiv:1404.1100.


Ray Siemens and Susan Schreibman, editors. A companion to Digital Literary Studies. Blackwell, 2008. URL:


Herbert A. Simon. On a Class of Skew Distribution Functions. Biometrika, 42(3/4):425–440, 1955. doi:10.2307/2333389.


Efstathios Stamatatos. A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 60(3):538–556, 2009.


Efstathios Stamatatos. On the robustness of authorship attribution based on character n-gram features. Journal of Law and Policy, 21(2):421–439, 2013.


Constantina Stamou. Stylochronometry: stylistic development, sequence of composition, and relative dating. Literary and Linguistic Computing, 23(2):181–199, 2008.


John Stephens and Robyn McCallum. Retelling stories, framing culture: traditional story and metanarratives in children's literature. Routledge, 2013.


Matthew Stephens. Dealing with label switching in mixture models. Journal of The Royal Statistical Society, Series B, 62:795–809, 2000.


Ida Storm, Holly Nicol, Georgia Broughton, and Timothy R. Tangherlini. Folklore tracks: historical gis and folklore collection in 19th century denmark. In Proceedings of the International Digital Humanities Symposium. Växjö, Sweden, 2017.


Lucy Suchman, Jeanette Blomberg, Julian E Orr, and Randall Trigg. Reconstructing technologies as social practice. American Behavioral Scientist, 43(3):392–408, 1999.


Christina A Sue and Edward E Telles. Assimilation and gender in naming. American Journal of Sociology, 112(5):1383–1415, 2007.


Timothy R. Tangherlini. Danish Folktales, Legends and Other Stories. University of Washington Press, Seattle, 2013.


Timothy R. Tangherlini and Peter M. Broadwell. Sites of (re)collection: creating the danish folklore nexus. Journal of Folklore Research, pages 223–247, 2014.


Stephan Themstrom, Ann Orlov, and Oscar Handlin, editors. Harvard encyclopedia of American ethnic groups. Cambridge, MA: Belknap, 1980.


John B. Thompson. Merchants of Culture: The Publishing Business in the Twenty-First Century. Polity, 2 edition, 2012. ISBN 978-0-7456-6361-6.


Gaye Tuchman. Edging Women Out: Victorian Novelists, Publishers, and Social Change. Yale University Press, New Haven, 1989. ISBN 0300043163 (alk. paper).


George Tzanetakis, Ajay Kapur, W Andrew Schloss, and Matthew Wright. Computational ethnomusicology. Journal of interdisciplinary music studies, 1(2):1–24, 2007.


Anne Ubersfeld, Frank Collins, Paul Perron, and Patrick Debbèche. Reading Theatre. Toronto Studies in Semiotics and Communication Series. University of Toronto Press, Toronto, 1999.


Daniel W. VanArsdale. Chain letter evolution. barnowl/chain-letter/evolution.html, 2019. Accessed: 2019-10-19.


Jake Vanderplas. Python Data Science Handbook. Essential Tools for Working with Data. O'Reilly Media, 2016.


Stéfan van der Walt, S Chris Colbert, and Gael Varoquaux. The numpy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22–30, 2011.


Hadley Wickham and Garrett Grolemund. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O'Reilly, 2017.


Raymond Williams. The Long Revolution. Chatto & Windus, London, 1961.


Vanessa Williams. Georgia voting rights activists move to block a plan to close two-thirds of polling places in a majority black county., August 2018.


Greg Wilson, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K. Teal. Good enough practices in scientific computing. PLOS Computational Biology, June 2017. doi:10.1371/journal.pcbi.1005510.


Art Winslow. The Fiction Atop the Fiction. September 2015.


Robert Young. White Mythologies: Writing History and the West. Routledge, London ; New York, 1990.


G. Udny Yule. The statistical study of literary vocabulary. Cambridge University Press, 1944.


He Zhao, Lan Du, Wray Buntine, and Gang Liu. MetaLDA: A Topic Model that Efficiently Incorporates Meta Information. In 2017 IEEE International Conference on Data Mining (ICDM), 635–644. November 2017. doi:10.1109/ICDM.2017.73.


Statistics Canada Government of Canada. Previous standard - Race (ethnicity)., July 1998.


Statistics Canada Government of Canada. Visible minority of person., December 2015.


Ministère de l'Éducation Nationale et de la Jeunesse. Projets de programmes de seconde et de première du lycée général et technologique. November 2018. URL:


R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2013. URL:


Peter Van Kranenburg, Martine De Bruin, and Anja Volk. Documenting a song culture: the dutch song database as a resource for musicological research. International Journal on Digital Libraries, Sep 2017. URL:, doi:10.1007/s00799-017-0228-4.