# Humanities Data Analysis: Case Studies with Python


````{margin}
```{image} images/bookcover.jpg
```
````

_Humanities Data Analysis: Case Studies with Python_ is a practical guide to
data-intensive humanities research using the Python programming language. The book,
written by [Folgert Karsdorp](https://www.karsdorp.io), [Mike
Kestemont](http://mikekestemont.github.io/) and [Allen
Riddell](https://www.ariddell.org/), was originally published with Princeton University
Press in 2021 (for a printed version of the book, see the [publisher's
website](https://press.princeton.edu/books/hardcover/9780691172361/humanities-data-analysis)),
and is now available as an Open Access interactive Juptyer Book. 

The book begins with an overview of the place of data science in the humanities, and
proceeds to cover data carpentry: the essential techniques for gathering, cleaning,
representing, and transforming textual and tabular data. Then, drawing from real-world,
publicly available data sets that cover a variety of scholarly domains, the book delves
into detailed case studies. Focusing on textual data analysis, the authors explore such
diverse topics as network analysis, genre theory, onomastics, literacy, author
attribution, mapping, stylometry, topic modeling, and time series analysis. Exercises and
resources for further reading are provided at the end of each chapter.

What is the book about?

```{grid}
:gutter: 3

:::{grid-item-card} Parsing and Manipulating Data ⛏️
:columns: 6
Learn to how effectively gather, read, store and parse different data formats, such as 
{ref}`CSV <sec-getting-data-csv>`, {ref}`XML <sec-getting-data-xml>`,
{ref}`HTML <sec-getting-data-html>`, {ref}`PDF <sec-getting-data-pdf>`, and 
{ref}`JSON <sec-getting-data-json>` data.
:::

:::{grid-item-card} Modeling and Data Representation 🚀
:columns: 6
Construct {ref}`Vector Space Models <chp-vector-space-model>` for texts and represent
data in a {ref}`tabular <chp-working-with-data>` format. Learn how use these and other
representations (such as {ref}`topics <chp-topic-models>`) to assess similarities and
distances between texts. 
:::

:::{grid-item-card} Creating Sophisticated Visualizations 📈
:columns: 6
Emphasizes visual storytelling via data visualizations of 
{ref}`character networks <chp-getting-data>`, 
{ref}`patterns of cultural change <chp-working-with-data>`, 
{ref}`statistical distributions <chp-statistics-essentials>`, and
{ref}`(shifts in) geographical distributions <chp-map-making>`. 
:::

:::{grid-item-card} Working on Real-World Case Studies 🌎
:columns: 6
Work on real-world case studies using publicly available data sets. Dive into the world of 
{ref}`historical cookbooks <chp-introduction-cook-books>`, 
{ref}`French drama <chp-vector-space-model>`, 
{ref}`Danish folktale collections <chp-map-making>`,
{ref}`the Tate art gallery <chp-topic-models>`, 
{ref}`mysterious medieval manuscripts <chp-stylometry>`, and many more. 
:::
```

### Accompanying Data 
The book features a large number of quality datasets. These datasets are published online
and are associated with the DOI ``10.5281/zenodo.891264``. They can be downloaded from the
address https://doi.org/10.5281/zenodo.891264.

### Citing HDA

If you use _Humanities Data Analysis_ in an academic publication, please cite the original
publication:

````{tab-set-code} 

```{code-block} APA
Karsdorp, F., Kestemont, M., & Riddell, A. (2021). Humanities Data Analysis: Case Studies
with Python. Princeton University Press. 
```

```{code-block} bibtex
@book{hda,
  author = {Folgert Karsdorp and Mike Kestemont and Allen Riddell},
  title = {Humanities Data Analysis: Case Studies with Python},
  publisher = {Princeton University Press},
  isbn = {9780691172361},
  year = {2021}
}
```
````

