01 Hello Quarto Notebooks

Notebooks for Academics

Carlos Granell

GEOTEC, Universitat Jaume I

May 2, 2024

What do Notebooks mean?

Why do scientists use notebooks?

Fields notes

Handwritten field notebook is a traditional research tool in science

Faraday’s use of notebooks

  • Faraday recorded roughly 30,000 experiments

  • Laboratory notebooks, idea books, loose slips, retrieval sheets, work sheets

  • Notebooks numbered from 1 (Aug 1832) to 16,041 (Mar 1860)

Field notebook needs improvement?

How do scientists use notebooks?

Literate programming (Knuth 1984)

Code embedded within the program’s documentation as opposed to documentation embedded within code.

Prose & code together

Electronic documents give reproducible research a new meaning

  • Merge publication with underlying computational analysis

  • Executable digital notebook

  • Be open & help others

  • Document for future self

Dynamic documentation

Any time that the underlying data, analysis, or code change…

  • figures and tables are automatically generated

  • results in text are automatically rendered

  • final content is automatically updated

Dynamic documentation

Constant content change

Advantages (source)

(1) Eliminate human error in copying and pasting results

We found that half of all published psychology papers that use null-hypothesis significance testing (NHST) contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion

(Nuijten et al. 2015)

Advantages (source)

(2) Easy revisions and specification of desired figures and tables

When revisions are requested, one may have to constantly modify tables and figures by hand, which creates a strong incentive not to rerun analyses because it would mean re-pasting and re-illustrating all the numbers and figures in an article.

Advantages (source)

(3) Promote computational reproducibility

Easy verification and replication of research findings

While programming environments may seem counter-intuitive for writing papers, they ultimately prevent mistakes and save time.

Advantages: Save time?

No more time is spent doing reproducible science with notebooks

It just reallocates time where you spend it.

Current state of affairs (source)

Most computational science is born in notebooks

  • Peer-review and publication workflows don’t support notebooks as research outputs
  • The more complex scenarios involve a lot of manual handling to bring the project to journal submission
  • Often during this process reproducibility is lost, or takes second seat to the formatting requirements
  • Final submission rarely captures all computations, which are, at best, relegated to supplementary materials

and dies ends in PDF or Word documents

What if notebooks became main drivers of research, from the initial conception to the journal submission stage?

What do Notebooks for Academics mean?

Academics can use notebooks for

Research

articles, reports

presentations

books, PhD thesis

websites, interactive dashboards

blog posts

Teaching

lab documents

presentations

textbooks, manuals

websites, interactive materials

blog posts

How many tools do you use to produce these types of results?

Hello Quarto

One source, many output formats

What is in a notebook?

A notebook is a (cloud-based/local) dynamic document composed of cells, which is used for literate programming

Each cell may contain:

  • narrative/text/documentation, or
  • executable code, or
  • results as code output (charts, tables, plots, maps, …)

What is in a Quarto notebook?

A Quarto notebook is a (cloud-based/local) dynamic document (.qmd) composed of cells, which is used for literate programming

Each cell may contain:

  • narrative/text/documentation in Markdown format, or
  • executable code (R, Python, Julia and Observable), or
  • results as code output (charts, tables, plots, maps, …)

Quarto (.qmd)

Quarto is an open-source scientific and technical publishing system built on Pandoc, and can:

  • be authored in your favourite code editor
  • render from qmd or Jupyter notebook to PDF, Word, HTML, slides, web pages, blog posts, books, etc.
  • execute code in R, Python, Julia and Observable
  • apply journal styles to your outputs with Quarto extensions
  • publish to GitHub Pages, Netlify, etc.
  • orchestrate multiple inputs and outputs with Quarto projects

References

Canfield, Michael R. 2011. Field Notes on Science & Nature. Harvard University Press. https://www.hup.harvard.edu/books/9780674057579.
Claerbout, JF, and M Karrenbach. 1992. “Electronic Documents Give Reproducible Research a New Meaning.” In SEG Technical Program Expanded Abstracts 1992, 601–4. Society of Exploration Geophysicists. https://doi.org/10.1190/1.1822162.
Knuth, DE. 1984. “Literate Programming.” The Computer Journal 11 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
Nuijten, Michèle B., Chris H. J. Hartgerink, Marcel A. L. M. van Assen, Sacha Epskamp, and Jelte M. Wicherts. 2015. “The Prevalence of Statistical Reporting Errors in Psychology (1985–2013).” Behavior Research Methods 48 (4): 1205–26. https://doi.org/10.3758/s13428-015-0664-2.
Quintana, Daniel. 2020. “Five Things about Open and Reproducible Science That Every Early Career Researcher Should Know.” Open Science Framework. https://doi.org/10.17605/OSF.IO/DZTVQ.
Tweney, Ryan D, and Christopher D Ayala. 2015. “Memory and the Construction of Scientific Meaning: Michael Faraday’s Use of Notebooks and Records.” Memory Studies 8 (4): 422–39. https://doi.org/10.1177/1750698015587149.