Source for The Good Research Code Handbook

  • By Patrick Mineault
  • Last update: Jan 9, 2023
  • Comments: 15

The Good Research Code Handbook source

This is the source for the Good Research Code Handbook, by Patrick Mineault. The website lives at goodresearch.dev. This book uses jupyterbook to build, with a highly customized theme based off of tufte.css.

Reporting issues with the website

Please report any issues via the Issues tab.

Building the book

Recreate the conda environment with:

conda create --name cb --file environment.yml

Run jupyter-book build . to build.

Use netlify deploy --prod to deploy.

Citing this book

10.5281/zenodo.5796873

Patrick J Mineault & The Good Research Code Handbook Community (2021). The Good Research Code Handbook. Zenodo. doi:10.5281/zenodo.5796873

Github

https://github.com/patrickmineault/codebook

Comments(15)

  • 1

    Sphinx module problem

    Hi, I am trying to build this jupyter notebook and I get these errors even after pip install sphinx and python -m pip install sphinxext-opengraph

    The conf file already has these lines:

    sphinx:
      extra_extensions:
      - sphinxext.opengraph
    

    so I don't know what the problem is?

    Extension error:
    Could not import extension sphinxext.opengraph (exception: No module named 'sphinxext')
    Traceback (most recent call last):
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sphinx/registry.py", line 425, in load_extension
        mod = import_module(extname)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
    ModuleNotFoundError: No module named 'sphinxext'
    
  • 2

    Error installing required packages and building book

    Hi, running into a few problems trying to set up the theme:

    Conda version: 4.11.0

    conda create --name cb --file environment.yml gives: CondaValueError: could not parse 'name: cb' in: environment.yml

    Seems like the command should be conda env create -n cb --file environment.yml:

    • the code in the README is missing env
    • the "name" flag should be passed with only one dash '-'

    Running the modified command, I still get:

    Collecting package metadata (repodata.json): done
    Solving environment: failed
    
    ResolvePackageNotFound: 
      - sqlite==3.36.0=hc218d9a_0
      - perl==5.20.3.1=2
      - ld_impl_linux-64==2.35.1=h7274673_9
      - ca-certificates==2021.5.30=ha878542_0
      - xorg-libxt==1.2.1=h7f98852_2
      - libiconv==1.16=h516909a_0
      - mkl_fft==1.3.0=py38h42c9631_2
      - libxcb==1.14=h7b6447c_0
      - openssl==1.1.1l=h7f98852_0
      - libglib==2.68.3=h3e27bee_0
      - giflib==5.2.1=h36c2ea0_2
      - ncurses==6.2=he6710b0_1
      - gdk-pixbuf==2.42.6=h04a7f16_0
      - numexpr==2.7.3=py38h22e1b3c_1
      - xorg-libxext==1.3.4=h7f98852_1
      - readline==8.1=h27cfd23_0
      - qt==5.9.7=h5867ecd_1
      - latexmk==4.55=pl5.20.3_0
      - cycler==0.10.0=py38_0
      - xz==5.2.5=h7b6447c_0
      - numpy-base==1.20.3=py38h74d4b33_0
      - scikit-learn==0.24.2=py38ha9443f7_0
      - tornado==6.1=py38h27cfd23_0
      - harfbuzz==2.4.0=h37c48d4_1
      - pixman==0.38.0=h516909a_1003
      - lz4-c==1.9.3=h295c915_1
      - gobject-introspection==1.68.0=py38h2109141_1
      - glib-tools==2.68.3=h9c3ff4c_0
      - librsvg==2.50.3=hfa39831_1
      - libgcc-ng==11.1.0=hc902ee8_8
      - fribidi==1.0.10=h36c2ea0_0
      - libgfortran4==7.5.0=ha8ba4b0_17
      - zlib==1.2.11=h7b6447c_3
      - pkg-config==0.29.2=h36c2ea0_1008
      - sip==4.19.13=py38he6710b0_0
      - scipy==1.6.2=py38had2a1c9_1
      - matplotlib-base==3.4.2=py38hab158f2_0
      - libffi==3.3=he6710b0_2
      - zstd==1.4.9=haebb681_0
      - xorg-libsm==1.2.2=h470a237_5
      - xorg-kbproto==1.0.7=h7f98852_1002
      - gst-plugins-base==1.14.0=h8213a91_2
      - libgomp==11.1.0=hc902ee8_8
      - pandas==1.3.2=py38h8c16a72_0
      - libxml2==2.9.12=h03d6c58_0
      - glib==2.68.3=h9c3ff4c_0
      - lcms2==2.12=h3be6417_0
      - freetype==2.10.4=h5ab3b9f_0
      - kiwisolver==1.3.1=py38h2531618_0
      - jbig==2.1=h7f98852_2003
      - graphviz==2.42.3=h0511662_0
      - imagemagick==7.0.11_12=pl5320h0cb4662_0
      - python==3.8.11=h12debd9_0_cpython
      - gettext==0.19.8.1=h0b5b191_1005
      - libstdcxx-ng==9.3.0=hd4cf53a_17
      - libtool==2.4.6=h58526e2_1007
      - pip==21.0.1=py38h06a4308_0
      - xorg-libice==1.0.10=h7f98852_0
      - xorg-libxpm==3.5.13=h7f98852_0
      - expat==2.4.1=h2531618_2
      - cairo==1.16.0=h18b612c_1001
      - xorg-xproto==7.0.31=h7f98852_1007
      - bzip2==1.0.8=h7f98852_4
      - bottleneck==1.3.2=py38heb32a55_1
      - fftw==3.3.9=h27cfd23_1
      - icu==58.2=he6710b0_3
      - graphite2==1.3.13=h58526e2_1001
      - libuuid==1.0.3=h1bed415_2
      - pyqt==5.9.2=py38h05f1152_4
      - libpng==1.6.37=hbc83047_0
      - pango==1.42.4=h7062337_4
      - _openmp_mutex==4.5=1_gnu
      - pcre==8.45=h295c915_0
      - openjpeg==2.4.0=hb52868f_1
      - brotli==1.0.9=he6710b0_2
      - gstreamer==1.14.0=h28cd5cc_2
      - libwebp-base==1.2.0=h27cfd23_0
      - jpeg==9d=h36c2ea0_0
      - matplotlib==3.4.2=py38h06a4308_0
      - libwebp==1.2.0=h3452ae3_0
      - mkl_random==1.2.2=py38h51133e4_0
      - xorg-libx11==1.7.2=h7f98852_0
      - xorg-libxrender==0.9.10=h7f98852_1003
      - fontconfig==2.13.1=h6c09931_0
      - pillow==8.3.1=py38h2c7a002_0
      - xorg-xextproto==7.3.0=h7f98852_1002
      - libtiff==4.2.0=h85742a9_0
      - certifi==2021.5.30=py38h578d9bd_0
      - _libgcc_mutex==0.1=conda_forge
      - mkl-service==2.4.0=py38h7f8727e_0
      - libgfortran-ng==7.5.0=ha8ba4b0_17
      - ghostscript==9.54.0=h9c3ff4c_1
      - intel-openmp==2021.3.0=h06a4308_3350
      - blas==1.0=mkl
      - mkl==2021.3.0=h06a4308_520
      - dbus==1.13.18=hb2f20db_0
      - tk==8.6.10=hbc83047_0
      - xorg-renderproto==0.11.1=h7f98852_1002
      - numpy==1.20.3=py38hf144106_0
    

    And when I try to build the jupyter book this is the error:

    Running Jupyter-Book v0.12.1
    Source Folder: /Users/k/codebook
    Config Path: /Users/k/codebook/_config.yml
    Output Path: /Users/k/codebook/_build/html
    Running Sphinx v4.3.1
    
    Extension error:
    Could not import extension sphinxext.opengraph (exception: No module named 'sphinxext')
    Traceback (most recent call last):
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sphinx/registry.py", line 429, in load_extension
        mod = import_module(extname)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
    ModuleNotFoundError: No module named 'sphinxext'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/jupyter_book/sphinx.py", line 114, in build_sphinx
        app = Sphinx(
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sphinx/application.py", line 237, in __init__
        self.setup_extension(extension)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sphinx/application.py", line 394, in setup_extension
        self.registry.load_extension(self, extname)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sphinx/registry.py", line 432, in load_extension
        raise ExtensionError(__('Could not import extension %s') % extname,
    sphinx.errors.ExtensionError: Could not import extension sphinxext.opengraph (exception: No module named 'sphinxext')
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/Library/Frameworks/Python.framework/Versions/3.9/bin/jupyter-book", line 8, in <module>
        sys.exit(main())
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/jupyter_book/cli/main.py", line 323, in build
        builder_specific_actions(
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/jupyter_book/cli/main.py", line 535, in builder_specific_actions
        raise RuntimeError(_message_box(msg, color="red", doprint=False)) from result
    RuntimeError: 
    ===============================================================================
    
    There was an error in building your book. Look above for the cause.
    
    ===============================================================================
    

    Any idea what's wrong? Thank you!

  • 3

    command to recover environment from yml file did not work

    conda create --name recoveredenv --file environment.yml did not work, but instead: conda env create --name recoveredenv --file environment.yml

    This might need to be changed in the handbook.

    Thanks!

  • 4

    Patch 3

    Changed experienced to experience based on tenses (figured it might have been a typo based on to double-up of the word experienced) and updated sentence so that it reads in the same tense.

  • 5

    Talk about the difference genericizing conda environment.yml and pip freeze

    Oftentimes, the environment you get through exporting environment.yml or pip freeze is too specific; it doesn't work on other systems. Write about alternative ways of freezing the environment that are more flexible.

    e.g. https://github.com/patrickmineault/codebook/issues/15

  • 6

    PDF version

    Hiya,

    I really love the codebook! It would be nice to have a PDF version. It seems jupyter-book allows you to do this: https://jupyterbook.org/advanced/pdf.html

  • 7

    Update pipelines.md

    • add some more tool suggestions for digital notes
    • fix inconsistencies in Makefile example
    • improve comment on tabs in Makefiles

    Also, beyond Software carpentry, what do you think of linking to https://makefiletutorial.com/ as well?

  • 8

    Tag not hidden from code

    In the testing chapter, one of the code cells has some tags that should probably be hidden in the rendered version so to not cause any confusion for the reader. The problematic cell is this one: https://github.com/patrickmineault/codebook/blob/f4cc9b7371dea93a250b8efdc4db37ed1cbb9840/testing.md?plain=1#L100

    Thanks for a really nice book!

  • 9

    Update tidy.md

    Typo fix.

    Nb- would be easier to make changes if your markdowns had an 80 (or some other number around high double digits) line limit as well, instead of one line per paragraph:)

  • 10

    consistency in setup.md

    Make references to package directory/name consistent. Looks like you used to call it code then renamed it to src but missed a few.

    Btw- when renaming the package, should the name= in setup.py be updated as well? (You write just renaming the directory is sufficient)

  • 11

    env export vs export env

    On the website, when exporting your Python environment, you list the command as:

    conda export env > environment.yml

    This didn't work for me, and the Conda webpage says to use:

    conda env export > environment.yml

    This worked for me.

  • 12

    Differentiate when to use assertions vs exceptions

    Hi,

    I really liked your talk on test & code, and I'm working through the book at the moment.

    One comment/question I have is that I found your suggestions when to assert vs raise an exception were a little vague. From my understanding, assertions are great in tests (its the main mechanism to implement them) but should not be in production code. Production code should raise a ValueError, NotImplementedError, SomeCustomError... Something more representative than AssertionError.

    For example, the exercise https://github.com/patrickmineault/codebook/blob/083bf3840bebee35e99203c87ab00b265841a392/testing.md?plain=1#L433-L435 Might lead us to believe that you should have asserts in your production code vs a more appropriate exception.

    In the testing to maintain your sanity section you mix tests at runtime (the data is bad) and tests that can be done at implementation time (you're loading the data incorrectlly) which add to the vagueness of when to put asserts. https://github.com/patrickmineault/codebook/blob/083bf3840bebee35e99203c87ab00b265841a392/testing.md?plain=1#L29-L33

    Thanks again for the ideas, ill be implementing some of them!

  • 13

    Mention tesing juptyer notebooks?

    Came across this package to run jupyter notebooks with pytest.

    https://pypi.org/project/nbmake/

    Playing around with it to test someone else's demos:

    https://github.com/Remi-Gau/GLMsingle/blob/fb9f0a9781c8d575abfd7742505593b8006ad876/.github/workflows/run_demos_python.yml#L42

    Maybe worth mentioning.

  • 14

    Discuss documenting alternative analyses

    From a reader:

    if one would like to document exploratory work, let's say trying different algorithms, data munging approaches, before commiting to one. How would you suggest to keep track of the whole thing, even if in the end the solution is just a fraction of the original process/code written? Different git branches?

    Discuss git branching, gigantum, others.

  • 15

    Discuss data management solutions

    From a reader:

    One issue that pops up frequently - data "versioning". Where you go through a series of data manipulations/cleaning until a fi al clean set that you use for manuscript analysis. Sometimes the path between the very raw acquired data to that clean data set is very poorly documented.

    Some solutions:

    • datalad
    • dvc
    • git-annex
    • git-lfs

    Also discuss sharing these versions, e.g. through dryad, figshare, OSF, etc.