AboutServicesProjectsBlog
Blog
Adding Jupyter Notebook (and Rust) to Gatsby
Correcting a bug in notebook-render
Simplifying Jupyter integration by eliminating heterogenous lists

Simplifying Jupyter integration by eliminating heterogenous lists

2nd article in Adding Jupyter Notebook (and Rust) to Gatsby
SoftwarePhilosophy
2019-12-23

Sometimes the best solution is less solution.

In programming one of the most common tasks is transforming lists from one shape to another. This website boils down to a complicated way to take a list content (represented by markdown files) and displaying it.

Heterogenous lists

It is much easier to operate on lists if every item is the same. This website started as purely markdown files, and much of the code makes that assumption. After integrating Jupyter suddenly there was another type of content, meaning that a list of posts could be made of markdown or jupyter. Every place a post could possibly appear had to handle both cases. This is commonly infeasible.

Homogenous lists

The first strategy attempted was to create a virtual third type of content and pass it around, bypassing graphql queries. Gatsby createPages first iterated over all markdown and created a data structure for each, then iterated over all jupyter notebooks and created similarly shaped data. As it turns out gatsby performs significant work during the query and this would break thumbnails.

The second strategy attempted was to create a new node type based on the common denominator, in gatsby onCreateNode markdown and jupyter nodes were converted to a CommonNode which would be the new post type. This also broke thumbnails for as yet unknown reasons.

After dumping significant time into these and not getting very far a third strategy is being pursued, namely leaving the markdown posts as they are and embedding jupyter into them. This follows the beaten path strategy of carefully navigating through well trodden paths. Few people appear to have multiple post types while many are customizing markdown rendering.

Beaten Path - Markdown plugins

Customizing markdown transformation has first party documentation, and the added benefit of keeping the surface area of gatsby more narrow (useful for transitioning to some more stable publishing platform in the future while retaining content)

https://www.danielworsnup.com/blog/how-to-build-a-simple-markdown-plugin-for-your-gatsby-site/

https://github.com/zestedesavoir/zmarkdown/tree/master/packages/remark-custom-blocks

The goal is to be able to use syntax similar to: {% jupyter ./rust-notebook.ipynb %}

hot reload bug

Error while running GraphQL query: The "path" argument must be of type string. Received type undefined

Please gatsby why can I not do anything without something else breaking! tracking bug

Thumbnails in frontmatter are breaking with an error querying thumbnail relativepath. Moving back minor versions until one works...

gatsby works
2.18.17 no
2.17.17 no
2.16.5 error

There were some notes in that bug, attempting to follow along...

Forking gatsby for local development following this guide

cd ~/projects
git clone https://github.com/gatsbyjs/gatsby
cd gatsby
yarn && yarn bootstrap
cd packages/gatsby
yarn watch

# tweak graphql-runner according to intuition

cd site
rm -r node_modules
yarn
yarn global add gatsby-dev-cli
~/.yarn/bin/gatsby-dev --set-path-to-repo ~/projects/gatsby
~/.yarn/bin/gatsby-dev --packages gatsby

yarn develop

The start of graphql-runner now looks like this:

const stackTrace = require(`stack-trace`)

const GraphQLRunner = require(`../query/graphql-runner`)
const errorParser = require(`../query/error-parser`).default

const { emitter } = require(`../redux`)

module.exports = (store, reporter) => {
  let runner = new GraphQLRunner(store)
  ;[
    `DELETE_CACHE`,
    `CREATE_NODE`,
    `DELETE_NODE`,
    `DELETE_NODES`,
    `SET_SCHEMA_COMPOSER`,
    `SET_SCHEMA`,
    `ADD_FIELD_TO_NODE`,
    `ADD_CHILD_NODE_TO_PARENT_NODE`,
  ].forEach(eventType => {
    emitter.on(eventType, event => {
      runner = new GraphQLRunner(store)
    })
  })
  return (query, context) =>
    // ...

And lo and behold it works. The progress meter is a bit broken, and it takes longer than I'd prefer but now gatsby no longer crashes every time I modify a markdown file! PR created

loading file

The code is able to identify the tag, now to locate the file

from gatsby-remark-images The rough process is:

  1. Visit all text nodes and match the relative filename
  2. Lookup the parent file node of this node and get its absolute path
  3. Construct the absolute path of the jupyter file
  4. Lookup the jupyter node (via gatsby-transformer-ipynb) from the files array
  5. Modify the markdown node to include the rendered html output

after copy pasting some code and bodging around let's see what happens...

It has found the file node, and oh boy~

Result of find:  { id: '2768ffe7-33e1-5a11-84bf-1958aff0f540',
  children:
   [ '2768ffe7-33e1-5a11-84bf-1958aff0f540 >>> JupyterNotebook' ],
  parent: null,
  internal:
   { contentDigest: '725590375e70b810fa67ce16aef62011',
     type: 'File',
     mediaType: 'application/octet-stream',
     description: 'File "content/blog/03-rusty-jupyter/rust-notebook.ipynb"',
     counter: 95,
     owner: 'gatsby-source-filesystem' },
  sourceInstanceName: 'content',
  absolutePath:
   '/home/username/projects/specific.solutions.limited/site/content/blog/03-rusty-jupyter/rust-notebook.ipynb',
   ...

By iterating over children we can find the jupyter node, which has this convenient html property, and viola:

[2]
1 + 1
2
[5]
println!("Hello {}!", "world");
Hello world!

Previous
Featured Projects
GeneratorHandwheel Repair
Company Info
About UsContactAffiliate DisclosurePrivacy Policy
Specific Solutions LLC
Portland, OR