Jupyter Notebooks to markdown and html with Pandoc¶
For several months now, the universal document converter pandoc has
had support for Jupyter Notebooks. This means that with a single call,
you can convert .ipynb
files to any of the output formats that Pandoc
supports (and vice-versa!). This post is a quick exploration of what this
looks like.
Note that for this post, we’re using Pandoc version 2.7.3. Also, some of what’s below is hard to interpret without actually opening the files that are created by Pandoc. For the sake of this blog post, I’m going to stick with the raw text output here, though you can expand the outputs if you wish, I recommend copy/pasting some of these commands on your own if you’d like to try.
from subprocess import run as sbrun
from subprocess import PIPE, CalledProcessError
from pathlib import Path
from IPython.display import HTML, Markdown
# A helper function to capture errors and outputs
def run(cmd, *args, **kwargs):
try:
out = sbrun(cmd.split(), stderr=PIPE, stdout=PIPE, check=True, *args, **kwargs)
out = out.stdout.decode()
if len(out) > 1:
print(out)
except CalledProcessError as e:
print(e.stderr.decode())
Our base notebook¶
First off, let’s take a look at our base notebook. We’ll convert this document to both Markdown and HTML using Pandoc.
The notebook will be fairly minimal in order to make it easier to inspect its contents. It has a collection of markdown with mixed content, as well as code cells with various outputs.
See this link for the notebook we’ll use.
.ipynb
to markdown¶
Let’s try converting this notebook to markdown. This should preserve as much information as possible about the input Jupyter notebook. This should include all markdown cells, cell metadata, and outputs with code cells.
A few pandoc options¶
Here are a few pandoc options that are relevant to our use-case:
--resource-path
defines the path where Pandoc will look for resources that are linked in the notebook. This allows us to discover images etc that are in a different folder from where we are invocingpandoc
.--extract-media
is a path where images and other media will be extracted at conversion time. Any links to images etc should point to files at this path in the output format.-s
(or--standalone
) tells Pandoc that the output should be a “standalone” format. This does different things depending on the output, such as adding a header if converting to HTML.-o
the output file, and implicitly the output file type (e.g., markdown)-t
the type of output file if we want to override the default (e.g., GitHub-flavored markdown vs. Pandoc markdown).
Converting to GitHub-flavored markdown¶
Let’s start by converting to GitHub-flavored markdown. By not specifying an output file
with -o
, we’ll cause Pandoc to print the result to the screen, which we’ll display here.
# ipynb -> gfmd
run(f'pandoc pandoc_ipynb/inputs/notebooks.ipynb --resource-path=inputs -s --extract-media=outputs/images -t gfm')
<div class="cell markdown">
# Here's a demo notebook
This is a demo notebook to play around with the pandoc ipynb support
## Markdown
As it is markdown, you can embed images, HTML, etc into your posts\!
![](outputs/images/ca17e56d65946db885db7f8f50a9605a6a94e6a7.jpg)
Here's one \(inline_{math}\) and
\[
math^{blocks}
\]
``` python
def my_functino():
mystring = "you can also include python cells"
return mystring
```
</div>
<div class="cell markdown" data-tags="["heresatag"]">
# Code cells
## Matplotlib output with metadata
The below code cell has some metadata attached to it. It also outputs a
figure. Both should be included in the output format.
</div>
<div class="cell code" data-execution_count="7" data-slideshow="{"slide_type":"subslide"}" data-tags="["mytag","parameters"]">
``` python
from matplotlib import rcParams, cycler
import matplotlib.pyplot as plt
import numpy as np
plt.ion()
data = np.random.rand(2, 1000) * 100
fig, ax = plt.subplots()
ax.scatter(*data, s=data[1], c=data[0])
```
<div class="output execute_result" data-execution_count="7">
<matplotlib.collections.PathCollection at 0x7f6e8d6269e8>
</div>
<div class="output display_data">
![](outputs/images/e843a737607d119ec5b2750a2bb737c915f1b6e8.png)
</div>
</div>
<div class="cell markdown">
## DataFrames
</div>
<div class="cell code" data-execution_count="8">
``` python
import pandas as pd
pd.DataFrame([['hi', 'there'], ['this', 'is'], ['a', 'DataFrame']], columns=['Word A', 'Word B'])
```
<div class="output execute_result" data-execution_count="8">
```
Word A Word B
0 hi there
1 this is
2 a DataFrame
```
</div>
</div>
<div class="cell markdown">
# Bibliography
Let's test the bibliography here
Testing this \[bibliography @holdgraf\_rapid\_2016\]
@holdgraf\_evidence\_2014
</div>
<div class="cell markdown">
### The actual bibliography
The bibliography will be placed at the end of the file
</div>
Note that cells are divided by hard-coded <div>
s, and cell-level metadata (such as tags)
are encoded within the HTML (e.g. data-tags
). Also note that we haven’t gotten the bibliography
to render, probably because we didn’t enable the citeproc
processor on pandoc (we’ll try that later).
Finally, note that there’s no notebook-level metadata in this output because GFM doesn’t support
a YAML header.
To pandoc-flavored markdown¶
# ipynb -> pandoc md
run(f'pandoc pandoc_ipynb/inputs/notebooks.ipynb --resource-path=inputs -s --extract-media=outputs/images')
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>notebooks</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<style>
code.sourceCode > span { display: inline-block; line-height: 1.25; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<div class="cell markdown">
<h1 id="heres-a-demo-notebook">Here's a demo notebook</h1>
<p>This is a demo notebook to play around with the pandoc ipynb support</p>
<h2 id="markdown">Markdown</h2>
<p>As it is markdown, you can embed images, HTML, etc into your posts!</p>
<p><img src="outputs/images/ca17e56d65946db885db7f8f50a9605a6a94e6a7.jpg" /></p>
<p>Here's one <span class="math inline"><em>i</em><em>n</em><em>l</em><em>i</em><em>n</em><em>e</em><sub><em>m</em><em>a</em><em>t</em><em>h</em></sub></span> and</p>
<p><br /><span class="math display"><em>m</em><em>a</em><em>t</em><em>h</em><sup><em>b</em><em>l</em><em>o</em><em>c</em><em>k</em><em>s</em></sup></span><br /></p>
<div class="sourceCode" id="cb1"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1"></a><span class="kw">def</span> my_functino():</span>
<span id="cb1-2"><a href="#cb1-2"></a> mystring <span class="op">=</span> <span class="st">"you can also include python cells"</span></span>
<span id="cb1-3"><a href="#cb1-3"></a> <span class="cf">return</span> mystring</span></code></pre></div>
</div>
<div class="cell markdown" data-tags="["heresatag"]">
<h1 id="code-cells">Code cells</h1>
<h2 id="matplotlib-output-with-metadata">Matplotlib output with metadata</h2>
<p>The below code cell has some metadata attached to it. It also outputs a figure. Both should be included in the output format.</p>
</div>
<div class="cell code" data-execution_count="7" data-slideshow="{"slide_type":"subslide"}" data-tags="["mytag","parameters"]">
<div class="sourceCode" id="cb2"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1"></a><span class="im">from</span> matplotlib <span class="im">import</span> rcParams, cycler</span>
<span id="cb2-2"><a href="#cb2-2"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt</span>
<span id="cb2-3"><a href="#cb2-3"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
<span id="cb2-4"><a href="#cb2-4"></a>plt.ion()</span>
<span id="cb2-5"><a href="#cb2-5"></a></span>
<span id="cb2-6"><a href="#cb2-6"></a>data <span class="op">=</span> np.random.rand(<span class="dv">2</span>, <span class="dv">1000</span>) <span class="op">*</span> <span class="dv">100</span></span>
<span id="cb2-7"><a href="#cb2-7"></a>fig, ax <span class="op">=</span> plt.subplots()</span>
<span id="cb2-8"><a href="#cb2-8"></a>ax.scatter(<span class="op">*</span>data, s<span class="op">=</span>data[<span class="dv">1</span>], c<span class="op">=</span>data[<span class="dv">0</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="7">
<pre><code><matplotlib.collections.PathCollection at 0x7f6e8d6269e8></code></pre>
</div>
<div class="output display_data">
<p><img src="outputs/images/e843a737607d119ec5b2750a2bb737c915f1b6e8.png" /></p>
</div>
</div>
<div class="cell markdown">
<h2 id="dataframes">DataFrames</h2>
</div>
<div class="cell code" data-execution_count="8">
<div class="sourceCode" id="cb4"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1"></a><span class="im">import</span> pandas <span class="im">as</span> pd</span>
<span id="cb4-2"><a href="#cb4-2"></a>pd.DataFrame([[<span class="st">'hi'</span>, <span class="st">'there'</span>], [<span class="st">'this'</span>, <span class="st">'is'</span>], [<span class="st">'a'</span>, <span class="st">'DataFrame'</span>]], columns<span class="op">=</span>[<span class="st">'Word A'</span>, <span class="st">'Word B'</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="8">
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Word A</th>
<th>Word B</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>hi</td>
<td>there</td>
</tr>
<tr>
<th>1</th>
<td>this</td>
<td>is</td>
</tr>
<tr>
<th>2</th>
<td>a</td>
<td>DataFrame</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="cell markdown">
<h1 id="bibliography">Bibliography</h1>
<p>Let's test the bibliography here</p>
<p>Testing this [bibliography @holdgraf_rapid_2016]</p>
<p>@holdgraf_evidence_2014</p>
</div>
<div class="cell markdown">
<h3 id="the-actual-bibliography">The actual bibliography</h3>
<p>The bibliography will be placed at the end of the file</p>
</div>
</body>
</html>
Now we’ve got something a little bit cleaner without all the hard-coded HTML. The :::
fences
are how Pandoc-flavored markdown denote different divs, and cell-level metadata is encoded
similar to how GFM worked.
.ipynb
to HTML¶
Next let’s try converting .ipynb
to HTML. This should let us view the notebook as a web-page
as well as include all of the extra metadata inside the HTML elements. We’ll start with
a vanilla HTML conversion. Note that the only thing we had to do was change the output
file extension to .html
and Pandoc inferred the output type for us:
# ipynb -> HTML
run(f'pandoc pandoc_ipynb/inputs/notebooks.ipynb --resource-path=inputs -s --extract-media=outputs/images')
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>notebooks</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<style>
code.sourceCode > span { display: inline-block; line-height: 1.25; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<div class="cell markdown">
<h1 id="heres-a-demo-notebook">Here's a demo notebook</h1>
<p>This is a demo notebook to play around with the pandoc ipynb support</p>
<h2 id="markdown">Markdown</h2>
<p>As it is markdown, you can embed images, HTML, etc into your posts!</p>
<p><img src="outputs/images/ca17e56d65946db885db7f8f50a9605a6a94e6a7.jpg" /></p>
<p>Here's one <span class="math inline"><em>i</em><em>n</em><em>l</em><em>i</em><em>n</em><em>e</em><sub><em>m</em><em>a</em><em>t</em><em>h</em></sub></span> and</p>
<p><br /><span class="math display"><em>m</em><em>a</em><em>t</em><em>h</em><sup><em>b</em><em>l</em><em>o</em><em>c</em><em>k</em><em>s</em></sup></span><br /></p>
<div class="sourceCode" id="cb1"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1"></a><span class="kw">def</span> my_functino():</span>
<span id="cb1-2"><a href="#cb1-2"></a> mystring <span class="op">=</span> <span class="st">"you can also include python cells"</span></span>
<span id="cb1-3"><a href="#cb1-3"></a> <span class="cf">return</span> mystring</span></code></pre></div>
</div>
<div class="cell markdown" data-tags="["heresatag"]">
<h1 id="code-cells">Code cells</h1>
<h2 id="matplotlib-output-with-metadata">Matplotlib output with metadata</h2>
<p>The below code cell has some metadata attached to it. It also outputs a figure. Both should be included in the output format.</p>
</div>
<div class="cell code" data-execution_count="7" data-slideshow="{"slide_type":"subslide"}" data-tags="["mytag","parameters"]">
<div class="sourceCode" id="cb2"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1"></a><span class="im">from</span> matplotlib <span class="im">import</span> rcParams, cycler</span>
<span id="cb2-2"><a href="#cb2-2"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt</span>
<span id="cb2-3"><a href="#cb2-3"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
<span id="cb2-4"><a href="#cb2-4"></a>plt.ion()</span>
<span id="cb2-5"><a href="#cb2-5"></a></span>
<span id="cb2-6"><a href="#cb2-6"></a>data <span class="op">=</span> np.random.rand(<span class="dv">2</span>, <span class="dv">1000</span>) <span class="op">*</span> <span class="dv">100</span></span>
<span id="cb2-7"><a href="#cb2-7"></a>fig, ax <span class="op">=</span> plt.subplots()</span>
<span id="cb2-8"><a href="#cb2-8"></a>ax.scatter(<span class="op">*</span>data, s<span class="op">=</span>data[<span class="dv">1</span>], c<span class="op">=</span>data[<span class="dv">0</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="7">
<pre><code><matplotlib.collections.PathCollection at 0x7f6e8d6269e8></code></pre>
</div>
<div class="output display_data">
<p><img src="outputs/images/e843a737607d119ec5b2750a2bb737c915f1b6e8.png" /></p>
</div>
</div>
<div class="cell markdown">
<h2 id="dataframes">DataFrames</h2>
</div>
<div class="cell code" data-execution_count="8">
<div class="sourceCode" id="cb4"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1"></a><span class="im">import</span> pandas <span class="im">as</span> pd</span>
<span id="cb4-2"><a href="#cb4-2"></a>pd.DataFrame([[<span class="st">'hi'</span>, <span class="st">'there'</span>], [<span class="st">'this'</span>, <span class="st">'is'</span>], [<span class="st">'a'</span>, <span class="st">'DataFrame'</span>]], columns<span class="op">=</span>[<span class="st">'Word A'</span>, <span class="st">'Word B'</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="8">
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Word A</th>
<th>Word B</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>hi</td>
<td>there</td>
</tr>
<tr>
<th>1</th>
<td>this</td>
<td>is</td>
</tr>
<tr>
<th>2</th>
<td>a</td>
<td>DataFrame</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="cell markdown">
<h1 id="bibliography">Bibliography</h1>
<p>Let's test the bibliography here</p>
<p>Testing this [bibliography @holdgraf_rapid_2016]</p>
<p>@holdgraf_evidence_2014</p>
</div>
<div class="cell markdown">
<h3 id="the-actual-bibliography">The actual bibliography</h3>
<p>The bibliography will be placed at the end of the file</p>
</div>
</body>
</html>
This time our math rendered properly, along with everything else except for the bibliography. Let’s get that working now.
We’ve included a bibliography with our input file. With this (and using the
citeproc citation style, we can use pandoc-citeproc
to automatically render a
bibliography within each page. To do so, we’ve used the following extra options:
--bibliography
specifies the path to a BibTex file-f ipynb+citations
tells Pandoc that our input format has citations in it. Before, theipynb
was inferred from the input extension. Now we’ve made it explicit as well.
# ipynb -> HTML with citations
run(f'pandoc pandoc_ipynb/inputs/notebooks.ipynb -f ipynb+citations --bibliography pandoc_ipynb/inputsreferences.bib --resource-path=inputs -s --extract-media=outputs/images')
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>notebooks</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<style>
code.sourceCode > span { display: inline-block; line-height: 1.25; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<div class="cell markdown">
<h1 id="heres-a-demo-notebook">Here's a demo notebook</h1>
<p>This is a demo notebook to play around with the pandoc ipynb support</p>
<h2 id="markdown">Markdown</h2>
<p>As it is markdown, you can embed images, HTML, etc into your posts!</p>
<p><img src="outputs/images/ca17e56d65946db885db7f8f50a9605a6a94e6a7.jpg" /></p>
<p>Here's one <span class="math inline"><em>i</em><em>n</em><em>l</em><em>i</em><em>n</em><em>e</em><sub><em>m</em><em>a</em><em>t</em><em>h</em></sub></span> and</p>
<p><br /><span class="math display"><em>m</em><em>a</em><em>t</em><em>h</em><sup><em>b</em><em>l</em><em>o</em><em>c</em><em>k</em><em>s</em></sup></span><br /></p>
<div class="sourceCode" id="cb1"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1"></a><span class="kw">def</span> my_functino():</span>
<span id="cb1-2"><a href="#cb1-2"></a> mystring <span class="op">=</span> <span class="st">"you can also include python cells"</span></span>
<span id="cb1-3"><a href="#cb1-3"></a> <span class="cf">return</span> mystring</span></code></pre></div>
</div>
<div class="cell markdown" data-tags="["heresatag"]">
<h1 id="code-cells">Code cells</h1>
<h2 id="matplotlib-output-with-metadata">Matplotlib output with metadata</h2>
<p>The below code cell has some metadata attached to it. It also outputs a figure. Both should be included in the output format.</p>
</div>
<div class="cell code" data-execution_count="7" data-slideshow="{"slide_type":"subslide"}" data-tags="["mytag","parameters"]">
<div class="sourceCode" id="cb2"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1"></a><span class="im">from</span> matplotlib <span class="im">import</span> rcParams, cycler</span>
<span id="cb2-2"><a href="#cb2-2"></a><span class="im">import</span> matplotlib.pyplot <span class="im">as</span> plt</span>
<span id="cb2-3"><a href="#cb2-3"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
<span id="cb2-4"><a href="#cb2-4"></a>plt.ion()</span>
<span id="cb2-5"><a href="#cb2-5"></a></span>
<span id="cb2-6"><a href="#cb2-6"></a>data <span class="op">=</span> np.random.rand(<span class="dv">2</span>, <span class="dv">1000</span>) <span class="op">*</span> <span class="dv">100</span></span>
<span id="cb2-7"><a href="#cb2-7"></a>fig, ax <span class="op">=</span> plt.subplots()</span>
<span id="cb2-8"><a href="#cb2-8"></a>ax.scatter(<span class="op">*</span>data, s<span class="op">=</span>data[<span class="dv">1</span>], c<span class="op">=</span>data[<span class="dv">0</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="7">
<pre><code><matplotlib.collections.PathCollection at 0x7f6e8d6269e8></code></pre>
</div>
<div class="output display_data">
<p><img src="outputs/images/e843a737607d119ec5b2750a2bb737c915f1b6e8.png" /></p>
</div>
</div>
<div class="cell markdown">
<h2 id="dataframes">DataFrames</h2>
</div>
<div class="cell code" data-execution_count="8">
<div class="sourceCode" id="cb4"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1"></a><span class="im">import</span> pandas <span class="im">as</span> pd</span>
<span id="cb4-2"><a href="#cb4-2"></a>pd.DataFrame([[<span class="st">'hi'</span>, <span class="st">'there'</span>], [<span class="st">'this'</span>, <span class="st">'is'</span>], [<span class="st">'a'</span>, <span class="st">'DataFrame'</span>]], columns<span class="op">=</span>[<span class="st">'Word A'</span>, <span class="st">'Word B'</span>])</span></code></pre></div>
<div class="output execute_result" data-execution_count="8">
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Word A</th>
<th>Word B</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>hi</td>
<td>there</td>
</tr>
<tr>
<th>1</th>
<td>this</td>
<td>is</td>
</tr>
<tr>
<th>2</th>
<td>a</td>
<td>DataFrame</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="cell markdown">
<h1 id="bibliography">Bibliography</h1>
<p>Let's test the bibliography here</p>
<p>Testing this <span class="citation" data-cites="holdgraf_rapid_2016">(bibliography Holdgraf et al. 2016)</span></p>
<p><span class="citation" data-cites="holdgraf_evidence_2014">Holdgraf et al. (2014)</span></p>
</div>
<div class="cell markdown">
<h3 id="the-actual-bibliography">The actual bibliography</h3>
<p>The bibliography will be placed at the end of the file</p>
</div>
<div id="refs" class="references" role="doc-bibliography">
<div id="ref-holdgraf_evidence_2014">
<p>Holdgraf, Christopher Ramsay, Wendy de Heer, Brian N. Pasley, and Robert T. Knight. 2014. “Evidence for Predictive Coding in Human Auditory Cortex.” In <em>International Conference on Cognitive Neuroscience</em>. Brisbane, Australia, Australia: Frontiers in Neuroscience.</p>
</div>
<div id="ref-holdgraf_rapid_2016">
<p>Holdgraf, Christopher Ramsay, Wendy de Heer, Brian N. Pasley, Jochem W. Rieger, Nathan Crone, Jack J. Lin, Robert T. Knight, and Frédéric E. Theunissen. 2016. “Rapid Tuning Shifts in Human Auditory Cortex Enhance Speech Intelligibility.” <em>Nature Communications</em> 7 (May): 13654. <a href="https://doi.org/10.1038/ncomms13654">https://doi.org/10.1038/ncomms13654</a>.</p>
</div>
</div>
</body>
</html>
Now we’ve got citations at the bottom of the page, and in-line references interspersed in the text. Pretty cool!
Wrapping up¶
It seems like we can get pretty far with converting .ipynb
files into
various flavors of markdown or HTML. My guess is that things will get a bit
trickier if we tried to do this with more complex cell outputs or metdata,
but it’s a good start. Using Pandoc also means that it would be relatively
straightforward to convert notebooks into latex, pdf, or even Microsoft Word
format. I’ll try to dig into this more in the future.