Brains - Data Science - Open Science

Five things I learned at Scipy 2016

scipy 2016 logo

I’ve finally decompressed after my first go-around with Scipy. For those who haven’t heard of this conference before, Scipy is an annual meeting where members of scientific community get together to discuss their love of Python, scientific programming, and open science. It spans both academics and people from industry, making it a unique place in terms of how software interfaces with scientific research. (if you’re interested the full set of.. Read More

The beauty of computational efficiency: The Fast Fourier Transform.

Time to compute an FFT as a function of signal length, and color-coded by the number of factors for that length.

When we discuss “computational efficiency”, you often hear people throw around phrases like $O(n^2)$ or $O(nlogn)$. We talk about them in the abstract, and it can be hard to appreciate what these distinctions mean and how important they are. So let’s take a quick look at what computational efficiency looks like in the context of a very famous algorithm: The Fourier Transform. A short primer on the Fourier Transform¶ Briefly,.. Read More

Could the Brexit have happened due to random chance?

A simulation of one Brexit  vote scenario.

As a scientist, watching the Brexit vote was a little bit painful. Though probably not for the reason you’re thinking. No, it wasn’t the politics that bothered me, but the method for making such an incredibly important decision. Let me explain… Scientists are a bit obsessed with the concept of error. In the context of collecting data and anaylzing it, this takes the form of our “confidence” in the results… Read More

Coding tools in windows for science

In teaching, tutoring, and collaborating with other scientists, I often come across people who want to get better at coding, but have a windows machine. While windows does a lot of stuff great, doing coding / data analysis with a language like python is not its strong point. Most notably, here are the main problems that people often run into: The file system is structured differently. Windows machines have their.. Read More

NIH Fellowship Success Rate Analysis


 This is a static version of an ipython notebook. The raw notebook can be found here. It is also cross-posted with the Berkeley Science Review blog. NIH Fellowship Success Rates¶ As I’m entering the final years of graduate school, I’ve been applying for a few typical “pre-doc” fellowships. One of these is the NRSA, which is notorious for requiring you to wade through forests of beaurocratic documents (seriously, their “guidelines”.. Read More

Blogging in wordpress with ipython (jupyter) notebooks


Note – this is a static version of a jupyter notebook. You can find the original here. Update – To make this easier, I’ve put together a (very) short python script to automatically convert a notebook to html, append a date tag to it, and store it in an html folder. Simply call: python ./path/to/my/ntbk.ipynb. It also includes a .css file with the proper stylings. You can find it.. Read More

Craigslist data analysis


This post is a static version of an ipython notebook. That notebook as well as several others can be found here. — In the last post I showed how to use a simple python bot to scrape data from Criagslist. This is a quick follow-up to take a peek at the data. Note – data that you scrape from Craigslist is pretty limited. They tend to clear out old posts,.. Read More

Querying Craigslist with Python


Note, you can find the nbviewer version of this notebook here Overview¶ In this notebook, I’ll show you how to make a simple query on Craigslist using some nifty python modules. You can take advantage of all the structure data that exists on webpages to collect interesting datasets. In [1]: import pandas as pd %pylab inline Populating the interactive namespace from numpy and matplotlib First we need to figure out how.. Read More

Correlation and coherence, what’s the difference?


Note – you can find the nbviewer of this post here Coherence vs. Correlation – a simple simulation¶ A big question that I’ve always wrestled with is the difference between correlation and coherence. Intuitively, I think of these two things as very similar to one another. Correlation is a way to determine the extent to which two variables covary (normalized to be between -1 and 1). Coherence is similar, but.. Read More

Some useful resources on understanding PCA

I was going to write out a quick post about what Principle Components Analysis is, but then I realized that the internet is already full of these kinds of things. So rather than adding yet another explanation to the mix, here’s a list of useful links if you want to learn more about PCA: A relatively in-depth and less technical explanation that I’ve found useful: A blog post from.. Read More