# PREDICTABLY NOISY

Brains - Data Science - Open Science

## Five things I learned at Scipy 2016

I’ve finally decompressed after my first go-around with Scipy. For those who haven’t heard of this conference before, Scipy is an annual meeting where members of scientific community get together to discuss their love of Python, scientific programming, and open science. It spans both academics and people from industry, making it a unique place in terms of how software interfaces with scientific research. (if you’re interested the full set of.. Read More

## The beauty of computational efficiency: The Fast Fourier Transform.

When we discuss “computational efficiency”, you often hear people throw around phrases like $O(n^2)$ or $O(nlogn)$. We talk about them in the abstract, and it can be hard to appreciate what these distinctions mean and how important they are. So let’s take a quick look at what computational efficiency looks like in the context of a very famous algorithm: The Fourier Transform. A short primer on the Fourier Transform¶ Briefly,.. Read More

## Could the Brexit have happened due to random chance?

As a scientist, watching the Brexit vote was a little bit painful. Though probably not for the reason you’re thinking. No, it wasn’t the politics that bothered me, but the method for making such an incredibly important decision. Let me explain… Scientists are a bit obsessed with the concept of error. In the context of collecting data and anaylzing it, this takes the form of our “confidence” in the results… Read More

## Coding tools in windows for science

In teaching, tutoring, and collaborating with other scientists, I often come across people who want to get better at coding, but have a windows machine. While windows does a lot of stuff great, doing coding / data analysis with a language like python is not its strong point. Most notably, here are the main problems that people often run into: The file system is structured differently. Windows machines have their.. Read More

## NIH Fellowship Success Rate Analysis

This is a static version of an ipython notebook. The raw notebook can be found here. It is also cross-posted with the Berkeley Science Review blog. NIH Fellowship Success Rates¶ As I’m entering the final years of graduate school, I’ve been applying for a few typical “pre-doc” fellowships. One of these is the NRSA, which is notorious for requiring you to wade through forests of beaurocratic documents (seriously, their “guidelines”.. Read More

## Blogging in wordpress with ipython (jupyter) notebooks

Note – this is a static version of a jupyter notebook. You can find the original here. Update – To make this easier, I’ve put together a (very) short python script to automatically convert a notebook to html, append a date tag to it, and store it in an html folder. Simply call: python jupyter_to_html.py ./path/to/my/ntbk.ipynb. It also includes a .css file with the proper stylings. You can find it.. Read More

## Craigslist data analysis

This post is a static version of an ipython notebook. That notebook as well as several others can be found here. — In the last post I showed how to use a simple python bot to scrape data from Criagslist. This is a quick follow-up to take a peek at the data. Note – data that you scrape from Craigslist is pretty limited. They tend to clear out old posts,.. Read More

## Querying Craigslist with Python

Note, you can find the nbviewer version of this notebook here Overview¶ In this notebook, I’ll show you how to make a simple query on Craigslist using some nifty python modules. You can take advantage of all the structure data that exists on webpages to collect interesting datasets. In [1]: import pandas as pd %pylab inline Populating the interactive namespace from numpy and matplotlib First we need to figure out how.. Read More

## Correlation and coherence, what’s the difference?

Note – you can find the nbviewer of this post here Coherence vs. Correlation – a simple simulation¶ A big question that I’ve always wrestled with is the difference between correlation and coherence. Intuitively, I think of these two things as very similar to one another. Correlation is a way to determine the extent to which two variables covary (normalized to be between -1 and 1). Coherence is similar, but.. Read More

## Some useful resources on understanding PCA

I was going to write out a quick post about what Principle Components Analysis is, but then I realized that the internet is already full of these kinds of things. So rather than adding yet another explanation to the mix, here’s a list of useful links if you want to learn more about PCA: A relatively in-depth and less technical explanation that I’ve found useful: http://www.sccg.sk/~haladova/principal_components.pdf A blog post from.. Read More