Thursday, October 20, 2016

Stemgraphic, a new visualization tool

PyData Carolinas 2016

At PyData Carolinas 2016 I presented the talk Stemgraphic: A Stem-and-Leaf Plot for the Age of Big Data.


The stem-and-leaf plot is one of the most powerful tools not found in a data scientist or statistician’s toolbox. If we go back in time thirty some years we find the exact opposite. What happened to the stem-and-leaf plot? Finding the answer led me to design and implement an improved graphical version of the stem-and-leaf plot, as a python package. As a companion to the talk, a printed research paper was provided to the audience (a PDF is now available through

The talk

Thanks to the organizers of PyData Carolinas, videos of all the talks and tutorials have been posted on youtube. In just 30 minutes, this is a great way to learn more about stemgraphic and the history of the stem-and-leaf plot for EDA work. This updated version does include the animated intro sequence, but unfortunately the sound was recorded from the microphone, and not the mixer. You can see the intro sequence in higher audio and video quality on the main page of the website below.

I've created a web site for stemgraphic, as I'll be posting some tutorials and demo some of the more advanced features, particularly as to how stemgraphic can be used in a data science pipeline, as a data wrangling tool, as an intermediary to big data on HDFS, as a visual validation for building models and as a superior distribution plot, particularly when faced with non uniform distributions or distributions showing a high degree of skewness (long tails).

Github Repo

Francois Dion

Tuesday, October 11, 2016

PyData Carolinas 2016 Tutorial: Datascience on the web

PyData Carolinas 2016

Don Jennings and I presented a tutorial at PyData Carolinas 2016: Datascience on the web.

The plan was as follow:


Learn to deploy your research as a web application. You have been using Jupyter and Python to do some interesting research, build models, visualize results. In this tutorial, you’ll learn how to easily go from a notebook to a Flask web application which you can share.


Jupyter is a great notebook environment for Python based data science and exploratory data analysis. You can share the notebooks via a github repository, as html or even on the web using something like JupyterHub. How can we turn the work we have done in the notebook into a real web application?
In this tutorial, you will learn to structure your notebook for web deployment, how to create a skeleton Flask application, add a model and add a visualization. While you have some experience with Jupyter and Python, you do not have any previous web application experience.
Bring your laptop and you will be able to do all of these hands-on things:
  1. get to the virtual environment
  2. review the Jupyter notebook
  3. refactor for reuse
  4. create a basic Flask application
  5. bring in the model
  6. add the visualization
  7. profit!
Now that is has been presented, the artifacts are a github repo and a youtube video.

Github Repo

After the fact

The unrefactored notebook is here while the refactored one is here.
Once you run through the whole refactored notebook, you will have train and test sets saved in data/ and a trained model in trained_models/. To make these available in the tutorial directory, you will have to run the script. On a unix like environment (mac, linux etc):
chmod a+x


The whole session is now on youtube: Francois Dion & Don Jennings Datascience on the web

Francois Dion

Thursday, October 6, 2016

Improving your communications: Professional Audio-Video Production on Linux

Pro AV on Linux

I'll be presenting on the subject of Professional Audio-Video Production on Linux, next week at TriLug.

From concept to finished product, it has never been easier to obtain professional results when it comes to audio-video production on Linux.

We will cover some of the hardware that should be part of your production suite, from microphones to jog wheels and highlight some of the top tools for animation, audio, broadcasting, effects, modeling, music, transcoding and video. We will also go beyond the usual suspects and introduce some tools that might not be typically used for AV production.
By the end of the presentation, you will have all the tools you need to improve the quality of your communications, for your personal enjoyment, your career, or your business.


Thursday, 13 October 2016 - 7:00pm to 9:00pm
The Frontier, 800 Park Offices Drive, Durham, NC
Francois Dion

Wednesday, October 5, 2016

Something For Your Mind, Polymath Podcast episode 2

A is for Anomaly

In this episode, "A is for Anomaly", our first of the alphabetical episodes, we cover financial fraud, the Roman quaestores, outliers, PDFs and EKGs. Bleep... Bleep... Bleep...
"so perhaps this is not the ideal way of keeping track of 15 individuals..."

Something for your mind is available on


Francois Dion
P.S. There is a bit more detail on this podcast as a whole, on linkedin.

Friday, September 30, 2016

5 music things

5 in 5

I like to cover 5 things in 5 minutes for lightning talks. Or one thing. At the local
Python user group, sometimes questions or other circumstances turn these 5
in 5 more into a 5 in 10-15...

5 Music Things

Eventually, after a year or two, I'll revisit a subject. I recently noticed that I had
not talked about music related things in almost two and a half years, so I did
5 quick Jupyter notebooks and presented that. Interestingly enough, none of
these 5 things were covered back then. The github repo includes edited versions
of the notebooks, based on the interactions at the meeting during my presentation.
Requirements: All require the following
pip install jupyter

1 - Audio

2 - libROSA

Here we will need to pip install matplotlib and numpy, and of course librosa.

3 - music21

pip install music21
You'll need some external programs: Lilypond and Musescore
You also need launch scripts for each of them. On a mac, use the provided
launch scripts in the mac/ folder of this repo. Make sure you chmod a+x them.
Change the path in the notebook to reflect your own user path.

4 - python-sonic

pip install python-sonic
You'll need one external program: Sonic Pi and to start it before running through
the notebook.

5 - pyKnon

pip install pyknon
You'll need one external program: timidity

easily installed:

  • in Linux with apt-get install timidity
  • on a Mac with brew install timidity
This was mostly an excuse to demo that external command line tools like timidity
or sox can be used here.

Have fun!
@f_dion - francois(dot)dion(at)gmail(dot)com

P.S.: Github repo at: but for some strange reason, github will not render the first (0-StartHere) notebook. This blog post is basically that notebook, putting things in context.

Sunday, September 25, 2016

Something for your mind: Polymath Podcast Episode 001

Two topics will be covered:

Chipmusic, limitations and creativity

Numfocus (Open code = better science)

The numfocus interview was recorded at PyData Carolinas 2016. There will be a future episode covering the keynotes, tutorials, talks and lightning talks later this year. This interview was really more about open source and less about PyData.

The episode concludes with Learn more, on Claude Shannon and Harry Nyquist.

Something for your mind is available on


Francois Dion

Sunday, September 18, 2016

Something for your mind: Polymath Podcast launched

Some episodes
will have more Art content, some will have more Business content, some will have more Science content, and some will be a nice blend of different things. But for sure, the show will live up to its name and provide you with “something for your mind”. It might raise more questions than it answers, and that is fine too.

Episode 000
Listen to Something for your mind on

Francois Dion