Analyzing Sleep with Sleep Cycle App and R

I have been tracking my sleep for almost two years now using my Fitbit. I started with the Fitbit Ultra and then moved on the the Fitbit One after it came out. In October 2013 I found out about the Sleep Cycle app for the iPhone. For weeks, Sleep Cycle was listed as the best-selling health app in Germany, where currently (as of January 2014) it is in second place. The program promises, to wake you up in the morning without being tired. This is indeed possible if the alarm goes off in light sleep and not in deep sleep. After reading all the positive reviews on the AppStore I decided to give it a try.

The app promises to wake you up in a time frame up to 30 minutes prior to the alarm you set if it detects your movement in the morning. Even more important to me than the actual smart alarm feature was the possibility to collect some data while sleeping. In the morning you are presented with a chart of your sleep pattern of last night:
Sleep Cycle Screenshot of last night

The app also allows to export the database as a comma seprarated file containing: time you went to bed, time you woke up, sleep quality in %, wake up mood and user defined sleep notes. This gives you the opportunity to do some more analysis. I decided to fire up R and create my own charts.

So far I have used the app to track 100 nights of sleep and decided to peak into the data. Let’s take a look how long I slept each night:
Sleep Duration over time

It looks like the longer I slept the higher the sleep quality is. A scatter plot of the data gives:
Sleep time vs. sleep quality

The chart takes also the sleep notes into consideration. You can see clearly that sleeping away from home results in lower sleep quality. The same applies for exercising (note: I tagged a sleep with exercising when I worked out late in the evening). On the contrary taking a 3mg melatonin increased the sleep quality.

Averaging the sleep quality by month shows, that the January worse than the previous month. One explanation is a vacation I took, where I did not sleep so well at all.
Average sleep quality by month

The R code for the data wrangling and the charts:

Berlin Marathon 2014 Participants

After the 2013 Berlin Marathon sold out in less than four hours, the organizers decided to alter the registration process for 2014. First there was a pre-registration phase followed by a random selection from the pool of registrants to receive a spot. Those who were selected had to register until November 11th, 2013. Any spots that were not confirmed till the 11th would be offered to pre-registered candidates according to the order in which they were randomly selected.

At is a list of all registered participants of the event. Being curious how many runners are on the list I had two options: 1. going through the entire list and counting or 2. download the entire list and let the computer do the math. I chose the latter. If you checked the website already you saw that they present only parts of the list at a time a reload it asynchronously while you scroll down. I wrote a little python script that queries the data and saves the JSON response to a csv file.

Checking the file we see there are only 16,707 participants so far. Sure, there are spots sold to agencies or given to sponsors, but how many will be handed out in the second wave? Until they announce the results let’s look the data:

The distribution of the year of birth shows a bimodal pattern. Most runners are born in the late 60s or early 70s and then there is a second spike a around 1980.
Looking at the top 10 participating nations it is no surprise that Germany stands out by far. Followed by Great Britain and Denmark. The only non-European country in the top 10 are the United States of America:

The code for generating the images:

Update: The official registration period is over and 23,286 runners have signed up.

Five years of Weight Tracking

After I moved back from New Jersey in June 2008 I started to track my body weight more seriously. My routine usually consists of getting up and after finishing the morning bathroom I would step on my scale. That way I try to ensure that the condition for each weighing are as similar as possible. I recorded my weight on paper and eventually would put everything into a spreadsheet for further analysis.

In January 2011 I upgraded my bathroom scale to the Withings WiFi Body Scale. That way I could automate the process of tracking my weight by just stepping on it. No more writing on paper and eventually transferring everything into spreadsheets.

The people at Withings provide an API so external services could access your weight data. A nice way to get better charts is through Trendweight. Just link your withings account to the web site and they will generate nice JavaScript charts of your weight/body fat/lean mass. Another great feature is the export functionality where you can export a comma separated version of your data including trend values for total weight as well as fat %, fat mass, and lean mass.

Using the exported data we can fire up R to look at the data:

The generated chart shows my weight loss in 2009, where I started cycling again after years of absence. From my lowest in the Fall of 2009 I gradually gained weight throughout 2010, 2011 until 2012, where I hit the 100 kg mark around Christmas time:

Body Weight 2008 - 2013

That was enough and I decided it was time to start working out again. So far, I am at a good downwards trend. I have to keep up that momentum. My next goal is to get between 80 kg and 85 kg and then maintain my weight. The color of the dots reflect my body fat percentage and there seems to be a strong correlation between body fat and my actual weight:

Body Weight vs. Body Fat Percentage

Doing a simple linear regression gives us an adjusted R-squared of 0.8419

Downloading Fitbit Data using Google Spreadsheets

One of the most important features in quantified self is the ability to export your data in an open format. Fitbit lets you download your personal data if you subscribe to a premium membership. Alternatively they provide an API at that allows developers to interact with Fitbit data in their own applications, products and services.

In a blog post at Mark Levitt shows a way how to export your Fitbit data into Google Spreadsheets. I explored to API myself adding and removing some of the fields to get more insights to the data.

In a future post I will delve into the data in order to understand some of my own physical activity patterns.

Update 14 November 2014: I removed the Active Score since it has been dropped by the Fitbit API

Getting the Raspberry Pi temperature from the command-line

If you are overclocking your Raspberry Pi or you just curious how hot this little guy gets, there are two ways to get the internal temperature. Assuming you are running Raspbian as your operating system.

Method 1:
$ /opt/vc/bin/vcgencmd measure_temp
This gives you the temperate in in degrees Celsius: temp=54.1'C

Method 2:
If you need the temperature to be more precise (e.g. storing it in an database or for further processing) use the following command.
$ cat /sys/class/thermal/thermal_zone0/temp
This will give you the temperature in Millidegrees Celsius: 54072

From my personal experience the temperature ranges from about 50°C to 55°C and I have never seen my Raspberry Pi running over 58°C.

RaspberryPi Temperature

Charting Sunrise and Sunset in Highcharts

In order to visually enhance my temperature logging I added some Javascript that computes sunrise and sunset for the 24h, 28h, weekly and monthly chart. Then I use this information to plot vertical bands on the chart indicating the effects of the sun on temperatures (and humidities):

To add the bands to your Highchart just get the sunrise and sunset value for a particular day and push it on the xAxis.plotbands.

The resulting chart:

Gathering and Charting Temperatures using RRDTool and Highcharts

tl;dr Checkout the charts on my RaspberryPi

For quite a long time I was looking for a way to monitor and record th temperature and humidity at my apartment. What was missing was a convenient, preferably wireless solution. After receiving my RaspberryPi I started to look into that more intensively.

USB-WDE1 Receiver

The USB Weather Data Receiver USB-WDE1 wirelessly receives data from various weather sensors of ELV at 868 MHz. The receiver is connected to a USB port on the computer, so no additional power supply is required. The data is transmitted via a simple serial ASCII protocol, which is well documented by ELV. The RasberryPi running Raspbian is used for the data acquisition allowing very little power consumption while being completely flexible.

The USB interface of the USB WDE1 is realized by the USB-serial converter CP 2102 of Silicon Labs. The responsible kernel module CP2101 for accessing the device is included in any modern Linux distribution. When connecting the USB-WDE1 should appear in the system once the appropriate messages:

$ Dmesg
usb 1-3.1: Product: ELV USB WDE1 weather data receiver
usb 1-3.1: Manufacturer: Silicon Labs

The udev subsystem then also creates a corresponding device file, usually is the / dev/ttyUSB0. This device behaves as seen by a Linux application program such as a serial port and therefore can be accessed with any terminal program such as minicom. If you connect other USB-to-serial converter to the RaspberryPi, the device can also be called /dev/ttyUSB1 or similar. It is important to set the baud rate to 9600 bits/s.

A simple and universal way to output the data supplied by the receiver on the terminal provides to tool socat, which should also be part of any Linux distribution. You may have to re-install it via the package manager. Using

socat / dev/ttyUSB0, B9600 STDOUT

Each line represents a complete data set consisting of 25 semicolon-separated fields. The first three fields are immutable, followed by the measured temperature (°C) of eight sensors and their humidity values​(%). The next fields show temperature (° C), humidity (%), wind speed (km / h), precipitation (rocker beats) and rain sensor (0/1) of the combination sensors. Since I do not have a combination sensor I won’t focus on those values. The last field with the fixed value of 0 indicates the end of the record.

Gathering Data With RRDtool

Now that I could receive temperature, as well as humidity, from the sensors I needed to come up with a way to store the information. For this I chose the RRDTool package to manage the data. Its a circular (RR in RRD stands for Round Robin) database that lets you store a predefined amount of data. After initial creation of the DB it is as big as it will ever get and just contains “unknown” data. This is a widely used open source package that has a bit of a steep learning curve on some of its aspects but gives you everything for functionality. It works on multiple platforms including Linux and Windows and has a large, active support community.

The ‘rrdtool create’ command is used to setup the database. Here’s the bash script I used to set it up:

Once I had setup the database I needed something to read all of the sensors every 5 minutes and place the data in the DB. For this I run a little script:

The script remains in an infinite loop while socat receives data from the sensors. After a complete line has been received, rrdtool updates database.


Once you have some data in your temperatures.rrd database it is time to create some charts. rrdtool comes with a built-in graphics engine that can be utilized to easily create some charts. One drawback though is, that the generated charts do not look very appealing:

Another reason to avoid pre-generated graphics is that the creating process takes lots of cpu power, where with the RaspberryPi this is a very limited resource. In the beginning I created the charts every five minutes. Later I changed the schedule to every hour for the periods of month and year. Even then I was not totally happy with the result. After looking for alternatives on the Internet I stumbled upon Highcharts, a charting library written in pure JavaScript. This approach delegates the chart generating to the client side. Therefor I export data from the RRDTool to an xml file:

This results in a bunch of xml files. I use the jQuery.get method to get the contents of the xml files. In the success callback function, I parse the returned values, add the results to the series members of the options object, and create the chart:

The resulting chart:
Last weeks temperature charted with HighCharts

Setting up Dynamic DNS on the Raspberry Pi

Once you have set up your Raspberry Pi chances are that you want to access it from remote machine or host a little web site on it. The problem is that your provider usually gives you a dynamic IP, which changes every time you connect to the Internet. In Germany most (A|V)DSL provider reset your connection every 24h. The solution for this is a dynamic DNS (DDNS), which automatically updates the name server in the Domain Name System (DNS). Here is how you set it up using the provider DynDNS NoIP.

The first step is obtaining a free subdomain from the provider. Therefor you register an account at and check the little box Create my hostname later. Once you activated the account and logged into the website click on *Hosts/Redirects*. To create a new subdomain click on Add a Host and get creative. As host type choose DNS Host (A)

After setting up your host you can update your IP using curl:


To update your IP everytime you boot your Raspberry Pi you can set-up a crontab: crontab -e


After you have successfully updated your IP address you have to setup your router to open the desired ports and forward requests to the Raspberry Pi in your local network.

Update: DynDNS no longer offerering free accounts.

A (not so) safe betting strategy for winning at roulette

One time I was on a trip to Budapest with a couple of friends. While roaming the streets we were passing by a casino and my friend insisted that there was a perfect strategy that would only lead to winning at roulette tables. Curious as I was I had him explain his theory. The system basically works as follows:

First, you place a coin on red. If red wins, take your winning and start over. Otherwise, you double your bet after every loss, so that the first win would recover all previous losses plus win a profit equal to the original stake. If there were no constraints this could actually work. I usually get suspicious when I hear “guaranteed wins” in the context of gambling. My first doubt was, that the chance of getting either red or black were not equally fifty percent since there was also the zero. Another thing is that at a roulette tables there is usually a limit.

I thought the best way to convince my friend that his system was not so perfect as he thought, was to simulate the whole process and show him the outcome.

Our gambler starts with a budget of 1000 coins and bets 1 coin initially. He also has a winning target, that if he reaches it, he withdraws from the table and goes home happily. The table limit is 1200 coins per bet. For simplicity I also assume that if zero comes up it counts as a lost bet.

Roulette Martingale

The graph shows clearly that the higher the gambler sets it target the lower is the probability of reaching it. If you want to play with the system, alter some parameters or extend it to a different betting strategy here is my code:

Results of the St. Pat’s 10 Miler and 5K

Recently I ran the St. Pat’s 10 Miler in Atlantic City, Nj. It was my first official running event ever and I enjoyed it lot.

Shortly after the race the official results have been posted on the Internet. The data did not only include the number and times of the participants but also gender and age. Looking at the finisher time distribution it shows that most runners finished at around 90 minutes:

Finisher Time Distribution of the St. Pat's 10 Miler and 5K 2008

How does age affect the finishing time?

Finisher Time by Gender and Age of the St. Pat's 10 Miler and 5K 2008

The code to generate the images: