An ETL import graph is build on logical dependencies of the jobs to each other. So typically a SQL transformation job depends on all the previous jobs that create the tables used in the query. But once there are a certain number of jobs, dependencies often get a bit more complicated and some of them become redundant in the process. A simple example can be seen in the dependency graph from figure, where the three red edges are redundant.

Continue reading

To speed up the ETL data pipeline, you should try to run jobs in parallel. Obviously, not all jobs can run at the same time in most cases, since there are dependency constraints between the jobs and limits of the servers capacity (number of processors and/or IO bandwidth). So assuming the server allows you to run n jobs in parallel, often there is the situation that the dependencies give you the option to run any of a set of m different jobs with m > n.

Continue reading

Once you have set-up a web server like Apache or nginx running on the Raspberry Pi it is time to create a website. From here there a several options: A CMS that relies on a database, some purely manual crafted pages or a static pages generated by a script. I chose the latter for some reasons. Static sites have a lot of advantages: no database to slow requests down offer greater security, as they do not contain dynamic content, so are immune to the most common attacks flat, text files, makes them ideal to be used with version control systems, such as Git low footprint on the server as serving raw html files But there also some limitations:

Continue reading

I have been using Munin to monitor the health of my Raspberry Pi for while now. As I have more devices installed in my network I was looking for a way to monitor these devices as well. As Munin uses a client-server model you are required to install the Munin node on the device to be monitored. Every five minutes the Munin server polls its clients for the values and creates charts using RRDTool.

Continue reading

After collecting some photovoltaic data using PikoPy and a some readings from the residential meter it was time to put everything together. The data is collected by a couple of scripts triggered by a cronjob every five minutes. $ crontab -l */5 * * * * python /home/solarpi/kostal_piko.py */5 * * * * python /home/solarpi/collect_meter.py */15 * * * * python /home/solarpi/collect_weather.py The results are then written into a SQLite database.

Continue reading

The first step of my plan, building a Raspberry Pi based photovoltaic monitoring solution, is finished. I created a python package that works with the Kostal Piko 5.5 inverter (and theoretically should work with other Kostal inverters as well) and offers a clean interface for accessing the data: import pikopy #create a new piko instance p = Piko('host', 'username', 'password') #get current power print p.get_current_power() #get voltage from string 1 print p.

Continue reading

I have been carrying my Fitbit One for a little over two years with me and it keeps tracking my daily steps. It also tracks my distance covered by multiplying those steps using the stride length which you can either provide explicitly or implicitly setting your heights. In the winter of 2012 I bought my first ~Garmin Forerunner 410~ (replaced by a Garmin Forerunner 920XT) GPS watch to help me track my running (and other outdoor) activities.

Continue reading

A friend of mine had a photovoltaic system (consisting of 14 solar panels) installed on his rooftop last year. As I was looking for another raspberry pi project I convinced him I would setup a reliable monitoring solution that will lead him to an access to the data in real-time data. The current setup comes with an inverter by the company Kostal. The Kostal Piko 5.5 runs an internal web server showing statistics like current power, daily energy, total energy plus specific information for each string.

Continue reading

I have been tracking my sleep for almost two years now using my Fitbit. I started with the Fitbit Ultra and then moved on the the Fitbit One after it came out. In October 2013 I found out about the Sleep Cycle (Link) app for the iPhone. For weeks, Sleep Cycle was listed as the best-selling health app in Germany, where currently (as of January 2014) it is in second place.

Continue reading

After the 2013 Berlin Marathon sold out in less than four hours, the organizers decided to alter the registration process for 2014. First there was a pre-registration phase followed by a random selection from the pool of registrants to receive a spot. Those who were selected had to register until November 11th, 2013. Any spots that were not confirmed till the 11th would be offered to pre-registered candidates according to the order in which they were randomly selected.

Continue reading

Author's picture

Christian Stade-Schuldt

Data Engineer @ HERE IoT innovation lab| Full-time geek | Cyclist | Learning from data

Data Engineer @ HERE

Berlin, Germany