When dealing with geospatial data it is sometimes useful to have a grid at hand that represents the given data. One way to create a grid like this is to use GeoHashes. GeoHashes are a hierarchical spatial data structure which subdivides space into buckets of grid shape, which is one of the many applications of what is known as a Z-order curve, and generally space-filling curves. A geohash is an encoded character string that is computed from geographic coordinates.

Continue reading

On October 5th the PostgreSQL Global Development Group announced the release of PostgreSQL 10. It comes with tremendous amount of new features like Table partitioning Logical replication Improved parallel queries Stronger password hashing Durable Hash Indexes and more. A nice list, including explanations can be found on Robert Haas’ blog. This post explains how to upgrade to the latest version of PostgreSQL on macOS using Homebrew. At the time of this writing I was using macOS 10.

Continue reading

After three great days at the PyCon US 2017 in Portland, OR Hendrik and I decided to participate in the development sprints succeeding the conferece. The code sprints are an essential part of PyCon, and a chance to meet some of the maintainers and contributors of various open source projects. For us it was the first time attending a code sprint. The day before the sprint there was a session helping people to set up Git, Python (including virtual environments) and getting familiar with version control.

Continue reading

Recently, I set up Jupyter Notebooks on a server at work. The idea was to create an enviroment where every team member could run analyses using Python and share the results with the rest. After reading the documentation, I found out that the Jupyter Notebook web application comes with a Contents API I quickly put together a little Munin script that collects some statistics about the current notebooks. The graph shows the total number of notebooks on the server as well as the currently open notebooks:

Continue reading

An ETL import graph is build on logical dependencies of the jobs to each other. So typically a SQL transformation job depends on all the previous jobs that create the tables used in the query. But once there are a certain number of jobs, dependencies often get a bit more complicated and some of them become redundant in the process. A simple example can be seen in the dependency graph from figure, where the three red edges are redundant.

Continue reading

To speed up the ETL data pipeline, you should try to run jobs in parallel. Obviously, not all jobs can run at the same time in most cases, since there are dependency constraints between the jobs and limits of the servers capacity (number of processors and/or IO bandwidth). So assuming the server allows you to run n jobs in parallel, often there is the situation that the dependencies give you the option to run any of a set of m different jobs with m > n.

Continue reading

Once you have set-up a web server like Apache or nginx running on the Raspberry Pi it is time to create a website. From here there a several options: A CMS that relies on a database, some purely manual crafted pages or a static pages generated by a script. I chose the latter for some reasons. Static sites have a lot of advantages: no database to slow requests down offer greater security, as they do not contain dynamic content, so are immune to the most common attacks flat, text files, makes them ideal to be used with version control systems, such as Git low footprint on the server as serving raw html files But there also some limitations:

Continue reading

I have been using Munin to monitor the health of my Raspberry Pi for while now. As I have more devices installed in my network I was looking for a way to monitor these devices as well. As Munin uses a client-server model you are required to install the Munin node on the device to be monitored. Every five minutes the Munin server polls its clients for the values and creates charts using RRDTool.

Continue reading

After collecting some photovoltaic data using PikoPy and a some readings from the residential meter it was time to put everything together. The data is collected by a couple of scripts triggered by a cronjob every five minutes. $ crontab -l */5 * * * * python /home/solarpi/kostal_piko.py */5 * * * * python /home/solarpi/collect_meter.py */15 * * * * python /home/solarpi/collect_weather.py The results are then written into a SQLite database.

Continue reading

The first step of my plan, building a Raspberry Pi based photovoltaic monitoring solution, is finished. I created a python package that works with the Kostal Piko 5.5 inverter (and theoretically should work with other Kostal inverters as well) and offers a clean interface for accessing the data: import pikopy #create a new piko instance p = Piko('host', 'username', 'password') #get current power print p.get_current_power() #get voltage from string 1 print p.

Continue reading

Author's picture

Christian Stade-Schuldt

Data Engineer @ HERE IoT innovation lab| Full-time geek | Cyclist | Learning from data

Data Engineer @ HERE

Berlin, Germany