ZosimoLab

Logo

ZosimoLab by Thilo Weber

View the Project on GitHub magnetilo/zosimolab

ZosimoLab

ZosimoLab is a data and knowledge company by Thilo Weber.

Data is the new gold! 
You just need an alchemist.

It’s possible. Ask Zosimo the Alchemist.

Services

Jump to top


Example Projects


Parsing Ustructured PDFs & Information Extraction from Documents

Description:

For Demokratis, I helped to make Legislative Consultation PDFs more accessible and interactive by parsing them into a structured data form using AI. I employed an evaluation driven development approach, that helps to navigate tha landscape of innumerable possible parsing solutions.

Methods:

Specials:

Similar approaches can used to bring a wide range of real world data and documents into a structured form, building the basis for a variety of data analysis and processing use cases.

Technology:

Python, LlamaParse, openai APIs, MLflow, difflib

Difflib HTML comparison

Jump to top


Energy Efficiency Monitoring and Root Cause Analysis

Description:

At geoimpact, I developed machine learning models for predicting the yearly heat and electricity demands of buildings. In the context of a Master’s thesis, we used these models to estimate the actual reduction in yearly heat consumption effected by different renovation measures (roof/facade insulation, window replacement, …). This framework can be used to build a monitoring of the effects of different measures applied to buildings in a portfolio or a region.

Methods:

Specials:

Apart from efficiency monitoring, such a framework can be used for a general root cause analysis of physical processes. For example, I introduced the framework to a friend who is working for a big chemical industry company. He quickly gained a lot of valuable insights into their production processes from it. He is since known as “the data leech” at his company.

Technology:

Python, Scikit-Learn, Tensorflow, MLflow, PostgreSQL, Kubernetes

Framework diagram

Partial dependences for different input features

Jump to top


Energy and CO2 Monitoring for Municipalities

Description:

At geoimpact, in collaboration with the Federal Office of Energy, we developed the public web application Energy Reporter, which monitors the progress of all Swiss municipalities in the energy transition. I was responsible for the design and deployment of the methodology and data pipeline that imports public datasets and updates the six indicators every week. The indicators of Energy Reporter are published as open data. Building on the Energy Reporter, I further developed a comprehensive energy and CO2 monitoring containing over fifty fine granulated indicators, among which also the scope 1 and 2 CO2 emissions per municipality and year.

Methods:

This project involved reasearching and processing of many different public data sources. Especially the two indicators for electricity consumption and renewable electricity production rely on statistical models that build on more than ten different data sources each.

Specials:

The Energy Reporter is widely used by the public, especially by municipal officers and the media. It’s open data is used by many Swiss medias such as SRF, 20 min, Watson, and many local news papers. It uses an unconventional data visualization method encouraging users to interactively compare different municipalities, inspired by the Quartett card game.

Technology:

Python, Pandas, Scikit-Learn, PostgreSQL, Kubernetes, Hangfire

Energie Reporter

Jump to top


Model-based Signal Processing and Sparse Priors

Description:

In my Master’s thesis at ETH Zurich, I used statistical signal processing methods for separating positional eye movement measurements into different types (saccades, smooth pursuit, and fixation eye movements). I developed a novel approach to precessing eye movement signals based on estimating signals in a mechanistic physiological model of the eye muscles. Apart from signal separation, the framework is also able to estimate the neural inputs into the eye muscles from the positional measurements.

Methods:

Specials:

I made especially two contributions in my thesis: Firstly, the usage of a new sparsity prior within the factor graph framework, which allowed to appropriately set the sparsity level. Secondly, the usage of existing mechanistic eye movement models, which have been developed since the 1980s, for solving this problem.

Technology:

Matlab

mbsd_framework

eye_movement_separation

Jump to top


Machine Learning-based Lead Generator (or Recommender System)

Description:

At geoimpact, I developed a lead generator that suggests promising buildings for the sale of a renewable energy product based on a list of past sales of the product.

Methods:

Specials:

We tested this method with a company selling photovoltaic systems. The project was stopped after the testing phase, as the cold acquisition process was too tedious. Here, I learned that a method can be theoretically very elaborated, but in the end it still needs to fit well into an end-to-end business workflow in order to be practical.

Technology:

Python, Scikit-Learn

similarity_clustering

Jump to top


Time-to-Event Prediction and Survival Analysis

Description:

At geoimpact, I developed multiple models for estimating the renovation pressure of a building. The underlying problem structure of calculating the time until a certain event happens has a variaty of applications apart from renovation rates in industry, medicine, churn rate (in e-commerce and human resources), and more.

Methods:

There are different ways to mathematically express such a pressure. We explored two approaches:

Specials:

The developed renovation pressure model has been integrated by different real- estate companies into their workflows and services.

Technology:

Python, lifelines, TensorFlow Probability

renovpress

Jump to top


Image Segmentation from Classification Labels only

Description:

At geoimpact, I developed a model for detecting photovoltaic (PV) systems in satellite images and estimating their area. The plan was to extract a Swiss-wide data base with all installed PV-systems and their installed capacity from arerial and satellite images.

Methods:

Specials:

Soon after we started the project, the Swiss Federal Office of Energy published an open data set containing all registered PV systems registered, which made our project basically obsolete. Here I learned, that there are often different ways to acquire a specific dataset. Sometimes, there are probably simpler and more accurate acquisition methods than rather complex image detection approaches.

Technology:

Python, PyTorch, OpenCV

test_CAM

Jump to top


About

outer-inner-knowledge

ca. 300: Zosimos of Panopolis (also known by the Latin name Zosimus Alchemista) was an alchemist living in the south of then Roman Egypt who wrote the oldest known books on alchemy. Alchemists were striving for understanding external and internal nature. In particular, one quest was the transformation of cupper into gold through a process of purification. This process of purifying external nature was thought to be mirrored by a process of purifying internal nature, where the quest was to draw the spirit from its bondage with bodies.

1718: Zosimo was a farmer living in Sicily who was elected king by the people. The people of Sicily were at the time controlled by Spanish oppressors. Zosimo started a revolution that brought the power back to the people, at least for a short period of time. I like to think of him as a social transformer or alchemist. There is a book written about his story by Andrea Camilleri as well as a song by the band Il Coccodrillo.

2024: The first alchemists’ strive for understanding and purifying external nature evolved into a deep understanding of and ability to control physical nature, resulting in trains, cars, electrical devices and many useful things. In a new age of data and artificial intelligence, we arrive at understanding and controlling another level of nature: the level of thoughts and information. ZosimoLab is founded to assist you in realizing the vast possibilities hidden on this level of thoughts and information - and thereby transforming your data into gold.

About me: My name is Thilo Weber, I have a Msc. in electrical engineering and information processing from ETH Zurich, and several years of experience in realizing data science and machine learning projects in the industry. Trough ZosimoLab, I want to share my passion for understanding the world through data with you, and help you to create great machine learning solutions. Further, I would also like to share my process of inner alchemy by embedding the current progress in data science and artificial intelligence in a phsychological and philosophical understanding.

kl_portrait

Jump to top


Partners

To create optimal solutions on larger data projects, I work with dedicated implementation partners:

Jump to top


Blog

I am writing about my projects, knowledge of the world, and self-knowledge on Medium.

Jump to top


Contact

Jump to top