Christian Scheible

I am a data scientist at Trusted Shops living in Cologne, Germany. Previously, I worked at StepStone and at the Institute for Natural Language Processing (IMS) at the University of Stuttgart.

I write software that enables computers to automatically extract information from natural language and other data sources using machine learning. I work primarily in Java or Python.

In the past, I have worked primarily on text analysis. This includes semantic analysis, discourse processing, and sentiment analysis.

I am used to building tools from scratch, but I also work with natural language processing and machine learning libraries.



An open source Java tool for detecting quoted speech in text. It provides a greedy algorithm for fast processing and sampling inference for higher accuracy.

Bayesian Naive Bayes

A Python implementation of two-class Naive Bayes with additional priors. Inference is done using Gibbs sampling. I used this model for unsupervised sentiment analysis.

Chess ratings

A Python tool to automatically predict the strength of chess players. Uses an SVM to perform classification, ranking, and regression.


Sentiment relevance

Sentiment relevance captures whether a piece of text contains an opinion or not. This dataset of movie reviews contains close to 4000 sentences annotated for sentiment relevance.

Chess commentary

This corpus consists of annotated chess games that were posted on I used this dataset to automatically predict the strength of chess players.

Deep semantic analogies

A collection of word analogy problems (e.g., man is to woman as king is to ?) in multiple languages.

Selected publications

(full list available here)

Contact me

christian (at)

professional GitHub | personal GitHub

Made using Jekyll and Solo