Talks and publications

Conference talks and lectures

A Deep Dive Into Dispatching Techniques
Jonathan Müller

At the core of an interpreter is a loop that iterates over instructions and executes them in order. This requires dispatching: based on the current instruction, it needs to select different code. A fast interpreter requires a fast instruction dispatcher, but so does everything else that needs to switch over a fixed set of different options.

This talk investigates at various dispatching techniques, from simple switch statements to jump tables. We'll look at performance analysis tools, benchmarks, and lots and lots of assembly code, in order to learn ways to trick the compiler into generating the assembly code that we actually want.

Even if you don't need to actually write an interpreter or other dispatcher, you will learn a lot about optimization.

Express your expectations: A fast, compliant JSON pull parser for writing robust applications
Jonathan Müller

There are, by now, several well-established C++ JSON libraries, for example, boost.JSON, rapidjson, and simdjson. C++ developers can choose between DOM parsers, SAX parsers, and pull parsers. DOM parsers are by design slow and use a lot of memory, SAX parsers are clumsy to use and the only well-known pull parser simdjson does not fully validate JSON documents and also has high non-constant memory usage. Our open-source JSON parser fills the gap between the existing parser libraries. It is a fully validating, fast, pull parser with O(1) memory usage.

Its main contribution, however, is the API design. All existing parsers verify that a parsed document is valid JSON. But most applications require the data to have a specific structure, for example, that an object has specific required keys while other keys may be optional. Their associated values in turn are expected to be, for example, strings, objects or arrays. Currently, developers need to implement their own checks and their own error handling on top of the existing parser APIs.

Our API forces developers to express these semantical constraints, providing automatic error handling in return. The resulting code concisely documents the required JSON structure and always handles errors correctly. We have found this to be extremely useful in practice.

This talk will show the JSON parser API in practice, compare it to the established parsers, and will demonstrate some elegant generic programming C++ techniques to beginners and intermediate C++ developers.

C++ Features You Might Not Know
Jonathan Müller

C++ is a big language — the upcoming C++23 standard will be over 2000 pages long. This talk will cover some obscure features you might not know. We will cover strange syntax like commutative array indexing and complicated declarators, surprising cases of undefined behavior in frequently used operators contrasted with a surprising lack of undefined behavior in operations that really shouldn’t work, overlooked language facilities — some of them actually useful, and half-forgotten standard library functions — some of them for good reason.

For each feature, we will talk about the what, the why, and how you can use it to write better (or much, much worse) C++ programs.

The New Library On The Block: A strong library foundation for your next project
Jonathan Müller

We at think-cell have given many conceptual talks about iterators, ranges, string formatting, and generic programming in the past. Now, we would like to present the library that is the foundation of our code base and that lets us write code the way we like it: Short, elegant, and to the point.

Did you know that std::move can be called on a const reference? And that you cannot move out of that const reference but will silently copy? Wouldn’t it be nice to get a compiler error in that case? Our range implementation is more powerful and easier to use than std::ranges. We treat generator functions as ranges, too. With the same syntax as iterator-based ranges, you often do not need to know the difference. We extend the standard range algorithms (and our own additional algorithms) with return type arguments that express common programming patterns elegantly: How often did you call std::ranges::find_if and compare the result to an end iterator? Wouldn’t it be shorter and more elegant to write something like tc::find_if<tc::return_bool>?

The talk will focus on actual usage, rather than conceptual ideas. The library is open-source, header-only and thus easy to integrate into any project. For any new project, it will be a significant head start over the standard library. We strive for our library to be close to the standard library in names and conventions, so it is compatible with std ranges and algorithms and you can mix and match it with other libraries.

Nobody Can Program Correctly. A Practical and Interactive Guide to Debugging C++ Code
Sebastian Theophil

We like to write code but—despite our best efforts—we make mistakes. Our program will contain bugs. Sometimes, we don’t write what we mean to write, sometimes we don’t understand an aspect of our programming language and at other times we lack—or fail to consider—some critical information about our program’s system environment. As a result, our program will not behave correctly. What do we do now? In this talk, I would like to take you through the entire debugging process, starting with a program that crashes. What do we do next? Which questions do we have to ask? What information do we need? What can we do to find the cause of the crash? Which tools can help us in this quest, and, last but not least, what can we do to make sure this bug never happens again? Thanks to real-world examples that we have encountered—and debugged—at think-cell over the years, you will learn how to reproduce, locate, understand and fix even the most difficult bugs.

Typescripten — Generating type-safe JavaScript bindings for emscripten
Sebastian Theophil

WebAssembly has become a very popular target platform for C++ developers. Thanks to emscripten, porting native applications to WebAssembly is easy — as long as the application only uses the browser as a display and input device. However, emscripten does not provide type-safe wrappers to the standard JavaScript APIs such as the DOM manipulation API, let alone any other JavaScript API provided by existing web applications. Our open source tool “typescripten” has been built on top of three powerful technologies to close this gap. It uses TypeScript interface definition files and the TypeScript compiler API to create type-safe C++ wrappers based on emscripten. TypeScript already has interface definition files for over 7000 JavaScript libraries that you can now use safely from C++. We strive to design our C++ wrappers such that the C++ code using them looks similar to the equivalent TypeScript or JavaScript code. However, the peculiar semantics of prototype-based Javascript and Typescript are often difficult to translate into a type-based language such as C++. I will discuss the challenges we faced and choices we made when designing this framework.

Windows, macOS and the Web: Lessons from cross-platform development at think-cell
Sebastian Theophil

For twelve years, think-cell had been a Windows-only software company and our codebase of approximately 700k lines of code had accumulated many unintentional platform dependencies. Six years ago, we decided to port our application to the Mac. This change has affected every part of our development process: the project organization, build system and the way we program in C++ today. The commonly used cross-platform libraries such as Qt and boost were good tools to build on, but by themselves were not enough. For many concepts, such as mutexes, semaphores or shared memory, they only offer a common interface to platform-specific objects with very different semantics and lifetimes. We wanted light-weight, platform-independent C++ abstractions with identical semantics for rendering, internationalization, file I/O, mouse event handling, RPC calls and error reporting. Developing these was challenging, firstly, because we had to define which semantics our application needed and, secondly, we had to implement them on each platform. This was not an easy process but I would argue it has improved the quality of our code very much. By now, we have moved on to the next challenge and have started to move some functionality to web applications. We wanted to reuse our existing code-base of course, and that meant writing web applications in expressive, type-safe C++. Definitely an advantage in our book! We have built our web applications using emscripten, but we generate type-safe C++ bindings from any TypeScript interface definition. In my talk, I will give you an overview of the C++ abstractions we have implemented, focusing on the cross-platform problem areas where common semantics were hard to define due to limitations of either one of the operating systems, and of course I will show you our tools that let us write web application in C++.

The C++ rvalue lifetime disaster
Arno Schödl

Rvalue references have been with us since C++11. They have originally been introduced to make moving objects more efficient: the object an rvalue reference references is assumed to go out of scope soon and thus may have its resources scavenged without harm. The C++ standard library, for example std::cref or std::ranges, makes use of yet another aspect of rvalue references: since they go out of scope soon, it is assumed unsafe to hold on to them beyond the scope of the current function, while lvalue references are considered safe. We, too, found this assumption to be very useful for smart memory management, in particular in generic code.
Unfortunately, the C++ language itself violates this assumption. Rvalues bind to const&. This means that innocent-looking functions silently convert rvalues to lvalue references, hiding any lifetime limitation of the rvalues. Temporary lifetime extension is meant to make binding a temporary to a reference safe by extending the lifetime of the temporary. But this only works as long as the temporary is a prvalue, and already breaks with rvalue references, let alone spuriously generated lvalue ones. These problems are not merely theoretical. We have had hard-to-find memory corruption in our code because of these problems. In this talk, I will describe the problems in detail, present our library-only approach to mitigate the problems, and finally, make an impossible-to-ever-get-into-the-standard proposal of how to put things right.

Better C++ Ranges
Arno Schödl

Ranges have been in the C++ standard for while now, and find their way into modern codebases. At think-cell, we have been developing and using our own range library for 20 years, which is built on top of the standard and is compatible with it, but goes beyond it in many aspects. Range adaptors are often stacked, a filter on top of a transform on top of etc. To make such a stack efficient, iterators are not good enough. We use a new concept that is more efficient and at the same time compatible with iterators so library users can keep using iterators as they did before. The standard library is very strict about the distinction between containers and views, and range adaptors are views which must maintain a reference to the data being adapted. Instead, we allow range adaptors to hold data themselves to make them self-contained and lazily evaluated at the same time. The standard iterator model only allows external iteration. However, internal iteration is often much easier to implement than external iteration. For many applications, internal iteration is completely adequate and more efficient than external iteration. Therefore, we bring internal iteration into the range family, to the point that the library user may not know or care which kind of iteration is being used. Standard algorithms return iterations and use the end iterator to signal some singleton state. By customizing return values, we can make our code more terse and expressive, for example eliminating these dreaded iterator end checks. These features combined make ranges an excellent tool for text formatting. We can use these ranges to represent the values to be formatted, conceptually turning them into lazily evaluated strings. These can be used just like regular strings are used today: in function returns; as standard algorithm input; embedded into other, equally lazily evaluated strings; and so on, before they are finally expanded for display. By choosing the right interfaces, we can optimize this expansion at compile-time, allowing both nice syntax and a performance which is very close to manually optimized code.

Range-Based Text Formatting - For a Future Range-Based Standard Library
Arno Schödl

Text formatting has been a favorite problem of C++ library authors for a long time. In this talk, I want to convince you that the combination of ranges with a bit of metaprogramming makes for a very elegant solution to the text formatting problem. We introduce a form of ranges with internal iteration, which are generating their elements one by one rather than exposing external iterators. We can use these generator ranges to represent the values to be formatted, conceptually turning them into lazily evaluated strings. These can be used just like regular strings are used today: in function returns; as standard algorithm input; embedded into other, equally lazily evaluated strings; and so on, before they are finally expanded into a container. By choosing the right interfaces, we can optimize this expansion to the point that it is only 15% slower than creating the equivalent string using hand-optimized special-case code.

Why Iterators Got It All Wrong — and what we should use instead
Arno Schödl

You understand iterators, right? How would you describe them? "Iterators are used to point into sequences of elements." Sounds good? More recently, the concept of ranges has been introduced to mean anything that exposes iterators. In particular, ranges include range adaptors for lazily transforming or filtering sequences of elements, and they, too, have iterators.
All good? Unfortunately, not. The iterator concept, which we have been using since the advent of C++, is fundamentally flawed. In particular, some iterators must behave differently depending on whether they are meant to point at an element or at a boundary between elements. So elements and boundaries are really two distinct concepts. In this talk, I will convince you that the problem is real and has practical implications, make a proposal on how to fix it and show how the solution not only fixes the problem but makes for clearer code and prevents mistakes.

From Iterators To Ranges — The Upcoming Evolution Of the Standard Library
Arno Schödl

Pairs of iterators are ubiquitous throughout the C++ library. It is generally accepted that combining such a pair into a single entity usually termed Range delivers more concise and readable code. Defining the precise semantics of such Range concept proves surprisingly tricky, however. Theoretical considerations conflict with practical ones. Some design goals are mutually incompatible altogether.

A Practical Approach to Error Handling
Arno Schödl

Every program may encounter errors, some originating from internal bugs in the program, others coming from the environment the program is operating in. Ignoring all errors will make the program utterly unreliable, while treating every conceivable one introduces lots of extra complexity with little benefit. At think-cell, we have been using and refining our own principled approach to error handling, which we have not seen elsewhere. This talk describes our method, so that in your next project you can write more reliable software with less effort.

C++ Weekly episode – Interview on CppCast
Arno Schödl

Rob Irving’s and Jason Turner’s CppCast of January 24th, 2018 interviewing Arno on the think-cell range library and C++ development at think-cell in general.

Developing a simple PowerPoint Add-in, how hard can it be?
Valentin Ziegler

Office development is widely associated with boring VBA Macros, or hacky JavaScript. Many developers Valentin has talked to were surprised to learn that think-cell's code base has a million lines of C++ code, and that we had to create some generic libraries along the way to keep it that short. We strive to implement the most simple user interface on top of PowerPoint that enables users to create great-looking slides in little time, and for this we need to use very powerful tools. Let us show you how we apply state-of-the-art algorithms to solve layout and placement problems, and what challenges we had to overcome for seamless integration with the host application.

Industrial Strength Software Hacking
Simon McPartlin

Software patching is a powerful but potentially risky method for fixing bugs, adding functionality, and improving the usability or performance of software. This lecture looks at patching software where the source code is unavailable, an activity commonly referred to as hacking. We discuss why and when such activity may be necessary before looking in detail at the design and implementation of robust patches. We finish off by describing various tools and techniques that can be used to find suitable patching locations.

std::cout is out — Why iostreams must go
Sebastian Theophil

We have recently begun to port our software to other platforms. In this process, we discovered the world of pain that are iostreams and C-style I/O. In this talk we give a short overview over why we have banished iostreams from our code base and what we have replaced them with.

C++ Memory Model
Valentin Ziegler and Fabio Fracassi

The C++ memory model defines how multiple threads interact with memory and shared data, enabling developers to reason about concurrent code in a platform independent way. The talk explains multi-threaded executions and data races in C++, how concurrent code is affected by compiler and hardware optimizations, and how to avoid undefined behavior by using locks and atomic operations. Then it focuses on different memory orders for atomic operations, their guarantees and performance implications.

C++ vs. Java
Valentin Ziegler and Fabio Fracassi

We love C++ and use it daily. In this talk we explain why – despite its reputation for complexity – C++ is conceptually a much better language than Java. Why? Because C++ knows value semantics. Because C++ has undefined behaviour. Because C++ does not enforce garbage collection. Because with C++ we can write code that is both abstract and efficient.

Scientific publications

An Efficient Algorithm for Scatter Chart Labeling
Sebastian Theophil and Arno Schödl
Proceedings of AAAI 2006

This paper presents an efficient algorithm for a new variation of the point feature labeling problem. The goal is to position the largest number of point labels such that they do not intersect each other or their points. First we present an algorithm using a greedy algorithm with limited lookahead. We then present an algorithm that iteratively regroups labels, calling the first algorithm on each group, thereby identifying a close to optimal labeling order. The presented algorithm is being used in a commercial product to label charts, and our evaluation shows that it produces results far superior to those of other labeling algorithms.

A Smart Algorithm for Column Chart Labeling
Sebastian Müller and Arno Schödl
Proceedings of SMART GRAPHICS 2005

This paper presents a smart algorithm for labeling column charts and their derivatives. To efficiently solve the problem, we separate it into two sub-problems. We first present a geometric algorithm to solve the problem of finding a good labeling for the labels of a single column, given that some other columns have already been labeled. We then present a strategy for finding a good order in which columns should be labeled, which repeatedly uses the first algorithm for some limited lookahead. The presented algorithm is being used in a commercial product to label charts, and has shown in practice to produce satisfactory results.

Graphcut Textures: Image and Video Synthesis Using Graph Cuts
Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk and Aaron Bobick
Proceedings of SIGGRAPH 2003

In this paper we introduce a new algorithm for image and video texture synthesis. In our approach, patch regions from a sample image or video are transformed and copied to the output and then stitched together along optimal seams to generate a new (and typically larger) output. In contrast to other techniques, the size of the patch is not chosen a-priori, but instead a graph cut technique is used to determine the optimal patch region for any given offset between the input and output texture. Unlike dynamic programming, our graph cut technique for seam optimization is applicable in any dimension. We specifically explore it in 2D and 3D to perform video texture synthesis in addition to regular image synthesis.

Controlled Animation of Video Sprites
Arno Schödl and Irfan Essa
Proceedings of SCA 2002

In this paper we present a new approach for generating controlled animations of video sprites. Video sprites are animations created by rearranging recorded video frames of a moving object. With our technique, the user can specify animations using a flexible cost function, which is automatically optimized by repeated replacement of video sprite subsequences.

Video Textures
Arno Schödl, Richard Szeliski, David H. Salesin and Irfan Essa
Proceedings of SIGGRAPH 2000

This paper introduces a new type of medium, called a video texture, which has qualities somewhere between those of a photograph and a video. A video texture provides a continuous infinitely varying stream of images. While the individual frames of a video texture may be repeated from time to time, the video sequence as a whole is never repeated exactly. Video textures can be used in place of digital photos to infuse a static image with dynamic qualities and explicit action.

Want to know more?

If you have any questions regarding working at think-cell, our job openings or events, please feel free to contact our colleague Julia Zhachuk.
+49 30 6664731-81