TUD Logo

TUD Home » ... » Faculty of Computer Science » Research » Academic Talks

Faculty of Computer Science

Academic Talks

Learning Sampling-Based 6D Object Pose Estimation

PhD defence by Dipl.-Medieninf. Alexander Krull (Institut für Künstliche Intelligenz, Professur für Computer Vision)

16th Jan 2018, 4.00 PM, APB 1004 (Ratssaal)

The task of 6D object pose estimation, i.e. of estimating an object's position (three degrees of freedom) and orientation (three degrees of freedom) from images is an essential building block of many modern applications, such as robotic grasping, autonomous driving, or augmented reality. Automatic pose estimation systems have to overcome a variety of visual ambiguities, including texture-less objects, clutter, and occlusion. Since many applications demand real time performance the efficient use of computational resources is an additional challenge. In this thesis, we will take a probabilistic stance on trying to overcome said issues. We build on a highly successful automatic pose estimation framework based on predicting pixel-wise correspondences between the camera coordinate system and the local coordinate system of the object. These dense correspondences are used to generate a pool of hypotheses, which in turn serve as a starting point in a final search procedure. We will present three systems that each use probabilistic modeling and sampling to improve upon different aspects of the framework. The goal of the first system, System I, is to enable pose tracking, i.e. estimating the pose of an object in a sequence of frames instead of a single image. By including information from previous frames tracking systems can resolve many visual ambiguities and reduce computation time. System I is a particle filter (PF) approach. The PF represents its belief about the pose in each frame by propagating a set of samples through time. Our system uses the process of hypothesis generation from the original framework as part of a proposal distribution that efficiently concentrates samples in the appropriate areas. In System II, we focus on the problem of evaluating the quality of pose hypotheses. This task plays an essential role in the final search procedure of the original framework. We use a convolutional neural network (CNN) to assess the quality of an hypothesis by comparing rendered and observed images. To train the CNN we view it as part of an energy based probability distribution in pose space. This probabilistic perspective allows us to train the system under the maximum likelihood paradigm. We use a sampling approach to approximate the required gradients. The resulting system for pose estimation yields superior results in particular for highly occluded objects. In System III, we take the idea of machine learning a step further. Instead of learning to predict an hypothesis quality measure, to be used in a search procedure, we present a way of learning the search procedure itself. We train a reinforcement learning (RL) agent, termed PoseAgent, to steer the search process and make optimal use of a given computational budget. PoseAgent dynamically decides which hypothesis should be refined next, and which one should ultimately be output as the system's estimate. Since the search procedure includes discrete non-differentiable choices, training of the system via gradient descent is not easily possible. To solve the problem, we model PoseAgent's behavior as non-deterministic stochastic policy, which is ultimately governed by a CNN. This allows us to use a sampling-based stochastic policy gradient training procedure. We believe that some of the ideas developed in this thesis, such as the sampling driven probabilistically motivated training of a CNN for the comparison of images or the search procedure implemented by PoseAgent have the potential to be applied in fields beyond pose stimation as well.

Learning to Predict Dense Correspondences For 6D Pose Estimation

PhD defence by Dipl.-Medieninf. Eric Brachmann (Institut für Software und Multimediatechnik; Professur für Computergraphik und Visualisierung)

17th Jan 2018, 10.00 AM, APB 1004 (Ratssaal)

Object pose estimation is an important problem in computer vision with applications in robotics, augmented reality and many other areas. An established strategy for object pose estimation consists of, firstly, finding correspondences between the image and the object's reference frame, and, secondly, estimating the pose from outlier-free correspondences using Random Sample Consensus (RANSAC). The first step, namely finding correspondences, is difficult because object appearance varies depending on perspective, lighting and many other factors. Traditionally, correspondences have been established using handcrafted methods like sparse feature pipelines. In this thesis, we introduce a dense correspondence representation for objects, called object coordinates, which can be learned. By learning object coordinates, our pose estimation pipeline adapts to various aspects of the task at hand. It works well for diverse object types, from small objects to entire rooms, varying object attributes, like textured or texture-less objects, and different input modalities, like RGB-D or RGB images. The concept of object coordinates allows us to easily model and exploit uncertainty as part of the pipeline such that even repeating structures or areas with little texture can contribute to a good solution. Although we can train object coordinate predictors independent of the full pipeline and achieve good results, training the pipeline in an end-to-end fashion is desirable. It enables the object coordinate predictor to adapt its output to the specificities of following steps in the pose estimation pipeline. Unfortunately, the RANSAC component of the pipeline is non-differentiable which prohibits end-to-end training. Adopting techniques from reinforcement learning, we introduce Differentiable Sample Consensus (DSAC), a formulation of RANSAC which allows us to train the pose estimation pipeline in an end-to-end fashion by minimizing the expectation of the final pose error.

Hypothesis Generation for Object Pose Estimation From Local Sampling to Global Reasoning

PhD defence by Dipl.-Medieninf. Frank Michel (Institut für Künstliche Intelligenz, Professur für Computer Vision)

18th Jan 2018, 10.30 AM, APB 3027

Dependable Systems Leveraging new ISA extensions

PhD defence by M. Sc. Dmitrii Kuvaiskii (Institut für Systemarchitektur, Professur für Systems Engineering)

22nd Jan 2018, 8.30 AM, APB 1004 (Ratssaal)

Unpredictable hardware faults and software bugs lead to application crashes, incorrect computa- tions, unavailability of internet services, data losses, malfunctioning components, and consequently financial losses or even death of people. In particular, faults in microprocessors (CPUs) and memory corruption bugs are among the major unresolved issues of today. CPU faults may result in benign crashes and, more problematically, in silent data corruptions that can lead to catas- trophic consequences, silently propagating from component to component and finally shutting down the whole system. Similarly, memory corruption bugs (memory-safety vulnerabilities) may result in a benign application crash but may also be exploited by a malicious hacker to gain control over the system or leak confidential data.

Approximate Data Analytics Systems

PhD defence by M. Sc. Do Le Quoc (Institut für Systemarchitektur, Professur für Systems Engineering)

22nd Jan 2018, 10.45 AM, APB 1004 (Ratssaal)

Today, more and more modern online services make use of big data analytics systems to extract useful information from the publicly available digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore's law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale data with low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: (1) StreamApprox - a data stream analytics system for approximate computing. (2) IncApprox - a data analytics system for incremental approximate computing. (3) PrivApprox - a data stream analytics system for privacy-preserving and approximate computing. (4) ApproxJoin - an approximate distributed joins system. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications.

search in announcement archive

Subscribe to the talk announcements news feed: RSS

Last modified: 16th Jan 2018, 2.03 PM
Author: Webmaster