TUD Logo

TUD Home » ... » Results of student work » Major Theses » Joachim Staib

Computer Graphics

Major Thesis of Joachim Staib

Interactive Analysis and Visualization of Ambiguities of Non-Negaitve Matrix Factorizations

Chair for computer graphics and visualization


Student: Joachim Staib
Advisor: Dipl.-Inf. Marcel Spehr
Professor: Prof. Dr. Stefan Gumhold

Motivation

Matrix factorization techniques are valuable for analyzing and reducing huge amounts of data by representing them as a linear superposition of a small number of basis signals. When the input data is composed of only non-negative values, non-negative matrix factorization (NMF) techniques preserve this property in the factorization result. In general NMF is not unique, i.e. more than one factorization for a given data set exists. The goal of this work is to provide an interactive browser that allows the visual analysis of the non-convex ambiguity space of all possible NMFs. Instead of changing the factorization algorithm to reflect different possibly fuzzy factorization requirements a human user actively involved in the process of finding a desired solution.

Description

The foundation of the browser is a sampling of the solution set for a given data set. For this a parameterization of this space is derived and a new sampling algorithm that can handle high dimensions is developed. The implemented browser works on image data. It enumerates various solutions and lets the user search for a factorization with desired properties. The browser provides the user with a tree oriented navigation structure where only important representatives of the solution set are shown. In combination with various visualizations of the sampled ambiguity space it becomes possible to quickly find a desired factorization for a given dataset.

Results

The work deals with three main problems:
  1. A parameterization of the set of all valid factorizations for a given data set,
  2. the sampling of this high dimensional and non-convex set and
  3. a visualization by means of a browser
The basis of the parameterization is an initial factorization of the data set from which other valid solutions can be generated with a transformation. Because of the high number of non-linear constraints of such a transformation the parameter set is described as a strong membership oracle. Given a parameter point, it tests whether a valid transformation can be created and returns "true" in this case and "false" otherwise. The sampling of this set is accomplished by a new sampling algorithm which approximates the set by a list of overlapping cuboids.
cuboid sampling
Outline of the approximation. (a) Beginning from a known starting point, the distance to the border of the valid space is searched. (b) Using the found border points a cuboid is fitted. (c) New points which serve as center points for future iterations are generated on the surface of the newly created cuboid. (d) Finally the space is approximated by a list of such cuboids.
From this representation of the solution space, points are generated uniformely. Their visualization in the browser is accomplished by first clustering the points and thus generating a low number of representative solutions, which are then presented to the user. By selecting one of these representatives a new sampling can be started in the local area of the selected solution. Furthermore, certain parts of the solution which already meet desired criteria can be fixed. These options allow for a hierarchical navigation structure which is the foundation of many efficient search algorithms. For the orientation inside of the solution space, different visualizations a supported, for example parallel coordinates or scatter plots, that allow the user to put the selected solution in context to the solution set as a whole.
screenshot
Screenshot of the browser. The left panel contains an overview of the tree oriented search. Beginning with a sampling of the complete solution set (Complete) the third representative solution was used to start another sampling in its surrounding. Both samplings are presented as a list of filtered solutions. The right side shows different visualizations of the solution set.

Download

Last modified: 17th Apr 2012, 3.21 PM
Author: Dr.-Ing. Wilfried Mascolus