Export search results as:
BibTex
2015

Perception of Strength and Power of Realistic Male Characters
Wellerdiek, A.C., Breidt, M., Geuss, M.N.,
Streuber, S., Kloos, U.,
Black, M.J. and Mohler, B.J.
In
ACM SIGGRAPH Symposium on Applied Perception,
ACM,
2015.
Abstract:
▸
We investigated the influence of body shape and pose on the perception of physical strength and social power for male virtual characters. In the first experiment, participants judged the physical strength of varying body shapes, derived from a statistical 3D body model. Based on these ratings, we determined three body shapes (weak, average, and strong) and animated them with a set of power poses for the second experiment. Participants rated how strong or powerful they perceived virtual characters of varying body shapes that were displayed in different poses. Our results show that perception of physical strength was mainly driven by the shape of the body. However, the social attribute of power was influenced by an interaction between pose and shape. Specifically, the effect of pose on power ratings was greater for weak body shapes. These results demonstrate that a character with a weak shape can be perceived as more powerful when in a high-power pose.
[Share]
Sending email...

Towards Probabilistic Volumetric Reconstruction using Ray Potentials
In
3D Vision (3DV), 2015 3rd International Conference on,
October
2015.
Abstract:
▸
This paper presents a novel probabilistic foundation for volumetric 3-d reconstruction. We formulate the problem as inference in a Markov random field, which accurately captures the dependencies between the occupancy and appearance of each voxel, given all input images. Our main contribution is an approximate highly parallelized discrete-continuous inference algorithm to compute the marginal distributions of each voxel's occupancy and appearance. In contrast to the MAP solution, marginals encode the underlying uncertainty and ambiguity in the reconstruction. Moreover, the proposed algorithm allows for a Bayes optimal prediction with respect to a natural reconstruction loss. We compare our method to two state-of-the-art volumetric reconstruction algorithms on three challenging aerial datasets with LIDAR ground truth. Our experiments demonstrate that the proposed algorithm compares favorably in terms of reconstruction accuracy and the ability to expose reconstruction uncertainty.
[Share]
Sending email...

FlowCap: 2D Human Pose from Optical Flow
In
German Conference on Pattern Recognition (GCPR),
2015.
Abstract:
▸
We estimate 2D human pose from video using only optical flow. The key insight is that dense optical flow can provide information about 2D body pose. Like range data, flow is largely invariant to appearance but unlike depth it can be directly computed from monocular video. We demonstrate that body parts can be detected from dense flow using the same random forest approach used by the Microsoft Kinect. Unlike range data, however, when people stop moving, there is no optical flow and they effectively disappear. To address this, our FlowCap method uses a Kalman filter to propagate body part positions and ve- locities over time and a regression method to predict 2D body pose from part centers. No range sensor is required and FlowCap estimates 2D human pose from monocular video sources containing human motion. Such sources include hand-held phone cameras and archival television video. We demonstrate 2D body pose estimation in a range of scenarios and show that the method works with real-time optical flow. The results suggest that optical flow shares invariances with range data that, when complemented with tracking, make it valuable for pose estimation.
[Share]
Sending email...

Linking Objects to Actions: Encoding of Target Object and Grasping Strategy in Primate Ventral Premotor Cortex
Vargas-Irwin, C.E., Franquemont, L.,
Black, M.J. and Donoghue, J.P.
Journal of Neuroscience,
35(30):10888-10897,
July
2015.
Abstract:
▸
Neural activity in ventral premotor cortex (PMv) has been associated with the process of matching perceived objects with the motor commands needed to grasp them. It remains unclear how PMv networks can flexibly link percepts of objects affording multiple grasp options into a final desired hand action. Here, we use a relational encoding approach to track the functional state of PMv neuronal ensembles in macaque monkeys through the process of passive viewing, grip planning, and grasping movement execution. We used objects affording multiple possible grip strategies. The task included separate instructed delay periods for object presentation and grip instruction. This approach allowed us to distinguish responses elicited by the visual presentation of the objects from those associated with selecting a given motor plan for grasping. We show that PMv continuously incorporates information related to object shape and grip strategy as it becomes available, revealing a transition from a set of ensemble states initially most closely related to objects, to a new set of ensemble patterns reflecting unique object-grip combinations. These results suggest that PMv dynamically combines percepts, gradually navigating toward activity patterns associated with specific volitional actions, rather than directly mapping perceptual object properties onto categorical grip representations. Our results support the idea that PMv is part of a network that dynamically computes motor plans from perceptual information.
Significance Statement: The present work demonstrates that the activity of groups of neurons in primate ventral premotor cortex reflects information related to visually presented objects, as well as the motor strategy used to grasp them, linking individual objects to multiple possible grips. PMv could provide useful control signals for neuroprosthetic assistive devices designed to interact with objects in a flexible way.
[Share]
Sending email...

The fertilized forests Decision Forest Library
In
ACM Transactions on Multimedia (ACMMM) Open-source Software Competition,
October
2015.
Abstract:
▸
Since the introduction of Random Forests in the 80's they have been a frequently used statistical tool for a variety of machine learning tasks. Many different training algorithms and model adaptions demonstrate the versatility of the forests. This variety resulted in a fragmentation of research and code, since each adaption requires its own algorithms and representations.
In 2011, Criminisi and Shotton developed a unifying Decision Forest model for many tasks. By identifying the reusable parts and specifying clear interfaces, we extend this approach to an object oriented representation and implementation. This has the great advantage that research on specific parts of the Decision Forest model can be done `locally' by reusing well-tested and high-performance components.
Our fertilized forests library is open source and easy to extend. It provides components allowing for parallelization up to node optimization level to exploit modern many core architectures. Additionally, the library provides consistent and easy-to-maintain interfaces to C++, Python and Matlab and offers cross-platform and cross-interface persistence.
[Share]
Sending email...

Active Learning for Efficient Sampling of Control Models of Collectives
Schiendorfer, A.,
Lassner, C., Anders, G., Reif, W. and Lienhart, R.
In
International Conference on Self-adaptive and Self-organizing Systems (SASO),
September
2015.
Abstract:
▸
Many large-scale systems benefit from an organizational structure to provide for problem decomposition. A pivotal problem solving setting is given by hierarchical control systems familiar from hierarchical task networks. If these structures can be modified autonomously by, e.g., coalition formation and reconfiguration, adequate decisions on higher levels require a faithful abstracted model of a collective of agents. An illustrative example is found in calculating schedules for a set of power plants organized in a hierarchy of Autonomous Virtual Power Plants. Functional dependencies over the combinatorial domain, such as the joint costs or rates of change of power production, are approximated by repeatedly sampling input-output pairs and substituting the actual functions by piecewise linear functions. However, if the sampled data points are weakly informative, the resulting abstracted high-level optimization introduces severe errors. Furthermore, obtaining additional point labels amounts to solving computationally hard optimization problems. Building on prior work, we propose to apply techniques from active learning to maximize the information gained by each additional point. Our results show that significantly better allocations in terms of cost-efficiency (up to 33.7 % reduction in costs in our case study) can be found with fewer but carefully selected sampling points using Decision Forests.
[Share]
Sending email...

Active Learning for Abstract Models of Collectives
Schiendorfer, A.,
Lassner, C., Anders, G., Reif, W. and Lienhart, R.
In
3rd Workshop on Self-optimisation in Organic and Autonomic Computing Systems (SAOS),
March
2015.
Abstract:
▸
Organizational structures such as hierarchies provide an effective means to deal with the increasing complexity found in large-scale energy systems. In hierarchical systems, the concrete functions describing the subsystems can be replaced by abstract piecewise linear functions to speed up the optimization process. However, if the data points are weakly informative the resulting abstracted optimization problem introduces severe errors and exhibits bad runtime performance. Furthermore, obtaining additional point labels amounts to solving computationally hard optimization problems. Therefore, we propose to apply methods from active learning to search for informative inputs. We present first results experimenting with Decision Forests and Gaussian Processes that motivate further research. Using points selected by Decision Forests, we could reduce the average mean-squared error of the abstract piecewise linear function by one third.
[Share]
Sending email...

Norm-induced entropies for decision forests
IEEE Winter Conference on Applications of Computer Vision (WACV),
January
2015.
Abstract:
▸
The entropy measurement function is a central element of
decision forest induction. The Shannon entropy and other
generalized entropies such as the Renyi and Tsallis entropy are designed to fulfill the Khinchin-Shannon axioms. Whereas these axioms are appropriate for physical systems,
they do not necessarily model well the artificial system of
decision forest induction.
In this paper, we show that when omitting two of the four
axioms, every norm induces an entropy function. The remaining two axioms are sufficient to describe the requirements for an entropy function in the decision forest context.
Furthermore, we introduce and analyze the
p-norm-induced
entropy, show relations to existing entropies and the relation
to various heuristics that are commonly used for decision
forest training.
In experiments with classification, regression and the recently introduced Hough forests, we show how the discrete
and differential form of the new entropy can be used for
forest induction and how the functions can simply be fine-tuned. The experiments indicate that the impact of the entropy function is limited, however can be a simple and useful
post-processing step for optimizing decision forests for high
performance applications.
[Share]
Sending email...

Discrete Optimization for Optical Flow
In
German Conference on Pattern Recognition (GCPR),
2015.
Abstract:
▸
We propose to look at large-displacement optical flow from a discrete point of view. Motivated by the observation that sub-pixel accuracy is easily obtained given pixel-accurate optical flow, we conjecture that computing the integral part is the hardest piece of the problem. Consequently, we formulate optical flow estimation as a discrete inference problem in a conditional random field, followed by sub-pixel refinement. Naive discretization of the 2D flow space, however, is intractable due to the resulting size of the label set. In this paper, we therefore investigate three different strategies, each able to reduce computation and memory demands by several orders of magnitude. Their combination allows us to estimate large-displacement optical flow both accurately and efficiently and demonstrates the potential of discrete optimization for optical flow. We obtain state-of-the-art performance on MPI Sintel and KITTI.
[Share]
Sending email...
Joint 3D Object and Layout Inference from a single RGB-D Image
In
German Conference on Pattern Recognition (GCPR),
2015.
Abstract:
▸
Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. In contrast to existing holistic approaches, our model leverages detailed 3D geometry using inverse graphics and explicitly enforces occlusion and visibility constraints for respecting scene properties and projective geometry. We cast the task as MAP inference in a factor graph and solve it efficiently using message passing. We evaluate our method with respect to several baselines on the challenging NYUv2 indoor dataset using 21 object categories. Our experiments demonstrate that the proposed method is able to infer scenes with a large degree of clutter and occlusions.
[Share]
Sending email...

Smooth Loops for Unconstrained Video
Sevilla-Lara, L.,
Wulff, J., Sunkavalli, K. and Shechtman, E.
In
Computer Graphics Forum (Proceedings of EGSR),
2015.
Abstract:
▸
Converting unconstrained video sequences into videos that loop seamlessly is an extremely challenging problem. In this work, we take the first steps towards automating this process by focusing on an important subclass of videos containing a single dominant foreground object. Our technique makes two novel contributions over previous work: first, we propose a correspondence-based similarity metric to automatically identify a good transition point in the video where the appearance and dynamics of the foreground are most consistent. Second, we develop a technique that aligns both the foreground and background about this transition point using a combination of global camera path planning and patch-based video morphing. We demonstrate that this allows us to create natural, compelling, loopy videos from a wide range of videos collected from the internet.
[Share]
Sending email...

Human Pose as Context for Object Detection
British Machine Vision Conference,
September
2015.
To Appear.
Abstract:
▸
Detecting small objects in images is a challenging problem particularly when they are often occluded by hands or other body parts.
Recently, joint modelling of human pose and objects has been proposed to improve both pose estimation as well as object detection.
These approaches, however, focus on explicit interaction with an object and lack the flexibility to combine both modalities when interaction is not obvious.
We therefore propose to use human pose as an additional context information for object detection.
To this end, we represent an object category by a tree model and train regression forests that localize parts of an object for each modality separately.
Predictions of the two modalities are then combined to detect the bounding box of the object.
We evaluate our approach on three challenging datasets which vary in the amount of object interactions and the quality of automatically extracted human poses.
[Share]
Sending email...

Joint 3D Estimation of Vehicles and Scene Flow
In
Proc. of the ISPRS Workshop on Image Sequence Analysis (ISA),
2015.
Abstract:
▸
Three-dimensional reconstruction of dynamic scenes is an important prerequisite for applications like mobile robotics or autonomous driving. While much progress has been made in recent years, imaging conditions in natural outdoor environments are still very challenging for current reconstruction and recognition methods. In this paper, we propose a novel unified approach which reasons jointly about 3D scene flow as well as the pose, shape and motion of vehicles in the scene. Towards this goal, we incorporate a deformable CAD model into a slanted-plane conditional random field for scene flow estimation and enforce shape consistency between the rendered 3D models and the parameters of all superpixels in the image. The association of superpixels to objects is established by an index variable which implicitly enables model selection. We evaluate our approach on the challenging KITTI scene flow dataset in terms of object and scene flow estimation. Our results provide a prove of concept and demonstrate the usefulness of our method.
[Share]
Sending email...

Map-Based Probabilistic Visual Self-Localization
Brubaker, M.A.,
Geiger, A. and Urtasun, R.
IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI),
2015.
Abstract:
▸
Accurate and efficient self-localization is a critical problem for autonomous systems. This paper describes an affordable solution to vehicle self-localization which uses odometry computed from two video cameras and road maps as the sole inputs. The core of the method is a probabilistic model for which an efficient approximate inference algorithm is derived. The inference algorithm is able to utilize distributed computation in order to meet the real-time requirements of autonomous systems in some instances. Because of the probabilistic nature of the model the method is capable of coping with various sources of uncertainty including noise in the visual odometry and inherent ambiguities in the map (e.g., in a Manhattan world). By exploiting freely available, community developed maps and visual odometry measurements, the proposed method is able to localize a vehicle to 4m on average after 52 seconds of driving on maps which contain more than 2,150km of drivable roads.
[Share]
Sending email...

Dyna: A Model of Dynamic Human Shape in Motion
ACM Transactions on Graphics, (Proc. SIGGRAPH),
34(4):120:1-120:14,
August
2015.
Abstract:
▸
To look human, digital full-body avatars need to have soft tissue deformations like those of real people. We learn a model of soft-tissue deformations from examples using a high-resolution 4D capture system and a method that accurately registers a template mesh to sequences of 3D scans. Using over 40,000 scans of ten subjects, we learn how soft tissue motion causes mesh triangles to deform relative to a base 3D body model. Our Dyna model uses a low-dimensional linear subspace to approximate soft-tissue deformation and relates the subspace coefficients to the changing pose of the body. Dyna uses a second-order auto-regressive model that predicts soft-tissue deformations based on previous deformations,
the velocity and acceleration of the body, and the angular velocities and accelerations of the limbs. Dyna also models how deformations vary with a person’s body mass index (BMI), producing different deformations for people with different shapes. Dyna realistically represents the dynamics of soft tissue for previously unseen subjects and motions. We provide tools for animators to modify the deformations and apply them to new stylized characters.
[Share]
Sending email...
3D Object Class Detection in the Wild
Pepik, B., Stark, M.,
Gehler, P., Ritschel, T. and Schiele, B.
In
Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),
IEEE,
2015.
[Share]
Sending email...
Metric Regression Forests for Correspondence Estimation
Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A. and Fitzgibbon, A.
International Journal of Computer Vision,
pages 1-13,
2015.
[Share]
Sending email...

From Scans to Models: Registration of 3D Human Shapes Exploiting Texture Information
PhD thesis.
University of Padova,
March
2015.
Abstract:
▸
New scanning technologies are increasing the importance of 3D mesh data, and of algorithms that can reliably register meshes obtained from multiple scans. Surface registration is important e.g. for building full 3D models from partial scans, identifying and tracking objects in a 3D scene, creating statistical shape models.
Human body registration is particularly important for many applications, ranging from biomedicine and robotics to the production of movies and video games; but obtaining accurate and reliable registrations is challenging, given the articulated, non-rigidly deformable structure of the human body.
In this thesis, we tackle the problem of 3D human body registration.
We start by analyzing the current state of the art, and find that: a) most registration techniques rely only on geometric information, which is ambiguous on flat surface areas; b) there is a lack of adequate datasets and benchmarks in the field. We address both issues.
Our contribution is threefold. First, we present a model-based registration technique for human meshes that combines geometry and surface texture information to provide highly accurate mesh-to-mesh correspondences. Our approach estimates scene lighting and surface albedo, and uses the albedo to construct a high-resolution textured 3D body model that is brought into registration with multi-camera image data using a robust matching term.
Second, by leveraging our technique, we present FAUST (Fine Alignment Using Scan Texture), a novel dataset collecting 300 high-resolution scans of 10 people in a wide range of poses. FAUST is the first dataset providing both real scans and automatically computed, reliable "ground-truth" correspondences between them.
Third, we explore possible uses of our approach in dermatology. By combining our registration technique with a melanocytic lesion segmentation algorithm, we propose a system that automatically detects new or evolving lesions over almost the entire body surface, thus helping dermatologists identify potential melanomas.
We conclude this thesis investigating the benefits of using texture information to establish frame-to-frame correspondences in dynamic monocular sequences captured with consumer depth cameras. We outline a novel approach to reconstruct realistic body shape and appearance models from dynamic human performances, and show preliminary results on challenging sequences captured with a Kinect.
[Share]
Sending email...

Shape Models of the Human Body for Distributed Inference
PhD thesis.
Brown University,
May
2015.
Abstract:
▸
In this thesis we address the problem of building shape models of the human body,
in 2D and 3D, which are realistic and efficient to use. We focus our efforts on the
human body, which is highly articulated and has interesting shape variations, but
the approaches we present here can be applied to generic deformable and articulated
objects. To address efficiency, we constrain our models to be part-based and have a
tree-structured representation with pairwise relationships between connected parts.
This allows the application of methods for distributed inference based on message
passing. To address realism, we exploit recent advances in computer graphics that
represent the human body with statistical shape models learned from 3D scans.
We introduce two articulated body models, a 2D model, named Deformable Structures
(DS), which is a contour-based model parameterized for 2D pose and projected
shape, and a 3D model, named Stitchable Puppet (SP), which is a mesh-based model
parameterized for 3D pose, pose-dependent deformations and intrinsic body shape.
We have successfully applied the models to interesting and challenging problems in
computer vision and computer graphics, namely pose estimation from static images,
pose estimation from video sequences, pose and shape estimation from 3D scan data.
This advances the state of the art in human pose and shape estimation and suggests
that carefully dened realistic models can be important for computer vision.
More work at the intersection of vision and graphics is thus encouraged.
[Share]
Sending email...

The Stitched Puppet: A Graphical Model of 3D Human Shape and Pose
In
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015,
June
2015.
Abstract:
▸
We propose a new 3D model of the human body that is both realistic and part-based. The body is represented by
a graphical model in which nodes of the graph correspond to body parts that can independently translate and rotate
in 3D as well as deform to capture pose-dependent shape variations. Pairwise potentials define a “stitching cost” for
pulling the limbs apart, giving rise to the stitched puppet model (SPM). Unlike existing realistic 3D body models, the
distributed representation facilitates inference by allowing the model to more effectively explore the space of poses,
much like existing 2D pictorial structures models. We infer pose and body shape using a form of particle-based max-product belief propagation. This gives the SPM the realism of recent 3D body models with the computational advantages
of part-based models. We apply the SPM to two challenging problems involving estimating human shape and
pose from 3D data. The first is the FAUST mesh alignment challenge (http://faust.is.tue.mpg.de/), where ours is the first method to successfully align all 3D meshes. The second involves estimating pose and shape from crude visual hull representations of complex
body movements.
[Share]
Sending email...

Efficient Sparse-to-Dense Optical Flow Estimation using a Learned Basis and Layers
In
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015,
June
2015.
Abstract:
▸
We address the elusive goal of estimating optical flow both accurately and efficiently by adopting a sparse-to-dense approach. Given a set of sparse matches, we regress to dense optical flow using a learned set of full-frame basis
flow fields. We learn the principal components of natural flow fields using flow computed from four Hollywood
movies. Optical flow fields are then compactly approximated as a weighted sum of the basis flow fields. Our
new PCA-Flow algorithm robustly estimates these weights from sparse feature matches. The method runs in under
300ms/frame on the MPI-Sintel dataset using a single CPU and is more accurate and significantly faster than popular
methods such as LDOF and Classic+NL. The results, however, are too smooth for some applications. Consequently,
we develop a novel sparse layered flow method in which each layer is represented by PCA-flow. Unlike existing layered
methods, estimation is fast because it uses only sparse matches. We combine information from different layers into
a dense flow field using an image-aware MRF. The resulting PCA-Layers method runs in 3.6s/frame, is significantly
more accurate than PCA-flow and achieves state-of-the-art performance in occluded regions on MPI-Sintel.
[Share]
Sending email...

Pose-Conditioned Joint Angle Limits for 3D Human Pose Reconstruction
In
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015,
June
2015.
Abstract:
▸
The estimation of 3D human pose from 2D joint locations is central to many vision problems involving the analysis
of people in images and video. To address the fact that the problem is inherently ill posed, many methods impose
a prior over human poses. Unfortunately these priors admit invalid poses because they do not model how joint-limits
vary with pose. Here we make two key contributions. First, we collected a motion capture dataset that explores a wide
range of human poses. From this we learn a pose-dependent model of joint limits that forms our prior. The dataset and
the prior will be made publicly available. Second, we define a general parameterization of body pose and a new, multistage, method to estimate 3D pose from 2D joint locations that uses an over-complete dictionary of human poses. Our method shows good generalization while avoiding impossible poses. We quantitatively compare our method with
recent work and show state-of-the-art results on 2D to 3D pose estimation using the CMU mocap dataset. We also
show superior results on manual annotations on real images and automatic part-based detections on the Leeds sports
pose dataset.
[Share]
Sending email...

Object Scene Flow for Autonomous Vehicles
In
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015,
June
2015.
Abstract:
▸
This paper proposes a novel model and dataset for 3D scene flow estimation with an application to autonomous driving. Taking advantage of the fact that outdoor scenes often decompose into a small number of independently moving objects, we represent each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object. This minimal representation increases robustness and leads to a discrete-continuous CRF where the data term decomposes into pairwise potentials between superpixels and objects. Moreover, our model intrinsically segments the scene into its constituting dynamic components. We demonstrate the performance of our model on existing benchmarks as well as a novel realistic dataset with scene flow ground truth. We obtain this dataset by annotating 400 dynamic scenes from the KITTI raw data collection using detailed 3D CAD models for all vehicles in motion. Our experiments also reveal novel challenges which can't be handled by existing methods.
[Share]
Sending email...

Displets: Resolving Stereo Ambiguities using Object Knowledge
In
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015,
June
2015.
Abstract:
▸
Stereo techniques have witnessed tremendous progress over the last decades, yet some aspects of the problem still remain challenging today. Striking examples are reflecting and textureless surfaces which cannot easily be recovered using traditional local regularizers. In this paper, we therefore propose to regularize over larger distances using object-category specific disparity proposals (displets) which we sample using inverse graphics techniques based on a sparse disparity estimate and a semantic segmentation of the image. The proposed displets encode the fact that objects of certain categories are not arbitrarily shaped but typically exhibit regular structures. We integrate them as non-local regularizer for the challenging object class 'car' into a superpixel based CRF framework and demonstrate its benefits on the KITTI stereo evaluation.
[Share]
Sending email...

Consensus Message Passing for Layered Graphical Models
Jampani*, V., Eslami*, S.M.A., Tarlow, D., Kohli, P. and Winn, J.
In
Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS),
JMLR Workshop and Conference Proceedings,
Vol. 38,
pages 425-433,
May
2015.
Abstract:
▸
Generative models provide a powerful framework for probabilistic reasoning. However, in many domains their use has been hampered by the practical difficulties of inference. This is particularly the case in computer vision, where models of the imaging process tend to be large, loopy and layered. For this reason bottom-up conditional models have traditionally dominated in such domains. We find that widely-used, general-purpose message passing inference algorithms such as Expectation Propagation (EP) and Variational Message Passing (VMP) fail on the simplest of vision models. With these models in mind, we introduce a modification to message passing that learns to exploit their layered structure by passing 'consensus' messages that guide inference towards good solutions. Experiments on a variety of problems show that the proposed technique leads to significantly more accurate inference results, not only when compared to standard EP and VMP, but also when compared to competitive bottom-up conditional models.
[Share]
Sending email...
Efficient Facade Segmentation using Auto-Context
In
Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on,
IEEE,
IEEE,
pages 1038-1045,
January
2015.
Abstract:
▸
In this paper we propose a system for the problem of facade segmentation. Building facades are highly structured images and consequently most methods that have been proposed for this problem, aim to make use of this strong prior information. We are describing a system that is almost domain independent and consists of standard segmentation methods. A sequence of boosted decision trees is stacked using auto-context features and learned using the stacked generalization technique. We find that this, albeit standard, technique performs better, or equals, all previous published empirical results on all available facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test time inference.
[Share]
Sending email...

Spike train SIMilarity Space (SSIMS): A framework for single neuron and ensemble data analysis
Vargas-Irwin, C.E., Brandman, D.M., Zimmermann, J.B., Donoghue, J.P. and
Black, M.J.
Neural Computation,
27(1):1-31,
January
2015.
Abstract:
▸
We present a method to evaluate the relative similarity of neural spiking patterns by combining spike train distance metrics with dimensionality reduction. Spike train distance metrics provide an estimate of similarity between activity patterns at multiple temporal resolutions. Vectors of pair-wise distances are used to represent the intrinsic relationships between multiple activity patterns at the level of single units or neuronal ensembles. Dimensionality reduction is then used to project the data into concise representations suitable for clustering analysis as well as exploratory visualization. Algorithm performance and robustness are evaluated using multielectrode ensemble activity data recorded in behaving primates. We demonstrate how Spike train SIMilarity Space (SSIMS) analysis captures the relationship between goal directions for an 8-directional reaching task and successfully segregates grasp types in a 3D grasping task in the absence of kinematic information. The algorithm enables exploration of virtually any type of neural spiking (time series) data, providing similarity-based clustering of neural activity states with minimal assumptions about potential information encoding models.
[Share]
Sending email...

The Informed Sampler: A Discriminative Approach to Bayesian Inference in Generative Computer Vision Models
In
Special Issue on Generative Models in Computer Vision and Medical Imaging,
Computer Vision and Image Understanding, Elsevier,
Vol. 136,
pages 32-44,
July
2015.
Abstract:
▸
Computer vision is hard because of a large variability in lighting, shape, and texture; in addition the image signal is non-additive due to occlusion. Generative models promised to account for this variability by accurately modelling the image formation process as a function of latent variables with prior beliefs. Bayesian posterior inference could then, in principle, explain the observation. While intuitively appealing, generative models for computer vision have largely failed to deliver on that promise due to the difficulty of posterior inference. As a result the community has favored efficient discriminative approaches. We still believe in the usefulness of generative models in computer vision, but argue that we need to leverage existing discriminative or even heuristic computer vision methods.
We implement this idea in a principled way in our informed sampler and in careful experiments demonstrate it on challenging models which contain renderer programs as their components. The informed sampler, using simple discriminative proposals based on existing computer vision technology achieves dramatic improvements in inference. Our approach enables a new richness in generative models that was out of reach with existing inference technology.
[Share]
Sending email...
2014
Segmentation of Biomedical Images Using Active Contour Model with Robust Image Feature and Shape Prior
Yeo, S.Y., Xie, X., Sazonov, I. and Nithiarasu, P.
International Journal for Numerical Methods in Biomedical Engineering,
30(2):232- 248,
2014.
Abstract:
▸
In this article, a new level set model is proposed for the segmentation of biomedical images. The image energy of the proposed model is derived from a robust image gradient feature which gives the active contour a global representation of the geometric configuration, making it more robust in dealing with image noise, weak edges, and initial configurations. Statistical shape information is incorporated using nonparametric shape density distribution, which allows the shape model to handle relatively large shape variations. The segmentation of various shapes from both synthetic and real images depict the robustness and efficiency of the proposed method.
[Share]
Sending email...

Automatic 4D Reconstruction of Patient-Specific Cardiac Mesh with 1- to-1 Vertex Correspondence from Segmented Contours Lines
Lim, C.W., Su, Y.,
Yeo, S.Y., Ng, G.M., Nguyen, V.T., Zhong, L., Tan, R.S., Poh, K.K. and P. Chai
PLOS ONE,
9(4),
2014.
Abstract:
▸
We propose an automatic algorithm for the reconstruction of patient-specific cardiac mesh models with 1-to-1 vertex correspondence. In this framework, a series of 3D meshes depicting the endocardial surface of the heart at each time step is constructed, based on a set of border delineated magnetic resonance imaging (MRI) data of the whole cardiac cycle. The key contribution in this work involves a novel reconstruction technique to generate a 4D (i.e., spatial–temporal) model of the heart with 1-to-1 vertex mapping throughout the time frames. The reconstructed 3D model from the first time step is used as a base template model and then deformed to fit the segmented contours from the subsequent time steps. A method to determine a tree-based connectivity relationship is proposed to ensure robust mapping during mesh deformation. The novel feature is the ability to handle intra- and inter-frame 2D topology changes of the contours, which manifests as a series of merging and splitting of contours when the images are viewed either in a spatial or temporal sequence. Our algorithm has been tested on five acquisitions of cardiac MRI and can successfully reconstruct the full 4D heart model in around 30 minutes per subject. The generated 4D heart model conforms very well with the input segmented contours and the mesh element shape is of reasonably good quality. The work is important in the support of downstream computational simulation activities.
[Share]
Sending email...

Left Ventricle Segmentation by Dynamic Shape Constrained Random Walk
Yang, X., Su, Y., Wan, M.,
Yeo, S.Y., Lim, C., Wong, S.T., Zhong, L. and Tan, R.S.
In
Annual International Conference of the IEEE Engineering in Medicine and Biology Society,
2014.
Abstract:
▸
Accurate and robust extraction of the left ventricle (LV) cavity is a key step for quantitative analysis of cardiac functions. In this study, we propose an improved LV cavity segmentation method that incorporates a dynamic shape constraint into the weighting function of the random walks algorithm. The method involves an iterative process that updates an intermediate result to the desired solution. The shape constraint restricts the solution space of the segmentation result, such that the robustness of the algorithm is increased to handle misleading information that emanates from noise, weak boundaries, and clutter. Our experiments on real cardiac magnetic resonance images demonstrate that the proposed method obtains better segmentation performance than standard method.
[Share]
Sending email...

Evaluation of feature-based 3-d registration of probabilistic volumetric scenes
Restrepo, M.I.,
Ulusoy, A.O. and Mundy, J.L.
In
ISPRS Journal of Photogrammetry and Remote Sensing,
Vol. 98,
pages 1-18,
2014.
Abstract:
▸
Automatic estimation of the world surfaces from aerial images has seen much attention and progress in recent years. Among current modeling technologies, probabilistic volumetric models (PVMs) have evolved as an alternative representation that can learn geometry and appearance in a dense and probabilistic manner. Recent progress, in terms of storage and speed, achieved in the area of volumetric modeling, opens the opportunity to develop new frameworks that make use of the \{PVM\} to pursue the ultimate goal of creating an entire map of the earth, where one can reason about the semantics and dynamics of the 3-d world. Aligning 3-d models collected at different time-instances constitutes an important step for successful fusion of large spatio-temporal information. This paper evaluates how effectively probabilistic volumetric models can be aligned using robust feature-matching techniques, while considering different scenarios that reflect the kind of variability observed across aerial video collections from different time instances. More precisely, this work investigates variability in terms of discretization, resolution and sampling density, errors in the camera orientation, and changes in illumination and geographic characteristics. All results are given for large-scale, outdoor sites. In order to facilitate the comparison of the registration performance of \{PVMs\} to that of other 3-d reconstruction techniques, the registration pipeline is also carried out using Patch-based Multi-View Stereo (PMVS) algorithm. Registration performance is similar for scenes that have favorable geometry and the appearance characteristics necessary for high quality reconstruction. In scenes containing trees, such as a park, or many buildings, such as a city center, registration performance is significantly more accurate when using the PVM.
[Share]
Sending email...

Image-based 4-d Reconstruction Using 3-d Change Detection
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Lecture Notes in Computer Science,
pages 31-45,
September
2014.
Abstract:
▸
This paper describes an approach to reconstruct the complete history of a 3-d scene over time from imagery. The proposed approach avoids rebuilding 3-d models of the scene at each time instant. Instead, the approach employs an initial 3-d model which is continuously updated with changes in the environment to form a full 4-d representation. This updating scheme is enabled by a novel algorithm that infers 3-d changes with respect to the model at one time step from images taken at a subsequent time step. This algorithm can effectively detect changes even when the illumination conditions between image collections are significantly different. The performance of the proposed framework is demonstrated on four challenging datasets in terms of 4-d modeling accuracy as well as quantitative evaluation of 3-d change detection.
[Share]
Sending email...

Human Pose Estimation from Video and Inertial Sensors
Ph.D Thesis.
-,
2014.
Abstract:
▸
The analysis and understanding of human movement is central to many applications
such as sports science, medical diagnosis and movie production. The ability to
automatically monitor human activity in security sensitive areas such as airports,
lobbies or borders is of great practical importance. Furthermore, automatic
pose estimation from images leverages the processing
and understanding of massive digital libraries available on the Internet.
We build upon a model based approach where the human shape is modelled with a surface mesh
and the motion is parametrized by a kinematic chain. We then seek for the pose
of the model that best explains the available observations coming from different sensors.
In a first scenario, we consider a calibrated mult-iview setup in an indoor studio. To obtain very accurate
results, we propose a novel tracker that combines information coming from video and a
small set of Inertial Measurement Units (IMUs). We do so by locally optimizing a joint
energy consisting of a term that measures the likelihood of the video data and a term
for the IMU data. This is the first work to successfully combine video and IMUs
information for full body pose estimation. When compared to commercial marker based systems
the proposed solution is more cost efficient and less intrusive for the user.
In a second scenario, we relax the assumption of an indoor studio and we tackle outdoor scenes
with background clutter, illumination changes, large recording volumes and difficult motions
of people interacting with objects. Again, we combine information from video and IMUs.
Here we employ a particle based optimization approach
that allows us to be more robust to tracking failures. To satisfy the orientation constraints
imposed by the IMUs, we derive an analytic Inverse Kinematics (IK) procedure to sample from the manifold
of valid poses. The generated hypothesis come from a lower dimensional manifold and therefore the computational
cost can be reduced. Experiments on challenging sequences suggest the proposed tracker can be applied
to capture in outdoor scenarios. Furthermore, the proposed IK sampling procedure can be used
to integrate any kind of constraints derived from the environment.
Finally, we consider the most challenging possible scenario: pose estimation of monocular images.
Here, we argue that estimating the pose to the degree of accuracy as in an engineered environment is
too ambitious with the current technology. Therefore, we propose to extract meaningful semantic information about
the pose directly from image features in a discriminative fashion. In particular, we introduce posebits
which are semantic pose descriptors about the geometric relationships between parts in the body.
The experiments
show that the intermediate step of inferring posebits from images can improve pose estimation from
monocular imagery. Furthermore, posebits can be very useful as input feature for many computer vision
algorithms.
[Share]
Sending email...

MoSh: Motion and Shape Capture from Sparse Markers
ACM Transactions on Graphics, (Proc. SIGGRAPH Asia),
33(6):220:1-220:13,
November
2014.
Abstract:
▸
Marker-based motion capture (mocap) is widely criticized as producing lifeless animations. We argue that important information about body surface motion is present in standard marker sets but is lost in extracting a skeleton. We demonstrate a new approach called MoSh (Motion and Shape capture), that automatically extracts this detail from mocap data. MoSh estimates body shape and pose together using sparse marker data by exploiting a parametric model of the human body. In contrast to previous work, MoSh solves for the marker locations relative to the body and estimates accurate body shape directly from the markers without the use of 3D scans; this effectively turns a mocap system into an approximate body scanner. MoSh is able to capture soft tissue motions directly from markers
by allowing body shape to vary over time. We evaluate the effect of different marker sets on pose and shape accuracy and propose a new sparse marker set for capturing soft-tissue motion. We illustrate MoSh by recovering body shape, pose, and soft-tissue motion from archival mocap data and using this to produce animations with subtlety and realism. We also show soft-tissue motion retargeting to new characters and show how to magnify the 3D deformations of soft tissue to create animations with appealing exaggerations.
[Share]
Sending email...

Probabilistic Progress Bars
Kiefel, M., Schuler, C. and Hennig, P.
In
Conference on Pattern Recognition (GCPR),
Springer,
Vol. 8753,
Lecture Notes in Computer Science,
pages 331-341,
September
2014.
Abstract:
▸
Predicting the time at which the integral over a stochastic process reaches a target level is a value of interest in many applications. Often, such computations have to be made at low cost, in real time. As an intuitive example that captures many features of this problem class, we choose progress bars, a ubiquitous element of computer user interfaces. These predictors are usually based on simple point estimators, with no error modelling. This leads to fluctuating behaviour confusing to the user. It also does not provide a distribution prediction (risk values), which are crucial for many other application areas. We construct and empirically evaluate a fast, constant cost algorithm using a Gauss-Markov process model which provides more information to the user.
[Share]
Sending email...

Modeling Blurred Video with Layers
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8694,
Lecture Notes in Computer Science,
pages 236-252,
September
2014.
Abstract:
▸
Videos contain complex spatially-varying motion blur due to the combination of object motion, camera motion, and depth variation with finite shutter speeds. Existing methods to estimate optical flow, deblur the images, and segment the scene fail in such cases. In particular, boundaries between differently moving objects cause problems, because here the blurred images are a combination of the blurred appearances of multiple surfaces. We address this with a novel layered model of scenes in motion. From a motion-blurred video sequence, we jointly estimate the layer segmentation and each layer's appearance and motion. Since the blur is a function of the layer motion and segmentation, it is completely determined by our generative model. Given a video, we formulate the optimization problem as minimizing the pixel error between the blurred frames and images synthesized from the model, and solve it using gradient descent. We demonstrate our approach on synthetic and real sequences.
[Share]
Sending email...

Optical Flow Estimation with Channel Constancy
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8689,
Lecture Notes in Computer Science,
pages 423-438,
September
2014.
Abstract:
▸
Large motions remain a challenge for current optical flow algorithms. Traditionally, large motions are addressed using multi-resolution representations like Gaussian pyramids. To deal with large displacements, many pyramid levels are needed and, if an object is small, it may be invisible at the highest levels. To address this we decompose images using a channel representation (CR) and replace the standard brightness constancy assumption with a descriptor constancy assumption. CRs can be seen as an over-segmentation of the scene into layers based on some image feature. If the appearance of a foreground object differs from the background then its descriptor will be different and they will be represented in different layers.We create a pyramid by smoothing these layers, without mixing foreground and background or losing small objects. Our method estimates more accurate flow than the baseline on the MPI-Sintel benchmark, especially for fast motions and near motion boundaries.
[Share]
Sending email...

Tracking using Multilevel Quantizations
Hong, Z.,
Wang, C., Mei, X., Prokhorov, D. and Tao, D.
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8694,
Lecture Notes in Computer Science,
pages 155-171,
September
2014.
Abstract:
▸
Most object tracking methods only exploit a single quantization of an image space: pixels, superpixels, or bounding boxes, each of which has advantages and disadvantages. It is highly unlikely that a common optimal quantization level, suitable for tracking all objects in all environments, exists. We therefore propose a hierarchical appearance representation model for tracking, based on a graphical model that exploits shared information across multiple quantization levels. The tracker aims to find the most possible position of the target by jointly classifying the pixels and superpixels and obtaining the best configuration across all levels. The motion of the bounding box is taken into consideration, while Online Random Forests are used to provide pixel- and superpixel-level quantizations and progressively updated on-the-fly. By appropriately considering the multilevel quantizations, our tracker exhibits not only excellent performance in non-rigid object deformation handling, but also its robustness to occlusions. A quantitative evaluation is conducted on two benchmark datasets: a non-rigid object tracking dataset (11 sequences) and the CVPR2013 tracking benchmark (50 sequences). Experimental results show that our tracker overcomes various tracking challenges and is superior to a number of other popular tracking methods.
[Share]
Sending email...

OpenDR: An Approximate Differentiable Renderer
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8695,
Lecture Notes in Computer Science,
pages 154-169,
September
2014.
Abstract:
▸
Inverse graphics attempts to take sensor data and infer 3D geometry, illumination, materials, and motions such that a graphics renderer could realistically reproduce the observed scene. Renderers, however, are designed to solve the forward process of image synthesis. To go in the other direction, we propose an approximate differentiable renderer (DR) that explicitly models the relationship between changes in model parameters and image observations. We describe a publicly available OpenDR framework that makes it easy to express a forward graphics model and then automatically obtain derivatives with respect to the model parameters and to optimize over them. Built on a new autodifferentiation package and OpenGL, OpenDR provides a local optimization method that can be incorporated into probabilistic programming frameworks. We demonstrate the power and simplicity of programming with OpenDR by using it to solve the problem of estimating human body shape from Kinect depth and RGB data.
[Share]
Sending email...

Intrinsic Video
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8690,
Lecture Notes in Computer Science,
pages 360-375,
September
2014.
Abstract:
▸
Intrinsic images such as albedo and shading are valuable for later stages of visual processing. Previous methods for extracting albedo and shading use either single images or images together with depth data. Instead, we define intrinsic video estimation as the problem of extracting temporally coherent albedo and shading from video alone. Our approach exploits the assumption that albedo is constant over time while shading changes slowly. Optical flow aids in the accurate estimation of intrinsic video by providing temporal continuity as well as putative surface boundaries. Additionally, we find that the estimated albedo sequence can be used to improve optical flow accuracy in sequences with changing illumination. The approach makes only weak assumptions about the scene and we show that it substantially outperforms existing single-frame intrinsic image methods. We evaluate this quantitatively on synthetic sequences as well on challenging natural sequences with complex geometry, motion, and illumination.
[Share]
Sending email...

Robot Arm Pose Estimation through Pixel-Wise Part Classification
Bohg, J.,
Romero, J., Herzog, A. and Schaal, S.
In
IEEE International Conference on Robotics and Automation (ICRA) 2014,
pages 3143-3150,
June
2014.
Abstract:
▸
We propose to frame the problem of marker-less robot arm pose estimation as a pixel-wise part classification problem. As input, we use a depth image in which each pixel is classified to be either from a particular robot part or the background. The classifier is a random decision forest trained on a large number of synthetically generated and labeled depth images. From all the training samples ending up at a leaf node, a set of offsets is learned that votes for relative joint positions. Pooling these votes over all foreground pixels and subsequent clustering gives us an estimate of the true joint positions. Due to the intrinsic parallelism of pixel-wise classification, this approach can run in super real-time and is more efficient than previous ICP-like methods. We quantitatively evaluate the accuracy of this approach on synthetic data. We also demonstrate that the method produces accurate joint estimates on real data despite being purely trained on synthetic data.
[Share]
Sending email...

Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points
In
German Conference on Pattern Recognition (GCPR),
Springer,
Lecture Notes in Computer Science,
pages 1-13,
September
2014.
Abstract:
▸
Hand motion capture has been an active research topic in recent years, following the success of full-body pose tracking. Despite similarities, hand tracking proves to be more challenging, characterized by a higher dimensionality, severe occlusions and self-similarity between fingers.
For this reason, most approaches rely on strong assumptions, like hands in isolation or expensive multi-camera systems, that limit the practical use. In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera. Our approach combines a generative model with collision detection and discriminatively learned salient points. We quantitatively evaluate our approach on 14 new sequences with challenging interactions.
[Share]
Sending email...

Human Pose Estimation with Fields of Parts
In
Computer Vision – ECCV 2014,
Springer International Publishing,
Vol. 8693,
Lecture Notes in Computer Science,
pages 331-346,
September
2014.
Abstract:
▸
This paper proposes a new formulation of the human pose estimation problem. We present the Fields of Parts model, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images.
The Fields of Parts model is inspired by the idea of Pictorial Structures, it models local appearance and joint spatial configuration of the human body. However the underlying graph structure is entirely different. The idea is simple: we model the presence and absence of a body part at every possible position, orientation, and scale in an image with a binary random variable. This results into a vast number of random variables, however, we show that approximate inference in this model is efficient. Moreover we can encode the very same appearance and spatial structure as in Pictorial Structures models.
This approach allows us to combine ideas from segmentation and pose estimation into a single model. The Fields of Parts model can use evidence from the background, include local color information, and it is connected more densely than a kinematic chain structure. On the challenging Leeds Sports Poses dataset we improve over the Pictorial Structures counterpart by 5.5% in terms of Average Precision of Keypoints (APK).
[Share]
Sending email...

Discovering Object Classes from Activities
In
European Conference on Computer Vision,
Springer International Publishing,
Vol. 8694,
Lecture Notes in Computer Science,
pages 415-430,
September
2014.
Abstract:
▸
In order to avoid an expensive manual labeling process or to learn object classes autonomously without human intervention, object discovery techniques have been proposed that extract visual similar objects from weakly labelled videos. However, the problem of discovering small or medium sized objects is largely unexplored. We observe that videos with activities involving human-object interactions can serve as weakly labelled data for such cases. Since neither object appearance nor motion is distinct enough to discover objects in these videos, we propose a framework that samples from a space of algorithms and their parameters to extract sequences of object proposals. Furthermore, we model similarity of objects based on appearance and functionality, which is derived from human and object motion. We show that functionality is an
important cue for discovering objects from activities and demonstrate the generality of the model on three challenging RGB-D and RGB datasets.
[Share]
Sending email...

Automated Detection of New or Evolving Melanocytic Lesions Using a 3D Body Model
In
Medical Image Computing and Computer-Assisted Intervention (MICCAI),
Spring International Publishing,
Vol. 8673,
Lecture Notes in Computer Science,
pages 593-600,
September
2014.
Abstract:
▸
Detection of new or rapidly evolving melanocytic lesions is crucial for early diagnosis and treatment of melanoma.We propose a fully automated pre-screening system for detecting new lesions or changes in existing ones, on the order of 2 - 3mm, over almost the entire body surface. Our solution is based on a multi-camera 3D stereo system. The system captures 3D textured scans of a subject at different times and then brings these scans into correspondence by aligning them with a learned, parametric, non-rigid 3D body model. This means that captured skin textures are in accurate alignment across scans, facilitating the detection of new or changing lesions. The integration of lesion segmentation with a deformable 3D body model is a key contribution that makes our approach robust to changes in illumination and subject pose.
[Share]
Sending email...

A freely-moving monkey treadmill model
Foster, J., Nuyujukian, P.,
Freifeld, O., Gao, H., Walker, R., Ryu, S., Meng, T., Murmann, B.,
Black, M. and Shenoy, K.
J. of Neural Engineering,
11(4):046020,
2014.
Abstract:
▸
Objective: Motor neuroscience and brain-machine interface (BMI) design is based on examining how the brain controls voluntary movement, typically by recording neural activity and behavior from animal models. Recording technologies used with these animal models have traditionally limited the range of behaviors that can be studied, and thus the generality of science and engineering research. We aim to design a freely-moving animal model using neural and behavioral recording technologies that do not constrain movement.
Approach: We have established a freely-moving rhesus monkey model employing technology that transmits neural activity from an intracortical array using a head-mounted device and records behavior through computer vision using markerless motion capture. We demonstrate the excitability and utility of this new monkey model, including the first recordings from motor cortex while rhesus monkeys walk quadrupedally on a treadmill.
Main results: Using this monkey model, we show that multi-unit threshold-crossing neural activity encodes the phase of walking and that the average ring rate of the threshold crossings covaries with the speed of individual steps. On a population level, we find that neural state-space trajectories of walking at different speeds have similar rotational dynamics in some dimensions that evolve at the step rate of walking, yet robustly separate by speed in other state-space dimensions.
Significance: Freely-moving animal models may allow neuroscientists to examine a wider range of behaviors and can provide a flexible experimental paradigm for examining the neural mechanisms that underlie movement generation across behaviors and environments. For BMIs, freely-moving animal models have the potential to aid prosthetic design by examining how neural encoding changes with posture, environment, and other real-world context changes. Understanding this new realm of behavior in more naturalistic settings is essential for overall progress of basic motor neuroscience and for the successful translation of BMIs to people with paralysis.
[Share]
Sending email...
Modeling the Human Body in 3D: Data Registration and Human Shape Representation
PhD thesis.
Brown University, Department of Computer Science,
May
2014.
[Share]
Sending email...

Can I recognize my body’s weight? The influence of shape and texture on the perception of self
Piryankova, I., Stefanucci, J.,
Romero, J., de la Rosa, S.,
Black, M. and Mohler, B.
ACM Transactions on Applied Perception for the Symposium on Applied Perception,
11(3):13:1-13:18,
2014.
Abstract:
▸
The goal of this research was to investigate women’s sensitivity to changes in their perceived weight by altering the body mass index (BMI) of the participants’ personalized avatars displayed on a large-screen immersive display. We created the personalized avatars with a full-body 3D scanner that records both the participants’ body geometry and texture. We altered the weight of the personalized avatars to produce changes in BMI while keeping height, arm length and inseam fixed and exploited the correlation between body geometry and anthropometric measurements encapsulated in a statistical body shape model created from thousands of body scans. In a 2x2 psychophysical experiment, we investigated the relative importance of visual cues, namely shape (own shape vs. an average female body shape with equivalent height and BMI to the participant) and texture (own photo-realistic texture or checkerboard pattern texture) on the ability to accurately perceive own current body weight (by asking them ‘Is the avatar the same weight as you?’). Our results indicate that shape (where height and BMI are fixed) had little effect on the perception of body weight. Interestingly, the participants perceived their body weight veridically when they saw their own photo-realistic texture and significantly underestimated their body weight when the avatar had a checkerboard patterned texture. The range that the participants accepted as their own current weight was approximately a 0.83 to −6.05 BMI% change tolerance range around their perceived weight. Both the shape and the texture had an effect on the reported similarity of the body parts and the whole avatar to the participant’s body. This work has implications for new measures for patients with body image disorders, as well as researchers interested in creating personalized avatars for games, training applications or virtual reality.
[Share]
Sending email...

Advanced Structured Prediction
Nowozin, S.,
Gehler, P.V., Jancsary, J. and Lampert, C.H.
Advanced Structured Prediction.
Neural Information Processing Series,
MIT Press,
2014.
Abstract:
▸
The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components.
These models are expressive and powerful, but exact computation is often intractable. A broad research effort in recent years has aimed at designing structured prediction models and approximate inference and learning procedures that are computationally efficient. This volume offers an overview of this recent research in order to make the work accessible to a broader research community. The chapters, by leading researchers in the field, cover a range of topics, including research trends, the linear programming relaxation approach, innovations in probabilistic modeling, recent theoretical progress, and resource-aware learning.
[Share]
Sending email...