make motion detection rather straightforward
makes things more difficult:
we require prior information
estimate plausible poses using probabilities
sufficiently general to admit all possible motions
<>
strong enough to resolve ambiguities
activity specific models from motion capturing
Problem: Pose and motion data is extremely high dimensional, difficult to visualize and expensive to compute on.
approximate the posterior probability distribution over human poses
or motions given image observations
\(p(x_{1:t}|z_{1:t}) = p(z_{1:t}|x_{1:t})p(x_{1:t}) / p(z_{1:t})\)
states \(x_{1:t}\), observations \(z_{1:t}\), time \(t\)
computing the posterior distribution is intractable
limited range of motion in each joint
detected poses need to satisfy valid biomechanics
can be used to capture plausibility of pose estimates
every new pose equals the old pose with some added noise
\(y_{t+1} = y_{t} + \eta\)
\(y_{t+1} = y_{t} + \kappa(y_t - y_{t-1}) + \eta\)
collected using off-line motion capturing
\(\mathbb{D} = \{y^{(i)}\}_{i=1,...,\mathcal{N}}\)
\(y^{(i)} \in \mathcal{R}^D\)
N poses y each consisting of D joint angles
a motion is a sequence of poses: \(m = (y_1,...,y_m)\)
activities exhibit strong regularities
\(\rightarrow\) data from a single activity is likely to be clustered in high dimension
\(\rightarrow\) eigen-poses can be constructed for complexity reduction
linear combination of mean motion and eigen-motions characterized by scalar coefficients
\[m \approx \mu + \Sigma_{j=1 \rightarrow B} x_j b_j\]
periodic motions follow a cyclic trajectory in high dimensionality
linear models require many dimensions to appropriately span the data
nonlinear manifolds can model those structures better
univariate \(\rightarrow\) multivariate \(\rightarrow\) processes
[drawings]
\(f \sim GP(m, k)\)
function \(f\) is distributed as a GP with mean function \(m\) and covariance function \(k\)
this is a superset of a gaussian distribution
\(f \sim \mathcal{N}(\mu_{1:n}, \sigma_{1:n,1:n})\)
\(\mu_i = m(x_i)\quad\) \(\sigma_{ij} = k(x_i, x_j)\)
\(k(x,x') = \alpha exp\left(-\gamma/2 * ||x-x'||^2\right) + \beta \delta(x,x')\)
Hyperparameters \(\theta={\alpha, \gamma, \beta}\)
\(p(Y|\{x^{(i)}\}, \theta) = \prod_{d=1:D} (1/((2\pi)^N|K|)^{-1}) exp(-1/2 * y_d^T * K^{-1} y_d)\)
training tupels of vectors \({(x^{(i)}, y^{(i)})}_{i=1:N}\), \(y_d\) being a vector of every dth element
utilizes gaussian processes to predict samples from latent variables
main feature: predictive distribution
unsupervised, we only know the observations and not latent space
optimization happens through evaluating for correct latent space \(\rightarrow\) pose space mapping
initialized with broad gaussians
GPLVM is sampled from independent training data -- ignores temporal relations
intuition for the latent space got lost because of missing spatial proximity
smooth pose trajectories \(\rightarrow\) smooth latent trajectories
required for accurate predictions and tracking
GPDM is initialized using GP prior over latent sequences
weighted sum over individual models with side information available
2 layers, neurons connected between the layers but not within
visible units represent the observation, hidden units the latent space
extension of RBMs to handle time-series data
added temporal input and autoregression: past n inputs influence current input and hidden layers
autoregressive weights model short-term temporal structure
hidden units model longer-term, higher level structure
\(M\ddot{y} = f_{joints} + f_{gravity} + f_{contact} + a\quad\) mass \(m\), acceleration \(\ddot{y}\), forces \(f\)
physically plausible motions: e.g. balance or interactions
better generalization: e.g. walking vs. walking while carrying heavy object
no need for a lot motion capture data for training
while very promising, not yet very well researched
models like this are strongly used in gaming
This: ahoereth.github.io/motion-models
All images from Visual Analysis of Humans (ch.10) and the respective references
Neil Lawrence on GPLVMs @ Google: youtu.be/DS853uA0u4I
Interactive visualizations by Neil Lawrence: github.com/lawrennd/oxford
seems to be broken, fixed version available on github.com/ahoereth/motion-models/tree/lawrennd