Gaze Estimation Models
PyEtSimul provides three families of gaze estimation algorithms: polynomial models, the Stampe (1993) biquadratic model, and homography normalization.
Polynomial Models
Polynomial models map pupil-corneal reflection (P-CR) vectors to gaze coordinates using polynomial regression. Given the P-CR vector \((x, y)\) in image coordinates, the gaze position \((g_x, g_y)\) is estimated as:
where \(\phi_i\) and \(\psi_j\) are polynomial terms, and \(a_i\), \(b_j\) are coefficients determined through calibration.
Built-in Polynomial Models
PyEtSimul includes seven built-in polynomial models:
Hennessey (2008) [1] — Polynomial with cross-terms:
Hoorman (2008) [2] — Linear polynomial (different features for X/Y):
Cerrolaza (2008) symmetric [3] — Second-order polynomial (same features for X/Y):
Cerrolaza (2008) asymmetric [3] — Second-order with different features for X/Y:
Second Order — Full second-order polynomial with all cross-terms:
Zhu Ji (2005) [4] — Asymmetric polynomial:
Blignaut Wium (2013) [5] — High-order polynomial:
Custom Polynomial Registration
Any custom polynomial formulation can be registered without modifying the framework using
PolynomialDescriptor and register_polynomial(). Once registered, custom models are
fully integrated into the data generation and evaluation pipeline.
See the Template for Custom Gaze Estimation Models guide for a complete template.
Stampe (1993) Biquadratic Model
The Stampe (1993) model [6] uses a two-stage calibration:
Stage 1 — Biquadratic polynomial (5 terms, no cross-term):
where \((x, y)\) are the P-CR feature coordinates.
Stage 2 — Per-quadrant corner correction removes residual nonlinearity at the screen corners. The screen is divided into 4 quadrants relative to the centroid \((X_c, Y_c)\) of the calibration grid. For each quadrant \(q\):
where \(c_x^{(q)}\) and \(c_y^{(q)}\) are fit via least-squares over all calibration points in that quadrant.
Homography Normalization
Homography normalization [7] and its degraded variants are gaze estimation methods for uncalibrated setups (e.g., unknown camera and setup parameters). They use multiple corneal reflections to estimate a normalizing projective planar transformation that compensates for head pose.
Given corneal reflection positions \(\mathbf{p}_i\) in the image and corresponding reference positions \(\mathbf{p}'_i\), the method estimates a normalizing homography matrix \(\mathbf{H}\) such that:
where \(\sim\) denotes equality up to scale. The mapping from normalized pupil coordinates to gaze can be learned through user calibration.
The implementation includes:
RANSAC-based homography estimation to handle outliers
Optional Gaussian Process regression for residual error correction
This method requires 4 or more light sources to compute the homography.