NeuralNet Tracker - OpenTrack

The NeuralNet tracker uses deep learning models to detect and track your face without requiring any physical markers. It’s the most convenient tracker to set up - just point your webcam at your face and start tracking.

How It Works

The tracker uses two neural networks:

Localizer network: Quickly finds your face in the camera frame
Pose estimator network: Accurately determines head position and orientation from the face region

Both networks are optimized ONNX models that run efficiently on CPU using ONNX Runtime.

The neural network models are included with OpenTrack. The default model (head-pose-0.4-big-int8.onnx) is quantized to INT8 for faster inference while maintaining good accuracy.

Requirements

Hardware

Webcam: Any standard webcam (640x480 or higher)
CPU: Modern multi-core processor (the tracker is CPU-optimized)
RAM: At least 4GB system memory

Software

ONNX Runtime (included with OpenTrack)
OpenCV (included with OpenTrack)
Trained pose estimation model (included)

No special hardware or physical markers required - just your face and a webcam!

Setup Instructions

Select Camera

In OpenTrack tracker settings:

Camera: Select your webcam
Resolution: 640x480 or higher
Force FPS: 30-60 (higher is smoother)
Use MJPEG: Enable if supported

Configure Field of View

Set your camera’s horizontal field of view:

Field of View: 56 degrees (typical for most webcams)

The FOV affects depth estimation accuracy. Most webcams are 50-65 degrees.

Set Head Offset

Configure the offset from face detection point to head rotation center:

Offset Forward: 200mm (typical distance from face to neck pivot)
Offset Up: 0mm
Offset Right: 0mm

Or use automatic calibration:

Click Start in calibration section
Rotate your head while keeping body still
System calculates optimal offset
Click Stop when satisfied

Adjust Performance Settings

Optimize for your system:

Number of Threads: 1-4 (more threads = faster, but diminishing returns)
ROI Filter Alpha: 1.0 (lower = smoother ROI transitions)
ROI Zoom: 1.0 (how much to zoom into face region)

Start Tracking

Click Start in OpenTrack
Position your face in camera view
The tracker will automatically detect and track your face
Green overlay shows detected face region

The tracker may take 1-2 seconds to initially detect your face. Once locked on, tracking is continuous.

Configuration Options

Camera Settings

Option	Default	Description
Camera Name	-	Select your webcam
Force Resolution	0 (auto)	Set specific resolution
Field of View	56°	Camera horizontal FOV
Force FPS	Default	Lock framerate (30, 60, 90, etc.)
Use MJPEG	false	Enable MJPEG compression

Head Position Offset

Option	Default	Description
Offset Forward	200mm	Distance from detected face point to neck pivot
Offset Up	0mm	Vertical offset
Offset Right	0mm	Horizontal offset

The forward offset (typically 150-250mm) is crucial for accurate translation tracking. It represents the distance from the face plane to your neck’s rotation center.

Neural Network Settings

Option	Default	Description
PoseNet File	head-pose-0.4-big-int8.onnx	Neural network model file
Number of Threads	1	CPU threads for inference (1-4 recommended)
Show Network Input	false	Display preprocessed input to neural network

Filtering and Smoothing

Option	Default	Description
Internal Filter Enabled	true	Enable built-in Kalman filtering
ROI Filter Alpha	1.0	Region-of-interest smoothing (0-1, lower = smoother)
ROI Zoom	1.0	Zoom factor for face region extraction
Deadzone Size	1.0	Circular deadzone size (mm)
Deadzone Hardness	1.5	Deadzone transition sharpness

Understanding the Internal Filter

The internal Kalman filter smooths the raw pose estimates from the neural network. It helps reduce jitter while maintaining responsiveness. The filter uses:

Unscented transform for non-linear pose spaces
Quaternion representation for rotations
Velocity prediction for smooth motion

Disable this filter only if you want to use OpenTrack’s external filters exclusively.

Advanced Features

Automatic Face Localization

The tracker uses a two-stage approach:

Coarse search: The localizer network scans the full frame at multiple scales
Fine tracking: Once a face is found, the pose estimator focuses on that region
ROI tracking: Subsequent frames only process the region around the last known face position
Recovery: If face is lost, returns to full-frame search

This approach provides both fast initial detection and efficient continuous tracking.

Adaptive ROI Filtering

The region of interest is filtered over time to prevent jittery crops:

ROI_new = ROI_old * (1 - alpha) + ROI_detected * alpha

Lower alpha values create smoother ROI transitions but may lag behind fast movements.

Deadzone Filter

A circular deadzone filter reduces noise from small movements:

Creates a “dead zone” around the current position
Small movements within the zone are dampened
Large movements pass through normally
Useful for steady aiming in games

Neural Network Models

Default Model

head-pose-0.4-big-int8.onnx

Quantized to INT8 for fast CPU inference
Input: 129x129 grayscale face crop
Output: Head pose (3D position + rotation quaternion + uncertainty)
Inference time: ~15-30ms on modern CPU

Custom Models

You can use custom ONNX models:

Train your own face pose estimation model
Export to ONNX format
Place in OpenTrack’s model directory
Select in PoseNet File dropdown

Custom models must match the expected input/output format. See the model adapter code for interface details.

Performance Optimization

Adjust Thread Count

Start with 1 thread, increase to 2-4 if CPU usage is low:

1 thread: ~30-40 FPS on modern CPU
2 threads: ~50-60 FPS (diminishing returns)
4 threads: ~60-70 FPS (limited gains)

More threads increase CPU usage significantly.

Optimize Resolution

Higher camera resolution improves face detection but reduces FPS:

320x240: Very fast, may miss faces at distance
640x480: Good balance (recommended)
1280x720: Better detection range, slower

Reduce ROI Zoom

Lower ROI zoom (0.8-0.9) processes fewer pixels:

May increase FPS slightly
Risk of losing face if it moves quickly
Only use if you need every bit of performance

Disable Preview

Turn off video preview when not needed:

Saves CPU cycles for drawing
Minimal but measurable FPS improvement

Troubleshooting

Face not detected

Lighting issues:

Ensure face is well-lit
Avoid backlighting (window behind you)
Add front lighting if needed

Camera positioning:

Face should be roughly centered in frame
Keep face within 30-100cm from camera
Don’t rotate face more than 60° from camera

Settings:

Increase camera resolution
Adjust camera exposure settings
Try different ROI zoom values

Jittery or unstable tracking

Enable internal filter (if disabled)
Reduce ROI filter alpha (e.g., 0.5-0.7)
Use OpenTrack’s Accela filter
Increase deadzone size
Improve lighting for better face detection
Ensure stable camera mounting

Tracking loses face easily

Increase ROI zoom to 1.2-1.5
Increase ROI filter alpha for faster response
Improve lighting conditions
Move closer to camera
Reduce extreme head rotations

Low framerate / High CPU usage

Reduce number of threads (try 1-2 only)
Lower camera resolution
Enable MJPEG compression
Close background applications
Disable “Show Network Input” option
Use lower resolution camera mode

Translation tracking incorrect

Recalibrate head offset (forward distance is critical)
Verify FOV setting matches your camera
Check that face is detected at correct size
Ensure camera is at eye level

Advantages and Limitations

Advantages

No markers or special hardware
Easy setup - just point and track
Works in normal lighting
No wearing anything on head
Good for casual use
Constantly improving with better models

Limitations

Higher latency than marker tracking (~50-80ms)
More CPU intensive
Requires good lighting
Limited rotation range (±70°)
Less accurate than IR point tracking
Can be affected by facial expressions

Comparison with Other Trackers

Feature	NeuralNet	ArUco	PointTracker
Marker required	No	Yes (paper)	Yes (IR LEDs)
Setup difficulty	Very Easy	Easy	Medium
Hardware cost	Very Low	Very Low	Medium
Accuracy	Good	Good	Excellent
Latency	Medium (50-80ms)	Low (20-30ms)	Very Low (10-20ms)
CPU usage	Medium-High	Low	Low
Rotation range	±70°	±60°	±90°

Tips for Best Results

Lighting: Use soft, diffuse front lighting - avoid harsh shadows on face
Camera: Position at eye level, 50-80cm away, angled slightly downward
Background: Keep background simple to help face detection
Movement: Start with small movements to establish tracking before large rotations
Calibration: Take time to properly calibrate the forward offset
Filtering: Use OpenTrack’s Accela filter for smoother gaming experience

Technical Details

Model Architecture

The default model uses:

MobileNet-based backbone for efficiency
Regression head for pose parameters
Uncertainty estimation for filter confidence
INT8 quantization for 4x speedup

Coordinate Systems

Face coordinates: Detected face center in camera space
Head coordinates: Rotation center after applying offset
Output: Standard OpenTrack 6DOF (X, Y, Z, Yaw, Pitch, Roll)

Build Requirements

For building with NeuralNet support:

# ONNX Runtime required
# See BUILD.md in tracker-neuralnet directory

Documentation Index

​How It Works

​Requirements

​Hardware

​Software

​Setup Instructions

​Configuration Options

​Camera Settings

​Head Position Offset

​Neural Network Settings

​Filtering and Smoothing

​Advanced Features

​Automatic Face Localization

​Adaptive ROI Filtering

​Deadzone Filter

​Neural Network Models

​Default Model

​Custom Models

​Performance Optimization

​Troubleshooting

​Advantages and Limitations

Advantages

Limitations

​Comparison with Other Trackers

​Tips for Best Results

​Technical Details

​Model Architecture

​Coordinate Systems

​Build Requirements

​See Also

How It Works

Requirements

Hardware

Software

Setup Instructions

Configuration Options

Camera Settings

Head Position Offset

Neural Network Settings

Filtering and Smoothing

Advanced Features

Automatic Face Localization

Adaptive ROI Filtering

Deadzone Filter

Neural Network Models

Default Model

Custom Models

Performance Optimization

Troubleshooting

Advantages and Limitations

Comparison with Other Trackers

Tips for Best Results

Technical Details

Model Architecture

Coordinate Systems

Build Requirements

See Also