Understanding RetinaFace Landmarks and Confidence Scores

What the five facial landmarks mean, how the confidence score works, and how to use both together to filter out weak detections.

Beyond a simple box around each face, RetinaFace hands you two extra pieces of information that quietly make it far more useful: five landmarks and a confidence score. Understanding what they represent lets you build cleaner, more reliable results into whatever you are working on.

The five landmarks

For every detected face, RetinaFace marks five key points: the centre of the left eye, the centre of the right eye, the tip of the nose, the left corner of the mouth and the right corner of the mouth. These are returned as pixel coordinates within the image. Five points may sound minimal, but they pin down the orientation and rough geometry of a face precisely enough for most practical purposes.

Why landmarks are useful

Landmarks turn a flat rectangle into something you can reason about. With the eye positions you can tell whether a head is tilted and rotate the crop to level it. With all five points you can align faces consistently so that downstream steps see them in a standard pose. They also help you filter: a detection whose landmarks fall in an implausible arrangement is often a false positive worth discarding. In short, the landmarks are the raw material for alignment, normalisation and sanity checks.

The confidence score

Alongside each face comes a confidence score between zero and one. It expresses how strongly the model believes the region is a face. A crisp, clear, well-lit face sits near one; a blurry, tiny or partly hidden face sits lower. The score is not a probability in a strict statistical sense, but in practice it behaves like a reliable ranking of how trustworthy each detection is.

Choosing a threshold

The score becomes powerful the moment you set a threshold and keep only detections above it. Where you set that line is a deliberate trade-off. A high threshold keeps only the surest faces and minimises false positives, at the risk of dropping faint ones. A low threshold catches more faces, including weak ones, at the risk of letting some non-faces through. There is no universally correct value — it depends on whether your application would rather miss a face or accept a wrong one.

Using landmarks and scores together

The two outputs are strongest in combination. Use the confidence score as your first filter to remove the obviously weak detections, then use the landmark geometry as a second check to catch the odd false positive that slipped through with a deceptively high score. This two-stage approach — score first, geometry second — produces noticeably cleaner results than relying on either signal alone.

Practical tips

Start with a moderate threshold and adjust it against your own images rather than trusting a number you read somewhere. Visualise both the boxes and the landmarks during development so you can see what you are accepting and rejecting. And remember that the right settings for a tightly controlled portrait app differ from those for messy real-world photos — tune to your data, and let the landmarks and confidence score do the quiet work of keeping your detections honest.