This model can detect face, eye, iris, ear, nose, mouth and even glasses. It is based on a YoloV6 nano, which provides high speed (170+ FPS).
Number of classes: 7 (face, eye, iris, ear, nose, mouth, glasses)
System requirements: Inference time using CPU: 15 ms (on HP Laptop 15-DA0042NH (Processor: Intel(R) Core(TM) i7-8550U CPU)) - Up to 5 ms on GPU (GeForce 1050 TI)
Model description: I made a yolov6 nano object detection model. This model is basically a face detector, but besides the face it can detect sunglasses, eyes, mouth, nose and even iris and ears. I have called this detector an extended face detector, as the detection is extended with the classes mentioned above.
The advantages of this solution are:
Very fast, on Nvidia 1050 the inference time is around 5 ms.
The inference time is quite good even on CPU, using opencv on an Intel(R) Core(TM) i7-8550U CPU the inference time is 15 ms.
It's small and fast enough to run at a reasonable speed on mobile devices.
It immediately returns regions of the face, so for simpler usecase it is not necessary to run a separate landmark detector on the face.
Some problem that can be solved with the help of this model:
It is capable of detecting multiple parts of the head, so simple head gaze tracking can be achieved using solvepnp.
It can also detect the iris, so simple eye gaze tracking can be built on top of it.
It gives back a bounding rectangle to the eye, which can be used to implement blink detection.
It gives back a bounding rectangle to the mouth, which can be used to implement yawn detection.
Since it also detects ears, it is detectable if the ear is covered by something. (This could be a valid usecase for driver monitoring related development.)