Face Anti-Spoofing feature enables to prevent false facial verification by using a photo, video or a different substitute for an authorized persons face.
It can defend against video attacks which is sophisticated way to trick the system, usually requires a looped video of a victims face.
The solution is not based on a pre-existing tracking algorithm, but I wrote a completely new tracking algorithm. This is a special purpose tracking solution. It is not suitable for tracking arbitrary objects, it is only suitable for tracking things that are well structured and that I have taught the model on which the solution is based. In this case, I taught the model for faces, but it could have been the structure of a bee or something similar.
Can be used in access control systems or Online Proctoring systems for student authentication. It can be used for any solution where facial recognition is used and we want to detect fraudsters.
This solution is based on a yolov4 model that detects fake and real faces. The detection is not continuous, but occurs on a separate thread at certain intervals. The detected faces are tracked by my tracking solution on the main thread.
The properties of the detector model are as follows:
Tiny Yolov4 version of the model:
mAP (mean average precision): class_id = 0, name = Spoof, ap = 99.33% class_id = 1, name = Real, ap = 99.41%
for conf_thresh = 0.25, precision = 0.96, recall = 0.98, F1-score = 0.97 for conf_thresh = 0.25, average IoU = 85.73 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.993667, or 99.37 %
Inference time using CPU: 55 ms (on HP Laptop 15-DA0042NH (Processor: Intel(R) Core(TM) i7-8550U CPU))
....Standard Yolov4 version of the model:
mAP (mean average precision): class_id = 0, name = Spoof, ap = 99.74% class_id = 1, name = Real, ap = 99.66%
for conf_thresh = 0.25, precision = 0.99, recall = 0.98, F1-score = 0.99 for conf_thresh = 0.25, average IoU = 88.39 %
oU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.996980, or 99.70 %
Inference time using CPU: 500 ms (on HP Laptop 15-DA0042NH (Processor: Intel(R) Core(TM) i7-8550U CPU))
The detector recognises the following three cases as spoofing:
Print attack: When someone tries to spoof the system by using a photo.
Video presentation attack: When someone tries to spoof the system by using a video recording.
3d face mask attack: When someone tries to fool the system with a 3d face replica.
What you see in this video is the raw output of the detector without pre/post processing, tracking, filtering, visualization etc.
Input (video or image to process, capable of processing):
mjpeg stream
rtsp stream
USB camera devices
video files (avi, mp4, mkv formats supported)
standalone image files (.png, .jpg formats supported)
Outputs:
Processed video frame
The faces in the frame (boinding boxes)
For each face:
Unique Tracking ID (when processing a video file, the same ID on each frame belongs to the same person)
5 facial landmark points
State of the face, real face or fake
The system is able to to write the processed video to a video file.
The demo video was recorded on a HP Laptop 15-DA0042NH (Processor: Intel(R) Core(TM) i7-8550U CPU, RAM: 8 Gb). It used 500 Mb RAM and the CPU usage was 65% during the recording. The input video was captured using a Xiaomi CMSXJ22A web camera. The input resolution was 1080p . During recording, the system processing speed was stable above 50 FPS . When processing a single face, the system can maintain this speed on this hardware. When processing multiple faces, the system may be slower. The visualization was added to the video afterwards. The visualization in the video can be done live, but may slow down processing.
The system is written entirely in C++ and uses the following libraries/technologies: