Comparison of SealNet with PrimNet
To see how well our software performed compared to a previously developed facial recognition software, PrimNet, we trained and tested it and SealNet models using the same data and parameters. To further ensure fairness, we tested the Rank-1 F1-Score results for each model at all threshold values in 0.01 increments and present the values for the run with the highest score. The Rank-5 scores presented use the same threshold as the best performing rank one, with loosened constraints for being classified as a True Positive. We used F1-Scores as there were only 74 in-set seal photos as opposed to the 571 photos with no corresponding seal in the gallery. Because F1-Score provides a better measure of propensity for incorrect classifications than accuracy it is more applicable to imbalanced datasets like ours. Baseline accuracy is the accuracy score of the model assuming all probes were rejected. TPR, or true positive rate, is the most intuitive measure of model performance, and shows the proportion of correctly classified probes at a given threshold.