New datasets for biometric research on multimodal and interoperable performance launched by NIST
NIST has launched new datasets, consisting of fingerprint, facial photographs, and iris scans, to help biometrics researchers to evaluate the performance of access control identity verification systems, according to an announcement by the institution.
The data consists of three databases of files, all stripped of identifying information and available from the NIST website. The three special databases are numbered SD 300, SD 301, and SD 302, and are intended to the first batch in an expanding collection of resources.
The data is sourced from various different collection processes, but two of the databases include data gathered during IARPA’s Nail to Nail Fingerprint Challenge, which NIST helped design and operate.
“This all gets back to reproducible research,” said NIST computer scientist Greg Fiumara. “The data will help anyone who is interested in testing the error rates of biometric identification systems.”
SD 301 is the first multimodal biometric dataset that NIST has every released, according to the announcement.
“This opens up possibilities for types of multimodal research that haven’t been done before,” Fiumara said. “We want to get more secure and more accurate identification, as multimodal systems are harder to spoof.”
SD 302 is made up of fingerprint data from several hundred people captured on eight different devices, some of which are prototypes and some of which are commercially available, while data collected during the N2N challenge includes fingerprints taken with contactless devices, as well as latent prints, which are not often available is realistically and expertly collected form, according to Fiumara.
All of the individuals whose data is found in SD 301 and SD 302 have formally consented to have their biometric data included. SD 300 is made up of data taken from 900 old ink cards, the subjects of which are now deceased. A benefit of this dataset is enabling the evaluation of how modern systems can produce results from legacy hard-copy records, which will continue to be used out of necessity by the criminal justice system.
The origin of biometric datasets, and whether people with data included in them consented to share their data, has been an increasingly contentious issue this year.
The SDs are all stored with archival-grade lossless compression, which Fiumara says is an improvement over past research datasets. Each dataset comes with a user’s guide containing information about the collection of the data and other details for researchers.