World’s largest biometrics database leverages Big Data architecture
MapR-DB is an enterprise-grade NoSQL database in Hadoop that allows enterprises to run operational and analytical workloads together in a single cluster. Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures of individual machines, or racks of machines are commonplace and thus should be automatically handled in software by the framework.
MapR-DB was used to build the biometric database with the aim to verify a person’s identity within 200 milliseconds. The Aadhaar registry includes an iris scan, digital fingerprints, a digital photo, and text-based data for every resident. The amount of biometric data that is collected per person is approximately three to five megabytes per person, which maps to a total of 10-15 petabytes of data. The Aadhaar database therefore conforms with the classic definition of a Big Data system.
Biometrics Research Group, Inc., publisher of BiometricUpdate.com, defines Big Data as a term used to describe large and complex data sets that can provide insightful conclusions when analyzed in a meaningful way.
Big Data is typically defined as a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. Relational database management systems and desktop statistics and visualization packages often have difficulty handling Big Data. The work instead requires “massive parallel” software running on tens, hundreds, or even thousands of servers. Big Data must be processed with advanced analytic tools and algorithms to reveal meaningful information. MapR-DB is such an advanced analytic tool.
“Multiple challenges include storage – analytics to make sure the data is accurate, security, and very high-volumes of authentications,” MapR co-founder and CEO John Schroeder told Forbes. “It also had to be implemented in a very economical way. Enrollment is on inexpensive laptops, and the low bandwidth and resilient technology must be able to work with the registrations coming in from areas of low connectivity.”
Indeed, between 60,000-80,000 small laptops that include the installed Aadhaar system are used in remote villages for enrollment and verification. Laptops, generators, chairs and tables are transported to the villages via donkeys, and the systems are then set up in each remote location. There are currently 150,000 certified operators and supervisors who are trained and certified to operate an enrollment station. On average, each station enrolls about 50 people per day, resulting in approximately one million new enrollments every day.
The entire technology architecture behind Aadhaar is based on principles of openness, linear scalability, strong security, and most importantly vendor neutrality. The backbone of the Aadhaar technology was developed using open architecture, design scalability, 2048-bit PKI encryption for data security and computing capacity that allows for over 600 trillion biometric matches to be processed every day.
Schroeder sees the implementation of such a Big Data storage architecture as a starting point that will allow India to have advantages in terms of delivering healthcare, insurance and other social services. “Aadhaar is a huge leap-frog over the U.S. where social security is just a number,” he told Forbes. “We don’t have the validation and biometric identification to match the person.”
Schroeder revealed his company spent six and a half years developing the Aadhaar platform. He believes the platform is “revolutionary” because of its is designed to accept and store data from over 1 billion registrants.
Aadhaar, the world’s largest universal Civil ID program, is the biometric database used by the Indian government to provide social services. To date, Aadhaar has issued 630 million Aadhaar numbers, and has enrolled approximately 850 million people. The database is actively used for monitor school attendance, issue natural gas subsidies to India’s rural poor, and to send wages directly to people’s bank accounts. The Indian government spends US$50 billion on direct subsidies, such food coupons for rice and cooking gas, every year.
The Aadhaar system, a landmark legacy project of India’s previous Congress Party government, also provides identification to people who do not have birth certificates. Under Prime Minister Narendra Modi, the Indian government is now seeking to massively expand the program. In its first budget, the government allocated US$340 million to speed the Aadhaar registration process. The new government’s objective is now to enroll 100 million more residents with Aadhaar in an effort to expand social programs offered through the service.
Last autumn, India’s Home Ministry declared that Aadhaar would facilitate “anytime, anywhere, anyhow” authentication to beneficiaries. To make this goal possible, the government had proposed a “Digital India” project, which would be tasked with providing citizens with a “cradle-to-grave” digital identity. Modi’s government also proposed the use of Aadhaar to issue bank accounts to all Indian households along with using the database as a means of identification for healthcare insurance beneficiaries in order to launch its newly proposed universal healthcare program. The government was also reportedly exploring the use of Aadhaar to assist in the issuance of passports, mobile SIM smartphone cards, and pension payments. India’s government also has been considering merging the database with its National Population Register (NPR). The NPR is a comprehensive identity database used for India’s census which is maintained by the Home Ministry, while the UIDAI maintains Aadhaar.
Future plans call for Aadhaar to be used for digital signatures, electronic documents and digital locker services. Also, Aadhaar could also be used for college and university certificates, as well
as credit registries.
With Aadhaar projected to be more integral to social service delivery, its stability and scalability is key. MapR-DB ensures that the world’s largest biometric database is able to effectively leverage Big Data architecture.