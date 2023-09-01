Meta made a couple significant computer vision announcements this week. It introduced a proposed fairness benchmark, and it made vision model open source.

In both cases, the parent of Facebook wants to insinuate itself deeper into fabric of AI development.

Meta has proposed FACET as the standard for image classification and semantic segmentation “at unprecedented scale.” How much, if at all, Facebook benefits from this is an open question. The company famously swore off facial recognition for the social media service.

Presumably referring to its corporate self, Meta in an announcement said, “we have a responsibility to ensure that our AI systems are fair and equitable.”

Anyone using AI computer vision “may” have a bad experience because of their demographics, not because biometric recognition and related tasks are inherently complex.

FACET, an acronym only a human could dream up, stands for FAirness in Computer Vision EvaluaTion. It is written to better evaluate vision models for visual grounding, instance segmentation, detection and classification.

There are 50,000 people recorded in 32,000 images in the FACET database, according to Meta. Each image is labeled for demographic attributes by expert human annotators. The company did not explain how it defines “expert.”

Other physical attribute labels include perceived skin tone and hair style and “person-related” classification such as doctor and basketball player.

As well, the company says, there are labels for 69,000 SA-1B database masks. That stands for Segment Anything 1 Billion, and it was designed to train general-purpose object segmentation on images from the wild.

The announcement, a six-minute read, goes into much more detail.

It also explains that Meta is expanding DINOv2, making it open-source. The computer vision model was trained using self-supervised learning to create universal features. It is covered by the Apache 2.0 license.

DINOv2-derived dense prediction models have been released for semantic image segmentation and monocular depth estimates.

This will give the AI community “greater flexibility to explore its capabilities on downstream tasks,” according to the company.

