Many of Google’s machine learning efforts are open sourced so that developers can take advantage of the latest advancements. The latest release is for semantic image segmentation, or the technology behind the Pixel 2’s single lens portrait mode.
This deep learning model assigns semantic labels to every pixel in an image. In turn, categorization allows classifications like road, sky, person, or dog, and which part of a picture is the background and what is the foreground.
Applied to photography, the latter is leveraged on the Pixel 2’s Portrait Mode for shallow depth-of-field effects with only one physical lens. This use requires optimization especially in “pinpointing the outline of objects,” or being able to distinguish where a person ends and the background begins.
Assigning these semantic labels requires pinpointing the outline of objects, and thus imposes much stricter localization accuracy requirements than other visual entity recognition tasks such as image-level classification or bounding box-level detection. This is made possible in DeepLab-v3 thanks to a decoder module that optimizes performance especially along object boundaries.
Open sourced on Monday (via The Verge), this semantic image segmentation model makes possible the uses seen on the Pixel 2 and Pixel 2 XL. Implemented in TensorFlow, this release also includes model training and evaluation code. Google notes that these current accuracy levels were unimaginable five years ago, but made possible by advancements to hardware, method, and datasets. Read more from 9to5google.com…
thumbnail courtesy of 9to5google.com