We faced several challenges. The deep-learning models need to be
shipped as part of the operating system, taking up valuable NAND
storage space. They also need to be loaded into RAM and require
significant computational time on the GPU and/or CPU. Unlike
cloud-based services, whose resources can be dedicated solely to a
vision problem, on-device computation must take place while
sharing these system resources with other running applications.
Finally, the computation must be efficient enough to process a
large Photos library in a reasonably short amount of time, but
without significant power usage or thermal increase.