The edge inference conversation has been dominated by latency. Read any survey paper, attend any infrastructure conference, and the opening argument is nearly always the same: cloud inference ...