AI has achieved significant progress in recent years, with systems matching human capabilities in diverse tasks. However, the real challenge lies not just in creating these models, but in deploying them efficiently in everyday use cases. This is where inference in AI takes center stage, surfacing as a critical focus for researchers and tech leaders