Microsoft has unveiled a new, smaller AI model named Phi-3, which aims to reduce costs for users who may find their large language models (LLM) unaffordable.
In a statement released Tuesday, Microsoft highlighted that Phi-3-mini demonstrates remarkable performance, outperforming models twice its size across various language, coding, and mathematical benchmarks.
Designed to handle simpler tasks, these smaller AI models are accessible to companies with limited resources. For instance, Microsoft explained that Phi-3 could assist businesses in summarizing lengthy documents and extracting valuable insights from market research reports.
Training Methodology of Phi-3
Eric Boyd, Corporate Vice President of Microsoft Azure AI Platform, described Phi-3 Mini as equivalent in capability to LLMs like GPT-3.5, albeit in a more compact form. He elucidated that developers employed a “curriculum” approach in training Phi-3, drawing inspiration from how children learn through bedtime stories and simplified language structures.
Boyd elaborated, “We took a list of more than 3,000 words and asked an LLM to make ‘children’s books’ to teach Phi.” He underscored that Phi-3 builds upon the knowledge acquired by its predecessors, including Phi-2 released in December. While Phi-2 matched the performance of larger models like Llama 2, Phi-3 surpasses its predecessor, delivering responses comparable to models ten times its size.
Implications and Cost Reduction
With reduced processing requirements, smaller AI models allow big tech providers to offer more affordable solutions to customers. Microsoft anticipates that the accessibility of these new models will enable broader integration of AI in scenarios where the larger models were previously financially prohibitive.
While Microsoft asserts that employing the new models will be “substantially cheaper” than using larger counterparts like GPT-4, specific pricing details were not provided.
Competitors in the Field
Other tech giants, including Google and Meta, have also introduced small AI models targeting simpler tasks such as document summarization and coding assistance. Google’s Gemma 2B and 7B, along with Meta’s Llama 3 8B, cater to tasks like chatbots and coding support, while Anthropic’s Claude 3 Haiku summarises dense research papers.
The introduction of Phi-3 marks a significant step in democratizing access to AI technologies, making them more accessible and affordable to a wider range of users worldwide.