Mistral AI's Open Source Initiative

Arthur Mensch, Co-Founder and CEO, Mistral

Welcome, everyone. I'm Artur Mensch, the CEO of Mistral.ai, and I'm excited to share with you what we've been up to in the last six months. Our ambition at Mistral.ai is to develop frontier models, the foundational models behind the AI revolution, and put them in the hands of real-world application makers. We believe in giving deep access to developers, and our approach is to release open-weight models and open-source software. We started Mistral.ai in May with the goal of enabling AI builders in France and Europe to have deeper access to AI models. We believe that open source is the key to accelerating technology adoption. Our team has grown to 18 people, and we quickly set out to recreate the entire stack needed to train large language models, including data pipelines, training code, and evaluation and inference pipelines.

In September, we released Mistral 7B, a model with 7 billion parameters. Despite its size, it's small enough to run on a smartphone, and the community has adopted it for various applications, including running on the iPhone 15 at reading speed. Mistral 7B has shown superior performance compared to other models, such as LLama 2 13B, which is almost twice as large.

One of our key premises is that generative AI can be trained more efficiently than what large companies have been doing. We demonstrated that with a fraction of the compute resources, we could train competitive and useful models. We strongly believe in the power of open source to make AI truly useful. Open source allows developers to have deep access to models, enabling them to tweak and customize models for specific applications. We understand the challenges of serving large language models, particularly in terms of memory. Mistral 7B addresses this by introducing a new architecture inspired by Longformer, reducing memory pressure and making it more efficient for inference. We are actively working on making Mistral 7B available on major hyperscalers and have seen adoption by companies and open-source projects, replacing existing APIs. Looking ahead, we plan to release new open-source models and are working on the Scaleway SuperPod which works really well. Our team, consisting of scientists and engineers, is exploring new scientific frontiers, aiming for better reasoning, memory capacity, and efficiency in training. On the business side, we're developing a hosted solution, a self-deployed platform, and optimized, verticalized models.

Our roadmap is packed for the next year, and we're excited about the possibilities. We've seen derivative work and collaborations that have improved and extended Mistral 7B, showcasing the power of open source and community contributions. We welcome creative ideas, new datasets, and community engagement to make Mistral stronger and better.