[AI Meta Llama-3.1] Introduction

Explore the powerful features of the Meta Llama-3.1 AI model, including multi-version options, key capabilities, custom deployment, and cost optimization strategies.

This open-source AI model can be fine-tuned, distilled, and deployed anywhere. The latest instruction-tuned models are available in 8B, 70B, and 405B versions.

Models

  • 405B flagship foundational model, supporting the broadest range of use cases.
  • 70B high-performing, cost-effective model, supporting various use cases.
  • 8B lightweight, ultra-fast model that can run anywhere.

Key Capabilities

  • Tool usage
  • Multilingual agent
  • Complex reasoning
  • Coding assistant

Make Llama Your Own

Build faster with our open ecosystem by choosing a range of differentiated product services to support your use case.

  • Inference Choose between real-time inference or batch inference services. Download model weights to further optimize cost per token.
  • Fine-tuning, Distillation, and Deployment Adapt and improve your applications with synthetic data, and deploy locally or in the cloud.
  • RAG and Tool Usage Use Llama system components and extend the model with zero-shot tool usage and RAG to build agent-like models.
  • Synthetic Data Generation Utilize 405B high-quality data to improve specialized models for specific use cases.

Quick Start with Partners

405B-models-with-partners

Model Evaluation

Measured on over 150 benchmark datasets covering multiple languages, with extensive human evaluations.

llama-31-vs-llama-3

llama-31-vs-other-models

Model Pricing

As of 2024-07-23 12:00 PST, public pricing for hosting Llama 3.1 inference API is per million tokens. This table will be updated as more pricing information becomes available.

llama-31-price


Built with Hugo
Theme Stack designed by Jimmy