Llama 3.1 70B vs Llama 3 70B: Which is Higher?

Introduction

On July twenty third, 2024, Meta launched its newest flagship mannequin, Llama 3.1 405B, together with smaller variants: Llama 3.1 70B and Llama 3.1 8B. This launch got here simply three months after the introduction of Llama 3. Whereas Llama 3.1 405B outperforms GPT-4 and Claude 3 Opus in most benchmarks, making it probably the most highly effective open-source mannequin accessible, it will not be the optimum alternative for a lot of real-world functions as a result of its sluggish era time and excessive Time to First Token (TTFT).

For builders trying to combine these fashions into manufacturing or self-host them, Llama 3.1 70B emerges as a extra sensible various. However how does it examine to its predecessor, Llama 3 70B? Is it value upgrading when you’re already utilizing Llama 3 70B in manufacturing?

On this weblog publish, we’ll conduct an in depth comparability between Llama 3.1 70B and Llama 3 70B, analyzing their efficiency, effectivity, and suitability for numerous use circumstances. Our purpose is that can assist you make an knowledgeable resolution about which mannequin most closely fits your wants.

Additionally Learn: Meta Llama 3.1: Newest Open-Supply AI Mannequin Takes on GPT-4o mini

Llama 3.1 70B vs Llama 3 70B: Which is Higher?

Overview

  • Llama 3.1 70B: Greatest for duties requiring intensive context, long-form content material era, and sophisticated doc evaluation.
  • Llama 3 70B: Excels in pace, making it preferrred for real-time interactions and fast response functions.
  • Benchmark Efficiency: Llama 3.1 70B outperforms Llama 3 70B in most benchmarks, notably in mathematical reasoning.
  • Velocity Commerce-Off: Llama 3 70B is considerably quicker, with decrease latency and faster token era.

Llama 3 70B vs Llama 3.1 70B

Primary Comparability

Right here’s a primary comparability between the 2 fashions.

  Llama 3.1 70B Llama 3 70B
Parameters 70 billion 70 billion
Worth-Enter tokens-Output tokens $0.9 / 1M tokens$0.9 / 1M tokens $0.9 / 1M tokens$0.9 / 1M tokens
Context window 128K 8K
Max output tokens 4096 2048
Supported inputs Textual content Textual content
Operate calling Sure Sure
Information cutoff date December 2023 December 2023

These important enhancements in context window and output capability give Llama 3.1 70B a considerable edge in dealing with longer and extra advanced duties, regardless of each fashions sharing the identical parameter depend, pricing, and information cutoff date. The expanded capabilities make Llama 3.1 70B extra versatile and highly effective for a variety of functions.