Which o3-mini Reasoning Stage is the Smartest?

Reasoning Mode Pace Use Case Benchmarks Best Functions
Low Improved accuracy over Low-mode Fast prototyping, high-volume information preprocessing Matches O1-mini coding accuracy Fundamental information entry, fast responses, easy queries
Medium Balanced velocity and accuracy Information evaluation, content material technology Improved accuracy over Low mode Reasonable complexity duties, report technology, content material creation
Excessive Designed for PhD-level precision Advanced problem-solving, strategic planning Elite-tier reasoning capabilities Superior STEM functions, search engine optimization optimization, in-depth analysis

1. Low Reasoning Mode

  • Pace: The Low Reasoning Mode processes queries in seconds, making it sooner than the o1-mini mannequin. This velocity is essential for functions the place time is of the essence.
  • Use Case: This mode is right for fast prototyping and high-volume information preprocessing, the place fast responses are important. It permits builders to check concepts and iterate rapidly with out getting slowed down by advanced reasoning.
  • Benchmarks: It matches the coding accuracy of the o1-mini mannequin, making it appropriate for duties that require swift responses with out the necessity for deep reasoning. That is notably helpful in environments the place velocity is prioritized over depth.
  • Best Functions: Fundamental information entry, fast responses to FAQs, and easy queries in customer support functions.

2. Medium Reasoning Mode

  • Pace: Whereas particular processing instances are usually not detailed, the Medium Reasoning Mode provides a stability between velocity and accuracy, making it a flexible selection for varied duties.
  • Use Case: This mode is appropriate for duties that require reasonable complexity, akin to information evaluation and content material technology. It permits for a extra nuanced method in comparison with the Low mode.
  • Benchmarks: It gives improved accuracy over the Low mode whereas sustaining cheap processing instances, making it versatile for varied functions. This mode is especially efficient for duties that require a reasonable degree of element and evaluation.
  • Best Functions: Report technology, content material creation for blogs and articles, and reasonable complexity duties in enterprise analytics.

3. Excessive Reasoning Mode

  • Pace: Though particular processing instances are usually not talked about, the Excessive Reasoning Mode is designed for duties that require PhD-level precision. This mode is optimized for depth and thoroughness.
  • Use Case: This mode is finest for advanced problem-solving, strategic planning, and duties that demand deep understanding and nuanced reasoning. It’s preferrred for eventualities the place accuracy and element are paramount.
  • Benchmarks: It delivers elite-tier reasoning capabilities, making it appropriate for superior functions in fields like STEM and search engine optimization. This mode excels in environments the place high-level reasoning is critical.
  • Best Functions: Superior STEM functions, search engine optimization optimization, in-depth analysis tasks, and strategic enterprise planning.

Arms-on o3-mini Reasoning Ranges

We’re experimenting with an AIME 2024 query which exams mathematical problem-solving with arithmetic, algebra, counting, geometry, quantity concept, likelihood and different secondary faculty math subjects.

AIME 2024

Low reasoning mode

from langchain_openai import ChatOpenAI
import time
# Initialize the mannequin
llm = ChatOpenAI(mannequin="o3-mini", openai_api_key=userdata.get('OPENAI_API_KEY'))
# Begin time calculation
start_time = time.time()
# Streaming response
for token in llm.stream(
 """ Let $ABCD$ be a tetrahedron such that $AB=CD= sqrt{41}$, $AC=BD= sqrt{80}$, and $BC=AD= sqrt{89}$.
     There exists a degree $I$ contained in the tetrahedron such that the distances from $I$ to every of the faces of the tetrahedron are all equal.
     This distance might be written within the type $frac{m sqrt n}{p}$, the place $m$, $n$, and $p$ are constructive integers, $m$ and $p$ are comparatively prime,
     and $n$ is just not divisible by the sq. of any prime. Discover $m+n+p$. """,
    reasoning_effort="low",
):
   print(token.content material, finish="")
# Finish time calculation
end_time = time.time()
# Show the period
print(f"nTime taken: {end_time - start_time:.2f} seconds")

Output

We are going to present that if one “labels” the tetrahedron by 

  AB = CD = √41, AC = BD = √80, BC = AD = √89,

then one could show that the distinctive level I contained in the tetrahedron at which the distances to the 4 faces are equal is precisely the in‐middle, and its distance to the faces (i.e. the inradius r) might be written within the type

  r = (m√n)/p        (1)

with m, n, and p constructive integers, m and p comparatively prime and n sq.–free. On this specific downside one could present that

  m = 2, n = 21, p = 3,

in order that 

  r = (2√21)/3    (2)

and therefore 

  m+n+p = 2+21+3 = 26.

In what follows we describe one method to the answer.

──────────────────────────────

Define of 1 resolution

(1) One could “repair” $A$, $B$, $C$ in 3–area by introducing coordinates. For instance, take

  A = (0,0,0),

  B = (√41,0,0),

  C = (x,y,0)

and select x and y in order that the aspect‐lengths

  AC = √80 and BC = √89

maintain. A brief calculation exhibits that one could take 

  x = 16/√41 and y = (12√21)/√41.

(2) Then one writes D = (u,v,w) and makes use of the distances

  AD = √89, BD = √80, CD = √41.

One finds, after one elementary computation, that

  u = 25/√41  and one obtains (after a number of extra traces of algebra) an expression for v and for w.

(3) The quantity V of the tetrahedron is now discovered by a determinant system; one could present that 

  V = (1/6)|det(AB,AC,AD)| = 2√21 |w|.

(4) Subsequent, one computes the areas of the 4 faces from Heron’s system. (All 4 faces occur to have the identical aspect–lengths {√41,√80,√89}; one could verify that these three numbers fulfill the triangle inequalities.) Specifically, one could present that the world of triangle ABC is

  Δ = (√a)/4     (with a constructive integer a) and equally for the opposite faces in order that the whole floor space is S.

(5) Lastly one makes use of the classical relation between the tetrahedron’s quantity, its floor space, and its inradius r,

  r = 3V/S.

After slightly elementary (if considerably prolonged) algebra one finds that 

  r = (2√21)/3.

Thus in (1) now we have m = 2, n = 21, and p = 3 in order that

  m+n+p = 2+21+3 = 26.

──────────────────────────────

Comment

Many contestants solved the issue by “placing coordinates” on the vertices and calculating the amount and face–areas in order to acquire r = (3V)/S. (In our resolution we obtained r = (2√21)/3.) There are (a minimum of) a number of completely different approaches; one could even discover a resolution by “guessing” that (when expressed in lowest phrases) the inradius might be written within the type (2√21)/3.

──────────────────────────────

Reply

Thus, the reply to the issue is 26.

(Discover that some printed options have m+n+p = 26; our reply is equal to saying that the space from I to every face is (2√21)/3.)

Time taken: 10.72 seconds

The Verdict

  • Pace: Quickest of the three. ~10 seconds in our instance.
  • Accuracy: Failed to unravel the issue accurately. It gave a solution of 26, which is inaccurate.
  • Construction/Reasoning: Supplied a high-level define of the answer method, mentioning key steps like coordinate task, quantity calculation, and the inradius system. Nevertheless, it didn’t execute the calculations accurately, resulting in the mistaken reply. It appeared to “know” the steps however couldn’t “do” the maths.
  • Traits: Good for rapidly producing outlines or preliminary drafts of options, however unreliable for advanced calculations. It’s like having a pupil who understands the ideas however makes plenty of arithmetic errors.

Medium reasoning mode

from langchain_openai import ChatOpenAI
import time
# Initialize the mannequin
llm = ChatOpenAI(mannequin="o3-mini", openai_api_key=userdata.get('OPENAI_API_KEY'))
# Begin time calculation
start_time = time.time()
# Streaming response
for token in llm.stream(
 """ Let $ABCD$ be a tetrahedron such that $AB=CD= sqrt{41}$, $AC=BD= sqrt{80}$, and $BC=AD= sqrt{89}$.
     There exists a degree $I$ contained in the tetrahedron such that the distances from $I$ to every of the faces of the tetrahedron are all equal.
     This distance might be written within the type $frac{m sqrt n}{p}$, the place $m$, $n$, and $p$ are constructive integers, $m$ and $p$ are comparatively prime,
     and $n$ is just not divisible by the sq. of any prime. Discover $m+n+p$. """,

    reasoning_effort="medium",
):
   print(token.content material, finish="")
# Finish time calculation
end_time = time.time()
# Show the period
print(f"nTime taken: {end_time - start_time:.2f} seconds")

Output

We want to discover the (frequent) distance r from the distinctive inside level I (the incenter) to every face of tetrahedron ABCD whose edge‐lengths are

  AB = CD = √41, AC = BD = √80, BC = AD = √89.

A helpful reality for any tetrahedron is that if r is the inradius (its distance from the incenter to any face), then

  V = (r⁄3) S,

the place V is the amount and S is the whole floor space. Thus,

  r = (3V)⁄S.

The plan is as follows:

 1. Compute the world of 1 face.

 2. Compute the whole floor space S.

 3. Compute the amount V.

 4. Discover r after which write it within the type (m√n)/p and compute m+n+p.

Step 1. Discovering the Space of a Face

Observe that each face is congruent. For instance, face △ABC has

  AB = √41, AC = √80, BC = √89.

Its space could also be obtained from Heron’s system. (Typically it's extra environment friendly to make use of a system when it comes to squared edge‐lengths.)

A properly‐recognized system is:

  Area² = (1/16)[(a² + b² + c²)² – 2(a⁴ + b⁴ + c⁴)],

with a, b, c representing the aspect‐lengths. Right here, set

  a² = 41, b² = 80, c² = 89.

Then

  a² + b² + c² = 41 + 80 + 89 = 210.

Additionally,

  a⁴ = 41² = 1681,

  b⁴ = 80² = 6400,

  c⁴ = 89² = 7921,

so

  a⁴ + b⁴ + c⁴ = 1681 + 6400 + 7921 = 16,002.

Thus,

  Area² = (1/16)[210² – 2·16,002] = (1/16)[44,100 – 32,004] = (1/16)(12,096) = 756.

Taking the sq. root offers

  Space = √756 = √(36·21) = 6√21.

Step 2. Whole Floor Space S

Since each face has space 6√21 and there are 4 faces,

  S = 4·6√21 = 24√21.

Step 3. Computing the Quantity V

A superb technique is to arrange coordinates. One attainable task is as follows:

• Let A = (0, 0, 0).

• Place B on the x‐axis in order that AB = √41. Take

  B = (√41, 0, 0).

• Place C within the xy–airplane. Let C = (x, y, 0). Since

  AC = √80 and BC = √89,

now we have

  x² + y² = 80       (1)

  (x – √41)² + y² = 89.   (2)

Subtract (1) from (2):

  (x – √41)² – x² = 9

Increasing,

  (x² – 2√41·x + 41) – x² = –2√41·x + 41 = 9.

Thus,

  –2√41·x = 9 – 41 = –32 ⟹ x = 16/√41.

Then from (1):

  y² = 80 – (16/√41)² = 80 – 256/41 = (80·41 – 256)/41 = (3280 – 256)/41 = 3024/41.

So, taking the constructive sq. root,

  y = (12√21)/√41       (since 12²·21 = 144·21 = 3024).

Thus, we could take

  C = (16/√41, 12√21/√41, 0).

• Now select D = (u, v, w) with w > 0. Its distances from A, B, and C are given by:

  AD = √89 ⟹ u² + v² + w² = 89.      (3)

  BD = √80 ⟹ (u – √41)² + v² + w² = 80.  (4)

  CD = √41 ⟹ (u – 16/√41)² + (v – 12√21/√41)² + w² = 41. (5)

Subtract (4) – (3):

  (u – √41)² – u² = 80 – 89 = –9.

Compute:

  (u² – 2√41·u + 41) – u² = –2√41·u + 41 = –9 ⟹ u = 25/√41.

Then (3) offers:

  v² + w² = 89 – (25/√41)² = 89 – 625/41 = (89·41 – 625)/41 = (3649 – 625)/41 = 3024/41. (6)

Now use (5). With u = 25/√41,

  (u – 16/√41) = (9/√41).

So (5) turns into:

  (9/√41)² + (v – (12√21)/√41)² + w² = 41.

That's,

  81/41 + [v² – (24√21/√41)v + (144·21)/41] + w² = 41.

However word v²+w² = 3024/41 from (6), and (144·21) = 3024. Therefore,

  81/41 + (3024/41) – (24√21/√41)v + 3024/41 = 41.

Thus,

  (81 + 3024 + 3024)/41 – (24√21/√41)v = 41,

so

  6129/41 – (24√21/√41)v = 41.

Multiply each side by 41:

  6129 – 24√21·(41/√41)v = 41·41.

Since 41/√41 = √41, now we have:

  6129 – 24√21√41 · v = 1681,

or

  24√21√41 · v = 6129 – 1681 = 4448,

and therefore,

  v = 4448/(24√21√41) = 556/(3√21√41).

We is not going to want v explicitly; as a substitute, we compute w from (6). From (6),

  w² = 3024/41 – v².

It seems (after some algebra) that one obtains

  w² = 6400/189.

(For readers: after writing 3024/41 with denominator 9·861 and noticing that 861 = 21·41, a cancellation happens resulting in this end result.)

Thus,

  w = √(6400/189) = 80/√189 = 80/(3√21),  

since 189 = 9·21 in order that √189 = 3√21.

Now, the amount V might be computed utilizing the scalar triple product. With

  AB = B – A = (√41, 0, 0),

  AC = C – A = (16/√41, 12√21/√41, 0),

  AD = D – A = (25/√41, v, w),

now we have

  V = (1/6)|AB · (AC × AD)|.

A cautious calculation exhibits that the one nonzero contribution comes from the z–element resulting in

  determinant = 12√21 · w.

Thus,

  V = (1/6)·12√21·w = 2√21·w.

Substitute w = 80/(3√21):

  V = 2√21 · (80/(3√21)) = 160/3.

Step 4. Discovering r and the Ultimate Reply

Recall that r = (3V)/S. We have now:

  V = 160/3 and S = 24√21.

Thus,

  r = [3·(160/3)]/(24√21) = 160/(24√21) = 20/(3√21).

It's customary to rationalize the denominator:

  r = (20√21)/(3·21) = (20√21)/63.

That is of the shape (m√n)/p with m = 20, n = 21, and p = 63. (One simply checks that 20 and 63 are comparatively prime and 21 is sq.–free.)

The ultimate reply requested is m + n + p:

  20 + 21 + 63 = 104.

Thus, the reply is 104.

Time taken: 34.54 seconds

The Verdict

  • Pace: Intermediate velocity. ~34 seconds in our instance.
  • Accuracy: Solved the issue accurately, arriving on the right reply of 104.
  • Construction/Reasoning: Supplied a way more detailed and correct step-by-step resolution. It accurately utilized Heron’s system, arrange the coordinate system, and calculated the amount and in radius. The reasoning was sound and the calculations had been principally correct.
  • Traits: A superb stability of velocity and accuracy. Appropriate for issues the place some degree of detailed reasoning and calculation is required.

Excessive reasoning mode

from langchain_openai import ChatOpenAI
import time
# Initialize the mannequin
llm = ChatOpenAI(mannequin="o3-mini", openai_api_key=userdata.get('OPENAI_API_KEY'))
# Begin time calculation
start_time = time.time()
# Streaming response
for token in llm.stream(
 """ Let $ABCD$ be a tetrahedron such that $AB=CD= sqrt{41}$, $AC=BD= sqrt{80}$, and $BC=AD= sqrt{89}$.
     There exists a degree $I$ contained in the tetrahedron such that the distances from $I$ to every of the faces of the tetrahedron are all equal.
     This distance might be written within the type $frac{m sqrt n}{p}$, the place $m$, $n$, and $p$ are constructive integers, $m$ and $p$ are comparatively prime,
     and $n$ is just not divisible by the sq. of any prime. Discover $m+n+p$. """,
    reasoning_effort="excessive",
):
   print(token.content material, finish="")
# Finish time calculation
end_time = time.time()
# Show the period
print(f"nTime taken: {end_time - start_time:.2f} seconds")

Output

We're given a tetrahedron ABCD whose six edge‐lengths are as follows (writing them squared for comfort):

  AB² = 41, AC² = 80, AD² = 89,

  BC² = 89, BD² = 80, CD² = 41.

Discover that in each face the three edge‐lengths are (in some order) √41, √80, and √89. (For instance, face ABC has AB = √41, AC = √80, and BC = √89.) Therefore all 4 faces are congruent triangles.

Step 1. Discover the world of 1 face

A properly‐recognized system for the world of a triangle when the aspect‐lengths are given is

  Area² = (1/16)[2(a²b² + a²c² + b²c²) – (a⁴ + b⁴ + c⁴)].

If we set

  a² = 41, b² = 80, c² = 89,

then

  a²b² = 41·80 = 3280,

  a²c² = 41·89 = 3649,

  b²c² = 80·89 = 7120.

Thus

  2(a²b² + a²c² + b²c²) = 2(3280 + 3649 + 7120) = 2(14049) = 28098.

Additionally,

  a⁴ = 41² = 1681,

  b⁴ = 80² = 6400,

  c⁴ = 89² = 7921,

so

  a⁴ + b⁴ + c⁴ = 1681 + 6400 + 7921 = 16002.

Thus

  Area² = (1/16)(28098 − 16002) = (1/16)(12096) = 756.

Taking the sq. root offers

  Space = √756 = √(36·21) = 6√21.

Since there are 4 faces, the whole floor space is

  S = 4 × (6√21) = 24√21.

Step 2. Discover the amount of the tetrahedron

A helpful technique to compute the amount is to “arrange” coordinates in order that we are able to compute a scalar triple product. One could present (by a cautious selection of coordinates) that the tetrahedron might be organized with

  A = (0, 0, 0),

  B = (√41, 0, 0),

  C = (16/√41, (12√21)/√41, 0),

  D = (25/√41, ?, f).

Right here the coordinates of C are compelled by the circumstances

  AC = √80 and BC = √89,

and a computation exhibits that the numbers work out if the x‐coordinate of D is 25/√41 and the (constructive) z‐coordinate of D seems to be

  f = 80/(3√21).

A computation of the scalar triple product offers a very good reply. In truth, if one computes the vectors

  AB = B – A = (√41, 0, 0),

  AC = C – A = (16/√41, 12√21/√41, 0),

  AD = D – A = (25/√41, (one thing), f),

then (by a lucky cancellation within the 3×3 determinant) one finds that

  V = (1/6)|det(AB,AC,AD)| = 2√21 · f.

Since now we have f = 80/(3√21), it follows that

  V = 2√21 · (80/(3√21)) = 160/3.

Step 3. Discover the inradius

A properly‐recognized system for the radius r of the sphere inscribed in a tetrahedron is

  r = (3V)/S.

With V = 160/3 and S = 24√21 now we have

  r = (3*(160/3))/(24√21) = 160/(24√21) = 20/(3√21).

It is not uncommon to precise this in “rationalized” type. Multiplying numerator and denominator by √21 offers

  r = 20√21/(3·21) = 20√21/63.

Step 4. Write the reply within the required type and add m + n + p

The issue asks for r within the type

  r = (m√n)/p,

with m and p comparatively prime and n sq.‐free. We have now

  m = 20, n = 21, p = 63.

(One could verify that 20 and 63 are comparatively prime and that 21 = 3·7 is sq.‐free.)

Thus, the ultimate reply is

  m + n + p = 20 + 21 + 63 = 104.

Reply: 104.

Time taken: 33.18 seconds

The Verdict

  • Pace: Slowest of the three. ~33 seconds in our instance. Curiously, it was barely sooner than Medium on this particular occasion, which is surprising. This might be as a consequence of variations in server load or different components. Typically, Excessive ought to be the slowest.
  • Accuracy: Solved the issue accurately with the reply of 104.
  • Construction/Reasoning: Much like the Medium mode, it offered an in depth and proper step-by-step resolution, though with barely completely different intermediate steps. It additionally used Heron’s system and the coordinate technique.
  • Traits: Designed for probably the most advanced issues requiring the best diploma of accuracy and detailed reasoning. Typically the slowest, however can generally be corresponding to Medium in velocity for sure issues.

Additionally learn: OpenAI o3-mini: Efficiency, The right way to Entry, and Extra

Tabular Comparability of o3-mini Reasoning Ranges

Characteristic Low Reasoning Medium Reasoning Excessive Reasoning
Pace Quickest (~10s) Intermediate (~34s) Slowest (~33s)
Accuracy Incorrect (26) Appropriate (104) Appropriate (104)
Reasoning/Construction Excessive-level define, flawed calculations Detailed, correct step-by-step resolution Detailed, correct step-by-step resolution
Use Case Fast prototyping, fast drafts Issues requiring reasonable element/calculation Advanced issues, excessive accuracy necessities
Calculation Capability Weak Sturdy Sturdy

Ultimate Verdict of o3-mini Reasoning Ranges

This experiment reveals a transparent trade-off between velocity and accuracy in OpenAI’s o3-mini reasoning modes. The Low mode prioritized velocity, producing a quick however incorrect reply. Each Medium and Excessive modes efficiently solved the issue, demonstrating larger accuracy. Whereas Excessive is usually meant for probably the most advanced reasoning, it exhibited related efficiency to Medium on this particular occasion, probably as a consequence of variations in processing. Due to this fact, for duties demanding exact calculations, Medium or Excessive reasoning is crucial, with Medium providing a doubtlessly higher stability of velocity and accuracy for a lot of sensible functions.

Additionally learn: The right way to Run OpenAI’s o3-mini on Google Colab?

Conclusion

The o3-mini reasoning ranges present flexibility for builders, permitting them to decide on the suitable mode based mostly on their particular wants:

  • Low Reasoning Mode is ideal for velocity and effectivity in simple duties.
  • Medium Reasoning Mode strikes a stability between velocity and accuracy for reasonably advanced duties.
  • Excessive Reasoning Mode excels in eventualities requiring deep reasoning and excessive precision.

This versatility is a big benefit for builders seeking to optimize workflows and improve productiveness in varied functions.

Harsh Mishra is an AI/ML Engineer who spends extra time speaking to Giant Language Fashions than precise people. Enthusiastic about GenAI, NLP, and making machines smarter (so that they don’t substitute him simply but). When not optimizing fashions, he’s in all probability optimizing his espresso consumption. 🚀☕

We use cookies important for this website to perform properly. Please click on to assist us enhance its usefulness with further cookies. Study our use of cookies in our Privateness Coverage & Cookies Coverage.

Present particulars