Is YOLO v12 Higher Than YOLO v11? -

YOLO (You Solely Look As soon as) has been a number one real-time object detection framework, with every iteration enhancing upon the earlier variations. The most recent model YOLO v12 introduces developments that considerably improve accuracy whereas sustaining real-time processing speeds. This text explores the important thing improvements in YOLO v12, highlighting the way it surpasses the earlier variations whereas minimizing computational prices with out compromising detection effectivity.

What’s New in YOLO v12?

Beforehand, YOLO fashions relied on Convolutional Neural Networks (CNNs) for object detection resulting from their velocity and effectivity. Nevertheless, YOLO v12 makes use of consideration mechanisms, an idea extensively recognized and utilized in Transformer fashions which permit it to acknowledge patterns extra successfully. Whereas consideration mechanisms have initially been sluggish for real-time object detection, YOLO v12 in some way efficiently integrates them whereas sustaining YOLO’s velocity, resulting in an Consideration-Centric YOLO framework.

Key Enhancements Over Earlier Variations

1. Consideration-Centric Framework

YOLO v12 combines the ability of consideration mechanisms with CNNs, leading to a mannequin that’s each quicker and extra correct. Not like its predecessors which relied solely on CNNs, YOLO v12 introduces optimized consideration modules to enhance object recognition with out including pointless latency.

2. Superior Efficiency Metrics

Evaluating efficiency metrics throughout completely different YOLO variations and real-time detection fashions reveals that YOLO v12 achieves increased accuracy whereas sustaining low latency.

The mAP (Imply Common Precision) values on datasets like COCO present YOLO v12 outperforming YOLO v11 and YOLO v10 whereas sustaining comparable velocity.
The mannequin achieves a outstanding 40.6% accuracy (mAP) whereas processing photos in simply 1.64 milliseconds on an Nvidia T4 GPU. This efficiency is superior to YOLO v10 and YOLO v11 with out sacrificing velocity.

3. Outperforming Non-YOLO Fashions

YOLO v12 surpasses earlier YOLO variations; it additionally outperforms different real-time object detection frameworks, equivalent to RT-Det and RT-Det v2. These various fashions have increased latency but fail to match YOLO v12’s accuracy.

Computational Effectivity Enhancements

One of many main considerations with integrating consideration mechanisms into YOLO fashions was their excessive computational price (Consideration Mechanism) and reminiscence inefficiency. YOLO v12 addresses these points by way of a number of key improvements:

1. Flash Consideration for Reminiscence Effectivity

Conventional consideration mechanisms devour a considerable amount of reminiscence, making them impractical for real-time purposes. YOLO v12 introduces Flash Consideration, a method that reduces reminiscence consumption and hurries up inference time.

2. Space Consideration for Decrease Computation Value

To additional optimize effectivity, YOLO v12 employs Space Consideration, which focuses solely on related areas of a picture as a substitute of processing the complete function map. This method dramatically reduces computation prices whereas retaining accuracy.

3. R-ELAN for Optimized Function Processing

YOLO v12 additionally introduces R-ELAN (Re-Engineered ELAN), which optimizes function propagation making the mannequin extra environment friendly in dealing with complicated object detection duties with out rising computational calls for.

YOLO v12 Mannequin Variants

YOLO v12 is available in 5 completely different variants, catering to completely different purposes:

N (Nano) & S (Small): Designed for real-time purposes the place velocity is essential.
M (Medium): Balances accuracy and velocity, appropriate for general-purpose duties.
L (Giant) & XL (Additional Giant): Optimized for high-precision duties the place accuracy is prioritized over velocity.

Additionally learn:

Let’s examine YOLO v11 and YOLO v12 Fashions

We’ll be experimenting with YOLO v11 and YOLO v12 small fashions to know their efficiency throughout varied duties like object counting, heatmaps, and velocity estimation.

1. Object Counting

YOLO v11

import cv2
from ultralytics import options

cap = cv2.VideoCapture("freeway.mp4")
assert cap.isOpened(), "Error studying video file"
w, h, fps = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)), int(cap.get(cv2.CAP_PROP_FPS)))

# Outline area factors
region_points = [(20, 1500), (1080, 1500), (1080, 1460), (20, 1460)]  # Decrease rectangle area counting

# Video author (MP4 format)
video_writer = cv2.VideoWriter("object_counting_output.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# Init ObjectCounter
counter = options.ObjectCounter(
    present=False,  # Disable inside window show
    area=region_points,
    mannequin="yolo11s.pt",
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    
    im0 = counter.depend(im0)

    # Resize to suit display (non-obligatory — scale down for big movies)
    im0_resized = cv2.resize(im0, (640, 360))  # Alter decision as wanted
    
    # Present the resized body
    cv2.imshow("Object Counting", im0_resized)
    video_writer.write(im0)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

YOLO v12

import cv2
from ultralytics import options

cap = cv2.VideoCapture("freeway.mp4")
assert cap.isOpened(), "Error studying video file"
w, h, fps = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)), int(cap.get(cv2.CAP_PROP_FPS)))

# Outline area factors
region_points = [(20, 1500), (1080, 1500), (1080, 1460), (20, 1460)]  # Decrease rectangle area counting

# Video author (MP4 format)
video_writer = cv2.VideoWriter("object_counting_output.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# Init ObjectCounter
counter = options.ObjectCounter(
    present=False,  # Disable inside window show
    area=region_points,
    mannequin="yolo12s.pt",
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    
    im0 = counter.depend(im0)

    # Resize to suit display (non-obligatory — scale down for big movies)
    im0_resized = cv2.resize(im0, (640, 360))  # Alter decision as wanted
    
    # Present the resized body
    cv2.imshow("Object Counting", im0_resized)
    video_writer.write(im0)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

2. Heatmaps

YOLO v11

import cv2

from ultralytics import options

cap = cv2.VideoCapture("mall_arial.mp4")
assert cap.isOpened(), "Error studying video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))

# Video author
video_writer = cv2.VideoWriter("heatmap_output_yolov11.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# In case you wish to apply object counting + heatmaps, you may go area factors.
# region_points = [(20, 400), (1080, 400)]  # Outline line factors
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)]  # Outline area factors
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360), (20, 400)]  # Outline polygon factors

# Init heatmap
heatmap = options.Heatmap(
    present=True,  # Show the output
    mannequin="yolo11s.pt",  # Path to the YOLO11 mannequin file
    colormap=cv2.COLORMAP_PARULA,  # Colormap of heatmap
    # area=region_points,  # If you wish to do object counting with heatmaps, you may go region_points
    # lessons=[0, 2],  # If you wish to generate heatmap for particular lessons i.e individual and automobile.
    # show_in=True,  # Show in counts
    # show_out=True,  # Show out counts
    # line_width=2,  # Alter the road width for bounding containers and textual content show
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    im0 = heatmap.generate_heatmap(im0)
    im0_resized = cv2.resize(im0, (w, h))
    video_writer.write(im0_resized)

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

YOLO v12

import cv2

from ultralytics import options

cap = cv2.VideoCapture("mall_arial.mp4")
assert cap.isOpened(), "Error studying video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))

# Video author
video_writer = cv2.VideoWriter("heatmap_output_yolov12.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# In case you wish to apply object counting + heatmaps, you may go area factors.
# region_points = [(20, 400), (1080, 400)]  # Outline line factors
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)]  # Outline area factors
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360), (20, 400)]  # Outline polygon factors

# Init heatmap
heatmap = options.Heatmap(
    present=True,  # Show the output
    mannequin="yolo12s.pt",  # Path to the YOLO11 mannequin file
    colormap=cv2.COLORMAP_PARULA,  # Colormap of heatmap
    # area=region_points,  # If you wish to do object counting with heatmaps, you may go region_points
    # lessons=[0, 2],  # If you wish to generate heatmap for particular lessons i.e individual and automobile.
    # show_in=True,  # Show in counts
    # show_out=True,  # Show out counts
    # line_width=2,  # Alter the road width for bounding containers and textual content show
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    im0 = heatmap.generate_heatmap(im0)
    im0_resized = cv2.resize(im0, (w, h))
    video_writer.write(im0_resized)

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

3. Velocity Estimation

YOLO v11

import cv2
from ultralytics import options
import numpy as np

cap = cv2.VideoCapture("cars_on_road.mp4")
assert cap.isOpened(), "Error studying video file"

# Seize video properties
w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video author
video_writer = cv2.VideoWriter("speed_management_yolov11.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# Outline velocity area factors (regulate on your video decision)
speed_region = [(300, h - 200), (w - 100, h - 200), (w - 100, h - 270), (300, h - 270)]

# Initialize SpeedEstimator
velocity = options.SpeedEstimator(
    present=False,  # Disable inside window show
    mannequin="yolo11s.pt",  # Path to the YOLO mannequin file
    area=speed_region,  # Cross area factors
    # lessons=[0, 2],  # Non-compulsory: Filter particular object lessons (e.g., automobiles, vehicles)
    # line_width=2,  # Non-compulsory: Alter the road width
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    
    # Estimate velocity and draw bounding containers
    out = velocity.estimate_speed(im0)

    # Draw the velocity area on the body
    cv2.polylines(out, [np.array(speed_region)], isClosed=True, coloration=(0, 255, 0), thickness=2)

    # Resize the body to suit the display
    im0_resized = cv2.resize(out, (1280, 720))  # Resize for higher display match
    
    # Present the resized body
    cv2.imshow("Velocity Estimation", im0_resized)
    video_writer.write(out)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

YOLO v12

import cv2
from ultralytics import options
import numpy as np

cap = cv2.VideoCapture("cars_on_road.mp4")
assert cap.isOpened(), "Error studying video file"

# Seize video properties
w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video author
video_writer = cv2.VideoWriter("speed_management_yolov12.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

# Outline velocity area factors (regulate on your video decision)
speed_region = [(300, h - 200), (w - 100, h - 200), (w - 100, h - 270), (300, h - 270)]

# Initialize SpeedEstimator
velocity = options.SpeedEstimator(
    present=False,  # Disable inside window show
    mannequin="yolo12s.pt",  # Path to the YOLO mannequin file
    area=speed_region,  # Cross area factors
    # lessons=[0, 2],  # Non-compulsory: Filter particular object lessons (e.g., automobiles, vehicles)
    # line_width=2,  # Non-compulsory: Alter the road width
)

# Course of video
whereas cap.isOpened():
    success, im0 = cap.learn()
    if not success:
        print("Video body is empty or video processing has been efficiently accomplished.")
        break
    
    # Estimate velocity and draw bounding containers
    out = velocity.estimate_speed(im0)

    # Draw the velocity area on the body
    cv2.polylines(out, [np.array(speed_region)], isClosed=True, coloration=(0, 255, 0), thickness=2)

    # Resize the body to suit the display
    im0_resized = cv2.resize(out, (1280, 720))  # Resize for higher display match
    
    # Present the resized body
    cv2.imshow("Velocity Estimation", im0_resized)
    video_writer.write(out)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.launch()
video_writer.launch()
cv2.destroyAllWindows()

Output

Additionally Learn: High 30+ Pc Imaginative and prescient Fashions For 2025

Skilled Opinions on YOLOv11 and YOLOv12

Muhammad Rizwan Munawar — Pc Imaginative and prescient Engineer at Ultralytics

“YOLOv12 introduces flash consideration, which reinforces accuracy, however it requires cautious CUDA setup. It’s a strong step ahead, particularly for complicated detection duties, although YOLOv11 stays quicker for real-time wants. Briefly, select YOLOv12 for accuracy and YOLOv11 for velocity.”

Linkedin Submit – Is YOLOv12 actually a state-of-the-art mannequin? 🤪

Muhammad Rizwan, not too long ago examined YOLOv11 and YOLOv12 aspect by aspect to interrupt down their real-world efficiency. His findings spotlight the trade-offs between the 2 fashions:

Frames Per Second (FPS): YOLOv11 maintains a mean of 40 FPS, whereas YOLOv12 lags behind at 30 FPS. This makes YOLOv11 the higher alternative for real-time purposes the place velocity is crucial, equivalent to site visitors monitoring or reside video feeds.
Coaching Time: YOLOv12 takes about 20% longer to coach than YOLOv11. On a small dataset with 130 coaching photos and 43 validation photos, YOLOv11 accomplished coaching in 0.009 hours, whereas YOLOv12 wanted 0.011 hours. Whereas this might sound minor for small datasets, the distinction turns into vital for larger-scale initiatives.
Accuracy: Each fashions achieved related accuracy after fine-tuning for 10 epochs on the identical dataset. YOLOv12 didn’t dramatically outperform YOLOv11 when it comes to accuracy, suggesting the newer mannequin’s enhancements lie extra in architectural enhancements than uncooked detection precision.
Flash Consideration: YOLOv12 introduces flash consideration, a robust mechanism that hurries up and optimizes consideration layers. Nevertheless, there’s a catch — this function isn’t natively supported on the CPU, and enabling it with CUDA requires cautious version-specific setup. For groups with out highly effective GPUs or these engaged on edge units, this may turn into a roadblock.

The PC specs used for testing:

GPU: NVIDIA RTX 3050
CPU: Intel Core-i5-10400 @2.90GHz
RAM: 64 GB

The mannequin specs:

Mannequin = YOLO11n.pt and YOLOv12n.pt
Picture measurement = 640 for inference

Conclusion

YOLO v12 marks a big leap ahead in real-time object detection, combining CNN velocity with Transformer-like consideration mechanisms. With improved accuracy, decrease computational prices, and a spread of mannequin variants, YOLO v12 is poised to redefine the panorama of real-time imaginative and prescient purposes. Whether or not for autonomous autos, safety surveillance, or medical imaging, YOLO v12 units a brand new normal for real-time object detection effectivity.

What’s Subsequent?

YOLO v13 Potentialities: Will future variations push the eye mechanisms even additional?
Edge Machine Optimization: Can Flash Consideration or Space Consideration be optimized for lower-power units?

That can assist you higher perceive the variations, I’ve connected some code snippets and output leads to the comparability part. These examples illustrate how each YOLOv11 and YOLOv12 carry out in real-world eventualities, from object counting to hurry estimation and heatmaps. I’m excited to see the way you guys understand this new launch! Are the enhancements in accuracy and a focus mechanisms sufficient to justify the trade-offs in velocity? Or do you assume YOLOv11 nonetheless holds its floor for many purposes?

GenAI Intern @ Analytics Vidhya | Ultimate Yr @ VIT Chennai
Captivated with AI and machine studying, I am wanting to dive into roles as an AI/ML Engineer or Information Scientist the place I could make an actual impression. With a knack for fast studying and a love for teamwork, I am excited to convey modern options and cutting-edge developments to the desk. My curiosity drives me to discover AI throughout varied fields and take the initiative to delve into knowledge engineering, guaranteeing I keep forward and ship impactful initiatives.

Is YOLO v12 Higher Than YOLO v11?

What’s New in YOLO v12?

Key Enhancements Over Earlier Variations

1. Consideration-Centric Framework

2. Superior Efficiency Metrics

3. Outperforming Non-YOLO Fashions

Computational Effectivity Enhancements

1. Flash Consideration for Reminiscence Effectivity

2. Space Consideration for Decrease Computation Value

3. R-ELAN for Optimized Function Processing

YOLO v12 Mannequin Variants

Let’s examine YOLO v11 and YOLO v12 Fashions

1. Object Counting

YOLO v11

Output

YOLO v12

Output

2. Heatmaps

YOLO v11

Output

YOLO v12

Output

3. Velocity Estimation

YOLO v11

Output

YOLO v12

Output

Skilled Opinions on YOLOv11 and YOLOv12

Conclusion

High 9 GenAI Founders to Meet at DataHack Summit 2025

Why And When do we have to construct Multi-Agent Programs?

Robots-Weblog | Wo Ideen tanzen und Technik begeistert – Riesige Ballerina tanzt auf der Maker Faire Hannover

Robots-Weblog | Wo Ideen tanzen und Technik begeistert – Riesige Ballerina tanzt auf der Maker Faire Hannover

GPT-4o vs Flux & Extra

High 9 GenAI Founders to Meet at DataHack Summit 2025

Why And When do we have to construct Multi-Agent Programs?

Robots-Weblog | Wo Ideen tanzen und Technik begeistert – Riesige Ballerina tanzt auf der Maker Faire Hannover

Robots-Weblog | Wo Ideen tanzen und Technik begeistert – Riesige Ballerina tanzt auf der Maker Faire Hannover