CV VideoPlayer — As soon as and For All | by Daniel Tomer | Dec, 2024

A Python video participant bundle made for pc imaginative and prescient analysis

Picture by writer

When growing pc imaginative and prescient algorithms, the journey from idea to working implementation typically entails numerous iterations of watching, analyzing, and debugging video frames. As I dove deeper into pc imaginative and prescient initiatives, I discovered myself repeatedly writing the identical boilerplate code for video visualization and debugging.

Sooner or later, I made a decision sufficient was sufficient, so I created CV VideoPlayer, a Python-based open-source video participant bundle, particularly designed for pc imaginative and prescient practitioners that may resolve this drawback as soon as and for all.

CV video participant “Double body mode” with added visualizations and keyboard shortcuts. Picture by writer

When you’ve ever developed an algorithm for video evaluation, you’ve in all probability written some model of the next code that can assist you visualize and debug it:

import cv2

cap = cv2.VideoCapture(<video_path>)
ret = True
whereas ret:
ret, body = cap.learn()
algo_output = some_video_analsys_algorithm(body)
frame_to_display = visualizer(body, algo_output)
cv2.imshow(frame_to_display)
cv2.waitKey()

However in nearly all initiatives I’ve labored on this code was not often sufficient. Because the venture went on I discovered myself including an increasing number of performance to assist me perceive what was happening.

For instance:

  • Navigation by way of the video forwards and backwards body by body.
  • The power to document the output to a file.
  • Supporting sources apart from a easy video file (body folder, stream, distant storage, and so forth.)

However the factor that irritated me essentially the most was the shortage of interactivity. Utilizing this sort of code, The visualization is created earlier than rendering and can’t change as soon as displayed. And, whereas that is okay for easy algorithms, for the extra complicated ones, there’s simply manner an excessive amount of data wanted for every body. And with out the flexibility to resolve, on the fly, what you need to show, you end up operating the identical video repeatedly, every time with totally different visualization parameters.

This course of was tedious and exhausting.

Picture by writer

CV VideoPlayer was born from the necessity for a easy customizable answer for interactively rendering movies and frames. It permits any variety of overlays, sidebars, or some other body edits, every of which might be simply switched on and off by the person throughout run time. let’s see an instance of how that is completed:

Set up

We begin by putting in the bundle utilizing pip set up cvvideoplayer

Taking part in vanilla video

We are able to then import the video participant and run an unedited video with the next code:

from cvvideoplayer import create_video_player

VIDEO_OR_FRAME_FOLDER_PATH = "<add native path right here>"

video_player = create_video_player(video_source=VIDEO_OR_FRAME_FOLDER_PATH)
video_player.run()

It will open the video participant and assist you to play it with the spacebar or utilizing the arrows, it is going to additionally add some default built-in frame-edit-callbacks which we are going to elaborate on within the following part.

Picture by writer

So as to add custom-built visualization to the video we will use the frame_edit_callbacks argument of the create_video_player constructor perform like so:

from cvvideoplayer import VideoPlayer

VIDEO_OR_FRAME_FOLDER_PATH = "<add native path right here>"

video_player = create_video_player(
video_source=VIDEO_OR_FRAME_FOLDER_PATH,
frame_edit_callbacks=[
FitFrameToScreen(),
FrameInfoOverlay(),
KeyMapOverlay(),
]
)
video_player.run()

When unspecified, the default checklist shall be precisely the one within the instance above.

Constructed-in callbacks

There are a bunch of built-in callbacks to make use of corresponding to:

  • FitFrameToScreen — Mechanically resizes the body to suit the display screen dimension.
  • FrameInfoOverlay — Prints the body quantity and unique body decision on the highest left nook.
  • KeyMapOverlay — Mechanically detects and prints all out there keyboard shortcuts (Additionally these added by the person).
  • DetectionCsvPlotter — Plots Bounding containers laid out in a CSV with the next Header: frame_id, label, x1, y1, width, peak, rating
  • FrameNormlizer — Permits the person to regulate the dynamic vary of the picture.
  • HistogramEqulizer — self-explanatory

And extra are added with every model.

Making a {custom} callback

Right here is the place the usefulness of the bundle shines. So as to add your individual {custom} visualization you create a brand new class that inherits BaseFrameEditCallback and implements the edit_frame technique, for instance:

class MyCallback(BaseFrameEditCallback):
def __init__(
self,
enable_by_default: bool = True,
enable_disable_key: Non-compulsory[str] = None,
additional_keyboard_shortcuts: Non-compulsory[List[KeyFunction]] = None
**any_other_needed_params
):
tremendous().__init__(
enable_by_default,
enable_disable_key,
additional_keyboard_shortcuts
)

def edit_frame(
self,
video_player: "VideoPlayer",
body: np.ndarray,
frame_num: int,
original_frame: np.ndarray,
) -> np.ndarray:
"""
This perform receives the displayed body and may return it
after it has been altered in any manner fascinating by the person

Args:
video_player: an occasion fo VideoPlayer
body (): the body to be edited and displayed
frame_num ():
original_frame () the body earlier than any alterations

Returns: the edited body
"""
body = add_any_visalizations(body)
return body

Moreover, you possibly can add setup and teardown strategies by overriding these strategies within the father or mother class:

class MyCallback(BaseFrameEditCallback):
...
def setup(self, video_player: "VideoPlayer", body) -> None:
"""
Optionally configure extra parameters in response to the
first incoming body
"""

def teardown(self) -> None:
"""
Optionally outline how the callback ought to shut when the
video participant is closed
"""

For every callback, CV Video Participant lets you add {custom} keyboard shortcuts that may change the visualization it does at run time.

Essentially the most primary shortcut is enabling/disabling the callback and is created utilizing the enable_disable_key parameter like so:

my_callback = MyCallback(
enable_disable_key="ctrl+a"
)

The string handed right here might be any mixture of modifiers (ctrl, alt, and shift) with a letter or quantity for instance: “crtl+alt+s”, “g”, “shift+v”, “crtl+1” and so forth.

So as to add shortcuts that change the visualization itself, you possibly can override theadditional_keyboard_shortcuts property which returns an inventory of the dataclassKeyFunction .

from cvvideoplayer import KeyFunction

class MyCallback(BaseFrameEditCallback):
...
@property
def additional_keyboard_shortcuts(self) -> Record[KeyFunction]:
[
KeyFunction(
key="alt+r",
function=self.a_function_to_modify_the_visualiztion,
description="what this does"
)
]

A KeyFunction is constructed utilizing three arguments:

  • The key argument — Identical as for enable_disable_key , The string handed right here might be any mixture of modifiers (ctrl, alt, and shift) with a letter or quantity for instance: “crtl+alt+s”, “g”, “shift+v”, “crtl+1”
  • The description argument — That is utilized by the KeyMapOverlay callback to print all of the out there shortcuts on the display screen.
  • The perform argument — Must be a perform that accepts no arguments.

In lots of instances, the KeyFunction will obtain a perform that toggles some boolean attribute of the callback, which is able to change one thing that the edit_frametechnique does. So one thing like:

from cvvideoplayer import KeyFunction

class MyCallback(BaseFrameEditCallback):
...
@property
def additional_keyboard_shortcuts(self) -> Record[KeyFunction]:
[
KeyFunction(
key="alt+r",
function=self.a_function_to_modify_the_visualiztion,
description="what this does"
)
]
def a_function_to_modify_the_visualiztion():
self._draw_something = bool(1 - self._draw_somthing)

Many occasions, I discovered myself wanting to check two totally different visualizations facet by facet. For instance, evaluating two detectors or an algorithm’s output with the unique body with out modifications, and so forth.

To do this I added double_frame_mode which might be turned on by:

video_player = create_video_player(
...
double_frame_mode=True
)

The video originally of this weblog is an instance of what this mode appears like.

On this mode, you need to use “ctrl+1” and “ctrl+2″ to resolve which body visualization you need to management with the keyboard.

By default, each frames may have the identical callbacks out there however in order for you totally different callbacks for the correct body you need to use the right_frame_callback argument to provide the correct body a special set of callbacks (the left body may have those handed to the frame_edit_callback argument):

video_player = create_video_player(
...
double_frame_mode=True
right_frame_callbacks = [callback1, callback2, ...]
)

I Hope this software turns out to be useful for all of you. In case you have any concepts on the best way to enhance it, please let me know within the points tab on the venture’s GitHub web page, and don’t neglect to depart a star whilst you’re at it 🙂 …