5 Pillars for a Hyper-Optimized AI Workflow | by Gilad Rubin

Now, let’s discover every pillar.

In each AI challenge there’s a sure aim we need to obtain, and ideally — a set of metrics we need to optimize.

These metrics can embody:

Predictive high quality metrics: Accuracy, F1-Rating, Recall, Precision, and so forth…
Price metrics: Precise $ quantity, FLOPS, Dimension in MB, and so forth…
Efficiency metrics: Coaching pace, inference pace, and so forth…

We are able to select one metric as our “north star” or create an mixture metric. For instance:

0.7 × F1-Rating + 0.3 × (1 / Inference Time in ms)
0.6 × AUC-ROC + 0.2 × (1 / Coaching Time in hours) + 0.2 × (1 / Cloud Compute Price in $)

There’s a great quick video by Andrew Ng. the place right here explains in regards to the subject of a Single Quantity Analysis Metric.

As soon as we now have an agreed-upon metric to optimize and a set of constraints to fulfill, our aim is to construct a workflow that maximizes this metric whereas satisfying our constraints.

On the planet of Information Science and AI improvement — interactivity is vital.

As AI Engineers (or no matter title we Information Scientists go by today), we have to construct code that works bug-free throughout totally different eventualities.

In contrast to conventional software program engineering, our position extends past writing code that “simply” works. A big side of our work entails inspecting the information and inspecting our fashions’ outputs and the outcomes of varied processing steps.

The most typical surroundings for this type of interactive exploration is Jupyter Notebooks.

Working inside a pocket book permits us to check totally different implementations, experiment with new APIs and examine the intermediate outcomes of our workflows and make choices primarily based on our observations. That is the core of the second pillar.

Nevertheless, As a lot as we take pleasure in these advantages in our day-to-day work, notebooks can typically comprise notoriously dangerous code that may solely be executed in a non-trivial order.

As well as, some exploratory components of the pocket book won’t be related for manufacturing settings, making it unclear how these can successfully be shipped to manufacturing.

“Manufacturing-Prepared” can imply various things in several contexts. For one group, it’d imply serving outcomes inside a specified timeframe. For one more, it may discuss with the service’s uptime (SLA). And but for an additional, it’d imply the code, mannequin, or workflow has undergone enough testing to make sure reliability.

These are all vital features of delivery dependable merchandise, and the particular necessities might range from place to position. Since my exploration is concentrated on the “meta” side of constructing AI workflows, I need to talk about a typical denominator throughout these definitions: wrapping our workflow as a serviceable API and deploying it to an surroundings the place it may be queried by exterior purposes or customers.

This implies we have to have a technique to summary the complexity of our codebase right into a clearly outlined interface that can be utilized throughout numerous use-cases. Let’s contemplate an instance:

Think about a fancy RAG (Retrieval-Augmented Technology) system over PDF recordsdata that we’ve developed. It could comprise 10 totally different components, every consisting of lots of of traces of code.

Nevertheless, we will nonetheless wrap them right into a easy API with simply two foremost features:

upload_document(file: PDF) -> document_id: str
query_document(document_id: str, question: str, output_format: str) -> response: str

This abstraction permits customers to:

Add a PDF doc and obtain a novel identifier.
Ask questions in regards to the doc utilizing pure language.
Specify the specified format for the response (e.g., markdown, JSON, Pandas Dataframe).

By offering this clear interface, we’ve successfully hidden the complexities and implementation particulars of our workflow.

Having a scientific technique to convert arbitrarily complicated workflows into deployable APIs is our third pillar.

As well as, we might ideally need to set up a technique that ensures that our iterative, each day work stays in sync with our manufacturing code.

This implies if we make a change to our workflow — fixing a bug, including a brand new implementation, and even tweaking a configuration — we should always have the ability to deploy these adjustments to our manufacturing surroundings with only a click on of a button.

One other essential side of our methodology is sustaining a Modular & Extensible codebase.

Which means we will add new implementations and check them towards present ones that occupy the identical logical step with out modifying our present code or overwriting different configurations.

This method aligns with the open-closed precept, the place our code is open for extension however closed for modification. It permits us to:

Introduce new implementations alongside present ones
Simply examine the efficiency of various approaches
Preserve the integrity of our present working options
Lengthen our workflow’s capabilities with out risking the soundness of the entire system

Let’s have a look at a toy instance:

On this instance, we will see a (pseudo) code that’s modular and configurable. On this manner, we will simply add new configurations and check their efficiency:

As soon as our code consists of a number of competing implementations & configurations, we enter a state that I wish to name a “superposition of workflows”. On this state we will instantiate and execute a workflow utilizing a selected set of configurations.

What if we take modularity and extensibility a step additional? What if we apply this method to total sections of our workflow?

So now, as a substitute of configuring this LLM or that retriever, we will configure our complete preprocessing, coaching, or analysis steps.

Let’s have a look at an instance:

Right here we see our total ML workflow. Now, let’s add a brand new Information Prep implementation and zoom into it:

Once we work on this hierarchical and visible manner, we will choose a piece of our workflow to enhance and add a brand new implementation with the identical enter/output interface as the present one.

We are able to then “zoom in” to that particular part, focusing solely on it with out worrying about the remainder of the challenge. As soon as we’re glad with our implementation — we will begin testing it out alongside different numerous configurations in our workflow.

This method unlocks a number of advantages:

Diminished psychological overload: Give attention to one part at a time, offering readability and decreasing complexity in decision-making.
Simpler collaboration: A modular construction simplifies job delegation to teammates or AI assistants, with clear interfaces for every element.
Reusability: These encapsulated implementations might be utilized in several initiatives, doubtlessly with out modification to their supply code.
Self-documentation: Visualizing total workflows and their parts makes it simpler to grasp the challenge’s construction and logic with out diving into pointless particulars.

These are the 5 pillars that I’ve discovered to carry the inspiration to a “hyper-optimized AI workflow”:

Metric-Based mostly Optimization: Outline and optimize clear, project-specific metrics to information decision-making and workflow enhancements.
Interactive Developer Expertise: Make the most of instruments for iterative coding & information inspection like Jupyter Notebooks.
Manufacturing-Prepared Code: Wrap full workflows into deployable APIs and sync improvement and manufacturing code.
Modular & Extensible Code: Construction code to simply add, swap, and check totally different implementations.
Hierarchical & Visible Constructions: Arrange initiatives into visible, hierarchical parts that may be independently developed and simply understood at numerous ranges of abstraction.

Within the upcoming weblog posts, I’ll dive deeper into every of those pillars, offering extra detailed insights, sensible examples, and instruments that can assist you implement these ideas in your personal AI initiatives.

Particularly, I intend to introduce the methodology and instruments I’ve constructed on high of DAGWorks Inc* Hamilton framework and my very own packages: Hypster and HyperNodes (nonetheless in its early days).

Keep tuned for extra!

*I’m not affiliated with or employed by DAGWorks Inc.

5 Pillars for a Hyper-Optimized AI Workflow | by Gilad Rubin | Sep, 2024

10 Free AI instruments for Working Professionals

Constructing Fashionable Knowledge Lakehouses on Google Cloud with Apache Iceberg and Apache Spark

What’s Multi-Modal Information Evaluation?

Construct ETL Pipelines for Information Science Workflows in About 30 Strains of Python

10 GitHub LLM Repositories Each AI Engineer Ought to Know

10 Free AI instruments for Working Professionals

Constructing Fashionable Knowledge Lakehouses on Google Cloud with Apache Iceberg and Apache Spark

What’s Multi-Modal Information Evaluation?

Construct ETL Pipelines for Information Science Workflows in About 30 Strains of Python