2024 in Assessment: What I Acquired Proper, The place I Was Flawed, and Bolder Predictions for 2025 | by Leonie Monigatti | Dec, 2024

What I acquired proper (and improper) about traits in 2024 and daring to make bolder predictions for the 12 months forward

AI Buzzword and Development Bingo (Picture by the creator)

In 2023, constructing AI-powered purposes felt filled with promise, however the challenges have been already beginning to present. By 2024, we started experimenting with strategies to sort out the arduous realities of creating them work in manufacturing.

Final 12 months, I reviewed the most important traits in AI in 2023 and made predictions for 2024. This 12 months, as a substitute of a timeline, I need to give attention to key themes: What traits emerged? The place did I get it improper? And what can we count on for 2025?

If I’ve to summarize the AI area in 2024, it will be the “Captain, it’s Wednesday” meme. The quantity of main releases this 12 months was overwhelming. I don’t blame anybody on this area who’s feeling exhausted in direction of the top of this 12 months. It’s been a loopy experience, and it has been arduous to maintain up. Let’s assessment key themes within the AI area and see if I accurately predicted them final 12 months.

Evaluations

Let’s begin by some generative AI options that made it to manufacturing. There aren’t many. As a survey by A16Z revealed in 2024, firms are nonetheless hesitant to deploy generative AI in customer-facing purposes. As a substitute, they really feel extra assured utilizing it for inner duties, like doc search or chatbots.

So, why aren’t there that many customer-facing generative AI purposes within the wild? Most likely as a result of we’re nonetheless determining tips on how to consider them correctly. This was considered one of my predictions for 2024.

A lot of the analysis concerned utilizing one other LLM to guage the output of an LLM (LLM-as-a-judge). Whereas the strategy could also be intelligent, it’s additionally imperfect because of added price, introduction of bias, and unreliability.

Trying again, I anticipated we might see this problem solved this 12 months. Nevertheless, wanting on the panorama at this time, regardless of being a serious matter of dialogue, we nonetheless haven’t discovered a dependable method to consider generative AI options successfully. Though I believe LLM-as-a-judge is the one means we’re in a position to consider generative AI options at scale, this exhibits how early we’re on this subject.

Multimodality

Though this one may need been apparent to lots of you, I didn’t have this on my radar for 2024. With the releases of GPT4, Llama 3.2, and ColPali, multimodal basis fashions have been an enormous development in 2024. Whereas we, builders, have been busy determining tips on how to make LLMs work in our current pipelines, researchers have been already one step forward. They have been already constructing basis fashions that would deal with multiple modality.

“There’s *completely no means in hell* we are going to ever attain human-level AI with out getting machines to study from high-bandwidth sensory inputs, corresponding to imaginative and prescient.” — Yann LeCun

Take PDF parsing for example of multimodal fashions’ usefulness past text-to-image duties. ColPali’s researchers prevented the troublesome steps of OCR and structure extraction through the use of visible language fashions (VLMs). Methods like ColPali and ColQwen2 course of PDFs as photographs, extracting info instantly with out pre-processing or chunking. It is a reminder that easier options typically come from altering the way you body the issue.

Multimodal fashions are an even bigger shift than they could appear. Doc search throughout PDFs is just the start. Multimodality in basis fashions will unlock fully new potentialities for purposes throughout industries. With extra modalities, AI is now not nearly language — it’s about understanding the world.

Positive-tuning open-weight fashions and quantization

Open-weight fashions are closing the efficiency hole to closed fashions. Positive-tuning them provides you a efficiency enhance whereas nonetheless being light-weight. Quantization makes these fashions smaller and extra environment friendly (see additionally Inexperienced AI) to run wherever, even on small gadgets. Quantization pairs effectively with fine-tuning, particularly since fine-tuning language fashions is inherently difficult (see QLoRA).

Collectively, these traits make it clear that the long run isn’t simply larger fashions — it’s smarter ones.

Nice visible abstract by Maxime Labonne. Additionally, try his weblog in case you are concerned about fine-tuning LLMs.

I don’t assume I explicitly talked about this one and solely wrote a small piece on this within the second quarter of 2024. So, I cannot give myself some extent right here.

AI brokers

This 12 months, AI brokers and agentic workflows gained a lot consideration, as Andrew Ng predicted originally of the 12 months. We noticed Langchain and LlamaIndex transfer into incorporating brokers, CrewAI gained a variety of momentum, and OpenAI got here out with Swarm. That is one other matter I hadn’t seen coming since I didn’t look into it.

“I believe AI agentic workflows will drive huge AI progress this 12 months — maybe much more than the following era of basis fashions.” — Andrew Ng

Screenshot from Google Developments for the time period “AI brokers” in 2024.

Regardless of the huge curiosity in AI brokers, they are often controversial. First, there’s nonetheless no clear definition of “AI agent” and its capabilities. Are AI brokers simply LLMs with entry to instruments, or have they got different particular capabilities? Second, they arrive with added latency and value. I’ve learn many feedback saying that agent programs aren’t appropriate for manufacturing programs because of this.

However I believe we’ve got already been seeing some agentic pipelines in manufacturing with light-weight workflows, corresponding to routing person queries to particular perform calls. I believe we are going to proceed to see brokers in 2025. Hopefully, we are going to get a clearer definition and film.

RAG isn’t de*d and retrieval goes mainstream

Retrieval-Augmented Era (RAG) gained vital consideration in 2023 and remained a key matter in 2024, with many new variants rising. Nevertheless, it stays a subject of debate. Some argue it’s turning into out of date with long-context fashions, whereas others query whether or not it’s even a brand new concept. Whereas I believe the criticism of the terminology is justified, I believe the idea is right here to remain (for a short time at the least).

All of the totally different RAG variants

Each time a brand new lengthy context mannequin is launched, some individuals predict it will likely be the top of RAG pipelines. I don’t assume that’s going to occur. This entire dialogue ought to be a weblog submit of its personal, so I’m not going into depth right here and saving the dialogue for an additional one. Let me simply say that I don’t assume it’s one or the opposite. They’re enhances. As a substitute, we are going to most likely be utilizing lengthy context fashions along with RAG pipelines.

Additionally, having a database in purposes just isn’t a brand new idea. The time period ‘RAG,’ which refers to retrieving info from a data supply to boost an LLM’s output, has confronted criticism. Some argue it’s merely a rebranding of strategies lengthy utilized in different fields, corresponding to software program engineering. Whereas I believe we are going to most likely half from the time period in the long term, the method is right here to remain.

Regardless of predictions of RAG’s demise, retrieval stays a cornerstone of AI pipelines. Whereas I could also be biased by my work in retrieval, it felt like this matter turned extra mainstream in AI this 12 months. It began with many discussions round key phrase search (BM25) as a baseline for RAG pipelines. It then advanced into a bigger dialogue round dense retrieval fashions, corresponding to ColBERT or ColPali.

Information graphs

I utterly missed this matter as a result of I’m not too accustomed to it. Information graphs in RAG programs (e.g., Graph RAG) have been one other huge matter. Since all I can say about data graphs at this second is that they appear to be a robust exterior data supply, I’ll hold this part quick.