China has carried out it once more with its AI fashions and this time the blow is larger and higher! Baidu – a Chinese language AI firm, just lately launched two massive language fashions (LLMs) – ERNIE 4.5 & X1. Claiming to carry out higher than OpenAI’s newest & best mannequin thus far – GPT-4.5, these fashions are extra cost-efficient than DeepSeek-R1! The fashions appear too good to be true – providing top quality at a fraction of the worth. On this weblog, we’ll discover the ERNIE 4.5 & X1 fashions, consider their benchmark outcomes, and see how they carry out in real-world functions. So, let’s start.
What are ERNIE 4.5 & X1?
ERNIE 4.5 & X1 are the 2 newest multimodal LLMs developed by the main Chinese language tech firm Baidu, specializing in web companies, synthetic intelligence, and autonomous driving. It’s best recognized for its dominant search engine in China and developments in AI-driven improvements. Baidu launched its first LLM, ERNIE 3.0 Titan, again in December 2021. After that, it has launched just a few extra fashions, whereas working concurrently to construct extra strong LLMs. The results of all of the analysis and steady efforts is ERNIE 4.5 & X1.
ERNIE 4.5
ERNIE 4.5 is a multimodal basis mannequin able to understanding and integrating numerous knowledge varieties, together with textual content, photographs, audio, and video. This numerous modeling strategy enhances its means to grasp and generate totally different sorts of content material.
Listed below are a few of the key options of ERNIE 4.5:
- ERNIE 4.5 exhibits complete enhancements in understanding, era, reasoning, and reminiscence over its predecessor, ERNIE 4.0.
- It exhibits nice talents in hallucination prevention, logical reasoning, and coding, making it adept at dealing with advanced duties with greater accuracy.
- The mannequin even performs higher than OpenAI’s GPT-4.5 in a number of benchmarks, whereas it solely prices 1% of what it prices to make use of GPT-4.5!
ERNIE X1
ERNIE X1 is designed as a deep-thinking reasoning mannequin with multimodal capabilities. It’s a primary of its variety deep considering mannequin launched by Baidu. Listed below are a few of its key options:
- ERNIE X1 excels in understanding context, planning its thought course of, reflecting on its response, and evolving over time.
- It’s able to autonomously using numerous instruments for duties equivalent to superior search, picture understanding, and sophisticated calculations.
- The mannequin delivers efficiency on par with DeepSeek-R1 however at half the worth, providing a cheap resolution for enterprises in search of superior AI capabilities.
Tips on how to Entry ERNIE 4.5 & X1?
You possibly can entry ERNIE 4.5 & X1 both by means of their AI Chatbot – ERNIE Bot, or by means of APIs.
Entry through Bot:
Each the fashions are freely accessible to particular person customers on Baidu’s ERNIE Bot platform. Nevertheless, the registration for ERNIE Bot is presently restricted to Chinese language nationals.
Entry through API:
- Head to Baidu AI Cloud’s MaaS platform, Qianfan
- Create your account on the platform to get began.
At the moment, the platform can’t be accessed by all customers. Additionally, solely ERNIE 4.5 is out there through API, whereas ERNIE X1 will quickly be made accessible on the platform.
ERNIE 4.5 & X1 Efficiency Test
On this part, we’ll learn the way these fashions carry out at duties involving multimedia, reasoning, doc evaluation, and extra. Because the mannequin interface solely helps Chinese language language, and the account creation is proscribed to Chinese language nationals, we are going to take a look at some examples of how persons are utilizing the 2 fashions, and the outputs they’ve obtained. We shall be masking a few of the most typical use instances of ERNIE 4.5 & X1 we’ve got discovered on-line, together with:
- Reasoning with Picture Evaluation
- Doc Evaluation and Summarization
- Audio Evaluation
- Creativity and Picture Technology
Job 1: Reasoning + Picture Evaluation
On this activity, the mannequin was requested to resolve a mathematical downside which was given to it within the type of a picture.
Mannequin used: ERNIE 4.5
Output:
Similar to most different multimodal LLMs, ERNIE 4.5 shortly analyses the video and solves the issue within the picture. It takes all of the questions within the picture one after the other, and at last summarizes all of them. The velocity and accuracy of its efficiency make it a great tool for college kids, educators, researchers, and professionals who require quick and correct problem-solving.
Job 2: Doc Evaluation + Summarization
Right here, the mannequin was given a doc and it needed to summarize the knowledge on a specific matter from that doc.
Mannequin used: ERNIE 4.5
Output:
The mannequin permits you to add a number of information of assorted varieties, abruptly. It’s able to processing information of various varieties, together with docs, PDFs, PPTs, Excel sheets, and extra. From the uploaded information, you’ll be able to choose the one (or extra) that you just want to question the chatbot about and the mannequin shortly summarizes the subject. Its fast processing of a number of information may be very helpful for duties like analysis evaluation, authorized doc evaluate, monetary knowledge extraction, and company reporting.
Job 3: Audio Evaluation
For this activity, the mannequin needed to analyze the given audio and discover its supply.
Mannequin used: ERNIE 4.5
Output:
Audio evaluation is a function that not one of the in style AI chatbots have integrated inside their interface, making ERNIE 4.5, the primary of its variety. The mannequin shortly analyzes the clip, determines its supply, after which even goes on to explain the importance of the clip. Its fast evaluation and the detailed description, make it a precious software for duties like real-time transcription, voice-based search, deepfake detection, and sentiment evaluation throughout media, customer support, schooling, and legislation enforcement.
Job 4: Creativity + Picture Technology
For this activity, the mannequin needed to analyze a room and counsel potential decorations that may improve its general attraction. It then needed to generate an up to date picture of the room.
Mannequin used: ERNIE X1
Output:
The mannequin shortly processes the picture. It then suggests the potential enhancements to the room’s decor to boost the general attraction. Lastly, it generates the picture of the room with all of the instructed enhancements. This function is a superb addition for duties like inside designing, house renovation planning, actual property staging, and digital decor visualization.
Word: We’ve taken the examples from this put up on X.
Baidu’s ERNIE 4.5 & X1: Pricing
Each ERNIE 4.5 & X1 have all of the options, and much more, in comparison with the highest fashions by OpenAI, DeepSeek, Grok, Claude, and many others. Here’s a pricing breakdown of the 2 fashions:
Mannequin | Enter Value(per million tokens) | Output Value(per million tokens) | Availability |
ERNIE 4.5 | $0.55 | $2.20 | Obtainable |
ERNIE X1 | $0.28 | $1.10 | Not but accessible |
In comparison with different prime fashions, ERNIE 4.5 & X1 are considerably cheaper, making them a precious asset within the development of generative AI.

ERNIE 4.5 & X1: Commonplace Benchmark Outcomes
We’ve already seen the options, capabilities, and the pricing of the newest ERNIE fashions. Now let’s take a look at some efficiency numbers of those fashions towards prime fashions like GPT-4.5, GPT-4o, DeepSeek-R1, and extra.
The graph under compares ERNIE 4.5 and GPT-4o throughout a number of benchmarks that check multimodal AI efficiency.

The graph exhibits that:
- ERNIE 4.5 outperforms GPT-4o in most multimodal duties.
- The common rating for ERNIE 4.5 is 77.77, which is greater than GPT-4o’s 73.92.
- ERNIE 4.5 has a big edge in MathVista and DocVQA, exhibiting higher math reasoning and document-based question-answering abilities.
- Each fashions carry out equally in OCRBench and MMMU, however ERNIE 4.5 nonetheless has a slight benefit.
The subsequent graph compares ERNIE 4.5, DeepSeek V3 – Chat, GPT-4o, and GPT-4.5 throughout a number of benchmarks for text-based reasoning and problem-solving.

Listed below are some key takeaways from the graph:
- ERNIE 4.5 leads the pack with a median rating of 79.6, narrowly surpassing DeepSeek V3 – Chat at 79.14.
- It performs effectively throughout basic information, reasoning, and programming benchmarks equivalent to MMLU-Professional, GSM8K, and HumanEval+.
- GPT-4o and DeepSeek V3 additionally display robust outcomes, with DeepSeek V3 performing competitively in Chinese language benchmarks like CMMLU.
- ERNIE 4.5 excels in GSM8K (math) and C-Eval (basic reasoning), though DeepSeek V3 may be very shut in efficiency.
Future Impression
The race to be the highest LLM is heating up and Baidu’s ERNIE 4.5 & X1 introduce critical competitors for OpenAI, DeepSeek, Anthropic, and Meta. With Chinese language AI labs delivering fashions that rival or surpass Western AI at a fraction of the associated fee, firms shall be compelled to innovate sooner and decrease their prices to remain aggressive.
All these developments will lastly result in:
- Quicker AI developments throughout all main AI analysis facilities.
- Extra reasonably priced AI for companies and builders.
- A brand new period of multimodal AI functions, increasing past conventional text-based AI.
Conclusion
Baidu’s ERNIE 4.5 & X1 fashions should not simply one other set of AI fashions – they’re business disruptors. Their superior multimodal and reasoning capabilities, low pricing, and deep integration into China’s digital ecosystem, sign an influence shift within the international AI market.
If this pattern continues, we might see a bigger scale AI democratisation and outreach throughout numerous industries. This might additionally push many western firms to launch cheaper fashions. Not solely would this add to competitiveness out there, however would additionally be certain that the customers get essentially the most worth for his or her cash.
Ceaselessly Requested Questions
A. ERNIE 4.5 & X1 are the newest massive language fashions (LLMs) developed by Baidu, designed to rival prime AI fashions like OpenAI’s GPT-4.5 and DeepSeek-R1. ERNIE 4.5 is a multimodal basis mannequin, whereas ERNIE X1 is a deep-thinking reasoning mannequin with superior capabilities.
A. ERNIE 4.5 is optimized for multimodal understanding, able to processing textual content, photographs, audio, and video with excessive accuracy. ERNIE X1, however, is designed for deep-thinking reasoning, excelling in context understanding, planning, and problem-solving with self-reflection.
A. Baidu ERNIE 4.5 outperforms GPT-4.5 in a number of benchmarks, significantly in reasoning, multimodal understanding, and hallucination prevention, whereas costing only one% of GPT-4.5’s value. ERNIE X1 delivers DeepSeek-R1 stage efficiency at half the associated fee, making them extremely aggressive AI options.
A. ERNIE 4.5: Enter price $0.55 per 1M tokens, output price $2.20 per 1M tokens.
ERNIE X1: Enter price $0.28 per 1M tokens, output price $1.10 per 1M tokens.
The ERNIE X1 mannequin isn’t but accessible through API however shall be quickly.
A. You possibly can entry these fashions by means of:
1. ERNIE Bot (AI Chatbot) at yiyan.baidu.com (Solely accessible for Chinese language customers).
2. Baidu AI Cloud’s MaaS platform, Qianfan, for API entry (presently solely ERNIE 4.5 is out there).
Login to proceed studying and luxuriate in expert-curated content material.