NVIDIA Crew Sweeps KDD Cup 2024 Information Science Competitors

Crew NVIDIA has triumphed on the Amazon KDD Cup 2024, securing first place Friday throughout all 5 competitors tracks.

The workforce — consisting of NVIDIANs Ahmet Erdem, Benedikt Schifferer, Chris Deotte, Gilberto Titericz, Ivan Sorokin and Simon Jegou — demonstrated its prowess in generative AI, profitable in classes that included textual content era, multiple-choice questions, title entity recognition, rating, and retrieval.

The competitors, themed “Multi-Activity On-line Procuring Problem for LLMs,” requested members to unravel varied challenges utilizing restricted datasets.

“The brand new pattern in LLM competitions is that they don’t provide you with coaching knowledge,” mentioned Deotte, a senior knowledge scientist at NVIDIA. “They provide you 96 instance questions — not sufficient to coach a mannequin — so we got here up with 500,000 questions on our personal.”

Deotte defined that the NVIDIA workforce generated a wide range of questions by writing some themselves, utilizing a giant language mannequin to create others, and reworking current e-commerce datasets.

“As soon as we had our questions, it was easy to make use of current frameworks to fine-tune a language mannequin,” he mentioned.

The competitors organizers hid the check questions to make sure members couldn’t exploit beforehand recognized solutions. This method encourages fashions that generalize effectively to any query about e-commerce, proving the mannequin’s functionality to deal with real-world eventualities successfully.

Regardless of these constraints, Crew NVIDIA’s modern method outperformed all rivals through the use of Qwen2-72B, a just-released LLM with 72 billion parameters, fine-tuned on eight NVIDIA A100 Tensor Core GPUs, and using QLoRA, a way for fine-tuning fashions with datasets.

In regards to the KDD Cup 2024

The KDD Cup, organized by the Affiliation for Computing Equipment’s Particular Curiosity Group on Information Discovery and Information Mining, or ACM SIGKDD, is a prestigious annual competitors that promotes analysis and improvement within the subject.

This 12 months’s problem, hosted by Amazon, centered on mimicking the complexities of on-line buying with the objective of creating it a extra intuitive and satisfying expertise utilizing giant language fashions. Organizers utilized the check dataset ShopBench — a benchmark that replicates the huge problem for on-line buying with 57 duties and about 20,000 questions derived from real-world Amazon buying knowledge — to judge members’ fashions.

The ShopBench benchmark centered on 4 key buying abilities, together with a fifth “all-in-one” problem:

  1. Procuring Idea Understanding: Decoding advanced buying ideas and terminologies.
  2. Procuring Information Reasoning: Making knowledgeable choices with buying information.
  3. Person Conduct Alignment: Understanding dynamic buyer habits.
  4. Multilingual Skills: Procuring throughout languages.
  5. All-Round: Fixing all duties from the earlier tracks in a unified answer.

NVIDIA’s Successful Answer

NVIDIA’s profitable answer concerned making a single mannequin for every monitor.

The workforce fine-tuned the just-released Qwen2-72B mannequin utilizing eight NVIDIA A100 Tensor Core GPUs for about 24 hours. The GPUs offered quick and environment friendly processing, considerably lowering the time required for fine-tuning.

First, the workforce generated coaching datasets based mostly on the offered examples and synthesized extra knowledge utilizing Llama 3 70B hosted on construct.nvidia.com.

Subsequent, they employed QLoRA (Quantized Low-Rank Adaptation), a coaching course of utilizing the info created in the first step. QLoRA modifies a smaller subset of the mannequin’s weights, permitting environment friendly coaching and fine-tuning.

The mannequin was then quantized — making it smaller and in a position to run on a system with a smaller laborious drive and fewer reminiscence — with AWQ 4-bit and used the vLLM inference library to foretell the check datasets on 4 NVIDIA T4 Tensor Core GPUs throughout the time constraints.

This method secured the highest spot in every particular person monitor and the general first place within the competitors—a clear sweep for NVIDIA for the second 12 months in a row.

The workforce plans to submit an in depth paper on its answer subsequent month and plans to current its findings at KDD 2024 in Barcelona.

Leave a Reply