Introduction
Not too long ago I’ve been engaged on the domain-specific fine-tuning of a number of LLMs. The primary and possibly a very powerful a part of this job is to gather, scrape, and clear textual knowledge to feed the LLM. I seen that my code was changing into messy with many repetitions, as a result of for each recognized supply I used to be writing a script from scratch which had loads of issues in frequent with different scripts in my codebase. I used to be not following the “Don’t repeat your self” (DRY) precept in any respect. This is the reason I made a decision to implement the Template Design Sample and make my code base extra elegant and environment friendly.
The Template Design Sample
I received’t repeat right here what a design sample is and the way we classify design patterns based mostly on their functionalities, since I’ve written many articles on the topic. If you’re interested by studying my earlier articles on this subject I’ll go away some references on the finish.
On this article, I’ll present you an instance associated to knowledge processing. Let’s say that in our mission we now have to cope with totally different sorts of information that we need to analyze. A few of these knowledge are…