5 AI Initiatives You Can Construct This Weekend (with Python) | by Shaw Talebi | Oct, 2024

The primary mistake freshmen make when considering of undertaking concepts is beginning with the query, “How can I exploit this new tech?” Whereas this could be a positive option to study a brand new device, there’s a higher manner.

Good undertaking concepts begin with the query, “What downside can I remedy?” This not solely makes for a pleasant story when sharing with potential employers however fixing issues is the way you translate technical expertise into worth.

The next initiatives all take this problem-first method. You may take these concepts and implement them instantly or (even higher) use them as inspiration for fixing an issue that you’re personally going through.

An efficient but time-consuming a part of making use of for jobs is adapting your resume to totally different job descriptions. Whereas automating this process would have been a sophisticated undertaking just a few years in the past, with at present’s massive language fashions, it is so simple as an API name.

Right here’s a step-by-step breakdown of how one can implement such an automation.

  1. Create a markdown model of your resume (Be aware: ChatGPT can do that for you).
  2. Experiment with totally different immediate templates that take your markdown resume and a job description and output a brand new resume in markdown.
  3. Use OpenAI’s Python API to immediate GPT-4o-mini to rewrite your resume dynamically.
  4. Convert the markdown file to HTML after which to PDF with the markdown and pdfkit libraries, respectively.

Libraries: openai, markdown, pdfkit

Whereas we might readily use ChatGPT for this, the upside of implementing this with Python is that we are able to simply scale up the method. Right here’s some starter code for Step 3.

import openai
openai.api_key = "your_sk"

# immediate (assuming md_resume and job_desciption have been outlined)
immediate = f"""
I've a resume formatted in Markdown and a job description.
Please adapt my resume to higher align with the job necessities whereas
sustaining an expert tone. Tailor my expertise, experiences, and
achievements to focus on essentially the most related factors for the place.
Be certain that my resume nonetheless displays my distinctive {qualifications} and strengths
however emphasizes the abilities and experiences that match the job description.

### Right here is my resume in Markdown:
{md_resume}

### Right here is the job description:
{job_desciption}

Please modify the resume to:
- Use key phrases and phrases from the job description.
- Alter the bullet factors below every position to emphasise related expertise and achievements.
- Be sure my experiences are offered in a manner that matches the required {qualifications}.
- Keep readability, conciseness, and professionalism all through.

Return the up to date resume in Markdown format.

"""

# make api name
response = openai.chat.completions.create(
mannequin="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
temperature = 0.25
)

# extract response
resume = response.decisions[0].message.content material

Be aware: ChatGPT is tremendous useful for writing quick code snippets (and prompts) like this. For those who get caught, attempt it for Step 4.

Though I really like including technical talks to my YouTube “watch later” playlist, it is likely to be some time earlier than I watch them (if I ever get round to it 😅). A undertaking that may assist with it is a device that watches the movies for me and generates concise summaries with key factors.

Right here’s a technique to do this:

  1. Extract YouTube video ID from video hyperlink utilizing regex
  2. Use video ID to extract transcript utilizing youtube-transcript-api
  3. Experiment with totally different ChatGPT prompts that successfully summarize the transcript
  4. Use OpenAI’s Python API to automate the method

Libraries: openai, youtube-transcript-api

From a technical perspective, that is similar to the primary undertaking. A key distinction, nevertheless, is that we might want to mechanically extract video transcripts, which we are able to feed into the LLM.

Right here’s some starter code for that.

import re
from youtube_transcript_api import YouTubeTranscriptApi

youtube_url = "video hyperlink right here"

# extract video ID with regex
video_id_regex = r'(?:v=|/)([0-9A-Za-z_-]{11}).*'
match = re.search(video_id_regex, youtube_url)

if match:
return match.group(1)
else:
return None

# extract transcript
text_list = [transcript[i]['text'] for i in vary(len(transcript))]
transcript_text = 'n'.be a part of(text_list)

My watch later playlist shouldn’t be the one place I hoard technical info. One other cache is my desktop, which is riddled with (118) analysis papers. Since manually reviewing these papers could be (very) time-consuming, let’s see how AI will help.

One might construct a device that analyzes the contents of every PDF on my desktop and arrange them into folders primarily based on subjects. Textual content embeddings can translate every paper right into a dense vector illustration, from which related articles may very well be clustered utilizing a conventional machine studying algorithm like Okay-Means.

Right here’s a extra detailed breakdown:

  1. Learn the summary of every analysis article utilizing PyMuPDF
  2. Use the sentence-transformers library to translate abstracts into textual content embeddings and retailer them in a Pandas DataFrame
  3. Use your favourite clustering algorithm from sklearn to group the embeddings primarily based on similarity
  4. Create folders for every cluster and transfer the information into the suitable folder.

Libraries: PyMuPDF, sentence_transformers, pandas, sklearn

The important thing step for this undertaking is producing the textual content embeddings. Right here’s a code snippet for doing that with sentence_transformers.

from sentence_transformers import SentenceTransformer

# load embedding mannequin
mannequin = SentenceTransformer("all-MiniLM-L6-v2")

# retailer abstracts in a listing
abstract_list = ["abstract 1", "abstract 2"]

# calculate embeddings
embeddings = mannequin.encode(abstract_list)

A few months in the past, I helped an organization create a primary RAG system for a set of technical experiences. One of many challenges with looking such experiences is that key info is usually offered in plots and figures moderately than textual content.

One option to incorporate this visible info into the search course of is to make use of a multimodal embedding mannequin to symbolize textual content and pictures in a shared house.

Right here’s a primary breakdown:

  1. Given a PDF, chunk it into sections and extract the photographs utilizing PyMuPDF
  2. Use a multimodal embedding mannequin (e.g. nomic-ai/nomic-embed-text-v1.5) to symbolize the chunks and pictures as dense vectors and retailer them in a dataframe
  3. Repeat for all PDFs within the information base
  4. Given a person question, go it by means of the identical embedding mannequin used for the information base
  5. Compute the cosine similarity rating between the question embedding and each merchandise within the information base
  6. Return prime okay outcomes

Libraries: PyMuPDF, transformers, pandas, sklearn

An important a part of this undertaking is how the PDFs are chunked. The best manner could be to make use of a set character depend with some overlap between chunks. It’s also useful to seize metadata equivalent to filename and web page quantity for every chunk.

Right here’s some primary boilerplate code to do this (courtesy of ChatGPT). For those who get caught, attempt asking it to extract the photographs.

import fitz  # PyMuPDF

def extract_text_chunks(pdf_path, chunk_size, overlap_size):
# Open the PDF file
pdf_document = fitz.open(pdf_path)
chunks = []

# Iterate by means of every web page within the PDF
for page_num in vary(len(pdf_document)):
web page = pdf_document[page_num]
page_text = web page.get_text()

# Break up the textual content from the present web page into chunks with overlap
begin = 0
whereas begin < len(page_text):
finish = begin + chunk_size
chunk = page_text[start:end]

# Retailer the web page quantity with the chunk
chunks.append((page_num + 1, chunk))
# Transfer to the subsequent chunk with the overlap
begin += chunk_size - overlap_size

return chunks

# Parameters for extraction
pdf_path = "your_file.pdf"
chunk_size = 1000 # Dimension of every textual content chunk in characters
overlap_size = 200 # Overlap dimension in characters

text_chunks = extract_text_chunks_with_page_numbers(pdf_path, chunk_size, overlap_size)

# Show the chunks with web page numbers
for i, (page_number, chunk) in enumerate(text_chunks):
print(f"Chunk {i + 1} (Web page {page_number}):n{chunk}n{'-' * 50}")

Over the previous 12 months, I’ve helped nearly 100 companies and people construct AI initiatives. By far, the most typical undertaking folks ask about is a doc question-answering system. Constructing on the earlier undertaking, we are able to implement this in a simple manner.

If we’ve already chunked and saved our paperwork in a DataFrame, we are able to convert the multimodal search device right into a multimodal RAG system.

Listed here are the steps:

  1. Carry out a search over the information base (just like the one created in Venture 4)
  2. Mix person question with prime okay search outcomes and go them to a multimodal mannequin.
  3. Create a easy Gradio person interface for the QA system.

Libraries: PyMuPDF, transformers, pandas, sklearn, collectively/openai, Gradio

Be aware: Llama 3.2 Imaginative and prescient is free till 2025 through Collectively AI’s API

This undertaking primarily combines initiatives 2 and 4. Nevertheless, it contains the important part of a person interface. For that, we are able to use a dashboarding device like Gradio, which permits us to create a chat UI with just a few traces of code.

Right here’s an instance snippet tailored from Gradio’s doc for doing this.

import gradio as gr
import time

def generate_response(message, historical past):
"""
Your code for producing a response
"""
return response

demo = gr.ChatInterface(
fn=generate_response,
examples=[{"text": "Hello", "files": []}],
title="Echo Bot",
multimodal=True)

demo.launch()

Due to instruments like ChatGPT and Cursor, it’s by no means been simpler to construct AI initiatives quick. Issues that used to dam me for hours (if not days) just a few years in the past can now be resolved in minutes with superior coding assistants.

My parting recommendation is to make use of these instruments to study quicker and be daring in your undertaking decisions. For initiatives, discover issues and time-box the implementation right into a weekend.

Drop your questions within the feedback 🙂