In half 1 of this sequence we mentioned brokers, and used instruments from LangGraph and Tavily to construct a minimal agent that may analysis, write, assessment and revise brief articles. That is nice for a demo, however what if we truly need to learn these articles outdoors of a pocket book? Or, extra ambitiously, what if we are able to to make this agent right into a instrument that may truly be helpful to somebody studying a couple of new topic? This has the potential to turn into a full stack challenge, however right here I’ll deal with only one attention-grabbing ingredient — giving out system the power to add essays to Google Docs. Recall that we additionally save the intermediate steps that the agent takes in attending to the ultimate reply too — most likely its value making a document of these as effectively.
In response to a query or matter immediate, our agent produces an extended record of output. At a minimal, we’d wish to dump this right into a Google Doc with a title, and timestamp. We’d additionally like to manage the place in Google Drive this doc is to be written, and ideally have the choice to create and identify a folders in order that our essays might be saved logically. We received’t focus an excessive amount of on formatting right here — though that is actually doable utilizing the Google Docs API — we’re extra fascinated about simply getting the textual content into a spot the place somebody would truly learn it. Formatting might be a observe up, or just left to the choice of the reader.
As soon as now we have a docs connection arrange, there’s a complete host of extra superior issues we might do with our essay — what about utilizing an LLM to reformat them for a presentation and importing that right into a Google Slides deck? Or scraping some referenced knowledge supply and importing that to Google Sheets? We might add this performance as instruments contained in the management circulation of our agent and have it resolve what to do. Clearly there’s loads of choices right here however its good to begin small.
Let’s begin by writing some code to work together with Google Docs in some primary methods. Some setup is required first: You will want a Google Cloud account and a brand new challenge. You’ll then must allow the Google Drive and Google Docs APIs. To create some credentials for this challenge, we will likely be utilizing a service account, which might be arrange utilizing the directions right here. This course of will create a personal key in a .json
file, which you retailer in your native machine. Subsequent, it’s a good suggestion to make a “grasp folder” for this challenge in your Google Drive. When that’s executed, you may add your service account to this folder and provides it write permissions. Now your service account has the authorization to programmatically work together with the contents of that folder.
from google.oauth2 import service_account
from abc import ABC, abstractmethod
from googleapiclient.discovery import construct
# path to your .json credentials file
from research_assist.gsuite.base.config import CREDENTIALS
from typing import Anyclass GSuiteService(ABC):
"""
An summary base class for G Suite companies.
This class defines the construction for any G Suite service implementation,
requiring subclasses to specify the scopes and repair creation logic.
Attributes:
credential_path (str): The trail to the credentials file.
SCOPES (record): The scopes required for the service.
"""
def __init__(self) -> None:
"""
Initializes the GSuiteService with the credential path and scopes.
"""
# The identify of the file containing your credentials
self.credential_path = CREDENTIALS
self.SCOPES = self.get_scopes()
@abstractmethod
def get_scopes(self) -> record[str]:
"""
Retrieves the scopes required for the G Suite service.
Returns:
record[str]: A listing of scopes required for the service.
"""
increase NotImplementedError("Subclasses should implement this methodology.")
@abstractmethod
def get_service(self, credentials: Any) -> Any:
"""
Creates and returns the service object for the G Suite service.
Args:
credentials (Any): The credentials to make use of for the service.
Returns:
Any: The service object for the G Suite service.
"""
increase NotImplementedError("Subclasses should implement this methodology.")
def construct(self) -> Any:
"""
Builds the G Suite service utilizing the offered credentials.
Returns:
Any: The constructed service object.
"""
# Get credentials into the specified format
creds = service_account.Credentials.from_service_account_file(
self.credential_path, scopes=self.SCOPES
)
service = self.get_service(creds)
return service
class GoogleDriveService(GSuiteService):
"""
A service class for interacting with Google Drive API.
Inherits from GSuiteService and implements the strategies to retrieve
the required scopes and create the Google Drive service.
Strategies:
get_scopes: Returns the scopes required for Google Drive API.
get_service: Creates and returns the Google Drive service object.
"""
def get_scopes(self) -> record[str]:
"""
Retrieves the scopes required for the Google Drive service.
Returns:
record[str]: A listing containing the required scopes for Google Drive API.
"""
SCOPES = ["https://www.googleapis.com/auth/drive"]
return SCOPES
def get_service(self, creds: Any) -> Any:
"""
Creates and returns the Google Drive service object.
Args:
creds (Any): The credentials to make use of for the Google Drive service.
Returns:
Any: The Google Drive service object.
"""
return construct("drive", "v3", credentials=creds, cache_discovery=False)
The code is about up like this as a result of there are numerous GSuite APIs (drive, docs, sheets, slides and many others) that we would need to use in future. They might all inherit from GSuiteService
and have their get_service
and get_scopes
strategies overwritten with the particular particulars of that API.
As soon as that is all arrange, you’re able to work together with drive. This can be a nice article displaying a few of the predominant methods of doing so.
In our implementation, the best way we’ll work together with drive is by way of strategies of GoogleDriveHelper
, which creates an occasion of GoogleDriveService
on initialization. We begin with giving it the identify of our grasp folder
from research_assist.gsuite.drive.GoogleDriveHelper import GoogleDriveHelpermaster_folder_name = ai_assistant_research_projects
drive_helper = GoogleDriveHelper(f"{master_folder_name}")
Now let’s say we need to create a challenge in regards to the Voyager sequence of area probes, for instance. We are able to get organized by establishing a folder for that contained in the grasp folder:
project_folder_id = drive_helper.create_new_folder("voyager")
This creates the folder and returns its ID, which we are able to use to create a doc there. There is likely to be a number of variations of this challenge, so we are able to additionally make related subfolders
version_folder_id = drive_helper.create_new_folder(
"v1",
parent_folder_id=project_folder_id
)
Now we’re able to make a clean doc, which we are able to additionally do with the drive service
final_report_id = drive_helper.create_basic_document(
"closing report", parent_folder_id=version_folder_id
)
Below the hood, the drive helper is operating the next code, which passes some metadata indicating that we need to make a doc to the create methodology of googleapiclient.discovery.construct
(i.e. what comes out of operating GoogleDriveService().construct()
)
document_metadata = {
"identify": document_name,
"mimeType": "software/vnd.google-apps.doc",
"mother and father": [parent_folder_id],
}
# make the doc
doc = (
self.drive_service.recordsdata()
.create(physique=document_metadata, fields="id")
execute()
)
doc_id = doc.get("id")
As you may think, the Google Drive API has loads of totally different performance and choices that we’re not masking right here. Probably the most complete python wrapper for it that I’ve discovered is this one, which might be a very good place to begin if you wish to discover additional.
Now that we’ve made a clean doc, let’s fill it with the ultimate essay! That is the place the GoogleDocsService
and GoogleDocsHelper
are available in. GoogleDocsService
is similar to GoogleDriveService
, and in addition inherits from GSuiteService
as we mentioned in part 2. GoogleDocsHelper
accommodates some instruments to put in writing textual content and pictures to Google Docs. They’re very primary proper now, however thats all we want for this challenge.
We are able to first use the agent we constructed partially 1 to put in writing an essay about Voyager
from research_assist.researcher.Agent import ResearchAgent, load_secrets
from langchain_openai import ChatOpenAI
from tavily import TavilyClientsecrets and techniques = load_secrets()
mannequin = ChatOpenAI(
mannequin="gpt-4o-mini", temperature=0, api_key=secrets and techniques["OPENAI_API_KEY"]
)
tavily = TavilyClient(api_key=secrets and techniques["TAVILY_API_KEY"])
agent = ResearchAgent(llm, tavily)
agent.run_task(
task_description="The Voyager missions: What did we study?",
max_revisions=3
)
Recall that the varied outputs of the agent are saved in its reminiscence, which might be explored with the next. Within the code, you may see that we’re utilizing “user_id = 1” as a placeholder right here, however in an software that has a number of customers this id would enable the mannequin to entry the proper reminiscence retailer.
reminiscences = agent.in_memory_store.search(("1", "reminiscences"))
The ultimate report textual content might be discovered right here, with the important thing names akin to the AgentState that we mentioned partially 1. It’s at index -3 as a result of it’s adopted by a name to the editor node (which stated sure) and the settle for node, which proper now simply returns “True”. The settle for node might be simply be prolonged to truly write this report back to a doc routinely.
final_essay = agent.in_memory_store.search(("1", "reminiscences"))[-3].dict()["value"][
"memory"
]["write"]["draft"]
Let’s see how we are able to put this textual content in a google doc. Recall that in part 2 we made a clean doc with doc_id
. There are two primary strategies of GoogleDocsHelper
which may do that. The primary is designed to supply a title and primary metadata, which is simply the date and time at which the doc was written. The second will paste some textual content into the doc.
The code exhibits easy methods to management elements of the place and formatting of the textual content, which could be a bit complicated. We outline a listing of requests containing directions like insertText
. Once we insert textual content, we have to present the index at which to begin the insertion, which corresponds to a place within the doc.
def create_doc_template_header(self, document_title: str, doc_id: str) -> int:
"""
Creates a header template for the doc,
together with the title and the present date.Args:
document_title (str): The title of the doc.
doc_id (str): The ID of the doc to replace.
Returns:
int: The index after the inserted header.
"""
# add template header
title = f"""
{document_title}
"""
template = f"""
Written on {datetime.date.right this moment()} at {datetime.datetime.now().strftime("%H:%M:%S")}
"""
requests: Record[Dict[str, Any]] = [
{
"insertText": {
"location": {
"index": 1,
},
"text": template,
}
},
{
"insertText": {
"location": {
"index": 1,
},
"text": title,
}
},
{
"updateParagraphStyle": {
"range": {
"startIndex": 1,
"endIndex": len(title),
},
"paragraphStyle": {
"namedStyleType": "TITLE",
"spaceAbove": {"magnitude": 1.0, "unit": "PT"},
"spaceBelow": {"magnitude": 1.0, "unit": "PT"},
},
"fields": "namedStyleType,spaceAbove,spaceBelow",
}
},
{
"updateParagraphStyle": {
"range": {
"startIndex": len(title) + 1,
"endIndex": len(title) + len(template),
},
"paragraphStyle": {
"namedStyleType": "SUBTITLE",
"spaceAbove": {"magnitude": 1.0, "unit": "PT"},
"spaceBelow": {"magnitude": 1.0, "unit": "PT"},
},
"fields": "namedStyleType,spaceAbove,spaceBelow",
}
},
]
consequence = (
self.docs_service.paperwork()
.batchUpdate(documentId=doc_id, physique={"requests": requests})
.execute()
)
end_index = len(title) + len(template) + 1
return end_index
def write_text_to_doc(self, start_index: int, textual content: str, doc_id: str) -> int:
"""
Writes textual content to the doc on the specified index.
Args:
start_index (int): The index at which to insert the textual content.
textual content (str): The textual content to insert.
doc_id (str): The ID of the doc to replace.
Returns:
int: The index after the inserted textual content.
"""
end_index = start_index + len(textual content) + 1
requests: Record[Dict[str, Any]] = [
{
"insertText": {
"location": {
"index": start_index,
},
"text": text,
}
},
{
"updateParagraphStyle": {
"range": {
"startIndex": start_index,
"endIndex": start_index + len(text),
},
"paragraphStyle": {
"namedStyleType": "NORMAL_TEXT",
"spaceAbove": {"magnitude": 1.0, "unit": "PT"},
"spaceBelow": {"magnitude": 1.0, "unit": "PT"},
},
"fields": "namedStyleType,spaceAbove,spaceBelow",
}
},
]
consequence = (
self.docs_service.paperwork()
.batchUpdate(documentId=doc_id, physique={"requests": requests})
.execute()
)
return end_index
You may study extra about how indices are outlined right here. When a number of insertText
calls, it seems to be simpler to put in writing the final piece of textual content first — for instance within the code beneath template
(which is the metadata that’s supposed to look beneath the title) seems first within the record at index 1. Then we write title
at index 1. This leads to title
showing first within the doc and template
showing beneath. Observe how we additionally must specify the startIndex
and endIndex
of the paragraphStyle
blocks with a purpose to change the formatting of the textual content.
Each strategies within the code above return the top index of the present block of textual content in order that it may be used as the beginning index of subsequent blocks to be appended. Should you intend to get extra inventive with the fashion and formatting of paperwork, this information will seemingly assist.
Now that we’ve seen the underlying code, we are able to name it to put in writing our closing report back to a doc.
from research_assist.gsuite.docs.GoogleDocsHelper import GoogleDocsHelperdocs_helper = GoogleDocsHelper()
# add the doc title
title_end_index = docs_helper.create_doc_template_header(
"voyager closing report", doc_id
)
# add the textual content
doc_end_index = docs_helper.write_text_to_doc(
start_index=title_end_index, textual content=final_essay, doc_id=doc_id
)
Nice! Now now we have all of the instruments of docs at our disposal to edit, format and share the report that our agent generated. Apparently, the agent formatted the textual content as markdown which is supported by Google Docs, however I used to be unable to discover a method to get the doc to routinely acknowledge this and convert the markdown into good headers and subheaders. Little question there’s a manner to try this and it could make the experiences look a lot nicer.
After operating the code above, the doc ought to look one thing like this.
We should always be capable to write all the data thats saved within the agent reminiscence to docs, which is able to enable us to simply flick through the outcomes of every stage. A considerably hacky manner to do that is as follows:
reminiscences = agent.in_memory_store.search(("1", "reminiscences"))# that is wanted as a result of we could name some nodes a number of instances
# and we need to maintain observe of this in order that we are able to make new paperwork
# for every name
seen_keys = set()
iterations = defaultdict(int)
# folder id the place we need to write the paperwork
folder_id = f"{folder_id}"
for m in reminiscences:
knowledge = m.dict()["value"]["memory"]
available_keys = knowledge.keys()
node_key = record(available_keys)[0]
unique_node_key = node_key + "_00"
if unique_node_key in seen_keys:
iterations[node_key] += 1
unique_node_key = unique_node_key.exchange("_00", "") + "_{:02d}".format(
iterations[node_key]
)
print("-" * 20)
print("Creating doc {}".format(unique_node_key))
# get the textual content
textual content = knowledge[node_key][list(data[node_key].keys())[0]]
# the tavily analysis output is a listing, so convert it to a string
if isinstance(textual content, Record):
textual content = "nn".be a part of(textual content)
# if anything will not be a string (e.g. the output of the settle for node)
# convert it to a string
if not isinstance(textual content, str):
textual content = str(textual content)
# create doc
report_id = drive_service.create_basic_document(
unique_node_key, parent_folder_id=folder_id
)
# create header
end_index = docs_helper.create_doc_template_header(unique_node_key, report_id)
# fill doc
end_index = docs_helper.write_text_to_doc(
start_index=end_index, textual content=textual content, doc_id=report_id
)
seen_keys.add(unique_node_key)
That is going to make 7 paperwork, and we’ll check out some instance screenshots beneath
The preliminary plan outlines the construction of the report. It’s attention-grabbing that the mannequin appears to favor a number of brief sections, which I feel is acceptable given the immediate request to make it concise and digestible to a basic readership.