An end-to-end information overlaying integration with the Sleeper API, creation of a Streamlit UI, and deployment through AWS CDK
It’s embarrassing how a lot time I spend eager about my fantasy soccer group.
Managing a squad means processing a firehose of data — harm studies, knowledgeable projections, upcoming bye weeks, and favorable matchups. And it’s not simply the amount of knowledge, however the ephermerality— in case your star RB tweaks a hamstring throughout Wednesday follow, you higher not be basing lineup choices off of Tuesday’s report.
Because of this general-purpose chatbots like Anthropic’s Claude and OpenAI’s ChatGPT are primarily ineffective for fantasy soccer suggestions, as they’re restricted to a static coaching corpus that cuts off months, even years in the past.
For example, if we ask Claude Sonnet 3.5 who the present greatest operating again is, we see names like Christian McCaffrey, Breece Corridor, and Travis Etienne, who’ve had injury-ridden or in any other case disappointing seasons so far in 2024. There is no such thing as a point out of Saquon Barkley or Derrick Henry, the plain frontrunners at this stage. (Although to Claude’s credit score, it discloses its limitations.)
Apps like Perplexity are extra correct as a result of they do entry a search engine with up-to-date data. Nevertheless, it in fact has no data of my complete roster state of affairs, the state of our league’s playoff image, or the nuances of our keeper guidelines.
There is a chance to tailor a fantasy football-focused Agent with instruments and customized context for every consumer.
Let’s dig into the implementation.
Structure Overview
The guts of the chatbot can be a LangGraph Agent based mostly on the ReAct framework. We’ll give it entry to instruments that combine with the Sleeper API for frequent operations like checking the league standings, rosters, participant stats, knowledgeable evaluation, and extra.
Along with the LangGraph API server, our backend will embody a small Postgres database and Redis cache, that are used to handle state and route requests. We’ll use Streamlit for a easy, however efficient UI.
For growth, we are able to run all of those parts regionally through Docker Compose, however I’ll additionally present the infrastructure-as-code (IaC) to deploy a scalable stack with AWS CDK.
Sleeper API Integration
Sleeper graciously exposes a public, read-only API that we are able to faucet into for consumer & league particulars, together with a full record of gamers, rosters, and draft data. Although it’s not documented explicitly, I additionally discovered some GraphQL endpoints that present essential statistics, projections, and — maybe most useful of all — latest knowledgeable evaluation by NFL reporters.
I created a easy API consumer to entry the assorted strategies, which you will discover right here. The one trick that I wished to spotlight is the requests-cache
library. I don’t wish to be a grasping consumer of Sleeper’s freely-available datasets, so I cache responses in an area Sqlite database with a primary TTL mechanism.
Not solely does this reduce the quantity redundant API site visitors bombarding Sleeper’s servers (lowering the possibility that they blacklist my IP tackle), but it surely considerably reduces latency for my shoppers, making for a greater UX.
Establishing and utilizing the cache is useless easy, as you’ll be able to see on this snippet —
import requests_cache
from urllib.parse import urljoin
from typing import Union, Non-obligatory
from pathlib import Pathclass SleeperClient:
def __init__(self, cache_path: str = '../.cache'):
# config
self.cache_path = cache_path
self.session = requests_cache.CachedSession(
Path(cache_path) / 'api_cache',
backend='sqlite',
expire_after=60 * 60 * 24,
)
...
def _get_json(self, path: str, base_url: Non-obligatory[str] = None) -> dict:
url = urljoin(base_url or self.base_url, path)
return self.session.get(url).json()
def get_player_stats(self, player_id: Union[str, int], season: Non-obligatory[int] = None, group_by_week: bool = False):
return self._get_json(
f'stats/nfl/participant/{player_id}?season_type=common&season={season or self.nfl_state["season"]}{"&grouping=week" if group_by_week else ""}',
base_url=self.stats_url,
)
So operating one thing like
self.session.get(url)
first checks the native Sqlite cache for an unexpired response that specific request. If it’s discovered, we are able to skip the API name and simply learn from the database.
Defining the Instruments
I wish to flip the Sleeper API consumer right into a handful of key capabilities that the Agent can use to tell its responses. As a result of these capabilities will successfully be invoked by the LLM, I discover it vital to annotate them clearly and ask for easy, versatile arguments.
For instance, Sleeper’s API’s typically ask for numeric participant id’s, which is smart for a programmatic interface. Nevertheless, I wish to summary that idea away from the LLM and simply have it enter participant names for these capabilities. To make sure some extra flexibility and permit for issues like typos, I carried out a primary “fuzzy search” methodology to map participant identify searches to their related participant id.
# file: fantasy_chatbot/league.pydef get_player_id_fuzzy_search(self, player_name: str) -> tuple[str, str]:
# will want a easy search engine to go from participant identify to participant id while not having precise matches. returns the player_id and matched participant identify as a tuple
nearest_name = course of.extract(question=player_name, selections=self.player_names, scorer=fuzz.WRatio, restrict=1)[0]
return self.player_name_to_id[nearest_name[0]], self.player_names[nearest_name[2]]
# instance utilization in a instrument
def get_player_news(self, player_name: Annotated[str, "The player's name."]) -> str:
"""
Get latest information a few participant for probably the most up-to-date evaluation and harm standing.
Use this each time naming a participant in a possible deal, as you must at all times have the proper context for a suggestion.
If sources are supplied, embody markdown-based hyperlink(s)
(e.g. [Rotoballer](https://www.rotoballer.com/player-news/saquon-barkley-has-historic-night-sunday/1502955) )
on the backside of your response to supply correct attribution
and permit the consumer to be taught extra.
"""
player_id, player_name = self.get_player_id_fuzzy_search(player_name)
# information
information = self.consumer.get_player_news(player_id, restrict=3)
player_news = f"Current Information about {player_name}nn"
for n in information:
player_news += f"**{n['metadata']['title']}**n{n['metadata']['description']}"
if evaluation := n['metadata'].get('evaluation'):
player_news += f"nnAnalysis:n{evaluation}"
if url := n['metadata'].get('url'):
# markdown hyperlink to supply
player_news += f"n[{n['source'].capitalize()}]({url})nn"
return player_news
That is higher than a easy map of identify to participant id as a result of it permits for misspellings and different typos, e.g. saquon
→ Saquon Barkley
I created quite a few helpful instruments based mostly on these rules:
- Get League Standing (standings, present week, no. playoff groups, and so on.)
- Get Roster for Crew Proprietor
- Get Participant Information (up-to-date articles / evaluation concerning the participant)
- Get Participant Stats (weekly factors scored this season with matchups)
- Get Participant Present Proprietor (essential for proposing trades)
- Get Greatest Obtainable at Place (the waiver wire)
- Get Participant Rankings (efficiency to date, damaged down by place)
You may most likely assume of some extra capabilities that might be helpful so as to add, like particulars about latest transactions, league head-to-heads, and draft data.
LangGraph Agent
The impetus for this complete mission was a possibility to be taught the LangGraph ecosystem, which can be changing into the de facto commonplace for establishing agentic workflows.
I’ve hacked collectively brokers from scratch prior to now, and I want I had recognized about LangGraph on the time. It’s not only a skinny wrapper across the numerous LLM suppliers, it offers immense utility for constructing, deploying, & monitoring advanced workflows. I’d encourage you to take a look at the Introduction to LangGraph course by LangChain Academy in case you’re focused on diving deeper.
As talked about earlier than, the graph itself relies on the ReAct framework, which is a well-liked and efficient technique to get LLM’s to work together with exterior instruments like these outlined above.
I’ve additionally added a node to persist long-term recollections about every consumer, in order that data could be continued throughout classes. I need our agent to “keep in mind” issues like customers’ issues, preferences, and previously-recommended trades, as this isn’t a function that’s carried out notably properly within the chatbots I’ve seen. In graph kind, it appears to be like like this:
Fairly easy proper? Once more, you’ll be able to checkout the total graph definition within the code, however I’ll spotlight the write_memory
node, which is answerable for writing & updating a profile for every consumer. This enables us to trace key interactions whereas being environment friendly about token use.
def write_memory(state: MessagesState, config: RunnableConfig, retailer: BaseStore):
"""Mirror on the chat historical past and save a reminiscence to the shop."""# get the username from the config
username = config["configurable"]["username"]
# retrieve present reminiscence if obtainable
namespace = ("reminiscence", username)
existing_memory = retailer.get(namespace, "user_memory")
# format the recollections for the instruction
if existing_memory and existing_memory.worth:
memory_dict = existing_memory.worth
formatted_memory = (
f"Crew Identify: {memory_dict.get('team_name', 'Unknown')}n"
f"Present Considerations: {memory_dict.get('current_concerns', 'Unknown')}"
f"Different Particulars: {memory_dict.get('other_details', 'Unknown')}"
)
else:
formatted_memory = None
system_msg = CREATE_MEMORY_INSTRUCTION.format(reminiscence=formatted_memory)
# invoke the mannequin to provide structured output that matches the schema
new_memory = llm_with_structure.invoke([SystemMessage(content=system_msg)] + state['messages'])
# overwrite the present consumer profile
key = "user_memory"
retailer.put(namespace, key, new_memory)
These recollections are surfaced within the system immediate, the place I additionally gave the LLM primary particulars about our league and the way I need it to deal with frequent consumer requests.
Streamlit UI and Demo
I’m not a frontend developer, so the UI leans closely on Streamlit’s parts and acquainted chatbot patterns. Customers enter their Sleeper username, which is used to lookup their obtainable leagues and persist recollections throughout threads.
I additionally added a few bells and whistles, like implementing token streaming in order that customers get immediate suggestions from the LLM. The opposite vital piece is a “analysis pane”, which surfaces the outcomes of the Agent’s instrument calls in order that consumer can examine the uncooked knowledge that informs every response.
Right here’s a fast demo.
Deployment
For growth, I like to recommend deploying the parts regionally through the supplied docker-compose.yml
file. This may expose the API regionally at http://localhost:8123
, so you’ll be able to quickly check modifications and hook up with it from an area Streamlit app.
I’ve additionally included IaC for an AWS CDK-based deployment that I exploit to host the app on the web. A lot of the sources are outlined right here. Discover the parallels between the docker-compose.yml
and the CDK code associated to the ECS setup:
Snippet from docker-compose.yml
for the LangGraph API container:
# from docker-compose.ymllanggraph-api:
picture: "fantasy-chatbot"
ports:
- "8123:8000"
healthcheck:
check: curl --request GET --url http://localhost:8000/okay
timeout: 1s
retries: 5
interval: 5s
depends_on:
langgraph-redis:
situation: service_healthy
langgraph-postgres:
situation: service_healthy
env_file: "../.env"
atmosphere:
REDIS_URI: redis://langgraph-redis:6379
POSTGRES_URI: postgres://postgres:postgres@langgraph-postgres:5432/postgres?sslmode=disable// file: fantasy-football-agent-stack.ts
And right here is the analogous setup within the CDK stack:
// fantasy-football-agent-stack.tsconst apiImageAsset = new DockerImageAsset(this, 'apiImageAsset', {
listing: path.be a part of(__dirname, '../../fantasy_chatbot'),
file: 'api.Dockerfile',
platform: property.Platform.LINUX_AMD64,
});
const apiContainer = taskDefinition.addContainer('langgraph-api', {
containerName: 'langgraph-api',
picture: ecs.ContainerImage.fromDockerImageAsset(apiImageAsset),
portMappings: [{
containerPort: 8000,
}],
atmosphere: {
...dotenvMap,
REDIS_URI: 'redis://127.0.0.1:6379',
POSTGRES_URI: 'postgres://postgres:[email protected]:5432/postgres?sslmode=disable'
},
logging: ecs.LogDrivers.awsLogs({
streamPrefix: 'langgraph-api',
}),
});
apiContainer.addContainerDependencies(
{
container: redisContainer,
situation: ecs.ContainerDependencyCondition.HEALTHY,
},
{
container: postgresContainer,
situation: ecs.ContainerDependencyCondition.HEALTHY,
},
)
Apart from some refined variations, it’s successfully a 1:1 translation, which is at all times one thing I search for when evaluating native environments to “prod” deployments. The DockerImageAsset
is a very helpful useful resource, because it handles constructing and deploying (to ECR) the Docker picture throughout synthesis.
Notice: Deploying the stack to your AWS account through
npm run cdk deploy
WILL incur prices. On this demo code I’ve not included any password safety on the Streamlit app, which means anybody who has the URL can use the chatbot! I extremely suggest including some extra safety in case you plan to deploy it your self.
Takeaways
You wish to hold your instruments easy. This app does quite a bit, however continues to be lacking some key performance, and it’ll begin to break down if I merely add extra instruments. Sooner or later, I wish to break up the graph into task-specific sub-components, e.g. a “Information Analyst” Agent and a “Statistician” Agent.
Traceability and debugging are extra vital with Agent-based apps than conventional software program. Regardless of important developments in fashions’ capability to provide structured outputs, LLM-based perform calling continues to be inherently much less dependable than typical applications. I used LangSmith extensively for debugging.
In an age of commoditized language fashions, there isn’t a alternative for dependable reporters. We’re at a degree the place you’ll be able to put collectively an inexpensive chatbot in a weekend, so how do merchandise differentiate themselves and construct moats? This app (or some other prefer it) can be ineffective with out entry to high-quality reporting from analysts and consultants. In different phrases, the Ian Rapaport’s and Matthew Berry’s of the world are extra helpful than ever.
Repo
All pictures, until in any other case famous, are by the writer.