Agent Dev Collection

This is a living repo gathering the most important evals, monitoring, test cases for the continuous development and improvement of agents. We'll add monitoring and evaluation tooling and standardized capability test cases (e.g. functional calling, agent communication) to a basic agent application.

Knowledge Worker Project

Check out the Weave Workspace here!

Getting Started

Install requirements_verbose.txt in environment (for Mac Silicon)
Setup benchmark.env in ./config with necessary API keys (WANDB_API_KEY) and optional (HUGGINGFACEHUB_API_TOKEN, OPENAI_API_KEY, ANTHROPIC_API_KEY)
Set variables accordingly in general_config.yaml
- Set Entity, Project (device for now only CPU)
- Setup = True the first time to run to extract data and generate dataset
- The chat model, embedding model, judge model, prompts, params as you want to!
Run main.py with different configs or run streamlit run chatbot.py to track interactions with an already deployed model.

Code Structure

main.py - contains the main application flow - serves as an example for bringing everything together
setup.py - contains utility functions for the RAG model RagModel(weave.Model) and the data extraction and dataset generation functions
evaluatie.py - contains the weave.flow.scorer.Scorer classes to evaluate the correctness, hallucination, and retrieval performance.
./configs - the configs of the project
- ./configs/benchmark.env - should contain env vars for your W&B account and the model providers you want to use (HuggingFace, OpenAI, Anthropic, Mistral, etc.)
- ./configs/requirements.txt - environment to install necessary dependencies to run RAG
- ./configs/sources_urls.csv - a CSV to contain all the Websites and PDFs that should be considered by RAG
- ./configs/general_config.yaml - the central config file with models, prompts, params
annotate.py - can be run with streamlit run annotation.py to annotate existing datasets or fetch datasets based on production function calls to annotate and save as new dataset.
chatbot.py - can be run with streamlit run chatbot.py to serve the RAG Model from Weave and track questions asked to it

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
configs		configs
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
annotation.py		annotation.py
chatbot.py		chatbot.py
evaluate.py		evaluate.py
main.py		main.py
prod_dashboard.py		prod_dashboard.py
weave_utils.py		weave_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Dev Collection

Knowledge Worker Project

Getting Started

Code Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

NiWaRe/agent-dev-collection

Folders and files

Latest commit

History

Repository files navigation

Agent Dev Collection

Knowledge Worker Project

Getting Started

Code Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages