Nesta brain

A research methods innovation project.

We're developing retrieval-augmented generation approaches to bolster internal knowledge management.

Installation

We're using poetry for dependency management.

Run the following commands to install depedencies.

poetry install
poetry install --with lint
poetry install --with test
poetry run pre-commit install

To start an environment in your terminal

poetry env use python3.11
poetry shell

To add a new package, use poetry add:

poetry add package-name

Repo structure

data/                 # Contains raw and processed datasets used for the project
documentation/        # Documentation
dsp_nesta_brain/
├── notebooks/        # Jupyter notebooks for exploration and experimentation
├── pipeline/         # Data processing and analysis pipelines.
├── getters/          # Getter functions to get data from S3 or other sources
└── utils/            # Utility scripts and helper functions
eval/                 # Evaluation metrics and Langfuse
front_end/            # Constants and functions needed for the streamlit app (project-specific)
google_api            # Interacting with Google Drive
lgraph/               # LangGraph experiments
llm/                  # LLM and LangChain use
retrieval/            # RAG retrieval
└── db/               # Vector database setup and maintenance
  ├── ingest/         # Vector database ingestion (one file for each project)
  └── schema/         # Vector database schema and setup (one file for each project)
scraping/             # Web-scraping and PDF parsing
└──pdf                #PDF parsing
topic_model/          # Topic modelling and visualisation

Keep project related data in the data folder for local prototyping. When submitting code for PR reviews, best to store the data on S3 and add getter functions in getters.

Feel free to add other folders (eg for streamlit apps).

Name		Name	Last commit message	Last commit date
Latest commit History 330 Commits
documentation		documentation
dsp_nesta_brain		dsp_nesta_brain
eval		eval
front_end		front_end
google_api		google_api
lgraph		lgraph
llm		llm
retrieval		retrieval
scraping		scraping
topic_model		topic_model
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint		.yamllint
README.md		README.md
app.py		app.py
config.py		config.py
env_template.txt		env_template.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nesta brain

Installation

Repo structure

About

Releases

Packages

Contributors 3

Languages

nestauk/dsp_nesta_brain

Folders and files

Latest commit

History

Repository files navigation

Nesta brain

Installation

Repo structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages