Expand description
Git repository connector.
Clones or updates a Git repository and walks files within a configurable
subdirectory. Extracts rich metadata from git log: per-file commit
timestamps, authors, and the HEAD commit SHA. Automatically generates
web-browsable URLs for GitHub and GitLab repositories.
§Configuration
[connectors.git.platform]
url = "https://github.com/acme/platform.git"
branch = "main"
root = "docs/"
include_globs = ["**/*.md"]
shallow = true§Cache Directory
Cloned repos are cached locally (default: alongside the SQLite DB in
data/.git-cache/<url-hash>/). Subsequent syncs do git fetch && reset.
§Metadata Extraction
For each file, the connector extracts:
updated_at— last commit timestamp fromgit log -1 --format=%ctauthor— last committer name fromgit log -1 --format=%ansource_url— web URL (GitHub/GitLab blob link) for the filemetadata_json— JSON withgit_shaandrepo_url
§Web URL Generation
The connector auto-detects GitHub and GitLab URLs and generates browsable blob links:
| Input URL | Generated URL |
|---|---|
git@github.com:org/repo.git | https://github.com/org/repo/blob/<sha>/<path> |
https://github.com/org/repo.git | https://github.com/org/repo/blob/<sha>/<path> |
git@gitlab.com:org/repo.git | https://gitlab.com/org/repo/-/blob/<sha>/<path> |
| Other | git://<url>/<path> |
Structs§
- GitConnector
- A Git connector instance that implements the
Connectortrait.
Functions§
- build_
globset 🔒 - Build a [
GlobSet] from a list of glob pattern strings. - build_
web_ 🔒url - Attempt to build a web-browsable URL from the git remote URL.
- file_
to_ 🔒source_ item - Convert a file in the cloned repo to a
SourceItem. - git_
clone 🔒 - Clone a Git repository into the cache directory.
- git_
file_ 🔒last_ author - Get the last commit author name for a specific file.
- git_
file_ 🔒last_ commit_ time - Get the last commit timestamp (Unix epoch) for a specific file.
- git_
head_ 🔒sha - Get the HEAD commit SHA of a repository.
- git_
pull 🔒 - Update an existing cached repository via fetch + hard reset.
- scan_
git - Scan a Git repository and produce
SourceItems. - short_
hash 🔒 - Generate a short (12-char) SHA-256 hash of input, used for cache directory naming.