dataio.scripts.sync_dataset_documentation

Sync dataset documentation (README.md and manifest files) from S3 file server to database.

This script fetches README.md and manifest files from the S3 filestore and caches their contents in the datasets table for faster access.

Usage: # Sync all datasets uv run python -m dataio.scripts.sync_dataset_documentation

# Sync specific dataset
uv run python -m dataio.scripts.sync_dataset_documentation --dataset DS_EXAMPLE01

# Dry run (show what would be synced)
uv run python -m dataio.scripts.sync_dataset_documentation --dry-run

Module Contents

Functions

get_database_url

Build database URL from environment variables.

get_s3_client

Initialize S3 client.

main

Data

API

dataio.scripts.sync_dataset_documentation.logger

‘getLogger(…)’

dataio.scripts.sync_dataset_documentation.get_database_url() str[source]

Build database URL from environment variables.

dataio.scripts.sync_dataset_documentation.get_s3_client()[source]

Initialize S3 client.

dataio.scripts.sync_dataset_documentation.main()[source]