dml_util.aws.s3#
S3 storage utilities.
Classes
|
S3 Store for DML |
- class dml_util.aws.s3.S3Store(bucket=<factory>, prefix=None, client=<factory>)[source]#
Bases:
object
S3 Store for DML
- Parameters:
bucket (str) – S3 bucket name. Defaults to the value of the environment variable “DML_S3_BUCKET”.
prefix (str) – S3 prefix. Defaults to the value of the environment variable “DML_S3_PREFIX”.
client (boto3.client, optional) – Boto3 S3 client. Defaults to a new client created using the get_client function.
Notes
If prefix is not provided, “/data” is appended to the DML_S3_PREFIX environment variable.
prefix is stripped of leading and trailing slashes, so if you want to use a prefix like “/foo/”, you’ll need to handle those uris directly. E.g. to put data at “s3://my-bucket//foo/bar”, you would use S3Store().put(data, uri=”s3://my-bucket//foo/bar”).
Examples
>>> s3 = S3Store(bucket="my-bucket", prefix="my-prefix") >>> s3.put(data=b"Hello, World!", name="greeting.txt") Resource(uri='s3://my-bucket/my-prefix/greeting.txt') >>> s3.ls(recursive=True) ['s3://my-bucket/my-prefix/greeting.txt'] >>> s3.get("greeting.txt") b'Hello, World!' >>> s3.exists("greeting.txt") True >>> s3.rm("greeting.txt") >>> s3.exists("greeting.txt") False >>> s3.put_js({"key": "value"}, name="data") Resource(uri='s3://my-bucket/my-prefix/data.json') >>> s3.get_js("data") {'key': 'value'} >>> s3.tar(dml, path="my_data", excludes=["*.tmp"]) Resource(uri='s3://my-bucket/my-prefix/my_data.tar') >>> s3.untar("s3://my-bucket/my-prefix/my_data.tar", dest="my_data") # Extracts the tar archive to the local directory "my_data" >>> s3.cd("new-prefix") S3Store(bucket='my-bucket', prefix='my-prefix/new-prefix') >>> s3.cd("..") # Go back to the previous prefix S3Store(bucket='my-bucket', prefix='')
- bucket: str#
- client: client#
- ls(s3_root=None, *, recursive=False, lazy=False)[source]#
List objects in the S3 bucket.
- Parameters:
s3_root (str, optional) – Name or s3 root to list. Defaults to s3://<bucket>/<prefix>/.
recursive (bool) – If True, list all objects recursively. Defaults to False.
lazy (bool) – If True, return a generator. Defaults to False.
- Returns:
A generator or list of S3 URIs.
- Return type:
generator or list
- parse_uri(name_or_uri)[source]#
Parse a URI or name into bucket and key.
Examples
>>> s3 = S3Store(bucket="my-bucket", prefix="my-prefix") >>> s3.parse_uri("s3://my-other-bucket/my-key") ('my-other-bucket', 'my-key') >>> s3.parse_uri("my-key") ('my-bucket', 'my-prefix/my-key') >>> s3.parse_uri(Resource("s3://my-other-bucket/my-key")) ('my-other-bucket', 'my-key')
- prefix: str = None#