open¶
lazycogs.open ¶
open(href: str, *, datetime: str | None = None, bbox: tuple[float, float, float, float], crs: str | CRS, resolution: float, filter: str | dict[str, Any] | None = None, ids: list[str] | None = None, bands: list[str] | None = None, chunks: dict[str, int] | None = None, sortby: list[str] | None = None, nodata: float | None = None, dtype: str | dtype | None = None, mosaic_method: type[MosaicMethodBase] | None = None, time_period: str = 'P1D', store: ObjectStore | None = None, max_concurrent_reads: int = 32, path_from_href: Callable[[str], str] | None = None, duckdb_client: DuckdbClient | None = None) -> DataArray
Open a mosaic of STAC items as a lazy (time, band, y, x) DataArray.
Synchronous entry point. Works in both regular Python scripts and Jupyter
notebooks. When called from inside a running event loop (e.g. a Jupyter
kernel), the coroutine is dispatched to a background thread with its own
event loop so the caller does not need await. Use :func:open_async
directly if you are already in an async context and want to skip the thread
overhead.
href must be a path to a geoparquet file (.parquet or
.geoparquet) or, when duckdb_client is provided, to a
hive-partitioned parquet directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
Path to a geoparquet file ( |
required |
datetime
|
str | None
|
RFC 3339 datetime or range (e.g. |
None
|
bbox
|
tuple[float, float, float, float]
|
|
required |
crs
|
str | CRS
|
Target output CRS. |
required |
resolution
|
float
|
Output pixel size in |
required |
filter
|
str | dict[str, Any] | None
|
CQL2 filter expression (text string or JSON dict) forwarded
to DuckDB queries, e.g. |
None
|
ids
|
list[str] | None
|
STAC item IDs to restrict the search to. |
None
|
bands
|
list[str] | None
|
Asset keys to include. If |
None
|
chunks
|
dict[str, int] | None
|
Chunk sizes passed to |
None
|
sortby
|
list[str] | None
|
Sort keys forwarded to DuckDB queries. |
None
|
nodata
|
float | None
|
No-data fill value for output arrays. |
None
|
dtype
|
str | dtype | None
|
Output array dtype. Defaults to |
None
|
mosaic_method
|
type[MosaicMethodBase] | None
|
Mosaic method class (not instance) to use. Defaults
to :class: |
None
|
time_period
|
str
|
ISO 8601 duration string controlling how items are
grouped into time steps. Supported forms: |
'P1D'
|
store
|
ObjectStore | None
|
Pre-configured obstore |
None
|
max_concurrent_reads
|
int
|
Maximum number of COG reads to run concurrently
per chunk. See :func: |
32
|
path_from_href
|
Callable[[str], str] | None
|
Optional callable |
None
|
duckdb_client
|
DuckdbClient | None
|
Optional |
None
|
Returns:
| Type | Description |
|---|---|
DataArray
|
Lazy |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
lazycogs.open_async
async
¶
open_async(href: str, *, datetime: str | None = None, bbox: tuple[float, float, float, float], resolution: float, crs: str | CRS, filter: str | dict[str, Any] | None = None, ids: list[str] | None = None, bands: list[str] | None = None, chunks: dict[str, int] | None = None, sortby: list[str] | None = None, nodata: float | None = None, dtype: str | dtype | None = None, mosaic_method: type[MosaicMethodBase] | None = None, time_period: str = 'P1D', store: ObjectStore | None = None, max_concurrent_reads: int = 32, path_from_href: Callable[[str], str] | None = None, duckdb_client: DuckdbClient | None = None) -> DataArray
Open a mosaic of STAC items as a lazy (time, band, y, x) DataArray.
Async entry point, suitable for use with await in Jupyter notebooks
and other async contexts. For synchronous scripts, use :func:open.
href must be a path to a geoparquet file (.parquet or
.geoparquet) or, when duckdb_client is provided, to a
hive-partitioned parquet directory.
Phase 0 work (runs at call time):
- Query the geoparquet index via DuckDB to discover bands and unique
time steps (applying
bbox,datetime,filter, andidsso the time axis contains no empty slices). - Compute the output grid (affine transform + coordinate arrays).
- Create one
StacBackendArrayper band wrapped in aLazilyIndexedArray-- no pixel I/O yet. - Assemble an
xr.Dataset, convert toxr.DataArray, and optionally chunk with dask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
Path to a geoparquet file ( |
required |
datetime
|
str | None
|
RFC 3339 datetime or range (e.g. |
None
|
bbox
|
tuple[float, float, float, float]
|
|
required |
crs
|
str | CRS
|
Target output CRS. |
required |
resolution
|
float
|
Output pixel size in |
required |
filter
|
str | dict[str, Any] | None
|
CQL2 filter expression (text string or JSON dict) forwarded
to DuckDB queries, e.g. |
None
|
ids
|
list[str] | None
|
STAC item IDs to restrict the search to. |
None
|
bands
|
list[str] | None
|
Asset keys to include. If |
None
|
chunks
|
dict[str, int] | None
|
Chunk sizes passed to |
None
|
sortby
|
list[str] | None
|
Sort keys forwarded to DuckDB queries. |
None
|
nodata
|
float | None
|
No-data fill value for output arrays. |
None
|
dtype
|
str | dtype | None
|
Output array dtype. Defaults to |
None
|
mosaic_method
|
type[MosaicMethodBase] | None
|
Mosaic method class (not instance) to use. Defaults
to :class: |
None
|
time_period
|
str
|
ISO 8601 duration string controlling how items are
grouped into time steps. Supported forms: |
'P1D'
|
store
|
ObjectStore | None
|
Pre-configured obstore |
None
|
max_concurrent_reads
|
int
|
Maximum number of COG reads to run concurrently
per chunk. Items are processed in batches of this size, which
bounds peak in-flight memory when a chunk overlaps many files.
Methods that support early exit (e.g. the default
:class: |
32
|
path_from_href
|
Callable[[str], str] | None
|
Optional callable Example — NASA LPDAAC proxy https url for S3 asset:: |
None
|
duckdb_client
|
DuckdbClient | None
|
Optional |
None
|
Returns:
| Type | Description |
|---|---|
DataArray
|
Lazy |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |