Search by AOI
This tutorial shows how to get data from an area of interested provided in a geojson file.
Requirements for this demo:
To start, include the packages that we are going to need:
from pystac_client import Client
import geopandas as gpd
import rasterio
import requests
import json
Then, let´s create a connection to the archive API
auth_payload = json.dumps({
"client_id": "<API_CREDENTIAL_ID>",
"client_secret": "<API_CREDENTIAL_SECRET>",
"audience": "https://api.satellogic.com/",
"grant_type": "client_credentials"
})
auth_headers = {
'Content-Type': 'application/json'
}
auth_response = requests.request("POST", "https://login.satellogic.com/oauth/token", headers=auth_headers, data=auth_payload)
access_token = auth_response.json()["access_token"]
headers = {"authorizationToken": f"Bearer {access_token}"}
archive = Client.open("https://api.satellogic.com/archive/stac", headers=headers)
Replacing
Now, we are going to search by AOI so we need the AOI defined as a GeoJson polygon. This can be done in the code directly like this:
# The geojson defining the AOI
aoi = {
"coordinates": [
[
[
2.1147711862296035,
41.38373806805163
],
[
2.1147711862296035,
41.37709414091424
],
[
2.130796005839642,
41.37709414091424
],
[
2.130796005839642,
41.38373806805163
],
[
2.1147711862296035,
41.38373806805163
]
]
],
"type": "Polygon"
}
Or we can directly import a GeoJson file into memory as follows:
Using any of those methods, the variable aoi
now contains the coordinates of the vertices of the poylgon of our AOI. Now it is time to query to the archive:
items = archive.search(
intersects = aoi,
collections=["quickview-visual"],
datetime="2023-07-01/2023-09-01",
).item_collection()
By using the item_colletion() method we get all the elements as a pystac object. If they are not needed we can use item_collection_as_dict() to get a Python dict instead.
Now, items
is a FeatureCollection
that contains all the matching items. Items can be accesed with the bracket operator. For example:
We can transform the FeatureCollection into a GeoPandas dataframe for example:
gdf = gpd.GeoDataFrame.from_features(items, crs=f"epsg:{items[0].properties['proj:epsg']}" )
gdf['id'] = [x.id for x in items]
gdf.set_index('id', inplace=True)
gdf['capture_date'] = [gpd.pd.to_datetime(f"{x[0]}T{x[1]}") for x in gdf.index.str.split("_")]
Given that we searched on an area in a long period of time, it is possible that the we have data from multiple dates, satellites a.k.a captures. If we look into the data, we can for example group it by capture using the satl:outcome_id
property and count how many tiles per capture matched our AOI.
gdf.groupby([pd.Grouper(key="capture_date", freq="D"),"satl:outcome_id"]).aggregate(tiles=("platform","count"))
capture_date satl:outcome_id tiles
2023-08-05 a914b272-3e59-4551-a690-8c4ff0ad8147--157143 6
2023-08-08 2328a86c-892a-4f21-a95e-45b0bad8128d--157143 6
2023-08-12 9487f2f4-1154-4083-a138-2c998340a794--157143 6
2023-08-17 2f39d9a4-0953-4919-9ed3-286f3195f29f--157271 3
2023-08-19 de325a55-1b3c-44d9-9e28-1f5ab323a5ae--157143 6
2023-08-21 6e3679db-6d81-43bc-9f37-0fcafb4e6cd0--157143 6
2023-08-27 54eddfb1-93ab-41b3-83ae-9d11479a53b3--157143 6
2023-08-28 59bd62cf-9dee-4604-90f6-21c4bedd83e1--157143 6
At this stage, we can for example filter those captures with a certain cloud cover, or discard data that is not useful for the analysis. In this example we will go directly to the final stage: mosaic all the tiles for each capture together and save them as a GeoTiff file on the disk to further exploration:
Let's first defain a mosaicing function. It will need rasterio
def mosaic_from_items(items, outfile, asset_name="analytic"):
src_files_to_mosaic = []
for item in items:
url = item.assets[asset_name].href
src = rasterio.open(url)
src_files_to_mosaic.append(src)
mosaic, out_trans = merge(src_files_to_mosaic, indexes=[1,2,3])
meta = {"driver": "GTiff",
"height": mosaic.shape[1],
"width": mosaic.shape[2],
"transform": out_trans,
"crs": item.properties['proj:epsg'],
"count": mosaic.shape[0],
"dtype": mosaic.dtype
}
with rasterio.open(outfile, "w", **meta ) as dest:
dest.write(mosaic,)
return mosaic
And then for each capture, call the mosaicing function
for (capture_date, outcome_id), group_df in gdf.groupby([pd.Grouper(key="capture_date", freq="D"),"satl:outcome_id"]):
subItems = ItemCollection([x for x in items if x.id in group_df.index])
fname = capture_date.strftime("%Y%m%d")+".tiff"
mosaic_from_items(subItems, fname)
The result is a GeoTiff file on the specified folder, with the date of each capture as the filename.