Visualising Satellite data using Google Colab

Having spent a few hours reading documentation and having an ongoing conversation with chatGPT, I’m getting the hang of the hdf5 file structure and can now visualise some multispectral data in Google Colab:


from google.colab import drive
drive.mount('/content/drive')

import h5py
import numpy as np
import matplotlib.pyplot as plt

# Get the HDF5 file from your Google Drive
file_id = '/content/drive/MyDrive/DATA/file_name.he5'

with h5py.File(file_id, "r") as f:
# List all groups
print("Keys: %s" % f.keys())
a_group_key = list(f.keys())[0]

# Get the data
data = list(f[a_group_key])

#This gives us some idea about the groups/keys in the hdf file and some idea about the datasets contained therein - but will become more detailed as we go along

# Open the HDF5 file
with h5py.File(file_id, 'r') as f:
# Open the data field
#currently this is hard-coded as I know from hdfView that this is the path I want to look at - but really we want to find this programmatically.
data_field = f['/path_to/Data Fields/Cube']
# Print the shape of the data field
print(f'Shape: {data_field.attrs}')
print(f'Shape: {data_field.dtype}')
print(f'Shape: {data_field.shape}')
#This gives us some idea about the data cube we are examining - such as its attributes, data type and shape (typically rows and columns) - it'll print them to output

# Open the HDF5 file
with h5py.File(file_id, 'r') as f:
# Open the data field
data_field = f['/path_to/Data Fields/Cube']
# Get the data and reshape it to 2D
data = np.array(data_field[:]).reshape(data_field.shape[0], data_field.shape[1])
# Scale the data to fit within an 800x800 pixels
data = np.uint8(np.interp(data, (data.min(), data.max()), (0, 255)))
# Create a figure with the specified size
fig = plt.figure(figsize=(8, 8))
# Add the data to the figure
plt.imshow(data, cmap='viridis')
# Display the figure
plt.show()

Satellite data plotted with viridis colourmap

Next steps involve developing a way of iteratively traversing the hdf5 directory structure, so that I can identify relevant data fields within the file – they’re not explicitly identified as ‘image files’. This can be done using h5py functions. Another thing to explore is GDAL: once I’ve identified the correct data in geolocation fields, it should become possible to output geotiffs or UE-friendly png files with geolocation metadata.

Here’s some other matplotlib colourmaps applied to the same dataset.

It’s all pretty crude at this point – just figuring out how this stuff might work.

Leave a Reply