Tag Archives: featured

Melbourne Visit

Melbourne skyline
Irony

Between the 12-15 February I made my first visit to the Swinburne Centre for Astrophysics and Supercomputing – somewhat ironically housed within the old Art building on Swinburne Campus in Hawthorn. Appropriate for an Art-Science Residency I guess!

Here’s a gallery of shots to give an idea of what I saw there – it was great opportunity to meet up in person with Chris and some of the amazing team there. I’ll add more detail soon.

Vizwall
View to City
New_Lobby
iDavie Brain
Hawthorn Vista
Dawn Balloons
Melbourne Aerial

ChatGPT Perspectives

Interior life. Stable Diffusion image. Credit: Peter Morse

This post is a more general discussion about ChatGPT and related systems – it’s important to cut through the hype surrounding LLMs and absorb the sober scientific and cultural perspectives and questions around these systems and their capabilities. The impacts of these systems for knowledge work and creative work is going to be huge in the near future – so time to start understanding their context and implications.

S. Wolfram : What Is ChatGPT Doing … and Why Does It Work?

This is an excellent technical discussion about how Neural Nets (NNs) work, with interesting questions about the internal ‘black box’ goings-on – that are in general quite inscrutable. Wolfram is arguing for a rigorous scientific understanding of NNs, as they seem principally to have arisen as engineering exercises – things that work, but no-one really understands exactly why (‘lore’ in Wolfram’s estimation). This is a sharp counterpoint to the feuilleton hype about AI (which is, really, ‘Machine Learning’, or ‘Machine Representation’, as it is not ‘truly’ intelligent or aware). He makes interesting points about interfacing something like ChatGPT with Wolfram Alpha, which is a kind of computational knowledge engine, and argues convincingly that an interface between the two systems could solve many of the factual errors confabulated by the LLM, and provide something much more powerful in combination: a system that is ‘factually’ connected to the ‘world’ – and perhaps even capable of causal inference as a result.

The discussion touches upon several interesting philosophical/theoretical areas concerning the construction and emergence of language and discourse.

Human language—and the processes of thinking involved in generating it—have always seemed to represent a kind of pinnacle of complexity. And indeed it’s seemed somewhat remarkable that human brains—with their network of a “mere” 100 billion or so neurons (and maybe 100 trillion connections) could be responsible for it. Perhaps, one might have imagined, there’s something more to brains than their networks of neurons—like some new layer of undiscovered physics. But now with ChatGPT we’ve got an important new piece of information: we know that a pure, artificial neural network with about as many connections as brains have neurons is capable of doing a surprisingly good job of generating human language. (Wolfram, 2023)

Of great interest to me is the possibility of what one might call ’empirical semiotics’ or ‘computational semiotics’ – where semiotic generation and analysis (semiosis) could be underpinned by computational forms of emergence, categorisation and logic.

The success of ChatGPT is, I think, giving us evidence of a fundamental and important piece of science: it’s suggesting that we can expect there to be major new “laws of language”—and effectively “laws of thought”—out there to discover. In ChatGPT—built as it is as a neural net—those laws are at best implicit. But if we could somehow make the laws explicit, there’s the potential to do the kinds of things ChatGPT does in vastly more direct, efficient—and transparent—ways. (Wolfram, 2023)

Presumably many of these ‘laws’ are already uncovered or at least hinted-at by research in NLP and computational language models – but it appears very much a contentious field – especially with regard to what any formulation of what ‘intelligence’ is.

If there is one constant in the field of artificial intelligence it is exaggeration: There is always breathless hype and scornful naysaying. It is helpful to occasionally take stock of where we stand. (Browning & LeCun, 2022b)

To me it seems important to understand LLM cognates and extensions in multimodal systems – that intelligent systems can draw inferences across visual, audial, somatic and other sensory modalities beyond the textual and linguistic (Browning & LeCun, 2022a).

The underlying problem isn’t the AI. The problem is the limited nature of language. Once we abandon old assumptions about the connection between thought and language, it is clear that these systems are doomed to a shallow understanding that will never approximate the full-bodied thinking we see in humans. In short, despite being among the most impressive AI systems on the planet, these AI systems will never be much like us.(Browning & LeCun, 2022a).

The field of semiotics has a significant body of work covering these differing/interdependent signifying regimes – but not a great deal that is computationally reducible, as it has been more in the form of ‘literary criticism’ or humanities ‘theory’ (including in my own research background). This is clearly inadequate for a scientific approach as it is far too qualitative – more ‘top down’ than ‘bottom up’ – reminiscent of debates concerning symbolic reasoning vs (what might be termed) ’emergent’ reasoning.  However, there are examples in the work of C.S. Peirce, M.A.K. Halliday and others that may be useful for thinking about this domain. In the area of cognitive neuroscience/neuroanthropology I immediately think of the work of T. Deacon, A. DamasioD. DeutschJ. Hawkins and others that synthesise this inter-disciplinary domain of knowledge into useful ways of thinking about what might constitute an intelligent system and how it might emerge.

For non-specialists like me there is much to absorb – that can inform ways of critically engaging with this novel technology.

It is terribly important to not be naive about this stuff (AI), as it will (and has already) have transformational impacts upon personhood, economic, political and natural systems, for good and for bad. It is hard to imagine a future when a self-aware, agentive  machine intelligence is more than a science ‘fiction.’ It sounds absurd, but perhaps it isn’t.

Imagine a world where people’s online images, text, music, voice recordings, videos, and code get gathered largely without consent to train AI models, and sold back to them for $10 a month. We’re already there but imagine something beyond that – and assume it’s incredible…

…Here’s a thought experiment: imagine an AGI system that advises taxing billionaires at a rate of 95 percent and redistributing their wealth for the benefit of humanity. Will it ever be hooked into the banking system to effect its recommended changes? No, it will not. Will those minding the AGI actually carry out those orders? Again, no.

No one with wealth and power is going to cede authority to software, or allow it to take away even some of their wealth and power, no matter how “smart” it is. No VIP wants AGI dictating their diminishment. And any AGI that gives primarily the powerful and wealthy more power and wealth, or maintains the status quo, is not quite what we’d describe as a technology that, as OpenAI puts it, benefits all of humanity. (Claburn, T. , The Inquirer, 2023)

We don’t have a good definition of intelligence – so it seems best to define it operationally (as Friston et. al. 2022 does). At this stage the take-away is that LLM’s are clearly what the label says: they are language models, not artificial intelligences – they are, literally, Rhetorical Devices.

LLMs statistically parameterise a huge amount of ‘knowledge’ about linguistic representations of the world – based upon their massive set of ‘training’ data. These terms are information, signs, similes, metaphors, metonyms, synedoches –  abstractions that can exhibit indefiniteness: degrees of epistemic and ontic undecidability or infinite regression. Uncertainty.

LLMs seem to respond dialogically, perhaps following chains of reasoning akin to the vectors in ‘meaning space’ that Wolfram discusses (’embeddings’  – examples of t-SNE or word2vec dimensional reduction plots). These dialogues can also be guided by user interaction via the Chatbot query interface through ‘chain-of-thought’ reasoning – which demonstrably improves the model performance (even, it seems, when the model performs what might be the equivalent of ‘self-talk’).

Larger models seem to improve inferential reasoning – yet presumably there will be drawbacks or limits in a scale-only approach. Not least amongst these being the prodigious amounts of compute required, and their concomitant use of electricity and consequent carbon-impacts.

Are they ’emulations’ or ‘simulations’? What would this distinction imply?*** To me, it indicates that it (an LLM) is a map, not an actor; a palimpsest, not an agent.

A counterpoint.

At no point with chatGPT is there any self-initiation. Purpose and curiosity, at this juncture, seem very much a human property. Whether this will continue to be the case, time will tell.

 

Graph explaining Dunning-Kruger curve, reinterpreted for ChatGPT.
Credit: Maurizio Bisogni

 

For our current purposes we can be reasonably confident about the ongoing necessity of the ‘human-in-the-loop’ approach we’re pursuing. For the time being. And with which humans?

 

 


*ChatGPT apparently implements this type of response training interface – thumbs-up/thumbs-down.

***Thanks to my colleague P. Bourke for drawing this distinction to my attention.

References

Altman, S., n.d. Planning for AGI and beyond [WWW Document]. OpenAI. URL https://openai.com/blog/planning-for-agi-and-beyond#SamAltman (accessed 3.10.23).

Ananthaswamy, A., 2023. In AI, is bigger always better? Nature 615, 202–205. https://doi.org/10.1038/d41586-023-00641-w

Browning, Jacob, and Yann LeCun. “What AI Can Tell Us About Intelligence,” June 16, 2022. https://www.noemamag.com/what-ai-can-tell-us-about-intelligence.

Browning, Jacob. “AI And The Limits Of Language,” August 23, 2022. https://www.noemamag.com/ai-and-the-limits-of-language.

Claburn, T., n.d. OpenAI CEO heralds AGI no one in their right mind would want [WWW Document]. URL https://www.theregister.com/2023/02/27/openai_ceo_agi/ (accessed 3.10.23).

Daull, Xavier, Patrice Bellot, Emmanuel Bruno, Vincent Martin, and Elisabeth Murisasco. “Complex QA and Language Models Hybrid Architectures, Survey.” arXiv, February 17, 2023. http://arxiv.org/abs/2302.09051.

Deacon, Terrence W. Incomplete Nature: How Mind Emerged from Matter. WW Norton & Company, 2011.Deacon, Terrence W. The Symbolic Species: The Co-Evolution of Language and the Brain. WW Norton & Company, 1998. ISBN:9780393049916

Dennett, D.C. Consciousness Explained. Little, Brown, 2017. ISBN: 0-316-18065-3

Deutsch, David. The Beginning of Infinity: Explanations That Transform The World. Penguin UK, 2011. ISBN: 9780140278163

Friston, Karl J, Maxwell J D Ramstead, Alex B Kiefer, Alexander Tschantz, Christopher L Buckley, Mahault Albarracin, Riddhi J Pitliya, et al. “Designing Ecosystems of Intelligence from First Principles,” December 2022. https://doi.org/10.48550/arXiv.2212.01354.

Halliday, Michael Alexander Kirkwood. Language as Social Semiotic. London Arnold, 1978. ISBN:9780713159677

Hawkins, Jeff. A Thousand Brains: A New Theory of Intelligence. Hachette UK, 2021. https://doi.org/10.26613/esic.6.1.282

Huang, Jiaxin, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, and Jiawei Han. “Large Language Models Can Self-Improve.” arXiv, October 25, 2022. http://arxiv.org/abs/2210.11610.

Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al. “Training Language Models to Follow Instructions with Human Feedback.” arXiv, March 4, 2022. http://arxiv.org/abs/2203.02155.

Pearl, J., Mackenzie, D., 2018. The Book of Why: The New Science of Cause and Effect. Penguin UK. ISBN: 9780141982410

Savage, N., 2023. Why artificial intelligence needs to understand consequences. Nature. https://doi.org/10.1038/d41586-023-00577-1

Tanaka-Ishii, K., 2010. Semiotics of Programming. Cambridge University Press. ISBN:9780521516556

Tenachi, W., Ibata, R., Diakogiannis, F.I., 2023. Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws. https://doi.org/10.48550/arXiv.2303.03192

Massimini, M., Tononi, G., 2018. Sizing Up Consciousness: Towards an Objective Measure of the Capacity for Experience. Oxford University Press. ISBN:9780198728443

Wolfram, Stephen. “What Is ChatGPT Doing … and Why Does It Work?,” February 14, 2023. https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/.

Mirror, Mirror on the Wall: Working with AI

Mirror, mirror on the wall – imaged generated by Stable Diffusion. Image credit: Peter Morse

It’s important when working with the current generation of AI systems (principally Large Language Models or LLMs) to approach them with caution and circumspection. The reason for this is essentially human fallibility – both by the end-user and by the system itself.

The computer science behind generative AI is moving so fast that innovations emerge every month. How researchers choose to use them will dictate their, and our, future.

(Stokel-Walker and Van Noorden, 2023: 214-16)

The system I’m currently using – ChatGPT – is genuinely revolutionary in terms of how helpful it is for programming assistance – it has saved me hours and hours of time working out some approaches I can develop to programming problems. The underpinning technology is, in fact, several years old, but the way it has been implemented and its availability is what sets it apart (expect many more of these sorts of systems over the next few years).

But it has also hallucinated things that simply don’t exist – and presented them in a convincing matter-of-fact way:

Q: How can I import a multispectral geotiff into Unreal Engine 5 and break it into different layers for selected wavebands all correctly geolocated?

ChatGPT: To import a multispectral GeoTIFF into Unreal Engine 5 and break it into different layers for selected wavebands, you will need to use a plugin or script that is capable of handling multispectral imagery. One example is the “GeoTIFF Import” plugin, which is available on the Unreal Engine Marketplace. This plugin allows you to import a GeoTIFF file and separate it into different layers for each band.

Once you’ve imported the GeoTIFF, you can use the “Landscape” tool to create a terrain that corresponds to the image. You can then use the “Landscape Layer Blend” node in the material editor to control the blending between the different layers of the image.

To correctly geolocate the image, you will need to use the “World Aligned Blocking” feature, which allows you to align the image with the terrain using GPS coordinates. In the Unreal Engine, you can use the “World Settings” panel to set the GPS coordinates for your image.

Alternatively, you can use Python scripts that are designed to handle the import of multispectral GeoTIFFs and separate them into different layers, such as rasterio and gdal libraries. You can then use these layers as textures in the Unreal Engine to create a 3D representation of the multispectral data with accurate geolocation.

I’ve highlighted the plausible falsehoods – for instance, there is no GeoTIFF import plugin for Unreal Engine in the UE Marketplace. There are off-the-shelf solutions for this such as Landscaping, Cesium and Terraform Pro (amongst others out there on the interwebs) – and it would be great if UE could directly import GeoTIFFs via a native plugin.  But there isn’t – instead ChatGPT hallucinates a solution. It sounds plausible – just like using a ‘World Aligned Blocking’ feature, or simply using the ‘World Settings’ panel to set GPS coordinates.

Essentially the system is confabulating. Large Language Models are pejoratively referred to as ‘stochastic parrots’ – they act basically as ‘token prediction’ machines, where there is no awareness of on-going context within a given text generated by the system, and certainly no concept of inference or common-sense. Long passages of text or programming are simply realisations of the likelihood of these generated token streams on a larger scale than  individual words**, articulated within the interpretive reception of the ‘user’ that may perceive ‘seeming’ coherence:

We say seemingly coherent because coherence is in fact in the eye of the beholder. Our human understanding of coherence derives from our ability to recognize interlocutors’ beliefs [30, 31] and intentions [23, 33] within context [32]. That is, human language use takes place between individuals who share common ground and are mutually aware of that sharing (and its extent), who have communicative intents which they use language to convey, and who model each others’ mental states as they communicate. As such, human communication relies on the interpretation of implicit meaning conveyed between individuals….

Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. It can’t have been, because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that. This can seem counter-intuitive given the increasingly fluent qualities of automatically generated text, but we have to account for the fact that our perception of natural language text, regardless of how it was generated, is mediated by our own linguistic competence and our predisposition to interpret communicative acts as conveying coherent meaning and intent, whether or not they do [89, 140]. The problem is, if one side of the communication does not have meaning, then the comprehension of the implicit meaning is an illusion arising from our singular human understanding of language (independent of the model).

(Bender and Gebru, 2021:616)

Nevertheless, even with these caveats, the system provides a valuable and useful distillation of a hugely broad-range of knowledge, and can present it to the end user in an actionable way. This has been demonstrated by my use of it in exploring approaches toward Python programming for the manipulation of GIS data. It has been a kind of dialogue – as it has provided useful suggestions, clarified the steps taken in the programming examples it has supplied, and helped me correct processes that do not work.

But it is not a dialogue with an agent – seeming more akin to a revealing mirror, or a complex echo, from which I can bounce back and forth ideas, attempting to discern a truth for my questions. This brings with it a variety of risks, depending upon the context and domain in which it is applied:

The fundamental problem is that GPT-3 learned about language from the Internet: Its massive training dataset included not just news articles, Wikipedia entries, and online books, but also every unsavory discussion on Reddit and other sites. From that morass of verbiage—both upstanding and unsavory—it drew 175 billion parameters that define its language. As Prabhu puts it: “These things it’s saying, they’re not coming out of a vacuum. It’s holding up a mirror.” Whatever GPT-3’s failings, it learned them from humans.
Moving beyond this current state, the path to ‘true’ AI, human-level AI, AGI (Artificial General Intelligence) and ASI (Artificial Super-Intelligence), may be shortish (20 years) or longish (50 years) – but given the current pace of development, my impression is that it will be measured in decades, not centuries. Domain experts have already mapped out research programs that encompass many of the conceptual and scientific breakthroughs that need to be made for this to occur (Hutter, 2005; LeCun, 2022), neatly adumbrated by Friston et al. (2022):
Academic research as well as popular media often depict both AGI and ASI as singular and monolithic AI systems, akin to super-intelligent, human individuals. However, intelligence is ubiquitous in natural systems—and generally looks very different from this. Physically complex, expressive systems, such as human beings, are uniquely capable of feats like explicit symbolic communication or mathematical reasoning. But these paradigmatic manifestations of intelligence exist along with, and emerge from, many simpler forms of intelligence found throughout the animal kingdom, as well as less overt forms of intelligence that pervade nature. (p.4)
…AGI and ASI will emerge from the interaction of intelligences networked into a hyper-spatial web or ecosystem of natural and artificial intelligence. We have proposed active inference as a technology uniquely suited to the collaborative design of an ecosystem of natural and synthetic sensemaking, in which humans are integral participants—what we call shared intelligence. The Bayesian mechanics of intelligent systems that follows from active inference led us to define intelligence operationally, as the accumulation of evidence for an agent’s generative model of their sensed world—also known as self-evidencing. (p.19)
In the meantime, it is the role of the human interlocutor to establish the inferential framework with which we work with these systems. It is remarkable that what until recently seemed like science-fictional concepts are now available for use.
A critical awareness of machine learning and machine intelligence capabilities seems to me to be a prudent mindset to develop for any engagement with technology that interfaces with Earth observation systems – indeed, any observational system, because it is up to us human beings to develop frameworks for designing goals for these systems, and developing the capacity to interrogate and understand them in accessible ways, discern objective and/or consensual truth and to deploy them for good.
~
For argument’s sake – here’s some hallucinated images of bushfires taken from a satellite, created using Stable Diffusion 1.5. Who’s to say they aren’t real images of real places? How would you be able to tell?
~
Fake Satellite I
Fake Satellite II
Fake Satellite III
Fake Satellite IV
Fake Satellite V
Fake Satellite VI
Notes:
**This is a supposition I have made that may or may not be correct (I don’t know, so I am rephrasing this/correcting this as my knowledge increases). Tokens are word fragments or components, and, apparently (see next posts) each is added singly on a stochastic basis – but presumably the probabilistic value is informed by larger-scale probabilistic structures than individual words in sequence. There must be syntagmatic and paradigmatic values at play.

References:
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. Virtual Event Canada: ACM, 2021. https://doi.org/10.1145/3442188.3445922.
Friston, Karl J, Maxwell J D Ramstead, Alex B Kiefer, Alexander Tschantz, Christopher L Buckley, Mahault Albarracin, Riddhi J Pitliya, et al. “Designing Ecosystems of Intelligence from First Principles,” December 2022. https://doi.org/10.48550/arXiv.2212.01354.
Hutter, Marcus. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Germany: Springer, 2005.
LeCun, Yann. “A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27,” July 2022. https://openreview.net/forum?id=BZ5a1r-kVsf&fbclid=IwAR2lFOl7Uy0lyfYbOya3sExmULL9uZopKuPbSJuNBqLYB2G9aob4j32fxqQ.
Stokel-Walker, Chris, and Richard Van Noorden. “What ChatGPT and Generative AI Mean for Science.” Nature 614, no. 7947 (February 6, 2023): 214–16. https://doi.org/10.1038/d41586-023-00340-6.
Strickland, Eliza. “OpenAI’s GPT-3 Speaks! (Kindly Disregard Toxic Language).” Accessed February 7, 2023. https://spectrum.ieee.org/open-ais-powerful-text-generating-tool-is-ready-for-business.

Visualising Philip Island

One objective (amongst many) that Chris and I have identified is to first start off looking at islands. The reason behind this is simple: an island presents a defined and constrained geographic area – not too big, not too small (depending on the island, of course) – and I’ve had a lot of recent experience working with island data on my recent Macquarie Island project with my colleague Dr Gina Moore. It provides some practical limits to the amount of data we have to work with, whilst engaging many of the technical issues one might have to address.

With this in mind, we’ve started looking a Philip Island (near Melbourne, Victoria) and Kangaroo Island (South Australia). Both fall under a number of satellite paths, from which we can isolate multispectral and hyperspectral data. I’ll add some details about these in a later post – as there are clearly issues around access and rights that need to be absorbed, explored and understood.

For a first attempt, here is a google colab script for looking at some Philip Island data from the XXX Satellite:

# Import necessary libraries
import os
from osgeo import gdal, gdal_array
from google.colab import drive
import numpy as np
import matplotlib.pyplot as plt
# Mount Google Drive
drive.mount('/content/drive')
#this will ask for permissions - you will need to login through your google account in a pop-up window
# Open the multispectral GeoTIFF file
#set the file path to the folder with the relevant data in it on your google drive (mount this first via the panel to the left of this one - it is called 'drive' and appears as a folder)
file_path = '/content/drive/MyDrive/DATA/XXX_multispectral.tif'
#set a variable for your path and the file you open
src_ds = gdal.Open(file_path)
#use gdal to get some characteristics of the data in the file
print("Projection: ", src_ds.GetProjection())  # get projection
print("Columns:", src_ds.RasterXSize)  # number of columns
print("Rows:", src_ds.RasterYSize)  # number of rows
print("Band count:", src_ds.RasterCount)  # number of bands
print("GeoTransform", src_ds.GetGeoTransform()) #affine transform
# Use gdalinfo command to print and save information about the raster file - this is extracted from the geotiff itself
info = gdal.Info(file_path)
print(info)
if not os.path.exists("/content/drive/MyDrive/DATA/OUTPUT"):
    os.makedirs("/content/drive/MyDrive/DATA/OUTPUT")
info_file = os.path.join("/content/drive/MyDrive/DATA/OUTPUT", "raster_info.txt")
with open(info_file, "w") as f:
    f.write(info)
# Retrieve the band count and band metadata
data_array = src_ds.GetRasterBand(1).ReadAsArray()
data_array.shape
band_count = src_ds.RasterCount
# Loop through each band and display in a matplotlib image
for i in range(1, band_count+1):
    band = src_ds.GetRasterBand(i)
    minval, maxval = band.ComputeRasterMinMax()
    data_array = band.ReadAsArray()
    plt.figure(figsize=(16, 9))
    plt.imshow(data_array, vmin=minval, vmax=maxval)
    plt.colorbar(anchor=(0, 0.3), shrink=0.5)
    plt.title("Band {} Data\n Min value: {} Max value: {}".format(i, minval, maxval))
    plt.suptitle("Raster data information")
    band_description = band.GetDescription()
    metadata = band.GetMetadata_Dict()
    geotransform = src_ds.GetGeoTransform()
    top_left_x = geotransform[0]
    top_left_y = geotransform[3]
    w_e_pixel_res = geotransform[1]
    n_s_pixel_res = geotransform[5]
    x_size = src_ds.RasterXSize
    y_size = src_ds.RasterYSize
    bottom_right_x = top_left_x + (w_e_pixel_res * x_size)
    bottom_right_y = top_left_y + (n_s_pixel_res * y_size)
    coordinates = ["Top left corner: ({},{})".format(top_left_x,top_left_y),"Bottom right corner:({},{})".format(bottom_right_x,bottom_right_y)]
    if band_description:
        metadata_list = ["Band description: {}".format(band_description)]
    else:
        metadata_list = ["Band description is not available"]
    if metadata:
        metadata_list += ["{}: {}".format(k, v) for k, v in metadata.items()]
    else:
        metadata_list += ["Metadata is not available"]
    plt.annotate("\n".join(coordinates+metadata_list), (0,0), (0, -50), xycoords='axes fraction', textcoords='offset points', va='top')
    plt.savefig("/content/drive/MyDrive/DATA/OUTPUT/Band_{}_Data.png".format(i), vmin=minval, vmax=maxval)
    plt.show()

 

This works well enough for a plot – but it’s interesting (a debate) – whether it is best/easiest to use GDAL or the rasterio Mapbox wrapper. Tests will tell. And there is Pandas too. They all have pros and cons. Try it yourself and let us know,

I’m looking into ways of sharing the colabs more directly for those who are interested – that’s the whole point.

Visualising Satellite data using Google Colab

Having spent a few hours reading documentation and having an ongoing conversation with chatGPT, I’m getting the hang of the hdf5 file structure and can now visualise some multispectral data in Google Colab:


from google.colab import drive
drive.mount('/content/drive')

import h5py
import numpy as np
import matplotlib.pyplot as plt

# Get the HDF5 file from your Google Drive
file_id = '/content/drive/MyDrive/DATA/file_name.he5'

with h5py.File(file_id, "r") as f:
# List all groups
print("Keys: %s" % f.keys())
a_group_key = list(f.keys())[0]

# Get the data
data = list(f[a_group_key])

#This gives us some idea about the groups/keys in the hdf file and some idea about the datasets contained therein - but will become more detailed as we go along

# Open the HDF5 file
with h5py.File(file_id, 'r') as f:
# Open the data field
#currently this is hard-coded as I know from hdfView that this is the path I want to look at - but really we want to find this programmatically.
data_field = f['/path_to/Data Fields/Cube']
# Print the shape of the data field
print(f'Shape: {data_field.attrs}')
print(f'Shape: {data_field.dtype}')
print(f'Shape: {data_field.shape}')
#This gives us some idea about the data cube we are examining - such as its attributes, data type and shape (typically rows and columns) - it'll print them to output

# Open the HDF5 file
with h5py.File(file_id, 'r') as f:
# Open the data field
data_field = f['/path_to/Data Fields/Cube']
# Get the data and reshape it to 2D
data = np.array(data_field[:]).reshape(data_field.shape[0], data_field.shape[1])
# Scale the data to fit within an 800x800 pixels
data = np.uint8(np.interp(data, (data.min(), data.max()), (0, 255)))
# Create a figure with the specified size
fig = plt.figure(figsize=(8, 8))
# Add the data to the figure
plt.imshow(data, cmap='viridis')
# Display the figure
plt.show()

Satellite data plotted with viridis colourmap

Next steps involve developing a way of iteratively traversing the hdf5 directory structure, so that I can identify relevant data fields within the file – they’re not explicitly identified as ‘image files’. This can be done using h5py functions. Another thing to explore is GDAL: once I’ve identified the correct data in geolocation fields, it should become possible to output geotiffs or UE-friendly png files with geolocation metadata.

Here’s some other matplotlib colourmaps applied to the same dataset.

It’s all pretty crude at this point – just figuring out how this stuff might work.

ChatGPT, Google Colab and Hyperspectral Data

It occurs to me that, rather than going to all the hassle of figuring out how to write all the code for colab myself, that I can use my chatGPT assistant to do it for me:

Q: write a script for google colab to extract images from an hdf5 file located upon my google drive


Chat GPT:

Here is an example script that demonstrates how to extract images from an HDF5 file located on your Google Drive using the h5py library in Python, and the PyDrive library to interact with Google Drive:

# Install the PyDrive library
!pip install -U PyDrive

# Import the necessary libraries
import h5py
import numpy as np
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Get the HDF5 file from your Google Drive
file_id = '<YOUR FILE ID>'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('hyperspectral_image_data.hdf5')

# Open the HDF5 file
with h5py.File('hyperspectral_image_data.hdf5', 'r') as hdf5_file:
# Access the dataset containing the images
images = hdf5_file['images']
# Extract an image by its index
image = np.array(images[0])
print(image)

Make sure to replace <YOUR FILE ID> with the actual file ID of the HDF5 file on your Google Drive. You can find the file ID by opening the file in Google Drive and looking at the URL – the file ID is the string of letters and numbers after the last forward slash.

You can also use the gdal_translate command with the PyDrive library as well, just add the following lines of code to download the file and then use gdal_translate command to convert to tiff or geotiff

# Get the HDF5 file from your Google Drive
file_id = '<YOUR FILE ID>'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('hyperspectral_image_data.hdf ')#Use gdal_translate
!gdal_translate -of GTiff HDF5:"hyperspectral_image_data.hdf5"://dataset output.tif

You can then use any image processing library to open and read the image.


Lots of interesting things to unpack in this advice – and lots of errors as I expect to debug the code.

So, as a first step, let’s see if we can just get access to the file and some basic metadata.

# Install the PyDrive library
!pip install -U PyDrive
# Import the necessary libraries
import h5py
import numpy as np
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive client
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# Get the HDF5 file from your Google Drive
file_id = '/content/drive/MyDrive/DATA/XXXXXXX.he5'
with h5py.File(file_id, "r") as f:
    # List all groups
    print("Keys: %s" % f.keys())
    a_group_key = list(f.keys())[0]
    # Get the data
    data = list(f[a_group_key])

 

Success! We can see the HDF keys returned below:

Keys: <KeysViewHDF5 ['HDFEOS', 'HDFEOS INFORMATION', 'Info', 'KDP_AUX']>

These match the top-level directory structure revealed by HDFView:

This is the first step in identifying the file contents, before we can drill down into the file structure to identify a whole bunch of parameters about the datacube that we can do interesting things with 🙂

Working with ChatGPT & Hyperspectral Data

Thus far I have been playing around with HDFView to extract hdf5 image data – but I am very interested in working with advanced machine learning systems and developing processes to interface them with Unreal Engine and on-line data sources. It seems the most tractable way to do this is through the use of a Python API – which is pretty much the lingua franca of scientific computing.

As I am a neophyte with using hyperspectral image data, let’s use ChatGPT to find out what might be an ‘easy’ or ‘best’ way to access image data within an hdf5 file:

Q: what is the best way to extract images from an hdf5 file containing hyperspectral image data?

ChatGPT:

One way to extract images from an HDF5 file containing hyperspectral image data is to use the h5py library in Python. This library allows you to open the HDF5 file and access the data stored within it, including the images. You can then use the numpy library to manipulate and extract the images as needed. Here is an example of how you might extract an image from an HDF5 file using h5py:

import h5py
import numpy as np
# Open the HDF5 file
with h5py.File(‘hyperspectral_image_data.hdf5’, ‘r’) as hdf5_file:
# Access the dataset containing the images
images = hdf5_file[‘images’]
# Extract an image by its index
image = np.array(images[0])
print(image)

Another way is to use the gdal library in python, specifically with the gdal_translate utility, which can convert HDF5 to other raster formats like tiff or geotiff, and then use any image processing library to open and read the image.

gdal_translate -of GTiff HDF5:"file.hdf5"://dataset output.tif

In this way you will be able to use any library of your preference to read and extract the images.


I have no idea currently whether this will work, so lets look up the relevant documentation:

Both look promising – and has saved me a heap of time looking for ways to do it!

The first outputs images as a numPy array – meaning that we can examine/export each image by its index – which would be useful for selecting for certain  λ (wavelength)) values and conducting operations upon them.

The second uses GDAL (Geospatial Data Abstraction Library), which provides powerful utilities for the translation of geospatial metadata – enabling correct geolocation of the hyperspectral image data, for instance.

So perhaps a combination of both will be useful as we proceed.

But of course, any code generated by chatGPT or OpenAI Codex or other AI assistants must be taken with several grains of salt. For instance – a recent study by MIT shows that users may write more insecure code when working with an AI code assistant (https://doi.org/10.48550/arXiv.2211.03622). Perhaps there are whole API calls and phrases that are hallucinated? I simply don’t know at this stage.

So – my next step will be to fire up a python environment – probably Google Colab or Anaconda and see what happens.

A nice overview of Codex here:

OpenAI Codex Live Demo

No Description

Scope: Hyperspectral Imaging, Earth Model, Unreal Engine

Our initial scope is to examine how we can can use hyperspectral satellite data within Unreal Engine (UE).

Hyperspectral Imaging

Wikipedia entry for Hyperspectral Imaging:

Hyperspectral imaging collects and processes information from across the electromagnetic spectrum.[1] The goal of hyperspectral imaging is to obtain the spectrum for each pixel in the image of a scene, with the purpose of finding objects, identifying materials, or detecting processes.[2][3] There are three general branches of spectral imagers. There are push broom scanners and the related whisk broom scanners (spatial scanning), which read images over time, band sequential scanners (spectral scanning), which acquire images of an area at different wavelengths, and snapshot hyperspectral imaging, which uses a staring array to generate an image in an instant.

Whereas the human eye sees color of visible light in mostly three bands (long wavelengths – perceived as red, medium wavelengths – perceived as green, and short wavelengths – perceived as blue), spectral imaging divides the spectrum into many more bands. This technique of dividing images into bands can be extended beyond the visible. In hyperspectral imaging, the recorded spectra have fine wavelength resolution and cover a wide range of wavelengths. Hyperspectral imaging measures continuous spectral bands, as opposed to multiband imaging which measures spaced spectral bands.[4]

Hyperspectral imaging satellites are equipped with special imaging sensors that collect data as a set of raster images, typically visualised as a stack, which each layer representing a discrete wavelength captured by the sensor array.

HyperspectralCube.jpg
By Dr. Nicholas M. Short, Sr. - NASA Link

This represents a 3-dimensional hyperspectral data cube, consisting of x,y coordinates that correspond to Earth-coordinates within a defined geodetic reference system, and the λ coordinate which corresponds to spectral wavelength.

Earth Model

An Earth Model means precisely that – a model of the Earth!

More specifically, the Earth is understood as a highly complex three-dimensional shape. It can be simply represented as a sphere or more accurately as an oblate spheroid, representing the diametric differences between equatorial and polar diameters caused by the rotation of the planet.

The scientific study of the shape of the Earth is called Geodesy. Arising from this is the World Geodetic System, about which Wikipedia notes:

The World Geodetic System (WGS) is a standard used in cartographygeodesy, and satellite navigation including GPS. The current version, WGS 84, defines an Earth-centered, Earth-fixed coordinate system and a geodetic datum, and also describes the associated Earth Gravitational Model (EGM) and World Magnetic Model (WMM). The standard is published and maintained by the United States National Geospatial-Intelligence Agency.[1]

Most relevant to creating a digital model of the Earth is the Earth-centred, Earth-fixed coordinate system (or ECEF):

The Earth-centered, Earth-fixed coordinate system (acronym ECEF), also known as the geocentric coordinate system, is a cartesian spatial reference system that represents locations in the vicinity of the Earth (including its surface, interior, atmosphere, and surrounding outer space) as XY, and Z measurements from its center of mass.[1][2] Its most common use is in tracking the orbits of satellites and in satellite navigation systems for measuring locations on the surface of the Earth, but it is also used in applications such as tracking crustal motion.

The distance from a given point of interest to the center of Earth is called the geocentric distanceR = (X2 + Y2 + Z2)0.5, which is a generalization of the geocentric radiusR0, not restricted to points on the reference ellipsoid surface. The geocentric altitude is a type of altitude defined as the difference between the two aforementioned quantities: h′ = R − R0;[3] it is not to be confused for the geodetic altitude.

Conversions between ECEF and geodetic coordinates (latitude and longitude) are discussed at geographic coordinate conversion.

Unreal Engine

Unreal Engine 5.1 provides a comprehensive system for managing ECEF georeferenced data, including ‘flat’ and ’round’ planet projections:

https://docs.unrealengine.com/5.1/en-US/georeferencing-a-level-in-unreal-engine/

In addition to this, there are a variety of plugins available for managing georeferenced data.

For this research we are evaluating two initial approaches:

Cesium for Unreal: https://cesium.com/platform/cesium-for-unreal/

Landscaping Plugin:  https://landscaping.ludicdrive.com

More soon.

Introduction

Hi and welcome to our creative research journal for Playable Earth: Disrupting the status quo of Earth Observation Visualisation through immersion and interaction.

I’m going keep this pretty informal – basically a blog with posts about our research progress, with links and references to relevant materials.

Chris and I have been meeting regularly since the beginning of the project in late November 2022, and will continue posting throughout the duration of the project until late June 2023. We’ve agreed to run the project on a 0.5 basis, allowing us to extend the residency from the usual 3-months duration full-time, to ~6-months half-time. This fits in well with our various commitments to other projects and facilitates some cross-hybridisation.

Broadly speaking, the objective of our research is to explore novel ways of using a computer game engine – specifically Unreal Engine – in the deployment and visualisation of Earth Observation data. Our interest is in exploring various aspects of ‘immersive’ visualisation through XR (eXtended Reality). XR is a catch-all abbreviation for an array of technologies including augmented reality (AR) and virtual reality (VR).

An important caveat here is that we need to exemplify the utility of ‘immersive’ media. It’s a vague term that does not explicitly identify the utility of the form – if there is one, beyond obvious entertainment applications. My intuition is that there is, and that it is not terribly complicated – but we need to articulate knowledge about prospective users, human visual perception, semiotics and human-computer interaction as well. More about this in a later post.

Modern computer game engines provide a robust platform for this kind of research, subtended by the huge financial resources of the games industry and its investment in the technological infrastructure of software and compute resources – that are developing at such a startling rate. Interaction with advanced machine learning systems – AI – is obvious and imperative. It makes sense to use them for sciartistic visualisation, as they provide many useful features consolidated within a systematic approach to programming, content development and human-computer interaction (HCI).

Some difficulties arise in interfacing data formats and approaches commonly used by creative industries technologies (CIT), with more specialised forms used by scientific visualisation and visual analytics.

Furthermore, game engines can  enable the development of novel aesthetic approaches towards visualisation, that often fall outside the remit of strictly ‘scientific’ visualisation conventions.  Thus a prospectively fertile intersection of art and science emerges – they can become much more conversant modalities for eliciting knowledge and insight about the world.

As an artist who has worked extensively with Unreal Engine on a number of personal projects, as well as someone with a PhD in computational geophysics, focusing upon scientific data visualisation, the opportunity to work with Chris and the Smartsat CRC at Swinburne is opportune and exciting – truly a way to draw together these strands of art and science in some blue-sky research.

We also express our gratitude to the Australian Network for Art and Technology (ANAT), CEO Melissa Delaney and her staff for facilitating this novel approach to supporting this research – we hope it paves the way for future initiatives of this type.

Playable Earth :: ANAT Bespoke Residency

Playable Earth

Disrupting the status quo of Earth Observation Visualisation through immersion and interaction.

This project aims to explore novel ways of visualising and interacting with satellite-originated Earth observation data and analytics using modern game development environments. The goal is to develop novel modalities for extended reality (XR) display and interrogation of selected data, including near real-time observations. The project will be open-ended and speculative, and will document its progress through this creative research journal and video demonstrations. It will also consider the potential of using platforms such as the National Broadband Network and 5G to distribute scientific knowledge and enquiry. The project will be informed by the Capability Demonstrators under development by the SmartSat Cooperative Research Centre, which focus on enhancing disaster resilience, water quality monitoring and situational awareness across the Indian, Pacific and Southern Ocean regions.

ANAT Bespoke Residency

This is the first pilot bespoke residency established between ANAT and a host institution. The residency is hosted by the SmartSat CRC at the Swinburne University Centre for Astrophysics and Supercomputing.

People:

Dr Peter Morse (Computational researcher and experimental media artist)

Bio: https://www.petermorse.com.au/about/

Prof Christopher Fluke (SmartSat Professorial Chair, Swinburne University of Technology)

Bio: https://www.swinburne.edu.au/research/our-research/access-our-research/find-a-researcher-or-supervisor/researcher-profile/?id=cfluke