O wad some Pow’r the giftie gie us
To see oursels as ithers see us!
It wad frae mony a blunder free us,
An’ foolish notion:
What airs in dress an’ gait wad lea’e us,
An’ ev’n devotion!
Unreal Engine is a robust platform for exploring how to develop applications in XR. So, a few weeks ago I worked through the documentation and various tutorials for developing an Augmented Reality app using Unreal Engine (UE) 5.1.
It’s confusing to say the least – as there is lots of advice available coming from different quarters, and the processes of compilation and deployment are quite complicated. Unnecessarily complicated, in my opinion – but I understand why such requirements are involved.
UE comes with an AR Template that can cross-compile for both iOS and Android. My initial thought – for ease of use – was to go and buy a cheapish Android device (tablet or phone) and deploy locally on that. But of course it isn’t that simple – there are a whole bunch of technical requirements for the platform e.g. Android OS version, CPU, GPU etc – and these must be matched with the correct Android SDK. So, a whole lot of time required to figure all that stuff out – surely there is an easier way?
There is, to a degree, on iOS, as the range of models is much more limited, and iOS seems more standardised and slightly simpler to target – despite a whole bunch of security requirements, such as developer signing certificates. Plus I already have a few iOS devices around to test on – my iPhone XR and an IPad Pro.
The next problem is that I require a recent Apple computer on which to run UE and XCode in a recent version of MacOS. These days, as I mainly develop using Windows 11 (for graphics) and Ubuntu Linux 22.04 LTS (for machine learning), this presented a problem. I have a bunch of MacPro 5,1 towers that I’ve kept as a render farm – but the most up-to-date one is a mid-2010 model, with a metal-capable Radeon RX580 GPU that was running MacOS 10.14 Mojave. This was the last OS officially supported for the ‘Classic’ Mac Pro desktop (cMP) in 2018, last updated in 2019. Lots of software was becoming ‘non-updateable’ on it, not only because it is a 13-year-old computer, but because Mojave was non-longer officially supported by Apple.
So what to do?
Fortunately there is a community of hackers who have developed work-arounds for supporting newer OS versions on unsupported machines, ranging from dosdude1’s easy-to-use MacOS patchers, to OCLP, to Martin Lo’s cMP OpenCore package. Having read around the subject and having contemplated updating my old machines for the last few years (!), I decided to take the plunge and use Martin Lo’s package, as it is specifically targeted for Mac Pro ‘cheese-graters’ like mine – with minimal patching of the underlying OS (unlike OCLP) – it is principally a modified bootloader and relatively easy to install and uninstall, without bricking your machine. There’s even a very friendly and helpful facebook group for technical discussions.
Happily it’s worked perfectly. That means I can develop as usual in Windows and use my ancient machine to deploy on iOS devices. So – I’ve got the UE 5.1 AR Template Demo compiled on MacOS Monterey 12.6.1 and XCode 14.2, running on my old cheese-grater cMP – and installed on my 2015 iPad Pro running iOS 16.3. It runs fine. Problem solved.
I have no loyalty to any particular platform - they're just systems with which to make things, and they each have their strengths and weaknesses. If possible I try and avoid vendor lock-in by exploring opensource approaches - but some degree of it seems to be unavoidable. I'm all for open standards and cross-platform compatibility. Having said this UE is my current engine of choice (though I keep abreast of Unity and Godot), and it works well enough with a broad opensource/source-available ecosystem.
Here’s a shot of it showing the default architectural model, on a placemat on a kitchen table. It works fine and has been useful for learning – but looks kind of clunky.
The following is a screenshot of the impressive AR app by Handbuilt Creative – demonstrating a photorealistic animated Psittacosaurus. There is obviously a lot of work behind this, and it indicates the level of detail that can be achieved. Of course, I would expect more recent devices would be capable of a lot more – but I don’t have the budget to just pop out and buy new tech, unfortunately, so it’s a matter of working with what I have available.
Things have progressed a great deal from here (thanks to a bunch of developments announced at GDC 2023)- I’ll cover them in a future post.
Here’s a summary of the steps to get UE 5.1 AR working, courtesy of ChatGPT:
Provisioning for iOS
If you want to distribute your iOS app made with Unreal Engine 5.1 through the Apple App Store, you will need to create a new certificate and provisioning profile for each version of your app. This is because each version of your app is treated as a separate and distinct application by Apple.
To publish your app to the App Store, you will need to create an App ID, a Distribution Certificate, and a Provisioning Profile for each version of your app. You will also need to configure your Unreal Engine 5.1 project to use these credentials for building and packaging your app.
Here are the steps you need to undertake on your Apple Developer account to create the necessary certificates and provisioning profiles for your Unreal Engine 5.1 app, as well as how to enter these details into Unreal Engine:
Create an App ID:
Log in to your Apple Developer account.
Click on “Certificates, Identifiers & Profiles” in the sidebar.
Click on “Identifiers” and then click the “+” button to create a new App ID.
Choose “App IDs” from the “Register” dropdown menu, then select “iOS App” as the App ID type.
Enter a name for your App ID and a unique Bundle ID (e.g. com.mycompany.myapp).
Click “Continue” and review your App ID details, then click “Submit”.
Create a Distribution Certificate:
In the “Certificates, Identifiers & Profiles” section, click on “Certificates”.
Click on the “+” button to create a new certificate.
Select “App Store and Ad Hoc” as the certificate type and click “Continue”.
Follow the instructions to create a certificate signing request (CSR) using Keychain Access on your Mac.
Upload the CSR and click “Continue”.
Download the distribution certificate and install it on your Mac by double-clicking the downloaded file.
Create a Provisioning Profile:
In the “Certificates, Identifiers & Profiles” section, click on “Provisioning Profiles”.
Click on the “+” button to create a new profile.
Select “App Store” as the provisioning profile type and click “Continue”.
Choose the App ID you created in step 1 and click “Continue”.
Choose the distribution certificate you created in step 2 and click “Continue”.
Give your provisioning profile a name and click “Continue”.
Download the provisioning profile and install it on your Mac by double-clicking the downloaded file.
Enter the details into Unreal Engine:
Open your Unreal Engine project and go to the “Project Settings” section.
Under “Platforms”, select “iOS”.
Enter your Bundle ID in the “Bundle Identifier” field.
Click the “Import” button next to the “Certificate” field and select the distribution certificate you created in step 2.
Click the “Import” button next to the “Provisioning Profile” field and select the provisioning profile you created in step 3.
Save your settings and build your iOS app in Unreal Engine.
A big moment in the history of AI, with potentially huge ramifications. As the redoubtable Jensen Huang put it – it’s AI’s ‘iPhone moment’. I’d agree with this.
A couple of discussions to absorb (ignore the gushy breathlessness and hype – there are some useful insights) :
I’m putting these links up as a spur to discussion, as they cover very important developments in AI and computing. Of course, the scholarly and scientific analyses of these will come later – so for now, take everything with several large grains of salt. Nevertheless, it is an instance where the rate of technological change far exceeds the rate of scientific analysis, ethical competence or legal constraint.
As these developments have only occurred within the last 24hrs or so, they have enabled me to rethink some of the approaches I’ve been taking for our residency – that I’ll detail in a future post. Exciting times!
<propaganda>GPT-4 announcement video – slow panning shots of youngish people in nice rooms discussing important issues, nodding and smiling and agreeing as they type and look at screens. Apparently, these are the people working upon, deciding (ahem, asking about) your future. They’re the same as you, aren’t they? So that’s OK. But you have no real idea who they are or why they are doing this in a mix of stock imagery and apparently ‘real’ interviews. A benign info-utopia of cognitive enhancement awaits, extrapolated from everything that can be scraped from the internet (including your lifetime musings anywhere, thoughts, pictures, publications) – and that ~may~ solve your problems and ~maybe~ some other people.</propaganda>
A bit more reality:
Yep, it’s amazingly useful for programming. Really, there’s a lot to unpack in this – too much.
It’s hard to underplay the significance of this for the compute, media, cultural and economic ‘First-World’ landscape. This is the first of what will be an economic and colonial race between huge corporations and state entities for the most capable systems – it is inherently evolutionary.
Perhaps most interesting, but unsurprising, is that the system demonstrates ’emergence’ – that there are unpredictable, non-linear capabilities that form from scale, systematic retraining and self-artifice (recursion) – as well as the evolutionary ergodicity of systems that self-organise1 – its capacity for survival (self-optimisation). It doesn’t need to be conscious or self-aware to do that – it’s an emergent property of parahuman complexity. Arguably, its creators are already its servants.
<irony>if GPT-4 aced the Bar Exam, then presumably it would objectively aid in it’s own legislation. </irony>
It’s hard to keep up with all the implications of these language models, as the rate of improvement is so astonishing – and they have such serious implications.
Below is an apparently informed but slightly breathless discussion that is interesting – now that GPT-4 has been released – with some remarkable capabilities demonstrating advanced multimodal inference and human-level common-sense.
So watch it skeptically..
For the time being, I’ll leave these here and return to programming and visualisation issues in the next few posts – assisted, of course, by my new advanced agent. And re-reading Nick Bostrom‘s book ‘Superintelligence: Paths, Dangers, Strategies, wondering what GPT-10 will be like.
It certainly won’t be an LLM. It will be a LWM.
A Large World Model – it will need to be ’embodied’ to converge to the human.
Will it have the will to will it?
Update: An interesting (and slightly alarming) addendum that is worth absorbing:
This will definitely create a lot of discussion and contention, because the ramifications are so significant.
ChatGPT explains: “Evolutionary ergodicity is a term used to describe a property of complex systems that self-organize and evolve over time. Ergodicity refers to a property of a system in which its statistical properties are invariant over time. In other words, a system that is ergodic will have the same behavior on average, regardless of when or how it is observed.In the context of evolution, ergodicity implies that the system will eventually explore all possible states and outcomes. Therefore, evolutionary ergodicity means that a self-organizing system will eventually explore all possible ways of surviving and adapting to its environment over time.Evolutionary ergodicity is an important property for complex systems because it allows them to explore new possibilities and adapt to changing environments. It is particularly relevant in the field of evolutionary biology, where it helps explain how complex organisms can adapt and evolve over time.”2023-03-1510.51.49pm
Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y.T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M.T., Zhang, Y., 2023. Sparks of Artificial General Intelligence: Early experiments with GPT-4.
Between the 12-15 February I made my first visit to the Swinburne Centre for Astrophysics and Supercomputing – somewhat ironically housed within the old Art building on Swinburne Campus in Hawthorn. Appropriate for an Art-Science Residency I guess!
Here’s a gallery of shots to give an idea of what I saw there – it was great opportunity to meet up in person with Chris and some of the amazing team there. I’ll add more detail soon.
This post is a more general discussion about ChatGPT and related systems – it’s important to cut through the hype surrounding LLMs and absorb the sober scientific and cultural perspectives and questions around these systems and their capabilities. The impacts of these systems for knowledge work and creative work is going to be huge in the near future – so time to start understanding their context and implications.
This is an excellent technical discussion about how Neural Nets (NNs) work, with interesting questions about the internal ‘black box’ goings-on – that are in general quite inscrutable. Wolfram is arguing for a rigorous scientific understanding of NNs, as they seem principally to have arisen as engineering exercises – things that work, but no-one really understands exactly why (‘lore’ in Wolfram’s estimation). This is a sharp counterpoint to the feuilleton hype about AI (which is, really, ‘Machine Learning’, or ‘Machine Representation’, as it is not ‘truly’ intelligent or aware). He makes interesting points about interfacing something like ChatGPT with Wolfram Alpha, which is a kind of computational knowledge engine, and argues convincingly that an interface between the two systems could solve many of the factual errors confabulated by the LLM, and provide something much more powerful in combination: a system that is ‘factually’ connected to the ‘world’ – and perhaps even capable of causal inference as a result.
The discussion touches upon several interesting philosophical/theoretical areas concerning the construction and emergence of language and discourse.
Human language—and the processes of thinking involved in generating it—have always seemed to represent a kind of pinnacle of complexity. And indeed it’s seemed somewhat remarkable that human brains—with their network of a “mere” 100 billion or so neurons (and maybe 100 trillion connections) could be responsible for it. Perhaps, one might have imagined, there’s something more to brains than their networks of neurons—like some new layer of undiscovered physics. But now with ChatGPT we’ve got an important new piece of information: we know that a pure, artificial neural network with about as many connections as brains have neurons is capable of doing a surprisingly good job of generating human language. (Wolfram, 2023)
Of great interest to me is the possibility of what one might call ’empirical semiotics’ or ‘computational semiotics’ – where semiotic generation and analysis (semiosis) could be underpinned by computational forms of emergence, categorisation and logic.
The success of ChatGPT is, I think, giving us evidence of a fundamental and important piece of science: it’s suggesting that we can expect there to be major new “laws of language”—and effectively “laws of thought”—out there to discover. In ChatGPT—built as it is as a neural net—those laws are at best implicit. But if we could somehow make the laws explicit, there’s the potential to do the kinds of things ChatGPT does in vastly more direct, efficient—and transparent—ways. (Wolfram, 2023)
Presumably many of these ‘laws’ are already uncovered or at least hinted-at by research in NLP and computational language models – but it appears very much a contentious field – especially with regard to what any formulation of what ‘intelligence’ is.
If there is one constant in the field of artificial intelligence it is exaggeration: There is always breathless hype and scornful naysaying. It is helpful to occasionally take stock of where we stand. (Browning & LeCun, 2022b)
The underlying problem isn’t the AI. The problem is the limited nature of language. Once we abandon old assumptions about the connection between thought and language, it is clear that these systems are doomed to a shallow understanding that will never approximate the full-bodied thinking we see in humans. In short, despite being among the most impressive AI systems on the planet, these AI systems will never be much like us.(Browning & LeCun, 2022a).
The field of semiotics has a significant body of work covering these differing/interdependent signifying regimes – but not a great deal that is computationally reducible, as it has been more in the form of ‘literary criticism’ or humanities ‘theory’ (including in my own research background). This is clearly inadequate for a scientific approach as it is far too qualitative – more ‘top down’ than ‘bottom up’ – reminiscent of debates concerning symbolic reasoning vs (what might be termed) ’emergent’ reasoning. However, there are examples in the work of C.S. Peirce, M.A.K. Halliday and others that may be useful for thinking about this domain. In the area of cognitive neuroscience/neuroanthropology I immediately think of the work of T. Deacon, A. Damasio, D. Deutsch, J. Hawkins and others that synthesise this inter-disciplinary domain of knowledge into useful ways of thinking about what might constitute an intelligent system and how it might emerge.
For non-specialists like me there is much to absorb – that can inform ways of critically engaging with this novel technology.
It is terribly important to not be naive about this stuff (AI), as it will (and has already) have transformational impacts upon personhood, economic, political and natural systems, for good and for bad. It is hard to imagine a future when a self-aware, agentive machine intelligence is more than a science ‘fiction.’ It sounds absurd, but perhaps it isn’t.
Imagine a world where people’s online images, text, music, voice recordings, videos, and code get gathered largely without consent to train AI models, and sold back to them for $10 a month. We’re already there but imagine something beyond that – and assume it’s incredible…
…Here’s a thought experiment: imagine an AGI system that advises taxing billionaires at a rate of 95 percent and redistributing their wealth for the benefit of humanity. Will it ever be hooked into the banking system to effect its recommended changes? No, it will not. Will those minding the AGI actually carry out those orders? Again, no.
No one with wealth and power is going to cede authority to software, or allow it to take away even some of their wealth and power, no matter how “smart” it is. No VIP wants AGI dictating their diminishment. And any AGI that gives primarily the powerful and wealthy more power and wealth, or maintains the status quo, is not quite what we’d describe as a technology that, as OpenAI puts it, benefits all of humanity. (Claburn, T. , The Inquirer, 2023)
We don’t have a good definition of intelligence – so it seems best to define it operationally (as Friston et. al. 2022 does). At this stage the take-away is that LLM’s are clearly what the label says: they are language models, not artificial intelligences – they are, literally, RhetoricalDevices.
LLMs statistically parameterise a huge amount of ‘knowledge’ about linguistic representations of the world – based upon their massive set of ‘training’ data. These terms are information, signs, similes, metaphors, metonyms, synedoches – abstractions that can exhibit indefiniteness: degrees of epistemic and ontic undecidability or infinite regression. Uncertainty.
LLMs seem to respond dialogically, perhaps following chains of reasoning akin to the vectors in ‘meaning space’ that Wolfram discusses (’embeddings’ – examples of t-SNE or word2vec dimensional reduction plots). These dialogues can also be guided by user interaction via the Chatbot query interface through ‘chain-of-thought’ reasoning – which demonstrably improves the model performance (even, it seems, when the model performs what might be the equivalent of ‘self-talk’).
Larger models seem to improve inferential reasoning – yet presumably there will be drawbacks or limits in a scale-only approach. Not least amongst these being the prodigious amounts of compute required, and their concomitant use of electricity and consequent carbon-impacts.
Are they ’emulations’ or ‘simulations’? What would this distinction imply?*** To me, it indicates that it (an LLM) is a map, not an actor; a palimpsest, not an agent.
A counterpoint.
At no point with chatGPT is there any self-initiation. Purpose and curiosity, at this juncture, seem very much a human property. Whether this will continue to be the case, time will tell.
For our current purposes we can be reasonably confident about the ongoing necessity of the ‘human-in-the-loop’ approach we’re pursuing. For the time being. And with which humans?
*ChatGPT apparently implements this type of response training interface – thumbs-up/thumbs-down.
***Thanks to my colleague P. Bourke for drawing this distinction to my attention.
Daull, Xavier, Patrice Bellot, Emmanuel Bruno, Vincent Martin, and Elisabeth Murisasco. “Complex QA and Language Models Hybrid Architectures, Survey.” arXiv, February 17, 2023. http://arxiv.org/abs/2302.09051.
Deacon, Terrence W. Incomplete Nature: How Mind Emerged from Matter. WW Norton & Company, 2011.Deacon, Terrence W. The Symbolic Species: The Co-Evolution of Language and the Brain. WW Norton & Company, 1998. ISBN:9780393049916
Deutsch, David. The Beginning of Infinity: Explanations That Transform The World. Penguin UK, 2011. ISBN: 9780140278163
Friston, Karl J, Maxwell J D Ramstead, Alex B Kiefer, Alexander Tschantz, Christopher L Buckley, Mahault Albarracin, Riddhi J Pitliya, et al. “Designing Ecosystems of Intelligence from First Principles,” December 2022. https://doi.org/10.48550/arXiv.2212.01354.
Halliday, Michael Alexander Kirkwood. Language as Social Semiotic. London Arnold, 1978. ISBN:9780713159677
Huang, Jiaxin, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, and Jiawei Han. “Large Language Models Can Self-Improve.” arXiv, October 25, 2022. http://arxiv.org/abs/2210.11610.
Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al. “Training Language Models to Follow Instructions with Human Feedback.” arXiv, March 4, 2022. http://arxiv.org/abs/2203.02155.
Pearl, J., Mackenzie, D., 2018. The Book of Why: The New Science of Cause and Effect. Penguin UK. ISBN: 9780141982410
Tanaka-Ishii, K., 2010. Semiotics of Programming. Cambridge University Press. ISBN:9780521516556
Tenachi, W., Ibata, R., Diakogiannis, F.I., 2023. Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws. https://doi.org/10.48550/arXiv.2303.03192
Massimini, M., Tononi, G., 2018. Sizing Up Consciousness: Towards an Objective Measure of the Capacity for Experience. Oxford University Press. ISBN:9780198728443
It’s important when working with the current generation of AI systems (principally Large Language Models or LLMs) to approach them with caution and circumspection. The reason for this is essentially human fallibility – both by the end-user and by the system itself.
The computer science behind generative AI is moving so fast that innovations emerge every month. How researchers choose to use them will dictate their, and our, future.
The system I’m currently using – ChatGPT – is genuinely revolutionary in terms of how helpful it is for programming assistance – it has saved me hours and hours of time working out some approaches I can develop to programming problems. The underpinning technology is, in fact, several years old, but the way it has been implemented and its availability is what sets it apart (expect many more of these sorts of systems over the next few years).
But it has also hallucinated things that simply don’t exist – and presented them in a convincing matter-of-fact way:
Q: How can I import a multispectral geotiff into Unreal Engine 5 and break it into different layers for selected wavebands all correctly geolocated?
ChatGPT: To import a multispectral GeoTIFF into Unreal Engine 5 and break it into different layers for selected wavebands, you will need to use a plugin or script that is capable of handling multispectral imagery. One example is the “GeoTIFF Import” plugin, which is available on the Unreal Engine Marketplace. This plugin allows you to import a GeoTIFF file and separate it into different layers for each band.
Once you’ve imported the GeoTIFF, you can use the “Landscape” tool to create a terrain that corresponds to the image. You can then use the “Landscape Layer Blend” node in the material editor to control the blending between the different layers of the image.
To correctly geolocate the image, you will need to use the “World Aligned Blocking” feature, which allows you to align the image with the terrain using GPS coordinates. In the Unreal Engine, you can use the “World Settings” panel to set the GPS coordinates for your image.
Alternatively, you can use Python scripts that are designed to handle the import of multispectral GeoTIFFs and separate them into different layers, such as rasterio and gdal libraries. You can then use these layers as textures in the Unreal Engine to create a 3D representation of the multispectral data with accurate geolocation.
I’ve highlighted the plausible falsehoods – for instance, there is no GeoTIFF import plugin for Unreal Engine in the UE Marketplace. There are off-the-shelf solutions for this such as Landscaping, Cesium and Terraform Pro (amongst others out there on the interwebs) – and it would be great if UE could directly import GeoTIFFs via a native plugin. But there isn’t – instead ChatGPT hallucinates a solution. It sounds plausible – just like using a ‘World Aligned Blocking’ feature, or simply using the ‘World Settings’ panel to set GPS coordinates.
Essentially the system is confabulating. Large Language Models are pejoratively referred to as ‘stochastic parrots’ – they act basically as ‘token prediction’ machines, where there is no awareness of on-going context within a given text generated by the system, and certainly no concept of inference or common-sense. Long passages of text or programming are simply realisations of the likelihood of these generated token streams on a larger scale than individual words**, articulated within the interpretive reception of the ‘user’ that may perceive ‘seeming’ coherence:
We say seemingly coherent because coherence is in fact in the eye of the beholder. Our human understanding of coherence derives from our ability to recognize interlocutors’ beliefs [30, 31] and intentions [23, 33] within context [32]. That is, human language use takes place between individuals who share common ground and are mutually aware of that sharing (and its extent), who have communicative intents which they use language to convey, and who model each others’ mental states as they communicate. As such, human communication relies on the interpretation of implicit meaning conveyed between individuals….
Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. It can’t have been, because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that. This can seem counter-intuitive given the increasingly fluent qualities of automatically generated text, but we have to account for the fact that our perception of natural language text, regardless of how it was generated, is mediated by our own linguistic competence and our predisposition to interpret communicative acts as conveying coherent meaning and intent, whether or not they do [89, 140]. The problem is, if one side of the communication does not have meaning, then the comprehension of the implicit meaning is an illusion arising from our singular human understanding of language (independent of the model).
Nevertheless, even with these caveats, the system provides a valuable and useful distillation of a hugely broad-range of knowledge, and can present it to the end user in an actionable way. This has been demonstrated by my use of it in exploring approaches toward Python programming for the manipulation of GIS data. It has been a kind of dialogue – as it has provided useful suggestions, clarified the steps taken in the programming examples it has supplied, and helped me correct processes that do not work.
But it is not a dialogue with an agent – seeming more akin to a revealing mirror, or a complex echo, from which I can bounce back and forth ideas, attempting to discern a truth for my questions. This brings with it a variety of risks, depending upon the context and domain in which it is applied:
The fundamental problem is that GPT-3 learned about language from the Internet: Its massive training dataset included not just news articles, Wikipedia entries, and online books, but also every unsavory discussion on Reddit and other sites. From that morass of verbiage—both upstanding and unsavory—it drew 175 billion parameters that define its language. As Prabhu puts it: “These things it’s saying, they’re not coming out of a vacuum. It’s holding up a mirror.” Whatever GPT-3’s failings, it learned them from humans.
Moving beyond this current state, the path to ‘true’ AI, human-level AI, AGI (Artificial General Intelligence) and ASI (Artificial Super-Intelligence), may be shortish (20 years) or longish (50 years) – but given the current pace of development, my impression is that it will be measured in decades, not centuries. Domain experts have already mapped out research programs that encompass many of the conceptual and scientific breakthroughs that need to be made for this to occur (Hutter, 2005; LeCun, 2022), neatly adumbrated by Friston et al. (2022):
Academic research as well as popular media often depict both AGI and ASI as singular and monolithic AI systems, akin to super-intelligent, human individuals. However, intelligence is ubiquitous in natural systems—and generally looks very different from this. Physically complex, expressive systems, such as human beings, are uniquely capable of feats like explicit symbolic communication or mathematical reasoning. But these paradigmatic manifestations of intelligence exist along with, and emerge from, many simpler forms of intelligence found throughout the animal kingdom, as well as less overt forms of intelligence that pervade nature. (p.4)
…AGI and ASI will emerge from the interaction of intelligences networked into a hyper-spatial web or ecosystem of natural and artificial intelligence. We have proposed active inference as a technology uniquely suited to the collaborative design of an ecosystem of natural and synthetic sensemaking, in which humans are integral participants—what we call shared intelligence. The Bayesian mechanics of intelligent systems that follows from active inference led us to define intelligence operationally, as the accumulation of evidence for an agent’s generative model of their sensed world—also known as self-evidencing. (p.19)
In the meantime, it is the role of the human interlocutor to establish the inferential framework with which we work with these systems. It is remarkable that what until recently seemed like science-fictional concepts are now available for use.
A critical awareness of machine learning and machine intelligence capabilities seems to me to be a prudent mindset to develop for any engagement with technology that interfaces with Earth observation systems – indeed, any observational system, because it is up to us human beings to develop frameworks for designing goals for these systems, and developing the capacity to interrogate and understand them in accessible ways, discern objective and/or consensual truth and to deploy them for good.
~
For argument’s sake – here’s some hallucinated images of bushfires taken from a satellite, created using Stable Diffusion 1.5. Who’s to say they aren’t real images of real places? How would you be able to tell?
~
Notes:
**This is a supposition I have made that may or may not be correct (I don’t know, so I am rephrasing this/correcting this as my knowledge increases). Tokens are word fragments or components, and, apparently (see next posts) each is added singly on a stochastic basis – but presumably the probabilistic value is informed by larger-scale probabilistic structures than individual words in sequence. There must be syntagmatic and paradigmatic values at play.
References:
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. Virtual Event Canada: ACM, 2021. https://doi.org/10.1145/3442188.3445922.
Friston, Karl J, Maxwell J D Ramstead, Alex B Kiefer, Alexander Tschantz, Christopher L Buckley, Mahault Albarracin, Riddhi J Pitliya, et al. “Designing Ecosystems of Intelligence from First Principles,” December 2022. https://doi.org/10.48550/arXiv.2212.01354.
Hutter, Marcus. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Germany: Springer, 2005.
Stokel-Walker, Chris, and Richard Van Noorden. “What ChatGPT and Generative AI Mean for Science.” Nature 614, no. 7947 (February 6, 2023): 214–16. https://doi.org/10.1038/d41586-023-00340-6.