Monday, October 31, 2016

SEMREM: The Search for Extraterrestrial, Morphically-REsonating Mathematicians

-->
An interesting idea came up in an email thread with my dad Ted Goertzel, his friend Bill McNeely, and my son Zar Goertzel…

Suppose that morphic resonance works – so that when a pattern arises somewhere in the universe, it then becomes more likely to appear other places in the universe.   Suppose that, like quantum entanglement, it operates outside the scope of special relativity – so that when a pattern occurs on this side of the universe, its probability of occurrence is immediately increased way on the other side of the universe. 

(As with quantum entanglement, the language of causation is not really the best one to use here – rather than saying “pattern X occurring here increases the odds of pattern Y occurring there”, it’s better to say “in our universe, the odds of the same pattern occurring in two distant locations, sometimes with a time lag, is higher than one would expect based on straightforward independence assumptions” – this has the same empirical consequences and less needless metaphysical baggage.   I’ve pointed this out here )

Suppose also that the physical universe contains multiple intelligent species and civilizations, flung all over the place – scattered across our galaxy and/or multiple galaxies.

It would follow that when one intelligent civilization creates a certain pattern, other civilizations halfway across the galaxy or universe would have a higher probability of happening upon that same pattern.   And perhaps there would be an increasing-returns type dynamic here: once half the intelligent civilizations in the universe have manifested a certain pattern, the odds of the rest coming to manifest it would be much higher.

But what kinds of patterns would be most likely to get propagated in this way?   A pattern highly specific to Earthly life would not be likely to get picked up by gas-cloud aliens in some other galaxy – because morphic resonance, if it works, would only mildly increase the odds of a pattern being found in one location or context, based on it already having been found in another.    Most likely its mechanism of action would involve slightly nudging the internal stochastic dynamics of existing processes – and there is a limit to how much change can be enacted via such nudging.   If the odds of a certain Earthly pattern being formed in the world of the gas-cloud aliens is very low, morphic resonance probably isn’t going to help.

Probably the most amenable patterns for morphic resonance based cross-intelligent-civilization transmission would be the most abstract ones, the ones that are of interest to as many different intelligent civilizations as possible, regardless of their particular cultural or physical  or psychological makeup.    Mathematics would seem the best candidate.

So, if this hypothesis is right, then mathematical theorems and structures that have already been discovered by alien civilizations elsewhere, would be especially easy for us to find – we would find ourselves somehow mysteriously/luckily guided to finding them.

It’s not hard to imagine how we might test this hypothesis.   What if we built a giant AGI mathematical theorem prover, and set it about searching for new mathematical theorems, proofs and structures in a variety of different regions of math-space.   Based on all this activity, it would be able to develop a reasonably decent estimator of how difficult it should be, on average, to discover new theorems and proofs in a certain area of mathematics.  

Suppose this AGI mathematician then finds that certain areas of mathematics are unreasonably easy for it – that in these areas, it often seems to “just get lucky” in finding the right mathematical patterns, without having to try as hard as its general experience would lead it to suspect.   These areas of mathematics would be the prime suspects for the role of “focus area of the intergalactic, cross-species community of morphically resonating mathematicians.”

Suppose the AGI mathematician is trying to solve some problem, and has to choose between two potential strategies, A and B.   If A lies in a region of math-space that seems to have lots of morphic resonance going on, then on the whole it’s going to be a better place to look than B.    But of course, every alien species is going to be reasoning the same way.   So without any explicit communication, the community of mathematically-reasoning species (which will probably  mostly be AGIs of some form or another, since it’s unlikely evolved organisms are going to be nearly as good at math as AGIs) will tend to help each other and collectively explore math-space.

This is an utterly different form of Search for Extraterrestrial Intelligence – I’ll label it the “Search for Extraterrestrial Morphically-REsonating Mathematicians”, or SEMREM.  

As soon as we have some highly functional AGI theorem-provers at our disposal, work on SEMREM can begin!

P.S. -- After reading the above, Damien Broderick pointed out that species doing lots of math but badly could pollute the morphic math-space, misdirecting all the other species around the cosmos.   Perhaps this will be the cause of some future intergalactic warfare --- AI destroyer-bots will be sent out to nuke the species polluting the morphic math metaverse with wrong equations or inept, roundabout proofs ... or, more optimistically, to properly educate them in the ways of post-Singularity mathemagic...

Saturday, October 29, 2016

Symbiobility

I want to call attention here to a concept that seems to get insufficient attention: “symbiobility”, or amenability to symbiosis.

The word “symbiobility” appears to have been used quite infrequently, according to Google; but I haven’t found any alternative with the same meaning and more common usage.   The phrase “symbiotic performance” is more commonly used in biology and seems to mean about the same thing, but it’s not very concise or euphonious.

What I mean by symbiobility is: The ability to enter into symbiotic unions with other entities.

In evolutionary theory (and the theory of evolutionary computation) one talks sometimes about the “evolution of evolvability” – where “evolvability” means the ability to be improved via mutation and crossover.   Similarly, it is important to think about the evolution and symbiogenesis of symbiobility.

There are decent (though still a bit speculative) arguments that symbiogenesis has been a major driver of biological evolution on Earth, perhaps even as critical as mutation, crossover and selection.  Wikipedia gives a conservative review of the biology of symbiogenesis.  Schwemmler has outlined a much more sweeping perspective on the role of symbiogenesis, including a symbiogenesis-based analysis of the nature of cancer; I reviewed his book in 2002.

One can think about symbiobility fairly generally, on various levels of complex systems.   For instance,

  •  Carbon-based compounds often have a high degree of symbiobility – they can easily be fused with other compounds to form larger compounds.  
  • Happily married couples in which both partners are extraverts also have a high degree of symbiobility, in the sense that they can be relatively easily included in larger social groups (without dissolving but also without withdrawing into isolation).


These usages could be considered a bit metaphorical, but no more so than many uses of the term “evolution.”

One of the weaknesses of most Artificial Life research, I would suggest, is that the Alife formalisms created have inadequate symbiobility.   I have been thinking about this a fair bit lately due to musing about how to build an algorithmic-chemistry-type system in OpenCog (see my blog post on Cogistry).    A big challenge there is to design an algorithmic-chemical (“codelet”) formalism so that the emergent systems of codelets (“codenets”) will have a reasonably high degree of symbiobility.  

My hope with Cogistry is to achieve symbiobility via using very powerful and flexible methods (e.g. probabilistic logic) to figure out how to merge two entities A and B into a new entity symbiotically combining A and B.   This requires that A and B be composed in a way that enables the logic engine in use to draw conclusions about how to best compose A and B, based on a reasonablye amount of resource usage.

In terms of the Maximum Pattern Creation Principle I have written about recently, it seems that symbiogenesis is often a powerful way for a system to carry out high-speed high-volume pattern creation.   In ideal cases the symbiotic combination of A and B can carry out basically the same sorts of pattern creation that A and B can, plus new ones besides.


As the world gets more and more connected and complex, each of us acts more and more as a part of larger networks (culminating in the so-called “Global Brain”).   This means that symbiobility is a more and more important characteristic for all of us to cultivate – along with evolvability generally, which is a must in a world so rapidly and dramatically changing.

Thursday, October 27, 2016

MaxPat: The Maximum Pattern Creation Principle


I will argue here that, in natural environments (I’ll explain what this means below), intelligent agents will tend to change in ways that locally maximize the amount of pattern created.    I will refer to this putative principle as MaxPat.

The argument I present here is fairly careful, but still is far from a formal proof.  I think a formal proof could be constructed along the lines of this argument, but obviously it would acquire various conditions and caveats along the route to full formalization.

What I mean by “locally maximize” is, roughly: If an intelligent agent in a natural environment has multiple possible avenues it may take, on the whole it will tend to take the one that involves more pattern creation (where “degree of patternment” is measured relative to its own memory’s notion of simplicity, a measure that is argued to be correlated with the measurement of simplicity that is implicit in the natural environment).

This is intended to have roughly the same conceptual form as the Maximum Entropy Production Principle (MEPP), and there may in fact be some technical relationship between the two principles as well.   I will indicate below that maximizing pattern creation also involves maximizing entropy in a certain sense, though this sense is complexly related to the sort of entropy involved in MEPP.

Basic Setting: Stable Systems and Natural Environments

The setting in which I will consider MaxPat is a universe that contains a large number of small “atomic” entities (atoms, particles,  whatever), which exist in space and time, and are able to be assembled (or to self-assemble) into larger entities.   Some of these larger entities are what I’ll call Stable Systems (or SS’s), i.e. they can persist over time.   A Stable System may be a certain pattern of organization of small entities, i.e. some or all of the specific small entities comprising it may change over time, and the Stable System may still be considered the same system.  (Note also that a SS as I conceive it here need not be permanent; stability is not an absolute concept...)

By a “natural environment” I mean one in which most Stable Systems are forming via heavily stochastic processes of evolution and self-organization, rather than e.g. by highly concerted processes of planning and engineering.  

In a natural environment, systems will tend to build up incrementally.   Small SS’s will build up from atomic entities.   Then larger SS’s will build up from small SS’s and atomic entities, etc.    Due to the stochastic nature of SS formation, all else equal, smaller combinations will be more likely to get formed than bigger ones.  On the other hand, if a bigger SS does get formed eventually, if it happens to be highly stable it may still stay around a while.

To put it a little more explicitly: The odds of an SS surviving in a messy stochastic world are going to depend on various factors, including its robustness and its odds of getting formed.   If formation is largely stochastic and evolutionary there will be a bias toward: smaller SS’s, and SS’s that can be built up hierarchically via combination of previous ones…  Thus there will be a bias toward survival of SS’s that can be merged with others into larger SS’s….   If a merger of S1 and S2 generally leads to S3 so that the imprint of S1 and S2 can still be seen in the observations produced by S3 ( a kind of syntax-semantics continuity) then we have a set of observations with hierarchical patterns in it…

Intelligent Agents Observing Natural Environments

Now, consider the position of an intelligent agent in a natural environment, collecting observations, and making hypotheses about what future observations it might collect.

Suppose the agent has two hypotheses about what kind of SS might have generated the observations it has made so far: a big SS of type X, or a small SS of type Y.   All else equal, it should prefer the hypothesis Y, because (according to the ideas outlined above) small SS’s are more likely to form in its (assumed natural) environment.   That is, in Bayesian terms, the prior probability of small SS’s should be considered greater.

Suppose the agent has memory capacity that is quite limited compared to the number of observations it has to process.  Then the SS’s it observes and conjectures have to be saved in its memory, but some of them will need to be forgotten as time passes; and compressing the SS’s it does remember will be important for it to make the most of its limited memory capacity.   Roughly speaking the agentwill do better to adopt a memory code in which the SS’s that occur more often, and have a higher probability of being relevant to the future, get a shorter code.   

So, concision in the agent’s internal “computational model” should end up corresponding roughly to concision in the natural environment’s “computational model.”

The agent should then estimate that the most likely future observation-sets will be those that are most probable given the system’s remembered observational data, conditioned on the understanding that those generated by smaller SS’s will be more likely.  

To put it more precisely and more speculatively: I conjecture that, if one formalizes all this and does the math a bit, it will turn out that: The most probable observation-sets O will be the ones minimizing some weighted combination of

  • Kullback-Leibler distance between: A) the distribution over entity-combinations on various scales that O demonstrates, and B) the distribution over entity combinations on various scales that’s implicit in the agent’s remembered observational data
  •  The total size of the estimated-likely set of SS generators for O


As KL distance is relative entropy, this is basically a “statistical entropy/information based on observations” term, and then an “algorithmic information” type term reflecting a prior assumption that more simply generated things are more likely.

Now, wha does this mean in terms of “pattern theory”?  -- in which a pattern in X is a function that is simpler than X but (at least approximately) produces X?   If one holds the degree of approximation equal, then the simpler the function is, the more 'intense" it is said to be as a pattern.

In the present case, the most probable observation-sets will be ones that are the most intense patterns relative to the background knowledge of the agent’s memory.  They will be the ones that are most concise to express in terms of the agent’s memory, since the agent is expressing smaller SS generators more concisely in its memory, overall.

Intelligent Agents Acting In Natural Environments

Now let us introduce the agent’s actions into the picture. 

If an agent, in interaction with a  natural, environment, has multiple possible avenue of action, then ones involving setting up smaller SS’s will on the whole be more appealing to the agent than ones involving setting up larger SS’s. 

Why?  Because they will involve less effort -- and we can assume the system has limited energetic resources and hence wants to conserve effort. 

Therefore, the agent’s activity will be more likely to result in possible scenarios with more patterns, than ones with less patterns.   That is -- the agent’s actions will, roughly speaking tend to lead to maximal pattern generation -- conditioned on the constraints of moving in the direction of the agent’s goals according to the agent’s “judgment.”  

MaxPat

So, what we have concluded is that: Given the various avenues open to it at a certain point in time, an intelligent agent in a natural environment will tend to choose actions that locally maximize the amount of pattern it understands itself to create (i.e., that maximize the amount of pattern created, where “pattern intensity” is measured relative to the system’s remembered observations, and its knowledge of various SS’s in the world with various levels of complexity.)    

This is what I call the Maximum Pattern Creation Principle – MaxPat.

If the agent has enough observations in its memory, and has a good enough understanding of which SS’s are small and which are not in the world, then measuring pattern intensity relative to the agent will be basically the same as measuring pattern intensity relative to the world.  So a corollary is that: A sufficiently knowledgeable agent in a natural environment, will tend to choose actions that lead to locally maximum pattern creation, where pattern intensity is measured relative to the environment itself.


There is nothing tremendously philosophically surprising here; however, I find it useful to spell these conceptually plain things out in detail sometimes, so I can more cleanly use them as ingredients in other ideas.    And of course, going from something that is conceptually plain to a real rigorous proof can still be a huge amount of work; this is a task I have not undertaken here.