Tuesday, May 18, 2010

Questioning the Answers

Archimedes
Why would computers deprive us of insight? It's not like it means anything to them...

Surreal story time! The setting: Cornell University. Fellow scientists Hod Lipson and Steve Strogatz find themselves thinking about our scientific future very differently in the final story of WNYC Radiolab's recent Limits episode. In the relatively short concluding segment, "Limits of Science", Dr. Strogatz voices concern about the implications of automated science as we learn about Dr. Lipson's jaw-dropping robotic scientist project, Eureqa.



I can relate with Steve Strogatz' concern about our seemingly imminent scientific uselessness. But is there actually anything imminent here? Science is the language we use to describe the universe for ourselves. Scientific meaning originates with us, the humans that cooperate to create the modal language of science. What are human language or 'meaning' to the Eureka bot but extra steps to repackage the formula into a less precise, linguistically bound representation? If one considers mathematics to be the most concise scientific description for phenomena, hasn't the robot already had the purest insight?

Given the sentiments expressed by Dr. Strogatz and Radiolab's hosts Jad and Robert, it's easy to draw comparisons between Eureqa and Deep Thought (the computer that famously answered "42" in The Hitchhiker's Guide to the Galaxy). Author Douglas Adams was brilliant satirist as much as prescient predictor of our eventual technological capacity (insofar as Deep Thought is like Eureqa). The unfathomably simplistic answer of "42" and the resulting quandary that faced the receivers of the Answer to Life, the Universe, and Everything in HHGTTG is partially intended to make us aware that we are limited in our abilities of comprehension.

More importantly, it shows that meaning is not inherent in an answer. 42 is the answer to uncountable questions (e.g. "What is six times seven?") and Douglas Adams perhaps chose it bearing this fact in mind. Consider that if the answer Deep Thought gave was a calculus equation 50,000 pages long, the full insight of his satire might be lost on us; it's easy to assume an answer so complicated is likewise accordingly meaningful, when in fact the complex answer is no more inherently accurate or useful in application than the answer of 42.
Deep Thought
The Eureqa software doesn't think about how human understanding is affected by the discovery of formula that best describe correlations in the data set. When Newton observed natural phenomena and eventually discovered his now eponymous "F = ma" law, he reached the same conclusion as the robot; the difference is that Newton was a human-concerned machine as well as a physical observer. He ascribed broader meaning to the formula by associating the observed correlation to systems that are important for human minds, the scientific language of physics, and consequently engineering and technology. A robotic scientist doesn't interface with these other complex language systems, and therefore does not consider the potential applications of its discoveries (for the moment, at least). 

Eureqa doesn't experience "Eureka!" insight because it isn't like Archimedes, Man. Man so thrilled by his bathtub discovery of water displacement that legend remembers Archimedes as running naked through the streets of Syracuse. He realized that his discovery could be of incalculable importance to human understanding. It is from this kind of associative realization that emerges the overwhelming sense of profound insight. When Eureqa reaches a conclusion about the phenomena it is observing, it displays the final formula and quietly rests, having already discovered everything that is inherently meaningful. It does not think to ask why the conclusion matters, nor can it tell as much to its human partners.

"Why?" is a tough question; the right answer depends on context. Physicist Richard Feynman, in his 1983 interview series on BBC "Fun to Imagine", takes time for an aside during a question on magnetism. When asked "Why do magnets repel each other?", Feynman stops to remind the interviewer and the audience of a critical distinction in scientific or philosophical thinking: why is always relative.
 
"I really can't do a good job, any job, of explaining magnetic force in terms of something else that you're more familiar with, because I don't understand it in terms of anything else that you're more familiar with." - Dr. Feynman

Meaning is not inherent or discoverable; meaning is learned.

Saturday, May 15, 2010

Making Virtual Sense of the Physical World

You'll remember everything. Not just the kind of memory you're used to; you'll remember life in a sense you never thought possible.

Wearable technology is already accessible and available to augment anyone's memory. By recording sensory data we would otherwise forget, digital devices enhance memory somewhat like the neurological condition synesthesia does: automatic, passive gathering of contextual 'sense data' about our everyday life experiences. During recollection, having the extra contextual information stimulates significantly more brain activity, and accordingly yields significant improvements in accuracy.

This week, Britain's BBC2 Eyewitness showed off research by Martin Conway [Leeds University]: MRI brain scan images of patients using Girton Labs Cambridge UK's "SenseCam", a passive accessory that takes pictures when triggered by changes in the environment, capturing momentary memory aids.

The BBC2 Eyewitness TV segment on the SenseCam as a memory aid:

The scientists' interpretation of the brain imaging studies seems to indicate that vividness and clarity of recollection is significantly enhanced for device users, even with only the fragmentary visual snapshots from the SenseCam. One can easily imagine how a device that can also record smells, sounds, humidity, temperature, bio-statistics, and so on could drastically alter the way we remember everyday life!

Given this seemingly inevitable technological destiny, we may feel the
limits of human memory changing dramatically in the near future. Data scientists are uniquely positioned to see this coming; a recent book by former Microsoft researchers Gordon Bell and Jim Gemmell, Total Recall: How the E-Memory Revolution Will Change Everything, begins its hook with "What if you could remember everything? Soon, if you choose, you will be able to conveniently and affordably record your whole life in minute detail."

When improvements in digital interfacing allow us to use the feedback from our data-collecting devices effortlessly and in real-time, we might even develop new senses.



A hypothetical example: my SkipSenser device can passively detect infrared radiation from my environment and relay this information, immediately and unobtrusively, to my brain (perhaps first imagine a visual gauge in a retinal display). By simply going through my day to day life and experiencing the fluctuations in the infrared radiation of my familiar environments, I will naturally begin to develop a sense for the infrared radiation being picked up by the device. In this hypothetical I might develop over time an acute sense of "heat awareness", fostered by the unceasing and incredibly precise measurements of the SkipSenser.

Of course I'm not limited to infrared radiation for my SkipSenser; hypothetically anything detectable can stimulate a new sense. The digital device acts as an aid or a proxy for the body's limited analog sense detectors (eyes, ears, skin, i.e. our evolutionary legacy hardware) and also adds new sense detectors, allowing the plastic brain to adapt itself to new sensory input. I could specialize my auditory cortex, subtly sensing the characteristics of sound waves as they pass through the air, discovering patterns and insights previously thought too complex for normal human awareness. I could even allow all of my human senses to slowly atrophy in favor of fomenting a set of entirely unfamiliar senses, literally changing my perception to fit some future paradigm.


NASA Interferometer Images

Augmenting our sensory systems isn't new, it's what humans are naturally selected for. Generally speaking, 'tool' or 'technology' implies augmentation. If you drive a car, your brain has to adapt to the feel of the steering wheel, the pressure needed to push the pedals, the spatial dimensions of the vehicle, the gauges in the dashboard. While you learned how to drive a car (or ride a bike), your brain was building a neural network by associatively structuring neurons, working hard to find a system good enough to both A) accurately handle these new arbitrary input parameters and B) process the information at a rate that allows you to respond in a timely fashion (i.e. drive without crashing). That ability to restructure based on sensory feedback is the essence of neuroplasticity; it's how humans specialize, how humanity shows such diverse talent as a species.


That diversity of talent seems set to explode because here's what is new: digital sensors that are easy to use, increasingly accessible, and surpassing human faculty. Integrated devices like the SenseCam continue to add functionality and shrink in size and effort required, now encompassing a sensory cost-benefit solution that appeals not only to the disabled, but to the everyman.

There may be no limits to the range of possible perception. Depending on your metaphysical standpoint, this might also mean there may be no limits to the range of possible realities.

Wednesday, May 5, 2010

Vanishing Words Tell Illuminating Tales

The Library of Congress set up a deal a few weeks ago to acquire Twitter's complete archive of public messages. It's not a particularly impressive number of bytes by itself, but it's a goldmine for computational analysis. And that academic potential is behind the government wanting to obtain what might seem like a vast cacophony of meaningless chatter.

In the WNYC Radiolab podcast released today, "Vanishing Words", Jad and Robert look at linguistic computation. Specifically, the idea that you can identify and predict dementia using word analysis of personal history, say a collection of letters or diary entries. Or if you're Agatha Christie, crime novels. If you've got a minute let Jad Abumrad & Robert Krulwich tell you about this:



Working with Jad's mention of "the age of Twitter": online services like Twitter, Facebook, Google, and so on are quite earnestly working with words as scientific data; it's a core element of staying competitive in their business. Computational language analysis is a fascinating field, and luckily it also seems to have powerful economic incentive.

Word data is probably still the easiest way to directly get highly personalized information about a person (e.g. a status update, a tweet). Facebook Data Scientists, for example, work primarily to teach computer models to interpret the words used in Facebook status updates into meaningful demographic data. The computers gather information and the scientists pick out interesting patterns so that better, more personalized advertising can be served. Better targeted ads translate to actual interest in ads, which translates to business.

Computational research and analysis (like the studies mentioned in this Radiolab podcast) is exploding commercially and academically, like a virtual internet gold rush. Supply is growing exponentially as hundreds of millions of people use online services to communicate publicly. Demand is blowing up too, because we're realizing, like these scientists discovering something deeply personal about Agatha Christie, just how much we can learn from a simple collection of words.

It's exciting to consider how much we may be able to learn about ourselves using non-contextual information. Words unrelated to each other in everyday usage still form patterns unseen on a larger scale. Everything you do leaves a mark on the world, and soon we may be able to better understand our markings and appreciate our histories holistically.

I imagine the future like learning the answers to questions we never thought to ask.

Edit 5/11/10: Agatha Christie also wrote dozens of diary entries and notes about books that may have shown signs of dementia. (via @JadAbumrad "Agatha Christie's deranged notebooks (interesing to read after the latest @wnycradiolab podcast) - http://bit.ly/ar2smX"

Edit 5/14/10: For an interesting exemplar of Facebook linguistic data-mining, see their Gross National Happiness trend index. The study describing the methodology used is cited below the chart.

Think about...

Random Thoughts

Where Thinkers Come From
 
Real Time Web Analytics