{"id":228,"date":"2012-01-12T10:20:15","date_gmt":"2012-01-12T18:20:15","guid":{"rendered":"http:\/\/facingthesing.wpengine.com\/?p=228"},"modified":"2014-12-06T14:46:53","modified_gmt":"2014-12-06T22:46:53","slug":"value-is-complex-and-fragile","status":"publish","type":"post","link":"https:\/\/intelligenceexplosion.com\/sv\/2012\/value-is-complex-and-fragile\/","title":{"rendered":"V\u00e4rderingar \u00e4r komplexa och sk\u00f6ra"},"content":{"rendered":"<p class=\"qtranxs-available-languages-message qtranxs-available-languages-message-sv\">Tyv\u00e4rr \u00e4r denna artikel enbart tillg\u00e4nglig p\u00e5 <a href=\"https:\/\/intelligenceexplosion.com\/en\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-en\" title=\"English\">English<\/a>, <a href=\"https:\/\/intelligenceexplosion.com\/fr\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-fr\" title=\"Fran\u00e7ais\">Fran\u00e7ais<\/a>, <a href=\"https:\/\/intelligenceexplosion.com\/ru\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-ru\" title=\"\u0440\u0443\u0441\u0441\u043a\u0438\u0439\">\u0440\u0443\u0441\u0441\u043a\u0438\u0439<\/a>, <a href=\"https:\/\/intelligenceexplosion.com\/sk\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-sk\" title=\"Sloven\u010dina\">Sloven\u010dina<\/a>, <a href=\"https:\/\/intelligenceexplosion.com\/zh\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-zh\" title=\"\u4e2d\u6587\">\u4e2d\u6587<\/a> och <a href=\"https:\/\/intelligenceexplosion.com\/it\/wp-json\/wp\/v2\/posts\/228\" class=\"qtranxs-available-language-link qtranxs-available-language-link-it\" title=\"Italiano\">Italiano<\/a>.<\/p><p><span class=\"dropcap\">O<\/span>ne day, my friend Niel asked his <a href=\"http:\/\/en.wikipedia.org\/wiki\/Virtual_assistant\">virtual assistant<\/a> in India to find him a bike he could buy that day. She sent him a list of bikes for sale from all over the world. Niel said, \u201cNo, I need one I can buy <em>in Oxford<\/em> today; it has to be local.\u201d So she sent him a long list of bikes available in Oxford, most of them expensive. Niel clarified that he wanted an inexpensive bike. So she sent him a list of children\u2019s bikes. He clarified that he needed a local, inexpensive bike that fit an adult male. So she sent him a list of adult bikes in Oxford needing repair.<\/p>\n<p>Usually humans understand each other\u2019s desires better than this. Our evolved <a href=\"http:\/\/lesswrong.com\/lw\/rl\/the_psychological_unity_of_humankind\/\">psychological unity<\/a> causes us to share a common sense and common desires. Ask me to find you a bike, and I\u2019ll <em>assume<\/em> you want one in working condition, that fits your size, is not made of gold, etc.\u2014even though you didn\u2019t actually <em>say<\/em> any of that.<\/p>\n<p>But a different mind architecture, one that didn\u2019t evolve with us, won\u2019t share our common sense. It wouldn\u2019t know what <em>not<\/em> to do. How do you make a cake? \u201cDon\u2019t use squid. Don\u2019t use gamma radiation. Don\u2019t use Toyotas.\u201d The list of what <em>not<\/em> to do is endless.<\/p>\n<p>Some people think an advanced AI will be some kind of super-butler, doing whatever they ask with incredible <a href=\"http:\/\/facingthesing.wpengine.com\/2011\/playing-taboo-with-intelligence\/\">efficiency<\/a>. But it\u2019s more accurate to imagine an <a href=\"http:\/\/lesswrong.com\/lw\/ld\/the_hidden_complexity_of_wishes\/\">Outcome Pump<\/a>: a non-sentient device that makes some outcomes more probable and other outcomes less probable. (The Outcome Pump isn\u2019t magic, though. If you ask it for an outcome that is <em>too<\/em> improbable, it will break.)<\/p>\n<p>Now, suppose your mother is trapped in a burning building. You\u2019re in a wheelchair, so you can\u2019t directly help. But you do have the <a href=\"http:\/\/lesswrong.com\/lw\/ld\/the_hidden_complexity_of_wishes\/\">Outcome Pump<\/a>:<\/p>\n<blockquote><p>You cry \u201cGet my mother out of the building!\u201d . . . and press Enter.<\/p>\n<p>For a moment it seems like nothing happens. You look around, waiting for the fire truck to pull up, and rescuers to arrive\u2014or even just a strong, fast runner to haul your mother out of the building\u2014<\/p>\n<p>BOOM! With a thundering roar, the gas main under the building explodes. As the structure comes apart, in what seems like slow motion, you glimpse your mother\u2019s shattered body being hurled high into the air, traveling fast, rapidly increasing its distance from the former center of the building.<\/p><\/blockquote>\n<p>Luckily, the Outcome Pump has a Regret Button, which rolls back time. You hit it and try again. \u201cGet my mother out of there <em>without blowing up the building<\/em>,\u201d you say, and press Enter.<\/p>\n<p>So your mother falls out the window and breaks her neck.<\/p>\n<p>After a dozen more hits of the Regret button, you tell the Outcome Pump:<\/p>\n<blockquote><p>Within the next ten minutes, move my mother (defined as the woman who shares half my genes and gave birth to me) so that she is sitting comfortably in this chair next to me, with no physical or mental damage.<\/p><\/blockquote>\n<p>You watch as all thirteen firemen rush the house at once. One of them happens to find your mother quickly and bring her to safety. All the rest die or suffer crippling injuries. The one fireman sets your mother down in the chair, then turns around to survey his dead and suffering colleagues. You got what you wished for, but you didn\u2019t get what you <em>wanted<\/em>.<\/p>\n<p>The problem is that your brain is not large enough to contain statements specifying every possible detail of what you want and don\u2019t want. How did you know you wanted your mother to escape the building in good health <em>without<\/em> killing or maiming a dozen firemen? It wasn\u2019t because your brain contained anywhere the statement \u201cI want my mother to escape the building in good health without killing and maiming a dozen firemen.\u201d Instead, you <em>saw<\/em> your mother escape the building in good health while a dozen firemen were killed or maimed, and you realized, \u201cOh, shit. I don\u2019t want <em>that<\/em>.\u201d Or you might have been able to imagine that specific scenario and realize, \u201cOh, no, I don\u2019t want that.\u201d But nothing so specific was written anywhere in your brain before it happened, or before you imagined the scenario. It couldn\u2019t be; your brain doesn\u2019t have room.<\/p>\n<p>But you can\u2019t afford to sit there, Outcome Pump in hand, imagining millions of possible outcomes and noticing which ones you do and don\u2019t want. Your mother will die <a href=\"http:\/\/lesswrong.com\/lw\/ld\/the_hidden_complexity_of_wishes\/\">before you have time to do that<\/a>.<\/p>\n<blockquote><p>What if her head is crushed, leaving her body? What if her body is crushed, leaving only her head? What if there\u2019s a cryonics team waiting outside, ready to suspend the head? Is a frozen head a person? Is Terry Schiavo a person? How much is a chimpanzee worth?<\/p><\/blockquote>\n<p>Still, your brain isn\u2019t <em>infinitely<\/em> complex. There is some finite set of statements that could describe the system that determines the judgments you would make. If we understood how every synapse and neurotransmitter and protein of the brain worked, and we had a complete map of your brain, then an AI could at least in principle compute which judgments you would make about a finite set of possible outcomes.<\/p>\n<p>The moral is that <a href=\"http:\/\/lesswrong.com\/lw\/ld\/the_hidden_complexity_of_wishes\/\">there is no safe wish smaller than an entire human value system<\/a>:<\/p>\n<blockquote><p>There are too many possible paths through Time. You can\u2019t visualize all the roads that lead to the destination you give the [Outcome Pump]. \u201cMaximizing the distance between your mother and the center of the building\u201d can be done even more effectively by detonating a nuclear weapon. . . . Or, at higher levels of [Outcome Pump] intelligence, doing something that neither you nor I would think of, just like a chimpanzee wouldn\u2019t think of detonating a nuclear weapon. You can\u2019t visualize all the paths through time, any more than you can program a chess-playing machine by hardcoding a move for every possible board position.<\/p>\n<p>And real life is far more complicated than chess. You cannot predict, in advance, which of your values will be needed to judge the path through time that the [Outcome Pump] takes. Especially if you wish for something longer-term or wider-range than rescuing your mother from a burning building.<\/p>\n<p>. . . The only safe [AI is an AI] that shares all your judgment criteria, and at that point, you can just say \u201cI wish for you to do what I should wish for.\u201d<\/p><\/blockquote>\n<p>There is a cottage industry of people who propose the One Simple Principle that will make AI do what we want. <a href=\"http:\/\/lesswrong.com\/lw\/lp\/fake_fake_utility_functions\/\">None of them will work<\/a>. We act not for the sake of <a href=\"http:\/\/lesswrong.com\/lw\/lb\/not_for_the_sake_of_happiness_alone\/\">happiness<\/a> or <a href=\"http:\/\/lesswrong.com\/lw\/65w\/not_for_the_sake_of_pleasure_alone\/\">pleasure<\/a> alone. What we value is <a href=\"http:\/\/lesswrong.com\/lw\/y3\/value_is_fragile\/\">highly complex<\/a>. Evolution gave you <a href=\"http:\/\/lesswrong.com\/lw\/l3\/thou_art_godshatter\/\">a thousand shards of desire<\/a>. (To see what a mess this makes in your neurobiology, read the first two chapters of <a href=\"http:\/\/www.amazon.com\/Neuroscience-Preference-Choice-Cognitive-Mechanisms\/dp\/0123814316\/\"><em>Neuroscience of Preference and<\/em> <em>Choice<\/em><\/a>.)<\/p>\n<p>This is also why moral philosophers have spent thousands of years <em>failing<\/em> to find a simple set of principles that, if enacted, would create a world we want. Every time someone proposes a small set of moral principles, <a href=\"http:\/\/commonsenseatheism.com\/wp-content\/uploads\/2011\/11\/Muehlhauser-Helm-The-Singularity-and-Machine-Ethics-draft.pdf\">somebody else shows where the holes are<\/a>. Leave something out, even something that seems trivial, and things can go <a href=\"http:\/\/lesswrong.com\/lw\/y3\/value_is_fragile\/\">disastrously wrong<\/a>:<\/p>\n<blockquote><p>Consider the <a href=\"http:\/\/lesswrong.com\/lw\/xr\/in_praise_of_boredom\/\">incredibly important human value of \u201cboredom\u201d<\/a>\u2014our desire not to do \u201cthe same thing\u201d over and over and over again. You can imagine a mind that contained <em>almost<\/em> the whole specification of human value, almost all the morals and metamorals, but left out <em>just this one thing<\/em>\u2014<\/p>\n<p>\u2014and so it spent until the end of time, and until the farthest reaches of its light cone, replaying a single highly optimized experience, over and over and over again.<\/p>\n<p>Or imagine a mind that contained almost the whole specification of which sort of feelings humans most enjoy\u2014but not the idea that those feelings had important <em>external referents<\/em>. So that the mind just went around <em>feeling<\/em> like it had made an important discovery, <em>feeling<\/em> it had found the perfect lover, <em>feeling<\/em> it had helped a friend, but not actually <em>doing<\/em> any of those things, having become its own <a href=\"http:\/\/en.wikipedia.org\/wiki\/Experience_machine\">experience machine<\/a>. And if the mind pursued those feelings <em>and<\/em> <em>their referents<\/em>, it would be a good future and true; but because this <em>one dimension<\/em> of value was left out, the future became something dull. Boring and repetitive, because although this mind <em>felt<\/em> that it was encountering experiences of incredible novelty, this feeling was in no wise true.<\/p>\n<p>Or the converse problem: an agent that contains all the aspects of human value, <em>except<\/em> the valuation of subjective experience. So that the result is a nonsentient optimizer that goes around making genuine discoveries, but the discoveries are not savored and enjoyed, because there is no one there to do so . . .<\/p>\n<p>Value isn\u2019t just complicated, it\u2019s <em>fragile<\/em>. There is <em>more than<\/em> <em>one dimension<\/em> of human value, where <em>if just that one thing is lost<\/em>, the Future becomes null. A <em>single<\/em> blow and <em>all<\/em> value shatters. Not every <em>single<\/em> blow will shatter <em>all<\/em> value\u2014but more than one possible \u201csingle blow\u201d will do so.<\/p><\/blockquote>\n<p>You can see where this is going. Since we\u2019ve never decoded an entire human value system, we don\u2019t know what values to give an AI. We don\u2019t know what wish to make. If we created superhuman AI tomorrow, we could only give it a disastrously incomplete value system, and then it would go on to do things we don\u2019t want, because it would be doing what we <em>wished for<\/em> instead of what we <em>wanted<\/em>.<\/p>\n<p>Right now, we only know how to build AIs that optimize for something <em>other<\/em> than what we want. We only know how to build dangerous AIs. Worse, we\u2019re learning how to make AIs <em>safe<\/em> much more slowly than we\u2019re learning to how to make AIs <em>powerful<\/em>, because we\u2019re devoting more resources to the problems of AI capability than we are to the problems of AI safety.<\/p>\n<p>The clock is ticking. <a href=\"http:\/\/facingthesing.wpengine.com\/2011\/superstition-in-retreat\/\">AI is coming<\/a>. And we are not ready.<\/p>","protected":false},"excerpt":{"rendered":"<p>Tyv\u00e4rr \u00e4r denna artikel enbart tillg\u00e4nglig p\u00e5 English, Fran\u00e7ais, \u0440\u0443\u0441\u0441\u043a\u0438\u0439, Sloven\u010dina, \u4e2d\u6587 och Italiano.ne day, my friend Niel asked his virtual assistant in India to find him a bike he could buy that day. She sent him a list of&hellip;  <a href=\"https:\/\/intelligenceexplosion.com\/sv\/2012\/value-is-complex-and-fragile\/\">continue reading<\/a> &raquo;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[],"class_list":["post-228","post","type-post","status-publish","format-standard","hentry","category-chapter"],"_links":{"self":[{"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/posts\/228","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/comments?post=228"}],"version-history":[{"count":0,"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/posts\/228\/revisions"}],"wp:attachment":[{"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/media?parent=228"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/categories?post=228"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/intelligenceexplosion.com\/sv\/wp-json\/wp\/v2\/tags?post=228"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}