+
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors

Beyond The Turing Test: Assessing the Intelligence of Modern Machines

In the sci-fi psychological thriller, Ex Machina, a mind-bending series of interactions with advanced artificial general intelligence drives the young software developer protagonist into a fit of existential doubt, pulling at his skin and harming himself to make sure that he is actually human. The premise of the film – an A.I. android manipulating the emotions and interhuman relationships of the people working on it, in order to serve its own innate desires – seemed more fiction than science when it premiered in 2014. Since then though, at least two different A.I. models have been described as “sentient” by people working on them, and the natural language ability of chatbots like ChatGPT has been proven to fool at least a third of participants in one of the world’s largest Turing Tests ever conducted. 

But does a machine’s ability to pass a Turing Test even matter now? 

A.I. researchers have said for years that the seminal test that Alan Turing proposed in 1950 to determine the “intelligence” of machines is now obsolete. However, it’s misleading to think about the Turing Test as a true measure of intelligence in the way we might understand intelligence today. Turing designed the pass/fail test to determine whether a machine could generate human language well enough to fool a human into thinking that they’re chatting with another human via text. 

A.I. models have never come closer to passing the traditional Turing Test than the pre-trained transformer models of today. Yet, it’s still up for debate whether any singular chatbot has truly passed it. Earlier this year, over 1.5 million people participated in a social turing test called “Human or Not” hosted by Israel-based AI21 Labs. After a month and more than 10 million conversations between humans and bots, researchers concluded that 32% of participants were unable to determine whether they were chatting with another human or a bot. 

Researchers are now considering new frameworks for gauging machine intelligence, grounded in psychology, neurology and cognitive development. 

Should Human-Like Machines be our Goal? The Case for Artificial Capable Intelligence

In 2020, SVP and Head Scientist for Artificial General Intelligence at Amazon, Rohit Prasad, wrote an opinion piece saying that the Turing Test is a misleading milestone. 

“Instead of obsessing about making AIs indistinguishable from humans, our ambition should be building AIs that augment human intelligence and improve our daily lives in a way that is equitable and inclusive.”

Instead, Prasad said that developers should strive to build the most capable human-assistant possible – one that can process complex requests, plot a course through multiple tasks, and execute those tasks to achieve the intended outcome. Three years later and this is still a tall order, even for highly-integrated A.I. systems like Google’s Bard. Current models are great at predicting what should come next in a sequence, but they still lack the innate ability of humans to interpret and plan for implicit circumstances as effectively as even a human toddler. Today’s models still need a lot of hand-holding. 

The goal for Prasad is to build an A.I. that exhibits human-like attributes of intelligence such as common sense and self-supervision, combined with a machine’s efficiency in searching information, memory recall and executing tasks. 

Prasad says that A.I. should be “ambient.” It should anticipate your needs, accomplish tasks and fade into the background when not needed.  

Under this view of A.I., the relative intelligence of the machine is essentially moot, as long as it possesses the capability to understand complex requests, account for contingencies, plan tasks methodically, and execute them. 

Co-founder of Google’s DeepMind A.I. division, Mustafa Suleyman, shares a similar view on how we should think about machine intelligence. In his book, The Coming Wave: Technology, Power, and the Twenty-first Century’s Greatest Dilemma, Suleyman suggests that our goal should be to build “Artificial Capable Intelligence” (ACI), and suggests that we need a modernized Turing Test. 

Suleyman says that the classic Turing Test, “doesn’t tell us anything about what the system can do or understand, anything about whether it has established complex inner monologues or can engage in planning over abstract time horizons, which is key to human intelligence.”

Suleyman’s take on the test challenges an A.I. to turn a seed investment of $100,000 and turn it into $1 million. To do so, the bot must research an e-commerce business opportunity, generate blueprints for a product, find a manufacturer on a site like Alibaba and then sell the item on a digital marketplace. 

Suleyman thinks that A.I. will be able to pass such a test within two to three years. It helps that the internet is rife with information about how to make money (especially through digital commerce) that A.I. can use as training data. But elsewhere in the A.I. science world, researchers are looking deeper into machine intelligence, to find out if A.I. can use cognitive skills like a human to learn things it’s never been trained on. 

Testing Abstract Reasoning in A.I. 

As the old adage goes, “teach a man to fish, and you feed him for a day, teach a man to fish and you feed him for a lifetime.” For the most part, A.I. models have been given their fish – trillions of data points, or “tokens” scraped from every corner of the internet – but an alternative framework challenges A.I. to get the fish itself. 

In July, researchers led by UCLA psychologist, Taylor Webb, published a study showing that ChatGPT-3 performed better than a group of undergraduates in certain exercises designed to test the use of analogies to solve problems. 

Analogy is central to human reasoning,” Webb said. “We think of it as being one of the major things that any kind of machine intelligence would need to demonstrate.”

However, it wasn’t a runaway victory for ChatGPT-3. While it scored well on some tests, it gave “absurd” answers on others. 

One test was modeled after one that’s often used in child psychology. The subject is provided a story about a magical genie that transfers gems from one bottle to another. The subject is then challenged to transfer gumballs from one bowl to another, using posterboard and a cardboard tube. Using hints from the story, children can often pass the test easily. However, ChatGPT-3 was less capable, suggesting solutions that were elaborate but physically impossible to execute, with unnecessary steps and no clear mechanism for actually transferring the gumballs. 

The example hints at one of A.I.’s greatest misconceptions: memorization versus intelligence. 

When OpenAI announced the release of ChatGPT-4, much was made of the model’s ability to pass a laundry list of rigorous academic tests, including the Bar and medical licensing exams. The headlines stoked fear in the professional world, as white collar workers imagined a world of A.I. doctors, biologists and lawyers. However, suspicions soon arose that the model passed the tests in large part because it was trained on the tests, or least trained on similar questions with similar answers. Such training gives the A.I. the same benefit as a human with photographic memory, to be able to process a question, quickly consult an expansive memory of everything ever learned, and present the most relevant response, without the need for any critical thinking. 

Follow-up studies showed that ChatGPT-4 performed significantly worse on questions that required knowledge of the world after 2021, demonstrating that A.I.’s “intelligence” is largely just its ability to surface relevant information. Experts describe this shortfall of only being able to recognize patterns that it’s seen before as an A.I. being “brittle” 

So can A.I. learn in the way that we do, through abstract reasoning? 

 In 2019, Google A.I. researcher, François Chollet, published the framework for the “Abstract Reasoning Corpus” (ARC). The framework seeks to measure machine intelligence based on the machine’s ability to accomplish complex tasks with limited prior knowledge. This framework strips away the possibility that signs of intelligence displayed by AI are merely a factor of the AI’s ability to consult its training data and reproduce previous solutions.

In 2020, Chollet put on the Abstract Reasoning Challenge, an open competition inviting A.I. builders to develop a model that could solve reasoning tasks that it’s never seen before, given only a handful of demonstrations to learn from. 

The outcome of the challenge could best be described as a partial success. The vast majority of submissions did not succeed on a single task. The best-performing model was only wrong 80% of the time. 

Should the ARC become the new standard benchmark for machine intelligence? The fact that no model has successfully solved all of the challenge’s problem-sets indicates that the ACS benchmark is still an aspirational goal, and Chollet has said that he wouldn’t expect an A.I. to achieve human-level aptitude for “many years.” 

Moving Goalposts for Measuring Machine Intelligence

The goalposts for machine intelligence have clearly moved since the arrival of ChatGPT made the traditional Turing Test seem quaint. Yet we are still left with the questions of what constitutes an “intelligent” machine, whether intelligence inherently means human-like, and whether an intelligent, human-like machine is a worthy goal. 

Those like Amazon’s Rohit Prasad and Google’s Mustafa Suleyman believe that intelligent machines shouldn’t be our goal, but rather we should primarily strive to build capable machines. 

Meanwhile, researchers like François Chollet and Taylor Webb see machine intelligence as a prerequisite to building capable machines.  

Yet another school of thought leap-frogs all of our contemporary discussions, saying that the modern Turing Test should consider whether A.I. possess a psyche, desires, and free will, in order to prevent an A.I. apocalypse. Would it need to be intelligent in order to have a psyche? 

The breakdown of the traditional Turing Test has left us with countless more questions, and it’s likely that the study of machine intelligence going forward will increasingly resemble our research into the mysteries of our biological black box: the human brain. 

14 Comments »

  1. Thank you I have just been searching for information approximately this topic for a while and yours is the best I have found out so far However what in regards to the bottom line Are you certain concerning the supply

    Comment by Eliminate Skin Tags Remover Buy Now — January 12, 2024 @ 6:19 pm

  2. I loved as much as youll receive carried out right here The sketch is tasteful your authored material stylish nonetheless you command get bought an nervousness over that you wish be delivering the following unwell unquestionably come more formerly again since exactly the same nearly a lot often inside case you shield this hike

    Comment by puravive reviews — February 21, 2024 @ 1:23 am

  3. What i dont understood is in reality how youre now not really a lot more smartlyfavored than you might be now Youre very intelligent You understand therefore significantly in terms of this topic produced me personally believe it from a lot of numerous angles Its like women and men are not interested except it is one thing to accomplish with Woman gaga Your own stuffs outstanding Always care for it up

    Comment by fitspresso reviews — February 22, 2024 @ 9:57 pm

  4. Thanks I have recently been looking for info about this subject for a while and yours is the greatest I have discovered so far However what in regards to the bottom line Are you certain in regards to the supply

    Comment by aeroslim — February 23, 2024 @ 2:55 am

  5. Can you be more specific about the content of your article? After reading it, I still have some doubts. Hope you can help me.

    Comment by conta binance gratuita — July 25, 2024 @ 7:11 pm

  6. Your point of view caught my eye and was very interesting. Thanks. I have a question for you.

    Comment by Anm"al dig f"or att fa 100 USDT — November 9, 2024 @ 2:17 am

  7. Thanks for sharing. I read many of your blog posts, cool, your blog is very good.

    Comment by binance us register — November 16, 2024 @ 2:33 pm

  8. Your article helped me a lot, is there any more related content? Thanks!

    Comment by sign up for binance — November 24, 2024 @ 1:23 am

  9. I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.

    Comment by binance — December 8, 2024 @ 12:28 pm

  10. Can you be more specific about the content of your article? After reading it, I still have some doubts. Hope you can help me.

    Comment by anm"ala dig till binance — January 9, 2025 @ 5:53 pm

  11. Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?

    Comment by binance Sign Up — January 21, 2025 @ 6:31 am

  12. Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?

    Comment by binance'ye kaydolun — January 25, 2025 @ 7:08 pm

  13. I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article. https://accounts.binance.info/register?ref=P9L9FQKY

    Comment by Registrasi Binance — March 23, 2025 @ 7:34 am

  14. Your article helped me a lot, is there any more related content? Thanks!

    Comment by 打开Binance账户 — April 18, 2025 @ 9:02 am

RSS feed for comments on this post. TrackBack URL

Leave a comment