How logic can help AI models tell more truth, according to AWS


byron-cook-may-2025-aws-symposium-byron-cook-talking-2

AWS distinguished scientist Byron Cook makes the case for “automated reasoning.”

Amazon AWS

The term “reasoning” is a familiar metaphor in today’s artificial intelligence (AI) technology, often used to describe the verbose outputs generated by so-called reasoning AI models such as OpenAI’s o1 or DeepSeek AI’s R1.

Another kind of reasoning is quietly taking root in the most advanced applications, perhaps closer to actual reasoning.

Also: Will AI think like humans? We’re not even close – and we’re asking the wrong question

Recently, Amazon AWS distinguished scientist Byron Cook made the case for what is called “automated reasoning,” also known as “symbolic AI” or, more abstrusely, “formal verification.” 

It is an area of study as old as the artificial intelligence field, and, said Cook, it is rapidly merging with generative AI to form an exciting new hybrid, sometimes termed “neuro-symbolic AI,” which combines the best of automated reasoning and large language models.  

byron-cook-may-2025-aws-symposium-terms-for-automated-reasoning

Amazon AWS

Cook gave a talk about automated reasoning at the AWS Financial Services Symposium in New York this May.

By whatever name you call it, automated reasoning refers to algorithms that search for statements or assertions about the world that can be verified as true by using logic. The idea is that all knowledge is rigorously supported by what’s logically able to be asserted.

Also: AI will boost the value of human creativity in financial services, says AWS

As Cook put it, “Reasoning takes a model and lets us talk accurately about all possible data it can produce.”

Cook gave a brief snippet of code as an example that demonstrates how automated reasoning achieves that rigorous validation.

As Cook explained to his audience, an instruction loop in a piece of computer code can be predicted — with certainty — to stop running at some point based on the conditions established in its statements. So, the question, “Can this loop run forever?” can be answered with logical analysis. 

byron-cook-may-2025-aws-symposium-code-loop-example

Amazon AWS

In Cook’s example, two variables, X and Y, are integers; Y is positive, and X is greater than Y. Y is repeatedly subtracted from X, reducing the value of X. Eventually, subtracting Y from X will make X smaller than Y. At that point, the conditions of the code loop have been violated, and the loop will terminate. 

The simple fact — that eventually X will be smaller than Y — can be inferred logically without exhaustively running the code loop itself. That’s perhaps the most important element of automated reasoning, a principle that Cook returned to repeatedly: Automated reasoning can answer fundamental questions about something with logic rather than with exhaustive trial and error.

“That’s what symbolic AI is,” said Cook. “We find arguments, step by step, and we can check them mechanically using the foundations of mathematical logic to make sure each statement is true. And then automated reasoning is the algorithmic search for arguments of that form.”

Such step-by-step solutions go back to the dawn of AI in the late 1950s, said Cook. In fact, in 1959, a top-of-the-line IBM machine, the 704, ran a form of automated reasoning to prove all of the theorems of Whitehead and Russell’s famous Principia Mathematica

But there’s been a lot of progress since then, Cook told the audience. “The tools keep getting remarkably better” through new algorithms.

Also: What is DeepSeek AI? Is it safe? Here’s everything you need to know

AWS has been using automated reasoning for a decade now, said Cook, to achieve real-world tasks such as guaranteeing delivery of AWS services according to SLAs, or verifying network security. 

Translating a problem into terms that can be logically evaluated step by step, like the code loop, is all that’s needed. 

For example, network security very often involves statements that are either absolutely true or absolutely false, explained Cook, which means that they can be tested in the same way as the code loop to determine automatically whether conditions are met or violated.

byron-cook-may-2025-aws-symposium-typical-security-questions

Amazon AWS

“When you look at the questions [AWS] customers ask, they use lots of words like, ‘for all,’ and ‘always,’ and ‘never’,” said Cook, such as “Is my data always encrypted at rest and in transit?” 

“These are universal statements; they range over very large, if not intractably large, if not infinite sets,” said Cook. “It’s not possible to exhaustively test any policy to know such absolutes,” said Cook. “The number of lifetimes of the sun it would take to exhaustively test all possible authorization requests would take 92,686 digits to write down” — not practical, in other words.

Using automated reasoning, AWS’s Identity and Access Management tool IAM Analyzer, which has been available for free for four years, “can solve the same problem in seconds,” said Cook. “That’s the value proposition of reasoning and mathematical logic as opposed to exhaustive testing.”

Cook argued that the power of automated reasoning means it will increasingly be “a form of artificial super-intelligence.”

Also: OpenAI’s o1 lies more than any major AI model. Why that matters

“For some time, we have had a form of artificial super-intelligence, if you will, it just spoke JSON,” said Cook. Automated reasoning has been used to “solve open math conjectures,” the stuff that “grabs headlines,” he said.

“We are solving in milliseconds or seconds or hours what humans could never solve in, like, a hundred lifetimes.”

Other uses at AWS include proving the correctness of open-source code developed by AWS and even “proving the correctness of AWS’s front door,” meaning evaluating whether to allow or disallow requests for access to AWS that come in from clients as frequently as two billion times a second.

byron-cook-may-2025-aws-symposium-proving-correctness-of-aws

Amazon AWS

Cook said all of these applications — the AIM Analyzer, the code proving, the AWS access authorization, and numerous other tools and services — draw upon an internal automated reasoning infrastructure at AWS called Zelkova, which can translate policies into mathematical formulas. 

A lot of the momentum for automated reasoning and Zelkova has come from the financial services industry, said Cook.

“We’ve had really nice partnerships with folks like Goldman, Bridgewater,” said Cook, citing the investment bank and the hedge fund. The technology has helped those clients’ teams “deploy faster, and, actually, save a lot of money.”

Also: AI has grown beyond human knowledge, says Google’s DeepMind unit

(John Kain, who is head of market development efforts in financial services for AWS, recently spoke to ZDNET about the use of automated reasoning for financial clients.)

The future of automated reasoning is melding it with generative AI, a synthesis referred to as neuro-symbolic.

On the most basic level, it’s possible to translate from natural-language terms into formulas that can be rigorously analyzed using logic by Zelkova. 

In that way, Gen AI can be a way for a non-technical individual to frame their goal in informal, natural language terms, and then have automated reasoning take that and implement it rigorously. The two disciplines can be combined to give non-logicians access to formal proofs, in other words.

Also: What Apple’s controversial research paper really tells us about LLMs

“You’re an expert in financial services, in immigration law, with automated reasoning checks, we give an individual the ability to encode that, and here are the rules derived.”

The other reason for a hybrid is to deal with the limitations of generative AI that have become apparent, especially what are called hallucinations or confabulations, the tendency for large language models (LLMs) to produce false assertions, sometimes wildly so.

“People got super excited about them [LLMs], and now they’re beginning to realize that, oh, wait, some of these things have limitations,” said Cook. “You can’t just force infinite data into these things, and they’ll just always get better.”

Scholars, especially critics of the current generative AI approach, have long discussed the idea of a hybrid neuro-symbolic approach. Noted gen AI skeptic Gary Marcus has suggested that gen AI needs something like formal logic to ground it in truth. 

Also: With AI models clobbering every benchmark, it’s time for human evaluation

There is even a venture-backed startup named Symbolica whose mission statement implies it will surpass what it sees as the limitations of LLMs.

Cook offered a practical example of the hybrid approach: checking the veracity of chat bots. 

“In a chat bot, you have questions and answers, and you want to know, is it true?” said Cook. Automated reasoning allows you to evaluate statements according to formal logic.

An example is an offering from AWS currently in preview, announced at AWS re:Invent, called Automated Reasoning Checks. The program can take a chatbot’s natural-language output and convert it into formal logic that can then be verified.

Cook used a chat with a bank loan chatbot as an example. A person asks how long it should take to get approval for their loan application. The chatbot responds with a series of statements, such as a “1 business day of approval.”

The automated reasoning works to verify whether those answers from the bot are true. 

byron-cook-may-2025-aws-symposium-chat-bot-correction

Amazon AWS

Explained Cook, “In the background, what we’re doing is we’re taking the natural language text, we’re mapping it into mathematical logic, we’re proving or disproving the correctness of the statements, and then we’re providing witnesses so you can, as a customer, pull on that, the log of the argument, that the property is true, but in a way that could be audited.”

Cook said automated reasoning will become even more important in an age of agentic AI. “Where things are headed is, we’re hearing more and more about agents; on the hype curve, this is sort of the new, new entry,” he said. 

Also: Google’s new AI tool Opal turns prompts into apps, no coding required

“If you are going to allow natural language to be converted into action that makes one-way-door decisions on your behalf with your money, with your reputation, with your career, with your code, that correctness is going to be absolutely paramount. With agentic AI, we’re allowing mere mortals to essentially write and execute distributed systems.”

Agentic AI consists of many AI systems operating in parallel, and should be solved the way automated reasoning has solved other distributed systems work at AWS, he argued. 

For example, in the case of AWS’s S3 storage system, the internal tool, Zelkova, was used to “prove the correctness of the distributed systems design,” he said.

“S3 [Amazon’s object storage] under the hood is hundreds of protocols,” Cook explained. “Assuming all the machines are speaking the protocols correctly, then you will get strong consistency — collectively, we will get the correct outcome.”

He explained that the same group voting approach, a kind of wisdom of the crowd, can be harnessed to verify agents’ actions. 

Also: Hacker slips malicious ‘wiping’ command into Amazon’s Q AI coding assistant – and devs are worried

“That’s the sort of thing we can show very quickly and very easily with automated reasoning.” 

Cook expressed optimism that the merger of automated reasoning and gen AI will continue to make progress.

“I’m glad to be alive and I’m glad to be a practitioner in this field right now,” he said. “Because these branches are really very quickly actually coming back together now.”

Those wishing to explore the topic further may want to start with Cook’s introductory blog post on automated reasoning from 2021.

Want more stories about AI? Sign up for Innovation, our weekly newsletter.





Source link

Leave a Comment