September 25, 2024
Can AI do math yet?
Today, leading large language models (LLMs) will attempt to generate advanced mathematical visualizations, walk you through the steps of an analysis, and derive insights from datasets. While many mathematical queries made to leading chatbots still result in hallucination, failure mode, and similarly frustrating outcomes, the realm of usable tools and response types is expanding. If you ask OpenAI, their next release is supposed to be even more promising.

Anticipating GPT “Strawberry” and more of the latest on AI analysis systems

If your role involves financial analysis or modeling, you might have been tempted to ask ChatGPT to solve a math problem or analyze a spreadsheet for you. Chances are, you may have been disappointed in the results, or simply not trusted them enough to use without doing the work yourself. According to several experts we spoke with, that’s changing. In short, AI might soon be able to math more reliably.

Today, leading large language models (LLMs) will attempt to generate advanced mathematical visualizations, walk you through the steps of an analysis, and derive insights from datasets. While many mathematical queries made to leading chatbots still result in hallucination, failure mode, and similarly frustrating outcomes, the realm of usable tools and response types is expanding. If you ask OpenAI, their next release is supposed to be even more promising.

Most followers of AI news have heard that LLMs have become more and more proficient at surpassing human competency tests in the realm of law, business or other academic fields, most famously passing the bar exam at higher rates than actual people. Thanks to large scale reinforcement learning and improved “chain of thought” reasoning, OpenAI thinks its new model will more convincingly meet the needs of users and developers building for analytical use cases. This includes use cases for finance, such as quantitative modeling, forecasting, data parsing and historical analysis.

Specifically, the upcoming “strawberry” release of GPT-4o brings a notable update to reasoning techniques to market by enabling the LLM to ‘think about’ how it will solve a problem, and plan the steps before actually generating text. During that process, which may take upward of one minute, the model iterates several attempts to structure a problem (not unlike a human analyst or consultant) before proceeding to generate answers. Instead of the traditional LLM technique of generating a response in one shot, OpenAI claims that by breaking down problems into components, their new model will unlock new types of use cases.

When it comes to tackling quantitative and conceptual problems, the results are fairly impressive: the new release exceeds all previous benchmarks, especially in areas that may be of interest to investors and analysts: subjects such as math, law, econometrics, logic and “global facts” all perform on the order of 20-30% better.

Source: OpenAI

OpenAI is not the only research lab that has pushed forward in this direction. Google DeepMind has also notched significant achievements, including a score of ‘silver’ at the International Mathematical Olympiad, with the expectation of beating out top-scoring humans by next year. Solving these problems necessitates both a high degree of critical thinking and conceptual prowess, suggesting that AI systems are developing the ability to match humans in solving the most complex analytical problems.

Meanwhile, Anthropic’s Claude 4 (which has not yet been announced) is rumored to be targeting similar capabilities, but with a skew toward more “responsible” answers. In other words, its developers are building in reasoning methods to minimize the risk of hallucination, overconfidence and unverified sources making their way into what is supposed to be a more empirical path of reasoning.

Productizing Analytical Reasoning for Finance

GenAI is now expected to play a larger role in analytical work even sooner than expected. With that, the question arises of how these capabilities will be made useful for personal and professional use cases, in fields such as private equity or investment banking. Chatbots may work well for students, but as we discussed in our post on AI use cases, the largest productivity gains will come from tools and applications that natively embed LLMs into existing high-value workflows.

Most users do not want to think about how to program or prompt engineer models to make raw outputs helpful; instead, application developers such as Keye will do the work for professionals like investors, consultants, researchers and operators who rely on analytical work to make decisions and recommendations. Keye works with OpenAI and other model developers to assist in producing the fundamental analyses required for due diligence and investment decisions.

Historically, AI startups might have been limited in their ability to provide value further down the analysis chain. The expectation would have been that AI could summarize ideas from text files in a data room, or report on the state of an industry by referencing reports available online. RAG, discussed in our last post, enhanced this by allowing for more context-specific interpretations of data using proprietary data sets and some guided reasoning. However, the underlying analysis was still done by a model relatively low in reasoning ability.

Now, those same techniques will be able to be combined to allow for iterative reasoning within processes like due diligence, sourcing, strategic planning and modeling. For daily operations, the new class of reasoning-based systems could automate and improve the accuracy of financial modeling and risk analysis. By generating iterative models, products based on these systems could offer analyses and predictions that adapt to new data as it becomes available, ensuring that stakeholders have the most current analysis at their disposal.

Takeaways for Buyers of AI in Finance

Without robust mathematical and analytical abilities, AI was already a powerful co-pilot for getting up to speed, surfacing insights and summarizing ideas. With the ability to go deeper on tasks like modeling, due diligence and rapid, high-confidence calculations, AI will go from one of a dozen tools, to a core engine of analysis work streams.

For firm leaders in PE, VC and similar fields, AI continues its transition from a nice-to-have, to a foundational technology that investors must at least have a perspective on. As we discussed in our piece on return on investment in AI, firms should be looking to adopt solutions that can unlock specific use cases, and deliver broad-based productivity gains.

With the ability to do analysis at a high degree of integrity, AI is now delivering those gains and competitive advantages to firms. Here's what that might look like:

  1. Streamlined Workflows: AI significantly streamlines financial workflows by automating time-intensive tasks such as data entry and model building.
  2. Error Reduction: Traditional financial modeling is prone to human error, from mistyped numbers to incorrectly copied formulas. AI's new computational precision matches that of humans, providing a valuable second set of eyes.
  3. Enhanced Collaboration: With AI's capability to expedite the forecasting process, financial teams can enhance their scenario planning. This speed facilitates rapid iterations of financial models based on varying assumptions, allowing teams to adapt swiftly to changing market conditions and to explore multiple strategic directions with ease.
  4. Management of Large Data Volumes: Especially when built into innovative analysis systems, AI excels in handling extensive datasets that overwhelm traditional spreadsheets. This capability is crucial when dealing with vast amounts of transactional data, enabling analysts to manage and analyze large-scale data efficiently without the constraints of spreadsheet software.

There are still risks to relying exclusively on AI, but with potential applications now expanding broader and deeper, the barrier to experimentation has never been lower.

Still, AI-generated models and forecasts will need to be rigorously verified by experienced analysts to ensure their accuracy. While AI can reduce the incidence of human error, its outputs are based on the data and parameters it has been trained on, which can sometimes lead to skewed or incorrect analyses if not properly supervised.

Lastly, as teams explore AI adoption, "back office" concerns about operational costs, privacy & security and ethics are still slowing the buying and onboarding process. Buyers should be working closely with both early-stage and enterprise solutions to figure out the best approach for them. As always, the team at Keye is ready to have those complex and nuanced conversations with investors who are ready to see for themselves what the latest models and AI applications are capable of. Send us a note at founders@keye.co.

Recent Blogs posted from our team

See more from our latest series on AI in the private finance space.

Want to stay in the loop?

Enter your email and our team will provide regular updates.

Thank you!

Thank you!
Your submission has been received!
Oops!
Something went wrong! Try again later