- AI Street
- Posts
- Top AI Models for Stock, ETF, and Crypto Trading Ranked in New Benchmark
Top AI Models for Stock, ETF, and Crypto Trading Ranked in New Benchmark
New benchmark shows Qwen2.5-72B leads in stock trading, while other LLMs excel in crypto and ETFs. Plus a chat with AnalystAI's Ata Onat on building AI research agents.
Happy New Year! I’m Matt, and welcome back to AI Street!
BENCHMARKS
Top AI Models for Stock, ETF and Crypto Trading according to New Benchmark
A group of academic researchers, led by Stevens Institute of Technology in collaboration with Columbia University, Harvard University, and The Fin AI, released a new paper to evaluate different AI models for stock, crypto, and ETF trading.
The framework, called InvestorBench, tested 13 different large language models on their ability to trade financial assets.
The system tests each AI model's performance across different time periods: stocks from mid-2020 to mid-2021, cryptocurrencies throughout most of 2023, and ETFs from mid-2019 to late 2020.
The researchers built what they call a "memory system" based on an earlier framework called FINMEM to help AI make trading decisions, incorporating both memory storage and reflection mechanisms that help explain trading decisions. During an initial training period, the system fed each AI model market data and financial news to spot patterns between information and subsequent price movements.
The system has three memory layers—short-term (14 days), medium-term (90 days), and long-term (365 days)—mimicking how human traders might weigh recent versus older market information. During the actual testing phase, the thirteen AI models used this memory system to make daily trading decisions across three types of assets.
Each morning, the AI models would analyze current market conditions and news, compare these to patterns stored in their memory, and issue a simple command: buy, sell, or hold.
The best-performing AI model for stocks, Alibaba’s Qwen2.5-72B-Instruct, achieved a 46% return with strong risk-adjusted performance compared to a 34% return for a simple buy-and-hold strategy. For cryptocurrency trading, OpenAI’s GPT-o1-preview led the pack, while Writer’s Palmyra-Fin-70B showed the best results in ETF trading.
"Large language models are amazing. They are good at many things, but for some tasks, they're not so good so that's why we choose to do an agentic framework, which combines different components together,” Jimin Huang, founder of Fin AI, told me.
The group was established in February 2024 to foster transparency and collaboration in an industry where companies often resist sharing their AI performance metrics.
Hedge funds, banks, and startups have approached Fin AI to learn from their research, but these conversations often become one-way streets, with firms unwilling to share their own findings or performance data in return.
Interestingly, the research found that larger AI models generally performed better than smaller ones, though size wasn't the only factor. Some specialized financial models underperformed despite their focus on market data, while general-purpose AI sometimes excelled at trading decisions.
As I’ve noted, there’s a real need for AI benchmarks since it’s challenging for non experts to really understand these platforms’ use cases/ how well they perform. My hunch is that AI adoption won’t really take off until the industry can agree upon some uniform rankings.
AI STREET MARKETS EDITION
Sunday Jan. 12 is the launch of AI Street Markets Edition, where I’ll be using AI tools to examine market trends using the following tools:
Plus more platforms!
👉️ Sign up here to start receiving the new Sunday Edition! 👈️
AI AGENTS
Building AI Agents for Investment Research with AnalystAI’s Ata Onat
The former private equity senior associate was working with EMP Belstar SuperFreeze in Southeast Asia, researching and investing in industrial real estate across Singapore, Vietnam, Indonesia and Malaysia —a time-consuming and expensive process.
Each deal involved commercial due diligence, writing investment memos, building decks and also real estate-specific activities such as analyzing supply-demand dynamics, comparable facilities, lease terms and the latest transactions. Often, PE firms pay research companies as much as $100,000 in Asia for a single, one-time market report.
During that time, Ata taught himself how to code and began building internal tools for the fund. Colleagues jokingly called him the “AI Guy,” but when his AI-driven research became a critical part of closing a deal, he realized this side project had greater potential. In June 2024, Ata founded AnalystAI, which develops AI agents for private markets.
In this conversation, Ata walks me through his thought process for building an agent. Step-by-step, he demonstrates how to create an agentic workflow for short opportunities in public markets. The discussion highlights practical techniques such as extracting insights from financial reports (10-Ks, 10-Qs, etc.), generating investment ideas using AI, and filtering actionable insights for deeper analysis.
Ata, who’s based in New York City, explains how iterative testing and domain expertise can refine these tools to produce results that investors can rely on.
MORE AI + FINANCE NEWS
REGULATION
Italy Fines OpenAI $15.6M for Privacy Breach
Italy's data protection agency fined ChatGPT maker OpenAI 15 million euros ($15.58 million) after closing an investigation into use of personal data by the generative artificial intelligence application. (Reuters)
a16z Partner to Lead Trump's AI Policy
Sriram Krishnan, until recently a general partner at Andreessen Horowitz (a16z), will serve as senior policy advisor for AI at the White House Office of Science and Technology Policy. (TechCrunch)
OCC Plans Generative AI Expansion
The banking regulator says in an RFI that it wants to “mature and develop” its AI and machine-learning capabilities. (FedScoop)
ECONOMY
Fed's Daly Sees AI Lifting Productivity
Fed President Daly sees echoes of early 2000s productivity gains in AI boom. (Bloomberg)
ADOPTION
LSEG Rolls Out AI-Driven Tool for Microsoft Teams
LSEG announced the launch of its new Meeting Prep tool, which lets bankers and analysts automate the bulk of their research gathering by leveraging a generative-AI chat-based interface. (WatersTechnology) $
How Wall Street Banks Are Embracing AI
Wall Street banks are proving that generative AI is here to stay and the tech is not just a fad. (Business Insider) $
Wealth Management Use Cases For AI
Rob Pettman, president at TIFIN, an AI and innovation platform for the wealth sector, on the latest AI trends. (WealthBriefing)
The TRADE Predictions Series 2025
Experts from Nasdaq, Millennium, Linedata, FINBOURNE, LTX, Google Cloud, Data Boiler, KCX, and ION Markets share their view on AI in 2025.
Banks Plan 2025 AI Push to Cut Junior Work
2024 has been about exploring use cases for AI, but investment banks plan more projects over the next 12 months. (Financial News)
FUNDRAISING
S&P Global Backs AI Direct Indexing Platform Brooklyn
Brooklyn AI Research, which offers AI-powered direct indexing across equities and fixed income, raised strategic funding from S&P Global Ventures and Atypical Ventures (amount undisclosed). The company's platform serves RIAs and asset managers with over $2 trillion in assets. (Press Release)
ADOPTION
CBO: AI in Early Stages, Economic Impact Unclear
A new CBO report says that we’re in AI's early stages, with significant uncertainty about its long-term economic impact. This seems to validate what many financial executives have been saying - AI adoption is happening more gradually and systematically than headlines might suggest.
Current Adoption is Limited: Only 5% of U.S. businesses meaningfully use AI in their operations, though adoption is higher in information and professional services sectors.
Cost Remains a Major Barrier: Training costs for advanced AI models have escalated from tens to hundreds of millions of dollars, with some projections hitting $1 trillion by decade's end. Customization costs remain prohibitive for many businesses.
Employment Impact More Nuanced Than Expected: About 90% of businesses using AI report no employment changes or plans to change. While 80% of U.S. workers could see at least 10% of their tasks affected by AI, only 19% could have half or more of their tasks impacted.
Economic Effects Still Unclear: Early evidence suggests AI-adopting firms become more productive, but economy-wide impacts remain uncertain. The CBO notes that, like electricity and computing, AI's full economic impact may take decades to materialize.
Federal Budget Implications Mixed: AI could affect federal revenues and spending through both economic changes and government adoption, but the direction and magnitude of these effects remain unclear.
IN CASE YOU MISSED IT
Recent Five Minutes with Interviews
Ex JPM’s Tucker Balch on Scaling Investment Analysis with AI.
USC's Matthew Shaffer on using ChatGPT to estimate “core earnings.”
Moody’s Sergio Gago on scaling AI at the enterprise level.
Ravenpack | Bigdata.com’s Aakarsh Ramchandani on AI and NLPs.
RESOURCES
Non-Technical Guide to AI Agents
The World Economic Forum put out a non-technical report on agents. It has helpful examples like on how we’re already using AI in our everyday lives, like spam filters, your roomba, etc. — and how agents are part of this evolution.
|
How did you like today's newsletter? |
Reply