This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
People that claim that language models are ineffective at financial research are being intentionally obtuse.
And no, I’m not talking about going to ChatGPT and asking your financial questions. ChatGPT and other language models are NOT trained to give you access to real-time financial information. At best, it can try to search the web and give you an answer. At worse, it’ll hallucinate, outright lie to you, and give you information that USED to be true… 200 days ago.
No. I’m talking about a purpose-built platform for financial analysis. A platform that can look at real-time financial information and sift through millions of records in seconds — where traditional analysis would take days or longer.
Don’t understand the difference between the two? Let me explain.
How to use AI for financial analysis?
If you want to use AI for financial analysis, you need a system built specifically for that purpose.
With an ordinary LLM, it is relying solely on its training data. While it may be accurate in some regards, it’s often wrong in many others. We cannot rely on what an LLM says about stocks.
To make the LLM reliable, we have to augment it by feeding it real-time financial information and applying advanced prompt engineering.
By doing so, you get accurate, reliable answers backed by data.
Pic: An LLM answering “what stocks increased their revenue every quarter for the past 8 quarters”
Luckily, you don’t have to do this work to use AI for trading and investing. I already did, and you can try it yourself for free.
The AI Chat within my platform NexusTrade can: * Create algorithmic trading strategies * Analyze the fundamentals of watchlists or individual stocks * Find novel investment opportunities and patterns in the market
This article will explain how you can use LLMs for financial analysis and algorithmic trading. We’ll start from collecting the data to using the data to perform advanced, comprehensive financial research.
While this article will focus on financial research, these lessons can be applied to other aspects, such as using LLMs for backtesting and algorithmic trading.
Pic: Using Aurora to create algorithmic trading strategies
So if you want to build your own LLM system for financial analysis, here’s EXACTLY what you have to do, step-by-step.
Step 1) Obtain real-time financial data
The most important step in using LLMs for financial analysis is obtaining a source of real-time financial data.
“Financial information” is a broad term. It can include technical indicators, fundamental indicators, news sources, and more. For this article, we’ll focus on prices and fundamental indicators.
Why these specifically? Because they’re the most important for long-term stock trends.
Indicators such as revenue and net income tell us how much money a company makes and how much of that results in a profit. Stock prices can identify trends and stock ranges. The combination of these can build a powerful AI stock assistant.
From my research, some of the best sources of data includes: * SimFin: The best bang for your buck. SimFin allows bulk downloads for fundamentals and has an extremely wide range of fundamentals for stocks, including sources to the reports. * Polygon: An extremely comprehensive data source. Probably your best bet for intraday stock and crypto data in one centralized location. Includes bulk downloads for stock data and an easy-to-use API * EODHD: Another comprehensive source of data. Includes additional data such as news, insider transactions, macroeconomic data, and more
Feel free to combine data sources that fit your use case. Key considerations include API request limits, bulk download options, specific data availability, and cost. If you want to get creative, alternative sources like StockNewsAPI can provide sentiment data.
What data sources are most important for you for algorithmic trading? Leave a comment below! 💬👇🏾
Once we’ve identified the data we need, we’ll have to store it in a way that makes sense for our application.
Step 2) Organize, store, and sync the data
Organizing and storing the data is critical — the wrong decisions early on can shoehorn you into a pattern that you’re stuck with and that’s difficult to change later.
Trust me, I know from experience. I originally stored the data all in MongoDB, but when I found it too slow (and complicated) for complex queries, I had to do weeks of work migrating everything to BigQuery.
Other suitable options include Postgres and TimescaleDB. Your objective should be extremely fast reads, and serverless platforms like BigQuery make this process easier.
Pic: A subset of my BigQuery schema
Once the database is set up, create jobs to upload and sync the data. SimFin makes this straightforward with bulk downloads, while other APIs may require more effort but follow the same principles.
Now, we’re ready for the fun part: teaching the LLM to query for financial information.
Step 3) Prompt engineering for real-time financial analysis
3a) Select a model
Weaker models, like GPT-4o-mini, are only suitable for basic analysis and creating simple trading strategies. Stronger models, such as GPT-4o, Claude 3.5 Sonnet, or GPT-o1, are more expensive, but better for tasks requiring complex reasoning, like generating SQL queries for complex analysis.
Pic: The different models you can choose in NexusTrade
Once we’ve selected our model, we need to build a system prompt.
Did you know that instead of building an entire platform from scratch, you can use NexusTrade for free? Try it out and let me know your thoughts! I’m always welcome to feedback, both positive and negative!
3b) Create a prompt
Pic: The system prompt for the financial analysis use-case in NexusTrade (excluding examples)
Building a system prompt is a lot more complicated than one might anticipate. We need our model to answer a wide array of different questions involving the database. Here’s how we can enable it to do so.
The architecture of a system prompt can be thought of in the following way. * Role: Defines the LLM’s identity and goals. * Context/Constraints: Explains schemas, response formatting, and any limitations. * Examples: Most importantly, examples give us pairs of inputs and outputs that are desired. They enable few-shot prompting, where the model is able to learn the desired format from the context of the conversation
Pic: Showcasing an example conversation that is injected into the conversation
For a more sophisticated model, we’ll store some examples in a database, and fetch them at runtime. To do this: 1. We store a list of examples and the vectorized version of them in a database 2. When a user sends a request, we transform it into a vector using the embeddings API from OpenAI 3. We pull all examples from the database, and we measure the similarity between the input and the other examples in the DB 4. We pick the most similar ones and inject those examples into the conversation 5. We then generate our response
This process is known as retrieval-augmented generation. It’s very useful for things such as generating queries because there is a very wide array of possible query outcomes. It allows us to give our model more information without overloading our system prompt.
Finally, the most important aspect of the model is the response format.
Our examples will show how the model needs to response. As the screenshots above illustrates, our model responds in JSON, with a thought process and a SQL query. The query is extracted from the JSON and executed against the database.
Pic: The result of executing a SQL query (generated by the LLM) against the database
The output of the query is transformed into a format, like markdown, which can be displayed to the user.
Putting this all together, this diagram shows the entire process from when a user sends a message to us getting a response.
Pic: The process of getting real-time information from an LLM
With this process of gathering financial data, storing and syncing the data regularly, and complex prompt engineering, we’ve successfully built a single use-case for using AI for financial analysis!
However, this is just one. Imagine building a half-dozen more, each for a separate use-case.
That’s exactly what I’ve already done.
Step 4) Repeat for each of the unique functionalities that you want to support
In this article, we focused on building an LLM for querying stock data. But a comprehensive platform like NexusTrade supports: * Developing custom indicators * Creating algorithmic trading strategies * Testing our strategies on historical data * Building watchlists of our favorite stocks * And more
When chatting with the LLM, the model needs to figure which prompt is most relevant. To do this, we’ll have to create a “Prompt Router”.
Pic: Building a “Prompt Router” that forwards a request to the most relevant prompt
This router looks at the request and looks at our list of prompts and decides which prompt is most relevant to our request.
Furthermore, because of the complexities and nuances of many of these processes, such as creating a portfolio of trading strategies, the prompts of these need to be split into multiple prompts.
With all of this being said, I hope to have shown you the amount of work required to build a purpose-built AI assistant for financial analysis and algorithmic trading.
This doesn’t touch upon the auxiliary portions – building the trading platform that can create and backtest strategies, then deploying those strategies to the market. Or connecting with APIs to determine if the market is open, and making sure you can go from paper-trading to real-trading seamlessly.
And the end result is a comprehensive platform that empowers retail investors. Anybody with a computer can now login to a free app and use AI for financial research and algorithmic trading.
And once you take the step, your portfolio will thank you. Mine already has.
Stop relying on outdated methods and ineffective tools. Take control of your portfolio today with NexusTrade — the only platform that combines real-time financial data with AI-driven insights. Click here to try it for free and see the difference for yourself.
Subreddit
Post Details
- Posted
- 14 hours ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/Wallstreetb...