Challenges and Opportunities for Generative AI for Private Market Analysts
Dr James Ravenscroft | 13th June 2023
The recent wave of new generative AI solutions that leverage transformer-based large language models (LLMs) has injected renewed vigour and interest in AI and Machine Learning systems and what they can do for businesses.
At Filament Syfter we’ve been tracking the state-of-the-art in natural language processing systems since 2016. We were early adopters of transformer-based LLMs and have been experimenting with them and understanding how they can be used to help private market analysts with sourcing and monitoring use cases since they were popularised by BERT in 2018.
Over the last five years, we have accumulated a large amount of experience and expertise around this family of models and their strengths and weaknesses. In this article, we discuss opportunities where LLMs can accelerate private market analysis and outline risks and some use cases where these technologies are not suitable or need to be paired with other systems to yield valuable results.
Anyone who has played with ChatGPT is likely surprised by its breadth and seeming depth of knowledge. However, ChatGPT’s propensity to hallucinate, i.e. makeup answers, is also infamous (if you haven’t already, try asking it to write a short bio about you). With the overnight success of these tools, many end users are aware and relatively accepting of hallucination in some instances. For example, you might use ChatGPT to write the first draft of an email and manually proofread the result because it still saves you 15 minutes versus writing the message from scratch.
At Filament, we understand that it’s very important for Analysts to have access to reliable and up-to-date information about private markets to provide partners with the information they need to make sound investment decisions. Using a tool that sometimes makes up answers to find out critical company information is likely to require analysts to spend just as long proofreading and verifying the facts as if they’d looked them up using traditional search tools, or, worst case, it could lead to incorrect or fabricated information being used to make investment decisions.
LLM hallucinations of this nature recently landed a high lawyer in hot water when he presented made-up citations in a lawsuit.
There have been incremental improvements between versions of LLMs (like the jump between GPT-3.5 and GPT4) and it is also possible to structure your question in a way that reduces the likelihood of a hallucination (for example, by asking the model to explain its reasoning). However, these incremental solutions do not eliminate the dangers of hallucination altogether and, counter-intuitively, may make it more likely that false information is passed along to stakeholders by making hallucination even harder to proofread for.
Another promising avenue of investigation is the pairing of fact-checking models with LLMs. Over the last few years, several AI-powered fact-checking models and benchmarks have been developed. Recent work has shown that fact-checking capabilities can be added to LLMs, allowing them to proofread and correct generated outputs on the fly. These technical developments show promise but are still very early, and at Filament, we are starting to experiment with some of these techniques.
Furthermore, the risk of hallucination can be further reduced or even eliminated by integrating the LLM with an external knowledge source and using it in a semantic search pattern. We’ll focus more on this in our next edition.
Hallucination remains a fundamental challenge when working with generative transformer-based LLMs, and at Filament Syfter, we encourage caution when working with LLM-generated texts, especially in commercially sensitive contexts where it is essential that facts and figures are correct.
On the other hand, there are some use cases where hallucination is less critical (like drafting a document to be manually edited) and others where the negative impact of hallucination can be neutralised by applying rules in post-processing (see “Acceleration of classification Use Cases” below).
Opportunity: Acceleration of Classification Use Cases
Classification is a type of machine learning problem where we attempt to assign labels to an input automatically, the classic one that most people are familiar with is spam filtering, where the labels are ‘spam’ and ‘not spam’. Over the last seven years, we have worked with a number of our private equity and corporate finance clients to build classification models that enrich their data universe — for example, “which sector does a company belong to?” or “Do those new Bloomberg articles about a portfolio company have a negative sentiment?” Historically this process has required large volumes of annotated data from our clients. However, LLMs present two exciting opportunities for accelerating this process and reducing the required manually labelled training data.
Few Shot Classification
Few-shot classification uses models to carry out the classification task with only a small number of manually labelled examples. Taking the company sector classification example, an LLM can be given a small number of companies’ elevator pitch summaries and associated sector labels and asked to assign labels to a set of unlabelled summaries. Used like this, we can ask the model to simply respond with the name of the sector for each company and use rules to ensure that the outputs adhere to our requirements, neatly side-stepping the hallucination predicament.
We usually find that custom, purpose-built AI models trained on our clients’ proprietary data perform significantly better than LLMs in “few-shot” mode. However, in cases where existing data is very limited or non-existent, few-shot learning can provide a jumping-off point and a more accurate purpose-built model can be built in a later phase of the project. Few-shot performance may also be made more reliable and consistent by generating multiple variations on the input prompt, running the model several times and taking the consensus answer.
Synthetic Data Generation
In this use case, we flip hallucination on its head and make it an asset rather than a challenge. LLMs can be prompted to generate fake information that can be used for training downstream models. Continuing with the company sector classification example, an LLM can be given some example company descriptions from a given operating sector and asked to generate more. These synthetic documents can then be used as part of a traditional training pipeline to create a purpose-built model, reducing the need for manual annotation by busy analysts.
We note that with this approach, it is important that the generated data is still reviewed and explored to make sure that it is free from problematic biases and representative of the real data that the model will need to run on in production. Filament has experience working with and reviewing synthetic data; this process is usually much faster and less intensive than creating new datasets from scratch.
Opportunity: Structured Information Extraction
LLMs can also be used to identify and extract key information from existing documents. The task of extracting names of people, places, companies and other items of interest from documents is known as Named Entity Recognition (NER). Similarly, identifying relationships between these named entities is known as Relationship Extraction. LLMs can be used to find and extract entities and their relationships and to format that information in a structured way that can be ingested into relational databases or knowledge graphs.
Hallucination still presents an obstacle in this use case, but the effect can be reduced using additional rules and logic as a post-processing step applied to the LLM output. For example, if multiple documents cite the same person as the CEO of a company, we can be confident that the model has made the proper connection.
LLMs are an exciting and powerful new family of technologies, and many businesses and investors are beginning to explore the opportunities they unlock. Hallucination, where an LLM makes up answers to questions, can be a significant challenge when working with this technology. Techniques for controllingLLMs are still in early development, and Filament Syfter recommends caution when working with generative outputs for some use cases.
However, we have also shown that there are a number of exciting opportunities where hallucination is less of a concern and where LLMs may unlock new value for private market investors and analysts.
Hallucination High-Risk Use Cases:
- Question answering and fact finding — e.g. “What was ACME’s EBITDA last year?” — LLM generated outputs to factoid questions like these have a high likelihood of being hallucinated/made up, even if the real answer is available to the model.
- Explanations — “Why is ACME’s revenue so low for FY 2016/2017?” — LLMs can hallucinate explanations for answers too, inventing plausible yet incorrect explanations that would have to be cross-checked against other data sources.
Hallucination Medium Risk Use Cases:
- Summarisation — “Give me one sentence summary of this news article” — hallucination can be mitigated by checking that the summary is faithful to the source document and conveys the main points.
- Drafting — “write the first draft of a report on ACME Corp” — hallucination can be mitigated by proofreading and editing the draft document before sending it on to the intended audience
Hallucination Low-Risk Use Cases:
- Few-shot classification and synthetic data generation — hallucination is mitigated in the former by constraining the desired output and applying validation rules. In the latter, hallucination is an intended effect that allows the model to generate synthetic training data that can be used in traditional purpose-built classifier model.
- Information Extraction — this use case could be considered an “offline” alternative to question answering where answers are filtered through layers of software and human validation before being presented to the user. These additional quality assurance measures can help to ensure the veracity of answers.
In our next edition, we will discuss the challenges and opportunities around knowledge retrieval and question-answering for LLMs.