Author: Gloria Yi Qiao
Since the development and advancement of large language models (“LLMs”), we have discussed and debated its application in domain verticals such as legal tech in depth. During this process, we have heard many misconceptions and misunderstandings of how LLMs work in a specific domain. Here are five common misconceptions.
1. You can use LLMs for legal research.
Our friends at Levidow, Levidow & Oberman tried that and it didn’t turn out well for them. We must always remind ourselves of the nature of LLMs. They are based on context, using a technique called natural language generation (“NLG”). They create the responses to certain prompts by examining patterns learned from its data depository and generating a contextually correct and relevant response. So by definition, LLMs are not accurate nor correct resources for research per se. They often hallucinate and provide answers that seem correct based on the context, but are actually made up because all they are doing is predicting the pattern based on information already known. If you want to conduct research with accurate and up-to-date data, using ChatGPT or similar interfaces by themselves won’t suffice. They can give you pointers but you must verify every detail they generate.
2. Anyone can use LLMs for any field.
Not true. LLMs are a typical representation of the “garbage in, garbage out” cliche. Half of the battle is to provide 1) the right prompt and 2) accurate training data. A specific prompt is important because it determines the focus of the response and the level of detail that the LLM will provide. The training data determines the knowledge and capabilities of the LLM. Without either, good luck generating anything with relevant and accurate domain expertise. For example, you can ask ChatGPT to write you a master supply agreement for automotive suppliers. Would you get a somewhat ok looking supply agreement? Maybe. Would it contain all the specific provisions that only seasoned supply chain professionals and lawyers know about relating to the automotive supply chain? Highly unlikely. You must train the data to achieve the above.
3. Large companies have lots of data, thus they must be better at using LLMs and building the next level of AI.
Maybe. But not always true. It is true that LLMs themselves need to be trained by a large amount of data. However, the amount of data you need for specific applications of LLMs has been dramatically reduced. So even with a much smaller size of data and with some techniques, such as simulation (more on that on a different day), even startups can build highly accurate algorithms without getting their hands on huge amounts of data or tremendous training using compute resources. On the flip side, if one has a huge amount of data, but does not know or want to mine them, it’s like sitting on a gold mine without the tools or knowledge to mine them. Would the mine owner automatically generate gold without lifting a finger? Probably not. They just have to have the knowledge and tools to properly leverage that data. In that sense, small startups know how to be more nimble and efficient in creating subject domain expertise than large companies with various kinds of data that they don’t know how to classify, cleanse, utilize or train with such data. A company’s size is irrelevant when it comes to leveraging this kind of technology. It all comes down to how efficiently a company leverages LLMs with the data and tools it already possesses.
4. LLMs will cause lawyers to lose their jobs.
Again, not necessarily, depending on what lawyers are doing and want to do in the future. If a person’s sole job is to compile agreements based on some samples, then this person needs to rethink really deeply what she or he wants to do in the future. However, if the person has best in class judgment, common sense and knows how to find the sample agreements and logically define the key parameters, then with LLMs this person can work 10x faster and do more sophisticated and creative tasks, as opposed to repeat what she or he has done every day for the past 10 years. LLMs may assist, or even replace, many routine tasks like compiling agreements based on samples, but they are not built to tackle non-routine tasks that require judgment, common sense, and creativity. While LLMs can automate tasks, they can never replace human ingenuity or solve complex legal problems with innovative solutions. We urge our team members, customers and partners to think of LLMs as awesome tools to improve their efficiency, reduce tedious manual work and enable them to work on more fun, creative tasks, as opposed to seeing them as a threat to their job security. Ride the wave, and don’t get crushed.
5. You must lose privacy if you want to use GPT.
Definitely not true. Granted, if you are like the Samsung employees who just copy and paste company confidential information and ask ChatGPT to provide solutions, of course, there is no privacy. However, there are ways to leverage GPT without risking privacy and data security. There are ways to “cleanse” your data (remove any sensitive or confidential information) before you connect with GPT, create the right level of specific and accurate prompts, and get a response back without exposing any of your private data. We call it your “private GPT”. Sounds intriguing? Reach out to us to learn more.
In conclusion, LLMs have their own limitations, such as needing a high level of prompt engineering, sizable amount of training data, and lack of accuracy and common sense at times. However, proper data training, prompt engineering and subject matter expertise can overcome many of these flaws and turn LLMs into excellent tools for subject domain automation such as legal. Here at Trusli, we strive to bring legal automation to the next level combining our expertise in machine learning, our subject matter expertise in legal and procurement, and our ability to iterate fast and train data sets specifically for your company. Let’s leverage LLMs to make humans work smarter and with more joy.