Boundless Potential of LLMs: Steering Through the Labyrinth of Online Experimentation

Boundless Potential of LLMs: Steering Through the Labyrinth of Online Experimentation

Sign up for our daily and weekly newsletters for the latest updates and exclusive content on the leading developments in AI. Learn More

Generative AI companies are taking a daring approach to quality assurance by releasing large language models (LLMs) directly onto the internet. Instead of traditional testing phases, they rely on the online community to uncover bugs and glitches. This method turns every user into an unintentional beta tester, helping to identify the quirks and flaws of these LLMs as they’re used. It’s an unpredictable journey where errors are discovered in real-time, without the safety net of controlled testing environments.

Ethics and accuracy often take a backseat in this fast-paced release of generative AI. It’s like handing out fireworks; they can be spectacular but also unpredictable and potentially dangerous without proper constraints. Recently, Mistral launched its 7B model under an Apache 2.0 license, raising concerns about potential misuse due to lack of explicit restrictions. Minor tweaks in parameters can lead to significantly different outcomes, making it crucial to recognize and mitigate biases inherent in the training data. For example, CommonCrawl, using an Apache Nutch-based web crawler, makes up a significant portion of data for LLMs like GPT-3 and LLaMA. However, without comprehensive quality control measures, the responsibility for selecting high-quality data falls on the developers, emphasizing the need for ethical AI deployment.

Developing ethical software should be mandatory, not optional. Yet, safeguards are minimal if developers disregard ethical guidelines, placing the burden on not just developers but also policymakers and organizations to ensure fair and unbiased AI applications. The responsibility for errors and misuse of LLMs is a significant legal and ethical question. For example, if an LLM makes a critical error, it’s unclear whether the liability lies with the LLM provider, the service utilizing the LLM, or the user.

There’s a suggestion to create a “no-llm-index” option for content creators to prevent their content from being processed by LLMs, similar to the “noindex” rule for search engines. LLMs do not comply with data deletion requests under laws like the CCPA or GDPR because they don’t store data in a traditional database. Instead, they generate responses based on learned patterns from the data they were trained on, making it difficult to pinpoint and remove specific information.

Legal challenges loom as the industry navigates these issues. In 2015, a U.S. appeals court ruled that Google’s book scanning for Google Books constituted “fair use” because it was highly transformative and didn’t replace the original market. However, the situation with LLMs is different and more complex, as lawsuits have emerged questioning the compensation for content creators whose work feeds these algorithms. Companies like OpenAI, Microsoft, GitHub, and Meta are already facing legal complications, particularly regarding the reproduction of copyrighted open-source software.

As expectations for LLMs are still evolving, it’s difficult to determine when these AI systems fail or hallucinate, unlike the tangible event of an app crash. Policymakers and technologists must develop frameworks to balance innovation with ethical and legal standards.

Different countries are beginning to address these issues. For instance, China’s National Information Security Standardization Technical Committee has proposed detailed rules on generative AI, and President Biden’s Executive Order on Safe, Secure, and Trustworthy AI suggests that other governments will follow suit.

Once the AI genie is out of the bottle, reversing the course is challenging. LLMs require vast amounts of training data, freely available on the internet, making it hard to create such datasets from scratch. Training these models solely on high-quality data is possible but poses questions about what constitutes high-quality data and who defines it. The real question is whether LLM providers will continue forming committees and pushing responsibility onto users or if there will be substantial action taken to address these challenges.

Amit Verma is the head of engineering/AI labs and a founding member at Neuron7.

DataDecisionMakers
Join the VentureBeat community! DataDecisionMakers is a platform where experts share insights and innovations related to data. For the latest ideas, best practices, and future developments in data and technology, consider contributing an article or joining the community. Read More From DataDecisionMakers