Grok This
Responsible AI is a Critical Success Factor
The announcement that the platform formerly known as Twitter launched its own AI model, Grok, wasn’t surprising. It did, however, raise alarms, especially after it was introduced with promises that it would “answer questions with a bit of wit” and that it “has a rebellious streak”, not to mention that “A unique and fundamental advantage of Grok is that it has real-time knowledge of the world via the 𝕏 (twitter) platform.” That should make for some interesting answers. And dangerous ones.
Because, after all, “Artificial Intelligence” is the wrong moniker; it should be “Statistical Decision Making”—that’s all a Generative AI model does, it creates images and concatenates words based on probabilities—it’s a very powerful statistical inference engine, a really good autocomplete. So if a model is trained on bad, toxic, and biased data, it will inevitably spew out garbage and filth.
This spurred me to go back to a blog post I wrote recently that summarized the key takeaways from a forum hosted by the Mortgage Bankers Association (MBA) and MISMO titled: Artificial Intelligence—Promise and Peril for Mortgage Lending.
One of these takeaways was “Responsible AI as a Critical Success Factor”. The Grok announcement has compelled me to elaborate on this concept, as these principles must be considered anytime an AI-powered solution is developed, deployed, monitored, and managed, irrespective of industry and use case.
Before we delve into Responsible AI, let’s make sure that we understand that it should be just one of the components of the foundation of anyone’s AI compliance strategy. This foundation is based on three elements:
- Responsible (ethical) AI principles
- Risk management (and identification)
- Risk monitoring, reporting, and communication
Part I (this Blog Post) will focus on Responsible AI, while Part II will focus on Risk Management and Monitoring of AI-powered solutions.
Part I: Responsible AI
By analyzing vast amounts of historical transaction data (using deep learning techniques to analyze patterns and relationships within the data), a Gen-AI powered solution learns intricate patterns that might escape human detection. It then generates synthetic data, helping professionals understand the types of activities that might otherwise go unnoticed. In Financial Services, there are multiple use-cases:
- Customer Due Diligence: Know Your Customer—KYC;
- Deep Retrieval: Analyzing large amount of contact, POs, invoices, and provide insights and extract key information for financial reporting;
- Q&A (Dialogue): Generate an answer to a question or a prompt by training the model on a specific set of relevant content; and
- Summarization: Summarize key insights from accounting pronouncements, financial reports, financial news that impacts policies, processes and reporting.
And we’re just scratching the surface.
Deploying any form of AI, whether Machine Learning models or Gen-AI solutions, creates challenges when it comes to measuring and mitigating concerns about fairness, bias, toxicity, and IP. The AI solution must respect the law, and must respect equity, privacy, and fairness. Foundational principles for deploying and using Gen-AI in a responsible manner are crucial in enabling AI’s trusted use.
So where to start? Having spent most of my career deploying technology in the Financial Services vertical, and in particular the last two years in the US mortgage banking market, we can take some of the lessons that the regulators (yes, the regulators!) have published to ensure that AI is used responsibly. In early 2022 the Federal Housing Finance Agency (FHFA) published an Advisory Bulletin related to AI and Machine Learning: “ARTIFICIAL INTELLIGENCE/MACHINE LEARNING RISK MANAGEMENT.”
The FHFA framework is a great baseline to start from when designing and deploying any type of AI-powered system, especially one predicated on Gen-AI. The FHFA’s Core Ethical Principles to enable Responsible AI systems include:
- Transparency: Provide visibility into where, how, and why AI/ML is used e.g., in mortgage for consumer-facing risk models that pose Fair-Lending risk.
- Accountability: Ensure there is human responsibility assigned for all AI/ML outcomes: designate a go-to person for all AI output and outcomes.
- Fairness, Equity, Diversity & Inclusion: Eliminate explicit biases in AI/ML systems across social, economic, political, or cultural dimensions to ensure fair and equitable outcomes across different groups. For example, mitigate the risk of generating content that is toxic, offensive, disturbing, or inappropriate; or AI that makes biased decisions based on the data it was trained on—this is where Grok will fail miserably, IMHO.
- Privacy & Security: Respect and protect privacy rights and data, for example, Intellectual Property: reproductions of copyrighted content, or unauthorized use of copyrighted content to generate answers; and Plagiarism: was the content generated by a person or a machine—it’s hard to detect without… another machine.
- Reliability & Safety: Monitor to ensure the AI/ML system functions as intended, and that misrepresentations are eliminated, like false information that sounds plausible (a Gen-AI “hallucination”, also known as the veracity problem).
To ensure that the Gen-AI model functions in a Responsible AI framework, a set of processes should be put in place to ensure that the results generated by the Gen-AI model are correct and fair. In essence, implement a control structure predicated on the above Core Ethical Principles:
- Incorporate humans-in-the-loop: validate the consistency of responses, and continually train the model—this is a key point for any AI-related implementation, as outputs from these models cannot fully be trusted.
- Enable source transparency: ensure the answer can be traced back to the source data, in other words, know where the data that trained the model is coming from, and what that data is.
- Summarize the output: eliminate bias and protect end users of the data from potential veracity problems by ensuring that the model does not advise, or make decisions—let the human-in-the-loop make the final call.
- Execute timely, comprehensive audits: continually train the model by validating the responses with expert human resources .
Oh, and one more thing about Grok: it promises to “…also answer spicy questions that are rejected by most other AI systems.” There’s a reason smarter and better trained AI systems reject these “spicy” questions.
Note: this blog post was written by a real human and does not contain content generated by ChatGPT or any other Generative-AI platform.