In this episode, our host, Alec Crawford, interviews Karin Golde, an Amazon alumna, linguistics expert and AI professional. They discuss various topics related to AI and language models. She shares her background in linguistics and the contributions of linguists like Barbara Partee. Karin explains the concerns and risks associated with large language models (LLMs) and the challenges of training data. She discusses different approaches to improve the accuracy of LLMs and the ethical considerations of their use. Karin also highlights the issue of "ghost workers" in AI and the need for fair labor practices. She provides insights into the future of AI, the role of regulation, and the importance of using AI for good. Karin concludes with advice for those interested in AI and the importance of building diverse data teams.
Key Takeaways
Large language models (LLMs) have the potential to amplify existing societal problems and biases, making it crucial to address issues of fairness and equity in their development and deployment.
Training data for LLMs often contains biases and undesirable content, which can lead to inaccurate or biased outputs. Careful data curation and fine-tuning techniques are necessary to mitigate these risks.
The ethical use of AI requires ongoing monitoring, risk management, and a deep understanding of the impact on individuals and society. It is important to involve diverse stakeholders and subject matter experts in the development and regulation of AI systems.
Building a data team with diverse skill sets (rather than titles), including expertise in statistics, machine learning, engineering, linguistics, sociology, and domain knowledge, is essential for successful AI projects.
To use AI for good, it is crucial to understand the real needs of the target audience and the potential impact of AI systems. Collaboration, education, and a focus on specific use cases can help ensure positive outcomes.
References in the show:
Karin Golde's startup:
https://www.westvalley.ai/
Ghost Work book:
https://ghostwork.info/
The letter from Senator Markey to big tech companies regarding ghost work: https://www.markey.senate.gov/news/press-releases/sen-markey-rep-jayapal-lead-colleagues-in-demanding-answers-from-ai-companies-on-use-of-underpaid-overworked-data-workers
"Mystery Hype Theater 3000" from the DAIR institute:
https://peertube.dair-institute.org/
"Human-in-the-Loop Machine Learning" by Robert Monarch: https://www.manning.com/books/human-in-the-loop-machine-learning
AI for Good: Amazon Alumna and Linguist Karin Golde