AI chatbots are incredibly convincing — even when they’re wrong
“Why is Italian a more difficult language than English?” I asked Google Bard this week. What I received back was well-structured, convincing, and factually wrong.
Now, Google’s Bard uses its own language model, the Language Model for Dialogue Applications or LaMDA, and it was specifically trained on dialogue. Contrast this with ChatGPT or Bing, which use OpenAPI’s GPT4 model. I’m sure GPT will give different responses.
But I’m interested in how these language models have been built and trained. They are designed to give highly likely human responses based on vast quantities of data. That’s why they sound remarkable and convincing (if not a bit flat and wordy at times.)
But what is the reference data? We need citations or references to know. Unlike current search engines, which point us towards resources that we can then assess (“that’s a person’s blog… this is an advertorial for a language course… oh, here’s a definitive resource from a university…"), these chat interfaces don’t — or can’t? — provide reference links to sources for verification.
So how did Bard do?
- “English is also a Romance language.” False. English is a Germanic language. It does frequently borrow from Latin, though.
- “there are a few letters that are not used in English, such as the letter “c” with a cedilla (ç).” Completely false. Italian does not use ç, unlike other Romance languages like French, Portuguese, or Spanish.
- “Italian has a number of verb tenses that do not exist in English.” Mostly false. Italian has one tense that doesn’t exist in English: the passato remoto, the remote past tense used mostly in Tuscan Italian and literature.
- “Italian has a different alphabet than English.” Entirely false. Italian uses the Roman alphabet just like English does. Italian words don’t use the letters x,y,w,k, and j, but these letters still exist in Italian to spell adopted foreign words.
- Critical and missing: Italian, unlike English or Spanish, frequently uses consonanti doppie. These double consonants have distinct pronunciations that can be challenging for English speakers to hear and pronounce. The semantic differences are stark: cane/canne (dog, reeds); pene/penne (penis, feathers); caro/carro (beloved, wagon); ano/anno (anus, year). In fact, given the hilarity of getting them wrong (a challenge I have every December 31!), I’d recommend starting here.
ChatAIs are designed to be convincing and to respond in human-like prose. While they seem definitive, they are trained on massive amounts of data without sourcing or citations, which could help validate their responses. Professionals need to approach their responses with a critical eye.
Google Bard’s full reply:
Italian is not necessarily a more difficult language than English. In fact, it is often considered to be one of the easier languages for English speakers to learn. This is because Italian is a Romance language, which means that it is descended from Latin. English is also a Romance language, so there are many similarities between the two languages. For example, many Italian words are similar to English words, and the grammar is also similar.
However, there are some aspects of Italian that can be challenging for English speakers. For example, Italian has a different alphabet than English, and some of the sounds are not familiar to English speakers. Additionally, Italian has a number of verb tenses that do not exist in English.
Overall, Italian is a beautiful and rewarding language to learn. It is not necessarily more difficult than English, but it does require some effort and dedication. With a little bit of practice, you will be speaking Italian like a native in no time!
Here are some of the challenges that English speakers may face when learning Italian:
- The alphabet: Italian uses the Latin alphabet, but there are a few letters that are not used in English, such as the letter “c” with a cedilla (ç). Additionally, some letters are pronounced differently in Italian than in English. For example, the letter “c” is pronounced like the “ch” in “church” before the vowels “e” and “i,” but like the “k” in “king” before other vowels.
- The sounds: Italian has a number of sounds that are not found in English, such as the rolled “r” and the throaty “c” sound. These sounds can be difficult for English speakers to pronounce correctly.
- The grammar: Italian grammar is similar to English grammar in some ways, but there are also some important differences. For example, Italian has a number of verb tenses that do not exist in English. Additionally, Italian nouns have gender, which means that they must be used with the correct article (masculine or feminine).
- The vocabulary: Italian has a large number of words that are similar to English words, but there are also many words that are completely different. This can make it difficult to learn new vocabulary.
Despite these challenges, Italian is a beautiful and rewarding language to learn. With a little bit of effort and dedication, you will be speaking Italian like a native in no time!
- “Why ChatGPT and Bing Chat are so good at making things up”, ArsTechnica, 4/6/2023.
when we ask why ChatGPT confabulates, it’s difficult to pinpoint an exact technical answer. And because there is a “black box” element of the neural network weights, it’s very difficult (if not impossible) to predict their exact output given a complex prompt.
- “On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?”, Dziri, Milton, et. al.
Our study reveals that the standard benchmarks consist of >60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations. Our findings raise important questions on the quality of existing datasets and models trained using them.