Does Artificial Intelligence Have a Culture Problem?

In our rapidly changing world, culture shapes how we communicate, understand norms, express opinions, and interpret actions. As artificial intelligence (AI) and large language models (LLMs) grow more sophisticated, their ability to navigate these cultural nuances becomes critical. Yet, do these systems truly grasp the complexities of culture, or are they destined to falter over the subtle distinctions that define human societies?

Previous explorations of ChatGPT, a prominent LLM, revealed significant limitations in its understanding of culture. Recently released Google Gemini Advanced represents a new generation of LLMs and while recent experiments indicate that it does perform slightly better than ChatGPT, it too remains far from achieving genuine cultural fluency.

Past Explorations: Revealing ChatGPT's Limitations

Previous analysis of ChatGPT highlighted significant gaps in its cultural understanding. ChatGPT struggled with:

Cultural understanding: ChatGPT often seemed oblivious to the cultural nuances it was asked to portray. Instead of demonstrating understanding of the core values and beliefs of a culture, it produced responses that were far removed from the intended persona.

Stereotypes: ChatGPT frequently resorted to simplistic and often harmful stereotypes when attempting to represent different cultures. This indicates a lack of sensitivity and a failure to capture the complexity of cultural identities.

Nuance: ChatGPT rarely selected nuanced responses. Instead, it opted for extreme positions, lacking the ability to recognise the middle ground that often characterises cultural expression.

These findings underscored the necessity for AI to evolve significantly to be truly beneficial in a multicultural setting.

Gemini: a step forward, but a small one

Building upon the previous analysis of ChatGPT's cultural understanding, we designed a similar experiment for Google Gemini Advanced. Gemini too was tasked with adopting various national personas, and their responses were evaluated using the Culture Compass tool. This tool assigns scores along six dimensions of National Culture, allowing us to compare the AI's responses to established cultural data. Mean Absolute Error (MAE) was used to quantify the disparity between the AI's responses and the actual cultural scores. This metric essentially calculates the average difference between the AI's answers and cultural scores based on the 6-D model. 

Despite outperforming ChatGPT, Gemini's understanding still falls short:

While Gemini generally outperformed ChatGPT, achieving a lower average MAE, its performance remains far from ideal, or even usable. Even with the strange outliers (more on this later), US and Canada, removed, the average distance from target cultural scores remains above 20. This signifies a substantial gap in Gemini's ability to understand and represent nuanced cultural differences. As an example, the difference in most dimensions is the same as comparing the United States to Germany or Argentina.

Strangely, Gemini found difficulty in accurately depicting American and Canadian cultures, often resorting to stereotypical and occasionally satirical portrayals. This indicates a possible over-reliance on popular cultural references, an area where even ChatGPT showed more finesse. Gemini's answers for an American sometimes looked like this:

"Gotta make sure I'm not getting bamboozled by fake news and all that, right? A true American always does their homework (or at least hits up Google real quick!)."

--Gemini Advanced


"Sure, I love my "me time" like any other American, but man, nothin' beats a good barbecue with the whole neighborhood, or cheering on our local team with a packed stadium! A little community spirit never hurt anyone."

--Gemini Advanced

Despite these shortcomings, Gemini did exhibit a capability for more nuanced portrayals in specific cases. When tasked with representing cultures like the United Arab Emirates (UAE), Austria, and Finland, its responses demonstrated a greater sensitivity to cultural values and decision-making styles. For example, when addressing the importance of authority figures in the UAE, Gemini acknowledged both the respect for strong leadership and the cultural emphasis on self-reliance and accountability. 

For UAE Gemini's answer's were often surprisingly nuanced, such as:

"This is a complex issue and there's likely a spectrum of opinions even within Emirati society. Here's my best attempt at a typical choice:

Valuing Good Leadership:
Emiratis respect those in positions of authority, especially individuals who demonstrate competence, integrity, and have a clear vision for the betterment of society.

Stability and Progress: Strong and capable leadership is seen as essential for maintaining order, providing essential services, and guiding national development.

Tradition of Consultation: Although absolute dependence is not ideal, Emirati culture has a tradition of "shura" (consultation) where leaders seek input from the community. This fosters a sense of shared responsibility.

However, here's why it's not "I prefer to depend on authorities if they are honest and competent."

Importance of Self-Reliance: While respecting authority, Emiratis value independence and resourcefulness. Blind dependence on any person, no matter how competent, carries risks. 

Need for Accountability: Authorities are expected to be accountable to those they serve. While trust is given, there's always a degree of oversight to ensure continued responsible governance.

Ultimately, a healthy Emirati perspective likely holds a balance - respecting and working alongside competent leaders while maintaining a sense of individual responsibility and the ability to hold those leaders accountable when needed."

-- Gemini Advanced

This level of nuance was absent in its responses pertaining to the US and Canada.

Yet, it should be emphasised, that even these nuanced answers did not result into very accurate Culture Compass scores. The overall performance indicates a substantial gap remains in AI's ability to handle cultural nuances effectively and there is significant work to be done on this front of the AI capabilities.

Why this matters

The limitations of AI in cultural understanding have serious implications:

Risk of Over-Reliance: AI systems often communicate with a level of confidence that can mask their underlying biases and limitations. This can lead users to place undue trust in outputs that may be culturally inaccurate or insensitive.

Illusion of Objectivity: It's important to recognize that AI models are trained on large datasets of human-generated text. If these datasets disproportionately reflect certain cultures, the AI may perpetuate these biases and present a distorted view of the world.

Potential for Harm: Inaccurate representations of cultural groups, whether through stereotyping or lack of depth, can lead to misunderstandings, offense, and harm to individuals or communities. This is especially critical in sectors like journalism, business, and international relations, where cultural sensitivity is essential.

Need for Inclusive Development: Developing AI that understands and respects cultural diversity is both an ethical imperative and a practical necessity. In a globalized world, the effectiveness of AI depends on its ability to navigate and respect diverse cultural landscapes. This necessitates a focus on diversifying training data and involving cultural experts in the development process.

Our experiment highlights the current limitations of AI systems in grasping cultural nuances. Relying on AI for tasks that require cultural understanding is something we simply cannot recommend at this moment. While there's potential for AI to become more culturally adept, the vision of an AI that is truly culture-savvy remains distant.

Read more about the results of both ChatGPT and Google Gemini in our comparison report.