AI Chatbot in Slack for Misogyny Detection

Content Moderation Support for Groups in Online Messaging Platforms

4 August 2024

"About four-in-ten (42%) working women in the U.S. have faced discrimination on the job because of their gender."

"The increased collaborations in online platforms such as Slack have become central to workplace communication."

"Sexism and misogyny harm the efficiency and wellbeing of employees."

Project

We aim to address the prevalence of misogyny by developing a digital solution that empowers HR personnel, company owners, employee teams and individual staff to uphold inclusive discourse. Our approach consists of using a large language model (LLM) to aid content moderation in mitigating sexism and misogyny in online group chats. This connects to a chatbot that we created in Slack where users can interact with a working prototype.

With the help of generative AI technology, the harmful impact can be minimized while at the same time, it provides human moderators, participating groups, and individuals an opportunity to respond appropriately and avoid offensive speech (both implicit and explicit) which they often struggle with, such as in cases of subtle misogyny or gaslighting.

The client for this project, Vividhata Pty Ltd, is a certified Social Trader organization, specialized in aiding enterprises in fostering and maintaining equitable, inclusive and diverse internal cultures. The prototype developed in the project will serve as a demonstrator for the client’s customers.

Project Team

David Fuchs - HTW Berlin
Karenina Schröder - HTW Berlin
Ella Klamm - HTW Berlin
Alver Remolar - UTS Sydney

Project Results

Initially, we focused on developing user Personas, conducted research on content moderation and LLM applications, and investigated existing datasets on sexist or offensive speech. This groundwork allowed us to create a Minimum Viable Product (MVP) and outline a long-term vision for our AI chatbot.

Development Process

1. Initial Prototype

Created a dummy dataset.
Built a minimal prototype of the Chatbot in Slack.
Chose Llama 3 8B as our preferred Large Language Model (LLM).
Tested the dummy dataset and generated a confusion matrix.

2. Iteration and Evaluation

Improved the product with each iteration.
Identified and utilized the ISEP dataset. ISEP dataset on Kaggle
Evaluated the model using a confusion matrix.
Achieved satisfactory results for the MVP, so further training was deemed unnecessary.

3. Integration and Testing

Connected the LLM to the Chatbot using the GROQ API and Slack API.
Tested various temperatures and prompts.
Opted for a few-shot learning approach.

Current Capabilities

Our Chatbot, integrated into Slack, is capable of recognizing misogyny with:

Accuracy: Over 90%
Precision: 89%
Recall: 94%

The Chatbot can detect even subtle misogyny by considering the context of conversations. When a potentially misogynistic message is flagged, the user receives a direct message with:

The flagged message

A probability score

A brief explanation of why the message might be considered misogynistic

Additional Features

HR Management Tools

Statistics on individuals and teams are available as a CSV file.
Notifications are sent to a private Slack channel for HR to ensure important information is not overlooked.

Realtime Evaluation

Developed a Chrome extension for live message evaluation.
Implemented a traffic light system (red, orange, green) to visualize the evaluation and probability scores for users.

Our chatbot can identify misogynistic content not only in English but also in several other languages (English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese and Korean - but we only tested in English and German) and provide real-time feedback and useful analytics to improve workplace communication. By leveraging advanced LLM technology and user-friendly interfaces, we aim to create a safer, more respectful workplace for everyone.

Future Work

Limitations and Next Steps

We managed to take the context of the conversation into account by passing the last n messages, but we have yet to unflag a non-misogynistic message in the context of a misogynous message consistently due to time constraints. However, the chatbot can still distinguish the current text as non-misogynistic from the message history that is misogynistic, despite the inconsistent flagging. It is recommended to test this further with different prompt templates and compare the results.

Additionally, our real-time evaluation (Chrome extension) is only an initial prototype that checks whether the word being typed is a word from a list of "bad words". Therefore, the next steps would therefore be to connect the Chrome extension with our Slackbot and the LLM.

Furthermore, the following is a list of recommendations that our team (also through advice of our academic supervisors and client) have accumulated over the course of the project:

Completing Features and Improving other Tech aspects

User Interface and Experience

Connect the traffic lights/chrome extension with our code and LLM

to make it platform-independent so that users are warned before they send a message that this message could contain misogyny, which has proven to be an effective way of preventing hateful messages. (Warner et al., 2024).
to discuss real-time detection/ censorship and its pitfalls: it can only work with a very reliable and accurate model (e.g., what about false positives?). This needs a lot of thorough model evaluation and real-time calculation power to effectively meet the deisred results.

Equip the slackbot with interactive feedback

a thumbs up or thumbs down for a classiffication message to feedback if it has declared the message correctly (-> reinforcement learning).

Direct messaging with the bot

have it send an initial message explaining what it can and cannot do (that is: can check a message for misogynistic content, but not chat in general) or customize the way it reponds further, etc.

Data Storage, Access, and Security

Include protection of data privacy and organisation (how is the data represented/made available to the person who will work with them?)
Server space/computing resources: find out how that works so small businesses who don’t have servers or are not tech-savvy can also use our product.

Understanding the Context Better

Complex Meanings, Conversational Sequences, and Socio-behavioural Patterns

Find a better solution to the context problem (so that messages that are not misogynistic are not falsely reported as misogynistic).
Categorize misogynistic messages to identify the most frequently discussed types of misogynistic content. Use this analysis to highlight the issue through a newsletter or workshop.

Multi-lingual and Multi-cultural Support

It would be nice if the bot could be trained and adjusted to cultural differences in the various contexts it might be used by the people using it themselves (because “Detecting hate speech is difficult because the definition of hate speech varies, depending on a complex intersection of the topic of the assertion, the context, the timing, outside events, and the identity of speaker and recipient” ). Same goes for different languages (currently, it works well for languages represented well on the internet but more tests are needed to evaluate how well it functions outside of English language).

Hate Speech Beyond Misogyny and Sexism

Consider intersectionality (misogyny does not happen by itself, is combined with other forms of discrimination and hate)

Alternative Machine Learning Approach

Create an even more accurate model by fine-tuning. Extensive and structured tests and model performance comparison (with more computing power).