At the recent Cloudera Sessions event in Munich, Germany, Paige Roberts, Syncsort’s Big Data Product Marketing Manager, had a chat with Katharine Jarmul, founder of KJamistan data science consultancy, and author of Data Wrangling with Python from O’Reilly. She had just given an excellent presentation on the implications of GDPR for the European data science community. Part 3 dives into the position of being one of the women in tech, the challenges of creating an inclusive company culture, and how bias doesn’t only exist in machine learning data sets.
In the first part of the interview, we talked about the importance of being able to explain your machine learning models – not just to comply with regulations like GDPR, but also to make the models more useful.
In part 2, Katharine Jarmul went beyond the basic requirements of GDPR, to discuss some of the important ethical drivers behind studying the data fed to machine learning models. A biased data set can make a huge impact in a world increasingly driven by machine learning.
Paige Roberts: I know, I’m probably a little obsessive about it, but one of the things I do is look around at every event, and calculate the percentage of women to men. And I must say, the percentage at this event is a little low on women.
Katharine Jarmul: Yeah.
So, do you find yourself in that situation a lot? Do you get that, “I’m the only woman in the room” feeling?
I would say that one of the biggest problems that I see in terms of women in technology is not that there’s not a lot of amazing women interested in tech, and it’s difficult for a lot of really talented women in tech to get recognized and promoted.
It feels like women have to be twice as good, to be recognized as half as good.
Yeah. And I think we’re finding out now, there’s a lot of other minority groups as well, who find it difficult, such as women of color. Maybe you have to work four times as hard. We see this exponential thing, and when you’re at an event where it’s mainly executives, or people that have worked their way up for a while, then you just tend to see fewer women, and that’s really sad. I don’t see it as a pipeline problem. I know a lot of people talk about it as a pipeline problem, and yeah, okay, we could have a better pipeline.
Yeah, we need a few more women graduating, but that’s not the problem. The problem is they don’t get as far as they should once they graduate.
Exactly, and maybe eventually they leave because they are tired of not being promoted, having somebody else promoted over them, not getting the cool projects so they can shine.
And some of it is just cultural in tech companies. You get that exclusionary feeling. I had a conversation recently, somebody I was talking to… Oh, I was talking to Tobi Bosede. She’s a woman of color, and she’s a machine learning engineer who did a presentation at Strata. She said something along the lines of, the guys I work with say, “Let’s go play basketball after work.” And everybody on the team does. She’s thinking, “I don’t even like basketball. I don’t really want to go play basketball with the guys after work, but I still feel left out.”
Yeah, I get that. It’s difficult to make a good team culture that’s inclusive. I think you must really work for it. I know some great team leads who are doing things that help, but I think especially if say, you’re a white guy that didn’t grow up with a lot of diversity in your family or your neighborhood, it might be more difficult for you to learn how to create that culture. You must work for it. It’s not just going to happen.
It’s almost like a biased data set in your life. You don’t recognize bias in yourself, until you stop and think about it. It doesn’t just jump out and make itself known.
I did an interview with Neha Narkhede, she’s the CTO at Confluent, and she was talking about hiring bias. Even as a woman of color herself, when hiring, she catches herself doing it, and must stop and think, and deliberately avoid bias. It’s in your own head. You think, I should know better.
Yeah, yeah. And I think these unconscious biases are things that we have, as humans. We all have some affinity bias, right? So, if somebody is like me, I’m going to automatically think that they’re clearer. They think like me, so I can more easily see their point. That’s fine but, one of the things that helps teams grow is having arguments, …
Having different points of view, and accepting that, “Okay, this guy thinks completely different from me, but maybe he’s got a point.”
I find myself doing the thing where I think, “Why did they disagree with me? How could they?”
They’re wrong, obviously. [laughing]
[laughing] Especially when I notice that I’m doing it like that, I say, “Okay, I need to sit down and think through this. Is there perhaps a cardinal truth here? Or something that bothers me because it doesn’t necessarily fit into my world view? And should I, perhaps, poke at that a little bit, and figure it out?”
Stop and think, introspect.
That’s a good word. I like that.
We have our own mental models, and we need to question the bias in them, too.
Be sure to check out part 4 of this interview where we’ll discuss some of the work Ms. Jarmul is doing in the areas of anonymization so that data can be repurposed without violating privacy, and creating artificial data sets that have the kind of random noise that makes real data sets so problematic.
For a look at 5 key Big Data trends in the coming year, check out our report, 2018 Big Data Trends: Liberate, Integrate & Trust
Katharine Jarmul on If Ethics is Not None
Katharine Jarmul on PyData Amsterdam Keynote on Ethical Machine Learning
Bigdata and data center