Why Do I Need A Suitable Diversity Data Ontology For DEI Work?
Suppose you’re working for one of the thousands of companies worldwide making pledges and commitments to support diversity, equity, and inclusion (DEI) in the workplace. In that case, you know that an early step on your journey is to collect data so that you’re able to measure the current representation and experience of your workforce, set practical but ambitious (ideally!) goals for your progress on a quarterly or annual basis, and to track progress against those goals.
Accurately measuring the equity and experience of your organization requires identity (“diversity”) data and experiential (“inclusion”) data to understand the fairness (“equity”) of its processes. Diversity data--and how you collect it--makes an enormous difference in your ability to gain meaning from the data and make effective use of it.
Now, to gain insight from your data, you have to think carefully about the data ontology (or underlying structure) you’re creating with your data.
What the hell is a data ontology?
Put simply, an ontology is a model or framework you use to understand your data. It helps to group and organize the data in a meaningful way to draw usable (and that part’s important!) information from the data you’ve gathered. In the case of demographic data (that’s the data about people’s identities), you will want to build a model for your organization that allows you to identify meaningful organizational trends but balances that with the need to examine experience in a granular way.
Why do I need to think about this?
Now, you might be thinking, “why do I have to care so much about the ontology of the data I’m collecting?” If you think about the number of people and systems that this data is used for, it becomes a little bit more obvious:
Compliance reporting and analysis
External “diversity reports”
Pay equity audits
Inclusion and experience surveys
Employee communications
ERG planning and strategy
Each of these use cases requires data for a different format, and you need to be able to tell compelling truths and stories with your diversity data for each of these audiences. In other words, how this data is used is complex, and so putting forethought into structuring this data will save you multiple headaches (and lots of manual work) down the road.
A well-created data ontology balances the variety of tradeoffs you’re making.
While we recognize that many companies are just starting to consider collecting data about the diversity (or balance) and inclusion (or equitable experience) of their workforce, that is the exact right time to consider what data ontology will allow you to best understand and respond to potential inequities within your workforce. When choosing a model for your diversity data, you’ll want to consider a variety of factors.
Usability: Collect What You’ll Action
No matter the particular structure that you choose for your data, you must be thoughtful about what data you collect. Because providing data involves trust, employees must see value from providing the data in organizational action. That derives from our first rule: don’t collect data that you aren’t willing to act upon.
Interoperability: Existing Ontologies & Reporting
What existing conventions or data ontologies exist that you could draw from? In many countries, for the most commonly collected data (e.g., gender, race/ethnicity, etc.), governmental and compliance reporting has specifically required data structures that reporting must be conducted in. In this case, you don’t need to choose the exact ontology your governing body does. Still, to avoid significant manual reporting (or double data collection), it’s essential to design data labels that either quickly expand or are combined to conform to legal reporting requirements.
Size: Group Labels
It’s essential to consider the size of the sub-groups within the categories of data you decide to collect (for example, what response categories exist within the option to identify as “Disabled”). When considering the number of labels (and thus the size of the groups you’re creating), you should optimize for groups that are large enough to see natural variation (ideally, creating groups of at least 5 people per group also to protect anonymity), but small enough to see meaningful differences. It can be tempting to develop expansive lists of options (e.g., 7+ gender options), but this makes groups too small to analyze or extract insight from meaningfully in most companies. In this example, most companies would be best served to provide 3 gender options, the third covering all trans & gender-non-conforming individuals (who would like benefit from similar interventions if issues were discovered with their experience, if that is an issue within a particular workplace).
Progressiveness: Social Convention & Language
What data you collect isn’t just determined by what you’ll use. You should also consider what’s considered socially appropriate (and, of course, legal!) to collect in the regions you plan to collect data from. Social convention--and your organization’s cultural readiness--should play an important role not only in deciding what to collect but the tone of the language you’ll use for labels. When choosing labels, you have to choose, for example, between using the most progressive language available and the language that’s familiar to the most significant portion of your audience (in most cases, choose the most familiar!).
Understanding how to develop a helpful data ontology is crucial to collecting diversity data in a practical and actionable way.
A thoughtfully constructed ontology provides a shared language for describing the groups within your organization and provides a helpful window into their experience. When beginning to build a DEI strategy, taking the time to consider the tradeoffs inherent in the data model you’re using will pay dividends in the overall success of your program.