Data Science Can’t Fix Hiring (Yet)
Recruiting managers desperately need new tools, because the existing ones—unstructured interviews, personality tests, personal referrals—aren’t very effective.
The newest development in hiring, which is both promising and worrying, is the rise of data science–driven algorithmsto find and assess job candidates. By my count, more than 100 vendors are creating and selling these tools to companies. Unfortunately, data science—which is still in its infancy when it comes to recruiting and hiring—is not yet the panacea employers hope for.
Vendors of these new tools promise they will help reduce the role that social bias plays in hiring. And the algorithms can indeed help identify good job candidates who would previously have been screened out for lack of a certain education or social pedigree. But these tools may also identify and promote the use of predictive variables that are (or should be) troubling.
Because most data scientists seem to know so little about the context of employment, their tools are often worse than nothing. For instance, an astonishing percentage build their models by simply looking at attributes of the “best performers” in workplaces and then identifying which job candidates have the same attributes. They use anything that’s easy to measure: facial expressions, word choice, comments on social media, and so forth. But a failure to check for any real difference between high-performing and low-performing employees on these attributes limits their usefulness. Furthermore, scooping up data from social media or the websites people have visited also raises important questions about privacy. True, the information can be accessed legally; but the individuals who created the postings didn’t intend or authorize them to be used for such purposes. Furthermore, is it fair that something you posted as an undergrad can end up driving your hiring algorithm a generation later?
Another problem with machine learning approaches is that few employers collect the large volumes of data—number of hires, performance appraisals, and so on—that the algorithms require to make accurate predictions. Although vendors can theoretically overcome that hurdle by aggregating data from many employers, they don’t really know whether individual company contexts are so distinct that predictions based on data from the many are inaccurate for the one.
Yet another issue is that all analytic approaches to picking candidates are backward looking, in the sense that they are based on outcomes that have already happened. (Algorithms are especially reliant on past experiences in part because building them requires lots and lots of observations—many years’ worth of job performance data even for a large employer.) As Amazon learned, the past may be very different from the future you seek. It discovered that the hiring algorithm it had been working on since 2014 gave lower scores to women—even to attributes associated with women, such as participating in women’s studies programs—because historically the best performers in the company had disproportionately been men. So the algorithm looked for people just like them. Unable to fix that problem, the company stopped using the algorithm in 2017. Nonetheless, many other companies are pressing ahead.
The underlying challenge for data scientists is that hiring is simply not like trying to predict, say, when a ball bearing will fail—a question for which any predictive measure might do. Hiring is so consequential that it is governed not just by legal frameworks but by fundamental notions of fairness. The fact that some criterion is associated with good job performance is necessary but not sufficient for using it in hiring.
Take a variable that data scientists have found to have predictive value: commuting distance to the job. According to the data, people with longer commutes suffer higher rates of attrition. However, commuting distance is governed by where you live—which is governed by housing prices, relates to income, and also relates to race. Picking whom to hire on the basis of where they live most likely has an adverse impact on protected groups such as racial minorities.
Unless no other criterion predicts at least as well as the one being used—and that is extremely difficult to determine in machine learning algorithms—companies violate the law if they use hiring criteria that have adverse impacts. Even then, to stay on the right side of the law, they must show why the criterion creates good performance. That might be possible in the case of commuting time, but—at least for the moment—it is not for facial expressions, social media postings, or other measures whose significance companies cannot demonstrate.
In the end, the drawback to using algorithms is that we’re trying to use them on the cheap: building them by looking only at best performers rather than all performers, using only measures that are easy to gather, and relying on vendors’ claims that the algorithms work elsewhere rather than observing the results with our own employees. Not only is there no free lunch here, but you might be better off skipping the cheap meal altogether.
Peter Cappelli is the George W. Taylor Professor of Management at the Wharton School and a director of its Center for Human Resources. He is the author of several books, including Will College Pay Off? A Guide to the Most Important Financial Decision You’ll Ever Make (PublicAffairs, 2015).
Expanding the Pool
Goldman Sachs is a people-centric business—every day our employees engage with our clients to find solutions to their challenges. As a consequence, hiring extraordinary talent is vital to our success and can never be taken for granted. In the wake of the 2008 financial crisis we faced a challenge that was, frankly, relatively new to our now 150-year-old firm. For decades investment banking had been one of the most sought-after, exciting, and fast-growing industries in the world. That made sense—we were growing by double digits and had high returns, which meant that opportunity and reward were in great supply. However, the crash took some of the sheen off our industry; both growth and returns moderated. And simultaneously, the battle for talent intensified—within and outside our industry. Many of the candidates we were pursuing were heading off to Silicon Valley, private equity, or start-ups. Furthermore, we were no longer principally looking for a specialized cadre of accounting, finance, and economics majors: New skills, especially coding, were in huge demand at Goldman Sachs—and pretty much everywhere else. The wind had shifted from our backs to our faces, and we needed to respond.
Not long ago the firm relied on a narrower set of factors for identifying “the best” students, such as school, GPA, major, leadership roles, and relevant experience—the classic résumé topics. No longer. We decided to replace our hiring playbook with emerging best practices for assessment and recruitment, so we put together a task force of senior business leaders, PhDs in industrial and organizational psychology, data scientists, and experts in recruiting. Some people asked, “Why overhaul a recruiting process that has proved so successful?” and “Don’t you already have many more qualified applicants than available jobs?” These were reasonable questions. But often staying successful is about learning and changing rather than sticking to the tried-and-true.
Each year we hire up to 3,000 summer interns and nearly as many new analysts directly from campuses. In our eyes, these are the firm’s future leaders, so it made sense to focus our initial reforms there. They involved two major additions to our campus recruiting strategy—video interviews and structured interviewing.
Asynchronous video interviews.
Traditionally we had flown recruiters and business professionals to universities for first-round interviews. The schools would give us a set date and number of time slots to meet with students. That is most definitely not a scalable model. It restricted us to a smaller number of campuses and only as many students as we could squeeze into a limited schedule. It also meant that we tended to focus on top-ranked schools. How many qualified candidates were at a school became more important than who were the most talented students regardless of their school. However, we knew that candidates didn’t have to attend Harvard, Princeton, or Oxford to excel at Goldman Sachs—our leadership ranks were already rich with people from other schools. What’s more, as we’ve built offices in new cities and geographic locations, we’ve needed to recruit at more schools located in those areas. Video interviews allow us to do that.
At a time when companies were just beginning to experiment with digital interviewing, we decided to use “asynchronous” video interviews—in which candidates record their answers to interview questions—for all first-round interactions with candidates. Our recruiters record standardized questions and send them to students, who have three days to return videos of their answers. This can be done on a computer or a mobile device. Our recruiters and business professionals review the videos to narrow the pool and then invite the selected applicants to a Goldman Sachs office for final-round, in-person interviews. (To create the video platform, we partnered with a company and built our own digital solution around its product.)
This approach has had a meaningful impact in two ways. First, with limited effort, we can now spend more time getting to know the people who apply for jobs at Goldman Sachs. In 2015, the year before we rolled out this platform, we interviewed fewer than 20% of all our campus applicants; in 2018 almost 40% of the students who applied to the firm participated in a first-round interview. Second, we now encounter talent from places we previously didn’t get to. In 2015 we interviewed students from 798 schools around the world, compared with 1,268 for our most recent incoming class. In the United States, where the majority of our student hires historically came from “target schools,” the opposite is now true. The top of our recruiting funnel is wider, and the output is more diverse.
Being a people-driven business, we have worked hard to ensure that the video interviews don’t feel cold and impersonal. They are only one component of a broader process that makes up the Goldman Sachs recruitment experience. We still regularly send Goldman professionals to campuses to engage directly with students at informational sessions, “coffee chats,” and other recruiting events. But now our goal is much more to share information than to assess candidates, because we want people to understand the firm and what it offers before they tell us why they want an internship or a job.
Our structured interview questions are
designed to assess 10 core competencies.
We also want them to be as well prepared as possible for our interview process. Our goal is a level playing field. To help achieve it, we’ve created tip sheets and instructions on preparing for a video interview. Because the platform doesn’t allow videos to be edited once they’ve been recorded, we offer a practice question before the interview begins and a countdown before the questions are asked. We also give students a formal channel for escalating issues should technical problems arise, though that rarely occurs.
We’re confident that this approach has created a better experience for recruits. It uses a medium they’ve grown up with (video), and most important, they can do their interviews when they feel fresh and at a time that works with their schedule. (Our data shows that they prefer Thursday or Sunday night—whereas our previous practice was to interview during working hours.) We suspected that if the process was a turnoff for applicants, we would see a dip in the percentage who accepted our interviews and our offers. That hasn’t happened.
Structured questioning and assessments.
How can you create an assessment process that not only helps select top talent but focuses on specific characteristics associated with success? Define it, structure it, and don’t deviate from it. Research shows that structured interviews are effective at assessing candidates and helping predict job performance. So we ask candidates about specific experiences they’ve had that are similar to situations they may face at Goldman Sachs (“Tell me about a time when you were working on a project with someone who was not completing his or her tasks”) and pose hypothetical scenarios they might encounter in the future (“In an elevator, you overhear confidential information about a coworker who is also a friend. The friend approaches you and asks if you’ve heard anything negative about him recently. What do you do?”).
Essentially, we are focused less on past achievements and more on understanding whether a candidate has qualities that will positively affect our firm and our culture. Our structured interview questions are designed to assess candidates on 10 core competencies, including analytical thinking and integrity, which we know correlate with long-term success at the firm. They are evaluated on six competencies in the first round; if they progress, they’re assessed on the remaining four during in-person interviews.
We have a rotating library of questions for each competency, along with a rubric for interviewers that explains how to rate responses on a five-point scale from “outstanding” to “poor.” We also train our interviewers to conduct structured interviews, provide them with prep materials immediately before they interview a candidate, and run detailed calibration meetings using all the candidate data we’ve gathered throughout the recruiting process to ensure that certain interviewers aren’t introducing grade inflation (or deflation). We’re experimenting with prehire assessment tests to be paired with these interviews; we already offer a technical coding and math exam for applicants to our engineering organization.
We decided not to pilot these changes and instead rolled them out en masse, because we realized that buy-in would come from being able to show results quickly—and because we know that no process is perfect. Indeed, what I love most about our new approach is that we’ve turned our recruiting department into a laboratory for continuous learning and refinement. With more than 50,000 candidate video recordings, we’re now sitting on a treasure trove of data that will help us conduct insightful analyses and answer questions necessary to run our business: Are we measuring the right competencies? Should some be weighted more heavily than others? What about the candidates’ backgrounds? Which interviewers are most effective? Does a top-ranked student at a state school create more value for us than an average student from the Ivy League? We already have indications that students recruited from the new schools in our pool perform just as well as students from our traditional ones—and in some cases are more likely to stay longer at the firm.
What’s next for our recruiting efforts? We receive almost 500,000 applications each year. From this pool we hire approximately 3%. We believe that many of the other 97% could be very successful at Goldman Sachs. As a result, picking the right 3% is less about just the individual and increasingly about matching the right person to the right role. That match may be made straight out of college or years later. We’re experimenting with résumé-reading algorithms that will help candidates identify the business departments best suited to their skills and interests. We’re looking at how virtual reality might help us better educate students about working in our offices and in our industry. And we’re evaluating various tools and tests to bring even more data into the hiring decision process. Can I imagine a future in which companies rely exclusively on machines and algorithms to rate résumés and interviews? Maybe, for some. But I don’t see us ever eliminating the human element at Goldman Sachs; it’s too deeply embedded in our culture, in the work we do, and in what we believe drives success.
I’m excited to see where this journey takes us. Our 2019 campus class is shaping up to be the most diverse ever—and it’s composed entirely of people who were selected through rigorous, objective assessments. There’s no way we aren’t better off as a result.
Dane E. Holmes is the global head of human capital management at Goldman Sachs.