Category: Data Science

Worst practices in managing data science teams

Or, sometimes we learn best by learning what not to do

LeBron and JR meme

Organizations, especially public education agencies, are always looking for best practices. I’m a big believer in this strategy — the world is complex, there are many variables, and starting the search for solutions with practices that others in your field have found to work is a smart strategy. But, sometimes best practices have the effect of deterring change because they feel too far out of reach.

We could never get there! We can’t even get X right!

That’s why in this post I’m going to talk about worst practices. Building data science teams in public agencies is an immense challenge. The competition for talent is fierce. Demonstrating return on the investment in analysts can be difficult when your organization isn’t in the business of selling ads. It is often hard to gauge how different the outcomes of the organization would have been without analysts in the first place.

From my work at the Wisconsin Department of Public Instruction to my work with the Strategic Data Project at the Center for Education Policy Research at Harvard University to conversations and consultation with education analysts across the country (which led me to co-author a whole book) I’ve been part of building and training education agency data science teams. And, I’ve seen the struggles and triumphs.

So without further ado, let me cover what I see as some of the worst practices in doing this work and, if you see these in your organization, what you can do to start moving away from them.

Have one analyst per language

This is a big one for agencies. It is easy to justify letting analysts select their own set of tools to use for data analysis. It is hard enough to recruit without limiting analysts by a specific set of tools they use. Agencies struggle with underinvestment in staff training for analysts (see below) so the thought of retraining a team of analysts is daunting. And, analysts, well, we’re a picky bunch with lots of opinions on the tools we have come to know and love.

But, when you step back and take an agency perspective you see the problems.

  1. Errors abound. Data analysis is hard work. Mistakes happen. Without someone else to quality control (QC) your analysis and code, mistakes are much more likely to get out.
  2. Stunted growth. Analysts will be happy at first not having to learn new tools, but without true collaborators it will be harder and harder to learn new skills, push the boundaries, and build efficiencies in the tools they use.
  3. Continuity. Where does all the SAS code go when your last SAS programmer quits? How will you run those reports?
  4. Budgets. It’s not just the license fees for SPSS, SAS, and Stata that add up. What about training materials? Books? Training time? You can’t achieve any economies of scale.
  5. Isolation. Your analysts won’t feel like a team, they’ll feel like they are isolated from one another. The long-term effects of this are less job satisfaction and, probably, higher turnover.

The best orgs have specialized teams by language. And, to achieve this, they set clear expectations about learning the standard tools used on their team, but also, provide plenty of support to new analysts to get up to speed on those tools.

Never read code

Reading and reviewing code outside of quality control is crucial. Teams need to be able to seamlessly read each other’s code to share ideas, develop norms, and collaborate. And, they need to to do so outside high-stakes quality control reviews. When teams don’t read code:

  1. Silos emerge. Analysts will assert different coding styles or slightly varying solutions to the same problem. This will create friction to collaboration among analysts over time.
  2. Innovation slows. QC of analysis code is a difficult task. While doing it, it is unlikely new ideas or patterns of solving problems will stick with the analyst. So the benefits of code reading aren’t realized.
  3. Continuity is threatened. Even if everyone uses the same language, radically different dialects can emerge in the absence of reading code. It is possible that an analyst leaves behind an R script that no one else is quite sure how it works.

Reading code is a great skill to help sustain professional development even when budgets are tight. Most analysts are trained in a solo environment — individual projects in graduate courses. So building the habit of reading code is a management challenge on three fronts: creating the expectation of reading code, protecting time to read code, and getting the tools to make code sharing easy. Teams that do it though gain the benefit of more standard code, more sustainable professional development, and a more collegial workspace.

Recruit one “superstar” and give them free reign

Speaking of collegial — here’s the number one way to keep your team from developing. I am not the the first one to write about this, and I won’t be the last. This is a real problem in the way software (and now data analysis is software) has historically been designed and the ways people think software should be designed.

It’s tempting to identify one person with great analytic skills and put your hardest challenges to them. It’s hard to find, recruit, and train analytic talent so it may feel like when you think your team has a superstar you have to maximize their output. But, here’s why it won’t work:

  1. What superstar? Most organizations are bad at identifying analytic superstars. Too often the “superstar” label goes to the person who most assertively takes it. But, the most confident analyst may not be the best. The best programmers I’ve ever worked with were also the most careful, cautious, and unassuming people you’d ever meet. But their code always opened my eyes.
  2. Toxic culture. You might get lucky and the “superstar” you identify is magnanimous in their role and uses it for good. But, it is equally, if not more likely, that you’ve, a) misidentified the superstar on your team and alienated your best analyst, and b) the team culture and discussions are now dominated by a single person — cutting down on the debate, collaboration, and professional growth the rest of the team needs.
  3. Fragility. What happens if your superstar leaves?

You want great talent on your teams and you want to create room for that talent to be put to good use. But, the superstar team model, while easier on you as a manager in the short run, creates serious risks and probably won’t last.

Leave analysts to self-direct their learning

Data nerds, generally, have already done a lot of self-teaching and self-learning. It is often part of the job and a necessary skill to learn the tools of the trade, and to keep up to date on them. But, turning over all the responsibilities of guiding professional development of analysts to the analysts themselves is damaging for four reasons:

  1. It’s not for everyone. Some analysts don’t want to manage their learning and their work. Or they aren’t sure what is best for themselves or the organization. Or they don’t feel comfortable asking for time to learn.
  2. It’s a strategic miss. Analysts may choose to learn things that help them professionally, but only marginally benefit the organization. This makes professional development function more like a perk than a strategic investment. Ideally, professional learning is linked to the priorities and workload of the agency.
  3. It’s inefficient. Just as each analyst using a different language creates silos, so does each analyst learning in different directions. Unless you have a staff with the ability and time to train each other, the whole will be less than the sum of its parts.
  4. Slows mastery. Too often analyst self-directed learning is not disciplined on mastering what analysts need and is instead directed toward trying new things that seem interesting and fun — trust me on this, I’m as guilty of it as anyone.

Switching away from self-directed learning is hard! It requires thinking carefully about the work analysts will do and holding firm in aligning professional learning towards mastery of tasks that align with that work.

Exclude subject matter experts

This is a big one. Data reign supreme in agencies and the mantra of “what gets measured gets done” gives analysts an outsized voice in policy discussions. Sometimes teams forget or choose not to work with experts that use other types of evidence — qualitative data, experience, direct feedback. Ignoring practitioners and subject matter experts when doing a data analysis may feel more efficient in terms of output — think of all the Tableau dashboards you can create if you’re not in meetings! But, in the long run it keeps data analysis teams from achieving their goal — to aid decision-makers in setting policy and practice by using data. This is because it:

  1. Creates blindspots. There are other ways of knowing things than using statistics to count, tabulate, and compare them. Data cannot capture all the important aspects of something as complex as education or policing. So avoiding other forms of knowledge from other methodologies or experience from practitioners creates dangerous blindspots.
  2. Damages credibility. Unless the chief of your organization is a supercomputer, there are probably lots of ways they receive information that are not grounded in quantitative data. If the data don’t line up with what others see, and they weren’t involved in the work, they will resist your recommendations and challenge your results. Instead of making allies you’ll be making opposition.

This is a topic I’ve done a lot of thinking about and I’ve written about the antidemocratic threat excluding other forms of knowledge poses to society at large elsewhere. I feel passionately about it. But even if you’re not worried on a global scale, these are serious risks to your organization that will catch up to you.

Instead, work to create the time and space for your analysts to get out and talk to practitioners and spend time collaborating with subject matter experts. In the long run you’ll get better answers and you’ll be building an organization that learns together.

Keep it internal

Organizations invest a lot in their data teams — infrastructure, recruitment, software tools. But, your team needs a steady flow of fresh ideas, new perspectives, and sharing of solutions too. Sometimes an analyst becomes a manager and has the time and energy to do this. But most often, analysts are left to read blog posts and watch YouTube videos and talk together to find out what is best. This underinvestment will keep your staff from growing their skills and will slow the adoption of, you guessed it, best practices.

The best organizations do the internal work but also periodically bring in outside expertise to coach them on what they are doing well, identify where they could improve, and bring a fresh perspective. Great teams are those that are not afraid of being vulnerable, showing their work, and sitting down to think about ways they can improve.

Agencies often think this type of training and review must be too expensive, but there’s a growing number of experts who are available to do just this kind of work for your agency. Experts who will work with you to identify the training your team can benefit from the most, train your staff, co-develop solutions with your staff to your toughest problems, and provide ongoing coaching as you and your team implement your projects.

If you think your team is ready to bring in some outside help to take steps away from these worst practices, and to really institutionalize the best practices you have in place, get in touch. This is just the kind of training, coaching, and application co-development that Civilytics will be happy to help you out with.