The #1 Skill That Holds Back (Most) Data Scientists | by Shaw Talebi
Back in grad school, the Physics department hosted weekly colloquia, where guest speakers would come to present their research. The typical story was that most of the audience understood the first slide (title) and maybe the second (agenda) but got lost after that.
The same happens in data science when non-technical stakeholders sit through presentations from (most) data scientists. “It made sense until you started talking about train-test splits and AUCs,” they might say.
While this might seem like an unavoidable reality of data science, I’ve learned that explaining these topics more effectively is not only possible but essential for advancing a data science career.
Here, I share the key communication tips I’ve used to get promoted, land clients, and explain AI to 100k+ people.
The importance of communication may shock some and be met with some resistance. So, allow me to explain this a little more.
Data scientists don’t typically solve their problems; rather, they solve other people’s problems (i.e., stakeholders). This is how data scientists generate value in a business context.
Therefore, the amount of value a data scientist provides is directly proportional to how effectively they can collaborate with their non-technical stakeholders. To put it plainly, if stakeholders don’t understand and adopt your solution, it provides zero value.
Some might think that communication is one of those skills you either have or don’t have. This, of course, is false. Communication (like any other skill) can and must be developed through practice.
For instance, I started this journey as an overly technical physics grad student, but after 5 years of actively giving presentations, writing articles, making YouTube videos, hosting events, interviewing entrepreneurs, and doing technical consultations, I now get praised (and paid) for my communication skills. If I can do it, you can too.
The following are the communication tips I use most often. Although I’m focusing on technical presentations here, these tips broadly apply to conversations, writing, and beyond.
An upside of developing this skill as a data scientist is that the bar is so low that even becoming a decent communicator can put you ahead of most of your peers (I am living proof of that).
The most powerful way to communicate is through storytelling. Our brains are wired for stories [1]. So, the more you can use them, the better.
When I say “story,” you might think of the textbook definition, i.e., an account of imaginary or real people and events [2]. However, I mean it in a broader sense, which I picked up from the book The Storyteller’s Secret [1].
There, the author defines a story as any 3-part narrative. Some examples of this are:
- Status quo. Problem. Solution.
- What? Why? How?
- What? So what? What now?
Here’s what the first example above looks like in action.
AI has taken the business world by storm (status quo). While its potential is clear, translating it into specific value-generating business applications remains a challenge (problem). Here I discuss 5 AI success stories to help guide its development in your business (solution).
Data science is full of abstract ideas that bear little resemblance to our daily lives (e.g., features, overfitting, curse of dimensionality). A powerful way to make these abstract ideas relatable is through specific examples.
Let’s demonstrate the power of examples by example. Suppose a stakeholder asks you, “What’s a feature?”
Your instinct might be to give a definition, i.e., “Features are what we use to make predictions.” However, this is a pretty vague statement.
A simple way to clarify this is by following up the general definition with a specific example like, “For example, the features in our customer churn are Account Age and Number of Logins in Past 90 Days.”
Although examples are powerful, sometimes they don’t get the job done. This is where analogies come in. Analogies are powerful because they map the familiar to the unfamiliar.
For instance, the other day, I found myself explaining Mechanistic Interpretability to a non-technical client. This is a big, scary term (even for data scientists), so here’s how I explained it.
Modern AIs like ChatGPT are powerful, but we don’t really know how they work under the hood. The idea with Mechanistic Interpretability is to look under the hood to find out what different parts of the model do.
By comparing an LLM (unfamiliar) to a car engine (familiar), this abstract concept becomes much more digestible.
In a sea of ideas and words, numbers tend to stand out. This makes them an effective way to convey information.
For example, I’m using this technique to structure the 7 communication tips in this article. However, this goes beyond the typical internet listicle you might see.
Another way to use numbered lists is when making multiple points in the flow of communication. For example, I want to make 2 points here: 1) numbers stand out to us, and 2) they provide a clear way to structure information.
The reason (IMO) this works so well is becuase numbers like 1, 2, 3, etc., are such basic and familiar concepts that they require little cognitive effort to process.
“I didn’t have time to write a short letter, so I wrote a long one instead.” — Mark Twain
This is the most fundamental principle of effective communication. Your audience only has a finite amount of attention to give you. Therefore, as communicators, we need to be economical when spending our audience’s attention.
While you might think fewer words mean less time, the opposite is often true. Distilling ideas down to the most essential takes many iterations.
This can mean cutting down the number of slides in a presentation, the number of elements on each slide, and even the number of characters used in the title.
Here are some heuristics I use in a business context:
- Keep talks 20 min or less (~10 slides or less)
- Don’t have more than 3–5 elements per slide
- Make bullet points as short as possible (down to the character)
A corollary of less is more is pictures over words. It takes more from our brains to process text than images, so conveying ideas through pictures is an unreasonably effective way to preserve people’s limited attention while still making the point.
Here is the fine-tuning analogy from Tip 3, compared to a visual representation of the same idea.
This highlights the power of data visualizations. Although this topic deserves a dedicated article, it shares the foundational principle of less is more.
This final tip was a game-changer for me. Before, I tended to rush through presentations. This was likely a result of nerves and just trying to get it over with. Eventually, however, I realized the nerves would naturally subside by slowing down my pace and using a calmer tone.
Slowing down has the added benefit of improving the audience’s experience. A rush talk can feel like getting blasted by a firehouse, while a well-paced one is like a soothing stream. Consequently, a short, rushed talk is more painful than a long, well-paced one.
While the tips above can yield quick improvements to one’s communication, their impact will be limited if the communication is not tailored to the audience. This highlights the importance of empathy.
Empathy means seeing things from someone else’s perspective. It is essential for effective communication because it provides the context for framing all aspects of your presentation.
The more you can put yourself in the audience’s shoes, the more effectively you can speak to what they care about and understand.
Most data scientists’ limiting factor is not their technical skills but their ability to communicate effectively. Developing this skill is one of the best ways for data science professionals to advance their careers and make a greater impact.
Here, I shared 7 tips that have been most helpful to me in improving my communication. If you have tips that have helped you, drop them in the comments 🙂