Expand Your Data Science Toolkit with Our Latest Math and Stats Must-Reads | by TDS Editors | Apr, 2024

April 27, 2024

168 2 minutes read

Expand Your Data Science Toolkit with Our Latest Math and Stats Must-Reads | by TDS Editors | Apr, 2024 — 0Da0c34 6G9YtnXjY.jpeg

Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.

The fundamental principles of math that data scientists use in their day-to-day work may have been around for centuries, but that doesn’t mean we should approach the topic as if we only learn it once and then store away our knowledge in some dusty mental attic. Practical approaches, tools, and use cases evolve all the time—and with them comes the need to stay up-to-date.

This week, we’re thrilled to share a strong lineup of recent math and stats must-reads, covering a wide range of questions and applications. From leveraging (very) small datasets to presenting linear regressions in accessible, engaging ways, we’re sure you’ll find something new and useful to explore. Let’s dive in!

N-of-1 Trials and Analyzing Your Own Fitness Data
The idea behind N-of-1 studies is that you can draw meaningful insights even when the data you’re using is based on input from a single person. It has far-reaching potential for designing individualized healthcare strategies, or, in the case of Merete Lutz’s fascinating project, establishing meaningful connections between alcohol consumption and sleep quality.
How Reliable Are Your Time Series Forecasts, Really?
Making long-term predictions is easy; making accurate long-term predictions is, well, less so. Bradley Stephen Shaw recently shared a useful guide to help you determine the reliability horizon of your forecasts through the effective use of cross-validation, visualization, and statistical hypothesis testing.
Building a Math Application with LangChain Agents
Despite the major strides LLMs have made in the past couple of years, math remains an area they struggle with. In her latest hands-on tutorial, Tahreem Rasul unpacks the challenges we face when we try to make these models execute mathematical and statistical operations, and outlines a solution for building an LLM-based math app using LangChain agents, OpenAI, and Chainlit.

A Proof of the Central Limit Theorem
It’s always a joy to see an abstract concept take concrete shape and, along the way, become much more accessible and intuitive for learners. That’s precisely what Sachin Date accomplishes in his latest deep dive, which shows us the inner workings of the central limit theorem, “one of the most far-reaching and delightful theorems in statistical science,” through the example of… candy!
8 Plots for Explaining Linear Regression to a Layman
Even if you, a professional data scientist or ML engineer, fully grasp the implications of your statistical analyses, chances are many of your colleagues and other stakeholders won’t. This is where strong visualizations can make a major difference, as Conor O’Sullivan demonstrates with eight different residual, weight, effect, and SHAP plots that explain linear regression models effectively.

Source

April 27, 2024

168 2 minutes read