How data teams can get better at estimating lead times


Hi there Reader,

It's been a couple of months since I last emailed you about a data-related topic. As a reminder, you signed up to this newsletter either via one of my blog posts on data or via a Big Book of R update post.

I had a great question from an analyst that spurred me to write this latest one (shared below in full).


To get better at estimating lead times, you need to change how you provide your estimates. You need to identify the key risks and figure out how to test them early.

Estimating how long it takes to deliver an analysis can be, and often is, very difficult to do. This is especially true where the projects are complex, novel and ad-hoc.

I cover this topic in my book Project Management Fundamentals for Data Analysts (which you should have a copy of, but if you don't you can get it here for free as a newsletter subscriber :) ), but the total time it takes to complete a task is made up both of the task duration and the lead time.

I had an analyst reach out to me and asked if I had any tips for getting better at estimating lead times, given that longer-term projects can run into all sorts of roadblocks. Getting access to the data, data quality, weird coding bugs, interpreting results – just a few of the unknowns that can crop up.

The usual way you provide estimates (and how people usually want them) is for you to give an on-the-spot timeline for how long it will take to do something. Maybe you’re able to ask a few more questions but you’re still left with a lot of uncertainty.

The way to change this is to think about the key assumptions you’re being asked to make and how you can test them early. For example, let’s say you’re asked to do an analysis using a datasource you’ve never worked with and you’ll likely need to use a modelling technique you’re unfamiliar with.

Before kicking off a “one month” analysis (that turns into a 6-month-long project), it would be worth taking a couple of days to assess the data quality and research the modelling technique.

Is the data standardized, documented and up-to-date? Is the modelling technique widely used, has a ton of reference material, many elements you’re already familiar with, loads of stack-overflow solutions and examples?

The potential roadblocks now pose very little risk, and you can give a more confident estimate of a shorter time frame.

What if the opposite happens? Is the data quality all over the place? Looks like it’s being generated by disparate systems? You noticed the latest records are very old? There’s a lot of confusing variables and no documentation?

And the modelling technique you’ve found is not great for the type of data you have, even if it were cleaned? You can only find obscure references to it and the examples are all in software you’re not at all familiar with?

The major roadblocks are now very clear and you can more confidently estimate a much longer timeline. You can think of the steps you’ll need to overcome them and what alternatives there may be, ideally in consultation with the stakeholder.

Identify the key risks, test them early.

> Bookmark the blogpost

Thanks and have a great rest of your week,

Oscar.

Oscar Baruffa

I'm a educator, blogger, and coach who loves to talk about business & entrepreneurship. Subscribe and join over 1,000+ newsletter readers every week!

Read more from Oscar Baruffa

Hi there Reader, There’s been a sudden jump in RStats peeps joining BlueSky, and some very familiar faces from the twitter days. You can find me here, and there’s a very handy “starter pack” which is a list of accounts to jumpstart your feed with! We’ve got a bumper edition in this latest update of Big Book of R. 10 new books in total. Many thanks to Peter Kemp, Isabella Velásquez, David Díaz Rodríguez and the ever-mysterious “Gary” for their contributions. Firstly, we have 4 new books added...

Hi there Reader, I’m super-excited to announce a bumper Big Book of R update. Quarto update Firstly, you’ll see that the site has been updated from bookdown to Quarto. Not only does it give us a nice visual update, the search function seems to work a lot better, and I think splitting the navigation into chapters on the left-hand side and content on the right makes this type of content easier to navigate. Because the site is programmatically generated, I knew that porting it over wouldn’t...

Hi there Reader, The Big Book of R has a collection of almost 400 free R books and as we round out 2023 it’s the perfect time to look back at which have been the most popular. I track the stats and they’re openly accessible. Some of these also have print versions. Get them either as a treat for yourself, a gift for a friend or colleague or even just to make the most out of your 2023 personal development budget! I’ve included an affiliate link to print versions where available. Not only will...