Reclaiming Agile Estimation Principles: Part 1
Begin reading this series to understand how story points emerged as measures of complexity rather than time. Discover the rationale behind relative estimation, and learn how establishing a reliable baseline can strengthen your team’s estimation practices.
Rediscovering the Purpose of Agile Story Points for Scrum Teams
In the early stages of Agile practice, story points emerged as a way to measure complexity rather than time. Many teams today do not understand this original intent. They view story points as a proxy for hours or days. This view contradicts the sources that introduced story points, including early Scrum literature and guidance drawn from the Agile Manifesto. Historical discussions of story points describe them as relative measures of uncertainty, scope, and effort influenced by difficulty and risk. These discussions focus on the nature of complexity rather than the duration of tasks. In their authentic form, story points serve as markers that help teams gauge how far they must travel into uncertain territory, not as timers that count down until a task finishes. Over time, this foundational principle became blurred, and many practitioners lost the anchor that once kept story points aligned with complexity-based thinking.
The Original Intent of Story Points
Examining the original Scrum Guide and early Agile references reveals a consistent emphasis on complexity. In these foundational texts, story points do not relate directly to hours of work. Instead, they represent a scale of difficulty and unknown factors. In these early descriptions, teams start from a simple baseline story that everyone understands. That baseline provides a stable reference point. More complex stories feel farther away on a conceptual scale. This scale acts like measuring distance rather than clock time. The central idea is that some features are more complex than others, and complexity creates uncertainty. Complexity also makes it hard to predict exact effort. The early Agile thought leaders saw this uncertainty as something to embrace. They believed that measuring complexity could help teams focus on refinement and learning. They did not intend for story points to predict how many hours a team member would spend completing a task.
A modern team might find this idea puzzling at first. Consider a new Agile team that reads about story points for the first time. They ask themselves why they cannot just equate a five-point story to five hours. Their confusion seems natural because so many have come to treat story points as hidden time estimates. Yet this thinking reduces the practice to a simplistic conversion. Such a conversion undermines the concept of complexity. A small task might take several hours, yet remain simple because its steps are known and stable. Another task might require fewer hours but demand intense research and experimentation. Trying to force both tasks into a time-based model causes teams to lose sight of what story points actually measure.
The Agile Manifesto and related guidance emphasize adaptability, learning, and responding to change. Complexity-based estimation supports these values. By treating story points as a relative measure of complexity and uncertainty, teams stay flexible. They avoid rigid assumptions about how long something will take. Instead, they acknowledge that some work items may have intricate dependencies or unfamiliar technologies. This acknowledgment allows for improved planning conversations. It encourages a shared language for discussing why a certain story feels more challenging than another. Rather than presenting a hard deadline, story points help the team think about what they do and do not know. This mental shift encourages open discussion and continuous learning.
Understanding Complexity & Time
To better understand the differences between complexity and time, consider a brief anecdote. A new Agile team sits around a table, ready to estimate their first set of user stories. They pick a small, well-understood piece of work as their baseline, agreeing that it is a simple reference story. They assign it a low number of points to represent minimal complexity. Next, they face a more complex story involving unfamiliar APIs and uncertain integration points. Someone asks if this new story should be five points because they think it might take about five hours. The team’s Scrum Master responds with a question: Is this truly about hours, or is it about how uncertain and difficult the work seems compared to the baseline story? After a pause, the team realizes that complexity involves hidden risks and unknowns, not just the passing of time. This realization shifts their perspective. Instead of searching for a direct correlation to hours, they compare the complexity of the two stories as if measuring a longer distance. A story that feels more complex stands further out on the conceptual map, even if no one can say exactly how many hours it will require.
This example highlights the true intent behind story points. They serve as a relative frame of reference. They encourage teams to calibrate their understanding of complexity against a known standard. The resulting estimates do not guarantee precise timelines. Instead, they enable more accurate comparisons between different pieces of work. By reinforcing the concept that story points measure complexity rather than time, teams can restore the original purpose of this practice. They can return to a mindset rooted in Agile values, focus on uncertainty and learning, and set aside the idea that story points must translate into hours. Doing so lays a strong foundation for the posts that follow, where the challenges and misinterpretations surrounding story points will come into sharper focus.
Reference Stories & Baselines
As teams deepen their understanding of complexity-based estimation, it becomes essential to establish a stable point of reference. This reference is commonly a simple, well-understood user story that anchors the relative sizing of all subsequent work. Known as a baseline story, it guides the team’s perception of complexity, ensuring that when they compare other stories, they measure them against a familiar and shared standard. By introducing a baseline, the team no longer relies on abstract ideas of difficulty or uncertain guesswork. Instead, members orient their discussions around a known quantity that everyone understands. This approach prevents the drifting interpretations that arise when teams estimate complexity in a vacuum.
Without a fixed reference, complexity assessments can shift subtly over time. A story that once felt moderate in difficulty may later seem more or less complex as perspectives, technologies, or team compositions change. By grounding estimates in a baseline story, the team holds a conceptual anchor that keeps them aligned. This anchor reduces the likelihood of stories ballooning into inflated figures or collapsing into arbitrary ratings. Instead, it establishes a stable footing for consistent comparisons. Much like how a navigator sets a course by referencing known landmarks, a Scrum team uses a baseline story to maintain a steady interpretation of complexity, sprint after sprint.
Teams that incorporate baseline stories find that their calibration improves. During estimation sessions, when someone proposes that a new feature’s complexity is about equal to the baseline, the rest of the team understands what this means. They share a mental image of that baseline story’s complexity and can judge how the new feature measures up. This collective understanding empowers them to identify subtle differences. Is the new feature slightly simpler, roughly the same, or more complex? By agreeing on a baseline, the team’s dialogue shifts from uncertain speculation toward meaningful relative sizing. Estimation becomes less about guessing hours and more about placing new work along a familiar gradient of complexity. This structured comparison allows the team to maintain coherence and consistency in their estimates, even as they tackle unfamiliar challenges.
Establishing a Reference Story
For this approach to work, the team must carefully choose a baseline story. Ideally, it should be one that everyone knows well—something stable and representative of a manageable level of complexity. Consider a simple login functionality that the team has already built many times. Everyone understands that it involves a few well-defined steps: displaying a login form, validating credentials, and granting access. There are no hidden surprises, no unusual integrations, and minimal uncertainty. Such a story is an ideal candidate for anchoring complexity estimates. If this baseline story feels like a known distance—let us say it “measures” a small number of points on the complexity scale—then other stories can be evaluated by asking how far they lie from this known reference.
A baseline story offers a tangible anchor that connects the abstract concept of complexity to something familiar. Without it, the conversation can become unmoored, drifting into subjective debates that lack a shared frame of reference. By introducing a baseline, the team installs a mental landmark that each member can return to whenever a new story’s complexity comes into question. This approach encourages the group to discuss uncertainty in specific terms. When a new feature appears more involved than the login story—perhaps it integrates with multiple external services—the team sees it as standing farther along the complexity continuum. By calibrating against a known baseline, they maintain accuracy and reduce the risk of gradual estimation drift.
Defining the Team's Baseline
Choosing a suitable baseline depends on simplicity and familiarity. A simple story that everyone understands ensures that the team’s mental models align. For example, a login feature is both simple and stable. It is unlikely to undergo dramatic changes, and its complexity level is well established. Selecting this as the baseline grounds the estimation process. When the team later evaluates a story that retrieves complex data from unfamiliar systems, they ask whether that new story feels more complex than the stable login baseline. The answer clarifies its relative position on the complexity scale.
Why focus so heavily on a known, stable story? Consider that every team member carries their own interpretation of how difficult a story should be. Without a baseline, these interpretations can pull in different directions. One member might overestimate difficulty due to past bad experiences, while another underestimates because of personal expertise. A familiar baseline story levels the field by providing a reference that everyone trusts. Even if there is minor disagreement, the team now debates complexity relative to a shared anchor rather than vague impressions.
However, imagine a more advanced scenario in which a team selects a baseline story that is inherently more complex than a simple login feature. Instead of starting with a low-complexity anchor, they choose a story worth roughly eight points on their complexity scale. Perhaps this baseline involves integrating a payment gateway with multiple third-party APIs, handling intricate error conditions, and conforming to strict security protocols. While not a trivial piece of work, the team finds this scenario representative of complexity they regularly encounter. Each member understands that this baseline includes a tapestry of interactions and unknowns. Everyone agrees that this story, with its extensive integrations and potential failure points, sets a clear standard for what “significantly complex” means.
Later, when they pick up a story that requires adding a new filter option to an existing search interface, they naturally question how complex it feels compared to their eight-point baseline. They see that the new story does not involve external services or complicated security logic. It might still require a few database queries and a small user interface adjustment, but it stands far below the complexity threshold set by their baseline. Compared to the payment gateway integration, this simpler feature resembles a slight bump in the road rather than a sprawling, uncertain terrain. After a brief discussion, the team recognizes that relative to their eight-point reference, this new story might rate only a two or three. The complexity has fewer unknowns and involves less intricate work. Because they started from a well-understood baseline of higher complexity, downgrading to simpler stories becomes more intuitive. Each new story’s complexity can be estimated by mentally stepping down from the demanding baseline they already agreed upon, aligning everyone’s understanding around a shared measure of difficulty.
It is essential, however, that the team commits to a single anchor for their comparisons. Whichever scenario they choose—whether it is a simple baseline or a more demanding one—must align with the team’s unique attributes, such as the complexity of their product domain, the maturity of their technical ecosystem, and the collective experience level of its members. By selecting a single anchor that reflects these factors, the team ensures that its point estimates remain coherent and relevant over time.
For further insight into the foundational materials and thought leadership that shaped the concepts this series, readers may consult the official Scrum Guide at https://scrumguides.org/, which defines the framework’s core elements and principles. The Agile Manifesto, available at https://agilemanifesto.org/, offers a concise statement of values and principles guiding Agile methods. Additional perspective on complexity-based estimation can be found through organizations like the Scrum Alliance at https://www.scrumalliance.org/ and expert commentary in blogs and articles, such as those by Mike Cohn at https://www.mountaingoatsoftware.com/blog/.
Conclusion to Part 1
Establishing and using a baseline story sets the stage for understanding how complexity-based estimation can remain consistent over time. By anchoring all judgments to a familiar, stable reference, the team avoids the pitfalls of drifting interpretations. This consistency is crucial for preserving the integrity of story point estimation, as later posts will show what happens when teams fail to maintain these anchors. In future posts, we will explore how the absence of strong baselines can lead teams astray, opening the door to the misinterpretations and challenges discussed later in this series. For now, with a baseline in place, the team takes a step toward ensuring that complexity remains at the heart of their estimation practices.