Case Study · Measurement

Using UX measurement to prioritize workflow improvements

Method SUS + feedback cycles
Featured study CCMS development pilot
Window Nov 2023–Jul 2025
Scale Up to 40 respondents

A structured UX measurement program helped a platform team prioritize improvements across a tangled authoring ecosystem. It turned scattered feedback into a baseline for deciding what to fix, what to validate, and where automation might help next.

The problem

The platform team supported a loosely connected authoring ecosystem that had grown over more than a decade. Writers depended on the tools every day, but the workflow was difficult to navigate: tools had been added, workarounds had become permanent, and technical debt had accumulated as urgent fixes were layered on top of older process gaps.

The team knew users were frustrated, but the signals were scattered across tickets, Slack questions, support conversations, and anecdotal feedback. What was missing was a structured way to decide what to fix first, identify low-hanging fruit, and prove whether development and workflow improvements were making the experience better.

The approach

The approach combined standardized usability scoring, structured feedback, and repeated evaluation cycles. Measurement was not the end goal. It was the mechanism for deciding where to focus, validating whether changes helped, and showing the value of improvements over time.

The program could be reused across related workflow studies, including CCMS development cycles, article publishing, and other authoring and publishing workflows.

1
Establish a baseline — Use the System Usability Scale (SUS) to create a consistent starting point for comparison across cycles.
2
Capture context — Collect direct user feedback to understand why users scored the experience the way they did — not just what the score was.
3
Identify priority improvements — Use quantitative scores alongside qualitative feedback to separate recurring friction from one-off complaints, and focus changes where they would matter most.
4
Repeat across cycles to measure change — Run multiple evaluation rounds to track whether workflow changes moved the experience, and distinguish genuine improvement from sample or population effects.

What the CCMS pilot data showed

The clearest longitudinal data came from the CCMS development pilot, where SUS scores were tracked across four cycles from November 2023 to July 2025.

Key finding In the CCMS pilot, repeated measurement showed improvement, then revealed why the later dip was not simple regression: newer users were struggling with the broader toolchain.

SUS scores improved from 39.0 in cycle 1 to 57.0 in cycle 3, a 46% increase. In cycle 4, the score dropped to 48.9 as the sample expanded from 14 to 40 respondents. That dip was not straightforward regression: the broader sample included newer users who rated the experience significantly lower. Users with more than six months in the system averaged 68.3; those with less tenure averaged 47.3. The program made this nuance visible, distinguishing a population effect from a performance decline.

What became visible

Once measurement was consistent and repeated, patterns emerged that hadn't been visible through anecdotal feedback. System performance — speed and reliability — had a clear effect on user perception and drove score shifts between cycles more than any single feature change.

Task-level data provided a parallel signal. Against the incumbent tool, the new system showed 86% task attempt rate with 100% completion on attempts. The incumbent's higher attempt rate masked a lower overall success rate — users were more likely to abandon tasks they didn't know how to complete rather than attempt and fail. Task satisfaction ratings also trended upward across the evaluation period, confirming directional improvement even while overall SUS scores remained in the lower range.

The work established a repeatable basis for future evaluation: a consistent scoring method, a feedback collection process, and a cycle cadence that any team could run forward. Without that structure, the next round of improvements would have no baseline to measure against.

The impact

The program gave the platform team a practical way to prioritize development work across the authoring ecosystem. Instead of relying only on tickets, anecdotes, or the loudest pain points, the team could identify recurring friction, connect user feedback to platform changes, and show whether improvements were moving the experience in the right direction.

It also made a deeper issue visible: experienced users had learned how to navigate the toolchain, while newer users experienced the same workflow as significantly harder to understand. That distinction helped separate general usability problems from onboarding, workflow-complexity, and toolchain-friction issues — and pointed toward different interventions for each.

Structured workflow evidence like this matters beyond any single platform project. Before introducing automation or AI assistance into a content workflow, teams need to understand what users are actually trying to do, where effort accumulates, and which interventions would reduce friction rather than add to it. Measurement is how you build that picture.

Related methods and patterns