Oxygen Positron Pilot Case Study

The problem

Technical writers were under mounting pressure from frequent release cycles while working in structured DITA environments with strict content requirements. AI was generating a lot of noise, but very little of it was practical for writers working at the document level.

The question wasn't whether AI could generate text. It was whether AI could produce structured, valid, insertable content that actually fit into an enterprise authoring workflow — and whether that was sustainable at scale.

What we were trying to solve

Writers under pressure from rapid release cycles
DITA requirements that generic AI tools couldn't meet
No integrated AI tooling designed for structured content
Unclear whether AI assistance was viable or just promising

What I actually built

Rather than simply testing the tool, I built a small system around it — designing repeatable prompt workflows, establishing evaluation criteria, and collecting structured feedback from pilot participants.

Prompt workflow design

Structured prompts for summarization, short descriptions, and DITA transformation
Designed for repeatability, not one-off use
Prompt as a product: designed, versioned, tested

Evaluation framework

Multiple prompts tested against structured criteria
Scoring across correctness, usability, and DITA validity
Feedback collected from pilot participants systematically

Operational assessment

Mapped real-world constraints beyond capability testing
Evaluated licensing, maintenance, and governance requirements
Identified what "scaling this" would actually require

Oxygen XML Editor with AI Positron Assistant panel showing Source Analysis Workflow and AI Editorial Pass prompts alongside structured DITA content — AI Positron Assistant inside Oxygen XML Editor — the Source Analysis and Editorial Pass workflows were designed and tested as repeatable prompt sets against structured DITA content.

Oxygen XML Editor with Positron Options panel expanded, showing prompt categories including Content Generation, Rewrite, Review, and Translation — The full Positron options panel — a broader set of capabilities explored during the pilot, spanning content generation, rewriting, review, and translation workflows.

What the pilot uncovered

The pilot produced real findings — not just about Positron, but about what AI-assisted authoring requires at an organizational level.

License dependency

Positron required Oxygen Enterprise licenses, creating a cost barrier before any writer could participate.

Model instability

Underlying model updates broke existing prompts, introducing silent regressions with no clear ownership path.

Proxy incompatibility

Our internal AI proxy broke Positron's built-in functionality, resulting in 404 errors and missing features.

Prompt maintenance burden

Every prompt required design, testing, and ongoing maintenance. No governance model existed for managing a prompt library at scale.

Bar chart showing average prompt scores by workflow type — Short description 4.56 and Grammar correction 4.66 scored highest, Generate questions 3.78 scored lowest — Average prompt scores by workflow type. Formatting and editorial prompts (grammar, short descriptions) outperformed generative tasks (outlines, questions), shaping the scope of what was recommended for adoption.

The strategic decision

The pilot revealed that AI-assisted authoring at enterprise scale requires sustained investment: in prompt governance, tooling maintenance, evaluation infrastructure, and organizational readiness. Given the pace of change in AI and competing priorities, the decision was made to defer full Positron adoption in favor of a lighter-weight approach — a custom AI chat interface built directly into our CMS, with lower overhead and tighter control.

Pilot phase Prompt workflows designed and tested with structured evaluation criteria

Findings Capability confirmed — but operational sustainability required resources the org wasn't positioned to commit

Decision

Defer Positron rollout. Build lightweight AI tooling inside the CMS with lower overhead and clearer ownership.

What this revealed about AI in structured environments

Prompts are products

Each prompt requires design, testing, and ongoing maintenance. Scaling prompt-driven workflows means accepting a real operational cost — one that needs ownership and governance to survive.

Structured content changes the rules

In DITA, "good text" is not enough. Output must be valid, insertable, and context-aware. AI tools built for general writing struggle at this boundary without significant scaffolding.

The gap is in the workflow

The capability existed. The system around it didn't. Without evaluation infrastructure, governance, and tooling stability, even strong AI output cannot be reliably operationalized.

The question was never whether AI could write. It was whether the organization was ready to maintain the system that makes AI output trustworthy and repeatable.

What this work demonstrates

This project shows how I approach AI tool evaluation as a systems problem — not just assessing capability, but mapping the operational requirements, governance gaps, and organizational readiness that determine whether a tool can actually be sustained at scale.

In this case, that meant designing prompt workflows, building an evaluation framework, and surfacing constraints that shaped a strategic decision. The pilot produced real findings — not just about Positron, but about what AI-assisted authoring requires to move from promising to production.

Back to projects LinkedIn

Evaluating AI-assisted authoring in a real enterprise environment