CASE STUDY · BANCO GALICIA · 2026

Turning the voice of the customer into an actionable backlog

Design and deployment of a monthly system that reads hundreds of CSAT comments, classifies them with a proprietary AI-assisted taxonomy, and delivers them as a prioritized backlog to each bank squad. The project became the foundation of a new Digital Vision squad.

THE PROBLEM

Hundreds of CSAT comments each month were lost in spreadsheets and ad-hoc reports, with no systematic way to turn them into action.

THE SOLUTION

A Product × Issue taxonomy, a reproducible monthly process with AI assistance, and a monthly report with a squad-prioritized backlog.

THE IMPACT

Methodology adopted bank-wide, trend tracking that detects early signals, and the foundation of the new Digital Vision squad.

THE PROBLEM

Thousands of voices, zero synthesis

Banco Galicia receives hundreds of open-text comments each month in its CSAT surveys: raw text where customers describe what's happening to them, what frustrates them, what they're missing. This material had everything needed to properly prioritize the product roadmap, but it lived scattered, without thematic classification, without an explicit link to the satisfaction score, and with no way to turn it into action.

The UX team read what they could, manually, case by case. Product and business made decisions with partial information. Recurring feedback about the same pain points repeated for months without being captured as a pattern. Nothing was systematic: feedback was lost in spreadsheets, one-off reports, and team memory.

The question that sparked the project was simple: What would happen if we could read the hundreds of critical comments each month, classify them with a common taxonomy, and deliver them to each squad as a prioritized backlog with evidence?

THE METHODOLOGY

From raw comment to prioritized backlog

The system takes feedback from Qualtrix (survey platform) and the in-app CSAT, classifies it with a proprietary Product × Issue taxonomy, cross-references it with the numerical satisfaction score, and translates it into a monthly report with a squad-prioritized backlog. AI assists in the analysis and synthesis steps; classification and prioritization decisions remain human.

Flow diagram: Qualtrix and In-App CSAT feed the Comment Classification step, which together with Contactability and Qualitative User Contact synthesizes into Report Creation, producing Improvement Proposals, Follow-up, and Results Socialization.

Combining quantitative data (CSAT score) and qualitative reading (comments) turns the score into an X-ray of the real experience.

13 PRODUCTS

App · Cards · Loans · Payments · Transfers · Identity/Login · Accounts & Upgrade · Investments · Loyalty · PDS · Customer Service · Checks · PFM

5 ISSUE TYPES

Functionality · Robustness · Offer · Service · Accessibility

A REAL MONTH

How one trend report detected a new critical front

In November 2025, the month-over-month trend detected something that in an isolated snapshot would have gone unnoticed: Payments (QR, DEBIN, debits) jumped from 0.7% to 10.5% of critical comments in a single month. The method surfaced the trend before it exploded.

+9.86 pp

Jump in the relative weight of Payments as a critical issue between October and November 2025

71.7 %

CSAT Top 2 Box December 2025 (scores 4–5 out of total)

−3.5 pp

CSAT drop from November to December, detected in the trend report

The finding translated directly into backlog: prioritize error fixes in QR and DEBIN, ensure operation traceability, review app overnight maintenance windows. The responsible squad received the evidence before Payments issues became the dominant conversation.

THE DECISIONS

Five decisions that defined the system

I led the project end to end: taxonomy design, process definition, report format, AI assistance implementation, and coordination with data and product for adoption. These are the decisions that most shaped the outcome.

01

Link feedback to the CSAT score from the first step

Most text analysis tools treat comments as an isolated corpus. The key decision here was joining each comment to its numerical score before any classification. This allows knowing not just what customers talk about, but which topics carry the most dissatisfaction and deserve real prioritization.

02

A Product × Issue taxonomy common across the bank

I defined 13 products and 5 issue types (Functionality, Robustness, Offer, Service, Accessibility) applicable to any comment. That shared grid is what allows an App squad and a Cards squad to speak the same language when prioritizing, and lets the reports compare apples to apples month after month.

03

AI where it scales, human where it decides

AI assistance is concentrated on volume analysis, dynamic chart generation, and pattern synthesis over already-classified comments. Classification itself and prioritization stay in human hands, because expert judgment still matters more than the model there. It's not about marrying AI: it's about using it where it performs best.

04

Monthly trend tracking, not just one-time snapshots

Reports aren't isolated snapshots: each month is compared against the previous one with percentage-point deltas. That's what allowed detecting the Payments jump in November, or the return of Transfers as a critical front in December, before they became visible crises.

05

Output designed for decision, not for analysis

The final deliverable isn't a dashboard: it's an actionable backlog by front (Robustness, Functionality, Offer, Accessibility), with textual examples traced by HOST_ID and concrete recommendations. Each squad uses it directly for their roadmap, without an intermediate interpretation layer.

IMPACT

What changed

  • New Digital Vision squad: the project served as the foundation for creating a dedicated cell to read and prioritize the voice of the customer bank-wide.
  • Methodology adopted as standard: the Product × Issue taxonomy became the internal reference for any qualitative feedback analysis.
  • Early trend detection: the month-over-month tracking captured focus shifts like the Payments jump (+9.86 pp) or the return of Transfers as a critical front, before they were obvious.
  • Reproducible prioritized backlog: the product team went from discussing isolated examples to deciding on quantified clusters with CSAT impact and traceability to the original comment.
  • New shared language: different squads discuss priorities using the same grid, making it possible to compare the relative weight of dissatisfaction across products and issue types.

LEARNINGS

What I take away

The designer's role changed in this project. It wasn't about designing screens: it was about designing how the organization listens to its customers. The value of design lay in defining what question the system answered, how to present the output to make it actionable, and what structural decisions to make about how to model feedback.

Generative AI doesn't replace qualitative reading; it makes it scalable. The team still reads comments, but now reads the most representative ones from each cluster instead of random samples. Human judgment is applied where it matters: in fine-grained classification and prioritization.

And perhaps the strongest lesson: AI projects that work aren't those that use the best model — they're the ones with the best-formulated business question. Linking feedback to CSAT from the first step and setting up a month-over-month trend was what turned a nice text analysis into a real prioritization tool.