The Mid-Semester Survey
I designed a feedback system, scaled it across twenty-plus course sections, and learned the hard way that more data is not more feedback.
The Problem
We asked everyone whether the course worked — except the people taking it.
This started, like a lot of my ideas, in a team conversation about what fully online education actually needs to be good. We were talking about how we measured course effectiveness, and I noticed something that bothered me: we asked everyone — administrators, professors, instructional designers — whether an online course was working. Everyone except the students who were actually inside it. And on the rare occasion we did ask them, we asked at the end, once the course was already over and nothing could be changed for them.
End-of-semester evaluations have a design flaw nobody names: they can only ever help the next group. The students who fill them out are gone by the time anyone reads them. I wanted feedback that could help the people who gave it — while there was still time to act.
End-of-course evaluations arrive too late to help the students who wrote them. I wanted a feedback loop short enough to close mid-flight.
The Design
A survey students actually had to take — three times.
I pitched a non-graded survey that ran three times a term — after week 4, after week 9, and in the module before the final exam. To make sure we actually got responses instead of the usual trickle, completing each one unlocked the student's progress through the course. It was anonymous, and every wave asked the same questions on purpose, so we could track how a cohort's confidence shifted from week 4 to week 9 even without knowing who was who.
The team liked it enough to challenge me to build it, and it became a mandate across the courses we owned. We each pitched it to our professors, who were game — with one recurring hesitation about locking student progress, which most came around on. It went live in more than twenty sections across seven courses. The instrument itself was thorough: eight Likert questions covering everything from navigation to challenge to enjoyment, plus two open-ended prompts.
Thorough turned out to be the problem.
What came back
A mountain of data, and no time to climb it.
Here is what one term produced across those seven courses:
That volume is the whole story. We had built an instrument thorough enough to measure everything — and in doing so, produced a haystack no one on the team could search during the term it mattered. The honest truth, and the reason this is a case study and not a victory lap: we almost never managed to read the feedback in time to act on it mid-course. The system designed to close the loop couldn't be parsed fast enough to close it.
It gets worse when you look at what the eight Likert questions actually told us. I recently went back and analyzed the full dataset — the thing we never had time to do while it was live:
Every Likert answer, all courses, one term
11,200+ responses · scaled to the largest bar
86% of answers were "agree" or "strongly agree." Six of the eight questions averaged between 4.3 and 4.5 out of 5. A scale that never moves can't tell you what to fix — it just tells you people are basically fine, which you already assumed.
Eleven thousand data points, and they nearly all said the same agreeable thing. The Likert half of the survey was, in hindsight, mostly noise dressed up as rigor.
The twist
The open-ended answers weren't noise. We just couldn't mine them.
When I finally analyzed the ~2,800 free-text comments, the real finding emerged: only about 8% were throwaway ("n/a," "none"). More than nine in ten said something real — and they clustered into consistent, actionable themes that showed up across completely unrelated courses.
What students asked us to improve
share of comments mentioning each theme · a comment can touch more than one
One request came up again and again, in course after course: clearer instructions and short video walkthroughs of assignments. That is a specific, buildable fix — exactly the kind of thing a professor could act on mid-course, if only we'd surfaced it in time.
In the students' own words:
The signal was there the whole time. The instrument just buried it under eleven thousand Likert scores and fifty-seven thousand words, and handed the whole pile to a team that didn't have a spare week to dig.
What it taught me
The fix isn't more effort. It's subtraction.
The instinct, when a feedback system doesn't get used, is to promise you'll try harder next semester — carve out the time, finally read it all. That's the wrong lesson, and I've made my peace with that. The right lesson is that a survey optimized for completeness is optimized against action. Every question you add is more data to parse before anyone can do anything, and past a certain point the thoroughness is what guarantees the feedback dies unread.
So the redesign I'm building isn't bigger. It's smaller — ten questions down to two. One open-ended question, tightly aimed at the single most actionable thing ("What's one change that would make this course clearer or easier to succeed in right now?") instead of the vague "what are you enjoying." And at most one Likert question — and if we keep one, the data says keep the one that actually varied: confidence with the ed-tech tools, the only question that ever dipped below the ceiling. Narrow the aperture, and 57,000 words becomes something a person can read in an afternoon and act on that week.
The most valuable thing this system ever produced wasn't a piece of feedback. It was the lesson about feedback itself — and that lesson only shows up if you're willing to look honestly at a thing you built and admit where it fell short of its own goal.
The bigger idea
This is the whole game in content strategy.
Every analytics dashboard ever built has the same failure mode as my first survey: it measures everything it can, and in doing so drowns the two or three numbers you'd actually change your behavior over. The core discipline of content strategy isn't gathering more data — it's the ruthless judgment to find the actionable signal and ignore the vanity metrics that just tell you you're basically fine. I didn't read that in a book. I built a system at real scale, watched it fail in exactly that way, and drew the conclusion the hard way.
And the analysis on this page is the point made twice. The insight that mattered was never in the 11,000 Likert scores — it was in the open-ended answers, the qualitative layer most dashboards can't even capture. Knowing which question actually carries the signal, and being able to go read 2,800 comments and come back with "they want video walkthroughs," is the exact muscle I'm sharpening formally right now with my Google Analytics certification. The certification is the quantitative vocabulary. This project is where I learned the instinct.
Give me a firehose of audience feedback and the question "so what do we actually do?" — that's not a stretch for me. It's the thing I've already done, at scale, and learned from.