You are standing in a pro shop or hovering over a checkout page with a shoe that costs as much as eight sets of strings, and a question you have probably asked yourself before: how do I know the review I just read came from someone who actually wore the thing? Most of what passes for a tennis shoe review online is an unboxing with adjectives. The good ones are rare, and the difference is not enthusiasm — it is method. So this piece is not a review of a single model. It is the protocol we use to produce one, written so you can judge any shoe review, ours included, by what it can and cannot prove.
The short version of our verdict: a trustworthy footwear review is one whose testing protocol you can reconstruct from the text, and whose author tells you where the test stopped working. Everything below is that protocol, including the parts where the honest answer is "it depends."
What we measure on the bench before anyone laces up
Before a shoe touches a court we record what can be recorded without opinion. These numbers do not tell you how a shoe plays, but they anchor the subjective notes that follow and let us compare across models without relying on memory.
- Weight, per shoe, in a single size (US men's 10.5), on a 0.1g scale. We weigh both shoes — left and right routinely differ by 3-8g, which matters when a brand advertises a figure to the gram.
- Stack height, heel and forefoot, measured with calipers at fixed points after the insole is removed and reseated. The published "heel-to-toe drop" is often a design target, not a measured result on the production unit.
- Outsole pattern and rubber callout — whether the compound is the brand's durability rubber, and where it sits on the wear zones. We do not have a lab durometer, so we report hardness as relative across our tested set, never as an absolute Shore A figure. That is a limit, and we name it.
- Torsional and longitudinal flex, by hand, scored on a repeatable 1-5 scale by the same two people. Subjective, but consistent within our own set.
This stage takes maybe forty minutes. Its entire purpose is to keep the playtest honest later — when a tester says a shoe "feels heavy," we can check whether that is 30 grams of real mass or a tester having an off day.
The court protocol: hours, surfaces, and a fixed drill set
A single hitting session tells you about first impressions and almost nothing about durability, break-in, or how a midsole behaves once it has compressed. Our standard is a minimum of 12 court hours per shoe across at least two surface types — typically hard court and one of clay or carpet — logged with date, surface, duration, and a one-line note each session.
Within those hours we run a fixed block so that every shoe faces the same demands:
- Lateral load: side-to-side suicides and a defensive-slide drill, to stress the medial and lateral outsole edges and the upper's lockdown.
- Linear acceleration and stop: baseline-to-net sprints with a hard plant, to load the forefoot and test toe-drag protection.
- Sustained rally play: at least 90 minutes of point play per session, because foot volume changes as feet swell over a match and a shoe that fits at minute five can bind at minute eighty.
We log discomfort in anatomical terms, not mood. "Pressure across the vamp after 40 minutes" is useful. "Felt great" is not. When a hot spot appears, we note where, when, and whether it resolved after break-in or persisted — because the difference decides whether it is a flaw or a fitting issue.
Three ways to evaluate a shoe, and what each one actually proves
Not all review methods claim the same things. Here is how the common approaches compare on the criteria that decide whether a verdict survives contact with your own feet.
| Criterion | Unboxing / first-impression | Single court session | Multi-session protocol |
|---|---|---|---|
| Catches break-in changes | No | Rarely | Yes |
| Detects outsole/midsole wear | No | No | Partially (12+ hrs) |
| Separates flaw from fit issue | No | Sometimes | Usually |
| Surface-dependent behavior | No | One surface only | Two-plus surfaces |
| Repeatable across reviewers | Low | Low | Moderate |
| Time cost | Minutes | ~2 hours | ~3 weeks |
The unboxing review is not worthless — it tells you about materials, finish, and out-of-box weight. It simply cannot speak to anything that emerges over time, which is most of what you are paying a premium for. The single session catches gross fit problems and immediate comfort. Only the multi-session protocol can tell you whether the midsole foam that felt responsive on day one is still responsive on day ten, and that is precisely the property mid-to-high-end shoes charge for.
The part where the honest answer is "it depends"
Here is what no protocol fixes, and where every footwear review — ours included — should lower its voice.
Fit is the largest uncontrolled variable. Our primary tester has a medium-width foot and a standard arch, runs neutral, and lands midfoot. If you have a wide forefoot, a high arch, or you are a heavy heel-striker, our comfort scores transfer poorly. A shoe we flag for vamp pressure may fit your narrower foot perfectly; a shoe we praise for lockdown may strangle a high-volume foot. We can describe the shape of a shoe — where it is roomy, where it is snug, how the last is built — and that description is more portable than the score attached to it. Read the shape notes, not the number.
Sample size is one shoe, usually one tester. Manufacturing varies. The pair we tested is not guaranteed identical to yours, particularly for cushioning foams that settle differently. When we can put a second tester in a second pair, we say so; most of the time we cannot, and a single-unit, single-foot result is not a population study. We will not dress it up as one.
Surface changes the answer. An outsole that grips beautifully on gritty hard court can feel skittish on a slick indoor hard or refuse to slide on dry clay. Durability claims also depend on abrasiveness — the same compound that lasts a season indoors can shred in months on coarse outdoor courts. A single durability figure with no surface attached is close to meaningless.
Play style loads shoes unevenly. An aggressive slider destroys lateral outsole edges that a flat-footed retriever barely touches. Our wear photos show our wear pattern. Yours will differ in proportion to how you move.
What we can stand behind, and what we can't
We can report, with confidence: measured weight and stack, where a shoe runs roomy or tight on a known foot shape, how lockdown holds up over a long match, where discomfort appeared and whether break-in resolved it, and relative outsole wear after a counted number of hours on a named surface.
We cannot honestly report: absolute rubber hardness without a durometer, energy return without force-plate equipment, long-term durability beyond our test window without extrapolating, or how the shoe will feel on a foot unlike our tester's. When we publish, those gaps appear in the text, not in a footnote nobody reads.
Who this protocol serves — and who it doesn't
This kind of review is built for the 3.0-4.5 player who already knows the current lines, has a medium-width foot and standard support needs, and wants comparison points before committing real money. For that reader, the protocol's blind spots line up closely with our tester's foot, so the transfer is strong.
It serves you less well if you have an atypical foot, a diagnosed pronation pattern, or a history of a specific injury. In those cases the most rigorous protocol in the world is no substitute for a fitting with someone who watches you move. Method has limits, and the foot in front of a fitter beats the foot in a review every time.
Back to the question you started with
So: how do you know a tennis shoe review actually tested the shoe? Not by the rating, and not by the confidence of the prose. You know by whether you can reconstruct the test — the hours, the surfaces, the foot, the measurements — and by whether the writer told you where the test ran out of road. A review that names its protocol and admits its limits is worth more than a glowing one that hides both. That is the number the whole piece turns on: a shoe that costs eight sets of strings deserves a review that did at least twelve hours of work, and said so.
Evidence grade for the central claim — that protocol transparency, not score confidence, predicts a useful footwear review: Moderate. It rests on repeatable internal method and known variance in foot fit and manufacturing, but not on a controlled study comparing review methods against player outcomes, which we have not run and have not seen published.