← Back to blog

Six tweets, six scores: a walkthrough of what actually moves the number

·9 min read·by Isaiah Dupree

Every score on this site comes from the same engine. Here's what six common tweet patterns actually score, and what the numbers are telling you to change.

These are real scores from the live API on May 9, 2026. Paste any of these tweets into the demo on the home page and you should see the same numbers back.

Pattern matters more than wording. The contrarian-diagnosis framework will outscore an unstructured tweet on the same topic every time — because the algorithm rewards reply rate, and contrarian framings drive replies.

01Contrarian diagnosis with reply trigger

Most local businesses don't have a lead problem. They have a follow-up speed problem. The first three minutes after a missed call decide whether you book or lose the job. What's your floor?
Decision
Publish
Quality score
59/100
Algorithm score
74/100

What this pattern does

  • Sets up a common belief (lead problem)
  • Reveals the actual bottleneck (follow-up speed)
  • Anchors with a specific number (three minutes)
  • Ends with a question that demands a real answer

Why it scored what it scored

  • Reply rate predicted at 12% — highest in this set. Replies are weighted ~3× likes by phoenix.
  • Profile-click probability is elevated because the tweet implies the author has data on this.
  • Composite score 44 — high clarity, strong hook, real reply trigger.

02Bookmarkable checklist

5 questions I ask before automating any SME workflow: 1) what's the manual baseline? 2) what fails in 1 of 100 cases? 3) who reviews the AI output? 4) how do I roll back? 5) what does the operator see when it breaks?
Decision
Publish
Quality score
48/100
Algorithm score
58/100

What this pattern does

  • Numbered list (saveable)
  • Five concrete sub-questions, each with one verb
  • Domain-specific (SME automation) — narrow audience but high relevance

Why it scored what it scored

  • Bookmark probability is elevated — checklists get saved.
  • Reply probability is lower than the contrarian — there's no question to react to.
  • Engagement caps at follower-graph reach because nothing invites response.

How to fix it

  • End with a 6th question: "Anything missing? What's your blocker?"
  • Compresses ~12 points of pre_publish score from reply trigger gain.

03Build-in-public receipt

Spent the week debugging a calibration loop. Root cause was 1 line: the migration was idempotent but the rollback wasn't. Three things broke that the tutorial didn't warn me about.
Decision
Publish
Quality score
47/100
Algorithm score
58/100

What this pattern does

  • Specific verb ("debugging"), specific time window ("the week")
  • Specific failure mode ("idempotent migration, non-idempotent rollback")
  • Promises three things broke without listing them — pulls dwell time

Why it scored what it scored

  • Dwell-time signal carries this tweet — readers who relate stop and read.
  • But there's no question, no contrarian frame. Phoenix score sits at 58 — solid but not breakout.
  • Composite 36 — clarity is okay, hook is okay, reply trigger is missing.

How to fix it

  • Add the three failures as a numbered list, OR end with "What broke that you didn't see coming?"
  • Either change typically lifts pre_publish into the high 50s.

04Generic AI hype

AI is changing everything in 2026. Every business needs to embrace it to stay competitive in this fast-evolving landscape.
Decision
Don't post
Quality score
0/100
Algorithm score
56/100
Hard blocker: ai-tell:'in this fast-evolving'

What this pattern does

  • Generic claim with no specifics
  • Hits an AI-tell phrase: "in this fast-evolving"
  • Closes on a cliché: "stay competitive"

Why it scored what it scored

  • Hard rejected on AI-tell phrase. Pre-publish score forced to 0.
  • Composite collapses to 6.9 because clarity, specificity, and hook all score 1-2.
  • The phrase "in this fast-evolving" is detected because it patterns with hundreds of GPT-style openings.

How to fix it

  • Replace with a specific claim about a specific tool: "Claude 4.7 cut my SEO research from 4h to 12 min — here's the prompt."
  • Specificity + tool name + outcome moves this from rejected to publish.

05Engagement bait

RT if you agree that automation is the future of small business.
Decision
Don't post
Quality score
0/100
Algorithm score
45/100
Hard blocker: bait:'rt if'

What this pattern does

  • Asks for an action ("RT")
  • Vague claim
  • No reason to engage beyond the ask

Why it scored what it scored

  • Hard rejected on bait phrase. Pre-publish score 0.
  • Phoenix score is also genuinely low (44.6) — bait patterns drive 5%+ "not interested" rate.
  • X actively throttles tweets that ask for retweets. Even if the gate passed, the algorithm wouldn't push it.

How to fix it

  • Drop the RT ask entirely. Make a specific contrarian claim.
  • If you want shares, write something share-worthy — don't ask.

06Em-dash AI signature

Build season is on — here is what shipped this week. Three things broke at step 4.
Decision
Don't post
Quality score
0/100
Algorithm score
56/100
Hard blocker: em-dash

What this pattern does

  • Em-dash character (U+2014) in the body
  • Otherwise: specific verb ("shipped"), specific count ("three things")

Why it scored what it scored

  • Rejected on em-dash. Phoenix is fine at 56, composite is fine at 40.5 — the body would have published.
  • Em-dash is one of the strongest AI-generation signals from 2024+. ChatGPT and Claude both default to em-dashes; the human X audience has learned to spot them.
  • Even if your tweet wasn't AI-generated, readers will assume it was.

How to fix it

  • Replace with a period, comma, or colon. Three options:
  • · "Build season is on. Here is what shipped this week."
  • · "Build season is on, and here's what shipped this week."
  • · "Build season is on: here's what shipped this week."
  • Any of these scores the same as the original would have, minus the em-dash penalty.

Patterns that consistently score well

Across the 6 examples and hundreds of others I've scored:

  1. Contrarian diagnosis with a question. Highest ceiling. Reply trigger is the highest-leverage single dimension.
  2. Numbered checklists with one ask at the end. High bookmarks AND replies. The ask matters — without it, replies collapse.
  3. Build-in-public receipts with a hook. Specifics drive dwell time. Specific number + specific verb + specific failure mode.

Patterns that consistently score badly

  1. Generic AI-flavored hype. Detected by 25 phrase patterns. Composite drops to single digits.
  2. Engagement bait. Hard-blocked. The algo throttles these in addition to the gate rejecting them.
  3. Em-dashes. One character can sink an otherwise good tweet. Replace with period, comma, or colon.

What the score is and isn't

The score is a snapshot of your tweet's text against measurable patterns. It's not telling you whether this exact tweet will hit 100k impressions — that depends on your follower base, post timing, topic relevance, and account history that we don't fully model. It IS telling you whether the tweet has the structural signals the algorithm rewards.

Use the score to A/B drafts. Write three versions, score each, ship the highest-scoring one. The score gap between "decent" and "strong" (~15-20 points) usually predicts a 2-3× engagement difference in practice.

Score your own draft →