Scorecards are AI-powered evaluation tools that help managers and revenue operations teams assess the quality of sales calls. By defining custom evaluation criteria, you can have AI automatically analyze meeting transcripts and score how well reps performed, providing consistent coaching feedback at scale.
What are Scorecards?
Scorecards are structured templates that define what to evaluate in customer conversations. Unlike Talking Points (which guide reps during calls), Scorecards analyze calls after they're complete to help managers understand coaching opportunities.
Important: Even if a scorecard is limited to a specific deal stage (via preconditions), the unit being scored is still the meeting. Teams often use stage-specific scorecards as “contributions” to a deal-level framework (e.g. MEDDICC), without confusing scorecards for the framework itself.
Key benefits:
- Consistent evaluation: Every call is scored using the same criteria
- AI-powered assessment: Automatic analysis of call transcripts
- Scalable coaching: Evaluate 100% of calls, not just those you have time to review
- Data-driven insights: Track performance trends over time
- Customizable criteria: Define what matters for your sales process
How Scorecards Work
Setup Phase
- RevOps or managers create scorecards defining evaluation criteria
- Questions are configured with different types (yes/no, scale, free-form)
- AI prompts are written to guide how AI should assess each criterion
- Preconditions are set (optional) to apply scorecards to specific call types
After a Meeting
- Meeting ends: Transcript and recording are available
- AI analyzes: Bigmind AI reviews the transcript against scorecard criteria
- Scores are generated: Each question is answered/scored automatically
- Managers review: View the scorecard results and provide additional feedback
- Coaching happens: Use scorecard data to coach reps on improvements
Scorecard Structure
Template
A scorecard template is the top-level container that defines:
- Name: E.g., "Discovery Call Quality", "Demo Effectiveness"
- Status: Active or inactive
- Final Score:
- Manual: Manager provides overall assessment
- Calculated: Weighted average of question scores
- Preconditions: When this scorecard should be used
- Object type: What to evaluate (meeting/session)
- Visibility: Everyone or specific users
Questions
Each scorecard contains evaluation questions that assess different aspects of the call:
Question configuration:
- Name: What to evaluate (e.g., "Did the rep establish rapport?", "How well did they handle objections?")
- Type: How to score it (see Answer types for full guidance)
- Yes/No: Binary pass/fail — best for checklist items
- Range: Numeric scale (e.g., 1–5 or 1–10) — best for quality ratings
- Percentage: A 0–100 value — best for coverage or proportion criteria
- Single select: One option from a labeled list — best for maturity levels or quality tiers
- Multi-select: Multiple options — best for capturing what happened
- Open-ended: Free-form text — best for qualitative summaries and coaching notes
- Method: How the answer is determined
- Manual: Manager provides the assessment
- Agentic (AI): AI analyzes and scores automatically
- Agent prompt: Instructions for AI on how to evaluate (for agentic questions)
- Weight: Importance in final calculated score (if using calculated scoring)
Answer types
Each question has an answer type that determines how AI (or a manager) responds to it and whether it factors into a calculated final score. Picking the right type isn't just cosmetic — it affects AI accuracy, scoring reliability, and coaching value.
Here's a quick reference before diving in:
| Type | Best for | Contributes to calculated score? |
|---|---|---|
| Yes / No | Checklist items, binary behaviors | Yes (Yes = 1, No = 0) |
| Range | Quality ratings on a numeric scale | Yes (normalized within min–max) |
| Percentage | Coverage or proportion-based criteria | Yes (0–100 stored, normalized to 0–1) |
| Single select | Labeled quality levels, maturity stages | Yes, if options have numeric values |
| Multi-select | Capturing what happened (topics, objections) | No — use for descriptive capture only |
| Open-ended | Qualitative summaries, coaching notes | No |
Yes / No
A binary pass/fail: the behavior either happened or it didn't. The simplest and most AI-reliable type.
Use it when the thing you're evaluating is clearly observable and has no meaningful middle ground. If AI can answer it by looking for a specific moment in the transcript, Yes/No is the right call.
Good fit:
- "Did the rep set an agenda at the start of the call?"
- "Was a next meeting or next step confirmed before ending?"
- "Did the rep identify who has budget authority?"
- "Did pricing come up in this conversation?"
Avoid it when the behavior has degrees of quality. "How well did the rep handle objections?" is not a Yes/No question — it deserves a Range.
This is the right default for checklist-style MEDDIC/SPICED/BANT criteria. If your methodology asks "was X covered?", Yes/No is almost always the answer.
Range
A numeric score within a custom scale you define (e.g., 1–5, 1–10, 0–3). The score is automatically normalized within your min/max when calculating a final score, so a 4/5 and an 8/10 count equally.
Use it when quality exists on a spectrum and you want to rate how well something was done, not just whether it happened.
Good fit:
- "Rate the quality of pain discovery (1–5)"
- "How clearly did the rep articulate ROI? (1–10)"
- "Objection handling quality (1–5)"
- "Demonstration relevance to stated pain (1–5)"
Avoid it when the criterion has a clean binary answer — use Yes/No instead. Also avoid if you don't have a clear rubric for what each number means, since AI will score inconsistently without one.
Important — always define your scale in the AI prompt. Don't just ask AI to "rate 1–5". Tell it what each level means. For example:
"Score 1 if no objection handling occurred. Score 2 if the rep acknowledged the objection but didn't address it. Score 3 if addressed but unconvincingly. Score 4 if handled well with some evidence. Score 5 if handled confidently with specific proof points or references."
Tip: Keep your scale consistent across a scorecard. Mixing 1–3, 1–5, and 1–10 on the same scorecard makes results harder to interpret. Pick one scale and use it throughout.
Percentage
A value from 0–100 where the answer is naturally a proportion or completion rate. Stored as 0–100 and automatically normalized to 0–1 for calculated scoring.
Use it when the criterion you're measuring has a natural "out of 100%" interpretation — not when you just want a scale.
Good fit:
- Framework coverage estimates: "How complete is the rep's MEDDICC coverage for this deal? (0–100%)"
- Proportional behaviors: "What percentage of the meeting was the rep in discovery mode vs. pitching?"
- Readiness or confidence scores: "How ready does this deal appear to close? (0–100%)"
Avoid it when you just want to rate quality on a scale. For that, Range with a defined rubric gives AI better guidance. Percentage works best when AI is being asked to estimate a coverage or proportion, not assign a quality grade.
Single select
Pick exactly one answer from a predefined set of labeled options. You configure the options yourself — including optional colors for visual identification.
Use it when you want consistent, human-readable labels instead of raw numbers, especially in manager review workflows where categories communicate more than a score.
Good fit:
- "Rep's engagement approach: Aggressive / Neutral / Consultative / Trusted Advisor"
- "Business case maturity: None / Early / Developed / Compelling"
- "Champion strength: Weak / Developing / Strong"
- "Overall call quality: Below expectations / Met expectations / Exceeded expectations"
Avoid it when the answer is binary (use Yes/No) or when you need narrative context (use Open-ended).
If you want single-select answers to factor into a calculated final score, assign numeric values to each option (e.g., Poor = 1, Fair = 2, Good = 3, Excellent = 4). AI selects reliably from labeled lists — just make sure your prompt clearly defines what each option represents so it can distinguish between them.
Multi-select
Pick one or more answers from a predefined set. You can optionally cap the number of selections. Use it to capture everything that applied, not just the best match.
Use it when multiple things can be simultaneously true and you want to record all of them.
Good fit:
- "Which objections came up? (Pricing / Timing / Competitor / Internal buy-in / Technical fit)"
- "Which MEDDICC elements were explicitly covered in this call?"
- "What topics were discussed? (Renewal / Expansion / Support / Executive alignment)"
- "Which product areas were mentioned? (Core product / Integrations / Enterprise tier / Analytics)"
Avoid it when only one answer is correct, or when you're planning to roll this question into a calculated score. Multi-select answers don't reduce cleanly to a number, so they're excluded from score calculations. If this question needs to count toward the final score, use single-select or Yes/No instead.
Multi-select is best suited for descriptive capture — recording what happened, not rating how well something was done. It pairs well with an open-ended follow-up for context.
Open-ended
A free-form text answer. AI writes a qualitative summary, or a manager types their own assessment.
Use it when you need narrative context that can't be reduced to a number or category.
Good fit:
- "What were the key pain points uncovered in this call?"
- "Summarize the rep's approach to closing"
- "What specific coaching feedback would you give this rep?"
- "What evidence did the rep provide for the business case?"
- "What are the main risks to this deal based on this conversation?"
Avoid it when the question needs to factor into a calculated final score — open-ended answers are excluded from score calculations entirely.
Open-ended questions add the most value in two scenarios. First, as context providers alongside scored questions — asking AI to explain its rating or cite specific evidence from the transcript. Second, as standalone qualitative captures for things AI summarizes well but can't meaningfully rate, like surfacing the prospect's stated priorities or identifying blockers mentioned in passing.
Setting Up Scorecards
1. Create a Scorecard
- Navigate to Settings → Coaching → Scorecards
- Click "Create Scorecard"
- Provide a name (e.g., "Enterprise Discovery Quality")
- Choose final score method:
- Manual: For qualitative overall assessment
- Calculated: For data-driven scoring
- Set status to Active when ready to use
- Save the scorecard
2. Add Evaluation Questions
For each aspect you want to evaluate:
- Click "Add Question"
- Write the evaluation criterion (e.g., "Did the rep uncover the economic buyer?")
- Choose question type: See the Answer types section above for detailed guidance. In short:
- Yes/No for binary checklist items
- Range for quality scales — always define your rubric in the AI prompt
- Percentage for coverage or proportion-based criteria
- Single select for labeled quality levels or maturity stages
- Multi-select for capturing what happened (topics, objections, covered elements)
- Open-ended for qualitative summaries and coaching context
- Select method:
- Agentic: For objective criteria AI can assess from transcript
- Manual: For subjective criteria needing human judgment
- Write AI prompt (for agentic questions): Guide AI on how to evaluate
- Example: "Review the transcript and determine if the sales rep explicitly identified who has budget authority. Look for direct questions about budget owners or discussions about who approves deals of this size."
- Set weight (for calculated scoring): Higher weights for more important criteria
3. Configure Final Scoring
If using Manual final score:
- Write the final score question (e.g., "Overall call quality rating")
- Managers will provide this after reviewing individual questions
If using Calculated final score:
- Set weights for each question
- Higher weights = more impact on final score
- Example: Rapport (10%), Discovery (40%), Objection Handling (30%), Next Steps (20%)
- Final score = weighted average of all question scores
4. Set Preconditions (Optional)
Control when scorecards apply:
- All meetings: Evaluate every call
- Deal-based: Only for calls associated with certain deals
- Filter by deal stage, amount, type, etc.
- Example: "Only score discovery calls for Enterprise deals"
- Account-based: Only for calls with certain accounts
- User-based: Only for calls by specific reps or teams
Using Scorecards
AI Evaluation Process
After a meeting ends:
- Scorecard is triggered: Based on preconditions, relevant scorecards are identified
- AI reviews transcript: Each agentic question is evaluated
- AI reads the full transcript
- Follows the evaluation prompt for each question
- Provides scores and reasoning
- Results are stored: Scores are saved with reasoning/evidence
- Managers are notified: New scorecards are ready for review
Manager Review
Managers can:
- View scorecard results: See all evaluated questions and scores
- Read AI reasoning: Understand why AI scored each item as it did
- Override if needed: Adjust scores based on manager judgment
- Add manual answers: Complete any manual-only questions
- Provide overall assessment: Add final comments or rating
- Mark as verified: Indicate the scorecard has been reviewed
Coaching with Scorecards
Use scorecard data to drive coaching:
- One-on-one reviews: Discuss specific calls and scores
- Trend analysis: Track rep improvement over time
- Team benchmarks: Compare performance across the team
- Identify patterns: Find common strengths and weaknesses
- Targeted training: Focus coaching on lowest-scoring areas
Scorecard examples
Discovery call scorecard
Mostly Yes/No for binary MEDDIC checks, with Range for subjective quality and Open-ended for coaching context:
- Rapport established: Did the rep build a personal connection before diving into business? — Yes/No
- Pain identified: Quality of pain and impact discovery — Range (1–5) — define in prompt: 1 = no pain uncovered, 5 = clear, quantified pain tied to business impact
- Budget discussed: Did the rep bring up budget or cost expectations? — Yes/No
- Economic buyer identified: Did the rep confirm who has budget authority? — Yes/No
- Timeline established: Did the rep uncover a concrete timeline or deadline? — Yes/No
- Topics covered: Which MEDDIC elements were explicitly addressed? — Multi-select (Metrics / Economic buyer / Decision criteria / Decision process / Identify pain / Champion)
- MEDDIC coverage: What percentage of MEDDIC elements does the rep appear to have captured? — Percentage
- Next step secured: Was a concrete next step agreed before ending the call? — Yes/No
- Key takeaways: What were the most important pain points and context uncovered? — Open-ended
Demo call scorecard
Mix of binary checks and quality ratings; open-ended for evidence capture:
- Agenda set: Did the rep state a clear agenda at the start? — Yes/No
- Discovery before demo: Did the rep ask clarifying questions before demoing features? — Yes/No
- Feature relevance: How well did the demo focus on the prospect's stated pain rather than a generic walkthrough? — Range (1–5)
- Value articulation: How clearly did the rep connect what they showed to business value? — Range (1–5)
- Prospect engagement: Did the prospect ask substantive questions or react positively? — Yes/No
- Objections raised: Which objections came up during the demo? — Multi-select (Pricing / Complexity / Timeline / Competitor / Internal buy-in / Technical fit)
- Objection handling quality: How well did the rep address concerns raised? — Range (1–5)
- Call to action: Was a clear next step proposed and agreed? — Yes/No
- Coaching notes: What one thing should this rep do differently in their next demo? — Open-ended
Negotiation call scorecard
Single select works well here for maturity and posture assessments:
- Value reinforced before negotiating: Did the rep anchor on value before any pricing discussion? — Yes/No
- Negotiation posture: How did the rep approach the negotiation? — Single select (Gave in immediately / Held firm without explanation / Traded thoughtfully / Anchored on value throughout)
- Concession strategy: Did the rep trade concessions for something in return (timeline, expansion, reference)? — Yes/No
- Deal structure quality: How well did the proposed terms reflect mutual benefit? — Range (1–5)
- Closing attempt: Did the rep explicitly ask for the business or a commitment? — Yes/No
- Path to close: Was there a clear, agreed-upon path to a signed contract? — Yes/No
- Deal risk summary: What are the main risks to closing this deal based on the conversation? — Open-ended
Customer success check-in scorecard
Health scoring and expansion signals benefit from Percentage and Single select:
- Relationship check: Did the CSM make a personal connection before diving into business? — Yes/No
- Value delivered: Did the conversation cover business value the customer is getting? — Yes/No
- Challenges surfaced: Did the CSM proactively ask about issues or blockers? — Yes/No
- Renewal health: Based on tone and content, how healthy does this account appear? — Single select (At risk / Neutral / Healthy / Advocate)
- Renewal confidence: How confident does renewal appear based on this conversation? — Percentage
- Expansion discussed: Were growth or upsell opportunities explored? — Yes/No
- Action items confirmed: Were clear follow-ups and owners agreed before ending? — Yes/No
- Key signals: What positive or negative signals about this account came up in the call? — Open-ended
Best Practices
Designing Effective Scorecards
- Focus on behaviors: Evaluate what reps do, not just outcomes
- Make criteria specific: Vague questions yield inconsistent scoring
- Balance quantity: 8-12 questions is usually sufficient
- Mix question types: Yes/No for checklists, scales for quality
- Align with methodology: Reflect your sales process and training
Writing AI Prompts
- Be explicit: Tell AI exactly what to look for in the transcript
- Provide context: Explain why this matters
- Give examples: Show what good vs bad looks like
- Define edge cases: Handle ambiguous situations
- Test and refine: Review AI scores and improve prompts
Example AI prompt:
Evaluate whether the sales rep successfully identified the economic buyer (the person with budget authority to approve this purchase). Look for:
- Direct questions about who controls the budget
- Discussion of approval processes
- Identification of specific individuals with financial authority
Score Yes if the rep explicitly identified a named individual with budget authority. Score No if they only identified influencers or technical buyers without confirming budget authority.
Managing the Review Process
- Review regularly: Don't let scorecards pile up
- Spot check AI: Periodically verify AI scoring accuracy
- Use for coaching: Don't just score - actually coach from the data
- Track trends: Look at patterns over time, not just individual calls
- Adjust criteria: Refine questions as your process evolves
Coaching with Data
- Focus on growth: Celebrate improvements, not just problems
- Be specific: Reference actual call examples and scores
- Find patterns: "You're consistently strong on rapport but missing next steps"
- Set goals: Use scores to create measurable improvement targets
- Share best practices: Highlight high-scoring calls as examples
Calculated vs Manual Scoring
When to Use Calculated Scoring
- Objective criteria: When evaluation is mostly factual
- High volume: When you need to score many calls
- Consistency: When you want standardized scoring
- Trending: When you want to track metrics over time
- Example: Discovery call checklists (Did they ask about budget? Yes=1, No=0)
When to Use Manual Scoring
- Subjective assessment: When nuance and judgment matter
- Complex evaluation: When multiple factors interact
- Coaching focus: When the goal is learning, not metrics
- Quality over quantity: When reviewing select important calls
- Example: Overall sales skill assessment by experienced managers
Hybrid Approach
Many teams use both:
- Calculated scores for tactical execution (Did they cover the checklist?)
- Manual scores for strategic assessment (How effective was their approach?)
- Combine for comprehensive evaluation
Troubleshooting
AI scores seem inaccurate:
- Refine the AI prompt to be more specific
- Add examples of good vs bad to the prompt
- Check that the transcript quality is good
- Consider making the question manual if too subjective
- Review several AI-scored calls to identify patterns
Scorecards not being generated:
- Check that scorecard status is "Active"
- Verify preconditions match the meeting/deal
- Ensure the meeting has a transcript
- Check that at least some questions are set to "Agentic"
Final score calculation is wrong:
- Verify weights are set correctly for all questions
- Check that weights sum to 100% (or intended total)
- Ensure question types support numeric scoring — Yes/No, Range, Percentage, and Single select (with numeric values) contribute to calculated scores; Open-ended and Multi-select do not
- If open-ended questions are weighted, their answers will always be excluded — move scoring intent to a Range or Yes/No question instead
- Review the calculation formula in settings
Too many scorecards per call:
- Refine preconditions to be more specific
- Consider consolidating similar scorecards
- Use "User-based" preconditions to assign specific scorecards to specific teams
- Deactivate unused scorecards
Scorecards vs Talking Points
| Aspect | Talking Points | Scorecards |
|---|---|---|
| When used | During the meeting (real-time) | After the meeting (post-call) |
| Purpose | Guide reps through conversation | Evaluate call quality |
| User | Sales reps | Managers/RevOps |
| Output | Discovery answers, CRM data | Performance scores, coaching insights |
| Focus | What information to gather | How well the rep performed |
Use Together: Talking Points help reps execute well; Scorecards help managers ensure they did.
Related Documentation
- Talking Points - Guide reps during meetings
- Sidekick Overview - Learn about meeting capture
- AI Agents - Understand AI evaluation
- Concepts - How these concepts differ
- Frameworks - Deal-level qualification across the sales cycle
