How to Evaluate AI Vendors (Without Getting Burned)

Every AI vendor sounds amazing in the demo. The demo data is perfect, the models are magical, and your problems will vanish in 30 days. Then you sign, and reality hits. Here are the 20 questions I ask AI vendors before my clients sign anything—questions that separate real solutions from expensive science projects.

Why This Matters Now

AI vendors are multiplying like rabbits. Every software company is now an "AI company." Every consultant offers "AI transformation." Some are legitimate. Most are rebranded existing tools with a ChatGPT wrapper.

The cost of getting this wrong is high:

Direct costs: $50K-$500K+ in implementation fees
Opportunity cost: 6-18 months lost on a failed implementation
Team morale: Another "strategic initiative" that goes nowhere
Political cost: Your reputation when you championed the vendor

These questions help you avoid expensive mistakes.

Category 1: The Reality Check Questions

Start here. These questions quickly separate real solutions from vaporware.

Question 1: "Show me production data from a similar client"

Not demo data. Not synthetic data. Real production results from a client in your industry, with your data complexity, at your scale.

What you're looking for:

Specific accuracy numbers (not "highly accurate")
Edge cases and how the system handles them
Performance on messy data, not clean demo data

Red flags:

"We can't share that due to confidentiality" (they should have anonymized examples ready)
Only showing perfect results, no discussion of failures or limitations
"Your data is unique, we'll need to train specifically for you" (translation: we haven't solved this before)

Example Scenario: The Invoice Processing Disaster

(Hypothetical scenario based on common patterns)

A client almost bought an AI invoice processing tool. The demo was perfect—extracted every field flawlessly. When asked to see production accuracy on invoices from construction subcontractors (the client's use case), the vendor hemmed and hawed. Turned out their system worked great on standardized invoices from big companies, but accuracy dropped to 60% on the handwritten, inconsistent formats construction subs send. Would have been a $120K waste.

Question 2: "What's your model's failure mode?"

Every AI system fails sometimes. The question is how it fails, and whether that's acceptable for your use case.

What you're looking for:

Honest discussion of limitations
Specific failure rates and scenarios
How the system flags low-confidence predictions
What happens when it's wrong

Red flags:

"Our AI is 99.9% accurate" (nothing is 99.9% accurate on real-world data)
No discussion of edge cases or failure modes
Can't explain what happens when the model is uncertain

Question 3: "Can I test this on my actual data before signing?"

Not a demo. Not a trial. A real proof-of-concept on your data.

What you're looking for:

Willingness to do a paid POC with clear success criteria
Realistic timeline (2-4 weeks for most use cases)
Defined deliverables and metrics

Red flags:

"Just sign the contract, we'll figure it out in implementation"
Requiring 12-month commitment before testing on your data
POC costs more than $10K-$25K (should be cheap to prove value)

Category 2: The Technical Due Diligence Questions

Even if the demo looks good, you need to understand what's under the hood.

Question 4: "Is this actually AI or just business rules?"

A shocking number of "AI" vendors are just using if-then logic with fancy UIs.

How to probe:

Ask them to explain the model architecture
Ask how it handles scenarios not in the training data
Ask if it improves over time and how

Red flags:

Vague answers about "proprietary algorithms"
Can't explain how the system learns or adapts
The "AI" requires you to configure extensive business rules

Question 5: "What happens to my data?"

Critical question. Some vendors use your data to train models they sell to your competitors.

What you're looking for:

Clear data privacy policy in the contract
Confirmation your data isn't used to train shared models
Where data is stored (US vs international)
How data is encrypted at rest and in transit
Retention policies and deletion guarantees

Red flags:

"We use your data to improve our models" (means they're training on your data)
Can't specify where data is stored
No clear data deletion policy when you churn

Example Scenario: The Competitive Intelligence Leak

(Hypothetical scenario based on common patterns)

A company in a competitive industry almost used an AI vendor whose TOS allowed them to train on customer data. The vendor had three of the company's direct competitors as customers. That means the company's data (pricing, customer patterns, strategic initiatives) would have effectively been shared with competitors through the model. Always read the data usage terms.

Question 6: "What's your model latency and throughput?"

Speed matters. An AI that takes 30 seconds per prediction might be fine for monthly reports, terrible for customer-facing applications.

What you're looking for:

Actual response times at your expected volume
Whether they batch process or real-time
How performance scales with data volume

Red flags:

Can't give specific performance numbers
"It depends" without any benchmarks
Demo is suspiciously fast (might be cached/pre-computed)

Question 7: "Can I export my data and models if I leave?"

Vendor lock-in is expensive. You need an exit strategy.

What you're looking for:

Data export in standard formats (CSV, JSON, not proprietary)
Whether you can export model weights (if you paid for custom training)
Transition period and support for migration

Red flags:

No data export capability
Export only in proprietary format
"Why would you want to leave?" (hostile to the question)

Category 3: The Economics Questions

The sticker price isn't the real price. Dig into total cost of ownership.

Question 8: "What's the all-in cost for year one?"

Get everything on the table:

Base platform fee
Implementation/setup fee
Training data labeling costs
Integration costs
Per-user or per-transaction fees
Support costs
Required professional services

Red flags:

"Pricing is custom, we'll figure it out"
Implementation costs more than the annual platform fee
Surprise fees for things that should be standard (API access, data export, etc.)

Question 9: "How does pricing scale with my usage?"

You're buying this to grow. Make sure the pricing model doesn't punish success.

What you're looking for:

Clear pricing tiers
Volume discounts
Predictable scaling (not exponential cost increases)

Red flags:

Per-transaction pricing with no caps
Pricing jumps 10x at certain thresholds
Can't give you a estimate for 2x or 10x your current volume

Example Scenario: The Per-Transaction Trap

(Hypothetical scenario based on common patterns)

A company signed with an AI vendor at $0.10 per transaction. Seemed cheap—they were processing 10,000 transactions/month, so $1,000/month. Two years later, they'd scaled to 500,000 transactions/month. Cost: $50,000/month = $600K/year. With no volume discounts. They were locked in for another year. Rebuilding it custom for $40K with hosting at $400/month would pay for itself in one month.

Question 10: "What's included in support, and what costs extra?"

Support is where vendors hide costs.

What you're looking for:

Response time SLAs in the contract
Whether support is included or extra
Who you actually talk to (offshore tier 1 vs. engineers)
Whether model retraining is included or extra

Red flags:

Premium support costs 50%+ of the platform fee
No SLAs in the contract
Model updates or retraining cost thousands per iteration

Category 4: The Integration Questions

The AI is useless if it doesn't fit into your workflow.

Question 11: "How does this integrate with our existing systems?"

Be specific. Name your actual systems.

What you're looking for:

Pre-built integrations with your ERP/CRM/data warehouse
API documentation you can actually read
Realistic integration timeline

Red flags:

"We can integrate with anything" (vague = expensive)
No API documentation available until after you sign
Every integration is custom professional services

Question 12: "Who builds the integrations, and what's it cost?"

Can your team do it, or do you need to pay the vendor's professional services team at $300/hour?

What you're looking for:

Self-service integration tools
Clear documentation
Fixed-price integration packages

Red flags:

Only vendor can build integrations
Time-and-materials pricing for integration work
Integration costs more than the platform

Question 13: "What happens when our data schema changes?"

Your business evolves. The AI needs to keep up.

What you're looking for:

How model retraining works when data changes
Whether you can update mappings yourself
Cost and timeline for adapting to schema changes

Red flags:

Any schema change requires full reimplementation
Model retraining costs $20K+ per iteration
3-month lead time for changes

Category 5: The Vendor Viability Questions

The technology might be great, but will the vendor exist in two years?

Question 14: "How long have you been in business?"

Not a dealbreaker if they're new, but you need to know what you're getting into.

What you're looking for:

Company history and funding
Customer count and retention rates
Whether they're profitable or burning cash

Red flags:

Series A startup with 6 months of runway asking for 3-year contracts
Can't name 10 production customers
Customer case studies are all from 6+ months ago

Question 15: "What's your customer retention rate?"

If customers aren't renewing, that tells you something.

What you're looking for:

80%+ retention rate (logo retention, not dollar retention)
Willingness to share the number
Explanation of why customers leave

Red flags:

Won't share retention numbers
Can't explain churn
<60% retention (means half their customers leave each year)

Question 16: "Can I talk to three customers who've been using this for 12+ months?"

Not cherry-picked references. Real customers, ideally in your industry, who've been through the full implementation and renewal cycle.

Questions to ask references:

What surprised you during implementation?
What didn't work as expected?
How much time did it actually take vs. what they promised?
What did it actually cost vs. what they quoted?
Would you buy it again knowing what you know now?

Example Scenario: What References Actually Tell You

(Hypothetical scenario based on common patterns)

References for a company evaluating an AI forecasting tool revealed discrepancies. Vendor said implementation takes 6 weeks. Three references all said it took 4-6 months and required significant data cleanup first. Vendor said accuracy was 95%. References said 70-80% on their actual data. Still might have been worth it, but realistic expectations allowed negotiating price down 40% to account for longer implementation.

Category 6: The Contract Questions

Read the contract. Seriously. Here's what to look for.

Question 17: "What's the contract term, and what's the out?"

Don't sign multi-year contracts for unproven technology.

What you're looking for:

12-month initial term with annual renewals
Performance-based out clauses
Reasonable termination notice (30-90 days)

Red flags:

3-year minimum commitment
Auto-renewal with no opt-out window
Termination requires 6+ months notice
Penalties for early termination exceed remaining contract value

Question 18: "What are your uptime SLAs and remedies?"

If the system is down, what happens?

What you're looking for:

99%+ uptime SLA in the contract
Specific remedies (credits, not just apologies)
Defined downtime (what counts as an outage)

Red flags:

No uptime SLA in contract
"Best effort" language
Remedies are capped at trivial amounts

Question 19: "What happens if you get acquired or shut down?"

Startups get acquired. Companies fail. You need protection.

What you're looking for:

Source code escrow for mission-critical systems
Data export rights that survive company changes
Transition assistance terms

Red flags:

No continuity provisions in contract
Won't discuss the scenario
"We're not going anywhere" (famous last words)

Question 20: "Can we do a 90-day pilot before full commitment?"

The ultimate risk mitigation.

What you're looking for:

Fixed-price pilot with clear success metrics
Option to expand or terminate based on results
Pilot cost credited toward full contract if you proceed

Red flags:

"No pilots, full contract only"
Pilot costs $50K+ (should be cheap to prove value)
Vague pilot success criteria

How to Use These Questions

Don't just rapid-fire all 20 questions in a single meeting. Use them strategically:

First Meeting: Reality Check (Q1-3)

Verify they can actually solve your problem with your data. If they can't answer these convincingly, stop here.

Second Meeting: Technical Due Diligence (Q4-7)

Dig into what's under the hood. Bring your technical team. If the technology doesn't hold up, stop here.

Third Meeting: Economics & Integration (Q8-13)

Understand the true cost and implementation complexity. If the economics don't work, stop here.

Fourth Meeting: Vendor Viability & References (Q14-16)

Verify the vendor will be around and can actually deliver. Talk to references separately.

Contract Review: Legal & Risk (Q17-20)

Negotiate terms that protect you. Don't sign until you're comfortable with outs and risk mitigation.

Good Signs vs Red Flags Summary

Good Signs	Red Flags
Shows production data and discusses failures openly	Only perfect demo data, won't discuss limitations
Offers paid POC with clear success metrics	Requires long-term contract before testing your data
Transparent pricing with predictable scaling	Vague costs, surprise fees, exponential scaling
80%+ customer retention, shares numbers	Won't share retention, can't explain churn
References confirm timeline and cost accuracy	References reveal 3x time and cost vs. sales pitch
12-month terms with performance outs	3-year lock-in with punitive termination clauses
Data export and portability guaranteed	Proprietary formats, no export capability
Specific SLAs with meaningful remedies	"Best effort" with no guarantees

The Bottom Line

Most AI vendors are selling futures, not solutions. Your job is to figure out which ones can actually deliver.

These 20 questions separate:

Real technology from rebranded business rules
Proven solutions from science projects
Fair pricing from vendor lock-in traps
Sustainable vendors from startups that won't exist in 18 months

Don't get dazzled by the demo. Do the diligence. Ask hard questions. Talk to references. Read the contract.

And remember: the best AI solution might be building custom with someone who knows your business, rather than buying a generic platform that kind of fits.

Need an Honest Second Opinion?

I help CFOs and finance leaders evaluate AI vendors and alternative solutions. I've seen enough vendor pitches to know what's real and what's marketing.

Sometimes the AI vendor is the right choice. Often, a custom-built solution delivers better results at 90% lower cost. I'll tell you honestly which makes sense for your situation.

Let's review the vendor proposals you're considering. I'll help you ask the right questions, interpret the answers, and make a decision you won't regret. No sales pitch—just straight talk from someone who builds this stuff and knows what actually works.

How to Evaluate AI Vendors (Without Getting Burned)

Why This Matters Now

Category 1: The Reality Check Questions

Question 1: "Show me production data from a similar client"

Question 2: "What's your model's failure mode?"

Question 3: "Can I test this on my actual data before signing?"

Category 2: The Technical Due Diligence Questions

Question 4: "Is this actually AI or just business rules?"

Question 5: "What happens to my data?"

Question 6: "What's your model latency and throughput?"

Question 7: "Can I export my data and models if I leave?"

Category 3: The Economics Questions

Question 8: "What's the all-in cost for year one?"

Question 9: "How does pricing scale with my usage?"

Question 10: "What's included in support, and what costs extra?"

Category 4: The Integration Questions

Question 11: "How does this integrate with our existing systems?"

Question 12: "Who builds the integrations, and what's it cost?"

Question 13: "What happens when our data schema changes?"

Category 5: The Vendor Viability Questions

Question 14: "How long have you been in business?"

Question 15: "What's your customer retention rate?"

Question 16: "Can I talk to three customers who've been using this for 12+ months?"

Category 6: The Contract Questions

Question 17: "What's the contract term, and what's the out?"

Question 18: "What are your uptime SLAs and remedies?"

Question 19: "What happens if you get acquired or shut down?"

Question 20: "Can we do a 90-day pilot before full commitment?"

How to Use These Questions

First Meeting: Reality Check (Q1-3)

Second Meeting: Technical Due Diligence (Q4-7)

Third Meeting: Economics & Integration (Q8-13)

Fourth Meeting: Vendor Viability & References (Q14-16)

Contract Review: Legal & Risk (Q17-20)

Good Signs vs Red Flags Summary

The Bottom Line

Need an Honest Second Opinion?

Let's Work Together