Use AI to Fight Spam, Bots, and Abuse at Scale

There's a version of AI integration that most platforms have landed on: surface a chatbot, slap a "powered by AI" badge on something, and call it a feature. Users notice. They also notice when it doesn't actually help them.

We think about AI differently at Carpathian. The most valuable place to put it isn't in front of your users, it's underneath them. Doing the unglamorous work they never have to think about. Protecting them without interrupting them. That's the philosophy behind how we've built AI into our own platform, and it's the same infrastructure we make available to you.

The Unglamorous Problem Nobody Talks About

Every platform deals with spam. Contact forms get hammered with gibberish. Signup flows get abused by automated scripts. API endpoints get probed by unfamiliar IPs. It's not a headline-worthy problem, but left unmanaged it quietly degrades everything; support queues, infrastructure costs, data quality, and the experience of the real people actually trying to use your product.

Rule-based filters don't hold up. They require constant maintenance, miss anything slightly novel, and create their own false positive problem. Hiring someone to manually review submissions doesn't scale. And the usual answer, "third-party spam services" means your users' data leaves your infrastructure before you've even decided what to do with it.

We wanted to implement AI, but built and experienced different. A multi-layer detection system that runs entirely on our own infrastructure, using the same AI models we host for customers. No third-party data sharing. No manual review queue. No rules to maintain.

How It Works

Layer 1: Heuristic Linguistic Analysis

The first layer is fast and deterministic. It runs on every form submission and signup before anything touches a model, catching the obvious cases cheaply.

It evaluates:

Character entropy — Shannon entropy of the character distribution. Real English text falls within a predictable range. Random strings don't.
Vowel ratio — Natural language has a consistent vowel-to-consonant balance. Text that skews far outside that range is almost always generated.
Consonant runs — Long sequences of consecutive consonants rarely appear in real words.
Word coverage — What percentage of submitted words appear in a standard English word list. Legitimate messages score high. Bot-generated text doesn't.
Bigram frequency — Common two-letter combinations in English follow predictable patterns. Unusual distributions get flagged.
Keyboard mashing detection — QWERTY-adjacent key sequences and repeated characters that indicate automated input rather than real typing.
User agent analysis — Known bot signatures, headless browsers, and suspicious client fingerprints get caught here.

Each check produces a weighted score. The weights are configurable per field because a name field with high entropy is more suspicious than a message body with the same score. The output is a confidence score and a specific set of flags explaining what triggered. High enough score, blocked immediately. Borderline, it moves to layer two.

This layer catches 80–90% of obvious spam before a model ever sees it, keeping inference costs low and latency down.

Layer 2: LLM-Powered Review

Submissions that pass heuristic analysis but still look questionable go to a isolated large language model running on our infrastructure via Ollama. Submissions never leave our network or get viewed by a third party.

The model receives the full submission in context: name, email, subject, message body, and any heuristic flags raised, and returns a structured JSON response with one of three decisions:

Approve — The submission looks legitimate.
Flag — Something is off, but it's not clearly spam. Notify the admin.
Reject — The submission is spam, fraudulent, or abusive.

Each decision includes a confidence score and a written explanation. That explanation matters, it gives the admin team real context when reviewing edge cases and builds a labeled dataset that improves the system over time.

The system is designed to fail open. If the model is slow or temporarily unavailable, the submission gets approved rather than blocked. Blocking a legitimate user because your spam filter had a bad moment is a much worse outcome than letting one spam message through. You can clean up spam. You can't undo a lost customer.

Automated Account Moderation

When the AI review layer evaluates a new signup, the platform takes automatic action based on the confidence score:

High confidence (60%+) — The account gets permanently suspended. The admin receives a full report with the model's reasoning.
Medium confidence (30–60%) — The account gets a 30-day temporary suspension, giving the team time to manually review without permanently blocking someone who might be legitimate.
Low confidence (below 30%) — The account stays active. The flag is logged and the admin gets notified, but no action is taken against the user.

The system also tracks invitation chains. If a user invites someone who later gets flagged, that relationship is recorded, helping surface patterns where a single actor is spinning up multiple accounts through the referral system.

Every decision feeds into a labeled training dataset. The model gets better over time because it has more real examples of what spam and legitimate users look like on this specific platform.

Progressive IP Blocking

Repeated failures from the same IP trigger escalating responses:

3 failures within 1 hour — 1-hour temporary block
5 failures within 24 hours — 24-hour temporary block
10+ failures within 7 days — Permanent block

Temporary blocks expire automatically. A legitimate user who trips a false positive waits out the block and tries again, they aren't permanently locked out. Permanent blocks require manual removal. If an IP has failed 10+ times in a week, it's almost certainly not a real person.

Geo-Aware AI Firewall

Every AI inference endpoint sits behind a geo-aware software firewall. When a new IP makes its first request to a model instance, the system performs a GeoIP lookup, logs the IP with geographic context, sets it to pending, and notifies the instance admin. The admin then decides whether to approve, block, or continue monitoring.

The firewall runs in two modes: monitor (all requests allowed, unknown IPs logged) and enforce (unknown IPs blocked by default until approved). It integrates with a Redis-backed rate limiter, each AI instance has configurable requests-per-minute limits enforced through a sliding window. If an IP triggers three or more blocks within a five-minute window, the instance auto-locks and the admin gets an alert.

What We Learned

Put AI where users don't have to think about it. The best AI integration is invisible. Users don't see the spam filter working, they just never see the spam. That's the goal.

Heuristics first, models second. Fast deterministic checks handle the obvious cases. The LLM handles the ones that require judgment. Stacking them this way keeps costs low and quality high.

Fail open, not closed. Every AI-dependent component is designed to approve on failure. An overloaded model should never be the reason a real customer gets blocked.

Confidence thresholds beat binary decisions. Hard approve/reject boundaries create too many false positives or false negatives. A three-tier system with graduated thresholds lets you be aggressive on obvious abuse while staying careful on edge cases.

Log everything with context. Every detection result, model decision, IP block, and security event gets logged with full reasoning attached. This makes debugging tractable, gives the team visibility, and generates the training data needed to keep improving.

The Same Infrastructure, Available to You

The models powering this system, spam detection, content moderation, text classification, natural language processing, are available to Carpathian customers today. We host open-source and fine-tuned LLMs on our own compute infrastructure and provide OpenAI-compatible API access with token-based billing, rate limiting described above.

If you're building something and want AI that works quietly in the background — protecting your users, improving your data quality, and scaling without a manual review queue — get in touch or learn more about Carpathian AI.

The goal was never to impress your users with AI. It was to build something they never have to think about.