AWS Managed AI Services

The Pre-Trained Toolkit You Don't Have to Train
AWS has a layer of AI services that sit below Bedrock and SageMaker in complexity but are, honestly, the ones most teams will reach for first in production. These are fully managed, pre-trained, API-only services. No datasets, no training jobs, no model selection — you send data in, you get intelligence back. They cover NLP, computer vision, speech, search, recommendations, and forecasting. If you're a software engineer wiring up an application, these are your building blocks.
The mental model: SageMaker is "I'll build and train my own model." Bedrock is "I'll use a foundation model via API." Managed AI services are "I just need sentiment analysis on this text — give me a REST endpoint." They're the most constrained and the easiest to adopt.
The Language Services
Amazon Comprehend — NLP as an API
Comprehend does natural language processing. You hand it text, it gives you back structured data: entities (people, places, organizations, dates), key phrases, sentiment (positive, negative, neutral, mixed), syntax (parts of speech), and language detection. It's the Swiss Army knife for text analysis when you don't want to think about models.
The killer feature most people overlook: PII detection and redaction. Comprehend can scan text — support tickets, emails, product reviews, social media posts — and identify personally identifiable information (names, addresses, SSNs, credit card numbers, etc.), then redact it. If you're building a search pipeline and need to strip PII before indexing, Comprehend is the answer. Not Textract, not Kendra — Comprehend.
There's also Comprehend Medical, a specialized variant trained on clinical text. It extracts medical entities like medications, diagnoses, dosages, and procedures from unstructured clinical notes. If you're in healthcare and need to parse doctor's notes into structured data, this is the service.
A common architecture pattern: pipe audio through Transcribe to get text, then feed that text into Comprehend for sentiment analysis. This is how you'd analyze customer service call recordings at scale — Transcribe handles speech-to-text, Comprehend handles the NLP.
Amazon Translate — Neural Machine Translation
Neural machine translation, on demand. You send it text in one language, you get it back in another. Supports dozens of language pairs. It's not doing rule-based translation — it uses deep learning models that handle context, idioms, and grammar far better than older approaches.
Use cases: localizing application content, translating user-generated content in real time, enabling multilingual customer support, or preprocessing foreign-language documents before feeding them to Comprehend for analysis.
Not much else to say — it does one thing well.
The Document and Speech Services
Amazon Textract — OCR That Actually Understands Structure
Textract goes beyond traditional OCR. Yes, it extracts printed text and handwriting from scanned documents. But the real value is that it understands document structure — forms (key-value pairs), tables (rows and columns), and layout elements. If you scan an invoice, Textract doesn't just give you a blob of text — it gives you "Invoice Number: 12345" as a structured key-value pair and "Line Items" as a table you can programmatically iterate.
This is the service for document processing pipelines: mortgage applications, tax forms, insurance claims, medical records. Any scenario where humans are currently reading paper documents and typing data into systems.
Important distinction: Textract extracts text from images and documents. If you then need to analyze that text for entities, sentiment, or PII — that's Comprehend's job. They chain together: Textract extracts, Comprehend analyzes.
Amazon Transcribe — Speech to Text
Transcribe converts audio to text using automatic speech recognition (ASR). Upload an audio file or stream audio in real time, and it produces a transcript with timestamps, speaker identification (diarization), and punctuation.
Key capabilities worth knowing:
PII redaction: Transcribe can automatically identify and redact PII in transcriptions. Think call center recordings where customers give credit card numbers.
Toxicity detection: it can flag toxic content in audio, useful for content moderation in user-generated audio platforms.
Custom vocabularies: you can feed it domain-specific terms (product names, technical jargon) so it transcribes them correctly.
Transcribe Medical: a HIPAA-eligible variant for medical transcription — clinical conversations, doctor-patient interactions.
The audio-to-insight pipeline is one of the most tested architectures: Transcribe (audio → text) → Comprehend (text → sentiment/entities). You'll see this pattern in practically every call center analytics solution on AWS.
Amazon Polly — Text to Speech
Polly goes the other direction: text in, lifelike speech out. It supports dozens of languages and voices, including both standard and neural (more natural-sounding) voice options. Neural voices use deep learning to produce speech that's significantly more human.
Use cases: accessibility features, voice-enabled applications, IVR systems, e-learning platforms that need audio narration, or generating spoken versions of articles. You can also use SSML (Speech Synthesis Markup Language) to control pronunciation, emphasis, pauses, and speaking rate.
The Conversation Service
Amazon Lex — Chatbot Brains
Lex is the service behind Alexa's conversational capabilities, exposed as an API you can build on. It combines automatic speech recognition (ASR) and natural language understanding (NLU) to build conversational interfaces — chatbots and voice bots.
You define intents (what the user wants to do, like "BookHotel" or "CheckBalance"), utterances (the ways they might phrase it), and slots (the parameters you need to collect, like dates or room types). Lex handles the dialog management — guiding the conversation, asking follow-up questions, validating input — and integrates with Lambda for fulfillment logic.
Use cases: FAQ bots for technical support, HR benefits chatbots, order status bots, or any scenario where you need a conversational interface that routes to backend logic. The key distinction from Q Business: Lex is a toolkit for building custom bots with defined conversation flows. Q Business is a managed RAG assistant over enterprise data. Different tools for different problems.
The Intelligence Services
Amazon Rekognition — Computer Vision
Rekognition does image and video analysis. You send it an image, it gives you back labels (objects, scenes, activities), faces (detection, comparison, search), text in images, celebrities, and content moderation scores (unsafe content detection).
Core capabilities:
Object and scene detection: "this image contains a dog, a park, and a frisbee"
Face detection and analysis: detects faces and estimates attributes (age range, emotion, glasses, etc.)
Face comparison and search: compare faces across images, or search a collection of indexed faces
Content moderation: detects explicit, suggestive, or violent content — critical for any user-generated content platform
Celebrity recognition: identifies well-known individuals
Text in images: reads text that appears in photos (signs, license plates, etc.)
Custom Labels: you can train custom image classification models within Rekognition if the built-in labels don't cover your domain
What Rekognition does NOT do: it doesn't handle multilingual translation, document structure extraction, or any NLP tasks. It's purely visual. If someone asks about enabling multilingual experiences, that's Translate. If you need to extract form data from a scanned document, that's Textract.
Amazon Kendra — Enterprise Search
Kendra is a managed enterprise search service powered by ML. You point it at your data sources (S3, SharePoint, Confluence, databases, websites — similar connector model to Q Business), and it builds a search index that understands natural language queries.
The difference from traditional keyword search: Kendra uses ML to understand the intent behind a query, not just match keywords. Ask "how do I reset my password?" and it finds the relevant IT article even if those exact words don't appear.
Kendra also serves as a retriever in RAG architectures. If you're building a retrieval-augmented generation pipeline, Kendra can be the retrieval layer that feeds relevant documents to a foundation model for answer generation. It's not the only option (OpenSearch, pgvector, Pinecone all work), but it's the managed, enterprise-connectors-included option.
Amazon Personalize — Recommendation Engine
Personalize is a managed recommendation service. You provide it three types of data: users (demographics, attributes), items (catalog — products, content, whatever you're recommending), and interactions (clicks, purchases, views, ratings). It trains custom models on your data and serves real-time recommendations via API.
Use cases: "customers who bought this also bought," personalized homepage content, re-ranking search results by user preference, personalized email campaigns. Any scenario where you need to surface the most relevant items for a specific user based on their behavior history.
It's fully managed — you don't pick algorithms or tune hyperparameters. You feed it data, it builds models, you call the API. A common prep pattern: if your data lives in multiple sources, use SageMaker Data Wrangler to prepare and transform it before importing into Personalize.
Amazon Forecast — Time-Series Prediction
Forecast is a managed time-series forecasting service. You give it historical time-series data (sales figures, web traffic, inventory levels, server capacity usage), and it produces predictions about future values.
Use cases that define the service:
Retail demand planning: predict product demand to vary inventory and pricing by store location
Supply chain planning: predict raw goods requirements for manufacturing
Resource planning: forecast staffing needs, advertising spend, energy consumption, server capacity
Operational planning: predict web traffic, AWS usage, IoT sensor patterns
Like Personalize, it's a no-ML-experience-required service. You provide the data, Forecast selects the best algorithm (or ensemble of algorithms) automatically, trains models, and serves predictions.
The Service Selection Cheat Sheet
The most common source of confusion is knowing which service handles which task. Here's the decision tree:
I need to...Use thisExtract sentiment, entities, key phrases from textComprehendDetect/redact PII in textComprehendAnalyze clinical/medical textComprehend MedicalAnalyze images or video (faces, objects, moderation)RekognitionExtract structured data from scanned documentsTextractConvert audio to textTranscribeConvert text to speechPollyTranslate text between languagesTranslateBuild a chatbot with defined conversation flowsLexAdd enterprise search over company dataKendraGenerate personalized recommendationsPersonalizePredict future values from time-series dataForecast
Common Architecture Patterns
These services are designed to compose together. The patterns you'll see most often:
Call center analytics: Audio → Transcribe → Comprehend (sentiment, entities, PII redaction). This is probably the single most common multi-service pipeline.
Document processing: Scanned documents → Textract (extract text + structure) → Comprehend (classify, extract entities, redact PII) → store in database or search index.
Content moderation: User-uploaded images → Rekognition (moderation labels) → approve/reject. User-uploaded text → Comprehend (sentiment + custom classification) → flag toxic content.
Enterprise search with PII compliance: Documents → Comprehend (detect and redact PII) → Kendra (index clean documents) → search interface.
Multilingual support pipeline: Foreign-language text → Translate → Comprehend (now in English, can do sentiment/entities) → respond.
The Bottom Line
These managed services exist to solve a specific problem: you need AI capabilities but you don't want to become an ML team. No training data, no model selection, no infrastructure management — just API calls and results. They're the foundation of countless production systems on AWS, and for good reason. Know what each one does, know which ones compose together, and you can architect an AI-powered application without ever touching a Jupyter notebook.