Glossary of AI, ML and automation terms

Authors: Karol Jurewicz & Michael Jan Rogocki · Last updated:

A glossary of the terms that come up when implementing AI, automation and data analysis in companies — explained simply, without unnecessary jargon.

Each definition is written for business owners and managers, not data scientists. Where a term ties to a broader topic, we link to the relevant article in the cm-opti knowledge base.

A

A/B testing — a method of comparing two variants (e.g. of a process, interface or model) on real data to see which delivers better results. → What is data analysis and BI?

Accuracy — a model-quality metric in ML: the percentage of correct predictions relative to all cases. Not always sufficient — with imbalanced data it can be misleading (see: precision, recall, F1 score).

ADAS (Advanced Driver Assistance Systems) — driver-assistance systems that use Computer Vision and sensors to analyze a vehicle's surroundings in real time: lane recognition, pedestrians, road signs, automatic emergency braking. → AI use cases in business — catalog

Agentic AI — the newest layer of the AI pyramid, built on top of Generative AI. Systems that combine a language model with access to tools, contextual memory and autonomy in carrying out complex tasks. → What is AI?

Agentic workflow — a workflow in which an AI agent carries out a sequence of tasks on its own: it takes input data, makes decisions, uses tools (e.g. sends emails, queries databases) and reports the result. Unlike simple automation, an agent responds to context and exceptions. → What is RAG and an AI agent?

Agile — an agile approach to project management based on short cycles (iterations), fast feedback and continuously adapting the plan to reality. → What is process optimization?

AI (Artificial Intelligence) — the ability of computer systems to perform tasks that require processing data in a way that ultimately resembles human reasoning: pattern recognition, classification, forecasting. → What is AI?

AI agent — an artificial intelligence system that carries out multi-step tasks on its own: it searches for information, compares data, performs actions in company systems. Unlike a chatbot, an agent doesn't just answer — it acts. → What is RAG and an AI agent?

AI Engineer — a specialist who builds production infrastructure for AI solutions: API, monitoring, scaling, retraining. Responsible for making the model work not in a research notebook but in the company's system — stably and securely.

AI hallucinations — a situation in which an AI model generates information that looks credible but is false. One of the main challenges in deploying LLMs. → What is AI?

AI Software Engineering — a discipline that combines software engineering with knowledge of AI/ML models. It covers designing, building and deploying systems based on artificial intelligence.

AI strategy — a plan defining where and how a company will use artificial intelligence: which processes to automate, what data to collect, in what order to implement, how much it will cost and how to measure results. Without a strategy, AI implementations are chaotic and don't deliver lasting results.

Annotation — manually labeling training data (e.g. marking defects on images, tagging document categories). The quality of annotation directly affects the quality of the model. → What is Computer Vision?

Anonymization — the irreversible removal of personal data from a dataset, so that a specific individual cannot be identified. Required when training AI models on customer data. It differs from pseudonymization, which is reversible. → What is RAG and an AI agent?

API (Application Programming Interface) — an interface through which systems communicate with one another. It allows, for example, data to be passed automatically from a CRM to an accounting system without human involvement. → What is systems integration?

Audit trail — an automatic record of who did what, and when, in the system. Essential in regulated industries and in AI implementations — it makes it possible to reconstruct the basis on which the system made a decision.

Auto-scaling — automatically adjusting resources (servers, compute power) to the current load. When traffic grows — the system adds resources. When it falls — it scales down. The company doesn't pay for idle capacity.

Automation — shifting repetitive, predictable activities from a human to a system. A spectrum ranging from simple Excel macros to AI agents carrying out multi-stage tasks. → What is automation?

AWS (Amazon Web Services) — the largest cloud platform in the world. It offers services from data storage (S3), through servers (EC2), to ready-made AI/ML tools (SageMaker, Bedrock). cm-opti builds solutions on the AWS cloud.

B

Backlog — an ordered list of tasks and requirements to be delivered in a project. Items at the top have the highest priority. A backlog is alive — it changes as new information emerges.

BaFin (Bundesanstalt für Finanzdienstleistungsaufsicht) — the German federal financial supervisory authority, the counterpart of Poland's KNF. It regulates banks, insurers and capital markets in Germany. An important context for AI implementations in the financial sector on the German market. → AI use cases in business — catalog

Balanced scorecard — a strategic-management tool that measures a company's performance across four perspectives: financial, customer, internal processes and growth. It helps translate strategy into measurable goals. → What is data analysis and BI?

Batch processing — processing data in batches (e.g. once a day, once an hour), as opposed to real-time processing. Cheaper, but with a delay. → What is data analysis and BI?

BERT (Bidirectional Encoder Representations from Transformers) — a language model developed by Google that analyzes text in both directions at once. The foundation of many NLP use cases in companies: document classification, sentiment analysis, semantic search. → What is OCR and NLP?

Bias — a systematic error in an AI model's results stemming from imbalances in the training data. If a model learned from data in which certain groups were underrepresented, it can reproduce those imbalances in its decisions. Especially important in recruitment, credit scoring and insurance. → What is AI?

Big Data — datasets too large or too complex to process with traditional tools. Characterized by the 3 Vs: Volume, Velocity (the speed of inflow) and Variety. → What is AI?

Bill of Lading — the basic document in sea transport, confirming that cargo has been received on board a vessel. It contains the sender's and recipient's details, a description of the goods and the terms of carriage. In sea freight forwarding, one of the documents processed by OCR + NLP systems. → AI use cases in business — catalog

BIM (Building Information Modeling) — a digital model of a building or infrastructure containing data on geometry, materials, installations and schedule. Combined with AI, it makes it possible to track construction progress, detect deviations from the plan and simulate scenarios. → AI use cases in business — catalog

Bottleneck — a point in a process that slows down the entire workflow. Identifying bottlenecks is the first step in optimization — because improving anything else won't help as long as the bottleneck exists. → What is process optimization?

BPA (Business Process Automation) — the automation of business processes beyond individual tasks — it covers the entire workflow, from start to finish. → What is automation?

BPMN (Business Process Model and Notation) — a graphical standard for modeling business processes. It lets you draw a workflow in a way that's understandable to both the business and IT. → What is process optimization?

Business case — a document or analysis that justifies an investment: how much the implementation costs, how much we'll save / earn, after how long it pays back (ROI), what the risks are. The basis for every decision about an AI implementation.

Business Intelligence (BI) — a set of technologies, processes and tools for transforming raw data into information that supports business decision-making. → What is data analysis and BI?

C

CAC (Customer Acquisition Cost) — the cost of acquiring one new customer (marketing budget / number of new customers). Compared with CLV — if CAC > CLV, the company loses money on every new customer. → What is process optimization?

Chain-of-thought (CoT) — a prompting technique in which you ask the AI model to show its reasoning step by step, instead of giving just the answer. It improves the quality of answers to complex questions, because it forces a structure of thinking.

Change Management — managing change in an organization. It covers preparing people, processes and company culture for the implementation of new solutions. → What is process optimization?

Chatbot — a computer program that holds a conversation with a user. Unlike an AI agent, a chatbot answers questions but doesn't carry out tasks in systems on its own. → What is RAG and an AI agent?

Chunking — splitting a document into smaller fragments (chunks) before storing it in a vector database. The quality of chunking determines the quality of RAG answers. → What is RAG and an AI agent?

Churn prediction — forecasting a customer's departure based on their behavior (e.g. a drop in activity, complaints). An application of predictive analytics and ML. → What is data analysis and BI?

CI/CD (Continuous Integration / Continuous Deployment) — the practice of automating the testing and deployment of code. Every change to the system is automatically tested and — if the tests pass — deployed to production. It shortens the time from writing code to a working solution and reduces the risk of errors.

Classification (Machine Learning) — a task in which a model assigns input data to one of several predefined categories. Examples: document → invoice/complaint/order, photo → product OK/defect, email → urgent/standard. One of the most common AI use cases in companies. → What is AI?

Cloud computing — a model of providing IT resources (servers, databases, AI tools) over the internet, without the need to own your own infrastructure. Companies pay for what they use and can scale resources up or down in minutes. → What is systems integration?

Clustering — an ML technique that automatically groups data into sets (clusters) based on similarity, without predefined categories. Use cases: customer segmentation, document grouping, anomaly detection. It differs from classification in that the model discovers the groups itself rather than assigning to predefined ones. → What is AI?

CLV / LTV (Customer Lifetime Value) — the forecasted value of the revenue a customer will generate over the entire period of their relationship with the company. It helps assess how much it's worth investing to acquire and retain a customer. → What is data analysis and BI?

CMR (International Consignment Note) — the standard transport document in road haulage, governed by the CMR Convention. It contains the sender's and recipient's details, a description of the cargo and the terms of carriage. In logistics, one of the documents most frequently processed by OCR + NLP systems. → AI use cases in business — catalog

CNN (Convolutional Neural Network) — a type of neural network designed to process images. It automatically detects visual features (edges, textures, shapes) without manual programming. → What is Computer Vision?

Co-development — a collaboration model in which an external partner builds a solution together with the client's team while transferring knowledge and competencies. The goal: over time the company takes over maintenance and development of the solution on its own. → External firm or in-house AI team?

Compliance — a company operating in line with applicable regulations and standards (GDPR, EU AI Act, industry norms). In AI implementations, compliance isn't an “add-on” — it's a requirement from day one.

Computer Vision — the field of AI concerned with machines “seeing” — analyzing and interpreting images and video. Use cases: quality control, product classification, safety. → What is Computer Vision?

Confidence score — an indicator of an AI system's certainty about the correctness of a result (0–100%). It makes it possible to separate confident results from those requiring human verification. → What is OCR and NLP?

Confusion matrix — a table showing how a classification model gets things wrong: how many cases it correctly recognized (true positive, true negative) and how many it got wrong (false positive, false negative). Precision, recall and F1 score are calculated from it. → What is AI?

Context window — the maximum amount of text a language model can process in a single query. Measured in tokens (1 token ≈ 3/4 of a word). → What is RAG and an AI agent?

Continuous Improvement — an approach that assumes processes are never “finished” — they can always be improved. Carried out in short cycles: change → measure → adjust. The foundation of Lean and Kaizen. → What is process optimization?

Conversion rate — the percentage of people who took a desired action (e.g. a purchase, a sign-up, a click) relative to all visitors. A basic KPI in sales and marketing. → What is data analysis and BI?

CRISP-DM — a methodology for delivering data science/ML projects: Business Understanding → Data Understanding → Data Preparation → Modeling → Evaluation → Deployment. → What is AI?

CRM (Customer Relationship Management) — a system for managing customer relationships. It stores the history of contacts, sales and complaints. One of the key data sources for BI. → What is systems integration?

Customer journey — the customer's full path from their first contact with the company through purchase and after-sales service. Mapping the customer journey helps identify where the customer runs into problems — and where automation or AI can improve the experience.

Cycle time — the time from the start to the end of a single run of a process. A basic operational KPI — the shorter it is, the more efficient the process. → What is process optimization?

D

Dashboard — an interactive visual panel presenting key indicators (KPIs) in real time. It lets you make decisions based on data, not hunches. → What is data analysis and BI?

Data augmentation — a technique for enlarging a training set by modifying existing data (e.g. rotating, cropping, changing the brightness of images). Used in Computer Vision and NLP. → What is Computer Vision?

Data catalog — a central register describing what data a company has, where it's located, what it means and who owns it. Without a catalog, teams lose hours searching for data that already exists.

Data drift — a change in the distribution of data over time, causing a drop in an ML model's effectiveness. It requires monitoring and retraining. → What is Computer Vision?

Data Engineer — a specialist responsible for preparing data: they build pipelines, clean and transform data, combine sources. Without a Data Engineer, a Data Scientist has nothing to work on.

Data extraction — automatically pulling specific information from documents (e.g. invoice number, amount, date, counterparty name). It combines OCR (reading the text) with NLP (understanding what the text means). → What is OCR and NLP?

Data governance — a set of rules, processes and responsibilities for managing data in a company: who has access, how data is classified, how long it's stored, who's accountable for quality. The foundation of BI and AI implementations.

Data lake — a repository that stores data in raw form (structured and unstructured) without prior processing. → What is data analysis and BI?

Data lineage — tracing the path of data from source to report: where it comes from, how it was transformed, where it ended up. It helps you understand why a figure on a dashboard looks the way it does.

Data normalization — the process of unifying the format of data from different sources (e.g. dates in DD.MM.YYYY vs MM/DD/YYYY format, customer names with typos). Necessary before analysis or model training.

Data pipeline — an automated flow of data from a source (e.g. CRM, IoT, files) through processing (cleaning, transformation) to a destination (data warehouse, dashboard, AI model). The backbone of every BI and ML implementation.

Data quality — the degree to which data is complete, correct, consistent and current. Poor data quality is the most common cause of failure in AI and BI projects — “garbage in, garbage out.”

Data Science — an interdisciplinary field combining statistics, programming and domain knowledge to extract knowledge from data.

Data Scientist — a specialist combining statistics, programming and business knowledge. They analyze data, build predictive models, design experiments. They focus on the question “what can the data tell us?”

Data warehouse — a central repository of processed, structured data, optimized for analysis and reporting. → What is data analysis and BI?

Deep Learning (DL) — a subset of Machine Learning that uses multi-layer neural networks. The foundation of technologies such as image, speech and natural-language recognition. → What is AI?

Demand forecasting — using historical data and ML models to predict future demand for products or services. It makes it possible to better plan purchasing, production and staffing. → What is data analysis and BI?

Descriptive analytics — the lowest level of analytics: it answers the question “what happened?” based on historical data. Dashboards, reports, KPIs. → What is data analysis and BI?

Diagnostic analytics — the second level: it answers the question “why did it happen?” by analyzing root causes (drill-down, correlations). → What is data analysis and BI?

Digital transformation — changing the way a company operates using technology: from digitizing documents, through automating processes, to implementing AI and analytics. It's not an IT project — it's a change in the company's operating model, requiring the involvement of people, processes and organizational culture. → What is automation?

Digital twin — a virtual replica of a physical object, process or system, fed with real-time data. It enables simulations, scenario testing and predictive maintenance.

Digitalization — moving processes into digital form — not just documents, but the way of working. A customer places an order online, the system automatically generates a quote, the status is available in a panel. The next step after digitization. → What is automation?

Digitization (capture) — moving information into digital form (e.g. scanning a document). The first step before digitalization and automation — a scan alone automates nothing, but it enables further processing. → What is automation?

Discovery (workshop) — the first stage of advisory collaboration: conversations with the client's team, mapping processes, identifying bottlenecks and opportunities. The goal is to understand, not to sell a solution. → What is process optimization?

DMAIC — a Six Sigma methodology: Define → Measure → Analyze → Improve → Control. A structured, data-based way of improving processes. → What is process optimization?

Docker — a platform for containerizing applications. It lets you run a system in an isolated environment, independent of the infrastructure. → What is systems integration?

Document classification — automatically assigning a document to a category (invoice, complaint, order, general correspondence) based on its content, without manual review. One of the most popular first AI projects in companies. → What is OCR and NLP?

DPA (Data Processing Agreement) — an agreement entrusting the processing of personal data, required by the GDPR (Article 28) in any collaboration where an external partner processes a client's personal data. It defines the purpose, scope, obligations and the right to audit. → External firm or in-house AI team?

Drill-down — an analytical technique of moving from a general level to a detailed one (e.g. company revenue → regional revenue → customer revenue). → What is data analysis and BI?

Due diligence — a detailed review of a company's documentation (contracts, finances, regulations, liabilities) carried out before a transaction, merger or acquisition. Traditionally it takes days of a legal team's work. AI (RAG + NLP) shortens this process by automatically searching the document base and extracting key clauses. → AI use cases in business — catalog

E

Edge computing — processing data on the device (e.g. a camera) instead of sending it to the cloud. It reduces latency and bandwidth requirements. Used in Computer Vision. → What is Computer Vision?

EMA (European Medicines Agency) — the EU agency responsible for evaluating and supervising medicinal products in Europe. A key regulatory context for AI use cases in pharma — from molecule discovery to production quality control. → AI use cases in business — catalog

Embedding — a representation of text (or an image) as a numerical vector. It lets a machine compare meanings, not the literal wording of words. The foundation of RAG. → What is RAG and an AI agent?

ERP (Enterprise Resource Planning) — an integrated system for managing an enterprise's resources: finance, warehouse, production, HR. A central source of operational data. → What is systems integration?

ETL (Extract, Transform, Load) — the process of acquiring data from sources, transforming it and loading it into a data warehouse or analytical system. → What is data analysis and BI?

EU AI Act — an EU regulation governing artificial intelligence systems. It classifies AI use cases by risk level and imposes corresponding requirements. → What is AI?

Explainability / XAI (Explainable AI) — an AI system's ability to explain why it made a specific decision. The EU AI Act requires explainability in high-risk systems (medicine, finance, recruitment). In practice: the model doesn't just say “rejected” but indicates which variables influenced the decision. → What is AI?

F

F1 score — a model-quality metric that combines precision and recall into a single value (the harmonic mean). Useful when both false alarms and misses matter.

FastAPI — a modern Python framework for building APIs. Fast, with automatic documentation. Used to expose AI models as services (microservices).

Feature engineering — the process of creating and selecting features (variables) that an ML model learns from. Key to the quality of predictions. → What is Computer Vision?

Feedback loop — a mechanism in which an AI model's results are rated by users, and those ratings come back as training data, improving the model. The more a company uses the system, the better it works.

Few-shot learning — a technique in which a model learns to perform a task from a few examples given in the prompt, without full training. Use cases: rapid prototyping, classification with minimal data. → What is RAG and an AI agent?

Fine-tuning — further training an existing AI model on specific company data so it handles tasks in a particular domain better. Cheaper and faster than training from scratch. → What is RAG and an AI agent?

Function calling — a mechanism that lets a language model (LLM) call external tools: check stock levels, send an email, pull data from a CRM. The foundation of AI agents — without it the model only generates text; with it, it can act. → What is RAG and an AI agent?

G

Gap analysis — a comparison of a company's current state (how things are) with the target state (how we want them to be). The gaps = specific areas for improvement that can be turned into projects.

Gartner Hype Cycle — a model describing a technology's life cycle: from inflated enthusiasm, through disillusionment, to productive use. → What is AI?

GDPR — the General Data Protection Regulation — EU law governing the processing of personal data. It imposes obligations on every company that processes the data of customers, employees or users. Crucial in implementations of AI, RAG and analytics. → What is RAG and an AI agent?

Gemba — the Japanese concept of “go and see” — understanding a process requires going to the place where the work happens, not analyzing it from behind a desk. → What is process optimization?

General Terms of Insurance (OWU) — a document defining the scope of insurance cover, exclusions, the parties' obligations and the claims-handling procedures. One of the documents most frequently searched by RAG systems in insurance companies. → AI use cases in business — catalog

Generative AI (GenAI) — AI models that don't just analyze data but generate new content: text, code, reports, images. Examples: ChatGPT, Claude, Gemini. → What is AI?

Git / GitHub — a version-control system (Git) and a platform for collaborating on code (GitHub). A standard in software engineering — every change to the code is tracked, reversible and reviewable.

GMP (Good Manufacturing Practice) — a set of rules and procedures ensuring that products (medicines, food, cosmetics) are manufactured and controlled to established quality standards. In pharma it's a regulatory requirement — AI supports GMP compliance, among other things through automatic quality control on packaging lines. → AI use cases in business — catalog

Go-live — the moment a solution is launched in the production environment, with real users and data. It's not the end of the project — it's the start of post-implementation care.

GPT (Generative Pre-trained Transformer) — a family of language models developed by OpenAI. “Generative” means the model generates text. “Pre-trained” means it learned on a huge dataset before being fine-tuned to specific tasks. ChatGPT is a conversational interface built on the GPT model. Other LLM families include Claude (Anthropic) and Gemini (Google). → What is AI?

GPU (Graphics Processing Unit) — a graphics card used to train AI models thanks to its capacity for massively parallel computation. → What is AI?

Ground truth — the “reference truth” — a set of correctly labeled data against which an AI model's results are compared. The quality of the ground truth determines the quality of the model's evaluation. → What is Computer Vision?

Grounding — a technique of “anchoring” an AI model's answers in specific data sources (documents, knowledge bases) to limit hallucinations. RAG is the most popular form of grounding.

Guardrails — rules and filters placed on an AI model that limit the range of its responses (e.g. “don't answer off-topic questions,” “don't reveal personal data,” “escalate to a human when confidence < 70%”). Crucial in company implementations.

H

HOAI (Honorarordnung für Architekten und Ingenieure) — the German fee regulation for architects and engineers. It sets the rules for remuneration for design services in construction. One of the documents that RAG systems work on in construction companies operating on the German market. → AI use cases in business — catalog

Hugging Face — a platform and community providing thousands of pretrained AI models (NLP, CV, audio). It lets you download a ready-made model and adapt it (fine-tuning) to company tasks instead of training from scratch.

Human-in-the-loop — an approach in which AI processes data automatically, but unusual results or those with a low confidence score are routed to a human for verification. → What is OCR and NLP?

Hybrid cloud — a combination of public cloud (e.g. AWS) with on-premise infrastructure (the company's own servers). Used when some data has to stay local (e.g. regulatory requirements) while the rest takes advantage of the cloud's flexibility.

Hyperparameter — a setting of an ML model that isn't learned from the data but chosen by a human before training. Examples: how fast the model learns (learning rate), how many times it passes over the data (the number of epochs), how large the data batches are (batch size). The choice of hyperparameters affects the quality of results.

I

IDP (Intelligent Document Processing) — intelligent document processing: combining OCR, NLP and ML into a single system that automatically reads documents, extracts data, classifies it and enters it into company systems. The successor to simple OCR. → What is OCR and NLP?

Image classification — a Computer Vision task of assigning an image to one of several predefined categories (e.g. “product OK” vs “defect”). → What is Computer Vision?

Incident management — the process of responding to failures and problems in production systems: detection, diagnosis, repair, communication with users, root-cause analysis, prevention of recurrence.

Inference — the process in which a trained AI model processes new data and generates a result (a prediction, classification, answer). This is the model's “work” after training — and it's what you pay for in the cloud.

IoT (Internet of Things) — a network of physical devices (sensors, cameras, machines) connected to the internet and exchanging data. In companies: machine monitoring, shipment tracking, environmental measurements. IoT data feeds BI dashboards and predictive models. → What is systems integration?

IPA (Intelligent Process Automation) — process automation with AI elements: systems not only perform tasks but analyze, classify and make decisions. → What is automation?

iPaaS (Integration Platform as a Service) — a cloud platform for integrating systems without writing code from scratch (e.g. Zapier, Make, n8n). → What is systems integration?

K

Kaizen — the Japanese philosophy of continuous improvement in small steps. In practice: regular process reviews and the implementation of minor improvements. → What is process optimization?

Kanban — a visual method for managing workflow. Tasks are represented as cards moved between columns (e.g. “to do → in progress → done”). It helps limit the number of tasks in progress and detect bottlenecks. → What is process optimization?

KNF (Polish Financial Supervision Authority) — the Polish authority supervising the financial market (banks, insurance, funds, the capital market). For AI implementations in the financial sector in Poland, it requires transparency and explainability of the models. → AI use cases in business — catalog

KPI (Key Performance Indicators) — key performance indicators. Measurable values showing whether a process / a company is heading in the right direction. → What is data analysis and BI?

KSeF (National e-Invoicing System) — the Polish system for the electronic circulation of structured invoices. It changes how invoices are issued, sent and stored in Poland. A context for AI use cases in accounting — automatic classification and processing of documents has to account for the KSeF format. → AI use cases in business — catalog

Kubernetes — a system for managing containers (Docker) at scale. It automates the deployment, scaling and management of applications. → What is systems integration?

L

LangChain — a framework for building applications based on LLMs: chatbots, AI agents, RAG systems. It connects language models with tools, memory and databases. → What is RAG and an AI agent?

Latency — the time from sending a query to receiving a response. In AI systems: the time from supplying input data to generating a result. Critical in real-time use cases (e.g. CV on a production line).

Lead scoring — a method in which an ML model assesses a prospect's probability of buying based on their behavior (website visits, opening emails, industry, company size). It helps salespeople focus on the contacts with the greatest potential. → AI use cases in business — catalog

Lead time — the total time from the moment an order is placed (or a process begins) to its completion. It includes work time, waiting and transport. A key KPI in process optimization. → What is process optimization?

Lean Management — a management approach focused on eliminating waste (muda) and maximizing value for the customer. → What is process optimization?

Least privilege — a security rule: every user and system should have access only to what's necessary to perform their task — no more. Crucial when working with external partners and in AI implementations that operate on company data.

LLM (Large Language Model) — a large language model trained on enormous text datasets. It can generate text, answer questions, translate, write code. Examples: GPT-4, Claude, Gemini. → What is RAG and an AI agent?

Low-hanging fruit — tasks or processes that can be improved quickly, cheaply and with a visible effect. The first step in optimization and automation — instead of starting with the hardest problem, you start with the one that gives the fastest return. → What is process optimization?

M

Machine Learning (ML) — a subset of AI: algorithms that optimize their behavior based on training data, instead of operating by rigid rules. → What is AI?

Managed service — a collaboration model in which, after implementation, an external partner takes over ongoing monitoring, maintenance and development of the solution for a fixed, predictable cost. The company doesn't have to build internal maintenance competencies. → External firm or in-house AI team?

Master data management (MDM) — managing a company's master data (customers, products, suppliers, locations) so it's consistent across all systems. Without MDM, the same data lives in different versions in the CRM, the ERP and spreadsheets.

Maturity model — a framework assessing what stage a company is at in a given area (e.g. data maturity, AI maturity). It helps set priorities: don't roll out predictive analytics if you don't yet have consistent data.

MFA (Multi-Factor Authentication) — multi-factor authentication: in addition to a password, a second factor is required (e.g. an SMS code, an app, a fingerprint). A security standard for accessing company systems and sensitive data.

Microservices — an architecture in which an application is made up of small, independent services communicating via APIs. It makes scaling and development easier. → What is systems integration?

Middleware — a software layer that mediates between systems. It translates data from one format into another so systems can “talk” to each other. → What is systems integration?

ML Engineering — an engineering discipline concerned with building, deploying and maintaining Machine Learning models in a production environment.

MLOps — a set of practices combining Machine Learning with DevOps: automating the training, deployment, monitoring and retraining of ML models in production. The equivalent of CI/CD for AI models.

Model degradation — the gradual decline in an AI model's quality over time, caused by changes in the data, processes or business environment. This is why an AI implementation isn't a one-off project — it needs care.

Model monitoring — continuously tracking the quality of an AI model's performance after deployment: whether accuracy isn't dropping, whether the input data hasn't changed (data drift), whether users are satisfied with the results. Without monitoring, a model “ages” quietly.

MTBF (Mean Time Between Failures) — the mean time between failures. A key indicator in maintenance and service — the higher it is, the more reliable the machine or system. MTBF data feeds predictive-maintenance models. → AI use cases in business — catalog

MTTR (Mean Time To Repair) — the mean time to repair, from the moment a failure is reported to the restoration of operation. Together with MTBF it forms a pair of indicators that maintenance management rests on. → AI use cases in business — catalog

Multi-agent system — an architecture in which several AI agents collaborate, each responsible for a different part of the process. One agent reads the inquiry, a second prepares a quote, a third books an appointment — instead of one system doing everything. → What is RAG and an AI agent?

Multimodal models — AI models that process different types of data at once: text, image, sound. They can, for example, analyze a document by combining OCR, NLP and layout analysis. → What is OCR and NLP?

MVP (Minimum Viable Product) — the minimum version of a product sufficient to test a business hypothesis. In AI projects: the first working model with a limited scope, deployed to users to verify the value before full development.

N

NDA (Non-Disclosure Agreement) — a confidentiality agreement between a company and an external partner. It defines what information is confidential and what the consequences of disclosing it are. A standard in any collaboration where a partner has access to a company's data or processes.

NER (Named Entity Recognition) — an NLP task of recognizing proper names, dates, amounts, addresses and other entities in text. Used in extracting data from documents. → What is OCR and NLP?

NLP (Natural Language Processing) — natural language processing: a set of AI technologies that enable machines to analyze, classify and generate text in human language. → What is OCR and NLP?

No-code / Low-code — platforms that make it possible to build applications and automations with no (or minimal) coding. Examples: n8n, Make, Power Automate. → What is systems integration?

NoSQL — a family of databases designed to store unstructured or semi-structured data (JSON documents, graphs, key-value). A complement to traditional SQL databases, not a replacement.

O

OCR (Optical Character Recognition) — optical character recognition: a technology that turns an image of text (a scan, a photo) into editable, searchable text. → What is OCR and NLP?

OEE (Overall Equipment Effectiveness) — an indicator of how effectively machines are used in production. It combines three factors: availability × performance × quality. OEE = 100% means production with no downtime, no speed losses and no defects. → What is process optimization?

Offboarding — the process of ending a relationship with an employee or external partner: revoking access, handing over documentation, verifying that the knowledge and code are complete. A well-run offboarding protects the company from losing knowledge and data.

Onboarding — the process of bringing a new employee or external partner on board: transferring knowledge about the company, processes, systems and data. The better-documented the processes, the faster and cheaper the onboarding. → What is process optimization?

OTIF (On Time In Full) — a logistics indicator: what percentage of deliveries arrive on time and in full quantity. A low OTIF means problems in the supply chain. → What is process optimization?

Outsourcing — delegating the delivery of tasks or processes to an external partner instead of doing them in-house. In the context of AI and automation: entrusting the diagnosis, build and implementation of a solution to an advisory-and-implementation firm. → External firm or in-house AI team?

Overfitting — a situation in which an ML model has “memorized” the training data too well and performs poorly on new data. Like a student who crammed the answers by heart but doesn't understand the subject. → What is AI?

P

Pareto (the 80/20 principle) — 80% of effects come from 20% of causes. In optimization: focus on the 20% of tasks that generate 80% of the problems. → What is process optimization?

PDCA (Plan-Do-Check-Act) — a continuous-improvement cycle: plan a change, implement it, measure the effect, adjust. Repeat. A basic Lean and Kaizen tool. → What is process optimization?

Personalization — tailoring content, an offer or an experience to a specific user based on their data and behavior. In e-commerce: product recommendations. In customer service: prioritizing cases. It requires data, segmentation and often ML. → What is data analysis and BI?

Pilot (pilot deployment) — launching a solution on a limited scope (e.g. one department, one type of document, one production line) to test how it works before a full rollout.

PoC (Proof of Concept) — a quick test of a concept on a limited scope of data/processes, to check whether an idea for an AI/automation implementation makes sense before the company invests in a full solution. → What is AI?

Precision — a model-quality metric: what percentage of the results marked as positive are actually correct. High precision = few false alarms.

Predictive analytics — the third level: it answers the question “what will happen?” using statistical and ML models. → What is data analysis and BI?

Predictive maintenance — using sensor data and ML models to predict machine failures before they happen. It makes it possible to schedule inspections and part replacements based on actual condition, not the calendar. An application of Computer Vision + IoT + ML. → What is Computer Vision?

Prescriptive analytics — the highest level: it answers the question “what should we do?” through scenario simulation and decision optimization. → What is data analysis and BI?

Process audit — a systematic review of how a company actually operates: which processes are mapped, where the bottlenecks are, what data is collected, what can be improved. The starting point before any AI or automation implementation. → What is process optimization?

Process mapping — a visual representation of the steps, decisions and flows in a business process. The first step in optimization — you can't improve what you can't see. → What is process optimization?

Process mining — a technique for analyzing data from IT systems (logs, timestamps) to automatically discover how processes actually run — as opposed to how they're described in the documentation. → What is process optimization?

Process optimization — systematically improving the way a company operates: identifying bottlenecks, eliminating unnecessary steps, standardization, measuring results. → What is process optimization?

Prompt — an instruction or question directed at an AI model. The quality of the prompt directly affects the quality of the answer. A well-constructed prompt includes context, the task, the expected answer format and constraints. → What is RAG and an AI agent?

Prompt chaining — a technique of linking several prompts into a sequence, where the output of one becomes the input of the next. It lets you break a complex task into simpler steps (e.g. first extract the data → then analyze it → finally write the report).

Prompt engineering — the skill of designing prompts that draw the best results out of an AI model. It includes techniques such as chain-of-thought, few-shot, role prompting and structured output. A key competency in deploying LLMs in a company.

Pseudonymization — replacing data that identifies a person (name, national ID number) with pseudonyms or codes, while keeping the ability to re-assign it. Unlike anonymization, it's reversible. Required by the GDPR as a data-protection measure.

Python — the dominant programming language in AI, Data Science and ML. Its ecosystem of libraries (pandas, scikit-learn, PyTorch, TensorFlow) makes it the industry standard. cm-opti builds solutions mainly in Python, but we match the technology to the problem — including R for statistical analytics and no-code/low-code platforms (n8n, Make) for quick prototypes and MVPs.

PyTorch — a deep-learning framework from Meta AI. It dominates in research and, increasingly, in production deployments. Flexible, with a large community. → What is AI?

R

R — a programming language specializing in statistics, data analysis and visualization. Strong in exploratory analytics, statistical modeling and reporting. Used where deep statistical analysis matters, rather than building a production system.

RAG (Retrieval-Augmented Generation) — an architecture that combines retrieving information from a company knowledge base with answer generation by an LLM. It lets AI answer based on the company's documents, not just its training data. → What is RAG and an AI agent?

ReAct (Reason + Act) — an AI agent's pattern of operation: the model “thinks” (plans the next step), performs an action, observes the result and decides what's next. → What is RAG and an AI agent?

Real-time processing — processing data immediately as it appears, without delay. Required, for example, in CV on a production line or KPI monitoring. More expensive than batch processing, but it gives a current picture.

Reasoning model — a type of language model that carries out an internal, step-by-step reasoning process before giving an answer. Unlike standard LLMs, which generate an answer right away, a reasoning model “thinks” longer but delivers more accurate results in tasks requiring logic, math and multi-step analysis. Examples: OpenAI o1, o3. More expensive and slower than a standard model, but useful where precision matters more than speed.

Recall — a model-quality metric: what percentage of the actually positive cases the model correctly detected. High recall = few misses. Crucial when a miss is costly (e.g. a product defect, a disease).

Regression — an ML technique that predicts a numerical value (e.g. a price, temperature, delivery time) based on historical data. Unlike classification, the output is a number, not a category. Use cases: sales forecasting, cost estimation, predicting lead times. → What is AI?

Reinforcement learning — a type of ML in which a system learns by trial and error, receiving a “reward” for good decisions and a “penalty” for bad ones. Used in recommendation systems, games, robotics and process optimization. The foundation of many agentic-AI systems. → What is AI?

Responsible AI — an approach to designing and deploying AI systems that accounts for ethics, transparency, fairness and safety. It covers: avoiding bias, explainability of decisions (XAI), privacy protection and human oversight. The EU AI Act formalizes many of these principles in EU law. → What is AI?

REST API — an API architecture style based on the HTTP protocol. The most popular way for systems to communicate over the internet. → What is systems integration?

Retraining — training an AI model again on new data when monitoring shows a drop in quality. It can be scheduled (e.g. quarterly) or triggered by an alert.

Roadmap — an implementation plan laid out over time: what we do first (quick wins), what comes in the following months, what requires preparation. It lets the client see the whole path, not just the next step.

ROI (Return on Investment) — return on investment. It measures how much a company gains relative to how much it spent. A key indicator for assessing whether an implementation is worthwhile. → What is process optimization?

Role prompting (persona) — a technique in which you give the AI model the role of an expert (e.g. “You are an experienced financial analyst...”). It helps the model adjust the tone, level of detail and perspective of its answer.

Rollback — reverting a deployment to a previous, working version in case of problems. A well-designed system always has a rollback plan.

RPA (Robotic Process Automation) — automating repetitive tasks using “bots” that mimic a human's actions in systems (clicking, copying, pasting). → What is automation?

S

SaaS (Software as a Service) — software available over the internet on a subscription model, without installation on the company's servers. → What is systems integration?

Scrum — a project-management framework based on short cycles (sprints, usually 2 weeks). Each sprint ends with a working increment of the product. Popular in IT and AI implementations. → What is process optimization?

Segmentation — in Computer Vision: dividing an image into regions (e.g. separating the product from the background). In business: dividing customers into groups by behavior. → What is Computer Vision?

Semantic search — search based on meaning, not on an exact keyword match. The question “how do I dismiss an employee” will find a document about the “procedure for terminating an employment contract,” even though the words don't overlap. The foundation of RAG systems. It requires a vector database and embeddings. → What is RAG and an AI agent?

Sentiment analysis — an NLP technique of automatically recognizing emotion and attitude in text (positive / neutral / negative). Use cases: monitoring customer opinions, complaint analysis, social media. → What is OCR and NLP?

Serverless — a model in which the company doesn't manage servers — the code runs automatically in response to events (e.g. a new document → OCR processing). You pay only for execution time, not for maintaining a server. Example: AWS Lambda.

Single source of truth — a single, central source of data used by all systems and teams. It eliminates discrepancies and the “which data is current?” question. → What is data analysis and BI?

Six Sigma — a process-improvement methodology based on the statistical elimination of variation. The goal: a maximum of 3.4 defects per million operations. → What is process optimization?

SLA (Service Level Agreement) — an agreement defining measurable service-quality parameters (e.g. response time < 2h, system availability 99.9%). It lets you hold a provider accountable for results, not effort.

SOP (Standard Operating Procedure) — a standardized instruction for carrying out a process. It ensures repeatability and independence from any one person. → What is process optimization?

Sprint — a short work cycle in Scrum (usually 2 weeks), after which the team delivers a working part of the solution. It lets you quickly verify the direction and gather feedback.

SQL — a query language for databases. It lets you retrieve, filter, join and aggregate data. A basic tool for a data analyst. → What is data analysis and BI?

SSO (Single Sign-On) — a mechanism that lets a user log in once and access many systems without re-entering a password. It makes work easier and improves security (one strong password instead of ten weak ones).

Staging — a test environment that looks identical to production but doesn't affect real data. Used for final testing before go-live.

Stakeholder — a person or group that influences a project or is affected by it. Identifying stakeholders at the start of an implementation prevents a situation where a key person finds out about the project on the day of go-live.

Streaming — processing data the moment it's created, without gathering it into batches. Used in monitoring, alerts, IoT. Tools: Apache Kafka, AWS Kinesis.

Supervised learning — a type of ML in which a model learns from labeled data (e.g. images marked “defect” / “OK”). The most common type in business use cases. → What is AI?

Synthetic data — artificially generated data that mimics the structure and distribution of real data. Used when there's too little real data to train a model, or when the real data contains sensitive information (GDPR). It doesn't replace real data, but complements it. → What is AI?

System prompt — an instruction set at the start of a conversation with an AI model, defining its behavior, tone, constraints and context. The user doesn't see it, but it affects every answer. The foundation of chatbot and AI-agent implementations.

Systems integration — connecting a company's IT tools (CRM, ERP, warehouse, e-commerce) so that data flows automatically, without manual copying. → What is systems integration?

T

TAT (Turnaround Time) — the time to handle a case from intake to resolution. A key KPI in customer service, technical support and complaint processes. → What is process optimization?

TCO (Total Cost of Ownership) — the total cost of owning a solution: not just purchase/implementation, but also maintenance, updates, training, infrastructure. Crucial when comparing the “build it ourselves” vs “buy it” options.

Temperature — a parameter controlling an AI model's “creativity.” A low temperature (0–0.3) = predictable, repeatable, safe answers. A high one (0.7–1.0) = more varied, creative, but less certain answers. In business use cases, usually low.

Text classification — an NLP task of assigning a document to a category (e.g. invoice, complaint, order) based on its content. → What is OCR and NLP?

Time to value — the time from the start of a project to the moment the company sees the first measurable benefits. In a well-planned AI implementation: weeks, not months. A key argument for choosing an iterative approach (pilot → scale) over a “big-bang implementation.”

TMS (Transport Management System) — a system for managing transport: route planning, order management, shipment tracking, settlements with carriers. One of the key systems in logistics companies, which AI enriches with order automation and route optimization. → AI use cases in business — catalog

Tokenization — the process of splitting text into units (tokens) processed by an AI model. One token is about 3/4 of a word in English. → What is AI?

Training (model training) — the process in which an AI model processes training data and optimizes its parameters to perform a task as well as possible. It requires data, compute power and time. The result: a trained model ready for inference.

Transfer learning — a technique of taking a model trained on a large dataset and further training it on a smaller dataset specific to the task. It drastically reduces data requirements. → What is Computer Vision?

Transformer — a neural-network architecture that forms the foundation of modern AI models (LLMs, generative models). The key innovation: the attention mechanism. → What is AI?

U

Unsupervised learning — a type of ML in which a model looks for patterns in unlabeled data. Use cases: customer segmentation, anomaly detection, clustering. → What is AI?

V

Value Stream Mapping — a Lean technique of mapping the entire value stream — from the customer's order to delivery of the product/service — marking value-adding and non-value-adding activities. → What is process optimization?

Vector database — a database that stores text as numerical vectors (embeddings), enabling semantic search — by meaning, not by keywords. The foundation of RAG. → What is RAG and an AI agent?

Vendor lock-in — dependence on a single technology provider, making it hard to switch or migrate. A risk when choosing closed platforms, proprietary data formats or solutions without an export option. → What is systems integration?

VOB (Vergabe- und Vertragsordnung für Bauleistungen) — the German procurement and contract regulations for construction works. It sets the rules for tenders, contract terms and the acceptance of work. Together with HOAI, it forms the foundation of the documentation that RAG systems work on in construction on the German market. → AI use cases in business — catalog

W

Webhook — a mechanism in which one system automatically notifies another of an event (e.g. a new order → a notification to the warehouse). A “push” action instead of constant polling. → What is systems integration?

WMS (Warehouse Management System) — a system for managing a warehouse: receipts, dispatches, locations, stock levels, order picking. One of the key tools in logistics, which AI integrates with the ERP, the e-commerce platform and the transport system. → AI use cases in business — catalog

Workflow — an ordered flow of work: a sequence of steps, responsible people and decisions needed to carry a task from start to finish. → What is automation?

Z

Zero-shot learning — a technique in which an AI model performs a task with no examples at all — based solely on a description in the prompt. Example: “classify this document as an invoice, an order or a complaint” — the model manages without prior training on those categories. → What is RAG and an AI agent?

Related articles in the cm-opti Knowledge base