AI & Data Glossary for Social Sector

The Definitive Glossary for Understanding Artificial Intelligence (AI) For Social Sector.

#	Term	Definition
1	Algorithm	A finite set of well-defined instructions to solve a problem or perform a computation.
2	Analytics	The discovery, interpretation, and communication of meaningful patterns in data.
3	Artificial Intelligence (AI)	The simulation of human intelligence processes by machines, especially computer systems.
4	Artificial Neural Network (ANN)	A computing system inspired by biological neural networks, consisting of interconnected layers of nodes (“neurons”) that process data via weighted connections.
5	API (Application Programming Interface)	A set of routines, protocols, and tools for building software and applications, enabling different systems to communicate.
6	A/B Testing	A method of comparing two versions of something (A and B) to determine which performs better under controlled conditions.
7	AutoML	Automated Machine Learning: techniques and tools that automate the end-to-end process of applying machine learning to real-world problems.
8	Backpropagation	A training algorithm for neural networks that calculates the gradient of the loss function and updates weights via gradient descent.
9	Batch Processing	Executing a series of jobs on a computer without manual intervention, processing data in large groups (“batches”).
10	Big Data	Extremely large and complex data sets that traditional data processing applications cannot handle efficiently.
11	BI (Business Intelligence)	Technologies, applications, and practices for collection, integration, analysis, and presentation of business information.
12	Black-Box Model	A model whose internal workings are not visible or interpretable by the user.
13	Blockchain	A distributed, decentralized ledger that records transactions across many computers in a way that prevents alteration.
14	Boosting	An ensemble technique that combines weak learners sequentially to create a strong learner, by focusing on errors of prior models.
15	Chatbot	A software application that simulates human conversation through text or voice interactions, often using NLP.
16	Classification	A supervised learning task of predicting a discrete label for input data.
17	Clustering	An unsupervised learning technique that groups data points based on similarity.
18	CNN (Convolutional Neural Network)	A class of deep neural networks, most commonly applied to analyzing visual imagery, using convolutional layers to detect patterns.
19	Cohort Analysis	A subset of behavioral analytics that takes data from a given dataset and rather than looking at all users as one unit, it breaks them into related groups for analysis.
20	Computer Vision	A field of AI that enables computers to interpret and process visual data from the world (images, video).
21	Confusion Matrix	A table used to evaluate classification models, showing true vs. predicted labels (TP, TN, FP, FN).
22	Cross-Validation	A model validation technique for assessing how results of a statistical analysis will generalize, by partitioning data into complementary subsets.
23	Data Cleaning	The process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset.
24	Data Engineering	The practice of designing and building systems for collecting, storing, and analyzing data at scale.
25	Data Governance	The overall management of the availability, usability, integrity, and security of data within an organization.
26	Data Lake	A centralized repository that allows you to store all structured and unstructured data at any scale.
27	Data Mart	A subset of a data warehouse, usually oriented to a specific business line or team.
28	Data Mining	The practice of examining large pre-existing databases to generate new information.
29	Data Pipeline	A set of processes that move data from source to destination, applying transformations along the way.
30	Data Privacy	The proper handling, processing, storage, and usage of personal data.
31	Data Provenance	The record of the origin and transformations applied to data, ensuring traceability.
32	Data Quality	The measure of data’s condition, including accuracy, completeness, reliability, and relevance.
33	Data Science	An interdisciplinary field that uses scientific methods, algorithms, and systems to extract knowledge and insights from data.
34	Data Warehouse	A central repository of integrated data from multiple sources, structured for query and analysis.
35	Data Wrangling	The process of cleaning, structuring, and enriching raw data into a desired format for better decision making.
36	Decision Tree	A flowchart-like structure used for classification and regression, where each internal node represents a test on a feature.
37	Deep Learning	A subset of machine learning involving neural networks with many layers that can learn representations of data with multiple levels of abstraction.
38	Dimensionality Reduction	Techniques (e.g., PCA, t-SNE) to reduce the number of variables under consideration by obtaining a set of principal variables.
39	Dropout	A regularization technique for neural networks that randomly “drops out” units during training to prevent overfitting.
40	EDA (Exploratory Data Analysis)	An approach to analyzing data sets to summarize their main characteristics, often using visual methods.
41	Embedding	A learned representation for categorical variables or items (e.g., words) as vectors in a continuous vector space.
42	Ensemble Learning	Methods that combine multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent algorithms alone.
43	Epoch	One complete pass through the entire training dataset during model training.
44	ETL (Extract, Transform, Load)	The process of extracting data from sources, transforming it for analysis, and loading it into a target database or warehouse.
45	Explainable AI (XAI)	Techniques that make the outputs of AI models understandable to humans.
46	Feature Engineering	The process of using domain knowledge to create features that make machine learning algorithms work better.
47	Feature Importance	Metrics that assign a score to input features based on how useful they are at predicting a target variable.
48	Feature Selection	The process of selecting a subset of relevant features for model construction.
49	Federated Learning	A distributed approach to machine learning where the model is trained across multiple decentralized devices holding local data samples.
50	Fine-Tuning	The process of taking a pre-trained model and adapting it to a new, related task by continuing training on new data.
51	Forecasting	The process of making predictions about the future based on past and present data.
52	GAN (Generative Adversarial Network)	A class of neural networks where two networks (generator and discriminator) compete, enabling generation of realistic data.
53	GPU (Graphics Processing Unit)	A specialized processor optimized for parallel computations, widely used for training deep learning models.
54	Graph Database	A database that uses graph structures with nodes, edges, and properties to represent and store data.
55	GUI (Graphical User Interface)	A user interface that allows users to interact with electronic devices through graphical icons.
56	Hadoop	An open-source framework for distributed storage and processing of large data sets using the MapReduce programming model.
57	Hyperparameter	A configuration parameter external to the model that cannot be estimated from data and must be set prior to training.
58	Hyperparameter Tuning	The process of searching for the optimal hyperparameter values for a learning algorithm.
59	Imbalanced Data	A dataset where the classes are not represented equally.
60	Inference	The process of using a trained model to make predictions on new data.
61	Input Layer	The first layer in a neural network that receives the input data.
62	Instance	A single data point or record in a dataset.
63	Integration Testing	Testing combined parts of an application to determine if they function together correctly.
64	Internet of Things (IoT)	The network of physical objects embedded with sensors, software, and other technologies to connect and exchange data with other devices and systems.
65	Jupyter Notebook	An open-source web application for creating and sharing documents that contain live code, equations, visualizations, and narrative text.
66	K-Nearest Neighbors (KNN)	A simple algorithm that stores all available cases and classifies new cases based on a similarity measure.
67	KPI (Key Performance Indicator)	A measurable value that demonstrates how effectively an organization is achieving key objectives.
68	L1/L2 Regularization	Techniques that add a penalty to the loss function based on the magnitude (L1) or square (L2) of model coefficients to prevent overfitting.
69	Label Encoding	Converting categorical text data into model-understandable numeric labels.
70	Large Language Model (LLM)	A deep learning model, often transformer-based, trained on massive text corpora to understand and generate human language.
71	Latent Variable	A variable that is not directly observed but is inferred from other variables.
72	Latency	The time delay between an input being processed and the corresponding output.
73	Layer	A collection of neurons in a neural network; includes input, hidden, and output layers.
74	Learning Curve	A plot of model learning performance over experience or time (e.g., training iterations).
75	Learning Rate	A hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated.
76	Linear Regression	A statistical method for modeling the relationship between a scalar response and one or more explanatory variables.
77	Logistic Regression	A statistical method for binary classification that models the probability of a binary outcome.
78	Log Loss (Cross-Entropy Loss)	A performance metric for classification models measuring the distance between predicted probabilities and actual labels.
79	Looker	A data exploration and business intelligence platform. (Note: proper noun)
80	Loss Function	A function that maps values of one or more variables onto a real number representing some “cost” associated with those values.
81	LSTM (Long Short-Term Memory)	A type of recurrent neural network capable of learning long-term dependencies.
82	Machine Learning (ML)	A subset of AI that gives computers the ability to learn from data without being explicitly programmed.
83	Macro-Average	Averaging performance metrics independently for each class and then taking the average, treating all classes equally.
84	Magnitude	The size or length of a vector; often used in context of embeddings.
85	MapReduce	A programming model for processing large data sets with a parallel, distributed algorithm.
86	Markov Chain	A stochastic process where the next state depends only on the current state, not on the sequence of events that preceded it.
87	Masked Language Model	A language model trained by hiding (“masking”) some tokens and predicting them from context.
88	Mean Absolute Error (MAE)	The average of absolute differences between predicted and actual values.
89	Mean Squared Error (MSE)	The average of squared differences between predicted and actual values.
90	MediaPipe	A cross-platform framework for building multimodal (e.g., video, audio) ML pipelines.
91	Metadata	Data that provides information about other data (e.g., creation date, author, format).
92	Microservice	An architectural style that structures an application as a collection of loosely coupled services.
93	Model Compression	Techniques to reduce the size of a trained model for deployment on resource-constrained devices.
94	Model Explainability	The degree to which a human can understand the cause of a decision made by a model.
95	Model Persistence	Saving a trained model to disk for later reuse.
96	Model Serving	Deploying a trained model so that it can respond to inference requests.
97	Monte Carlo Simulation	A computational algorithm that uses random sampling to obtain numerical results for probabilistic systems.
98	Multicollinearity	A situation in which two or more predictor variables in a multiple regression model are highly correlated.
99	Multilabel Classification	A classification task where each instance may be assigned multiple labels.
100	Multivariate Analysis	The examination of more than two variables to determine relationships and patterns.
101	Naive Bayes	A family of probabilistic classifiers based on applying Bayes’ theorem with strong independence assumptions between features.
102	Natural Language Processing (NLP)	The field of AI focused on the interaction between computers and human (natural) languages.
103	Neural Network	A network of interconnected nodes organized in layers that can learn complex patterns in data.
104	NLG (Natural Language Generation)	The process of automatically generating coherent text from data.
105	NLP Pipeline	A sequence of processing steps (tokenization, parsing, tagging) to analyze textual data.
106	Node	A basic unit of a data structure, such as a linked list or tree, or a point in a graph.
107	Normalization	Scaling numeric data to a common range, often [0,1], to improve model performance.
108	Object Detection	A computer vision task of identifying and localizing objects within an image.
109	OCR (Optical Character Recognition)	Technology to convert different types of documents, such as scanned paper documents, into editable and searchable data.
110	One-Hot Encoding	Representing categorical variables as binary vectors.
111	Online Learning	A model training paradigm where the model is updated incrementally as new data arrives.
112	Outlier	An observation point that is distant from other observations, possibly indicating variability in measurement or experimental error.
113	Overfitting	When a model learns training data too well, capturing noise and failing to generalize to new data.
114	Parameter	An internal configuration variable of a model learned from data (e.g., weights in a neural network).
115	PCA (Principal Component Analysis)	A technique to reduce dimensionality by transforming to a new set of variables (principal components) ordered by variance.
116	Precision	The ratio of true positive predictions to the total predicted positives.
117	Predictive Modeling	The process of using data and statistical algorithms to predict outcomes.
118	Precision-Recall Curve	A plot of precision vs. recall for different thresholds, useful for imbalanced datasets.
119	Principal Component	A new variable constructed as a linear combination of original variables in PCA.
120	Probability Distribution	A function that describes the likelihood of each outcome in an experiment.
121	Productionalization	The process of deploying and integrating a model into a live production environment.
122	Programmatic Access	Interacting with software or data via code rather than manually.
123	Propensity Score	The probability of assignment to a particular treatment given covariates, used in causal inference.
124	PR Curve	Precision-Recall curve.
125	Python	A high-level, interpreted programming language widely used in data science and AI.
126	R-Squared	A statistical measure representing the proportion of variance for a dependent variable explained by independent variables in a regression model.
127	Random Forest	An ensemble learning method using multiple decision trees to improve predictive accuracy and control overfitting.
128	RapidMiner	A data science platform for building predictive models. (Proper noun)
129	Recommender System	An information filtering system that predicts user preferences for items.
130	Recall	The ratio of true positive predictions to the total actual positives.
131	Regression	A statistical method for modeling relationships between variables.
132	Reinforcement Learning	A type of machine learning where agents learn to make decisions by performing actions and receiving rewards.
133	Repeatability	The degree to which an experiment or measurement yields the same results under unchanged conditions.
134	Reproducibility	The ability to duplicate the results of a study using the same data and methods.
135	ResNet	A deep neural network architecture with “skip connections” that mitigate the vanishing gradient problem.
136	REST (Representational State Transfer)	An architectural style for designing networked applications using stateless communication.
137	Return on Investment (ROI)	A performance measure used to evaluate the efficiency of an investment, calculated as (gain−cost)/cost.
138	ROC Curve	Receiver Operating Characteristic curve: plot of true positive rate vs. false positive rate at various thresholds.
139	Root Mean Squared Error (RMSE)	The square root of the average of squared differences between predicted and actual values.
140	Sampling	The process of selecting a subset of data from a population for analysis.
141	Scalability	The capability of a system to handle growing amounts of work by adding resources.
142	Schema	The structure that defines the organization of data in a database.
143	Score Function	A function that assigns a numerical score to potential outputs of a model.
144	Script	A file containing a sequence of instructions executed by a program.
145	Search Algorithm	An algorithm for retrieving information stored within some data structure or calculated in the search space of a problem domain.
146	Semantic Segmentation	A computer vision task that assigns a class label to each pixel in an image.
147	Sensitivity Analysis	The study of how uncertainty in model output can be apportioned to different sources of uncertainty in model input.
148	Sentiment Analysis	The use of NLP to identify and extract subjective information from text.
149	Sequence Model	Models (e.g., RNN, LSTM) designed to handle sequential data.
150	Sigmoid Function	An S-shaped activation function used in neural networks, mapping inputs to (0,1).
151	Similarity Measure	A metric that quantifies the similarity between two data objects.
152	Simplex Algorithm	A popular algorithm for numerical solution of linear programming problems.
153	Skewness	A measure of the asymmetry of the probability distribution of a real-valued variable.
154	SLAM (Simultaneous Localization and Mapping)	A technique used in robotics and autonomous vehicles to build a map of an unknown environment while simultaneously keeping track of an agent’s location.
155	Softmax Function	An activation function that converts a vector of values into a probability distribution.
156	Software as a Service (SaaS)	A software licensing model where access is provided on a subscription basis and hosted centrally.
157	Source Code	The human-readable instructions that define what a program does.
158	Spectral Clustering	A clustering technique using the eigenvalues of a similarity matrix to reduce dimensions before clustering.
159	SQL (Structured Query Language)	A domain-specific language used in programming for managing relational databases.
160	Stack	A data structure that follows Last In First Out (LIFO) principle.
161	Standardization	Scaling data to have zero mean and unit variance.
162	Statistical Significance	The likelihood that a result or relationship is caused by something other than random chance.
163	Stepwise Regression	A method of fitting regression models by adding or removing predictors based on statistical criteria.
164	Stop Word	Commonly used words (e.g., “the,” “is”) that are often removed in NLP preprocessing.
165	Stratified Sampling	A sampling method that divides the population into strata and samples each stratum.
166	Stream Processing	Real-time processing of data in motion rather than at rest.
167	Structured Data	Data that adheres to a pre defined data model and is easily searchable.
168	Supervised Learning	ML tasks where models are trained on labeled data.
169	Support Vector Machine (SVM)	A supervised learning model that finds the hyperplane that best separates classes in feature space.
170	Survey Analysis	The practice of examining survey data to extract insights.
171	Swarm Intelligence	The collective behavior of decentralized systems, natural or artificial.
172	TF-IDF (Term Frequency-Inverse Document Frequency)	A numerical statistic intended to reflect how important a word is to a document in a collection or corpus.
173	TCO (Total Cost of Ownership)	The total cost of purchasing, operating, and maintaining a system over its life cycle.
174	Tensor	A multi-dimensional array used by deep learning frameworks.
175	TensorFlow	An open-source library for dataflow and differentiable programming, commonly used for deep learning. (Proper noun)
176	Testing Set	A subset of data used to assess the performance of a fully trained model.
177	Time Series Analysis	Techniques for analyzing time-ordered data points to extract meaningful statistics and trends.
178	Tokenization	The process of breaking text into individual words or subwords (tokens).
179	TP/FP/TN/FN	True Positive, False Positive, True Negative, False Negative—components of the confusion matrix.
180	Transfer Learning	Reusing a pre-trained model on a new, related problem, often requiring less data.
181	Tree Ensemble	An ensemble of decision trees (e.g., random forest, gradient boosting).
182	Training Set	The portion of data used to fit model parameters.
183	Tuning	The process of optimizing model hyperparameters.
184	Underfitting	When a model is too simple to capture underlying patterns in data, resulting in poor performance.
185	Unstructured Data	Data that does not adhere to a pre-defined model (e.g., text, images).
186	Unsupervised Learning	ML tasks where models find patterns in unlabeled data.
187	Validation Set	A subset of data used to tune hyperparameters and prevent overfitting.
188	Variance	The variability of model predictions for different training data.
189	Vector	A one-dimensional array of numbers representing features or embeddings.
190	Video Analytics	The process of applying computer vision to extract meaningful information from video.
191	Virtual Assistant	A software agent that can perform tasks or services for an individual based on commands or questions.
192	Visualization	The graphical representation of data or model outputs.
193	Voice Recognition	The ability of a system to identify and process human speech.
194	Web Scraping	Automated extraction of information from websites.
195	Weight Decay	A regularization technique that adds a penalty proportional to the magnitude of weights to the loss function.
196	Word Embedding	A representation of words as continuous vectors capturing semantic meaning.
197	Word2Vec	A two-layer neural network that produces word embeddings by predicting context words.
198	XGBoost	An optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable.
199	XML (eXtensible Markup Language)	A markup language that defines a set of rules for encoding documents in a format readable by both humans and machines.
200	Zero-Shot Learning	The ability of a model to correctly make predictions on classes it has not seen during training.

Latest AI & Data COE Insights

The Hidden Costs of Intelligence: Unpacking the Environmental Footprint of LLMs

Navigating the AI Frontier: Non-Profit AI Maturity Framework

AI and the Future of Public Health

Subscribe to our newsletter

The Impact Post

Insights on trends, challenges and opportunities in the social impact ecosystem