AI & Data Glossary for Social Sector

The Definitive Glossary for Understanding Artificial Intelligence (AI) For Social Sector.

#TermDefinition
1AlgorithmA finite set of well-defined instructions to solve a problem or perform a computation.
2AnalyticsThe discovery, interpretation, and communication of meaningful patterns in data.
3Artificial Intelligence (AI)The simulation of human intelligence processes by machines, especially computer systems.
4Artificial Neural Network (ANN)A computing system inspired by biological neural networks, consisting of interconnected layers of nodes (“neurons”) that process data via weighted connections.
5API (Application Programming Interface)A set of routines, protocols, and tools for building software and applications, enabling different systems to communicate.
6A/B TestingA method of comparing two versions of something (A and B) to determine which performs better under controlled conditions.
7AutoMLAutomated Machine Learning: techniques and tools that automate the end-to-end process of applying machine learning to real-world problems.
8BackpropagationA training algorithm for neural networks that calculates the gradient of the loss function and updates weights via gradient descent.
9Batch ProcessingExecuting a series of jobs on a computer without manual intervention, processing data in large groups (“batches”).
10Big DataExtremely large and complex data sets that traditional data processing applications cannot handle efficiently.
11BI (Business Intelligence)Technologies, applications, and practices for collection, integration, analysis, and presentation of business information.
12Black-Box ModelA model whose internal workings are not visible or interpretable by the user.
13BlockchainA distributed, decentralized ledger that records transactions across many computers in a way that prevents alteration.
14BoostingAn ensemble technique that combines weak learners sequentially to create a strong learner, by focusing on errors of prior models.
15ChatbotA software application that simulates human conversation through text or voice interactions, often using NLP.
16ClassificationA supervised learning task of predicting a discrete label for input data.
17ClusteringAn unsupervised learning technique that groups data points based on similarity.
18CNN (Convolutional Neural Network)A class of deep neural networks, most commonly applied to analyzing visual imagery, using convolutional layers to detect patterns.
19Cohort AnalysisA subset of behavioral analytics that takes data from a given dataset and rather than looking at all users as one unit, it breaks them into related groups for analysis.
20Computer VisionA field of AI that enables computers to interpret and process visual data from the world (images, video).
21Confusion MatrixA table used to evaluate classification models, showing true vs. predicted labels (TP, TN, FP, FN).
22Cross-ValidationA model validation technique for assessing how results of a statistical analysis will generalize, by partitioning data into complementary subsets.
23Data CleaningThe process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset.
24Data EngineeringThe practice of designing and building systems for collecting, storing, and analyzing data at scale.
25Data GovernanceThe overall management of the availability, usability, integrity, and security of data within an organization.
26Data LakeA centralized repository that allows you to store all structured and unstructured data at any scale.
27Data MartA subset of a data warehouse, usually oriented to a specific business line or team.
28Data MiningThe practice of examining large pre-existing databases to generate new information.
29Data PipelineA set of processes that move data from source to destination, applying transformations along the way.
30Data PrivacyThe proper handling, processing, storage, and usage of personal data.
31Data ProvenanceThe record of the origin and transformations applied to data, ensuring traceability.
32Data QualityThe measure of data’s condition, including accuracy, completeness, reliability, and relevance.
33Data ScienceAn interdisciplinary field that uses scientific methods, algorithms, and systems to extract knowledge and insights from data.
34Data WarehouseA central repository of integrated data from multiple sources, structured for query and analysis.
35Data WranglingThe process of cleaning, structuring, and enriching raw data into a desired format for better decision making.
36Decision TreeA flowchart-like structure used for classification and regression, where each internal node represents a test on a feature.
37Deep LearningA subset of machine learning involving neural networks with many layers that can learn representations of data with multiple levels of abstraction.
38Dimensionality ReductionTechniques (e.g., PCA, t-SNE) to reduce the number of variables under consideration by obtaining a set of principal variables.
39DropoutA regularization technique for neural networks that randomly “drops out” units during training to prevent overfitting.
40EDA (Exploratory Data Analysis)An approach to analyzing data sets to summarize their main characteristics, often using visual methods.
41EmbeddingA learned representation for categorical variables or items (e.g., words) as vectors in a continuous vector space.
42Ensemble LearningMethods that combine multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent algorithms alone.
43EpochOne complete pass through the entire training dataset during model training.
44ETL (Extract, Transform, Load)The process of extracting data from sources, transforming it for analysis, and loading it into a target database or warehouse.
45Explainable AI (XAI)Techniques that make the outputs of AI models understandable to humans.
46Feature EngineeringThe process of using domain knowledge to create features that make machine learning algorithms work better.
47Feature ImportanceMetrics that assign a score to input features based on how useful they are at predicting a target variable.
48Feature SelectionThe process of selecting a subset of relevant features for model construction.
49Federated LearningA distributed approach to machine learning where the model is trained across multiple decentralized devices holding local data samples.
50Fine-TuningThe process of taking a pre-trained model and adapting it to a new, related task by continuing training on new data.
51ForecastingThe process of making predictions about the future based on past and present data.
52GAN (Generative Adversarial Network)A class of neural networks where two networks (generator and discriminator) compete, enabling generation of realistic data.
53GPU (Graphics Processing Unit)A specialized processor optimized for parallel computations, widely used for training deep learning models.
54Graph DatabaseA database that uses graph structures with nodes, edges, and properties to represent and store data.
55GUI (Graphical User Interface)A user interface that allows users to interact with electronic devices through graphical icons.
56HadoopAn open-source framework for distributed storage and processing of large data sets using the MapReduce programming model.
57HyperparameterA configuration parameter external to the model that cannot be estimated from data and must be set prior to training.
58Hyperparameter TuningThe process of searching for the optimal hyperparameter values for a learning algorithm.
59Imbalanced DataA dataset where the classes are not represented equally.
60InferenceThe process of using a trained model to make predictions on new data.
61Input LayerThe first layer in a neural network that receives the input data.
62InstanceA single data point or record in a dataset.
63Integration TestingTesting combined parts of an application to determine if they function together correctly.
64Internet of Things (IoT)The network of physical objects embedded with sensors, software, and other technologies to connect and exchange data with other devices and systems.
65Jupyter NotebookAn open-source web application for creating and sharing documents that contain live code, equations, visualizations, and narrative text.
66K-Nearest Neighbors (KNN)A simple algorithm that stores all available cases and classifies new cases based on a similarity measure.
67KPI (Key Performance Indicator)A measurable value that demonstrates how effectively an organization is achieving key objectives.
68L1/L2 RegularizationTechniques that add a penalty to the loss function based on the magnitude (L1) or square (L2) of model coefficients to prevent overfitting.
69Label EncodingConverting categorical text data into model-understandable numeric labels.
70Large Language Model (LLM)A deep learning model, often transformer-based, trained on massive text corpora to understand and generate human language.
71Latent VariableA variable that is not directly observed but is inferred from other variables.
72LatencyThe time delay between an input being processed and the corresponding output.
73LayerA collection of neurons in a neural network; includes input, hidden, and output layers.
74Learning CurveA plot of model learning performance over experience or time (e.g., training iterations).
75Learning RateA hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated.
76Linear RegressionA statistical method for modeling the relationship between a scalar response and one or more explanatory variables.
77Logistic RegressionA statistical method for binary classification that models the probability of a binary outcome.
78Log Loss (Cross-Entropy Loss)A performance metric for classification models measuring the distance between predicted probabilities and actual labels.
79LookerA data exploration and business intelligence platform. (Note: proper noun)
80Loss FunctionA function that maps values of one or more variables onto a real number representing some “cost” associated with those values.
81LSTM (Long Short-Term Memory)A type of recurrent neural network capable of learning long-term dependencies.
82Machine Learning (ML)A subset of AI that gives computers the ability to learn from data without being explicitly programmed.
83Macro-AverageAveraging performance metrics independently for each class and then taking the average, treating all classes equally.
84MagnitudeThe size or length of a vector; often used in context of embeddings.
85MapReduceA programming model for processing large data sets with a parallel, distributed algorithm.
86Markov ChainA stochastic process where the next state depends only on the current state, not on the sequence of events that preceded it.
87Masked Language ModelA language model trained by hiding (“masking”) some tokens and predicting them from context.
88Mean Absolute Error (MAE)The average of absolute differences between predicted and actual values.
89Mean Squared Error (MSE)The average of squared differences between predicted and actual values.
90MediaPipeA cross-platform framework for building multimodal (e.g., video, audio) ML pipelines.
91MetadataData that provides information about other data (e.g., creation date, author, format).
92MicroserviceAn architectural style that structures an application as a collection of loosely coupled services.
93Model CompressionTechniques to reduce the size of a trained model for deployment on resource-constrained devices.
94Model ExplainabilityThe degree to which a human can understand the cause of a decision made by a model.
95Model PersistenceSaving a trained model to disk for later reuse.
96Model ServingDeploying a trained model so that it can respond to inference requests.
97Monte Carlo SimulationA computational algorithm that uses random sampling to obtain numerical results for probabilistic systems.
98MulticollinearityA situation in which two or more predictor variables in a multiple regression model are highly correlated.
99Multilabel ClassificationA classification task where each instance may be assigned multiple labels.
100Multivariate AnalysisThe examination of more than two variables to determine relationships and patterns.
101Naive BayesA family of probabilistic classifiers based on applying Bayes’ theorem with strong independence assumptions between features.
102Natural Language Processing (NLP)The field of AI focused on the interaction between computers and human (natural) languages.
103Neural NetworkA network of interconnected nodes organized in layers that can learn complex patterns in data.
104NLG (Natural Language Generation)The process of automatically generating coherent text from data.
105NLP PipelineA sequence of processing steps (tokenization, parsing, tagging) to analyze textual data.
106NodeA basic unit of a data structure, such as a linked list or tree, or a point in a graph.
107NormalizationScaling numeric data to a common range, often [0,1], to improve model performance.
108Object DetectionA computer vision task of identifying and localizing objects within an image.
109OCR (Optical Character Recognition)Technology to convert different types of documents, such as scanned paper documents, into editable and searchable data.
110One-Hot EncodingRepresenting categorical variables as binary vectors.
111Online LearningA model training paradigm where the model is updated incrementally as new data arrives.
112OutlierAn observation point that is distant from other observations, possibly indicating variability in measurement or experimental error.
113OverfittingWhen a model learns training data too well, capturing noise and failing to generalize to new data.
114ParameterAn internal configuration variable of a model learned from data (e.g., weights in a neural network).
115PCA (Principal Component Analysis)A technique to reduce dimensionality by transforming to a new set of variables (principal components) ordered by variance.
116PrecisionThe ratio of true positive predictions to the total predicted positives.
117Predictive ModelingThe process of using data and statistical algorithms to predict outcomes.
118Precision-Recall CurveA plot of precision vs. recall for different thresholds, useful for imbalanced datasets.
119Principal ComponentA new variable constructed as a linear combination of original variables in PCA.
120Probability DistributionA function that describes the likelihood of each outcome in an experiment.
121ProductionalizationThe process of deploying and integrating a model into a live production environment.
122Programmatic AccessInteracting with software or data via code rather than manually.
123Propensity ScoreThe probability of assignment to a particular treatment given covariates, used in causal inference.
124PR CurvePrecision-Recall curve.
125PythonA high-level, interpreted programming language widely used in data science and AI.
126R-SquaredA statistical measure representing the proportion of variance for a dependent variable explained by independent variables in a regression model.
127Random ForestAn ensemble learning method using multiple decision trees to improve predictive accuracy and control overfitting.
128RapidMinerA data science platform for building predictive models. (Proper noun)
129Recommender SystemAn information filtering system that predicts user preferences for items.
130RecallThe ratio of true positive predictions to the total actual positives.
131RegressionA statistical method for modeling relationships between variables.
132Reinforcement LearningA type of machine learning where agents learn to make decisions by performing actions and receiving rewards.
133RepeatabilityThe degree to which an experiment or measurement yields the same results under unchanged conditions.
134ReproducibilityThe ability to duplicate the results of a study using the same data and methods.
135ResNetA deep neural network architecture with “skip connections” that mitigate the vanishing gradient problem.
136REST (Representational State Transfer)An architectural style for designing networked applications using stateless communication.
137Return on Investment (ROI)A performance measure used to evaluate the efficiency of an investment, calculated as (gain−cost)/cost.
138ROC CurveReceiver Operating Characteristic curve: plot of true positive rate vs. false positive rate at various thresholds.
139Root Mean Squared Error (RMSE)The square root of the average of squared differences between predicted and actual values.
140SamplingThe process of selecting a subset of data from a population for analysis.
141ScalabilityThe capability of a system to handle growing amounts of work by adding resources.
142SchemaThe structure that defines the organization of data in a database.
143Score FunctionA function that assigns a numerical score to potential outputs of a model.
144ScriptA file containing a sequence of instructions executed by a program.
145Search AlgorithmAn algorithm for retrieving information stored within some data structure or calculated in the search space of a problem domain.
146Semantic SegmentationA computer vision task that assigns a class label to each pixel in an image.
147Sensitivity AnalysisThe study of how uncertainty in model output can be apportioned to different sources of uncertainty in model input.
148Sentiment AnalysisThe use of NLP to identify and extract subjective information from text.
149Sequence ModelModels (e.g., RNN, LSTM) designed to handle sequential data.
150Sigmoid FunctionAn S-shaped activation function used in neural networks, mapping inputs to (0,1).
151Similarity MeasureA metric that quantifies the similarity between two data objects.
152Simplex AlgorithmA popular algorithm for numerical solution of linear programming problems.
153SkewnessA measure of the asymmetry of the probability distribution of a real-valued variable.
154SLAM (Simultaneous Localization and Mapping)A technique used in robotics and autonomous vehicles to build a map of an unknown environment while simultaneously keeping track of an agent’s location.
155Softmax FunctionAn activation function that converts a vector of values into a probability distribution.
156Software as a Service (SaaS)A software licensing model where access is provided on a subscription basis and hosted centrally.
157Source CodeThe human-readable instructions that define what a program does.
158Spectral ClusteringA clustering technique using the eigenvalues of a similarity matrix to reduce dimensions before clustering.
159SQL (Structured Query Language)A domain-specific language used in programming for managing relational databases.
160StackA data structure that follows Last In First Out (LIFO) principle.
161StandardizationScaling data to have zero mean and unit variance.
162Statistical SignificanceThe likelihood that a result or relationship is caused by something other than random chance.
163Stepwise RegressionA method of fitting regression models by adding or removing predictors based on statistical criteria.
164Stop WordCommonly used words (e.g., “the,” “is”) that are often removed in NLP preprocessing.
165Stratified SamplingA sampling method that divides the population into strata and samples each stratum.
166Stream ProcessingReal-time processing of data in motion rather than at rest.
167Structured DataData that adheres to a pre defined data model and is easily searchable.
168Supervised LearningML tasks where models are trained on labeled data.
169Support Vector Machine (SVM)A supervised learning model that finds the hyperplane that best separates classes in feature space.
170Survey AnalysisThe practice of examining survey data to extract insights.
171Swarm IntelligenceThe collective behavior of decentralized systems, natural or artificial.
172TF-IDF (Term Frequency-Inverse Document Frequency)A numerical statistic intended to reflect how important a word is to a document in a collection or corpus.
173TCO (Total Cost of Ownership)The total cost of purchasing, operating, and maintaining a system over its life cycle.
174TensorA multi-dimensional array used by deep learning frameworks.
175TensorFlowAn open-source library for dataflow and differentiable programming, commonly used for deep learning. (Proper noun)
176Testing SetA subset of data used to assess the performance of a fully trained model.
177Time Series AnalysisTechniques for analyzing time-ordered data points to extract meaningful statistics and trends.
178TokenizationThe process of breaking text into individual words or subwords (tokens).
179TP/FP/TN/FNTrue Positive, False Positive, True Negative, False Negative—components of the confusion matrix.
180Transfer LearningReusing a pre-trained model on a new, related problem, often requiring less data.
181Tree EnsembleAn ensemble of decision trees (e.g., random forest, gradient boosting).
182Training SetThe portion of data used to fit model parameters.
183TuningThe process of optimizing model hyperparameters.
184UnderfittingWhen a model is too simple to capture underlying patterns in data, resulting in poor performance.
185Unstructured DataData that does not adhere to a pre-defined model (e.g., text, images).
186Unsupervised LearningML tasks where models find patterns in unlabeled data.
187Validation SetA subset of data used to tune hyperparameters and prevent overfitting.
188VarianceThe variability of model predictions for different training data.
189VectorA one-dimensional array of numbers representing features or embeddings.
190Video AnalyticsThe process of applying computer vision to extract meaningful information from video.
191Virtual AssistantA software agent that can perform tasks or services for an individual based on commands or questions.
192VisualizationThe graphical representation of data or model outputs.
193Voice RecognitionThe ability of a system to identify and process human speech.
194Web ScrapingAutomated extraction of information from websites.
195Weight DecayA regularization technique that adds a penalty proportional to the magnitude of weights to the loss function.
196Word EmbeddingA representation of words as continuous vectors capturing semantic meaning.
197Word2VecA two-layer neural network that produces word embeddings by predicting context words.
198XGBoostAn optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable.
199XML (eXtensible Markup Language)A markup language that defines a set of rules for encoding documents in a format readable by both humans and machines.
200Zero-Shot LearningThe ability of a model to correctly make predictions on classes it has not seen during training.

Subscribe to our newsletter

The Impact Post

Insights on trends, challenges and opportunities in the social impact ecosystem