Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

Bonferroni Correction(also: Bonferroni Adjustment): The Bonferroni correction is a statistical adjustment that controls the family-wise error rate when multiple hypothesis tests are performed on the same data. It divides the target significance threshold (commonly 0.05) by the number of comparisons, so with three pairwise tests…
Chernoff Faces(also: Chernoff's Faces): A visualisation technique introduced by Herman Chernoff in 1973 that represents multivariate data by mapping each data variable to a facial feature — eye size, eye spacing, nose length, mouth curvature, face shape, and so on — producing one cartoon face per data sample. The idea…
Click-Time Distribution(also: Timing Profile, Click Precision): A statistical model of when a switch user activates their switch relative to a target timing event, used to characterize the precision and consistency of a user's motor control. In the Nomon interface, the click-time distribution measures how accurately a user clicks when a…
Cohen's Kappa(also: Kappa Statistic, Kappa Coefficient): A statistical measure of inter-rater reliability that accounts for agreement occurring by chance, used to assess the consistency between two or more raters coding qualitative data. Values range from -1 to 1, where 1 indicates perfect agreement, 0 indicates agreement no better…
Cumulative Link Mixed Model(also: CLMM, Ordinal Mixed Model): A statistical model for analysing ordinal outcome data (such as Likert-scale ratings) that includes both fixed effects (experimental conditions) and random effects (participants, stimuli). CLMMs use a link function — commonly logit — to relate ordered categorical responses to…
DBSCAN(also: Density-Based Spatial Clustering of Applications with Noise): A density-based clustering algorithm introduced by Ester, Kriegel, Sander, and Xu (1996) that groups data points located in dense neighbourhoods and labels sparse points as noise. Unlike k-means, DBSCAN does not require the user to specify the number of clusters in advance and…
Dynamic Bayesian Network(also: DBN, Temporal Bayesian Network): A probabilistic graphical model that represents sequences of variables over time, extending standard Bayesian networks to handle temporal relationships. In accessibility and affective computing contexts, Dynamic Bayesian Networks are used to model how facial expressions, head…
Friedman Test(also: Friedman Rank Test): The Friedman test is a non-parametric statistical test used to detect differences across three or more related samples - for example, the same participants rating three interface conditions. It ranks each participant's responses across conditions and tests whether the rank sums…
Gaussian Mixture Model(also: GMM): A Gaussian Mixture Model (GMM) is a probabilistic model that represents data as a weighted combination of multiple Gaussian (normal) distributions. Each component Gaussian has its own mean and covariance, allowing GMMs to model complex, multimodal distributions. In speech…
Growth mixture model(also: GMM, Latent class growth model): A statistical method that identifies unobserved subpopulations (latent classes) within a dataset based on distinct patterns of change over time. In accessibility research, growth mixture models can reveal that a seemingly homogeneous user group actually contains distinct…
Inter-Annotator Agreement(also: IAA, Inter-rater agreement, Inter-coder agreement): A statistical measure of how consistently two or more human annotators assign the same label to the same data item, widely used in NLP, computer vision, and AI dataset construction as a proxy for label quality. Common measures include Cohen's kappa, Fleiss' kappa, and…
Krippendorff's Alpha(also: Krippendorff Alpha, Kalpha): A statistical measure of inter-rater agreement used to assess how consistently two or more coders classify the same qualitative data. Developed by Klaus Krippendorff, the metric handles any number of coders, any level of measurement (nominal, ordinal, interval, ratio), and…
Long Tail(also: Long-tail Distribution, Long-tail Participation): A statistical distribution in which a small number of items or participants account for the majority of the total, while a very long list of lower-frequency items collectively make up the remainder. The term was popularised by Chris Anderson in reference to online retail, but it…
Minimum Clinically Important Difference(also: MCID, Minimal Clinically Important Difference): The smallest change in a measurement that is perceived as beneficial or meaningful from a clinical perspective. MCID thresholds help researchers and clinicians distinguish statistically significant changes from clinically meaningful improvements. In digital health and assistive…
Multivariate Data(also: Multivariate Dataset, High-Dimensional Data): Data in which each observation or sample has more than two measured variables (dimensions). Analysing multivariate data is a core task in statistics, science, and business intelligence, but presenting it accessibly is difficult: traditional charts effectively show two or three…
Pareto Principle(also: 80/20 Rule, Law of the Vital Few): The empirical observation, named after economist Vilfredo Pareto, that in many systems roughly 80% of effects come from 20% of causes. In crowdsourcing and volunteer communities, the principle predicts that a small number of top contributors produce the majority of output, while…
Pointwise Mutual Information(also: PMI): A statistical measure used in natural language processing to quantify the strength of association between two words based on how much more frequently they co-occur in a corpus than would be expected by chance. PMI is calculated as the logarithm of the ratio of the observed…
Probabilistic Sampling(also: Random Sampling, Statistical Sampling): A sampling method in which every member of a population has a known, non-zero probability of being selected for the sample. In accessibility evaluation, probabilistic sampling of web pages allows auditors to make statistically valid generalisations about the overall…
Psychometric validation(also: Psychometric evaluation, Instrument validation): The process of establishing that a measurement instrument (such as a questionnaire or scale) possesses adequate reliability (consistency of measurement), criterion validity (correlation with established measures), and construct validity (measuring the intended theoretical…
Randomization Test(also: Randomisation Test, Permutation Test): A randomization test (also called a permutation test) is a non-parametric statistical test that computes a p-value by re-shuffling the observed data many times under the null hypothesis and asking how often the re-shuffled data produce a test statistic as extreme as the one…
Representative Sampling(also: Representative Page Sampling): In web accessibility auditing, the practice of selecting a subset of pages from a website that statistically reflects the full site, so that evaluation findings can be generalised to pages not directly audited. WCAG-EM requires that a 'representative sample' be included…
Shannon Entropy(also: Information Entropy, Source Entropy): A measure of the average uncertainty or unpredictability associated with a set of possible outcomes, defined by Claude Shannon as H = -Σ p(x) log₂ p(x), where p(x) is the probability of each outcome. In the context of interface evaluation, entropy quantifies how much uncertainty…
Signal Detection Theory(also: SDT): A statistical framework used to measure the accuracy of a system or person in distinguishing between the presence and absence of a target signal amid noise. In accessibility and assistive technology research, Signal Detection Theory is used to evaluate how well detection systems…
Spearman correlation(also: Spearman rank correlation, Spearman's rho): A non-parametric statistical measure of the strength and direction of the monotonic relationship between two ranked variables, ranging from -1 to +1. In accessibility evaluation research, Spearman correlation is used to assess how well automated metrics (such as Word Error Rate…
Statistical Graph(also: Statistical Chart, Data Graph, Quantitative Graph): A visual representation of numerical or statistical data using geometric elements such as lines, bars, points, or areas to convey patterns, trends, relationships, and comparisons. Common types include line graphs (showing trends over time), bar charts (comparing categories), pie…
Stratified Sampling(also: Stratified Random Sampling): Stratified sampling is a statistical technique that divides a population into non-overlapping subgroups (strata) that share some characteristic, then draws a random sample from each stratum. In accessibility evaluation, stratified sampling is used to pick test pages by first…
TrueSkill(also: TrueSkill Bayesian Rating): A Bayesian skill-rating algorithm developed at Microsoft Research (Herbrich et al., 2007) that models each player or option as a normal distribution with a mean skill μ and uncertainty σ, updating both after each pairwise match. Originally designed for matchmaking in competitive…
Wilcoxon Signed-Rank Test(also: Wilcoxon Test): The Wilcoxon signed-rank test is a non-parametric alternative to the paired t-test, used to compare two related samples when the data are ordinal or not normally distributed. It ranks the absolute differences between paired observations and tests whether the sum of positive and…

28 results.

Category

Search results