Categorical - Binary recodings
Recoding categorical variables into binary or vice versa. Collection of macros for conversion categorical data into binary data or back; for example, categorical multiple response set (MRC) or dichotomous multiple response set (MRD) – one type into another. Such a need emerges frequently during processing of a survey data.
Multiple Response tools
Tools for multiple response sets. One macro is appoined to fix a categorical multiple response set (MRC). Another macro provides dichotomous multiple response sets (MRD) with “no answer” variables. One more macro enriches or impoverishes data of a categorical multiple response set consulting with other variables with the same response list. A pair of other macros create a multiple response set out of a string variable (it can be handy to enter responses for a multiple choice question first into one string variable).
Series Response tools
Tools for series of items. Collection of macros for a “simple matrix question”, i.e. a series of variables with a common pool of alternative responses (Single response series, SRS); for example, a set of items each scored by rating scale or ranked. One of the macros is for the data respondents ranked and it shifts the variables into the categorical multiple response set or back. Another macro is intended for more general tasks of translating values and variables into each other as well as for calculating on reduplicating values. The third macro is for a situation when respondents rated not all items but only those they had chosen before, and the rating data having been entered in a packed (quickened) mode.
Some horizontal operations. Collection of macros performing some wanted things (such as sorting, ranking or counting up unique values) within cases, horizontally. The input file remains fully safe because transposing is not applied.
Derandomizing of tasks. If same tasks (some stimuli, e.g. questionnaire questions, specimens being tested, or medical treatments) were offered in different sequence to different respondents, so that the data, too, were then entered in that order of exposure – “order of trials” – then the macro will restructure these data into a unified “order of tasks” wherein each variable contains data of only one task.
Weighting groups. Achieving wanted proportion sizes of respondent groups by univariate or multivariate (rim) weighting. You can select total N, impose restriction upon weighting individual cells or cases, weight several subsamples in parallel, take account of initial weights.
Categorical into Contrast
Categorical variables into contrast variables. Creates contrast variables from categorical variables (of 3 types to choose) and their interaction variables. Contrast variables are needed first of all when one has to analyse influence of qualitative factors by methods designed for quantitative input (e.g. linear regression).
Various proximity measures. Calculation of some pairwise measures of proximity or association (similarities, distances, correlations) absent in SPSS. Among them are Gower similarity for comparing respondents by quantitative and qualitative characteristics at once; Canberra distance which is optimal for comparing respondents based on their responses to a ranking question; tetrachoric and biserial coefficients of correlation.
Differences inside or between matrices. Macros compute a matrix of distances between matrices of proximity coefficients (rather than between variables or cases), - such as correlation or distance matrices; or between columns inside such matrices. These comparisons can help a researcher: for example, before a cluster or a factor analysis.
Fitting variables to a matrix of coefficients. The macros modify variables’ values so that the variables have strength of relations according to a user-specified matrix (correlation, covariance, or cross-product). Option of insurance against heteroscedasticity allows to achieve homoscedastic relationships.
Cumulative curves. Macros that are related to analysis of cumulative distributions. One of them comparing, via cluster analysis, subsamples by shape of cumulative distribution in variables. Another macro – for marketing – analyses data of the so called price sensitivity meter (PSM).
Clustering criterions. Computation of indices, such as Calinski–Harabasz, Davies–Bouldin, Ratkowsky–Lance, C-Index, correlation, Gamma statistic, Dunn, Silhouette statistic (several types), AIC, BIC, helpful in choosing the better classification partition, specifically, to decide how many clusters one should extract in a cluster analysis.
Euclidean space tools
Euclidean corrections and convertions. Macros for matrices of proximities that must be layed in euclidean or metric space. You can convert similarities (of a covariance/correlation type or interpretation) geometrically correctly into distances or vice versa; correct similarities or dissimilarities not fully satisfying space to ones satisfying it.
Instruments facilitating work. Macros that are not connected with specific analysis or processing but rather serve to speed up various kind of job through syntax. One of them is an alternative to “SPSS Production Facility”, accelerating production of tables etc.
Regular clouds. Creating multivariate data with regular, nonrandom structure. In particular, such data can be understood as fully no-clustered, unlike data generated randomly. Useful as model data in exploration of habits of one or another statistical algorithm, for example of cluster analysis.
Generate random clusters
Random clustered data. Creation of random data broken up into clusters. May make clusters demarkated (clear) or intersecting (fuzzy), round or elongated, regulate sizes and bodily closeness among clusters. A separate macro randomly rotates data in space.
Neighbourhood chains. Out of data showing pairwise relationships within a set of objects there is extracted the information about which object is referred to “in the first place” or “most strongly” by each given object. This way, a trajectory of sequential references is being built. It is shown in form of a table (adjacency list) and a dendrogram.
Make Paired samples
Pairing cases of two samples. Between two samples or sets optimal pairing is being done, such that the sum of within-pair differences gets minimized. Being used is Hungarian Algorithm for matching elements from two arrays into pairs.
Procrustes analysis. Procrustes analysis for two configurations finds a way to maximally superpose two clouds of points in space, provided that a point in one cloud is designedly correspondent with a point in the other. Residual amount of mismatch tells of initial degree of non-identity of configurations. The analysis is used in tasks of comparing shapes and juxtaposition of ordinations (for example factor loading matrices – for detecting identical factors).
Adding latents as lines to data cloud. The macros show on scatterplot of data their principal components or discriminants – in a form of lines tiled with points, these latents’ scores.
Imput missing data
Imputation of missing data. The macros perform hot-deck imputation of missing values, borrowing valid values from cases which are similar to cases with missing data by some background characteristics. A separate macro performs an arbitrary, user-defined borrowing of values from some cases by other cases.
Updated MATRIX - END MATRIX functions
Functions for MATRIX – END MATRIX. Collection of useful statistical, mathematical, restructuring and other functions for matrix session in SPSS.