sparse matrix of transactions
Returns the result from Node-FPGrowth or a summary of support and strong associations
Formats an array of transactions into a sparse matrix like format for Apriori/Eclat
CSV data of transactions
{values - unique list of all values, valuesMap - map of values and labels, transactions - formatted sparse array}
Used to test variance and bias of a prediction
Array of accucracy calculations
Provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds. Each fold is then used once as a validation while the k - 1 remaining folds form the training set.
array of data to split
returns dataset split into k consecutive folds
Used to test variance and bias of a prediction with parameter tuning
Array of accucracy calculations
Provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds. Each fold is then used once as a validation while the k - 1 remaining folds form the training set.
array of data to split
returns dataset split into k consecutive folds
Split arrays into random train and test subsets
array of data to split
returns training and test arrays either as an object or arrays
Mean Absolute Deviation (MAD) indicates the absolute size of the errors
numerical samples
estimates values
MAD
MAD over Mean Ratio - The MAD/Mean ratio is an alternative to the MAPE that is better suited to intermittent and low-volume data. As stated previously, percentage errors cannot be calculated when the actual equals zero and can take on extreme values when dealing with low-volume data. These issues become magnified when you start to average MAPEs over multiple time series. The MAD/Mean ratio tries to overcome this problem by dividing the MAD by the Mean—essentially rescaling the error to make it comparable across time series of varying scales
numerical samples
estimates values
MMR
MAPE (Mean Absolute Percent Error) measures the size of the error in percentage terms
numerical samples
estimates values
MAPE
The bias of forecast accuracy
numerical samples
estimates values
MFE (bias)
MAD over Mean Ratio - The MAD/Mean ratio is an alternative to the MAPE that is better suited to intermittent and low-volume data. As stated previously, percentage errors cannot be calculated when the actual equals zero and can take on extreme values when dealing with low-volume data. These issues become magnified when you start to average MAPEs over multiple time series. The MAD/Mean ratio tries to overcome this problem by dividing the MAD by the Mean—essentially rescaling the error to make it comparable across time series of varying scales
numerical samples
estimates values
MMR
The standard error of the estimate is a measure of the accuracy of predictions made with a regression line. Compares the estimate to the actual value
numerical samples
estimates values
MSE
Transforms features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one.
array of integers or floats
This function returns two functions that can mix max scale new inputs and reverse scale new outputs
Standardize features by removing the mean and scaling to unit variance
Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using the transform method.
Standardization of a dataset is a common requirement for many machine learning estimators: they might behave badly if the individual feature do not more or less look like standard normally distributed data (e.g. Gaussian with 0 mean and unit variance)
array of integers or floats
This function returns two functions that can standard scale new inputs and reverse scale new outputs
Tracking Signal - Used to pinpoint forecasting models that need adjustment
numerical samples
estimates values
trackingSignal
You can use the adjusted coefficient of determination to determine how well a multiple regression equation “fits” the sample data. The adjusted coefficient of determination is closely related to the coefficient of determination (also known as R2) that you use to test the results of a simple regression equation.
the number of independent variables in the regression equation
adjusted r^2 for multiple linear regression
You can use the adjusted coefficient of determination to determine how well a multiple regression equation “fits” the sample data. The adjusted coefficient of determination is closely related to the coefficient of determination (also known as R2) that you use to test the results of a simple regression equation.
the number of independent variables in the regression equation
adjusted r^2 for multiple linear regression
Converts z-score into the probability
Number of standard deviations from the mean.
p - p-value
The coefficent of Correlation is given by R decides how well the given data fits a line or a curve.
numerical samples
estimates values
R
In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variance in the dependent variable that is predictable from the independent variable(s). Compares distance of estimated values to the mean. {\bar {y}}={\frac {1}{n}}\sum {i=1}^{n}y{i}
numerical samples
estimates values
r^2
The errors (residuals) from acutals and estimates
numerical samples
estimates values
errors (residuals)
returns a safe column name / url slug from a string
Mean Absolute Deviation (MAD) indicates the absolute size of the errors
numerical samples
estimates values
MAD
MAPE (Mean Absolute Percent Error) measures the size of the error in percentage terms
numerical samples
estimates values
MAPE
The bias of forecast accuracy
numerical samples
estimates values
MFE (bias)
The standard error of the estimate is a measure of the accuracy of predictions made with a regression line. Compares the estimate to the actual value
numerical samples
estimates values
MSE
returns a matrix of values by combining arrays into a matrix
a matrix of column values
returns an array of vectors as an array of arrays
The coefficent of Correlation is given by R decides how well the given data fits a line or a curve.
numerical samples
estimates values
R
You can use the adjusted coefficient of determination to determine how well a multiple regression equation “fits” the sample data. The adjusted coefficient of determination is closely related to the coefficient of determination (also known as R2) that you use to test the results of a simple regression equation.
the number of independent variables in the regression equation
adjusted r^2 for multiple linear regression
The coefficent of determination is given by r^2 decides how well the given data fits a line or a curve.
r^2
Creates an array of numbers (positive and/or negative) progressing from start up to, but not including, end. If end is not specified it’s set to start with start then set to 0. If end is less than start a zero-length range is created unless a negative step is specified.
The start of the range.
The end of the range.
The value to increment or decrement by.
Returns a new range array.
This method is like _.range
except that it populates values in
descending order.
The start of the range.
The end of the range.
The value to increment or decrement by.
Returns the new array of numbers.
Returns an array of the squared different of two arrays
Squared difference of left minus right array
The standard error of the estimate is a measure of the accuracy of predictions made with a regression line. Compares the estimate to the actual value
numerical samples
estimates values
Standard Error of the Estimate
Calculates the z score of each value in the sample, relative to the sample mean and standard deviation.
An array like object containing the sample data.
The z-scores, standardized by mean and standard deviation of input array
Tracking Signal - Used to pinpoint forecasting models that need adjustment
numerical samples
estimates values
trackingSignal
Calculates the z score of each value in the sample, relative to the sample mean and standard deviation.
An array like object containing the sample data.
The z-scores, standardized by mean and standard deviation of input array
returns association rule learning results
https://github.com/alexisfacques/Node-FPGrowth