Home Reference Source Test
import {ColumnVectorizer} from 'modelscript/src/ColumnVectorizer.mjs'
public class | source

ColumnVectorizer

class creating sparse matrices from a corpus

Constructor Summary

Public Constructor
public

constructor(options: Object): this

creates a new instance for classifying text data for machine learning

Member Summary

Public Members
public
public
public
public
public
public
public
public
public
public

Method Summary

Public Methods
public

evaluate(testString: String): number[][]

returns new matrix of words with counts in columns

public

evaluateString(testString: String): Object

returns word map with counts

public

fit_transform(options: Object)

Fits and transforms data by creating column vectors (a sparse matrix where each row has every word in the corpus as a column and the count of appearances in the corpus)

public

get_limited_features(options: *)

Returns limited sets of dependent features or all dependent features sorted by word count

public

get_tokens(): String[]

Returns a distinct array of all tokens

public

get_vector_array(): String[]

Returns array of arrays of strings for dependent features from sparse matrix word map

Public Constructors

public constructor(options: Object): this source

creates a new instance for classifying text data for machine learning

Params:

NameTypeAttributeDescription
options Object
  • optional
  • default: {}

Return:

this

Example:

const dataset = new ms.nlp.ColumnVectorizer(csvData);

Public Members

public data source

public limitedFeatures source

public matrix source

public maxFeatures source

public replacer source

public sortedWordCount source

public tokens source

public vectors source

public wordCountMap source

public wordMap source

Public Methods

public evaluate(testString: String): number[][] source

returns new matrix of words with counts in columns

Params:

NameTypeAttributeDescription
testString String

Return:

number[][]

sparse matrix row for new classification predictions

Example:

ColumnVectorizer.evaluate('I would rate everything Great, views Great, food Great') => [ [ 0, 1, 3, 0, 0, 0, 0, 0, 1 ] ]

public evaluateString(testString: String): Object source

returns word map with counts

Params:

NameTypeAttributeDescription
testString String

Return:

Object

object of corpus words with accounts

Example:

ColumnVectorizer.evaluateString('I would rate everything Great, views Great, food Great') => { realli: 0,
good: 0,
definit: 0,
recommend: 0,
wait: 0,
staff: 0,
rude: 0,
great: 3,
view: 1,
food: 1,
not: 0,
cold: 0,
took: 0,
forev: 0,
seat: 0,
time: 0,
prompt: 0,
attent: 0,
bland: 0,
flavor: 0,
kind: 0 }

public fit_transform(options: Object) source

Fits and transforms data by creating column vectors (a sparse matrix where each row has every word in the corpus as a column and the count of appearances in the corpus)

Params:

NameTypeAttributeDescription
options Object
options.data Object[]

array of corpus data

public get_limited_features(options: *) source

Returns limited sets of dependent features or all dependent features sorted by word count

Params:

NameTypeAttributeDescription
options *
options.maxFeatures number

max number of features

public get_tokens(): String[] source

Returns a distinct array of all tokens

Return:

String[]

returns a distinct array of all tokens

public get_vector_array(): String[] source

Returns array of arrays of strings for dependent features from sparse matrix word map

Return:

String[]

returns array of dependent features for DataSet column matrics