import {UpperConfidenceBound} from 'modelscript/src/ReinforcedLearning.mjs'

public class | source

UpperConfidenceBound

Extends:

ReinforcedLearningBase → UpperConfidenceBound

Implementation of the Upper Confidence Bound algorithm

Constructor Summary

Public Constructor
public	constructor(options: Object): this creates a new instance of the Upper confidence bound(UCB) algorithm.

Member Summary

Public Members
public	numbers_of_selections
public	sums_of_rewards
public	total_reward

Method Summary

Public Methods
public	learn(ucbRow: Object, getBound: Function): this single step trainning method
public	predict(): number returns next action based off of the upper confidence bound
public	train(ucbRow: Object \| Object[], getBound: Function): this training method for upper confidence bound calculations

Inherited Summary

From class ReinforcedLearningBase
public	bounds
public	getBound
public	iteration
public	last_selected
public	total_reward
public	learn() interface instance method for reinforced learning step
public	predict() interface instance method for reinforced prediction step
public	train() interface instance method for reinforced training step

Public Constructors

public constructor(options: Object): this source

creates a new instance of the Upper confidence bound(UCB) algorithm. UCB is based on the principle of optimism in the face of uncertainty, which is to choose your actions as if the environment (in this case bandit) is as nice as is plausibly possible

Override:

ReinforcedLearningBase#constructor

Params:

Name	Type	Attribute	Description
options	Object	optional default: {}

Return:

this

Example:

const dataset = new ms.ml.UpperConfidenceBound({bounds:10});

See:

http://banditalgs.com/2016/09/18/the-upper-confidence-bound-algorithm/

Public Members

public numbers_of_selections source

public sums_of_rewards source

public total_reward source

Override:

ReinforcedLearningBase#total_reward

Public Methods

public learn(ucbRow: Object, getBound: Function): this source

single step trainning method

Override:

ReinforcedLearningBase#learn

Params:

Name	Type	Attribute	Description
ucbRow	Object		row of bound selections
getBound	Function	optional default: this.getBound	select value of ucbRow by selection value

Return:

this

public predict(): number source

returns next action based off of the upper confidence bound

Override:

ReinforcedLearningBase#predict

Return:

number

returns bound selection

public train(ucbRow: Object | Object[], getBound: Function): this source

training method for upper confidence bound calculations

Override:

ReinforcedLearningBase#train

Params:

Name	Type	Attribute	Description
ucbRow	Object \| Object[]		row of bound selections
getBound	Function	optional default: this.getBound	select value of ucbRow by selection value

Return:

this