The ML.TRAINING_INFO function
This document describes the ML.TRAINING_INFO
function, which lets you see
information about the training iterations of a model.
You can run ML.TRAINING_INFO
while the CREATE MODEL
statement for the target model is running, or you can wait until after the
CREATE MODEL
statement completes. If you run ML.TRAINING_INFO
before the
first training iteration of the CREATE MODEL
statement completes, the query
returns a Not found
error.
Syntax
ML.TRAINING_INFO(MODEL `project_id.dataset.model`)
Arguments
ML.TRAINING_INFO
takes the following arguments:
project_id
: Your project ID.dataset
: The BigQuery dataset that contains the model.model
: The name of the model.
Output
ML.TRAINING_INFO
returns the following columns:
training_run
: anINT64
value that contains the training run identifier for the model. The value in this column is0
for a newly created model. If you retrain the model using thewarm_start
argument of theCREATE MODEL
statement, this value is incremented.iteration
: anINT64
value that contains the iteration number of the training run. The value for the first iteration is0
. This value is incremented for each additional training run.loss
: aFLOAT64
value that contains the loss metric calculated after an iteration on the training data:- For logistic regression models, this is log loss.
- For linear regression models, this is mean squared error.
- For multiclass logistic regressions, this is cross-entropy log loss.
- For explicit matrix factorization models this is mean squared error calculated over the seen input ratings.
- For implicit matrix factorization models, the loss is calculated using the following formula:
$$ Loss = \sum_{u, i} c_{ui}(p_{ui} - x^T_uy_i)^2 \lambda(\sum_u||x_u||^2 \sum_i||y_i||^2) $$For more information about what the variables mean, see Feedback types.
eval_loss
: aFLOAT64
value that contains the loss metric calculated on the holdout data. For k-means models,ML.TRAINING_INFO
doesn't return aneval_loss
column. If theDATA_SPLIT_METHOD
argument isNO_SPLIT
, then all entries in theeval_loss
column areNULL
.learning_rate
: aFLOAT64
value that contains the learning rate in this iteration.duration_ms
: anINT64
value that contains how long the iteration took, in milliseconds.cluster_info
: anARRAY<STRUCT>
value that contains the fieldscentroid_id
,cluster_radius
, andcluster_size
.ML.TRAINING_INFO
computescluster_radius
andcluster_size
with standardized features. Only returned for k-means models.
Permissions
You must have the bigquery.models.create
and bigquery.models.getData
Identity and Access Management (IAM) permissions
in order to run ML.TRAINING_INFO
.
Limitations
ML.TRAINING_INFO
is subject to the following limitations:
ML.TRAINING_INFO
doesn't support imported TensorFlow models.- For time series models,
ML.TRAINING_INFO
only returns three columns:training_run
,iteration
, andduration_ms
. It doesn't expose the training information per iteration, or per time series if multiple time series are forecasted at once. Theduration_ms
is the total time cost for the entire process.
Example
The following example retrieves training information from the model
mydataset.mymodel
in your default project:
SELECT * FROM ML.TRAINING_INFO(MODEL `mydataset.mymodel`)
What's next
- For information about model evaluation, see BigQuery ML model evaluation overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.