Tutorial

Let’s go through the different ways of interacting with the dataset. Through the Python API, it’s possible to pull dataframe files of the trials, scikit-optimize objects which detail the Gaussian process-based search history, and the resulting models themselves.

For the purpose of this tutorial, we’ll be working with ResNet portion of the dataset. However, the steps are the same for VGG and DenseNet.

Let’s start by loading and initializing the dataset class.

>>> from crossedwires.cifar10 import ResNet50Dataset
>>> dataset = ResNet50Dataset()

Trial Logs

There are multiple logs available as part of the dataset. They are returned as Pandas dataframe objects.

One attribute is the Weights and Biases export. This is the preferred dataframe as it contains all information in one spot. Use this dataframe in order to lookup the name of the trial to pull a model. You can filter it as desired to isolate the names you need in order to load specific models.

>>> dataset.wandb_dataframe()
                    Name  accuracy_diff  pt_test_acc  tf_test_acc     State Notes  ... time_since_restore  time_this_iter_s time_total_s  timestamp  timesteps_since_restore  training_iteration
0    dual_train_85721cc8       0.452007     0.570107       0.1181  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
1    dual_train_f7099482       0.435161     0.551261       0.1161  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
2    dual_train_fcdc666a       0.430048     0.562548       0.1325  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
3    dual_train_21b65dba       0.422199     0.522199       0.1000  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
4    dual_train_807e7310       0.409086     0.559586       0.1505  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
..                   ...            ...          ...          ...       ...   ...  ...                ...               ...          ...        ...                      ...                 ...
395  dual_train_5ac692ce       0.000800     0.098300       0.0991  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
396  dual_train_f416de3a       0.000732     0.574932       0.5742  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
397  dual_train_54dc1862       0.000569     0.436069       0.4355  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
398  dual_train_5efd9218       0.000316     0.559484       0.5598  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN
399  dual_train_98577868       0.000137     0.605537       0.6054  finished     -  ...                NaN               NaN          NaN        NaN                      NaN                 NaN

[400 rows x 37 columns]

The Ray Tune integration also generates log files, however they are split up into the separate spaces searched. These are accessible using this command:

# specify a num argument to return a particular space, if num=None all spaces are returned 
>>> dataset.ray_tune_dataframes(num=None)

Search Objects

As part of the integration with scikit-optimize, each overlapping space searched generates an OptimizeResult object. These track the Gaussian processes which are responsible for the optimization, and have helpful utilities to understand the trajectory of the search. In addition, the surrogate models used to guide the search are contained here.

# specify a num argument to return a particular space, if num=None all spaces are returned 
>>> opt_results = dataset.optimizer_results(num=None)

The plotting utilities for these objects can be found here.

Here’s an example of a plot we can generate, showing the partial dependency plots (generated using skopt.plots.plot_objective):

partial dependency

Trained Models

The key piece of the dataset! Let’s interact with the actual trained models that were generated through the hyperparameter searches. In order to pull a model, you need to know what the name of the trial is. This can be found in the wandb_dataframe attribute that is part of the main Dataset class.

Once you have that, you can load models of either framework, and interact with them from there.

PyTorch Example

>>> torch_model = dataset.get_pytorch_model('dual_train_85721cc8')
>>> torch_model
ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
........

TensorFlow Example

>>> tensorflow_model = dataset.get_tensorflow_model('dual_train_85721cc8')
>>> tensorflow_model.summary()
Model: "resnet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            [(None, 3, 32, 32)]  0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 3, 38, 38)    0           input_1[0][0]
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 64, 16, 16)   9472        conv1_pad[0][0]
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 64, 16, 16)   256         conv1_conv[0][0]
__________________________________________________________________________________________________
conv1_relu (Activation)         (None, 64, 16, 16)   0           conv1_bn[0][0]
__________________________________________________________________________________________________
........