[crazy][hobby][spam] Automated Reverse Engineering
k
gmkarl at gmail.com
Sun Jan 16 04:39:33 PST 2022
[after a number of psychotic breaks] the training loop runs now. it's
likely not running very effectively. for the notebook to run right
now, an uncommitted change is needed:
# compute loss
loss = optax.softmax_cross_entropy(logits,
flax.training.common_utils.onehot(labels, logits.shape[-1]))
- padding_mask = decoder_attention_mask
+ padding_mask = batch['decoder_attention_mask']
loss = (loss * padding_mask).sum() / padding_mask.sum()
More information about the cypherpunks
mailing list