We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
2 parents fe35475 + bed1f85 commit d45d0fbCopy full SHA for d45d0fb
README.md
@@ -671,7 +671,7 @@ accuracy numbers.
671
* If you are pre-training from scratch, be prepared that pre-training is
672
computationally expensive, especially on GPUs. If you are pre-training from
673
scratch, our recommended recipe is to pre-train a `BERT-Base` on a single
674
- [preemptable Cloud TPU v2](https://cloud.google.com/tpu/docs/pricing), which
+ [preemptible Cloud TPU v2](https://cloud.google.com/tpu/docs/pricing), which
675
takes about 2 weeks at a cost of about $500 USD (based on the pricing in
676
October 2018). You will have to scale down the batch size when only training
677
on a single Cloud TPU, compared to what was used in the paper. It is
0 commit comments