Sebastian Borgeaud (Google) on RSI: "With synthetic data, you use a strong model to generate the synthetic data, and then you run smaller-scale ablations to validate the effect of the synthetic data. One really interesting question is whether you can actually generate synthetic data to make a model that you want to train in the future better than the model that generated the synthetic data in the first place. We spend a lot of time thinking about this and doing research in this direction."