🔗
If it were necessary to license published materials for training large language models, it would necessarily limit the viability of those models to those companies which could afford the significant expense.
As opposed to the “usual” way of training with data scraped from the web…a legally-contested practice.
It reminds me of when Craig Federighi said something like “We don’t need to train ML algorithms on our users’ photos when we can just buy images elsewhere.”
Me? I approve.