The line of defense tried by the AI-crowd to deal with copyright problems is that the data model is not large enough to really store all the original input so they argue it must be using “intelligence” to generate the text/picture...
...It is just that
Nobody can tell what is really recorded in the AI-model. We have no way to know what is hidden there and how high the risk for vile behavior. It is bit like a the-house-has-not-burnt-down-yet-situation but nobody has an idea how to check the fire safety.
That the model forgot most of the input does not mean it forgot the text that will cause copyright problems
How much space data that takes after compression is only easy to calculate for lossless compression. What kind of compression ratio that is possible for lossy compression is an open research question. If you consider the Johnson–Lindenstrauss lemma and how it has been used in real world applications there are grounds to suspect that AI-model forget a lot less than now is assumed by the AI-crowd.
I think a ban on AI generated content on AFF makes sense. People have been banned for posting stories stolen from other people to promote their hit count so why should we not worry about a tool that allow people to generate infinite text?