Want to hear something funny? In my legal case against LAION e.V., the opposing party's lawyers argue that, according to the new EU AI law, it should be documented which works were used to train a generative AI. From this they conclude that training with these works should be expressly legal as long as it is documented. What do you think? #ai #ki #legal #law #aiact
Why did you sure LAION? Did they use some of your data? If yes did you try to find a middle ground?
Allow me to share a legal perspective that probably explains their reasoning: According to German copyright law, data mining is permissible concerning publicly available data, also without the copyright owner‘s consent. In case the mining happens for non-commercial purposes, the case ends there. Commercial mining has to appreciate machine-readable opt-out (before training). This is all based on the EU legislation (Digital Single Market Directive). The AI Act confirms that these laws apply to generative AI models. However, that is what the opponent seems to refer to: the AI Act also aims at transparency about the training data. So where data mining is permissible by law and the data found its way into a model (and you learned about it), you are currently stuck. Today, I believe that discussions about compensation make more sense in this environment than discussions about „don’t do that“. Think of it as a case of compulsory licensing: „If I have to accept this use, then I at least want to be paid.“ Is this the end of it? I don’t think so. The questions go way beyond the law and have to be discussed there (in society) to define the playbook of the future. But it is a realistic status quo report, I am afraid.
I think they are using sophistry. Saying that a need for documenting sources (taking inventory of stolen goods), legitimizes ownership or free use of the source (keeping stolen goods because they were inventoried) is nonsense. "I've documented what I have stolen, and by doing so, now legally own what I have stolen." Nope. Possession isn't 9/10ths of the law. Owners don't transfer ownership because their goods are documented in a list of stolen goods.
It’s an effective strategy. I like to steal a handful of oranges every time I walk into my local grocery store and the manager always lets me do it as long as I tell him first. On a serious note, to me the point of documenting trained works is to be able to cross reference/validate whether the materials were properly licensed. This just sounds like another chapter in GenAI’s never ending story of trying to cut out those annoying middlemen known as “consent” and “ethics.”
They argue like that, but would that mean they also offer the documentation of the sources? Till now, I thought, it is not possible at all... My opinion: of course bullshit, but documentation would be a great first step.
Logically the argument does not stand. In the same fashion if you document where you got all your counterfeit money from does not legalize them. Now as for training AIs, it is a quagmire, I understand the need of the AI trainers for data. I also understand the original creators of the texts that see their work generate billions without them getting anything out of it. Some IP must be paid, somehow, maybe like in the music industry?
But of course! If the thief provides you the list of items he borrowed from you, will never return them, but is selling them without your permission or compensation - all is Ok! : ) EU is a farcical political structure that works directly against the interests of it's population, and across way too many areas, But bad things fall appart, sooner or later. Maybe AI will accelerate it? : )
I will rob banks but as long as i keep a spreadsheet of which banks I hit, it's perfectly legal. 😉
BS. I hope (for them) they have better arguments … Transparency obligations for copyrighted material in the AI Act, if this is what they refer to, should allow copyright holders to exercise their rights « including through state of the art technologies and reservation of rights expressed pursuant to Article.4(3) of Dirrctive etc. » In other terms allow rightsholders to opt-out of the training. They will have to provide detailed lists of the data traine. Based on this rightsholders will be able to opt-out.