20 Comments
Apr 16, 2023Liked by Timothy B Lee

The "almost exact copy" problem seems solvable? It should be possible to construct a similarity score between a generated image and any of the source images, based on some set of criteria.

Expand full comment

I think you can build the app to reject copies after they are outputted, but I don’t think you can build a model that doesn’t copy.

Expand full comment

Great article! I think the big entertainment companies are the 800lb gorilla, waiting in the wings. I wouldn't want to be on the other side of Disney's copyright lawyers. Disney also has the connections in government to lobby for favourable legislation. I think it's also a little telling that the entertainment companies haven't been making a big stink about it. Disney would love a tool that spits out an endless stream of Spider-man colouring books.

Expand full comment

Wow, this article really highlights the complexities of copyright issues in the AI industry! Excellent work, Tim! It's fascinating how Stable Diffusion's ability to generate new images based on latent representations can lead to potential copyright infringement, even when not intended.

It's definitely concerning that if plaintiffs win these lawsuits, the entire generative AI industry could be thrown into chaos, potentially giving even more power to big tech companies like Google, Microsoft, and Meta. I can't help but wonder how this will impact the future of AI startups and the industry as a whole....

What are your thoughts on how these copyright issues might be resolved in a way that still fosters innovation and competition in the AI space?

Expand full comment
author

Hi Matt! I did a note recently exploring how these lawsuits might end. Check it out. https://substack.com/profile/101111787-timothy-b-lee/note/c-14517300

Expand full comment

Will we be seeing any more technical aspects of AI and LLMs? I would love to see stories about how to fine tune an Open Source model. It would also be cool to see stories about how to write interfaces between LLMs and other software.

Expand full comment
author

I do want to write some stories focusing on technical aspects of AI, but probably not at the level of how to fine tune open source models. One story I really want to write is an explainer of how transformers and the attention mechanism work, but first I have to figure it out myself.

Expand full comment

One thing that might be fun and appropriate to your target audience, at the end of each post, perhaps you could provide links to other resources.

Expand full comment

The lawsuits can't come quick enough! Shut this job-killing thievery down ASAP!

Expand full comment

Doesn't this rest on the same (old) legal battle over "copying" vs "inspired by"? And don't we have lots of data points (case law) on this? At some point if I (pre-AI) paint something that is obviously in the style of say, Thomas Kinkade, if he sues me, the court will have to determine 1) what features of art that are covered by his intellectual property and 2) whether or not what I drew was close enough to one of his works to violate that. Same with any trademarked characters - you can draw Batman or Mickey Mouse with certain traits (that are public domain) but not with others (that are still covered). Any case has to analyze that and determine "this is 92% the same, and we think the threshold is 90%, therefore you're infringing." I recognize that's a nerdy computer-person way of thinking about it, but it boils down to the core of the case.

Except that for these AI image models, the judge/jury doesn't have to "think" about it - the software/model will tell you. For your example of the Ann Lotz image above, the model can take in 2 images and tell you how different they are, along a number of different axes! Difference in pixels, shades, even a (perhaps rough estimate, given imprecision of models at time) numeric slider of how much of a given style or source material was incorporated. Certainly I think it's fair to say that generated picture is sufficiently similar (and I think an AI image processor - perhaps even a facial recognition one - would label it as highly similar) to the original to count as a recreation, but this is a thing that copyright infringement cases have always had to deal with.

Again, maybe it's my computer nerd reaction of "we've already designed an object oriented template for this type of case!" or "this case seems like everyone is pointedly ignoring that in order to generate lots of lawyer legal fees" but it seems like this is something that is already well established in previous cases. The change that it's now software remixing instead of the artist's brain - but I don't see how that changes the legal precedent, but then, there's a reason that I Am Not A Lawyer.

Expand full comment

You glossed over it, but for me the first question is whether this is all-or-nothing, whether Stable Diffusion *itself* is legal. Whereas this comment implies that specific images it produces may have different legal standing, which makes sense to me.

Perhaps Stable Diffusion is just a new tool, like a new paintbrush. I can produce whatever I want with it in the privacy of my own home, whether that's a Disney character or an "original" work whose influences are not immediately evident. But depending on what I made, that may constrain my ability to post it online, to sell it, etc.

Expand full comment

I certainly hope no court is going to take the stance that SD itself (and therefore a lot of other things!) is illegal - it seems obvious to me that shouldn't be at stake, but it wouldn't be the first time the courts have surprised me.

Expand full comment
author

I think there are two things you could say. One is that Stability AI infringed copyright by training SD with copyrighted images. The other is that SD is a derivative work of the training images. The former would make Stability AI liable but might not prevent re-distribution of SD. The second would implicate both. I don't think either claim is obviously crazy.

Expand full comment

I don't know a single human artist that was not influenced by copyrighted work. This is true for visual art. This is true for music. This is even true for writing code.

Expand full comment

The first does seem crazy to me. "Training" on copyrighted images just physically means "having the images on your hard drive and running them through some software." That is definitely legal, right? It's legally equivalent to looking at copyrighted art while you're in art school, or watching copyrighted movies in film school. When they graduate sure they draw something that looks like Picasso or direct a movie that is like a Bond movie, but surely that's not illegal? The legal crux of the issue is still "how close is it?"

The 2nd makes more logic in the sense that the software is like a work of art itself, but software is just 1's and 0's - I still think it's crazy. Perhaps the positions of those 1's or 0's was influenced by copyrighted material (just like the positions of my neurons in my brain) but producing art (even art that is generative art!) that has taken copyrighted work as inspiration cannot (in my humble opinion) be illegal by default - it has to be contingent on how similar the output is. If I print out 10 Thomas Kinkade paintings on my color printer, then put them in a blender and then splash them on a wall, Pollack-style, for some modern performance art installation, that does not violate the copyright. Unless I had a bad blender and it didn't sufficiently chop up one of the pictures and it's obvious to viewers what it was.

Expand full comment

You are assuming that "taking inspiration" is an accurate description of what SD is doing. In fact, "inspired by" and "derived from" are two very different claims both artistically and legally.

Expand full comment

Artistically and legally, I agree - but image generation tools of this type show us that the two concepts are perhaps on the same spectrum. Certainly I think there should be a clear legal line between the two, with "amount of source material incorporated" as a major component.

Expand full comment
Comment deleted
Expand full comment

First: let me say I'm only addressing legality. The philosophical and moral questions are a separate topic in my opinion.

Ok, you're starting out by saying "stealing" which is kind of the issue at hand: is it legally stealing? I'm a programmer too - I "steal" people's code all the time. What matters is how s - is it just a stackoverflow post showing how to interact with some API that I use for an example - or is it something real, where I take thousands of lines of specific task-code and use it to make my company money?

AI has no legal existence. AI can take no legal actions. AI cannot steal, and it cannot not-steal. Humans - using tools - take legal actions. If I ask my tool for code and it is protected, and then I use it in some product against the copyright - then that's illegal, just like if I used some other tool called "copy-paste." It is legally the same principle. (in my opinion)

In the same way, SD is not the artist. It is a tool used by a human, like a paintbrush or photoshop. Every piece of artwork created, by Da Vinci's paintbrush, or SD's prompt system, is the creation/inspiration of some human, or a collaboration of some humans: SD's programmers and the prompter and yes, whoever created the original images that serve as the inspiration for both every human artist and SD's weighted models. Just like all human created art, for the purposes of "stealing" or "copyright infringement" a court will have to determine if the % of the work (yes, inherently subjective) is sufficiently high. I do not believe the process should be treated legally different than a human artist using filters or other various tools in photoshop.

Expand full comment