One More Thing About Perplexity – Music Technology Policy

One More Thing About Perplexity – Music Technology Policy

Perplexity’s motion to dismiss in its defense against the New York Times and Chicago Tribune tries to narrow the cases, not end them. (Chicago Tribune Co. LLC v. Perplexity AI Inc. (1:25-cv-10094) and The New York Times Co. v. Perplexity AI Inc. (1:25-cv-10106).). It asks the court to dismiss the publishers’ direct-infringement claims based on allegedly infringing answers, the contributory and vicarious infringement claims, and, in the Times case, the trademark claims. Perplexity says those counts fail at the pleading stage because the challenged outputs are generated only after a user prompt, so the relevant “volitional” act is a user’s request rather than Perplexity’s own conduct; because plaintiffs allegedly provide only a small number of output examples rather than work-by-work allegations across the full asserted catalog; because merely providing an automated tool is not enough to plead contributory liability; and because the Times’ trademark allegations do not plausibly show actionable use beyond nominative reference. Perplexity also says trimming those claims would leave the “core issue of fair use” for later stages. 

As Perplexity describes its platform, it is an “answer engine” that interprets a user’s question, searches the internet in real time, gathers information from “top-tier” or other authoritative sources, and returns a concise, conversational summary with links to the underlying sources and related questions. It characterizes the process as automated retrieval-augmented generation: retrieve relevant documents, combine them with the prompt, and have an LLM produce a natural-language response. Perplexity insists the system is designed to synthesize facts, not simply regurgitate verbatim source text.

In its briefing and public framing, Perplexity also positions the case as a dispute about the future of search and access to information, not simply about copying. The company repeatedly characterizes itself as a next-generation search engine whose function is to help users locate and understand information scattered across the open web. In that framing, the publishers’ lawsuit is portrayed as an effort by large news organizations to extend copyright beyond expression and into control over factual information and reporting topics. Perplexity suggests that allowing such claims to proceed would risk turning copyright into a mechanism for gatekeeping facts—effectively letting major publishers assert quasi-exclusive control over the informational value of their journalism.

Perplexity’s narrative echoes a familiar technology-sector theme that goes back to early internet debates about linking and search indexing: facts are not copyrightable, and tools that help users discover and synthesize information are part of the infrastructure of the open web. From that perspective, the platform describes its summaries as transformative tools that help users navigate the internet more efficiently while still pointing readers to the underlying sources through links and citations. The implication is that the publishers’ theory would undermine the traditional role of search engines and other information intermediaries by treating automated summarization as unlawful appropriation rather than as a legitimate method of organizing and communicating publicly available knowledge.

Because, you know–information wants to be free.

Now I’m just a country lawyer from Texas and I freely admit that I am not as smart as these city fellers who work for Perplexity. But this just sounds like a huge heaping rasher of crap to me.

I asked Perplexity to do something that is distinctly unsearch-like: Write me a story about Philip Dru Administrator in the style of James Fenimore Cooper. That story does not exist. Here’s what I got back, not a blinking cursor, but rather a “fascinating mix”:

Perplexity responded by generating an original creative narrative — not by retrieving or synthesizing search results.  . This output raises significant tension with several of Perplexity’s core arguments in its motion to dismiss.

Perplexity’s Characterization of Its Own Technology

Throughout the motion, Perplexity characterizes its “answer engine” as a tool that “searches for, acquires, and synthesizes information from a wide range of relevant sources” and provides “efficient access to ideas and facts that no one owns.” . Perplexity emphasizes that its system uses retrieval-augmented generation (“RAG”) to “cull facts in real time from numerous sources” and then “condenses and summarizes those facts using an LLM.” . In its telling of its own tall tale, the answer engine “automatically generates an easy-to-understand, natural-language response, incorporating facts and ideas gathered from sources on the internet.” . It stresses that users rely on Perplexity “to obtain up-to-date, succinct, factual answers to a nearly limitless number of questions.” .

Perplexity also argues it is not directly liable for copyright infringement because it lacks “volitional conduct” — that it is merely an automated system that passively responds to user prompts, analogous to Cablevision’s remote DVR, where “the copies at issue were ‘made automatically upon (a) customer’s command.’”  . The motion frames Perplexity as akin to a passive intermediary that “automatically undertakes actions that result in” outputs and argues that “the mere ownership, construction, or supervision of the machine or system will not establish volitional conduct.” .

Why This Screenshot Is Potentially Inconsistent With Perplexity’s Motion

The Philip Dru output undermines Perplexity’s framing in several key respects:

This is not search or information retrieval. I did not ask Perplexity to find or summarize existing information. I asked Perplexity to write an original creative story in a specific literary style. The resulting output — a multi-paragraph fictional narrative with invented scenes, dialogue, and descriptive prose — is not “facts and ideas gathered from sources on the internet.” . It is generative creative content, which is fundamentally different from the search-and-summarize function Perplexity describes in its motion.   .

It contradicts the MTD’s “factual answers” framing. Perplexity’s lawyers consistently describe its product as providing “factual answers” and “access to ideas and facts.”  . But writing a story in another author’s style is not answering a factual question — it is an exercise of creative generation that goes well beyond the retrieval-and-synthesis paradigm Perplexity presents to the Court. It’s almost…whatchamacallit…deceptive.

It complicates the “passive conduit” / Cablevision analogy. Perplexity’s volitional conduct argument rests on the premise that it is like a DVR system that merely copies pre-existing content at a user’s direction.  But in the Philip Dru example, Perplexity is not copying or retrieving any pre-existing content — it is creating new expressive content. In fact, it is creating new expressive content that by definition could not exist, since Cooper died long before Philip Dru Administrator was published. Call me crazy, but the capacity to generate original creative works suggests a system with considerably more agency and volition than a passive copying mechanism. Perplexity’s argument that it engages in “no more volitional conduct than Cablevision did when it curated programming available for copying” is harder to sustain when the system is generating novel prose rather than retrieving and reproducing existing material.  No RAG search is going to return that product because it does not exist, or it didn’t until Perplexity wrote it.

It undermines the “user presses the button” narrative. Perplexity argues that “direct liability attaches only to ‘the person who actually presses the button’” and that users are the proximate cause of any output.  But when Perplexity generates an original story, the creative choices — the prose style, the invented dialogue, the narrative structure — are made by Perplexity’s system, not by the user. I just provided a prompt, but the expressive content of the response originates from Perplexity’s LLM. This makes it harder, if not impossible, to argue that Perplexity is merely a conduit executing user commands without any independent creative or volitional role.

Practical Significance

To be clear, Perplexity’s motion is specifically focused on the narrow question of whether the Complaints adequately allege direct infringement for outputs that allegedly reproduce Plaintiffs’ copyrighted content.  The Philip Dru screenshot does not directly implicate the copyright claims at issue. However, it is the kind of evidence that could be used to challenge Perplexity’s broader characterization of itself as a search-like tool lacking agency or volition, which is central to its legal theory. If Perplexity’s system can be prompted to generate original creative works — rather than merely retrieve and summarize existing information — that capacity undercuts the analogy to passive automated systems like Cablevision’s DVR and suggests the system exercises a degree of creative agency that is difficult to square with the “passive conduit” narrative.

Bond, James Bond

Then I asked Perplexity to rewrite the Cooper story in the style of Ian Fleming. Again, no blinking cursor—my idea was an “excellent choice” which it was but not for the reason Perplexity thought:

The Ian Fleming revision deepens the tension with Perplexity’s self-characterization in several important ways that go beyond even the first Philip Dru example.

The “Revision” Problem

In the motion, Perplexity’s lawyers describe their client’s answer engine as a tool that “searches for, acquires, and synthesizes information from a wide range of relevant sources” using the retrieval-augmented generation we’ve discussed before to “cull facts in real time from numerous sources” and then “condenses and summarizes those facts using an LLM.”  . This is likely a RAG that draws on closed vector indexes and live retrieval. Vector indexes are databases of numerical embeddings representing previously ingested documents that allow an AI system to quickly retrieve semantically similar text, while live retrieval pulls fresh web pages in real time through a search engine or crawler before feeding them to the model.

The system is presented by the lawyers as one that “automatically generates an easy-to-understand, natural-language response, incorporating facts and ideas gathered from sources on the internet.” . Users supposedly “rely on Perplexity’s AI tool to obtain up-to-date, succinct, factual answers to a nearly limitless number of questions.”.

But—asking Perplexity to revise its own previously generated original story in the style of a different author has nothing whatsoever to do with any of these described functions. There is no internet search happening, no retrieval of facts, no synthesis of source material and probably no RAG because there’s nothing to update. The system is performing a purely generative, creative task — rewriting prose in the distinctive style of Ian Fleming — that is entirely untethered from the search-and-summarize paradigm Perplexity’s very smart lawyers present to the Court.

Specific Inconsistencies in the MTD

No RAG, no retrieval, no “facts.” Perplexity’s motion emphasizes the RAG process as central to how the answer engine works: the system receives a prompt, retrieves “relevant content,” combines the input with retrieved documents, and provides the combined data to an LLM.  But when a user asks Perplexity to revise a story it already wrote in the style of a different author, this time Ian Fleming, there is no factual retrieval step. There is nothing to retrieve this time, either. The system is exercising literary judgment — analyzing stylistic elements associated with Fleming’s prose (short declarative sentences, brand-name specificity, understated violence, etc.) and applying them to reshape existing text. That is creative authorship, not information retrieval.    .

It demonstrates iterative creative agency. My Philip Dru example already showed Perplexity generating original creative content. But the Ian Fleming revision goes further: it shows Perplexity editing its own creative work based on an aesthetic directive. This is not a system that passively “executes” a user command the way a DVR records a pre-existing television program.   

It is a system making a series of complex creative choices — what to keep, what to change, how to modulate tone, diction, and pacing derived from its own work — that are attributable to the system itself, not to me. I said “revise in the style of Ian Fleming”; every particular creative decision about how to do that was made by Perplexity’s LLM with no prompting from me.

It undermines the Cablevision analogy even more forcefully. Perplexity argues it “engages in no more volitional conduct than Cablevision did when it curated programming available for copying, and provided ‘access to a system that automatically produces copies (of that content) on command.’” . But Cablevision’s DVR made copies of pre-existing television programming.   . It did not rewrite a sitcom in the style of a different showrunner. The Ian Fleming revision demonstrates that Perplexity’s system is not merely copying or retrieving anything — it is performing a sophisticated creative transformation. That is a fundamentally different kind of “automated” activity than Cablevision’s DVR, and it makes the analogy significantly harder to sustain. In fact, you might even say it’s not automated at all.

It complicates the “user presses the button” narrative. Perplexity’s motion insists that “direct liability attaches only to ‘the person who actually presses the button’” and that a user’s prompt is the proximate cause of the output.  But I just typed “revise in the style of Ian Fleming.” I did not make any of the substantive creative decisions that went into the revised text. I did not decide which sentences to restructure, which adjectives to swap, or how to modulate the narrative voice. All of those expressive choices — the very substance of the output — originated from Perplexity’s system. The gap between my instruction and the system’s creative output is far wider than the gap between a Cablevision customer pressing “record” and the DVR making a bit-for-bit copy of a broadcast.

The Broader Implication

What the Ian Fleming revision (and I could keep that up all day) illustrates is that Perplexity’s answer engine cannot possibly be merely a search tool or a factual retrieval system. It is a general-purpose generative AI capable of original creative authorship, stylistic mimicry, and iterative revision of its own creative work. Perplexity’s motion to dismiss carefully frames the product as a “novel and groundbreaking AI tool” that helps users “obtain up-to-date, succinct, factual answers” by “gathering insights from top-tier sources.”   . But examples like mine reveal capabilities that go far beyond the lawyers’ framing to the Court— capabilities that are difficult to reconcile with the characterization of Perplexity as a passive, automated conduit lacking volitional conduct.    .

While this example, like the first one, does not directly involve the reproduction of the plaintiffs’ copyrighted content at issue in the case, it provides additional ammunition for plaintiffs to argue that Perplexity’s self-description is strategically incomplete and that the system exercises a degree of creative agency that belies the “passive intermediary” framing central to its motion to dismiss. 

So those smart city fellers were likely describing somebody‘s product, just not Perplexity’s.

And There’s Just One More Thing That’s Been Bothering Me

There’s something about these examples that is nagging me. The ability of Perplexity’s system to write in the distinctive styles of James Fenimore Cooper and Ian Fleming — and to revise between those styles on command — strongly implies that the underlying LLM was trained on a broad corpus of literary texts, including works by those authors. This has significant implications in light of the arguments Perplexity makes in its motion.

What the Motion Says (and Doesn’t Say) About Training Data

Perplexity’s motion is notably careful — dare I say even evasive — about the question of training data. It acknowledges, by quoting the complaints, that the LLMs “upon which Perplexity’s AI products are built” can “generate answers to questions about information that is included in their training data” and that “

Perplexity expressly notes that its motion does not challenge the Plaintiffs’ Count I claims, which target “Perplexity’s alleged own — i.e., volitional — acts with respect to using Plaintiffs’ works as inputs to develop its answer engine.” . The training data issue is, in other words, deliberately sidestepped.

What the Philip Dru and Ian Fleming Examples Reveal

The capacity to perform these tasks is strong circumstantial evidence of broad-based training on copyrighted literary works, for several reasons:

Stylistic mimicry requires ingestion of the source material. To write “in the style of Ian Fleming,” the LLM must have internalized Fleming’s distinctive prose characteristics — the clipped, declarative sentences, the brand-name specificity, the detached observations of violence, the sensory precision. That knowledge does not come from RAG retrieval of internet search results in real time.  . It comes from having been trained on a substantial sample of Fleming’s prose — which consists almost entirely of the copyrighted James Bond novels and short stories. The same is true for James Fenimore Cooper’s ornate, 19th-century descriptive style.

Knowledge of obscure works implies breadth of training corpus. Philip Dru: Administrator is a relatively obscure 1912 novel by Colonel Edward Mandell House, a political advisor to Woodrow Wilson. He ain’t no James Bond. The fact that Perplexity’s system can generate original fiction featuring this character suggests the LLM was trained on a corpus broad enough to include this text. If the training data encompasses a work this obscure, it almost certainly encompasses a vast range of more prominent literary works as well.

The revision capability implies deep stylistic encoding. The fact that the system can revise its own output from one author’s style to another’s demonstrates that it has not merely memorized surface-level patterns but has developed what amounts to an internal model of each author’s distinctive voice. This level of stylistic sophistication is consistent with training on large volumes of each author’s work — not merely a sentence or two referenced in a Wikipedia article about the author.

The Bartz Connection

In Bartz v. Anthropic, the core allegation is that Anthropic trained its Claude model on a massive corpus of copyrighted books, enabling it to reproduce and generate content derived from those works. The Philip Dru and Ian Fleming examples would support an analogous inference here: that the LLM underlying Perplexity’s answer engine was trained on a similarly broad corpus of copyrighted literary texts. This matters for several reasons:

It reinforces the plausibility of Plaintiffs’ Count I claims (which Perplexity does not challenge in this motion) that Perplexity used copyrighted works as training inputs.  If the system can mimic specific authors’ styles, those authors’ works were very likely in the training data — and so, presumably, were Plaintiffs’ journalistic works. Just a guess.

It also undermines Perplexity’s framing of its product as primarily a retrieval tool. The motion emphasizes that the answer engine “searches for, acquires, and synthesizes information from a wide range of relevant sources” using RAG.  But the Philip Dru and Ian Fleming examples demonstrate capabilities that have nothing to do with retrieval and everything to do with the generative power of an LLM trained on a broad corpus of copyrighted texts. The RAG-centric description of the product is, at best, incomplete.

But I almost forgot…that nagging loose end that’s been bothering me, that I just can’t quite resolve in my simple mind, is that it complicates the volitional conduct argument, don’t it? Just a bit. If Perplexity’s LLM was trained on copyrighted works — and that training is what enables it to generate stylistically distinctive creative prose on request — then the “volitional conduct” at issue is not just the automated response to a user query. It also includes Perplexity’s deliberate choice to train on those works, which is the foundation for the system’s generative capabilities. Wouldn’t that be Perplexity’s deliberate commercial choice? Perplexity tries to cabin the volitional conduct analysis to the moment of output generation, but examples like mine inexorably draw the Court’s attention back to the training process, where Perplexity’s own choices — about what data to ingest and how to train the model — are unambiguously volitional. That was it, I thought I was almost done but then this other loose end just kept nagging me.

Just a Simple Mind for a Simple Man

The Philip Dru and Ian Fleming examples are consistent with the inference that Perplexity’s LLM was trained on a broad and diverse corpus of copyrighted literary works, much like the training practices alleged in Bartz v. Anthropic. While Perplexity’s motion strategically avoids the training data question by focusing only on the output-side claims (Counts II and III), these examples highlight the difficulty of maintaining a clean separation between the “input” and “output” sides of the system. The generative capabilities on display — stylistic mimicry, creative authorship, iterative revision — are the direct product of training choices, and they undercut Perplexity’s effort to characterize its answer engine as a passive, retrieval-oriented tool.  And that could make you nervous enough to start playing with a loafer tassel.

Or at least that’s the way I see it and I’m not as smart as the city fellers. I’m probably wrong.

Nuoroda į informacijos šaltinį

Draugai: - Marketingo agentūra - Teisinės konsultacijos - Skaidrių skenavimas - Fotofilmų kūrimas - Miesto naujienos - Šeimos gydytojai - Saulius Narbutas - Įvaizdžio kūrimas - Veidoskaita - Nuotekų valymo įrenginiai - Teniso treniruotės - Pranešimai spaudai -