{"id":6894,"date":"2025-03-28T07:52:25","date_gmt":"2025-03-28T12:52:25","guid":{"rendered":"https:\/\/garyborders.com\/pages\/?p=6894"},"modified":"2025-03-28T07:52:25","modified_gmt":"2025-03-28T12:52:25","slug":"shadow-library-apparently-pirated-my-book","status":"publish","type":"post","link":"https:\/\/garyborders.com\/pages\/shadow-library-apparently-pirated-my-book\/","title":{"rendered":"&#8216;Shadow Library&#8217; Apparently Pirated My Book!"},"content":{"rendered":"<p class=\"wpf_wrapper\"><a class=\"print_link\" href=\"\" target=\"_blank\">Print this entry<\/a><\/p><!-- .wpf_wrapper --><p>I was sitting in a Dallas doctor\u2019s office the other day, reading <em>The Atlantic<\/em> on my phone since I forgot to bring a book. This was about a week before that venerable magazine \u2014 founded in 1857 \u2014 made headlines when its editor-in-chief was accidentally put on Signal, an encrypted messaging app, with key members of the administration\u2019s team planning an attack on Houthi rebels in Yemen, including the vice president, CIA director, director of national intelligence and secretary of defense. Elect a clown. Expect a circus.<\/p>\n<p><em>The Atlantic<\/em> is now owned by Lauren Powell Jobs, widow of Apple co-founder Steve Jobs and a noted philanthropist. I have subscribed to it for decades and happily continue to do so. We need strong, independent journalism more than ever.<\/p>\n<p>The <a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2025\/03\/libgen-meta-openai\/682093\/\">article<\/a> that got my attention was titled <em>The Unbelievable Scale of AI\u2019s Pirated-Books Problem<\/em>, by Alex Reisner. He is a computer programmer who has written extensively about generative artificial intelligence, made famous by ChatGPT, and now being used in search engines, customer service and elsewhere. It has rapidly become ubiquitous. For example, if you use Google to look up something, the answer is generated by that company\u2019s version of AI.<\/p>\n<p>I am OK with that. I often use ChatGPT in research, carefully checking the citations and links it provides. I don\u2019t use it to <em>write<\/em> these pieces. That would be cheating.<\/p>\n<p>What I, and many others, take issue with is how these large language models are being trained. Reisner\u2019s article points out that court documents he obtained indicate that Meta, the company that owns Facebook, pirated millions of books, and research papers to train its flagship AI model, Llama 3. It did so by using LibGen, a notorious buccaneer of copyrighted materials. Founded in Russia, it is known as a \u201cshadow library.\u201d As one of several lawsuits against LibGen <a href=\"https:\/\/howtobe247.com\/libgen-publishers-sue-infamous-shadow-library-over-pirated-books\/#:~:text=It%20was%20ordered%20to%20shut,still%20accessible%20through%20alternate%20domains.\">states<\/a>:<\/p>\n<p><em>\u00a0Libgen enables users to download, for free, fiction and non-fiction books (among other types of works), including educational textbooks, instead of buying or renting lawful copies or checking them out from a legitimate library. Defendants have absolutely no legal justification for what they do and operate in complete and knowing defiance of the rule of law.\u201d<\/em><\/p>\n<p>Rather than paying authors to use their work, LibGen steals it. Meta reportedly decided to use LibGen to train Llama 3. Now it is being sued by several authors, including Sarah Silverman and Junot Diaz, for copyright infringement. Open AI, which owns ChatGPT, is also being sued and accused of copyright infringement by The Authors Guild, <em>The New York Times,<\/em> and others. (Aft<a href=\"https:\/\/garyborders.com\/pages\/shadow-library-apparently-pirated-my-book\/libgen-image\/\" rel=\"attachment wp-att-6896\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-6896 alignright\" src=\"https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/LibGen-image-300x220.jpg\" alt=\"\" width=\"300\" height=\"220\" srcset=\"https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/LibGen-image-300x220.jpg 300w, https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/LibGen-image-600x440.jpg 600w, https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/LibGen-image.jpg 668w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>er the <em>Atlantic <\/em>article was published, OpenAI said the models powering ChatGPT <em>now<\/em> weren\u2019t developed using the LibGen datasets, and that it hasn\u2019t used them since 2021.)\u00a0 So, while I use ChatGPT and accept that AI is going to increasingly be part of the intellectual fabric of society, these companies are making billions of dollars in profit off other people\u2019s work. The originators of the works should be compensated. I don\u2019t know how that would pan out but am hopeful the courts will side with the creators.<\/p>\n<p>Reisner also provided a search bar in his article so one could search for an author in LibGen, with some caveats. There is no way of knowing what specific content Meta and Open AI used to train their models. Just because a particular title is in LibGen doesn\u2019t necessarily mean it was used to train one of those AI models.<\/p>\n<p>With that caveat in mind, I typed my name into Reisner\u2019s search bar. A total of 36 results popped up, most having nothing to do with me. But the top two results were a book I wrote that was published 19 years ago by the University of Texas Press: <em>A Hanging in Nacogdoches: Murder, Race, Politics, and Polemics in Texas\u2019s Oldest Town, 1870-1916.<\/em><\/p>\n<p>LibGen has stolen my book! Could I be entitled to compensation, as the lawyer commercials say? Doubtful. I don\u2019t even know who to sue. LibGen is pretty shady as to ownership. From what I gather, it constantly changes domains and mirror sites to evade lawsuits from publishers, such as major foe Elsevier, a major academic journal and book publisher.<\/p>\n<p><a href=\"https:\/\/garyborders.com\/pages\/shadow-library-apparently-pirated-my-book\/hanging-cover-for-col\/\" rel=\"attachment wp-att-6895\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-6895 alignright\" src=\"https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/Hanging-cover-for-col-300x172.jpg\" alt=\"\" width=\"300\" height=\"172\" srcset=\"https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/Hanging-cover-for-col-300x172.jpg 300w, https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/Hanging-cover-for-col-600x344.jpg 600w, https:\/\/garyborders.com\/pages\/wp-content\/uploads\/2025\/03\/Hanging-cover-for-col.jpg 672w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> The Hanging book (my shorthand title) is still in print, thanks to print-on-demand technology. It still sells regularly in Nacogdoches, usually to tourists drawn to the title. The Nacogdoches T-Shirt Company (formerly Bosslight) kindly stocks it. It can be found on Amazon and other book sites.<\/p>\n<p>The Hanging books can also be found in a box in my garage. I buy a box a year wholesale from UT Press to keep the Nacogdoches store stocked, plus (very) occasional requests from someone who wants a signed copy. Every October, UT-Press deposits a royalty payment in my bank account, usually enough to take my Beautiful Mystery Companion out for Mexican food. The royalty check won\u2019t cover the margaritas, however.<\/p>\n<p>What the heck? I asked ChatGPT if it uses LibGen. Here was its response:<\/p>\n<p><em>LibGen is an unauthorized repository of copyrighted content, including books and academic papers, often shared without the permission of publishers or authors. OpenAI, the company behind ChatGPT, follows strict legal and ethical guidelines and does <strong>not<\/strong> use data from illegal or pirated sources like LibGen in training or in generating responses.<\/em><\/p>\n<p><em>Instead, ChatGPT is trained on a mixture of licensed data, publicly available information, and data created by human trainers. When it comes to academic or scientific topics, it provides summaries or explanations based on general knowledge, not from pirated PDFs or paywalled articles<\/em><\/p>\n<p>OK, then. I guess the courts eventually will have the final say.<\/p>\n<p class=\"wpf_wrapper\"><a class=\"print_link\" href=\"\" target=\"_blank\">Print this entry<\/a><\/p><!-- .wpf_wrapper -->","protected":false},"excerpt":{"rendered":"<p>Print this entryI was sitting in a Dallas doctor\u2019s office the other day, reading The Atlantic on my phone since I forgot to bring a book. This was about a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[81,38],"tags":[],"class_list":["post-6894","post","type-post","status-publish","format-standard","hentry","category-81","category-columns"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/posts\/6894","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/comments?post=6894"}],"version-history":[{"count":2,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/posts\/6894\/revisions"}],"predecessor-version":[{"id":6898,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/posts\/6894\/revisions\/6898"}],"wp:attachment":[{"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/media?parent=6894"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/categories?post=6894"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/garyborders.com\/pages\/wp-json\/wp\/v2\/tags?post=6894"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}