I recently came across a torrent that seems to be an archive of Reddit. It got me thinking if it would be possible to make it locally browsable. However, I also considered the possibility that someone might have already addressed this by creating a public Lemmy instance, enabling the content to be accessible from any federated instance.
Honestly it upsets me enough when I see people or bots mirroring new Reddit posts to fedi without the original author’s permission. A full archive - whether in the form of a torrent or a fedi instance - also makes me feel icky.
I know it’s not possible and it’s entirely against reddit’s interests, but I wish there were a way for subreddits or people or posts to be marked somehow as not for copying or use elsewhere.
It has always weirded me out when I found /r/relationships posts copy-pasted to like BuzzFeed knock-off sites. Then yesterday I saw and blocked a Lemmy bot mirroring like a dozen reddit subs (including gonewild) to its instance.
It may be fine, good, and useful to archive like how-to content or technical support questions and stuff like that as there is a clear utility there. But seeing the more personal stuff that people might not want to see copied around or searchable makes me feel bad.
Yes, yes I know it’s the internet and these people should know better and if they really want to opt out they should submit a request to the wayback machine and set a robotstxt plus there’s no way to stop it and we really really need all of this valuable information preserved for historical purposes and as we all know information wants to be free and you can’t stop the signal. And all the myriad excuses that the less well behaved digital preservationists will lean on.
But at some point and in a lot of circumstances you’re copying people’s personal information and using it in ways they didn’t intend on when they posted it. I don’t know your personal opinion on the reports of reddit admins undeleting posts people have been deleting before they delete their accounts, but people who are upset about that should consider that “preserving” reddit data also takes away peoples’ agency over their data and their right to be forgotten in much the same way.
Are you a reddit employee?