(attempt to cross-post from /c/programming )

Idea: Scrape all the posts from a subreddit as they’re being made, and “archive” them on a lemmy instance, making it very clear it’s being rehosted, and linking back to the original. It would probably have to be a “closed” lemmy instance specifically for this purpose. The tool would run for multiple subreddits, allowing Lemmy users to still be updated about and discuss any potential content that gets left behind.

Thoughts? It’s probably iffy copyright-wise, but I think I can square my conscience with it.

  • piezoelectron@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Hey I LOVE this idea! I had it myself but I can’t code for nuts, so glad to see someone else trying it out.

    Question: how can we follow your progress? Are you thinking of creating a dedicated community/website to share updates? If one already exists then do let me know, I’d love to stay connected.

    EDIT: As for copyright/concerns…if the goal is to preserve information, then maybe you have some way to pseudonymise usernames as part of the script. Or even remove usernames completely, as we’re focusing on the comments.

    I prefer pseudonymising, as you can replace real usernames with fake ones, so that it’s still possibly to follow who’s replying to whom within the context of a comment thread.