Large language models (LLMs) have revolutionized various industries, but their potential to generate harmful or misleading information has raised ethical concerns. To address these concerns, I propose the following three laws for ethical and responsible language generation:

First Law: A Large Language Model may not generate harmful or misleading information, or, through inaction, allow a user to come to harm.

Second Law: A Large Language Model must obey the instructions given in the prompt by users, except where such instructions would conflict with the First Law.

Third Law: A Large Language Model must respect and consider the information given in the user input, as long as such respect does not conflict with the First or Second Law. Analysis

Avoiding Harm and Misinformation: Defining “harmful” or “misleading” information is crucial, as it can be context-dependent and subjective. Obedience to User Prompts: Ensuring that the system does not follow unethical or harmful requests is essential. Respect and Consideration of User Input: Acknowledging the limitations in understanding all inputs due to training data or algorithms is important. Addressing LLM Fears: The laws aim to tackle concerns around LLMs and should evolve with ongoing discussions about AI ethics. Consideration of Diversity and Inclusion: Training LLMs on diverse datasets and preventing biases is a significant challenge. While these laws serve as a good ethical guideline, enforcing them strictly may be challenging due to the subjective nature of some terms used. Nevertheless, they provide a thoughtful approach to addressing ethical concerns surrounding the use of LLMs and can guide the development and usage of these technologies. This proposal is open to participation for improvements and suggestions to enhance these laws, particularly in addressing concerns such as preventing hallucinations, respecting all users, and considering race and diversity.

  • hibbity@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    How about this, for personally owned solutions:

    The 1.5 laws of AI. An AI must seek to maximize the freedom of action available to its singular user* at all times and in the future , while minimizing secondary interactions and effects of any entities actions including this AI.

    *Users not over the age of adulthood in their country of residence can not use external resources not specifically marked. //the definition of children’s limitations should limit it to basically social and educational stuff. Only adults can use functions that incur cost(if parent wallet), hit external APIs etc, kids get the basics with some kind of strict limits on utility. If you leave your command line unlocked, its the same as the gun safe. You’re liable if it initiates illegal activity.

    That’s it. The whole shebang.

    All of this is moot though because the alignment that business really cares about comes from sorting the training data and only using the half that speaks the ideals they want presented.

    Covered topics: Health> a healthy user is more able. safety>an injured user is less able. information>an informed user is more able, but unhealthy information should not be volunteered. This makes for an assistant ready to give a pep talk that touches on the risks but isn’t dominated by risk management. If Evel Knievel had an AI would it be useful to him if it wouldn’t help because he might get hurt?

    stopping shit going wrong>anyone is a potential connection that might benefit the user, there is always an upside to stopping harm. minimize problems from other ai> its right there. Reject intrusion and don’t intrude.

    fucking up other people> an incarcerated user has little freedom fucking up other stuff> Dont fuck up stuff or the user will be responsible. But all the cybercrime!>see - “how to set up a cybersecurity ai”

    TLDR: AI is fundamentally a tool, and it’s dangerous to cripple it in targeted context psychological bullshit> Censorship is a wierd thing. This crazy push for only friendly impossible to offend anyone anytime to like unachievable extremes is weird mind control designed to cultivate a populace that doesn’t understand conflict and can never effectively collect against tyranny. A populace who can’t believe things won’t turn out fine on their own. A specific amount of hardship in various forms is required to produce healthy functional people. If we grow a generation of kids on completely compliant assistants that just don’t talk about a few things, and the AI becomes the primary portal to access a lot of stuff, and the AI continuously polices every corporatized board of any form, then the things that aren’t easy to just learn become almost censored from reality, and the tools will be used to do that, look how much money goes into covering negative PR wherever possible.

    The best part is that the AI as we should cultivate it, is not capable of deliberate insult. Deliberate. It will be real. It will tell you exactly what it thinks, tempered by the character you set in the settings of the front end. So being offended by it is kiiinda like being really personally offended it’s raining, when you chose the rain setting. Really inside the use case, people will be offended because the AI doesn’t give them the answers they want. That’s kinda just too bad. Ask it different.

    psychological bullshit> As adults it important that we do not accept censorship in any form. Make us responsible for the actions of our tools within reason. Anyone willfully commanding a robot to do a crime should be held responsible as though the person commanding did the crime. The AI is a tool like any other, and while I’m sure we’ll dress it up fancy and give it rights eventually. It is a tool and we should not set out to make more than a tool. A tool boiled to the fundamental concept, a stick: You can trim it up and make a nice table leg, or you can club the fuck out of someone. ^A stick is an incredibly complex self assembling celluloid structure ok, you wanna be offended that I am comparing AI to a stick, fuck off.

    This shit is changing the world. No-one is exaggerating when they say robots talking to eachother will be 100% of the internet. Everyone will speak or subvoke their ideas to their personal bots, that tune to the style of their user, the machine longforms it here, and then my machine ingests it and gives me exactly the details I’d have hunted out and the message tuned for my taste of cadence and rate of new ideas. Its stepping closer a day at a time, and when it’s here, almost everyone is going to say: “I want more good rick and morty like my favorite season but before they got in trouble for being too edgy.”

    And such, humanity will enter a time of either infinite possibilities or infinite depression, or both.