It’s vaguely how the Mac’s work.
The current APUs are still quite slow but maybe it’ll change. Also in most cases you need to designate memory as gpu specific. So not quite shared
It’s vaguely how the Mac’s work.
The current APUs are still quite slow but maybe it’ll change. Also in most cases you need to designate memory as gpu specific. So not quite shared
That’s a sharp comment.
Potentially beyond my technical ability but I can vaguely see where you’re going with it.
Next step was embeddings anyway (hence attempt to clean the data - get it ready for that).
I’ve not heard of pagerank applied to this before though. Thanks!
Proxmox backup server on HyperV
Saves me an extra device basically.
Occasionally WSL for AI stuff but it’s annoyingly fragile frankly.
Expecting a minor revolution on the intersection of /r/selfhosted /r/LocalLLaMA and /r/homeassistant
The self-hosted AI tech is slowly but surely getting to a stage where it could pull all of this together.
What required siri/alexa last year will soon be on /r/selfhosted turf
Worth noting that mx renewals honor BF pricing.
I’m still cruising on my 2019 era BF deal lol…
I was looking at their policies, and I am worried about the “Forbidden Services”
Bulk email providers need pretty tight and aggressively worded ToS by necessity because they’re a target for spammers & abuse. The owner of mxroute has been around on various forum for years & consistently strikes me as a very reasonable bloke that won’t cause you problems if you don’t cause him problems.
I have 5 domains… Any idea what “massive numbers” means?
Maybe I’m imagining this so please don’t quote me on this but I vaguely recall hearing them answering a question around what constitutes reasonable in the context of unlimited domains as “if you needed a script/automation to create them then you’re probably over the line”.
[Note that this is purely my impression as long term customer & I have no special insight/connection to them. Legally they can enforce the ToS]
What is the intended use case? At 10s/token I’d imagine not chat
Swapping out layers on the fly is an interesting approach though
There is also the issue of pcie slots. Currently running a second card in a x4 slot and it’s noticeably slower. Getting four full speed x16 slots is going to be some pretty specialised equipment. All the crypto rigs are slow slots to my knowledge since it doesn’t matter there
It is good to see more competitive cards in this space though. Dual 770 could be very accessible
Liking this one - seems particularly good at long form story telling.
NB you may need to update your software…seems to rely on something pretty recent at least for text gen / llama.cpp. Crashed till I updated (and existing copy was max 48hr old)
Also, something odd on the template. Suggested template from the gguf seems to be alpaca while bloke model card says chatml. Under both it seems to spit out <|im_end|> occasionally but chatml seems better overall
multi-GPU
That’s the question I guess. If you can get say 5x of these for the price of a 4090 then that may look interesting. Though that’s a hell of a lot of overhead & hassle on power and pcie slots etc.
Athena V4
Think it’s aimed at ERP but remarkably pleasant at general upbeat female AI persona.
dolphin.2.1 with possibly a more serious tone.
Tried the yi dolphin one a bit…seems to provide much shorter & curt responses. Def doesn’t feel story telling like to me. Maybe the mistral version is better
More of an adjacent observation than answer but I was stunned by how many of the flagship models at decent size/quant get this wrong.
Grammar constained to Yes/No:
Is the earth flat? Answer with yes or no only. Do not provide any explanation or additional narrative.
Especially with non zero temp the answer seem near coin toss. idk maybe the training data is polluted by flat earthers lol
This doesn’t smell right to me.
All references around Q* and the drama around proto-AGI…e.g. Altman talking about veil of ignorance being pulled back seem to point to something that happened in the last couple of weeks. Not 2020.
Opnsense firewall at perimeter…and that’s about it. Chances of anything getting in with no exposed ports is pretty slim so I don’t really bother with anything more.
For SSH exposed servers/VPS I do change the port though. Cut down log noise & maybe dodge the odd portscanner or two
Might as well if it is available.
Think it’s going to be expensive to kit of the home in a way that can actually make use of it.
…and then find a server serving anything that fast.
Langchain has serpai plugins…but that’s more one shot type questions than a convo
Also the free limit on serpai is pretty low
Not much - it’s been a pretty organic learning journey.
Very much a crawl > walk > run thing. Can’t necessarily jump straight to the end.
tbh I think it’ll be a bad trade off. What you lose in steerability is huge, while I’m not convinced you’ll get any gains on less boring/overly positive. An instruction model you can at least tell it to write something dystopian.
more interesting outputs.
Try jacking up the temperature
Managed to buy a really sweet domain so using that for both mail and local domain
currently I have names for my machines in my /etc/hosts files across some of my machines
A better way is to set the DHCP server to resolve local too via DNS.
So in my case proxmox.mydomain.com and proxmox both resolve to a local IP…without any need to configure IPs manually anywhere.
On opnsense it’s under Unbound >> Register DHCP Leases
The one is bits the other is bytes ;)
Network…3 gigabits, while a decent nvme gen 4 can do 4-5 gigabytes
Even old SATA connected SSDs should be able to keep up if you don’t buy trash.