@BigBayesian

BigBayesian@alien.top · 1 year ago

There’s no real growth opportunity for you at this job. No way for you to learn / level up anything but your confidence and independence.

It’s unusual, but not that unusual on the lower pay end. Normally a company would hire someone a little more senior to do the unsupervised-team-of-one thing. But they went cheap.

It sounds like they’ve made a bet that this product is worth $X, and doing it faster, better, or more robustly won’t impact that.

It’s probably better for you to finish the product and leave, but only for the story and the closure. You should be searching for your next job now. Find something where you’ll be part of a team - you’ll learn a lot more.

BigBayesian@alien.top · 1 year ago

I think MCMC is a family of methods that puts lie to the claim of determinism. Unless his point is “if you set the random seed the same, then this code block will produce the same result with perfect fidelity”. In which case, sure, okay.

BigBayesian@alien.top · 1 year ago

I think MCMC is a family of methods that puts lie to the claim of determinism. Unless his point is “if you set the random seed the same, then this code block will produce the same result with perfect fidelity”. In which case, sure, okay.

BigBayesian@alien.top · 1 year ago

When I did my PhD, starting around 20 years ago, things were different. Feature engineering within a data domain was a pretty common way to specialize. Neural networks were old fashioned function approximators that went out with shoulder pads. The future was structred Bayes nets, or whatever was going to replace conditional random fields.

I’d listen to my PIs talk about how things were different - how I was leaning too much on the power of models to just learn things, and I had to focus more on the precise choice of model. When I pointed out that given the right feature engineering, the models basically performed the same, they’d dismiss that, saying I was leaving a lot on the table by not deriving a more fit-to-purpose model.

These days, I look at the work the junior modelers I supervise perform, and I urge them to at least look at the data, because you can’t really understand the problem you’re working without getting your hands on the data. But they’d rather focus on aggregate performance metrics than examine their outliers. After all, with large datasets, you may not be able to even look at all your outliers to recognize patterns. And how could you do as good a job as one of today’s giant networks?

Then there’s LLMs, where you may be handed something that can already basically solve your problem. That flips even more ways of working on their ear.

But the fact is, these patterns repeat.

You’re going to study something that won’t be as relevant in 20 years. That’s the way of all rapidly moving fields. And it’s okay. All code ever written will one day be dust. Even if we don’t like to admit it, the same is true of every publication. Industry or Academia, you build things that move us forward, slowly, in fits and starts. That’s the broader game you opt to play in this field.

ML will change. So will everything else.

BigBayesian@alien.top · 1 year ago

I think there’s a lot of bias in how you’re looking at the data. In particular, for someone trained to deal with noise, you’re attributing your observations to signal, not noise. What’s the acceptance rate at these conferences these days? It’s so low it beggars belief. The review process exists during the same duration as ever (the briefest in academia), but the raw number of submissions has exploded. There’s no serious way to stack rank that much data without multiple evaluations, and that’s too hard / expensive. So the end ranking is largely noise, probably only weakly correlated with the “true” ranking that would be determined by a million ML profs doing nothing but reviewing papers all the time.

You have failed in the narrow sense that you didn’t earn the laurels you needed to achieve your career goals. But that isn’t to say that someone else, with your fortune but their ability, would have done differently. You gambled and didn’t win the grand prize.

Please don’t buy into the myth that those of us who’ve gotten luckier are so well-served by propagating - that this field is a serious meritocracy. There’s just way too little signal and way too much noise to take that belief seriously.

At least you got something. You’ll get about the same money, but fewer tshirts and snacks. And you’ll have to dress better.

BigBayesian@alien.top · 1 year ago

Feedback: as others have noted, unless you live in a country where therapy isn’t regulated, this is a ticking legal time bomb. If you think no one at a company like Better Help has thought about building this, you’re wrong. The issues are:

Where’s legal liability when someone self-harms?
Where does your training set come from? Existing therapy log datasets that don’t mention the training of LLMs as a possible use case in the legal agreement that clients signed may not be usable in the way you’ve used them.
What’s your core hypothesis? That generating therapist-language has therapeutic value? Remember that the whole profession is about indirectly developing a model of what’s going on in someone else’s head, and then leading them to a conclusion based on that the therapist thinks may help them (this is reductionist and overly simplistic, but I think the point holds). This level of modeling and indirection seems poorly suited to a language-generator.

BigBayesian@alien.top · 1 year ago

To be fair, we don’t know that.

But it’s definitely an unsolved problem without much of a glimmer of hope, and for which any solution would simply yield a harder version of the problem once it was turned around and used for adversarial training.

BigBayesian@alien.top · 1 year ago

Check out model selection. There’s heuristic scores that can work okay - AIC, BIC.

Basically, it comes down to trading off quality of fit (distance from datapoints to cluster means) with complexity of model.

BigBayesian@alien.top · 1 year ago

If you can’t understand the proofs, then you’re taking what this educator says on faith. You may also have a less sophisticated idea of when to apply which methods. Your ability to evaluate new results / methods / etc may be compromised by your inability to evaluate them in a principled way, which may be facilitated by your understanding of their underpinnings.

On the other hand, it’s a rare work day when you derive a significantly new method / actually leverage the proofs / their underlying methodologies.

All in all, it’s like saying “you can do software engineering without understanding theory of computation”. You totally can, and can do it well, but you’ll have some blind spots that won’t be able to efficiently address / speak to your peers about.

There’s no one right answer. There’s the right answer for you.