I have a dataset of two column values something like the one shown below. I need to predict the values of y for values of x greater than 60. The curve must follow the increasing trend it is shown till x=60.
I have tried polynomial regression and SVR but it declines for values greater than 60. I have tried to fit the curve y = alnx + b to this curve but the R2score is 0.94. What model can I train for this purpose, or how can I improve the R2score but regressing over an appropriate logarithmic function?
Not a direct answer, but be aware that overfitting will be a thing here too. You might get an R2 of 0.99+ but the extrapolation could be horrendous (for example, using a high-degree polynomial, you already saw that). 0.94 with only two parameters does not sound too bad for me.
Maximizing R2 and eyeballing the extrapolation is not really a valid approach. You should use a goodness of fit test that includes model complexity. You could also implement a simple validation by leaving out the last x% of your data when fitting and then look at the test error.
I also have to agree that it looks somewhat piecewise. Without knowing the generating process the correct continuation could be anything.