Hello all,
I’m new to the data analysis and mining. I have a list of 100k entries in a CSV file having a just single column.
The values are as follows

0
1
1
1
0
0
1
1
0
1
1
1
.


1
1
0
0
Based on these data, can I predict the 100001 results? Will it be 0 or 1? If So, what is the best method for it? I’m learning Python and trying GradientBoosting, Support Vector Machines (SVM) and Basic Neural Networks. But I’m not able to achieve it.

  • finite_user_names@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    There are a lot of ways you _could_ predict this. The simplest might just be to predict the modal value, assuming that these values are IID. If there is some generating process that goes through different phases, and each observation corresponds to a timestep, you might consider fitting an HMM. Others have mentioned sliding windows, and these are fine as well if there is indeed a sequential structure to this data.

    How are you trying to use gradient boosting or SVM here? Predicting each in isolation, the best you should be able to do is just predicting the modal value, since you have no further information.