1

Topic: Classification and probability theory

Long not to tell I will try to invent the task and to adapt for it. Assume I predict the list from 5 books which will be bought by the user with the greatest probability. In total books of 24 possible. I have a model which on features of the user gives the forecast, it not bad works, but does not consider change of popularity of books in due course. And it is considerable. Assume I was trained on 06 month 2015 and I should make the forecast for 06 months 2016. For some user the forecast is received: [0.00273501 0.00273501 0.21609817 0.00411186 0.00925567 0.1917159 0.04742743 0.00293886 0.00331788 0.01592568 0.00443959 0.00344877 0.04274225 0.01281825 0.00274041 0.01408142 0.00274985 0.12125014 0.01134686 0.00730866 0.00296034 0.10991038 0.04232292 0.12561873] Number of a position in an array it also is the book. Further it is sorted and undertakes 5 with the greatest probabilities. But not an essence. In 15 year there was  a following amount of books [1 0 3878 5 2347 40 531 226 131 0 0 46 2709 61 3 22 7 279 4248 183 7 5488 5513 10163] In 16 year this amount changed [0 0 6731 9 1872 8 211 216 152 290 33 956 1205 245 4 19 8 2933 4684 152 3 5082 8147 9031] And I would like to make correction of the forecast for value of change of popularity. For example, the third book almost twice grew on popularity. It would Seem it is necessary to divide an amount in 16 year for 15 year and to increase by probabilities of the forecast. But it is impossible so to do. Since for 6th book coefficient turns out unfairly big (40/8). I.e. it is necessary also total number somehow to consider. A question as it do?

2

Re: Classification and probability theory

Hello, Gattaka, you wrote: G> And I would like to make correction of the forecast for value of change of popularity. For example, the third book almost twice grew on popularity. It would Seem it is necessary to divide an amount in 16 year for 15 year and to increase by probabilities of the forecast. But it is impossible so to do. Since for 6th book coefficient turns out unfairly big (40/8). I.e. it is necessary also total number somehow to consider. A question as it do? Through the conditional probability - posteriori it in the elementary case product of factors (popularity of the book the general * popularity taking into account your model), the most problem part - to receive the normalizing factor that it is all was probability, in rough approximation simply enough to normalize turned out product on 1 for the given user. If will not work in practice - it is necessary to think.

3

Re: Classification and probability theory

Hello, 3141566=Z, you wrote: Z> Hello, Gattaka, you wrote: G>> And I would like to make correction of the forecast for value of change of popularity. For example, the third book almost twice grew on popularity. It would Seem it is necessary to divide an amount in 16 year for 15 year and to increase by probabilities of the forecast. But it is impossible so to do. Since for 6th book it turns out unfairly big (40/8) coefficient. I.e. it is necessary also total number somehow to consider. A question as it do? Z> through the conditional probability - posteriori it in the elementary case product of factors (popularity of the book the general * popularity taking into account your model), the most problem part - to receive the normalizing factor that it is all was probability, in rough approximation simply enough to normalize turned out product on 1 for the given user. If will not work in practice - it is necessary to think. Aha, that is if suddenly there is very rare book, but my model precisely knows that the user buys it. Gives the forecast 0.9565 or something like it. The general popularity at it is extremely small 0.004. I multiply and I receive 0.003826, i.e. the general probability upon hammers in my model. It should be resolved by the normalizing factor? I.e. if the forecast of model for the given book too big - we take it. Or I not correctly understand? And the second question, and that if to construct simply linear regression on two variables, the forecast of model and the general popularity. So do not do?

4

Re: Classification and probability theory

Hello, Gattaka, you wrote: G> And I would like to make correction of the forecast for value of change of popularity. For example, the third book almost twice grew on popularity. It would Seem it is necessary to divide an amount in 16 year for 15 year and to increase by probabilities of the forecast. But it is impossible so to do. Since for 6th book coefficient turns out unfairly big (40/8). I.e. it is necessary also total number somehow to consider. A question as it do? Try to include this coefficient as a feature. Generally, if at you the result depends on time it is necessary to add features, time-dependent. For example, already existing features to consider on some backwindow ( for the last year), there where it makes sense. As a rule, each model allows to estimate the contribution of each feature to result, accordingly it will be possible to see, what appeared useful and what are not present.

5

Re: Classification and probability theory

Hello, akochnev, you wrote: A> Try to include this coefficient as a feature. A> generally if at you the result depends on time it is necessary to add features, time-dependent. For example, already existing features to consider on some backwindow ( for the last year), there where it makes sense. A> as a rule, each model allows to estimate the contribution of each feature to result, accordingly it will be possible to see, what appeared useful and what are not present. Already tried, the result even worsened. Laid hopes on it. Now  the positive result is bluntly pattern if and manually to a wheel predictions - but terribly is is not general-purpose.

6

Re: Classification and probability theory

Hello, Gattaka, you wrote: G> Already tried, the result even worsened. Laid hopes on it. Now  the positive result is bluntly pattern if and manually to a wheel predictions - but terribly is is not general-purpose. It is interesting to understand why. What at the heart of model? Trees, networks? How the result for an example from an initial post exchanged? From my experience if the result of operation of model contradicts a specific obvious example it is necessary to look narrowly attentively - suddenly somewhere at feature formation the error crept in?

7

Re: Classification and probability theory

Hello, akochnev, you wrote: A> Hello, Gattaka, you wrote: G>> Already tried, the result even worsened. Laid hopes on it. Now  the positive result is bluntly pattern if and manually to a wheel predictions - but terribly is is not general-purpose. A> it is interesting to understand why. What at the heart of model? Trees, networks? In a basis trees. A> as the result for an example from an initial post exchanged? For was specific this example did not look, what exactly is quite possible for this purpose did not exchange. I the metrics map@7 had something like 0.822704318474, and after adding of features 0.810530596046 I.e. exchanged, but is feeble to the worst - I so estimated that figs will find where became worse. Though if to think it was necessary to look,  I will try... A> From my experience if the result of operation of model contradicts a specific obvious example it is necessary to look narrowly attentively - suddenly somewhere at feature formation the error crept in? I so understand that it simply does not feel dependence. At me 12 months and 22 features if would be months were pieces 70 or 700 - it would experience.