<![CDATA[Programmer's Town - Algorithms]]>
http://www.progtown.com/
Sat, 08 Apr 2017 06:55:00 +0000PunBB<![CDATA[Correlations and classifications]]>
http://www.progtown.com/topic2056956-correlations-and-classifications.html
All greetings, Unfortunately were not trained in the necessary courses, therefore are feeblly familiar with terminology. Vigorous on Wikipedia showed that as usual - it is necessary to know half of answer to receive the second. In general, a pattern here what: Is, let us assume, the order of 10^3 different products. Them can buy in more or less arbitrary combinations. We admit, we have data about approximately 10^5 purchases. Each purchase is a vector from those 10^3 cells where everyone corresponds to an amount of the bought product i. (at us this amount - always whole, but it not so is important). Naturally, the majority of values in such vector to equally zero. There allocation "tail" very quickly rolls off: 1 product buys 99 % of customers, 2 products - 0.9 % or 90 % from the remained, 3 products - 0.09 %, 4 and more - amounts . And here now we have a desire to look at correlations between sales of various products. Well, that is we have type hypotheses "buyers of a product X often buy product Y", or "buyers of product X1 never buy product X2". Worse that, any regularities volume of purchase of product Y can "be good correlates with the total of volumes of purchase X1 and X2" that I remember from university course on , hints that I should run 10^3*10^3 convolutions to find out pairwise correlations. Looks it is computing the bulky. Besides, it is not quite clear, as them to normalize to receive comparable numbers. Perhaps, there is any more optimal method to find out regularities, without writing out separate individual hypotheses with specific numbers of products? Where it is possible to esteem about the decision of such tasks?]]>Sat, 08 Apr 2017 06:55:00 +0000http://www.progtown.com/topic2056956-correlations-and-classifications.html<![CDATA[File sorting 100]]>
http://www.progtown.com/topic2056957-file-sorting-100.html
All greetings, Allowed a test problem to sort the text file line by line, the size 100 and 10. Earlier especially such tasks never solved. It became interesting, how many such sorting should occupy time for the decision of close to the ideal.]]>Sat, 08 Apr 2017 06:33:00 +0000http://www.progtown.com/topic2056957-file-sorting-100.html<![CDATA[Time series prediction]]>
http://www.progtown.com/topic2056958-time-series-prediction.html
The alternative to filter Kalmana, to linear anyway is necessary to me. I have a little enough critical cases when linearity has not enough: 1. The bird (almost material point) flies in sight, its path is close to a straight line, but represents small waves: upwards-downwards-upwards-downwards. It flies for a tree, filter Kalmana conducts it on a straight line appropriate to the last wave. If carries, it takes off because of a tree in the correct point and filter Kalmana produces a prediction near to it. Does not carry - the bird flies all also directly, and filter Kalmana which has lost the purpose on a crest of the last wave, leaves in the sky - the bird is lost. 2. The car on a straight line goes, then starts to turn on the smooth arc, in the middle disappears for a second from a type, filter Kalmana leaves on a tangent, and the machine continues to turn on an arc. I, of course, can take last pair of seconds of a path, approximate them in the form of linear (practically Kalman) or square function. And at me at once it turns out to catch turning machines more precisely, than at Kalmana. But birds with their waves so you will not catch. Whether there are algorithms/models which allow to build difficult non-linear paths? For example, to remember wave character of driving of a bird and to play back it some seconds when it disappears behind a tree. Also to understand that the machine began to turn, understand that a straight line driving on an arc with any radius ended and began. Here focus that I do not demand to define the manoeuvre beginning that is difficult. I should build driving model when it already some time proceeds. There are numerous models in statistican which consider seasonal prevalence, day-night - all it at them is registered also all influences result. But I do not have such dependences, objects can move as arbitrarily in space (the same birds), and on a plane (people, cars). And the path to me should be built and predicted on a frame - too in a plane. Whether there is any general-purpose approximator which builds driving model? It seems to me that polynomials of high levels can behave absolutely arbitrarily outside of observation. Experiments with the same square functions (by means of ) show instability of the decision.]]>Fri, 07 Apr 2017 11:52:00 +0000http://www.progtown.com/topic2056958-time-series-prediction.html<![CDATA[Check of existence of an element]]>
http://www.progtown.com/topic2056959-check-of-existence-of-an-element.html
Good afternoon Prompt as in a DBMS solve a problem with unique field value which is not key, and a type field int32-int64, that is in a simple array of counters, or a huge bit field it not to turn out to manage, probably most simple or a high-grade index on a B-tree, well or set on a B-tree, probably is the data structure which quickly (in an ideal certainly ) whether would allow to interpose, delete record and to check there is such record well and naturally minimum of consumption on storage, At me on an input the positive integer numbers, and to me the order is not important as they will be allocated in structure, serial bypass is not necessary to me, such number already or it was not, probably think here is possible as that to adapt DAWG was necessary only the nobility that, assuming that at me the alphabet {0,1} and numbers are "lines" on the given alphabet.]]>Mon, 03 Apr 2017 05:36:00 +0000http://www.progtown.com/topic2056959-check-of-existence-of-an-element.html<![CDATA[Optimization of the Hungarian algorithm]]>
http://www.progtown.com/topic2056960-optimization-of-the-hungarian-algorithm.html
Solves the task about assignments: in Russian and in English. Actually, me the task about assignments which now at me in the code solves the Hungarian algorithm interests. The best decision is O (n^3) that suddenly became for me too slow. On the other hand, I about the data have an additional information. To remain within known terms, I will continue an example from Russian Wikipedia. Namely we admit that Peter for operations B and a C takes too much. So it is a lot of that its services anyway will be not claimed. How it can be considered for acceleration of calculations? The first idea - to apply a disperse matrix, I even found source codes of this business on Githabe. But it works not faster, moreover and is less exact. And, no, I all the same should describe the solved task that it is better to explain character of restrictions. The task - Multitarget tracking. A problematics such: For already before the tracks of objects found on video with paths (are set by a rectangle) to assign the objects found on a current frame (too rectangles). We consider distance between each track and new object, we make a cost matrix, we solve the minimization task. All is ready, all is good. But here to us comes across to video of high resolution with large snowflakes, each of which is detected and . Objects it turns out much - more thousand, and even two thousand. And the Hungarian algorithm with the asymptotics goes to a miss. We have a heuristics: the distance transited by object on adjacent frames cannot exceed how many there pixels. Thus like as we should not minimize globally: all tracks to all objects. And it is necessary to consider only for each track objects in some neighborhood from it. It would seem matrixes and the version of the Hungarian algorithm with them could help, but in practice it became worse also low-level optimization from a half-kick does not dare. From here a question: somebody knows decisions of the task on assignments for the rarefied data or with similar restrictions? P.S. About alternative methods of the decision of task Multitarget tracking' I in the course, everyones Energy Minimization, etc. now that the decision way to the key described above is interesting To me.]]>Sun, 02 Apr 2017 00:33:00 +0000http://www.progtown.com/topic2056960-optimization-of-the-hungarian-algorithm.html<![CDATA[for a range of dates]]>
http://www.progtown.com/topic2056961-for-a-range-of-dates.html
There is an interval in the form of dates - two fields. It is necessary to make that that of type for the given range that then it was possible comparing to tell two that they are intersected.]]>Fri, 24 Mar 2017 04:04:00 +0000http://www.progtown.com/topic2056961-for-a-range-of-dates.html<![CDATA[and . Programming]]>
http://www.progtown.com/topic2053979-and-programming.html
I welcome! The question me interests for a long time. A question here in what. We admit at us there is an information in the form of the traditional algorithm, adding knowledge learning sampling. As this algorithm to tire out in , or to cross with . I will explain that cats: http://rsdn.org/forum/humour/6708031.1 the Author: de Niro Date: 24.02 20:02 That algorithm to which it is trained recognizes a circle as an eye. We admit, we should specify - "this circle not an eye, and a nose". Or the single circle in the middle of a muzzle is more difficult "is a nose".]]>Fri, 03 Mar 2017 04:07:00 +0000http://www.progtown.com/topic2053979-and-programming.html<![CDATA[Filling of triangles with]]>
http://www.progtown.com/topic2053980-filling-of-triangles-with.html
Made triangle filling on algorithm of Brezenhema as follows. From the upper peak two algorithms of Brezenhema in the lower peaks are launched; for everyone Y I draw a horizontal line (the fastest operation for equipment and operation with video storage) between X1 and X2. When one of peaks will be reached, parameters of appropriate algorithm are adjusted and we flood the second part of a triangle before achievement of the lowermost peak. Now a question. There is a possibility to draw in gradation of color. It would be desirable to try to add . That is to paint over initial and finishing points of horizontal lines not the full color and one of gradation between the full color and background color. There are any ready implementations of it on which it is possible to look?]]>Tue, 28 Feb 2017 05:15:00 +0000http://www.progtown.com/topic2053980-filling-of-triangles-with.html<![CDATA[Image search in the image with adjustable probability of coincidence]]>
http://www.progtown.com/topic2053981-image-search-in-the-image-with-adjustable-probability-of-coincidence.html
Prompt the fast search algorithm of the sample image in the image strongly to dimensionality. It is desirable if it is possible to regulate still accuracy of search. Coincidence in pixel not mandatory and most likely will not be, since source images can be different quality. Meanwhile found this operation: http://www.wseas.us/e-library/conferenc … TH-091.pdf the Approach whole interesting, but not so approaches for color RGB images. And if RGB images to reduce to that there is a big loss of the data. Ideally was have certain function which could for the big units, we tell 100x100 pixels to produce a certain line (numerical) and thus that this line not strongly differed at insignificant shifts of this unit in the big image. For example, at unit shift on 50 pixels, the code so changed no more than on 50 %. For example, took the unit with the code "1234567890" shifted in left, received 6789055555. Then I can break an initial template into these codes and the source image on these bloko-codes. Then it is possible to reduce taking into account possible shifts all to substring search in line. If who can, direct in the necessary direction. Where it is possible to find appropriate operations on a subject. Thanks.]]>Wed, 22 Feb 2017 08:00:00 +0000http://www.progtown.com/topic2053981-image-search-in-the-image-with-adjustable-probability-of-coincidence.html<![CDATA[To reduce the task to .?]]>
http://www.progtown.com/topic2053982-to-reduce-the-task-to.html
Probably-whether to reduce to any mathematical model optimization of the decision a trace. Tasks - Are set of elements. Elements have a dial-up of attributes. Each attribute can have one or several values. To select satisfying to request. The data is rare-changed. There is a possibility to fulfill any predesigns, to reduce to system of equations, a matrix and . For example: There are countries and languages. There is a list of people for which it is set what countries it visited also what languages owns. Request: to Select people which visited one of the given countries and own one of the given languages. It is clear that there is a direct search, a choice on an index with handling already selected and . It would be desirable to optimize the task . For example, at once to discard those elements which do not satisfy to the given dial-up of values of attributes (request).]]>Mon, 20 Feb 2017 07:16:00 +0000http://www.progtown.com/topic2053982-to-reduce-the-task-to.html<![CDATA[From a multidimentional array to receive a tree.]]>
http://www.progtown.com/topic2049867-from-a-multidimentional-array-to-receive-a-tree.html
Good afternoon, there is a data structure received by sampling of a DB of hierarchical structure, let there will be a City-region-device, designate them for simplicity numerical values. We receive a dial-up of tuples, I so understand they should be is not mandatory arranged 1 1 1 1 1 2 1 2 3 1 3 3 2 4 6 2 4 7 2 4 8 3 5 9 1 5 6 It should turn out 1 | 1 | 1 2 5 | 6 2 | 3 2 4 | 6 7 3 5 | 9 Question consists here in what, in my opinion the task is similar on standard and the title seems to me to it already invented or it is necessary to solve independently? I am interested in respect of high-speed performance since the tree turns out the order of several thousand nodes.]]>Thu, 12 Jan 2017 13:21:00 +0000http://www.progtown.com/topic2049867-from-a-multidimentional-array-to-receive-a-tree.html<![CDATA[How to search for algorithms on columns?]]>
http://www.progtown.com/topic2049868-how-to-search-for-algorithms-on-columns.html
Normally the graph is defined as ordered pair from two sets - sets of peaks of this graph (they nodes) and sets of communications of this graph (they edges, arcs = directed edges). : = <in, the simple graph, and the graph with peaks of different types (probably it is called "the colored peaks") is necessary To me not. And not only the graph, but also the search algorithm in the directional graph of a subgraph with the given coloring (in that specific case not a subgraph, and a chain). Chain search in the colored directional acyclic graph you tell - sit down and think. Searches, searches. We sort out all peaks of the graph, for everyone we look - whether there is the way starting with to this peak completely coinciding with a chain. I in such decision do not like two moments: 1) at adding of new peak (with communications) in the graph it will be necessary to reconsider all, and it is unduly labor-consuming. 2) generally the full search looks unduly labor-consuming. There is an inspiring algorithm of the Whip-morrisa-pratta in which the amount of actions decreases (at them there, truth, not the graph and a chain, and is simple two chains). You tell - it is necessary to you , put a chain different nodes to the peak new interposed into the graph and check presence of both tails of a chain in the column. But also here, me it seems, it is possible to spare, if the graph beforehand in addition to label. How the algorithm necessary to me is called and how for it to search?]]>Wed, 11 Jan 2017 10:11:00 +0000http://www.progtown.com/topic2049868-how-to-search-for-algorithms-on-columns.html<![CDATA[To calculate a logarithmic scale]]>
http://www.progtown.com/topic2049869-to-calculate-a-logarithmic-scale.html
It is given - a certain dial-up of values with the allocation similar on logarithmic. A question - how approximately to count the logarithm base at which the logarithmic schedule will be as much as possible close to the linear?... <<RSDN@Home 1.0.0 alpha 5 rev. 0 on Windows 8 6.2.9200.0>>]]>Thu, 05 Jan 2017 11:28:00 +0000http://www.progtown.com/topic2049869-to-calculate-a-logarithmic-scale.html<![CDATA[Cogwheels, evolvents and everything, all, all]]>
http://www.progtown.com/topic2049870-cogwheels-evolvents-and-everything-all-all.html
Hello! There was here a task to think a gear program. Esteemed about them, understood nothing, understood only what not all so is simple. Moreover in different docks so walking around terminology are transited that line not clearly. And the most part of materials is devoted calculations for choice a material for for the given loading, and other to subtleties. And how to calculate geometry - normally hold back. Remembered only that a cogwheel of smaller diameter name a gear, and - by a wheel the algorithm Is interesting to set a pack of parameters, and on an output - an array from arcs and segments, or somehow so. Still separately became interested - and there is something by calculation of cogwheels of the arbitrary form? For any different variants of non-uniform transmission of driving? There, for example, for conjugation of pair toothed "squares" and ? In the same place all will be not how for a round wheel? Found any plug-in to on a python - I try to understand. There with the theory rather badly, one calculations. While line not clearly, as what for there becomes]]>Tue, 27 Dec 2016 17:04:00 +0000http://www.progtown.com/topic2049870-cogwheels-evolvents-and-everything-all-all.html<![CDATA[Classification and probability theory]]>
http://www.progtown.com/topic2047003-classification-and-probability-theory.html
Long not to tell I will try to invent the task and to adapt for it. Assume I predict the list from 5 books which will be bought by the user with the greatest probability. In total books of 24 possible. I have a model which on features of the user gives the forecast, it not bad works, but does not consider change of popularity of books in due course. And it is considerable. Assume I was trained on 06 month 2015 and I should make the forecast for 06 months 2016. For some user the forecast is received: [0.00273501 0.00273501 0.21609817 0.00411186 0.00925567 0.1917159 0.04742743 0.00293886 0.00331788 0.01592568 0.00443959 0.00344877 0.04274225 0.01281825 0.00274041 0.01408142 0.00274985 0.12125014 0.01134686 0.00730866 0.00296034 0.10991038 0.04232292 0.12561873] Number of a position in an array it also is the book. Further it is sorted and undertakes 5 with the greatest probabilities. But not an essence. In 15 year there was a following amount of books [1 0 3878 5 2347 40 531 226 131 0 0 46 2709 61 3 22 7 279 4248 183 7 5488 5513 10163] In 16 year this amount changed [0 0 6731 9 1872 8 211 216 152 290 33 956 1205 245 4 19 8 2933 4684 152 3 5082 8147 9031] And I would like to make correction of the forecast for value of change of popularity. For example, the third book almost twice grew on popularity. It would Seem it is necessary to divide an amount in 16 year for 15 year and to increase by probabilities of the forecast. But it is impossible so to do. Since for 6th book coefficient turns out unfairly big (40/8). I.e. it is necessary also total number somehow to consider. A question as it do?]]>Mon, 19 Dec 2016 08:28:00 +0000http://www.progtown.com/topic2047003-classification-and-probability-theory.html