51

Re: The compact notation of binary number designation

Generally, the specific optimal algorithm will depend on allocation of received values on a range.
Here for example a variant [length + value]:
We break value into ranges, with digit capacity to the multiple 4 bits, i.e.: 0. 15, 16. 270, 271. 4365...
At first we write down length units, and the size of length is multiple 2 to bits, i.e.: 00 = 1, 01 = 2, 10 = 3, 1100 = 4, 1101 = 5...
After length goes N*4 number bit.
Number 100500 will be written down as 1101 0001 1000 1000 1001 0100-> 24 bits.
Number ' 127213652436512428173512214698378623721714738352374 ' in binary system to translate laziness, but it occupies 196 bits.
There is some redundancy (for example, number 1 occupies 6 bits), but without it anywhere smile

52

Re: The compact notation of binary number designation

alekcvp wrote:

Number 100500 will be written down as 1101 0001 1000 1000 1001 0100-> 24 bits.

Actually: 1101 0000 0111 0111 1000 1000

53

Re: The compact notation of binary number designation

As though I wrote.
I would not diminish less byte - only to be upset.
At each byte I would consider the high order as a flag of the end of number. And low seven - significant discharges. And number 0-127 would be located in one byte, 0..... 128*128-1 - in two bytes and so on.
Double-byte from a range 0.... 127 I would leave to myself on storage as eskejp - the codes.
If it is a pity to lose one bit from eight then would take words (or double words) with bit-flag. Then losses on an end flag - 1/16 32 64 It is good. But, the range a zero-127 is encoded in two bytes is a small loss if it is not enough small numbers, big - if it is a lot of.
Besides, if numbers in a range 0... 32000 it is not enough, in total amount it is possible to supply with a flag any more a word, and double, or even a fourfold word. Or even a piece  digit capacity.
Depending on how much great numbers on the average - it will be favourable to use slices of different length. For switching between digit capacity  a piece - aforementioned  - the codes.
Probably, the codes of characters in utf are similarly arranged. - I do not know plainly, probably. It seems to me that in a task context it is possible  the device  character encodings, well and the device of libraries on operation with them.....
To me it is thought that noticeably more difficult algorithms of the coding do not give noticeable advantage on the one hand, and make reading - record more difficultly - with another.
But statistics - what numbers in a flow much what - it is not enough, without it to solve a question on coding optimal incorrectly is necessary nevertheless.....

54

Re: The compact notation of binary number designation

Vladimir Baskakov wrote:

I would not diminish less byte - only to be upset.
At each byte I would consider the high order as a flag of the end of number. And low seven - significant discharges. And number 0-127 would be located in one byte, 0..... 128*128-1 - in two bytes and so on.

What for you operate with bytes? In a bit flow resolutely all the same.
Generally if to take Levenshtejna, and to it similar (with an unary prefix) that is primary
It is supposed that smaller numbers occupy a smaller bit grid.
But if initially histogram of numbers going on an input is known and has
A certain warp (there are great numbers than smaller is more often) that
Then the Omega *, Levenshtejn* and other lose meaning. It is necessary to search for others
Approaches.

55

Re: The compact notation of binary number designation

Mayton - it is convenient to processor to work as bytes. It can be the imperfect code, on bytes. But. It not  is worse good, which on a bit line. And a few to pay efficiency for convenience of operation on mine, in practice, quite logically.
And, I agree with judgement earlier stated in a subject - without having representation about  an input array, to estimate efficiency of the code it is impossible....
As with sorting - each algorithm can pick up such dial-up input that it works longer.
That is, if the task to consider as reason game - then bit map but only you will gather towards application-oriented - bytes, words, double and fourfold - here as here....
Well, judgement. I respect those who is capable to more abstract, less ordinary thinking.

56

Re: The compact notation of binary number designation

Eventually, well let's overtake for the test. On an input we push a flow  natural 0... 2^64-1.
Also we look, what  makes average  less than 60?
On the average 64 bit number probability to have 64 bits - half, is more 62 - 3/4, it is more 61 - 7/8,> 60---- 15/16.....
And so on. In general the average length bits of number from uniform  a flow - is a little less than digit capacity.
Contrive or not. If the flow strongly differs from uniform, and there it is a lot of trifle - then certainly put another. But, how validly marked above then this non-uniformity and a warp should be described? Before modeling for fight of codings which the most compact wins.

57

Re: The compact notation of binary number designation

Vladimir Baskakov wrote:

Eventually, well let's overtake for the test. On an input we push a flow  natural 0... 2^64-1.
Also we look, what  makes average  less than 60?
On the average 64 bit number probability to have 64 bits - half, is more 62 - 3/4, it is more 61 - 7/8,> 60---- 15/16.....
And so on. In general the average length bits of number from uniform  a flow - is a little less than digit capacity.

These are a bit different tasks . The initial task was: to use for number designation minimum necessary number of bits. I.e. not to write down 32 bit numbers 64 in bits and ., but thus to have possibility to write down number of any digit capacity.
And you want to compress a flow of 64 bit numbers with the uniform allocation , and it is is already specific compression, besides casual [generally] the data. No different tasks - different decisions, besides the second task [in case of really casual data with the uniform allocation] the decision [most likely] has.

58

Re: The compact notation of binary number designation

alekcvp wrote:

to compress a flow of 64 bit numbers with the uniform allocation ... [In case of really casual data with the uniform allocation] no decision [most likely] has

If the flow "really casual data with the uniform allocation" it is given us in any notation ("the bit notation" as in a computer there are only bits) is given us. Whether the question "will manage to be compressed" it makes sense only with instructions of two notations - initial and compressed.
So why the initial notation cannot be compressed? Because notations generally all are equally compact or because initial always the most compact?

59

Re: The compact notation of binary number designation

Let's specify a statement.
For example, to create the bit code for natural numbers 1... N, allowing to write down them in a flow and to read out in the arbitrary amount and the order, such that the total bit  on all dial-up was minimum? Or how? Or with  restriction what was long bit sequence for small number noticeably less, than for number very big?
Or dop - the minimax requirement - that maximum it is long encoding sequence on all dial-up was as less as possible? For last coding in equal digit capacity is favourable.
---- Criteria ... Different interpretations are not set precisely, from here

60

Re: The compact notation of binary number designation

For the coding of the arbitrary natural number it is possible to construct f-tsiju R (n) - how many discharges of binary representation are necessary for number representation n.
What integral, or maximum, or still that from this f-tsii we optimize on all set of possible codings? It is clear that  the coding is that - so much , what number and then a zero. Such codes it is possible  any natural and r (n) =n in view of that,
The wittiest coding on mine cannot give stablly better than log 2 (N).
What integral from f-tsii to digit capacity corresponds to an intuitive ideal - compact representation of number-?

61

Re: The compact notation of binary number designation

Basically, I can offer the unpretentious coding for r (n) = A* log B (n) + a C.
For example we take X. We encode n. In system on the base 2^x-1 - if =2 - that in ternary.
To each sign on this coding we put in correspondence its binary code, digit capacity .
Thus, one bit sequence remained unengaged with us - we use it as a sign of the termination of number.
Well here, there were on the logarithmic metrics, with constants A, B, Cs. It good, or not? And why? In what sense others == codings == are better or worse..... Is more practical? Basically, the variant == byte with a marker == gives the same metrics
The more , the less than superfluous discharges will be on great numbers, but it is more - on small, like so.......

62

Re: The compact notation of binary number designation

FXS wrote:

the Dictionary {"", "", "", ""... "M-bitnyjdiapazon"} should possess properties  and an optimality (for our task), that is, for example, bit values
0, 10, 110, 1110... (___) 0
-- Should be arranged on it according to our aprioristic waitings of frequencies of record of different numbers by us.

-- Here there (is more exact on one remark above) at me was inspiration what to put the shortest prefix word "0" for a designation "" - sets of numbers occupying one bit, - which only two: 0 and 1... It is completely not compact. And if it is stupid to invert the order of prefix words we receive the notation with the identical size of record of words on all range from "one-bit" to "M-bit"... And why then not to use the most normal notation with number designation fixed size?

63

Re: The compact notation of binary number designation

Vladimir Baskakov wrote:

the unpretentious coding for r (n) = A* log B (n) + a C.
For example we take X

personally I did not understand, how at you here correspond r and .

64

Re: The compact notation of binary number designation

FXS wrote:

it is passed...
If the flow "really casual data with the uniform allocation" it is given us in any notation ("the bit notation" as in a computer there are only bits) is given us. Whether the question "will manage to be compressed" it makes sense only with instructions of two notations - initial and compressed.
So why the initial notation cannot be compressed? Because notations generally all are equally compact or because initial always the most compact?

Compression as process is a removal of redundancy from the data. In casual a flow of 64 bit numbers with the uniform value distribution of redundancy is not present on-definition. Accordingly and to lossless compression it does not give in.

65

Re: The compact notation of binary number designation

alekcvp, apparently, in your statement "a casual flow of 64 bit numbers with the uniform value distribution"
-- Are spliced the allocation description (namely: uniformly-is casual distributed integer numbers in a range from 0 to 2^64-1) and the notation description (namely: binary notation of number with addition at the left zeroes to the full number 64 of bit). It is interesting, as such nontrivial construction from related subjects qualification "redundances is attributed is not present on-definition"...

66

Re: The compact notation of binary number designation

FXS wrote:

it is interesting, as such nontrivial construction from related subjects qualification "redundances is attributed is not present on-definition"...

Well a simple example: at compression of the text to us not mandatory to store all letters and punctuation marks of which it consists. After all in any text there are repeating - words or at least syllables. Therefore it is possible to make the dictionary of these repetitions, to sort them by frequency of appearance and to encode sequences of bits of different length, the meets - the more shortly sequence (Huffman code) is more often.
In a case of absolutely casual sequence (any digit capacity) - repeating groups of numbers either will not be at all, or there will be a minimum quantity of the minimum length, accordingly and to compress thus it it does not turn out.

67

Re: The compact notation of binary number designation

FXS wrote:

alekcvp, apparently, in your statement "a casual flow of 64 bit numbers with the uniform value distribution"
-- The description of allocation and the notation description

are spliced
To read as "a casual flow of numbers in a range from 0 to 2^64 - 1".

68

Re: The compact notation of binary number designation

alekcvp wrote:

In a case of absolutely casual sequence (any digit capacity) - repeating groups of numbers either will not be at all, or there will be a minimum quantity of the minimum length, accordingly and to compress thus it it does not turn out.

-- There will be also repetitions, and any, generally speaking, lengths. And to compress it does not turn out because it will be necessary to save as well the dictionary which "gobbles up" all effect from compression (is the roll call with an adjacent subject about "the generated dictionary").
And if there would be "a parallel concession fare" for sending (storage) of the dictionary it would be possible  "dictionary" from one word (= the sent text). Then "archive" would have length of 1 bit.

69

Re: The compact notation of binary number designation

alekcvp wrote:

to Read as "a casual flow of numbers in a range from 0 to 2^64 - 1".

Then your statement consists that the notation "binary notation of numbers with addition at the left zeroes to the full number 64 of bit and  these records closely one for another (without prefixes and separators which are not required)" - is the most compact (in the sense that no more compact notation exists).

70

Re: The compact notation of binary number designation

FXS wrote:

it is passed...
Personally I did not understand, how at you here correspond r and .

For a case =2.
2^x-1 = 3.
We encode number in ternary system.
Each sign on the code is replaced
0 on 00
1 on 01
2 on 10
11 - stop bits.
The length of the code in  to system for number N - log 3 (N), each sign is encoded by two bits - A=2, stop bits too two =2.
=3 - Two in =8, we encode in  to system
3 (log 7 (n)) + 3.
Clever codings, they certainly it is better. Whether much more? A question.
In practice, I this code would encode length of bit sequence of number. And number left as is.
Zero
01 11 0
1 01 11 1
2 - 10 11 10
-- In practice from it the favor is depthbalanced by harm. So, on  whole In title will be 6*2+2 - 10 bits.... Like a few, but such long integers . For 64 bit - 4*2+2 = 10 bits. 10 % there are more than loss...

71

Re: The compact notation of binary number designation

alekcvp;
And present that the task it is necessary to transfer "a casual flow of numbers in a range from 0 to To * 2^64 - 1", where =0.55... Then "the standard bit notation" it is obvious (for me, at least) is not the most compact...

72

Re: The compact notation of binary number designation

alekcvp wrote:

In a case of absolutely casual sequence

be absolutely casual the infinite sequence can only. In any finite, including a subsequence casually, it is possible to look for any regularities. Not the fact that carries, but chance - not zero.
P.S. Other question that:
1. All the same nonsense: a problem (initial) technology not in applicability, and in a practicality;
2. 512 (yes though and 4096) bits -  anyway.

73

Re: The compact notation of binary number designation

Basil A. Sidorov wrote:

In any finite, including a subsequence casually, it is possible to look for any regularities. Not the fact that carries, but chance - not zero.

Find any separate regularities you can and find, but an overhead charge for their coding eat all scoring from this operation .

FXS wrote:

alekcvp;
And present that the task it is necessary to transfer "a casual flow of numbers in a range from 0 to To * 2^64 - 1", where =0.55... Then "the standard bit notation" it is obvious (for me, at least) is not the most compact...

Well, unless record with a fractional amount of bits on number, and that here will be more optimal not the fact. To me laziness to consider, if it is fair smile
In all remaining cases  all the same 64 bit record will be more optimal.

74

Re: The compact notation of binary number designation

To whom as, but personally it is obvious enough me that any algorithm of compression (with losses or without) creates the determined sequence. Yes, without producing compression, we do not know result, but repeated compression produce the same result.
What changes, if the same algorithm (the compression scheme plus parameters of a specific mode) the same data processes one million participants? Anything.
How many participants are not simply ready, and can really invent the new algorithm of compression adapted under specific sequence? And to repeat this feat every second?
If to be the realist - a strict zero. That, actually, also does senseless an initial invention.

75

Re: The compact notation of binary number designation

Basil A. Sidorov to adapt algorithm of compression under specific sequence (selecting in it value of some parameter) is all the same at all is not called "to invent new algorithm of compression"