Search code examples
ccharacter-encodingbinarycomputer-sciencedecoding

Decimal to binary in computers


I've looked for the answer in many places, but they all say the same thing. Whenever someone explains how to convert decimal numbers to binary, the technique of continuous division by two is shown.

If we declare a number in a program (ex. int = 229), this doesn't make any sense as the computer doesn't know what this number is and can't divide it by two.

From what I understand, when an int is declared, the computer treats the digits as simple characters. To get the binary number, the only thing that makes sense for me would be something like this:

  • Computer uses Ascii table to recognize the symbols ( 2 - 2 - 9)
  • It takes the symbol 9, finds "00111001" (57 in ascii table) which is associated to its real binary value "1001" (9) [57 - 48]
  • It takes the first 2, finds "00110010" (50), binary value "10" (2), but knowing it is the second symbol, it multiplies it by "1010" (10) and obtains "10100" (20)
  • It sums "10100" to "1001" = 11101 (29)
  • It takes the second 2, finds "10" (2), but knowing it is the third symbol, it multiplies it by (100) and obtain "11001000" (200)
  • It sums "11001000" to "11101 " = 11100101 (229)

Am I on the right track?

This and the inverse conversion (binary to decimal) would resemble at something like C functions atoi and itoa but completely performed with binary arithmetic only, exploiting a small knowledge base (ascii table, binary value of 10, 100, 1000 etc.).

I want to clarify that:

  • I already went into arguments such floating-point arithmetic and binary-coded decimal (calculators)
  • I know that decimals are only useful for human understanding
  • Ints were chosen in the example for their simplicity

The question is not related to how numbers are stored but rather to how they are interpreted.

Thank you!


Solution

  • After some research I came to these conclusions:

    • The code we write is nothing more than a series of characters which will be parsed by the compiler and transformed into instructions for the CPU
    • Characters are themselves bits but there is no need to deepen such argument.
    • These instructions can be expressed as a sequence of hex numbers for simplicity but we are always talking about a bit sequence.
    • Going down to a lower level language such the assembly, the point is the same, the text (our assembly code) will be converted into machine instructions by the assembler.
    • The CPU itself doesn't contain a logic to convert the bits of char sequence "2-2-9" into 11100101, a conversion must be done first.
    • In a scenario like this : C code -> ASM -> Machine language, the point where this conversion takes place is before the Machine language generation.
    • If no one has implemented a method for this conversion (nothing to do with base 10 to base 2 conversion with the scholastic method) and we have no library, no external tools that do the conversion for us, the conversion is done as indicated in this answer: Assembler - write binary in assembly from decimal