Adding two integers in CPython

Ignoring cheap operations (arithmetic, logical, bits) and some not-so-cheap operations (branching), this is the operational cost of adding two integers in CPython.

  1. 1 (hot) indirect read: read bytecode buffer. (NEXTOP macro)

  2. 1 jump-table jump: instruction selection (BINARY_ADD in this case).

  3. 2 (hot) indirect reads: pop the evaluation stack. (POP macro, TOP macro)

  4. 2 indirect reads: RTTI check for int type. (PyInt_CheckExact macro)

  5. 2 indirect reads: extract int value. (PyInt_AS_LONG macro)

  6. 1 unsigned long add: the actual addition (maybe).

  7. 1 overflow check. Four paths:

    1. “Fast” path: (cached int object)

      1. 1 range check: determine if value is cached.

      2. 1 (warm) global read: read the cached object.

      3. 1 indirect increment: reference count.

    2. “Slow” path: (not cached; int object ‘free list’ not full)

      1. 1 global read: use free list.

      2. 1 global write: advance free list.

      3. 1 indirect write: set object type. (PyObject_INIT macro)

      4. 1 indirect write: initialize reference count. (_Py_NewReference macro)

      5. 1 indirect write: set integer value.

    3. “Slower” path: (not cached; free list full / not allocated)

      1. 1 memory allocation: allocate new PyIntObject free list.

      2. 1 indirect write: chain new free list to current (full) free list.

      3. 1 global write: set new free list as current free list.

      4. 1000 (hot) indirect writes: link PyIntObjects to each other.

      5. Go to “Slow” path.

    4. “Slowest” path: (addition overflow)

      1. 2 indirect reads: RTTI check.

      2. 1 double-indirect reads: read number method table.

      3. 1 triple-indirect reads: read addition method from table. (NB_BINOP)

      4. 1 quadruple-indirect call: go to int addition. (int_add)

      5. 2 indirect reads: RTTI check for int type. (CONVERT_TO_LONG)

      6. 2 indirect reads: extract int value.

      7. 1 unsigned long add: the actual addition (again).

      8. 1 overflow check (again).

      9. 1 global read: read PyLong_Type number method table.

      10. 1 indirect read: read long addition method.

      11. 1 doubly-indirect call: go to long addition. (long_add)

      12. Convert ints to longs: (CONVERT_BINOP macro)

        1. 2 indirect reads: RTTI check for int type.

        2. 2 indirect reads: extract int value (third time). (PyInt_AS_LONG)

        3. Allocate temporary long objects: (in PyLong_FromLong)

          1. Quite extensive, not going into detail.
      13. 1 object allocation: for addition result.

        1. Quite extensive, not going into detail.
      14. 4* indirect reads & writes: arbitrary-precision addition.

      15. 2 indirect decrements: refcount for temporary longs. (Py_DECREF)

      16. 2 indirect function calls: object deallocation. (_Py_Dealloc)

        1. Quite extensive, not going into detail.
  8. 2 indirect decrements: refcount for popped stack values. (Py_DECREF)

  9. (Possible) 2 indirect function calls: object deallocation. (_Py_Dealloc)

    1. (Returns int to free list)
  10. 1 (hot) indirect write: set top of evaluation stack. (SET_TOP macro)