nickwitha_k (he/him)

  • 0 Posts
  • 24 Comments
Joined 1 year ago
cake
Cake day: July 16th, 2023

help-circle











  • In an abstract sense, they do mean the same things but, in a technical sense, the one most relevant to programming, they do not.

    The standard Python bool type is a subclass of the integer type. This means that it is stored as either 4 bytes (int32) or 8 bytes (int64).

    The numpy.bool_ type is something closer to a native C boolean and is stored in 1 byte.

    So, memory-wise, one could store a numpy.bool_ in a Python bool but that now leaves 3-7 extra bytes that are unused in the variable. This introduces not just unnecessary memory usage but potential space for malicious data injection or extraction. Now, if one tries to store a Python bool in a numpy.bool_, if the interpreter or OS don’t throw an error and kill the process, you now have a buffer overflow/illegal memory access problem.

    What about converting on the fly? Well, that can be done but will come at a performance cost as every function that can accept a numpy.bool_ now has to perform additional type checking, validation, and conversion on every single function call. That adds up quick when processing data on scales where numpy is called for.