Comparison of Python `is` Operator and `==` Operator

This article was written in September 2020. It is a small report on the interesting parts of the CPython interpreter for the “Computer Architecture Theory 2” subject of my university. The writing may be a bit messy. If you are interested, please read. :)

Summary

This report mainly uses the approach of reading source code and doing code experimental to understand the internal implementation of CPython’s Comparison operators and Identity operators.

I found that comparison operators such as = and < are processed as COMPARE_OP bytecode, CPython calls the tp_compare function of either first or second operand’s type, tp_compare results in invocation of special methods such as __eq__ (handled in CPython without calling special methods for the builtin types), for a default tp_compare function is used for = and != if the classes do not define the special methods __eq__ and __ne__. For ease of use, I have organized the built-in types’ Richcmp method list. I will give more detailed content and analysis process in the article.

At last, I give a simple analysis of the caching mechanism of small numbers in CPython’s int type. I found that CPython will cache the int numbers in the range [-5, 256], analysis process in the section <Other interesting findings>.

Introduction

In the Python language, there is a pair Identity operator, is and is not.

It can be used to determine whether two objects are the same instance, that is, whether two objects exist at the same memory address. As shown below:

1
2
3
4
5
6
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a == b
True
>>> a is b
False

However, there is a very unintuitive situation, which makes many people feel confused:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
>>> a = 1
>>> b = 1
>>> a is b
True
>>> a == b
True
>>> a = 114514
>>> b = 114514
>>> a is b
False
>>> a == b
True

Since is operators do not exist in many other languages, many beginners are easy to confuse it with == operator, and many materials are not explained thoroughly, so I want to study the implementation principles.

This article is the analysis based on the latest version of CPython 3.10 dev version (2020-08-20) of the master branch source code, I found that the implementation is quite different from the current stable version of CPython 3.8, please pay attention to the versionW difference. Now let’s go to the topic.

Analysis process

First, let’s pull the latest version of the source code to compile and run compile the Python, in the shell window, use dis module, to analyze isand == CPython disassembly code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
>>> import dis
>>> def test():
...     a = 1
...     b = 1
...     a is b
...     a == b
...
>>> dis.dis(test)
  2           0 LOAD_CONST               1 (1)
              2 STORE_FAST               0 (a)

  3           4 LOAD_CONST               1 (1)
              6 STORE_FAST               1 (b)

  4           8 LOAD_FAST                0 (a)
             10 LOAD_FAST                1 (b)
             12 IS_OP                    0
             14 POP_TOP

  5          16 LOAD_FAST                0 (a)
             18 LOAD_FAST                1 (b)
             20 COMPARE_OP               2 (==)
             22 POP_TOP
             24 LOAD_CONST               0 (None)
             26 RETURN_VALUE

We can be seen from the bytecode, the latest development version 3.10, is is using the IS_OP process flow (oparg = 0), and == by using COMPARE_OP process flow (oparg = 2).

is operator

Following the clue, we searched in the source code IS_OP and found the code in Python/ceval.c:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
case TARGET(IS_OP): {
    PyObject *right = POP();
    PyObject *left = TOP();
    int res = (left == right)^oparg;
    PyObject *b = res ? Py_True : Py_False;
    Py_INCREF(b);
    SET_TOP(b);
    Py_DECREF(left);
    Py_DECREF(right);
    PREDICT(POP_JUMP_IF_FALSE);
    PREDICT(POP_JUMP_IF_TRUE);
    FAST_DISPATCH();
}

It can be found that the core function of the IS_OP operator is very simple. We know that all types in Python are from generic PyObject, so the value on the left of the operator left and the value on the right of the operator right here just pointers. From this, we can find that is operator only compares the memory pointers of the left and right. If the two pointers are equal (the memory address is the same), then return Py_True otherwise then return Py_False.

== operator

What about the == operators? I searched in the source code COMPARE_OP and found the code in Python/ceval.c:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
case TARGET(COMPARE_OP): {
    assert(oparg <= Py_GE);
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *res = PyObject_RichCompare(left, right, oparg);
    SET_TOP(res);
    Py_DECREF(left);
    Py_DECREF(right);
    if (res == NULL)
        goto error;
    PREDICT(POP_JUMP_IF_FALSE);
    PREDICT(POP_JUMP_IF_TRUE);
    DISPATCH();
}

I noticed that the PyObject_RichCompare function is called here, and passing the value on the left of the operator as left, the value on the right of the operator as right and the Rich comparison opcode(for example Py_EQ) as oparg. And let me check the source code of this function (located at Objects/object.c):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* Perform a rich comparison with object result.  This wraps do_richcompare()
   with a check for NULL arguments and a recursion check. */

PyObject *
PyObject_RichCompare(PyObject *v, PyObject *w, int op)
{
    PyThreadState *tstate = _PyThreadState_GET();

    assert(Py_LT <= op && op <= Py_GE);
    if (v == NULL || w == NULL) {
        if (!_PyErr_Occurred(tstate)) {
            PyErr_BadInternalCall();
        }
        return NULL;
    }
    if (_Py_EnterRecursiveCall(tstate, " in comparison")) {
        return NULL;
    }
    PyObject *res = do_richcompare(tstate, v, w, op);
    _Py_LeaveRecursiveCall(tstate);
    return res;
}

We found that the main body of this function is a security check, and the most critical step is to call it do_richcompare, so we continue to look down:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/* Perform a rich comparison, raising TypeError when the requested comparison
   operator is not supported. */
static PyObject *
do_richcompare(PyThreadState *tstate, PyObject *v, PyObject *w, int op)
{
    richcmpfunc f;
    PyObject *res;
    int checked_reverse_op = 0;

    if (!Py_IS_TYPE(v, Py_TYPE(w)) &&
        PyType_IsSubtype(Py_TYPE(w), Py_TYPE(v)) &&
        (f = Py_TYPE(w)->tp_richcompare) != NULL) { // <- 1st case
        checked_reverse_op = 1;
        res = (*f)(w, v, _Py_SwappedOp[op]);
        if (res != Py_NotImplemented)
            return res;
        Py_DECREF(res);
    }
    if ((f = Py_TYPE(v)->tp_richcompare) != NULL) { // <- 2nd case
        res = (*f)(v, w, op);
        if (res != Py_NotImplemented)
            return res;
        Py_DECREF(res);
    }
    if (!checked_reverse_op && (f = Py_TYPE(w)->tp_richcompare) != NULL) { // <- 3rd case
        res = (*f)(w, v, _Py_SwappedOp[op]);
        if (res != Py_NotImplemented)
            return res;
        Py_DECREF(res);
    }
    /* If neither object implements it, provide a sensible default
       for == and !=, but raise an exception for ordering. */
    switch (op) { // <- other situations
    case Py_EQ:
        res = (v == w) ? Py_True : Py_False;
        break;
    case Py_NE:
        res = (v != w) ? Py_True : Py_False;
        break;
    default:
        _PyErr_Format(tstate, PyExc_TypeError,
                      "'%s' not supported between instances of '%.100s' and '%.100s'",
                      opstrings[op],
                      Py_TYPE(v)->tp_name,
                      Py_TYPE(w)->tp_name);
        return NULL;
    }
    Py_INCREF(res);
    return res;
}

This function is relatively long, let’s analyze it part by part.

Rich compare

Take a look at this function,it uses tp_richcompare, the constants Py_EQ and Py_NE, we can find the relevant code in Include/object.h:

1
2
3
4
5
6
7
/* Rich comparison opcodes */
#define Py_LT 0
#define Py_LE 1
#define Py_EQ 2
#define Py_NE 3
#define Py_GT 4
#define Py_GE 5

From Python 3.8, A new concept is proposed — tp slots. Search it in Python C-API document (https://docs.python.org/3/c-api/typeobj.html), we can find what method tp_richcompare corresponds to.

PyTypeObject SlotTypespecial methods/attrsInfo
tp_richcomparerichcmpfunc__lt__, __le__, __eq__, __ne__, __gt__, __ge__XG

And look at the document of tp_richcompare:

1
PyObject *tp_richcompare(PyObject *self, PyObject *other, int op);

The first parameter is guaranteed to be an instance of the type that is defined by PyTypeObject.

The function should return the result of the comparison (usually Py_True or Py_False). If the comparison is undefined, it must return Py_NotImplemented, if another error occurred it must return NULL and set an exception condition.

So I sorted out the sheet:

opop methodop arg
<__lt__Py_LT = 0
<=__le__Py_LE = 1
==__eq__Py_EQ = 2
!=__ne__Py_NE = 3
>__gt__Py_GT = 4
>=__ge__Py_GE = 5

We can override any one or more of the above methods to reload the corresponding operation symbols.

Each object in Python is associated with a type. There is a tp_richcompare function pointer in the type to determine the behavior of rich compare between objects.

By calling the tp_richcompare function of a given type, CPython runs the special methods (defined by Python code in general) to do the comparisons.

For example, you can define special methods __eq__ in your custom classes of the following example. The example show which special method is called for various combinations of the types of left and right of “==” comparison code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> class MyClass:
...     def __eq__(self, o):
...         print('__eq__(==) method is called!')
...         return True
...
>>> a = MyClass()
>>> b = MyClass()
>>> a == b
__eq__(==) method is called!
True

The builtin types usually have __eq__ and __ne__. A part of the builtin types implement all the comparisons including __lt__. We can see that by running simple code:

1
2
3
4
5
6
>>> x = 1
>>> x.__lt__(2)
True
>>> x = {1: 1}
>>> x.__lt__({2: 2})
NotImplemented

Many of their special methods are implemented in C (e.g., long_richcompare() function of CPython for the int type), I will list the default implementations in Conclusion.

You may want to know what if __lt__ of the builtin types such as int is not implemented in C. The int’s < shows a better performance than calling __le__ because of no invocation of Python methods. Compare the times below. x < 2 does not need method invocation but x.__lt__(2) does.

1
2
3
4
5
>>> import statistics, timeit
>>> statistics.mean(timeit.repeat("x < 2", setup="x=1", repeat=100, globals=globals()))
0.05961373900000126
>>> statistics.mean(timeit.repeat("x.__lt__(2)", setup="x=1", repeat=100, globals=globals()))
0.11898647200000483

Then, let’s see the 3 ifs above in the do_richcompare function, it means that there are three situations.

The first case

1
2
3
4
5
6
7
8
9
if (!Py_IS_TYPE(v, Py_TYPE(w)) &&
    PyType_IsSubtype(Py_TYPE(w), Py_TYPE(v)) &&
    (f = Py_TYPE(w)->tp_richcompare) != NULL) {
    checked_reverse_op = 1;
    res = (*f)(w, v, _Py_SwappedOp[op]);
    if (res != Py_NotImplemented)
        return res;
    Py_DECREF(res);
}

v and w are of different types. w’s class is a subclass of v’s class. If w overloads a certain richcompare method, the richcompare method in w is called. Here I give an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
>>> class A:
...     pass
...
>>> class B(A):
...     def __eq__(self, o):
...         print('Do eq richcompare in B')
...         return True
...
>>> a = A()
>>> b = B()
>>> a == b
Do eq richcompare in B
True

The second case

1
2
3
4
5
6
if ((f = Py_TYPE(v)->tp_richcompare) != NULL) {
    res = (*f)(v, w, op);
    if (res != Py_NotImplemented)
        return res;
    Py_DECREF(res);
}

v and w are of the same type, or w’s class is not a subclass of v’s class, or w does not have a richcompare method, if v defined a richcompare method, then call the richcompare method in v. Let’s do an experiment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
>>> class A:
...     def __eq__(self, o):
...         print('Do eq richcompare in A')
...
>>> class B:
...     pass
...
>>> class C(A):
...     pass
...
>>> class D(B):
...     def __eq__(self, o):
...         print('Do eq richcompare in D')
...
>>> a = A()
>>> b = B()
>>> c = C()
>>> d = D()
>>> a == b
Do eq richcompare in A
>>> a == c
Do eq richcompare in A
>>> a == d
Do eq richcompare in A

The third case

1
2
3
4
5
6
if (!checked_reverse_op && (f = Py_TYPE(w)->tp_richcompare) != NULL) {
    res = (*f)(w, v, _Py_SwappedOp[op]);
    if (res != Py_NotImplemented)
        return res;
    Py_DECREF(res);
}

w’s class is not a subclass of of v’s class, in v does not define or inherit the richcompare method, but the richcompare method is defined in w, then the richcompare method in w will be called, and we continue to test with the code defined in the previous example:

1
2
3
4
>>> c == d
Do eq richcompare in A
>>> b == d
Do eq richcompare in D

Other situations

Next, the function enters into a switch branch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
/* If neither object implements it, provide a sensible default
   for == and !=, but raise an exception for ordering. */
switch (op) {
case Py_EQ:
    res = (v == w) ? Py_True : Py_False;
    break;
case Py_NE:
    res = (v != w) ? Py_True : Py_False;
    break;
default:
    _PyErr_Format(tstate, PyExc_TypeError,
                  "'%s' not supported between instances of '%.100s' and '%.100s'",
                  opstrings[op],
                  Py_TYPE(v)->tp_name,
                  Py_TYPE(w)->tp_name);
    return NULL;
}

If the above three conditions are not present, and finally the function will compare pointers through the switch branch (==and !=), the result is just the same as is operator, if not == or != , then thrown the type of error directly.

Because all types are initialized the default tp_richcompare(For example, class type is object_richcompare, as an example, the mechanism of object_richcompare will be introduced below), only if the above tp_richcompare is called and returns Py_NotImplemented, can this switch branch code be executed.

We can do the following experiment to verify.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
>>> class A:
...     pass
...
>>> class B:
...     pass
...
>>> a = A()
>>> b = B()
>>> a == b # called tp_richcompare in 2nd and 3rd case but tp_richcompare returns Py_NotImplemented, so it returns in switch branch case Py_EQ
False
>>> a is b # get the same return value as ==
False
>>> a != b # called tp_richcompare in 2nd and 3rd case but tp_richcompare returns Py_NotImplemented, so it returns in switch branch case Py_NE
True
>>> a is not b # get the same return value as !=
True
>>> a > b # called tp_richcompare in 2nd and 3rd case but tp_richcompare returns Py_NotImplemented, so it returns in switch branch case default (will throw a type error)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '>' not supported between instances of 'A' and 'B'

The default implementation of richcompare

Why we have neither defined the __eq__(==) or __ne__( !=) methods in class A and class B, but we can compare them normally, and other symbols can’t? I found the relevant code in Objects/typeobject.c:

1
2
3
4
5
PyTypeObject PyBaseObject_Type = {
    ...
    object_richcompare,                         /* tp_richcompare */
    ...
};

So far we know that all class types use the built-in object_richcompare function by default. By looking at this function, we can find that the Py_EQ and Py_NE has been implemented by default:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
static PyObject *
object_richcompare(PyObject *self, PyObject *other, int op)
{
    PyObject *res;

    switch (op) {

    case Py_EQ:
        /* Return NotImplemented instead of False, so if two
           objects are compared, both get a chance at the
           comparison.  See issue #1393. */
        res = (self == other) ? Py_True : Py_NotImplemented;
        Py_INCREF(res);
        break;

    case Py_NE:
        /* By default, __ne__() delegates to __eq__() and inverts the result,
           unless the latter returns NotImplemented. */
        if (Py_TYPE(self)->tp_richcompare == NULL) {
            res = Py_NotImplemented;
            Py_INCREF(res);
            break;
        }
        res = (*Py_TYPE(self)->tp_richcompare)(self, other, Py_EQ);
        if (res != NULL && res != Py_NotImplemented) {
            int ok = PyObject_IsTrue(res);
            Py_DECREF(res);
            if (ok < 0)
                res = NULL;
            else {
                if (ok)
                    res = Py_False;
                else
                    res = Py_True;
                Py_INCREF(res);
            }
        }
        break;

    default:
        res = Py_NotImplemented;
        Py_INCREF(res);
        break;
    }

    return res;
}

We cloud find here case Py_EQ is just a simple pointer comparison, if the same is Py_True, otherwise it is Py_NotImplemented. If it returns Py_NotImplemented, the comparison work will be handed over according to priority.

But it should be noted that in Py_NE the function tries to call Py_EQ if tp_richcompare has been implemented:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
res = (*Py_TYPE(self)->tp_richcompare)(self, other, Py_EQ);
if (res != NULL && res != Py_NotImplemented) {
    int ok = PyObject_IsTrue(res);
    Py_DECREF(res);
    if (ok < 0)
        res = NULL;
    else {
        if (ok)
            res = Py_False;
        else
            res = Py_True;
        Py_INCREF(res);
    }
}

This means that if the Py_NE has not been rewrote, the function will try to call the case Py_EQ, and get the result value, if in True case, it will return Py_False, else then returns Py_True.

Rules of do_richcompare

After the above analysis and experiments, I believe that you have a very clear understanding of its implementation. Let me organize a table of rules below.

Python do_richcompare(v, w, op)

  • The following means that class has a not NULL tp_richcompare and it can return Py_True or Py_False. (tp_richcompare Implemented)
  • The following × means that class’s tp_richcompare is NULL or tp_richcompare returns Py_NotImplemented. (tp_richcompare Not Implemented)
Priorityv’s class (Cv)w’s class (Cw)Do
0baseclass of w’s classsubclass of v’s classPy_TYPE(w)->tp_richcompare
1×Py_TYPE(v)->tp_richcompare
2×Py_TYPE(w)->tp_richcompare
3××switch branch

Other interesting findings

Now, we are very clear about the realization principle of Python’s is and ==. But do you still remember the incredible code at the beginning?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
>>> a = 1
>>> b = 1
>>> a is b
True
>>> a == b
True
>>> a = 114514
>>> b = 114514
>>> a is b
False
>>> a == b
True

They are both digital, are there1 and 114514 essential differences? I want to do an experiment:

1
2
3
4
for i in map(str, range(-10, 260)):
    a = int(i)
    b = int(i)
    print(i, a is b)

According to the output result, we find that in range[-5, 256], a is b is True, but other numbers are False. It obviously relates to the implementation of Python’s int type. In Python3, the int type is no matter the magnitudes of the number, they are all PyLongObject.

So let’s read Include/longobject.c. I found the _PyLong_Init function, and found that in this function, a small_ints array was loaded in the thread:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
for (Py_ssize_t i=0; i < NSMALLNEGINTS + NSMALLPOSINTS; i++) {
    sdigit ival = (sdigit)i - NSMALLNEGINTS;
    int size = (ival < 0) ? -1 : ((ival == 0) ? 0 : 1);

    PyLongObject *v = _PyLong_New(1);
    if (!v) {
        return -1;
    }

    Py_SET_SIZE(v, size);
    v->ob_digit[0] = (digit)abs(ival);

    tstate->interp->small_ints[i] = v;
}

Read the definition of this array:

1
2
3
4
5
6
    /* Small integers are preallocated in this array so that they
       can be shared.
       The integers that are preallocated are those in the range
       -_PY_NSMALLNEGINTS (inclusive) to _PY_NSMALLPOSINTS (not inclusive).
    */
    PyLongObject* small_ints[_PY_NSMALLNEGINTS + _PY_NSMALLPOSINTS];

I found that the size is _PY_NSMALLNEGINTS + _PY_NSMALLPOSINTS, and the range is [-5, 257)

1
2
3
4
/* interpreter state */

#define _PY_NSMALLPOSINTS           257
#define _PY_NSMALLNEGINTS           5

In Python 3, all of the int will call the PyLong_FromLong function, so let’s take look at this function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
/* Create a new int object from a C long int */

PyObject *
PyLong_FromLong(long ival)
{
    PyLongObject *v;
    unsigned long abs_ival;
    unsigned long t;  /* unsigned so >> doesn't propagate sign bit */
    int ndigits = 0;
    int sign;

    if (IS_SMALL_INT(ival)) {
        return get_small_int((sdigit)ival);
    }

    if (ival < 0) {
        /* negate: can't write this as abs_ival = -ival since that
           invokes undefined behaviour when ival is LONG_MIN */
        abs_ival = 0U-(unsigned long)ival;
        sign = -1;
    }
    else {
        abs_ival = (unsigned long)ival;
        sign = ival == 0 ? 0 : 1;
    }

    /* Fast path for single-digit ints */
    if (!(abs_ival >> PyLong_SHIFT)) {
        v = _PyLong_New(1);
        if (v) {
            Py_SET_SIZE(v, sign);
            v->ob_digit[0] = Py_SAFE_DOWNCAST(
                abs_ival, unsigned long, digit);
        }
        return (PyObject*)v;
    }

#if PyLong_SHIFT==15
    /* 2 digits */
    if (!(abs_ival >> 2*PyLong_SHIFT)) {
        v = _PyLong_New(2);
        if (v) {
            Py_SET_SIZE(v, 2 * sign);
            v->ob_digit[0] = Py_SAFE_DOWNCAST(
                abs_ival & PyLong_MASK, unsigned long, digit);
            v->ob_digit[1] = Py_SAFE_DOWNCAST(
                  abs_ival >> PyLong_SHIFT, unsigned long, digit);
        }
        return (PyObject*)v;
    }
#endif

    /* Larger numbers: loop to determine number of digits */
    t = abs_ival;
    while (t) {
        ++ndigits;
        t >>= PyLong_SHIFT;
    }
    v = _PyLong_New(ndigits);
    if (v != NULL) {
        digit *p = v->ob_digit;
        Py_SET_SIZE(v, ndigits * sign);
        t = abs_ival;
        while (t) {
            *p++ = Py_SAFE_DOWNCAST(
                t & PyLong_MASK, unsigned long, digit);
            t >>= PyLong_SHIFT;
        }
    }
    return (PyObject *)v;
}

This will use the macro IS_SMALL_INT to judge whether the number is in the range [-5, 256], if in the range then call get_small_int and return the result.

1
2
3
if (IS_SMALL_INT(ival)) {
    return get_small_int((sdigit)ival);
}

Look at get_small_int function, so far we can be surely known that if a number in the range, it should have existed in small_ints, so CPython will not allocate a new object for it.

1
2
3
4
5
6
7
8
9
static PyObject *
get_small_int(sdigit ival)
{
    assert(IS_SMALL_INT(ival));
    PyInterpreterState *interp = _PyInterpreterState_GET();
    PyObject *v = (PyObject*)interp->small_ints[ival + NSMALLNEGINTS];
    Py_INCREF(v);
    return v;
}

At this point, this strange question has also been answered: Python caches these numbers in memory, that is, the numbers in small_ints, Python will not allocate memory for them again, but use them directly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
>>> a = 256
>>> b = 256
>>> c = 257
>>> d = 257
>>> id(a)
1636339312944
>>> id(b)
1636339312944
>>> a is b
True
>>> id(c)
1636346998896
>>> id(d)
1636346999024
>>> c is d
False

It can be seen that this is indeed the case.

Conclusion

My study analyzed is and == the differences and connections, in general:

  • The is compare whether the memory addresses of the two objects are the same, that is, whether they are the same object.
  • == is a richcompare, unless the type of the object rewrote tp_richcompare, otherwise compare the memory addresses of two objects by default, the same approach as is consistent.

Python’s commonly used built-in types such as int, str, list, and dict all have a default implementation of tp_richcompare.

I search the source code and read it, and find the following examples below (maybe incomplete).

Python typeRichcmp method
array.arrayarray_richcompare
bytearraybytearray_richcompare
bytesbytes_richcompare
cellcell_richcompare
codecode_richcompare
collections.dequedeque_richcompare
collections.OrderedDictodict_richcompare
complexcomplex_richcompare
datetime.timezonetimezone_richcompare
dictdict_richcompare
dict_itemsdictview_richcompare
dict_keysdictview_richcompare
floatfloat_richcompare
frozensetset_richcompare
instancemethodinstancemethod_richcompare
intlong_richcompare
listlist_richcompare
mappingproxymappingproxy_richcompare
methodmethod_richcompare
method-wrapperwrapper_richcompare
rangerange_richcompare
re.Patternpattern_richcompare
setset_richcompare
sliceslice_richcompare
strPyUnicode_RichCompare
tupletuplerichcompare
weakrefweakref_richcompare

Python language has accelerated my development cycle. Personally, I like Python very much now, but I hadn’t delved into many details. After reading the CPython source code this time, I understood many things that I thought were incredible and benefited a lot.