Python performance comparison for creating sets - set() vs. {} literal [duplicate]












15
















This question already has an answer here:




  • Why is faster than list()?

    5 answers



  • Why is `{*l}` faster than `set(l)` - python sets (not really only for sets, for all sequences)

    1 answer




A discussion following this question left me wondering, so I decided to run a few tests and compare the creation time of set((x,y,z)) vs. {x,y,z} for creating sets in Python (I'm using Python 3.7).



I compared the two methods using time and timeit.
Both were consistent* with the following results:



test1 = """
my_set1 = set((1, 2, 3))
"""
print(timeit(test1))


Result: 0.30240735499999993



test2 = """
my_set2 = {1,2,3}
"""
print(timeit(test2))


Result: 0.10771795900000003



So the second method was almost 3 times faster than the first.
This was quite a surprising difference to me.
What is happening under the hood to optimize the performance of the set literal over the set() method in such a way? Which would be advisable for which cases?



* Note: I only show the results of the timeit tests since they are averaged over many samples, and thus perhaps more reliable, but the results when testing with time showed similar differences in both cases.





Edit: I'm aware of this similar question and though it answers certain aspects of my original question, it didn't cover all of it. Sets were not addressed in the question, and as empty sets do not have a literal syntax in python, I was curious how (if at all) set creation using a literal would differ from using the set() method. Also, I wondered how the handling of the tuple parameter in set((x,y,z) happens behind the scenes and what is its possible impact on runtime.
The great answer by coldspeed helped clear things up.










share|improve this question















marked as duplicate by Martijn Pieters python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 17 at 16:36


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • 1





    Related stackoverflow.com/questions/36674083/…

    – snakecharmerb
    Dec 30 '18 at 14:43











  • Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

    – Andras Deak
    Dec 30 '18 at 15:13








  • 1





    @AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

    – coldspeed
    Dec 30 '18 at 15:17













  • This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

    – Martijn Pieters
    Jan 17 at 16:35






  • 1





    Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

    – ShadowRanger
    Feb 19 at 21:34
















15
















This question already has an answer here:




  • Why is faster than list()?

    5 answers



  • Why is `{*l}` faster than `set(l)` - python sets (not really only for sets, for all sequences)

    1 answer




A discussion following this question left me wondering, so I decided to run a few tests and compare the creation time of set((x,y,z)) vs. {x,y,z} for creating sets in Python (I'm using Python 3.7).



I compared the two methods using time and timeit.
Both were consistent* with the following results:



test1 = """
my_set1 = set((1, 2, 3))
"""
print(timeit(test1))


Result: 0.30240735499999993



test2 = """
my_set2 = {1,2,3}
"""
print(timeit(test2))


Result: 0.10771795900000003



So the second method was almost 3 times faster than the first.
This was quite a surprising difference to me.
What is happening under the hood to optimize the performance of the set literal over the set() method in such a way? Which would be advisable for which cases?



* Note: I only show the results of the timeit tests since they are averaged over many samples, and thus perhaps more reliable, but the results when testing with time showed similar differences in both cases.





Edit: I'm aware of this similar question and though it answers certain aspects of my original question, it didn't cover all of it. Sets were not addressed in the question, and as empty sets do not have a literal syntax in python, I was curious how (if at all) set creation using a literal would differ from using the set() method. Also, I wondered how the handling of the tuple parameter in set((x,y,z) happens behind the scenes and what is its possible impact on runtime.
The great answer by coldspeed helped clear things up.










share|improve this question















marked as duplicate by Martijn Pieters python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 17 at 16:36


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • 1





    Related stackoverflow.com/questions/36674083/…

    – snakecharmerb
    Dec 30 '18 at 14:43











  • Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

    – Andras Deak
    Dec 30 '18 at 15:13








  • 1





    @AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

    – coldspeed
    Dec 30 '18 at 15:17













  • This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

    – Martijn Pieters
    Jan 17 at 16:35






  • 1





    Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

    – ShadowRanger
    Feb 19 at 21:34














15












15








15


0







This question already has an answer here:




  • Why is faster than list()?

    5 answers



  • Why is `{*l}` faster than `set(l)` - python sets (not really only for sets, for all sequences)

    1 answer




A discussion following this question left me wondering, so I decided to run a few tests and compare the creation time of set((x,y,z)) vs. {x,y,z} for creating sets in Python (I'm using Python 3.7).



I compared the two methods using time and timeit.
Both were consistent* with the following results:



test1 = """
my_set1 = set((1, 2, 3))
"""
print(timeit(test1))


Result: 0.30240735499999993



test2 = """
my_set2 = {1,2,3}
"""
print(timeit(test2))


Result: 0.10771795900000003



So the second method was almost 3 times faster than the first.
This was quite a surprising difference to me.
What is happening under the hood to optimize the performance of the set literal over the set() method in such a way? Which would be advisable for which cases?



* Note: I only show the results of the timeit tests since they are averaged over many samples, and thus perhaps more reliable, but the results when testing with time showed similar differences in both cases.





Edit: I'm aware of this similar question and though it answers certain aspects of my original question, it didn't cover all of it. Sets were not addressed in the question, and as empty sets do not have a literal syntax in python, I was curious how (if at all) set creation using a literal would differ from using the set() method. Also, I wondered how the handling of the tuple parameter in set((x,y,z) happens behind the scenes and what is its possible impact on runtime.
The great answer by coldspeed helped clear things up.










share|improve this question

















This question already has an answer here:




  • Why is faster than list()?

    5 answers



  • Why is `{*l}` faster than `set(l)` - python sets (not really only for sets, for all sequences)

    1 answer




A discussion following this question left me wondering, so I decided to run a few tests and compare the creation time of set((x,y,z)) vs. {x,y,z} for creating sets in Python (I'm using Python 3.7).



I compared the two methods using time and timeit.
Both were consistent* with the following results:



test1 = """
my_set1 = set((1, 2, 3))
"""
print(timeit(test1))


Result: 0.30240735499999993



test2 = """
my_set2 = {1,2,3}
"""
print(timeit(test2))


Result: 0.10771795900000003



So the second method was almost 3 times faster than the first.
This was quite a surprising difference to me.
What is happening under the hood to optimize the performance of the set literal over the set() method in such a way? Which would be advisable for which cases?



* Note: I only show the results of the timeit tests since they are averaged over many samples, and thus perhaps more reliable, but the results when testing with time showed similar differences in both cases.





Edit: I'm aware of this similar question and though it answers certain aspects of my original question, it didn't cover all of it. Sets were not addressed in the question, and as empty sets do not have a literal syntax in python, I was curious how (if at all) set creation using a literal would differ from using the set() method. Also, I wondered how the handling of the tuple parameter in set((x,y,z) happens behind the scenes and what is its possible impact on runtime.
The great answer by coldspeed helped clear things up.





This question already has an answer here:




  • Why is faster than list()?

    5 answers



  • Why is `{*l}` faster than `set(l)` - python sets (not really only for sets, for all sequences)

    1 answer








python python-3.x performance set






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 30 '18 at 15:20







yuvgin

















asked Dec 30 '18 at 13:27









yuvginyuvgin

951421




951421




marked as duplicate by Martijn Pieters python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 17 at 16:36


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Martijn Pieters python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Jan 17 at 16:36


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.










  • 1





    Related stackoverflow.com/questions/36674083/…

    – snakecharmerb
    Dec 30 '18 at 14:43











  • Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

    – Andras Deak
    Dec 30 '18 at 15:13








  • 1





    @AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

    – coldspeed
    Dec 30 '18 at 15:17













  • This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

    – Martijn Pieters
    Jan 17 at 16:35






  • 1





    Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

    – ShadowRanger
    Feb 19 at 21:34














  • 1





    Related stackoverflow.com/questions/36674083/…

    – snakecharmerb
    Dec 30 '18 at 14:43











  • Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

    – Andras Deak
    Dec 30 '18 at 15:13








  • 1





    @AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

    – coldspeed
    Dec 30 '18 at 15:17













  • This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

    – Martijn Pieters
    Jan 17 at 16:35






  • 1





    Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

    – ShadowRanger
    Feb 19 at 21:34








1




1





Related stackoverflow.com/questions/36674083/…

– snakecharmerb
Dec 30 '18 at 14:43





Related stackoverflow.com/questions/36674083/…

– snakecharmerb
Dec 30 '18 at 14:43













Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

– Andras Deak
Dec 30 '18 at 15:13







Yeah, empty set literals don't exist. Non-empty ones do, and you'll see that the answer given to the other question is largely applicable to yours. Let's hope nobody asks a question about tuple literals vs tuple(...).

– Andras Deak
Dec 30 '18 at 15:13






1




1





@AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

– coldspeed
Dec 30 '18 at 15:17







@AndrasDeak The two questions are definitely related but I am not quite sure they are duplicates. That question does not address when set() is more appropriate than the literal construction/comprehension syntax, which seems to be the underlying X in this XY problem. I would not close this myself but I would not lose any sleep if it was closed.

– coldspeed
Dec 30 '18 at 15:17















This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

– Martijn Pieters
Jan 17 at 16:35





This is, essentially, the same question as vs list(). The factors that make literal syntax faster are exactly the same.

– Martijn Pieters
Jan 17 at 16:35




1




1





Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

– ShadowRanger
Feb 19 at 21:34





Fun times with modern Python: It has an "empty set literal", the one-eyed monkey operator: {*()}. It uses unpacking generalizations with an empty tuple (which is a singleton on CPython, so no tuple construction actually occurs) to impose the necessary context so Python sees a set being constructed, rather than a dict.

– ShadowRanger
Feb 19 at 21:34












1 Answer
1






active

oldest

votes


















30














(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:



test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734




test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722




Now, the reason for the difference in timings is because set() is a function call requiring a lookup into the symbol table, whereas the {...} set construction is an artefact of the syntax, and is much faster.



The difference is obvious when observing the disassembled byte code.



import dis

dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE




dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE


In the first case, a function call is made by the instruction CALL_FUNCTION on the tuple (1, 2, 3) (which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST), whereas in the second instruction is just a BUILD_SET call, which is more efficient.



Re: your question regarding the time taken for tuple construction, we see this is actually negligible:



timeit("""(1, 2, 3)""")
# 0.01858693000394851

timeit("""{1, 2, 3}""")
# 0.11971827200613916


Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).





For larger sequences, we see similar behaviour. {..} syntax is faster at constructing sets using set comprehensions as opposed to set() which has to build the set from a generator.



timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858

timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197


For reference, you can also use iterable unpacking on more recent versions:



timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803


Interestingly, however, set() is faster when called directly on range:



timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125


This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as lists).



My recommendation would be to use the {...} set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set(); and instead use set() to convert an existing sequence/iterable to a set.






share|improve this answer





















  • 1





    He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

    – Daniel Mesejo
    Dec 30 '18 at 13:32






  • 2





    @DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

    – coldspeed
    Dec 30 '18 at 13:34








  • 1





    @DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

    – spectras
    Dec 30 '18 at 13:36








  • 2





    Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

    – user2357112
    Dec 31 '18 at 0:40






  • 1





    @coldspeed: {1, 2, 3} is mutable.

    – user2357112
    Dec 31 '18 at 6:14


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









30














(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:



test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734




test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722




Now, the reason for the difference in timings is because set() is a function call requiring a lookup into the symbol table, whereas the {...} set construction is an artefact of the syntax, and is much faster.



The difference is obvious when observing the disassembled byte code.



import dis

dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE




dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE


In the first case, a function call is made by the instruction CALL_FUNCTION on the tuple (1, 2, 3) (which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST), whereas in the second instruction is just a BUILD_SET call, which is more efficient.



Re: your question regarding the time taken for tuple construction, we see this is actually negligible:



timeit("""(1, 2, 3)""")
# 0.01858693000394851

timeit("""{1, 2, 3}""")
# 0.11971827200613916


Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).





For larger sequences, we see similar behaviour. {..} syntax is faster at constructing sets using set comprehensions as opposed to set() which has to build the set from a generator.



timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858

timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197


For reference, you can also use iterable unpacking on more recent versions:



timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803


Interestingly, however, set() is faster when called directly on range:



timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125


This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as lists).



My recommendation would be to use the {...} set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set(); and instead use set() to convert an existing sequence/iterable to a set.






share|improve this answer





















  • 1





    He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

    – Daniel Mesejo
    Dec 30 '18 at 13:32






  • 2





    @DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

    – coldspeed
    Dec 30 '18 at 13:34








  • 1





    @DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

    – spectras
    Dec 30 '18 at 13:36








  • 2





    Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

    – user2357112
    Dec 31 '18 at 0:40






  • 1





    @coldspeed: {1, 2, 3} is mutable.

    – user2357112
    Dec 31 '18 at 6:14
















30














(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:



test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734




test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722




Now, the reason for the difference in timings is because set() is a function call requiring a lookup into the symbol table, whereas the {...} set construction is an artefact of the syntax, and is much faster.



The difference is obvious when observing the disassembled byte code.



import dis

dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE




dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE


In the first case, a function call is made by the instruction CALL_FUNCTION on the tuple (1, 2, 3) (which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST), whereas in the second instruction is just a BUILD_SET call, which is more efficient.



Re: your question regarding the time taken for tuple construction, we see this is actually negligible:



timeit("""(1, 2, 3)""")
# 0.01858693000394851

timeit("""{1, 2, 3}""")
# 0.11971827200613916


Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).





For larger sequences, we see similar behaviour. {..} syntax is faster at constructing sets using set comprehensions as opposed to set() which has to build the set from a generator.



timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858

timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197


For reference, you can also use iterable unpacking on more recent versions:



timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803


Interestingly, however, set() is faster when called directly on range:



timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125


This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as lists).



My recommendation would be to use the {...} set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set(); and instead use set() to convert an existing sequence/iterable to a set.






share|improve this answer





















  • 1





    He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

    – Daniel Mesejo
    Dec 30 '18 at 13:32






  • 2





    @DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

    – coldspeed
    Dec 30 '18 at 13:34








  • 1





    @DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

    – spectras
    Dec 30 '18 at 13:36








  • 2





    Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

    – user2357112
    Dec 31 '18 at 0:40






  • 1





    @coldspeed: {1, 2, 3} is mutable.

    – user2357112
    Dec 31 '18 at 6:14














30












30








30







(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:



test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734




test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722




Now, the reason for the difference in timings is because set() is a function call requiring a lookup into the symbol table, whereas the {...} set construction is an artefact of the syntax, and is much faster.



The difference is obvious when observing the disassembled byte code.



import dis

dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE




dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE


In the first case, a function call is made by the instruction CALL_FUNCTION on the tuple (1, 2, 3) (which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST), whereas in the second instruction is just a BUILD_SET call, which is more efficient.



Re: your question regarding the time taken for tuple construction, we see this is actually negligible:



timeit("""(1, 2, 3)""")
# 0.01858693000394851

timeit("""{1, 2, 3}""")
# 0.11971827200613916


Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).





For larger sequences, we see similar behaviour. {..} syntax is faster at constructing sets using set comprehensions as opposed to set() which has to build the set from a generator.



timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858

timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197


For reference, you can also use iterable unpacking on more recent versions:



timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803


Interestingly, however, set() is faster when called directly on range:



timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125


This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as lists).



My recommendation would be to use the {...} set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set(); and instead use set() to convert an existing sequence/iterable to a set.






share|improve this answer















(This is in response to code that has now been edited out of the initial question) You forgot to call the functions in the second case. Making the appropriate modifications, the results are as expected:



test1 = """
def foo1():
my_set1 = set((1, 2, 3))
foo1()
"""
timeit(test1)
# 0.48808742000255734




test2 = """
def foo2():
my_set2 = {1,2,3}
foo2()
"""
timeit(test2)
# 0.3064506609807722




Now, the reason for the difference in timings is because set() is a function call requiring a lookup into the symbol table, whereas the {...} set construction is an artefact of the syntax, and is much faster.



The difference is obvious when observing the disassembled byte code.



import dis

dis.dis("set((1, 2, 3))")
1 0 LOAD_NAME 0 (set)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE




dis.dis("{1, 2, 3}")
1 0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (2)
4 LOAD_CONST 2 (3)
6 BUILD_SET 3
8 RETURN_VALUE


In the first case, a function call is made by the instruction CALL_FUNCTION on the tuple (1, 2, 3) (which also comes with its own overhead, although minor—it is loaded as a constant via LOAD_CONST), whereas in the second instruction is just a BUILD_SET call, which is more efficient.



Re: your question regarding the time taken for tuple construction, we see this is actually negligible:



timeit("""(1, 2, 3)""")
# 0.01858693000394851

timeit("""{1, 2, 3}""")
# 0.11971827200613916


Tuples are immutable, so the compiler optimises this operation by loading it as a constant—this is called constant folding (you can see this clearly from the LOAD_CONST instruction above), so the time taken is negligible. This is not seen with sets are they are mutable (Thanks to @user2357112 for pointing this out).





For larger sequences, we see similar behaviour. {..} syntax is faster at constructing sets using set comprehensions as opposed to set() which has to build the set from a generator.



timeit("""set(i for i in range(10000))""", number=1000)
# 0.9775058150407858

timeit("""{i for i in range(10000)}""", number=1000)
# 0.5508635920123197


For reference, you can also use iterable unpacking on more recent versions:



timeit("""{*range(10000)}""", number=1000)
# 0.7462548640323803


Interestingly, however, set() is faster when called directly on range:



timeit("""set(range(10000))""", number=1000)
# 0.3746800610097125


This happens to be faster than the set construction. You will see similar behaviour for other sequences (such as lists).



My recommendation would be to use the {...} set comprehension when constructing set literals, and as an alternative to passing a generator comprehension to set(); and instead use set() to convert an existing sequence/iterable to a set.







share|improve this answer














share|improve this answer



share|improve this answer








edited Dec 31 '18 at 6:19

























answered Dec 30 '18 at 13:30









coldspeedcoldspeed

133k23142225




133k23142225








  • 1





    He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

    – Daniel Mesejo
    Dec 30 '18 at 13:32






  • 2





    @DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

    – coldspeed
    Dec 30 '18 at 13:34








  • 1





    @DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

    – spectras
    Dec 30 '18 at 13:36








  • 2





    Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

    – user2357112
    Dec 31 '18 at 0:40






  • 1





    @coldspeed: {1, 2, 3} is mutable.

    – user2357112
    Dec 31 '18 at 6:14














  • 1





    He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

    – Daniel Mesejo
    Dec 30 '18 at 13:32






  • 2





    @DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

    – coldspeed
    Dec 30 '18 at 13:34








  • 1





    @DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

    – spectras
    Dec 30 '18 at 13:36








  • 2





    Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

    – user2357112
    Dec 31 '18 at 0:40






  • 1





    @coldspeed: {1, 2, 3} is mutable.

    – user2357112
    Dec 31 '18 at 6:14








1




1





He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

– Daniel Mesejo
Dec 30 '18 at 13:32





He is also creating a tuple and then pass it to the set function, does the tuple creation times count?

– Daniel Mesejo
Dec 30 '18 at 13:32




2




2





@DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

– coldspeed
Dec 30 '18 at 13:34







@DanielMesejo Perhaps, but I cannot be sure. In this case, maybe not as I believe python interns (caches) tuples, so after the first couple of timeits that might not cause much of a difference in timings. But in theory, yes, it will contribute.

– coldspeed
Dec 30 '18 at 13:34






1




1





@DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

– spectras
Dec 30 '18 at 13:36







@DanielMesejo might be wrong but from the bytecode, it seems it does not create the tuple, it is built as a constant when python code is initially parsed. It just LOAD_CONSTs the tuple. The overhead comes after, building the set from that tuple.

– spectras
Dec 30 '18 at 13:36






2




2





Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

– user2357112
Dec 31 '18 at 0:40





Python doesn't intern the (1, 2, 3) tuple. You're looking at constant folding, not interning.

– user2357112
Dec 31 '18 at 0:40




1




1





@coldspeed: {1, 2, 3} is mutable.

– user2357112
Dec 31 '18 at 6:14





@coldspeed: {1, 2, 3} is mutable.

– user2357112
Dec 31 '18 at 6:14





Popular posts from this blog

Bressuire

Cabo Verde

Gyllenstierna