• 8
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

name Punditsdkoslkdosdkoskdo

Removing multiple keys from a dictionary safely

I know to remove an entry, 'key' from my dictionary d, safely, you do:

if d.has_key('key'):
    del d['key']

However, I need to remove multiple entries from dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.

entitiesToREmove = ('a', 'b', 'c')
for x in entitiesToRemove:
    if d.has_key(x):
        del d[x]

However, I was wondering if there is a smarter way to do this?

    • Retrieval time from a dictionary is nearly O(1) because of hashing. Unless you are removing a significant proportion of the entries, I don't think you will do much better.
      • 2
    • If you can spare a bit of memory, you can do for x in set(d) & entities_to_remove: del d[x]. This will probably only be more efficient if entities_to_remove is "large".
d = {'some':'data'}
entriesToRemove = ('any', 'iterable')
for k in entriesToRemove:
    d.pop(k, None)
  • 233
Reply Report
      • 1
    • This. This is the clever Pythonista's choice. dict.pop() eliminates the need for key existence testing. Excellent.
    • For what it's worth, I think .pop() is bad and unpythonic, and would prefer the accepted answer over this one.
      • 1
    • A staggering number of people appear unbothered by this :) I don't mind the extra line for existence checking personally, and it's significantly more readable unless you already know about pop(). On the other hand if you were trying to do this in a comprehension or inline lambda this trick could be a big help. I'd also say that it's important, in my opinion, to meet people where they are. I'm not sure that "bad and unpythonic" is going to give the people who are reading these answers the practical guidance they are looking for.
      • 1
    • There is a particularly good reason to use this. While adding an extra line may improve "readability" or "clarity", it also adds an extra lookup to the dictionary. This method is the removal equivalent of doing setdefault. If implemented correctly (and I'm sure it is), it only does one lookup into the hash-map that is the dict, instead of two.
      • 2
    • Personally I would be concerned with correctness and maintainability first, and speed only if it is proven to be insufficiently fast. The speed difference between these operations is going to be trivial when zoomed out to the application level. It may be the case that one is faster, but I expect that in real world usage you will neither notice nor care, and if you do notice and care, you will be better served rewriting in something more performant than Python.

Using Dict Comprehensions

final_dict = {key: t[key] for key in t if key not in [key1, key2]}

where key1 and key2 are to be removed.

In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.

>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>> 
  • 89
Reply Report
      • 1
    • new dictionary? list comprehension? You should adjust the answer to the person asking the question ;)
      • 2
    • This solution has a serious performance hit when the variable holding the has further use in the program. In other words, a dict from which keys have been deleted is much more efficient than a newly created dict with the retained items.
    • This also has performance issues when the list of keys is too big, as searches take O(n). The whole operation is O(mn), where m is the number of keys in the dict and n the number of keys in the list. I suggest using a set {key1, key2} instead, if possible.

Why not like this:

entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}

def entries_to_remove(entries, the_dict):
    for key in entries:
        if key in the_dict:
            del the_dict[key]

A more compact version was provided by mattbornski using dict.pop()

  • 56
Reply Report
      • 1
    • Adding this for people coming from a search engine. If keys are known (when safety is not an issue), multiple keys can be deleted in one line like this del dict['key1'], dict['key2'], dict['key3']
    • Depending on the number of keys you're deleting, it might be more efficient to use for key in set(the_dict) & entries: and bypass the key in dict test.

a solution is using map and filter functions

python 2

d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)

python 3

d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)

you get:

{'c': 3}
  • 21
Reply Report
    • This doesn't work for me with python 3.4: >>> d={"a":1,"b":2,"c":3} >>> l=("a","b","d") >>> map(d.__delitem__, filter(d.__contains__,l)) >>> print(d) {'a': 1, 'b': 2, 'c': 3}
    • @Risadinha list(map(d.__delitem__,filter(d.__contains__,l))) .... in python 3.4 map function return a iterator
      • 2
    • or deque(map(...), maxlen=0) to avoid building a list of None values; first import with from collections import deque

If you also needed to retrieve the values for the keys you are removing, this would be a pretty good way to do it:

valuesRemoved = [d.pop(k, None) for k in entitiesToRemove]

You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.

  • 19
Reply Report
      • 1
    • Or if you wanted to keep the deleted entries as a dictionary: valuesRemoved = dict((k, d.pop(k, None)) for k in entitiesToRemove) and so on.
      • 1
    • You can leave away the assignment to a variable. In this or that way it's the shortest and most pythonic solution and should be marked as the corect answer IMHO.

Found a solution with pop and map

d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)

The output of this:

{'d': 'valueD'}

I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.

Update

The above code will throw an error if a key does not exist in the dict.

DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']

def remove_keys(key):
    try:
        DICTIONARY.pop(key, None)
    except:
        pass  # or do any action

list(map(remove_key, keys))
print(DICTIONARY)

output:

DICTIONARY = {'b': 'valueB', 'd': 'valueD'}
  • 12
Reply Report

I have no problem with any of the existing answers, but I was surprised to not find this solution:

keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}

for k in keys_to_remove:
    try:
        del my_dict[k]
    except KeyError:
        pass

assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}

Note: I stumbled across this question coming from here. And my answer is related to this answer.

  • 4
Reply Report

Why not:

entriestoremove = (2,5,1)
for e in entriestoremove:
    if d.has_key(e):
        del d[e]

I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:

entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}
  • 3
Reply Report

inline

import functools

#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}

entitiesToREmove = ('a', 'b', 'c')

#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)

#: python3

list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))

print(d)
# output: {'d': 'dvalue'}
  • 2
Reply Report

Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:

timeit results (10k iterations):

  • all(x.pop(v) for v in r) # 0.85
  • all(map(x.pop, r)) # 0.60
  • list(map(x.pop, r)) # 0.70
  • all(map(x.__delitem__, r)) # 0.44
  • del_all(x, r) # 0.40
  • <inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
      """Remove list of elements from mapping."""
      for key in to_remove:
          del mapping[key]

For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.

  • 1
Reply Report

I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:

def remove_keys(d, keys):
    to_remove = set(keys)
    filtered_keys = d.keys() - to_remove
    filtered_values = map(d.get, filtered_keys)
    return dict(zip(filtered_keys, filtered_values))

Example:

>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}
  • 0
Reply Report

It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.

from itertools import product
from time import perf_counter

# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))

print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000

keys = product(range(50000, 100000), range(1, 100))

# for x,y in keys:
#     del cells[x, y]

for n in map(cells.pop, keys):
    pass

print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")

10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:

keys = cells.keys() & keys

In summary: del is already heavily optimised, so don't worry about using it.

  • 0
Reply Report

I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.

k = ['a','b','c','d']

Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.

new_dictionary = [dictionary.pop(x, 'n/a') for x in k]

The 'n/a' is in case the key does not exist, a default value needs to be returned.

  • -1
Reply Report

Trending Tags