How to split a list into evenly sized chunks? [Answered]

Answer #1: Split a list into evenly sized chunks

Here’s a generator that yields the chunks you want:

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

If you’re using Python 2, you should use xrange() instead of range():

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in xrange(0, len(lst), n):
        yield lst[i:i + n]

Also you can simply use list comprehension instead of writing a function, though it’s a good idea to encapsulate operations like this in named functions so that your code is easier to understand. Python 3:

[lst[i:i + n] for i in range(0, len(lst), n)]

Python 2 version:

[lst[i:i + n] for i in xrange(0, len(lst), n)]

Answer #2: Split a list into evenly sized chunks

If you want something super simple:

def chunks(l, n):
    n = max(1, n)
    return (l[i:i+n] for i in range(0, len(l), n))

Use xrange() instead of range() in the case of Python 2.x

Answer #3: How to split a list into evenly sized chunks

I know this is kind of old but nobody yet mentioned numpy.array_split:

import numpy as np

lst = range(50)
np.array_split(lst, 5)
# [array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
#  array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
#  array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]),
#  array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]),
#  array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])]

Answer #4: Split a list into evenly sized chunks

Directly from the (old) Python documentation (recipes for itertools):

from itertools import izip, chain, repeat

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

The current version, as suggested by J.F.Sebastian:

#from itertools import izip_longest as zip_longest # for Python 2.x
from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

I guess Guido’s time machine works—worked—will work—will have worked—was working again.

These solutions work because [iter(iterable)]*n (or the equivalent in the earlier version) creates one iterator, repeated n times in the list. izip_longest then effectively performs a round-robin of “each” iterator; because this is the same iterator, it is advanced by each such call, resulting in each such zip-roundrobin generating one tuple of n items.

Answer #5: Split a list into evenly sized chunks

I’m surprised nobody has thought of using iter‘s two-argument form:

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

Demo:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn’t pad; if you want padding, a simple variation on the above will suffice:

from itertools import islice, chain, repeat

def chunk_pad(it, size, padval=None):
    it = chain(iter(it), repeat(padval))
    return iter(lambda: tuple(islice(it, size)), (padval,) * size)

Demo:

>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk_pad(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

Like the izip_longest-based solutions, the above always pads. As far as I know, there’s no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

_no_padding = object()

def chunk(it, size, padval=_no_padding):
    if padval == _no_padding:
        it = iter(it)
        sentinel = ()
    else:
        it = chain(iter(it), repeat(padval))
        sentinel = (padval,) * size
    return iter(lambda: tuple(islice(it, size)), sentinel)

Demo:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

I believe this is the shortest chunker proposed that offers optional padding.

As Tomasz Gandor observed, the two padding chunkers will stop unexpectedly if they encounter a long sequence of pad values. Here’s a final variation that works around that problem in a reasonable way:

_no_padding = object()
def chunk(it, size, padval=_no_padding):
    it = iter(it)
    chunker = iter(lambda: tuple(islice(it, size)), ())
    if padval == _no_padding:
        yield from chunker
    else:
        for ch in chunker:
            yield ch if len(ch) == size else ch + (padval,) * (size - len(ch))

Demo:

>>> list(chunk([1, 2, (), (), 5], 2))
[(1, 2), ((), ()), (5,)]
>>> list(chunk([1, 2, None, None, 5], 2, None))
[(1, 2), (None, None), (5, None)]

Answer #6: Split a list into evenly sized chunks

Simple yet elegant

L = range(1, 1000)
print [L[x:x+10] for x in xrange(0, len(L), 10)]

or if you prefer:

def chunks(L, n): return [L[x: x+n] for x in xrange(0, len(L), n)]
chunks(L, 10)

Hope you learned something from this post. The primary source of this article is StackOverflow.

Follow Programming Articles for more!

About ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ

Linux and Python enthusiast, in love with open source since 2014, Writer at programming-articles.com, India.

View all posts by ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ →