summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorRaymond Hettinger <python@rcn.com>2007-05-28 05:23:22 (GMT)
committerRaymond Hettinger <python@rcn.com>2007-05-28 05:23:22 (GMT)
commit1749a1353217329b5cba90c502377b5cb05fe40b (patch)
treee1d4e2b195dd906167af6517265d0513b5352d37 /Doc
parenta0fcb9384ead24c412b93a4de903788eb5828dbe (diff)
downloadcpython-1749a1353217329b5cba90c502377b5cb05fe40b.zip
cpython-1749a1353217329b5cba90c502377b5cb05fe40b.tar.gz
cpython-1749a1353217329b5cba90c502377b5cb05fe40b.tar.bz2
Explain when groupby() issues a new group.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/lib/libitertools.tex8
1 files changed, 8 insertions, 0 deletions
diff --git a/Doc/lib/libitertools.tex b/Doc/lib/libitertools.tex
index ac6028b..e2f0f0e 100644
--- a/Doc/lib/libitertools.tex
+++ b/Doc/lib/libitertools.tex
@@ -138,6 +138,13 @@ by functions or loops that truncate the stream.
identity function and returns the element unchanged. Generally, the
iterable needs to already be sorted on the same key function.
+ The operation of \function{groupby()} is similar to the \code{uniq} filter
+ in \UNIX{}. It generates a break or new group every time the value
+ of the key function changes (which is why it is usually necessary
+ to have sorted the data using the same key function). That behavior
+ differs from SQL's GROUP BY which aggregates common elements regardless
+ of their input order.
+
The returned group is itself an iterator that shares the underlying
iterable with \function{groupby()}. Because the source is shared, when
the \function{groupby} object is advanced, the previous group is no
@@ -147,6 +154,7 @@ by functions or loops that truncate the stream.
\begin{verbatim}
groups = []
uniquekeys = []
+ data = sorted(data, key=keyfunc)
for k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)