summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorBarney Gale <barney.gale@gmail.com>2023-11-13 17:15:56 (GMT)
committerGitHub <noreply@github.com>2023-11-13 17:15:56 (GMT)
commitcf67ebfb315ce36175f3d425249d7c6560f6d0d5 (patch)
tree3007eaa7164eba027714b9752aecea60627e6de6 /Doc
parentbabb787047e0f7807c8238d3b1a3128dac30bd5c (diff)
downloadcpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.zip
cpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.tar.gz
cpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.tar.bz2
GH-72904: Add `glob.translate()` function (#106703)
Add `glob.translate()` function that converts a pathname with shell wildcards to a regular expression. The regular expression is used by pathlib to implement `match()` and `glob()`. This function differs from `fnmatch.translate()` in that wildcards do not match path separators by default, and that a `*` pattern segment matches precisely one path segment. When *recursive* is set to true, `**` pattern segments match any number of path segments, and `**` cannot appear outside its own segment. In pathlib, this change speeds up directory walking (because `_make_child_relpath()` does less work), makes path objects smaller (they don't need a `_lines` slot), and removes the need for some gnarly code. Co-authored-by: Jason R. Coombs <jaraco@jaraco.com> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/glob.rst39
-rw-r--r--Doc/whatsnew/3.13.rst7
2 files changed, 46 insertions, 0 deletions
diff --git a/Doc/library/glob.rst b/Doc/library/glob.rst
index 0e4cfe7..8e76d2d 100644
--- a/Doc/library/glob.rst
+++ b/Doc/library/glob.rst
@@ -145,6 +145,45 @@ default. For example, consider a directory containing :file:`card.gif` and
>>> glob.glob('.c*')
['.card.gif']
+
+.. function:: translate(pathname, *, recursive=False, include_hidden=False, seps=None)
+
+ Convert the given path specification to a regular expression for use with
+ :func:`re.match`. The path specification can contain shell-style wildcards.
+
+ For example:
+
+ >>> import glob, re
+ >>>
+ >>> regex = glob.translate('**/*.txt', recursive=True, include_hidden=True)
+ >>> regex
+ '(?s:(?:.+/)?[^/]*\\.txt)\\Z'
+ >>> reobj = re.compile(regex)
+ >>> reobj.match('foo/bar/baz.txt')
+ <re.Match object; span=(0, 15), match='foo/bar/baz.txt'>
+
+ Path separators and segments are meaningful to this function, unlike
+ :func:`fnmatch.translate`. By default wildcards do not match path
+ separators, and ``*`` pattern segments match precisely one path segment.
+
+ If *recursive* is true, the pattern segment "``**``" will match any number
+ of path segments. If "``**``" occurs in any position other than a full
+ pattern segment, :exc:`ValueError` is raised.
+
+ If *include_hidden* is true, wildcards can match path segments that start
+ with a dot (``.``).
+
+ A sequence of path separators may be supplied to the *seps* argument. If
+ not given, :data:`os.sep` and :data:`~os.altsep` (if available) are used.
+
+ .. seealso::
+
+ :meth:`pathlib.PurePath.match` and :meth:`pathlib.Path.glob` methods,
+ which call this function to implement pattern matching and globbing.
+
+ .. versionadded:: 3.13
+
+
.. seealso::
Module :mod:`fnmatch`
diff --git a/Doc/whatsnew/3.13.rst b/Doc/whatsnew/3.13.rst
index 9f9239a..81e133b 100644
--- a/Doc/whatsnew/3.13.rst
+++ b/Doc/whatsnew/3.13.rst
@@ -183,6 +183,13 @@ doctest
:attr:`doctest.TestResults.skipped` attributes.
(Contributed by Victor Stinner in :gh:`108794`.)
+glob
+----
+
+* Add :func:`glob.translate` function that converts a path specification with
+ shell-style wildcards to a regular expression.
+ (Contributed by Barney Gale in :gh:`72904`.)
+
io
--