diff options
author | Barney Gale <barney.gale@gmail.com> | 2023-11-13 17:15:56 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-11-13 17:15:56 (GMT) |
commit | cf67ebfb315ce36175f3d425249d7c6560f6d0d5 (patch) | |
tree | 3007eaa7164eba027714b9752aecea60627e6de6 /Doc | |
parent | babb787047e0f7807c8238d3b1a3128dac30bd5c (diff) | |
download | cpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.zip cpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.tar.gz cpython-cf67ebfb315ce36175f3d425249d7c6560f6d0d5.tar.bz2 |
GH-72904: Add `glob.translate()` function (#106703)
Add `glob.translate()` function that converts a pathname with shell wildcards to a regular expression. The regular expression is used by pathlib to implement `match()` and `glob()`.
This function differs from `fnmatch.translate()` in that wildcards do not match path separators by default, and that a `*` pattern segment matches precisely one path segment. When *recursive* is set to true, `**` pattern segments match any number of path segments, and `**` cannot appear outside its own segment.
In pathlib, this change speeds up directory walking (because `_make_child_relpath()` does less work), makes path objects smaller (they don't need a `_lines` slot), and removes the need for some gnarly code.
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/glob.rst | 39 | ||||
-rw-r--r-- | Doc/whatsnew/3.13.rst | 7 |
2 files changed, 46 insertions, 0 deletions
diff --git a/Doc/library/glob.rst b/Doc/library/glob.rst index 0e4cfe7..8e76d2d 100644 --- a/Doc/library/glob.rst +++ b/Doc/library/glob.rst @@ -145,6 +145,45 @@ default. For example, consider a directory containing :file:`card.gif` and >>> glob.glob('.c*') ['.card.gif'] + +.. function:: translate(pathname, *, recursive=False, include_hidden=False, seps=None) + + Convert the given path specification to a regular expression for use with + :func:`re.match`. The path specification can contain shell-style wildcards. + + For example: + + >>> import glob, re + >>> + >>> regex = glob.translate('**/*.txt', recursive=True, include_hidden=True) + >>> regex + '(?s:(?:.+/)?[^/]*\\.txt)\\Z' + >>> reobj = re.compile(regex) + >>> reobj.match('foo/bar/baz.txt') + <re.Match object; span=(0, 15), match='foo/bar/baz.txt'> + + Path separators and segments are meaningful to this function, unlike + :func:`fnmatch.translate`. By default wildcards do not match path + separators, and ``*`` pattern segments match precisely one path segment. + + If *recursive* is true, the pattern segment "``**``" will match any number + of path segments. If "``**``" occurs in any position other than a full + pattern segment, :exc:`ValueError` is raised. + + If *include_hidden* is true, wildcards can match path segments that start + with a dot (``.``). + + A sequence of path separators may be supplied to the *seps* argument. If + not given, :data:`os.sep` and :data:`~os.altsep` (if available) are used. + + .. seealso:: + + :meth:`pathlib.PurePath.match` and :meth:`pathlib.Path.glob` methods, + which call this function to implement pattern matching and globbing. + + .. versionadded:: 3.13 + + .. seealso:: Module :mod:`fnmatch` diff --git a/Doc/whatsnew/3.13.rst b/Doc/whatsnew/3.13.rst index 9f9239a..81e133b 100644 --- a/Doc/whatsnew/3.13.rst +++ b/Doc/whatsnew/3.13.rst @@ -183,6 +183,13 @@ doctest :attr:`doctest.TestResults.skipped` attributes. (Contributed by Victor Stinner in :gh:`108794`.) +glob +---- + +* Add :func:`glob.translate` function that converts a path specification with + shell-style wildcards to a regular expression. + (Contributed by Barney Gale in :gh:`72904`.) + io -- |