summaryrefslogtreecommitdiffstats
path: root/Lib/pathlib
Commit message (Collapse)AuthorAgeFilesLines
* GH-128520: Divide pathlib ABCs into three classes (#128523)Barney Gale2025-01-113-60/+74
| | | | | | | | | | | | In the private pathlib ABCs, rename `PurePathBase` to `JoinablePath`, and split `PathBase` into `ReadablePath` and `WritablePath`. This improves the API fit for read-only virtual filesystems. The split of `PathBase` entails a similar split of `CopyWorker` (implements copying) and the test cases in `test_pathlib_abc`. In a later patch, we'll make `WritablePath` inherit directly from `JoinablePath` rather than `ReadablePath`. For a couple of reasons, this isn't quite possible yet.
* GH-127381: pathlib ABCs: remove `PathBase.move()` and `move_into()` (#128337)Barney Gale2025-01-042-35/+30
| | | | | These methods combine `_delete()` and `copy()`, but `_delete()` isn't part of the public interface, and it's unlikely to be added until the pathlib ABCs are made official, or perhaps even later.
* GH-127381: pathlib ABCs: remove uncommon `PurePathBase` methods (#127853)Barney Gale2024-12-293-67/+5
| | | | | | | | Remove `PurePathBase.relative_to()` and `is_relative_to()` because they don't account for *other* being an entirely different kind of path, and they can't use `__eq__()` because it's not on the `PurePathBase` interface. Remove `PurePathBase.drive`, `root`, `is_absolute()` and `as_posix()`. These are all too specific to local filesystems.
* GH-127381: pathlib ABCs: remove `PathBase.stat()` (#128334)Barney Gale2024-12-292-30/+13
| | | | | | | Remove the `PathBase.stat()` method. Its use of the `os.stat_result` API, with its 10 mandatory fields and low-level types, makes it an awkward fit for virtual filesystems. We'll look to add a `PathBase.info` attribute later - see GH-125413.
* GH-127807: pathlib ABCs: move private copying methods to dedicated class ↵Barney Gale2024-12-223-248/+261
| | | | | | | | | | | (#127810) Move 9 private `PathBase` attributes and methods into a new `CopyWorker` class. Change `PathBase.copy` from a method to a `CopyWorker` instance. The methods remain private in the `CopyWorker` class. In future we might make some/all of them public so that user subclasses of `PathBase` can customize the copying process (in particular reading/writing of metadata,) but we'd need to make `PathBase` public first.
* GH-127807: pathlib ABCs: remove a few private attributes (#127851)Barney Gale2024-12-222-56/+64
| | | | | From `PurePathBase` delete `_globber`, `_stack` and `_pattern_str`, and from `PathBase` delete `_glob_selector`. This helps avoid an unpleasant surprise for a users who try to use these names.
* GH-127807: pathlib ABCs: remove `PurePathBase._raw_paths` (#127883)Barney Gale2024-12-223-40/+38
| | | | | Remove the `PurePathBase` initializer, and make `with_segments()` and `__str__()` abstract. This allows us to drop the `_raw_paths` attribute, and also the `Parser.join()` protocol method.
* GH-127807: pathlib ABCs: remove `PathBase._unsupported_msg()` (#127855)Barney Gale2024-12-123-35/+41
| | | | | This method helped us customise the `UnsupportedOperation` message depending on the type. But we're aiming to make `PathBase` a proper ABC soon, so `NotImplementedError` is the right exception to raise there.
* GH-127381: pathlib ABCs: remove remaining uncommon `PathBase` methods (#127714)Barney Gale2024-12-122-55/+27
| | | | | | | | | | | | | | | | | | Remove the following methods from `pathlib._abc.PathBase`: - `expanduser()` - `hardlink_to()` - `touch()` - `chmod()` - `lchmod()` - `owner()` - `group()` - `from_uri()` - `as_uri()` These operations aren't regularly supported in virtual filesystems, so they don't win a place in the `PathBase` interface. (Some of them probably don't deserve a place in `Path` :P.) They're quasi-abstract (except `lchmod()`), and they're not called by other `PathBase` methods.
* GH-127381: pathlib ABCs: remove `PathBase.samefile()` and rarer `is_*()` ↵Barney Gale2024-12-112-88/+66
| | | | | | | | | | (#127709) Remove `PathBase.samefile()`, which is fairly specific to the local FS, and relies on `stat()`, which we're aiming to remove from `PathBase`. Also remove `PathBase.is_mount()`, `is_junction()`, `is_block_device()`, `is_char_device()`, `is_fifo()` and `is_socket()`. These rely on POSIX file type numbers that we're aiming to remove from the `PathBase` API.
* GH-127456: pathlib ABCs: add protocol for path parser (#127494)Barney Gale2024-12-092-54/+24
| | | | | | | | | | Change the default value of `PurePathBase.parser` from `ParserBase()` to `posixpath`. As a result, user subclasses of `PurePathBase` and `PathBase` use POSIX path syntax by default, which is very often desirable. Move `pathlib._abc.ParserBase` to `pathlib._types.Parser`, and convert it to a runtime-checkable protocol. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
* GH-127381: pathlib ABCs: remove `PathBase.unlink()` and `rmdir()` (#127736)Barney Gale2024-12-082-41/+18
| | | | | | | Virtual filesystems don't always make a distinction between deleting files and empty directories, and sometimes support deleting non-empty directories in a single operation. Here we remove `PathBase.unlink()` and `rmdir()`, leaving `_delete()` as the sole deletion method, now made abstract. I hope to drop the underscore prefix later on.
* GH-127381: pathlib ABCs: remove `PathBase.resolve()` and `absolute()` (#127707)Barney Gale2024-12-061-63/+1
| | | | | | | | | Remove our implementation of POSIX path resolution in `PathBase.resolve()`. This functionality is rather fragile and isn't necessary in most cases. It depends on `PathBase.stat()`, which we're looking to remove. Also remove `PathBase.absolute()`. Many legitimate virtual filesystems lack the notion of a 'current directory', so it's wrong to include in the basic interface.
* GH-127381: pathlib ABCs: remove `PathBase.rename()` and `replace()` (#127658)Barney Gale2024-12-062-36/+18
| | | | These methods are obviated by `PathBase.move()`, which can move directories and supports any `PathBase` object as a target.
* GH-125413: Revert addition of `pathlib.Path.scandir()` method (#127377)Barney Gale2024-12-052-10/+9
| | | | | | | | | | Remove documentation for `pathlib.Path.scandir()`, and rename the method to `_scandir()`. In the private pathlib ABCs, make `iterdir()` abstract and call it from `_scandir()`. It's not worthwhile to add this method at the moment - see discussion: https://discuss.python.org/t/ergonomics-of-new-pathlib-path-scandir/71721 Co-authored-by: Steve Dower <steve.dower@microsoft.com>
* GH-127381: pathlib ABCs: remove `PathBase.cwd()` and `home()` (#127427)Barney Gale2024-11-302-15/+17
| | | | | These classmethods presume that the user has retained the original `__init__()` signature, which may not be the case. Also, many virtual filesystems don't provide current or home directories.
* GH-127381: pathlib ABCs: remove `PathBase.lstat()` (#127382)Barney Gale2024-11-292-10/+9
| | | | | | Remove the `PathBase.lstat()` method, which is a trivial variation of `stat()`. No user-facing changes because the pathlib ABCs are still private.
* pathlib ABCs: tighten up `resolve()` and `absolute()` (#126611)Barney Gale2024-11-091-9/+14
| | | | | | | | | | | | In `PathBase.resolve()`, raise `UnsupportedOperation` if a non-POSIX path parser is used (our implementation uses `posixpath._realpath()`, which produces incorrect results for non-POSIX path flavours.) Also tweak code to call `self.absolute()` upfront rather than supplying an emulated `getcwd()` function. Adjust `PathBase.absolute()` to work somewhat like `resolve()`. If a POSIX path parser is used, we treat the root directory as the current directory. This is the simplest useful behaviour for concrete path types without a current directory cursor.
* pathlib ABCs: support initializing paths with no arguments (#126608)Barney Gale2024-11-091-9/+7
| | | | | | | | In the past I've equivocated about whether to require at least one argument in the `PurePathBase` (and `PathBase`) initializer, and what the default should be if we make it optional. I now have a local use case that has persuaded me to make it optional and default to the empty string (a `zipp.Path`-like class that treats relative and absolute paths similarly.) Happily this brings the base class more in line with `PurePath` and `Path`.
* pathlib ABCs: defer path joining (#126409)Barney Gale2024-11-052-41/+43
| | | | | | | | Defer joining of path segments in the private `PurePathBase` ABC. The new behaviour matches how the public `PurePath` class handles path segments. This removes a hard-to-grok difference between the ABCs and the main classes. It also slightly reduces the size of `PurePath` objects by eliminating a `_raw_path` slot.
* GH-126363: Speed up pattern parsing in `pathlib.Path.glob()` (#126364)Barney Gale2024-11-041-14/+27
| | | | | | | | | | | | | | The implementation of `Path.glob()` does rather a hacky thing: it calls `self.with_segments()` to convert the given pattern to a `Path` object, and then peeks at the private `_raw_path` attribute to see if pathlib removed a trailing slash from the pattern. In this patch, we make `glob()` use a new `_parse_pattern()` classmethod that splits the pattern into parts while preserving information about any trailing slash. This skips the cost of creating a `Path` object, and avoids some path anchor normalization, which makes `Path.glob()` slightly faster. But mostly it's about making the code less naughty. Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
* GH-125413: pathlib ABCs: use `scandir()` to speed up `walk()` (#126262)Barney Gale2024-11-011-10/+12
| | | | | | | Use the new `PathBase.scandir()` method in `PathBase.walk()`, which greatly reduces the number of `PathBase.stat()` calls needed when walking. There are no user-facing changes, because the pathlib ABCs are still private and `Path.walk()` doesn't use the implementation in its superclass.
* GH-125413: pathlib ABCs: use `scandir()` to speed up `glob()` (#126261)Barney Gale2024-11-011-13/+1
| | | | | | | Use the new `PathBase.scandir()` method in `PathBase.glob()`, which greatly reduces the number of `PathBase.stat()` calls needed when globbing. There are no user-facing changes, because the pathlib ABCs are still private and `Path.glob()` doesn't use the implementation in its superclass.
* GH-125413: Add `pathlib.Path.scandir()` method (#126060)Barney Gale2024-11-012-1/+19
| | | | | Add `pathlib.Path.scandir()` as a trivial wrapper of `os.scandir()`. This will be used to implement several `PathBase` methods more efficiently, including methods that provide `Path.copy()`.
* GH-125069: Fix inconsistent joining in `WindowsPath(PosixPath(...))` (#125156)Barney Gale2024-10-131-2/+2
| | | | | | | | | | `PurePath.__init__()` incorrectly uses the `_raw_paths` of a given `PurePath` object with a different flavour, even though the procedure to join path segments can differ between flavours. This change makes the `_raw_paths`-enabled deferred joining apply _only_ when the path flavours match. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
* GH-119518: Stop interning strings in pathlib GH-123356)Barney Gale2024-09-021-2/+1
| | | | Remove `sys.intern(str(x))` calls when normalizing a path in pathlib. This speeds up `str(Path('foo/bar'))` by about 10%.
* gh-118761: Speedup pathlib import by deferring shutil (#123520)Daniel Hollas2024-09-011-2/+4
| | | | Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
* GH-73991: Prune `pathlib.Path.copy()` and `copy_into()` arguments (#123337)Barney Gale2024-08-261-33/+19
| | | | | | | | | | | | | | | | Remove *ignore* and *on_error* arguments from `pathlib.Path.copy[_into]()`, because these arguments are under-designed. Specifically: - *ignore* is appropriated from `shutil.copytree()`, but it's not clear how it should apply when the user copies a non-directory. We've changed the callback signature from the `shutil` version, but I'm not confident the new signature is as good as it can be. - *on_error* is a generalisation of `shutil.copytree()`'s error handling, which is to accumulate exceptions and raise a single `shutil.Error` at the end. It's not obvious which solution is better. Additionally, this arguments may be challenging to implement in future user subclasses of `PathBase`, which might utilise a native recursive copying method.
* GH-73991: Make `pathlib.Path.delete()` private. (#123315)Barney Gale2024-08-262-66/+21
| | | | | | | | Per feedback from Paul Moore on GH-123158, it's better to defer making `Path.delete()` public than ship it with under-designed error handling capabilities. We leave a remnant `_delete()` method, which is used by `move()`. Any functionality not needed by `move()` is deleted.
* GH-73991: Add `pathlib.Path.copy_into()` and `move_into()` (#123314)Barney Gale2024-08-261-0/+31
| | | | | | | | | | | | These two methods accept an *existing* directory path, onto which we join the source path's base name to form the final target path. A possible alternative implementation is to check for directories in `copy()` and `move()` and adjust the target path, which is done in several `shutil` functions. This behaviour is helpful in a shell context, but less so in a stored program that explicitly specifies destinations. For example, a user that calls `Path('foo.py').copy('bar.py')` might not imagine that `bar.py/foo.py` would be created, but under the alternative implementation this will happen if `bar.py` is an existing directory.
* GH-73991: Add `pathlib.Path.move()` (#122073)Barney Gale2024-08-251-1/+20
| | | | | Add a `Path.move()` method that moves a file or directory tree, and returns a new `Path` instance pointing to the target. This method is similar to `shutil.move()`, except that it doesn't accept a *copy_function* argument, and it doesn't check whether the destination is an existing directory.
* GH-122890: Fix low-level error handling in `pathlib.Path.copy()` (#122897)Barney Gale2024-08-241-16/+42
| | | | | | | | | Give unique names to our low-level FD copying functions, and try each one in turn. Handle errors appropriately for each implementation: - `fcntl.FICLONE`: suppress `EBADF`, `EOPNOTSUPP`, `ETXTBSY`, `EXDEV` - `posix._fcopyfile`: suppress `EBADF`, `ENOTSUP` - `os.copy_file_range`: suppress `ETXTBSY`, `EXDEV` - `os.sendfile`: suppress `ENOTSOCK`
* GH-73991: Disallow copying directory into itself via `pathlib.Path.copy()` ↵Barney Gale2024-08-231-6/+37
| | | | (#122924)
* GH-120754: Disable buffering in Path.read_bytes (#122111)Cody Maloney2024-08-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `Path.read_bytes()` is used to read a whole file. buffering / BufferedIO is focused around making small, possibly interleaved, read/write efficient which doesn't add value in this case. On my Mac, running the benchmark: ```python import pyperf from pathlib import Path def read_all(all_paths): for p in all_paths: p.read_bytes() def read_file(path_obj): path_obj.read_bytes() all_rst = list(Path("Doc").glob("**/*.rst")) all_py = list(Path(".").glob("**/*.py")) assert all_rst, "Should have found rst files" assert all_py, "Should have found python source files" runner = pyperf.Runner() runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst")) runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst")) ``` before: ```python ..................... read_file_small: Mean +- std dev: 6.80 us +- 0.07 us ..................... read_file_large: Mean +- std dev: 10.8 us +- 0.2 us ```` after: ```python ..................... read_file_small: Mean +- std dev: 5.67 us +- 0.05 us ..................... read_file_large: Mean +- std dev: 9.77 us +- 0.52 us ```
* GH-73991: Rework `pathlib.Path.copytree()` into `copy()` (#122369)Barney Gale2024-08-114-99/+63
| | | | | | | | | | Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.) Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `delete()` methods (which will also accept any type of file.) Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
* GH-73991: Rework `pathlib.Path.rmtree()` into `delete()` (#122368)Barney Gale2024-08-072-28/+40
| | | | | | Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting non-directories. This simplifies the interface for users, and nicely complements the upcoming `move()` and `copy()` methods (which will also accept any type of file.)
* Fix duplicated words 'begins with a' in pathlib docstring (#122732)Виталий Дмитриев2024-08-061-1/+1
|
* GH-73991: Support preserving metadata in `pathlib.Path.copytree()` (#121438)Barney Gale2024-07-201-2/+6
| | | | | Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`.
* GH-73991: Add `pathlib.Path.rmtree()` (#119060)Barney Gale2024-07-202-0/+60
| | | | | | | | | | | Add a `Path.rmtree()` method that removes an entire directory tree, like `shutil.rmtree()`. The signature of the optional *on_error* argument matches the `Path.walk()` argument of the same name, but differs from the *onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within pathlib is probably more important. In the private pathlib ABCs, we add an implementation based on `walk()`. Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
* GH-73991: Support preserving metadata in `pathlib.Path.copy()` (#120806)Barney Gale2024-07-063-5/+137
| | | | | Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied. Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
* GH-73991: Support copying directory symlinks on older Windows (#120807)Barney Gale2024-07-034-23/+38
| | | | | Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and fall back to the `PathBase.copy()` implementation.
* GH-73991: Add `pathlib.Path.copytree()` (#120718)Barney Gale2024-06-231-0/+30
| | | | | | | | | | | | | | | | | | | | Add `pathlib.Path.copytree()` method, which recursively copies one directory to another. This differs from `shutil.copytree()` in the following respects: 1. Our method has a *follow_symlinks* argument, whereas shutil's has a *symlinks* argument with an inverted meaning. 2. Our method lacks something like a *copy_function* argument. It always uses `Path.copy()` to copy files. 3. Our method lacks something like a *ignore_dangling_symlinks* argument. Instead, users can filter out danging symlinks with *ignore*, or ignore exceptions with *on_error* 4. Our *ignore* argument is a callable that accepts a single path object, whereas shutil's accepts a path and a list of child filenames. 5. We add an *on_error* argument, which is a callable that accepts an `OSError` instance. (`Path.walk()` also accepts such a callable). Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
* GH-73991: Add follow_symlinks argument to `pathlib.Path.copy()` (#120519)Barney Gale2024-06-193-9/+37
| | | | | | | Add support for not following symlinks in `pathlib.Path.copy()`. On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs. No news as `copy()` was only just added.
* GH-73991: Add `pathlib.Path.copy()` (#119058)Barney Gale2024-06-143-0/+184
| | | | | | | | | | | | | | | Add a `Path.copy()` method that copies the content of one file to another. This method is similar to `shutil.copyfile()` but differs in the following ways: - Uses `fcntl.FICLONE` where available (see GH-81338) - Uses `os.copy_file_range` where available (see GH-81340) - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`. The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata. Incorporates code from GH-81338 and GH-93152. Co-authored-by: Eryk Sun <eryksun@gmail.com>
* GH-116380: Move pathlib-specific code from `glob` to `pathlib._abc`. (#120011)Barney Gale2024-06-071-2/+30
| | | | | In `glob._Globber`, move pathlib-specific methods to `pathlib._abc.PathGlobber` and replace them with abstract methods. Rename `glob._Globber` to `glob._GlobberBase`. As a result, the `glob` module is no longer befouled by code that can only ever apply to pathlib. No change of behaviour.
* pathlib ABCs: remove duplicate `realpath()` implementation. (#119178)Barney Gale2024-06-051-59/+28
| | | | | | | | | Add private `posixpath._realpath()` function, which is a generic version of `realpath()` that can be parameterised with string tokens (`sep`, `curdir`, `pardir`) and query functions (`getcwd`, `lstat`, `readlink`). Also add support for limiting the number of symlink traversals. In the private `pathlib._abc.PathBase` class, call `posixpath._realpath()` and remove our re-implementation of the same algorithm. No change to any public APIs, either in `posixpath` or `pathlib`. Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
* GH-119169: Implement `pathlib.Path.walk()` using `os.walk()` (#119573)Barney Gale2024-05-292-2/+34
| | | | | | For silly reasons, pathlib's generic implementation of `walk()` currently resides in `glob._Globber`. This commit moves it into `pathlib._abc.PathBase.walk()` where it really belongs, and makes `pathlib.Path.walk()` call `os.walk()`.
* GH-82805: Fix handling of single-dot file extensions in pathlib (#118952)Barney Gale2024-05-252-18/+50
| | | | | | | | | | | | | | | pathlib now treats "`.`" as a valid file extension (suffix). This brings it in line with `os.path.splitext()`. In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method that splits a path into a `(root, ext)` pair, like `os.path.splitext()`. This method is called by `PurePathBase.stem`, `suffix`, etc. In a future version of pathlib, we might make these base classes public, and so users will be able to define their own `splitext()` method to control file extension splitting. In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes` properties that don't use `splitext()`, which avoids computing the path base name twice.
* GH-119113: Raise `TypeError` from `pathlib.PurePath.with_suffix(None)` (#119124)Barney Gale2024-05-191-6/+4
| | | Restore behaviour from 3.12 when `path.with_suffix(None)` is called.
* gh-119049: Defer `import warnings` in `pathlib._local` (#119111)Kirill Podoprigora2024-05-171-1/+1
|