Skip to content

Commit

Permalink
Introduce checked functions for pushing, joining, and switching encod…
Browse files Browse the repository at this point in the history
…ings (#23)

* Implement push_checked on Encoding and Utf8Encoding traits alongside their unix and windows implementations
* Implement push_checked for Pathbuf and Utf8PathBuf; add join_checked to Path and Utf8Path
* Implement push_checked for TypedPathBuf and Utf8TypedPathBuf
* Implement join_checked for TypedPath, Utf8TypedPath, TypedPathBuf, and Utf8TypedPathBuf
* Implement with_encoding_checked for non-utf8 and utf8 paths; add is_valid method to component traits and implementations
* Implement with_unix_encoding_checked and with_windows_encoding_checked functions
* Bump version to 0.8.0 and update changelog & readme to reference new functionality
* Add is_valid to Path/Utf8Path
  • Loading branch information
chipsenkbeil authored Feb 25, 2024
1 parent a886829 commit 2eadf1b
Show file tree
Hide file tree
Showing 24 changed files with 1,633 additions and 37 deletions.
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.8.0] - 2024-02-24

* Add `push_checked` function, which ensures that any path added to an existing `PathBuf` or `TypedPathBuf` must abide by the following rules:
1. It cannot be an absolute path. Only relative paths allowed.
2. In the case of Windows, it cannot start with a prefix like `C:`.
3. All normal components of the path must contain only valid characters.
4. If parent directory (..) components are present, they must not result in a path traversal attack (impacting the current path).
* Add `join_checked` function, which ensures that any path joied with an existing path follows the rules of `push_checked`
* Add `with_encoding_checked` function to ensure that the resulting path from an encoding conversion is still valid
* Add `with_unix_encoding_checked` and `with_windows_encoding_checked` functions as shortcuts to `with_encoding_checked`
* Add `is_valid` to `Component` and `Utf8Component` traits alongside `Path` and `Utf8Path` to indicate if a component/path is valid for the given encoding

## [0.7.1] - 2024-02-15

* Support `wasm` family for compilation
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "typed-path"
description = "Provides typed variants of Path and PathBuf for Unix and Windows"
version = "0.7.1"
version = "0.8.0"
edition = "2021"
authors = ["Chip Senkbeil <[email protected]>"]
categories = ["development-tools", "filesystem", "os"]
Expand Down
67 changes: 66 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Unix and Windows.

```toml
[dependencies]
typed-path = "0.7"
typed-path = "0.8"
```

As of version `0.7`, this library also supports `no_std` environments that
Expand Down Expand Up @@ -124,6 +124,51 @@ fn main() {
}
```

### Checking paths

When working with user-defined paths, there is an additional layer of defense needed to prevent abuse to avoid [path traversal attacks](https://owasp.org/www-community/attacks/Path_Traversal) and other risks.

To that end, you can use `PathBuf::push_checked` and `Path::join_checked` (and equivalents) to ensure that the paths being created do not alter pre-existing paths in unexpected ways.

```rust
use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};

fn main() {
let path = Path::<UnixEncoding>::new("/etc");

// A valid path can be joined onto the existing one
assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));

// An invalid path will result in an error
assert_eq!(
path.join_checked("/sneaky/replacement"),
Err(CheckedPathError::UnexpectedRoot)
);

let mut path = PathBuf::<UnixEncoding>::from("/etc");

// Pushing a relative path that contains parent directory references that cannot be
// resolved within the path is considered an error as this is considered a path
// traversal attack!
assert_eq!(
path.push_checked(".."),
Err(CheckedPathError::PathTraversalAttack)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing an absolute path will fail with an error
assert_eq!(
path.push_checked("/sneaky/replacement"),
Err(CheckedPathError::UnexpectedRoot)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing a relative path that is safe will succeed
assert!(path.push_checked("abc/../def").is_ok());
assert_eq!(path, PathBuf::from("/etc/abc/../def"));
}
```

### Converting between encodings

There may be times in which you need to convert between encodings such as when
Expand Down Expand Up @@ -156,6 +201,26 @@ fn main() {
}
```

Like with pushing and joining paths using *checked* variants, we can also ensure that paths created from changing encodings are still valid:

```rust
use typed_path::{CheckedPathError, Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

fn main() {
// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding_checked::<Utf8WindowsEncoding>().unwrap();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Convert from Unix to Windows will fail if there are characters that are valid in Unix but not in Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/|invalid|/foo.txt");
assert_eq!(
unix_path.with_encoding_checked::<Utf8WindowsEncoding>(),
Err(CheckedPathError::InvalidFilename),
);
}
```

### Typed Paths

In the above examples, we were using paths where the encoding (Unix or Windows)
Expand Down
31 changes: 31 additions & 0 deletions src/common/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,34 @@ impl fmt::Display for StripPrefixError {

#[cfg(feature = "std")]
impl std::error::Error for StripPrefixError {}

/// An error returned when a path violates checked criteria.
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum CheckedPathError {
/// When a normal component contains invalid characters for the current encoding.
InvalidFilename,

/// When a path component that represents a parent directory is provided such that the original
/// path would be escaped to access arbitrary files.
PathTraversalAttack,

/// When a path component that represents a prefix is provided after the start of the path.
UnexpectedPrefix,

/// When a path component that represents a root is provided after the start of the path.
UnexpectedRoot,
}

impl fmt::Display for CheckedPathError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::InvalidFilename => write!(f, "path contains invalid filename"),
Self::PathTraversalAttack => write!(f, "path attempts to escape original path"),
Self::UnexpectedPrefix => write!(f, "path contains unexpected prefix"),
Self::UnexpectedRoot => write!(f, "path contains unexpected root"),
}
}
}

#[cfg(feature = "std")]
impl std::error::Error for CheckedPathError {}
9 changes: 9 additions & 0 deletions src/common/non_utf8.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ pub use parser::ParseError;
pub use path::*;
pub use pathbuf::*;

use crate::common::errors::CheckedPathError;
use crate::no_std_compat::*;
use crate::private;

Expand All @@ -33,4 +34,12 @@ pub trait Encoding<'a>: private::Sealed {

/// Pushes a byte slice (`path`) onto the an existing path (`current_path`)
fn push(current_path: &mut Vec<u8>, path: &[u8]);

/// Like [`Encoding::push`], but enforces several new rules:
///
/// 1. `path` cannot contain a prefix component.
/// 2. `path` cannot contain a root component.
/// 3. `path` cannot contain invalid filename bytes.
/// 4. `path` cannot contain parent components such that the current path would be escaped.
fn push_checked(current_path: &mut Vec<u8>, path: &[u8]) -> Result<(), CheckedPathError>;
}
22 changes: 22 additions & 0 deletions src/common/non_utf8/components/component.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,28 @@ pub trait Component<'a>:
/// * `UnixComponent::Normal("here.txt")` - `is_current() == false`
fn is_current(&self) -> bool;

/// Returns true if this component is valid. A component can only be invalid if it represents a
/// normal component with bytes that are disallowed by the encoding.
///
/// # Examples
///
/// ```
/// use typed_path::{Component, UnixComponent, WindowsComponent};
///
/// assert!(UnixComponent::RootDir.is_valid());
/// assert!(UnixComponent::ParentDir.is_valid());
/// assert!(UnixComponent::CurDir.is_valid());
/// assert!(UnixComponent::Normal(b"abc").is_valid());
/// assert!(!UnixComponent::Normal(b"\0").is_valid());
///
/// assert!(WindowsComponent::RootDir.is_valid());
/// assert!(WindowsComponent::ParentDir.is_valid());
/// assert!(WindowsComponent::CurDir.is_valid());
/// assert!(WindowsComponent::Normal(b"abc").is_valid());
/// assert!(!WindowsComponent::Normal(b"|").is_valid());
/// ```
fn is_valid(&self) -> bool;

/// Returns size of component in bytes
fn len(&self) -> usize;

Expand Down
127 changes: 126 additions & 1 deletion src/common/non_utf8/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ use core::{cmp, fmt};
pub use display::Display;

use crate::no_std_compat::*;
use crate::{Ancestors, Component, Components, Encoding, Iter, PathBuf, StripPrefixError};
use crate::{
Ancestors, CheckedPathError, Component, Components, Encoding, Iter, PathBuf, StripPrefixError,
};

/// A slice of a path (akin to [`str`]).
///
Expand Down Expand Up @@ -252,6 +254,25 @@ where
!self.is_absolute()
}

/// Returns `true` if the path is valid, meaning that all of its components are valid.
///
/// See [`Component::is_valid`]'s documentation for more details.
///
/// # Examples
///
/// ```
/// use typed_path::{Path, UnixEncoding};
///
/// // NOTE: A path cannot be created on its own without a defined encoding
/// assert!(Path::<UnixEncoding>::new("foo.txt").is_valid());
/// assert!(!Path::<UnixEncoding>::new("foo\0.txt").is_valid());
/// ```
///
/// [`Component::is_valid`]: crate::Component::is_valid
pub fn is_valid(&self) -> bool {
self.components().all(|c| c.is_valid())
}

/// Returns `true` if the `Path` has a root.
///
/// * On Unix ([`UnixPath`]), a path has a root if it begins with `/`.
Expand Down Expand Up @@ -666,6 +687,35 @@ where
buf
}

/// Creates an owned [`PathBuf`] with `path` adjoined to `self`, checking the `path` to ensure
/// it is safe to join. _When dealing with user-provided paths, this is the preferred method._
///
/// See [`PathBuf::push_checked`] for more details on what it means to adjoin a path safely.
///
/// # Examples
///
/// ```
/// use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};
///
/// // NOTE: A path cannot be created on its own without a defined encoding
/// let path = Path::<UnixEncoding>::new("/etc");
///
/// // A valid path can be joined onto the existing one
/// assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));
///
/// // An invalid path will result in an error
/// assert_eq!(path.join_checked("/sneaky/replacement"), Err(CheckedPathError::UnexpectedRoot));
/// ```
pub fn join_checked<P: AsRef<Path<T>>>(&self, path: P) -> Result<PathBuf<T>, CheckedPathError> {
self._join_checked(path.as_ref())
}

fn _join_checked(&self, path: &Path<T>) -> Result<PathBuf<T>, CheckedPathError> {
let mut buf = self.to_path_buf();
buf.push_checked(path)?;
Ok(buf)
}

/// Creates an owned [`PathBuf`] like `self` but with the given file name.
///
/// See [`PathBuf::set_file_name`] for more details.
Expand Down Expand Up @@ -878,6 +928,81 @@ where
}
}

/// Like [`with_encoding`], creates an owned [`PathBuf`] like `self` but with a different
/// encoding. Additionally, checks to ensure that the produced path will be valid.
///
/// # Note
///
/// As part of the process of converting between encodings, the path will need to be rebuilt.
/// This involves [`pushing and checking`] each component, which may result in differences in
/// the resulting path such as resolving `.` and `..` early or other unexpected side effects.
///
/// [`pushing and checking`]: PathBuf::push_checked
/// [`with_encoding`]: Path::with_encoding
///
/// # Examples
///
/// ```
/// use typed_path::{CheckedPathError, Path, UnixEncoding, WindowsEncoding};
///
/// // Convert from Unix to Windows
/// let unix_path = Path::<UnixEncoding>::new("/tmp/foo.txt");
/// let windows_path = unix_path.with_encoding_checked::<WindowsEncoding>().unwrap();
/// assert_eq!(windows_path, Path::<WindowsEncoding>::new(r"\tmp\foo.txt"));
///
/// // Converting from Windows to Unix will drop any prefix
/// let windows_path = Path::<WindowsEncoding>::new(r"C:\tmp\foo.txt");
/// let unix_path = windows_path.with_encoding_checked::<UnixEncoding>().unwrap();
/// assert_eq!(unix_path, Path::<UnixEncoding>::new(r"/tmp/foo.txt"));
///
/// // Converting from Unix to Windows with invalid filename characters like `:` should fail
/// let unix_path = Path::<UnixEncoding>::new("/|invalid|/foo.txt");
/// assert_eq!(
/// unix_path.with_encoding_checked::<WindowsEncoding>(),
/// Err(CheckedPathError::InvalidFilename),
/// );
///
/// // Converting from Unix to Windows with unexpected prefix embedded in path should fail
/// let unix_path = Path::<UnixEncoding>::new("/path/c:/foo.txt");
/// assert_eq!(
/// unix_path.with_encoding_checked::<WindowsEncoding>(),
/// Err(CheckedPathError::UnexpectedPrefix),
/// );
/// ```
pub fn with_encoding_checked<U>(&self) -> Result<PathBuf<U>, CheckedPathError>
where
U: for<'enc> Encoding<'enc>,
{
let mut path = PathBuf::new();

// For root, current, and parent we specially handle to convert to the appropriate type,
// otherwise we attempt to push using the checked variant, which will ensure that the
// destination encoding is respected
for component in self.components() {
if component.is_root() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::root()
.as_bytes(),
);
} else if component.is_current() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::current(
)
.as_bytes(),
);
} else if component.is_parent() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::parent()
.as_bytes(),
);
} else {
path.push_checked(component.as_bytes())?;
}
}

Ok(path)
}

/// Converts a [`Box<Path>`](Box) into a
/// [`PathBuf`] without copying or allocating.
pub fn into_path_buf(self: Box<Path<T>>) -> PathBuf<T> {
Expand Down
Loading

0 comments on commit 2eadf1b

Please sign in to comment.