Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce checked functions for pushing, joining, and switching encodings #23

Merged
merged 10 commits into from
Feb 25, 2024
Merged
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.8.0] - 2024-02-24

* Add `push_checked` function, which ensures that any path added to an existing `PathBuf` or `TypedPathBuf` must abide by the following rules:
1. It cannot be an absolute path. Only relative paths allowed.
2. In the case of Windows, it cannot start with a prefix like `C:`.
3. All normal components of the path must contain only valid characters.
4. If parent directory (..) components are present, they must not result in a path traversal attack (impacting the current path).
* Add `join_checked` function, which ensures that any path joied with an existing path follows the rules of `push_checked`
* Add `with_encoding_checked` function to ensure that the resulting path from an encoding conversion is still valid
* Add `with_unix_encoding_checked` and `with_windows_encoding_checked` functions as shortcuts to `with_encoding_checked`
* Add `is_valid` to `Component` and `Utf8Component` traits alongside `Path` and `Utf8Path` to indicate if a component/path is valid for the given encoding

## [0.7.1] - 2024-02-15

* Support `wasm` family for compilation
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "typed-path"
description = "Provides typed variants of Path and PathBuf for Unix and Windows"
version = "0.7.1"
version = "0.8.0"
edition = "2021"
authors = ["Chip Senkbeil <[email protected]>"]
categories = ["development-tools", "filesystem", "os"]
Expand Down
67 changes: 66 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Unix and Windows.

```toml
[dependencies]
typed-path = "0.7"
typed-path = "0.8"
```

As of version `0.7`, this library also supports `no_std` environments that
Expand Down Expand Up @@ -124,6 +124,51 @@ fn main() {
}
```

### Checking paths

When working with user-defined paths, there is an additional layer of defense needed to prevent abuse to avoid [path traversal attacks](https://owasp.org/www-community/attacks/Path_Traversal) and other risks.

To that end, you can use `PathBuf::push_checked` and `Path::join_checked` (and equivalents) to ensure that the paths being created do not alter pre-existing paths in unexpected ways.

```rust
use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};

fn main() {
let path = Path::<UnixEncoding>::new("/etc");

// A valid path can be joined onto the existing one
assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));

// An invalid path will result in an error
assert_eq!(
path.join_checked("/sneaky/replacement"),
Err(CheckedPathError::UnexpectedRoot)
);

let mut path = PathBuf::<UnixEncoding>::from("/etc");

// Pushing a relative path that contains parent directory references that cannot be
// resolved within the path is considered an error as this is considered a path
// traversal attack!
assert_eq!(
path.push_checked(".."),
Err(CheckedPathError::PathTraversalAttack)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing an absolute path will fail with an error
assert_eq!(
path.push_checked("/sneaky/replacement"),
Err(CheckedPathError::UnexpectedRoot)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing a relative path that is safe will succeed
assert!(path.push_checked("abc/../def").is_ok());
assert_eq!(path, PathBuf::from("/etc/abc/../def"));
}
```

### Converting between encodings

There may be times in which you need to convert between encodings such as when
Expand Down Expand Up @@ -156,6 +201,26 @@ fn main() {
}
```

Like with pushing and joining paths using *checked* variants, we can also ensure that paths created from changing encodings are still valid:

```rust
use typed_path::{CheckedPathError, Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

fn main() {
// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding_checked::<Utf8WindowsEncoding>().unwrap();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Convert from Unix to Windows will fail if there are characters that are valid in Unix but not in Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/|invalid|/foo.txt");
assert_eq!(
unix_path.with_encoding_checked::<Utf8WindowsEncoding>(),
Err(CheckedPathError::InvalidFilename),
);
}
```

### Typed Paths

In the above examples, we were using paths where the encoding (Unix or Windows)
Expand Down
31 changes: 31 additions & 0 deletions src/common/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,34 @@ impl fmt::Display for StripPrefixError {

#[cfg(feature = "std")]
impl std::error::Error for StripPrefixError {}

/// An error returned when a path violates checked criteria.
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum CheckedPathError {
/// When a normal component contains invalid characters for the current encoding.
InvalidFilename,

/// When a path component that represents a parent directory is provided such that the original
/// path would be escaped to access arbitrary files.
PathTraversalAttack,

/// When a path component that represents a prefix is provided after the start of the path.
UnexpectedPrefix,

/// When a path component that represents a root is provided after the start of the path.
UnexpectedRoot,
}

impl fmt::Display for CheckedPathError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::InvalidFilename => write!(f, "path contains invalid filename"),
Self::PathTraversalAttack => write!(f, "path attempts to escape original path"),
Self::UnexpectedPrefix => write!(f, "path contains unexpected prefix"),
Self::UnexpectedRoot => write!(f, "path contains unexpected root"),
}
}
}

#[cfg(feature = "std")]
impl std::error::Error for CheckedPathError {}
9 changes: 9 additions & 0 deletions src/common/non_utf8.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ pub use parser::ParseError;
pub use path::*;
pub use pathbuf::*;

use crate::common::errors::CheckedPathError;
use crate::no_std_compat::*;
use crate::private;

Expand All @@ -33,4 +34,12 @@ pub trait Encoding<'a>: private::Sealed {

/// Pushes a byte slice (`path`) onto the an existing path (`current_path`)
fn push(current_path: &mut Vec<u8>, path: &[u8]);

/// Like [`Encoding::push`], but enforces several new rules:
///
/// 1. `path` cannot contain a prefix component.
/// 2. `path` cannot contain a root component.
/// 3. `path` cannot contain invalid filename bytes.
/// 4. `path` cannot contain parent components such that the current path would be escaped.
fn push_checked(current_path: &mut Vec<u8>, path: &[u8]) -> Result<(), CheckedPathError>;
}
22 changes: 22 additions & 0 deletions src/common/non_utf8/components/component.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,28 @@ pub trait Component<'a>:
/// * `UnixComponent::Normal("here.txt")` - `is_current() == false`
fn is_current(&self) -> bool;

/// Returns true if this component is valid. A component can only be invalid if it represents a
/// normal component with bytes that are disallowed by the encoding.
///
/// # Examples
///
/// ```
/// use typed_path::{Component, UnixComponent, WindowsComponent};
///
/// assert!(UnixComponent::RootDir.is_valid());
/// assert!(UnixComponent::ParentDir.is_valid());
/// assert!(UnixComponent::CurDir.is_valid());
/// assert!(UnixComponent::Normal(b"abc").is_valid());
/// assert!(!UnixComponent::Normal(b"\0").is_valid());
///
/// assert!(WindowsComponent::RootDir.is_valid());
/// assert!(WindowsComponent::ParentDir.is_valid());
/// assert!(WindowsComponent::CurDir.is_valid());
/// assert!(WindowsComponent::Normal(b"abc").is_valid());
/// assert!(!WindowsComponent::Normal(b"|").is_valid());
/// ```
fn is_valid(&self) -> bool;

/// Returns size of component in bytes
fn len(&self) -> usize;

Expand Down
127 changes: 126 additions & 1 deletion src/common/non_utf8/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ use core::{cmp, fmt};
pub use display::Display;

use crate::no_std_compat::*;
use crate::{Ancestors, Component, Components, Encoding, Iter, PathBuf, StripPrefixError};
use crate::{
Ancestors, CheckedPathError, Component, Components, Encoding, Iter, PathBuf, StripPrefixError,
};

/// A slice of a path (akin to [`str`]).
///
Expand Down Expand Up @@ -252,6 +254,25 @@ where
!self.is_absolute()
}

/// Returns `true` if the path is valid, meaning that all of its components are valid.
///
/// See [`Component::is_valid`]'s documentation for more details.
///
/// # Examples
///
/// ```
/// use typed_path::{Path, UnixEncoding};
///
/// // NOTE: A path cannot be created on its own without a defined encoding
/// assert!(Path::<UnixEncoding>::new("foo.txt").is_valid());
/// assert!(!Path::<UnixEncoding>::new("foo\0.txt").is_valid());
/// ```
///
/// [`Component::is_valid`]: crate::Component::is_valid
pub fn is_valid(&self) -> bool {
self.components().all(|c| c.is_valid())
}

/// Returns `true` if the `Path` has a root.
///
/// * On Unix ([`UnixPath`]), a path has a root if it begins with `/`.
Expand Down Expand Up @@ -666,6 +687,35 @@ where
buf
}

/// Creates an owned [`PathBuf`] with `path` adjoined to `self`, checking the `path` to ensure
/// it is safe to join. _When dealing with user-provided paths, this is the preferred method._
///
/// See [`PathBuf::push_checked`] for more details on what it means to adjoin a path safely.
///
/// # Examples
///
/// ```
/// use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};
///
/// // NOTE: A path cannot be created on its own without a defined encoding
/// let path = Path::<UnixEncoding>::new("/etc");
///
/// // A valid path can be joined onto the existing one
/// assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));
///
/// // An invalid path will result in an error
/// assert_eq!(path.join_checked("/sneaky/replacement"), Err(CheckedPathError::UnexpectedRoot));
/// ```
pub fn join_checked<P: AsRef<Path<T>>>(&self, path: P) -> Result<PathBuf<T>, CheckedPathError> {
self._join_checked(path.as_ref())
}

fn _join_checked(&self, path: &Path<T>) -> Result<PathBuf<T>, CheckedPathError> {
let mut buf = self.to_path_buf();
buf.push_checked(path)?;
Ok(buf)
}

/// Creates an owned [`PathBuf`] like `self` but with the given file name.
///
/// See [`PathBuf::set_file_name`] for more details.
Expand Down Expand Up @@ -878,6 +928,81 @@ where
}
}

/// Like [`with_encoding`], creates an owned [`PathBuf`] like `self` but with a different
/// encoding. Additionally, checks to ensure that the produced path will be valid.
///
/// # Note
///
/// As part of the process of converting between encodings, the path will need to be rebuilt.
/// This involves [`pushing and checking`] each component, which may result in differences in
/// the resulting path such as resolving `.` and `..` early or other unexpected side effects.
///
/// [`pushing and checking`]: PathBuf::push_checked
/// [`with_encoding`]: Path::with_encoding
///
/// # Examples
///
/// ```
/// use typed_path::{CheckedPathError, Path, UnixEncoding, WindowsEncoding};
///
/// // Convert from Unix to Windows
/// let unix_path = Path::<UnixEncoding>::new("/tmp/foo.txt");
/// let windows_path = unix_path.with_encoding_checked::<WindowsEncoding>().unwrap();
/// assert_eq!(windows_path, Path::<WindowsEncoding>::new(r"\tmp\foo.txt"));
///
/// // Converting from Windows to Unix will drop any prefix
/// let windows_path = Path::<WindowsEncoding>::new(r"C:\tmp\foo.txt");
/// let unix_path = windows_path.with_encoding_checked::<UnixEncoding>().unwrap();
/// assert_eq!(unix_path, Path::<UnixEncoding>::new(r"/tmp/foo.txt"));
///
/// // Converting from Unix to Windows with invalid filename characters like `:` should fail
/// let unix_path = Path::<UnixEncoding>::new("/|invalid|/foo.txt");
/// assert_eq!(
/// unix_path.with_encoding_checked::<WindowsEncoding>(),
/// Err(CheckedPathError::InvalidFilename),
/// );
///
/// // Converting from Unix to Windows with unexpected prefix embedded in path should fail
/// let unix_path = Path::<UnixEncoding>::new("/path/c:/foo.txt");
/// assert_eq!(
/// unix_path.with_encoding_checked::<WindowsEncoding>(),
/// Err(CheckedPathError::UnexpectedPrefix),
/// );
/// ```
pub fn with_encoding_checked<U>(&self) -> Result<PathBuf<U>, CheckedPathError>
where
U: for<'enc> Encoding<'enc>,
{
let mut path = PathBuf::new();

// For root, current, and parent we specially handle to convert to the appropriate type,
// otherwise we attempt to push using the checked variant, which will ensure that the
// destination encoding is respected
for component in self.components() {
if component.is_root() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::root()
.as_bytes(),
);
} else if component.is_current() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::current(
)
.as_bytes(),
);
} else if component.is_parent() {
path.push(
<<<U as Encoding>::Components as Components>::Component as Component>::parent()
.as_bytes(),
);
} else {
path.push_checked(component.as_bytes())?;
}
}

Ok(path)
}

/// Converts a [`Box<Path>`](Box) into a
/// [`PathBuf`] without copying or allocating.
pub fn into_path_buf(self: Box<Path<T>>) -> PathBuf<T> {
Expand Down
Loading
Loading