Skip to content
/ satya Public

SATYA - High Performance Data Validation for Python written in Rust

License

Notifications You must be signed in to change notification settings

justrach/satya

Repository files navigation

Satya Logo

Satya (सत्य)

PyPI version License: Apache 2.0 Python Versions

SATYA - High Performance Data Validation for Python

Satya (सत्य) is the Sanskrit word for truth and reality, embodying our commitment to data integrity and validation. Just as truth is fundamental and unwavering, Satya ensures your data validation is reliable, fast, and efficient. 🚀

Satya is a blazingly fast data validation library for Python, powered by Rust. Early benchmarks show it performing up to 134x faster than Pydantic for large-scale validation tasks.

Key Features:

  • Lightning fast validation (134x faster than Pydantic in initial benchmarks)
  • Stream processing support for handling large datasets
  • Rust-powered core with a Pythonic API
  • Support for nested models and complex types
  • Compatible with standard Python type hints
  • Minimal memory overhead

Quick Start:

from satya import Model, Field

class User(Model):
    id: int = Field(description="User ID")
    name: str = Field(description="User name")
    email: str = Field(description="Email address")
    active: bool = Field(default=True)

Example 2:

from typing import Optional
from satya import Model, Field, List

# Enable pretty printing for this module
Model.PRETTY_REPR = True

class User(Model):
    id: int
    name: str = Field(default='John Doe')
    signup_ts: Optional[str] = Field(required=False)  # Using str for datetime
    friends: List[int] = Field(default=[])

external_data = {'id': '123', 'signup_ts': '2017-06-01 12:22', 'friends': [1, '2', b'3']}
validator = User.validator()
result = validator.validate(external_data)
user = User(**result.value)
print(user)
#> User(id=123, name='John Doe', signup_ts='2017-06-01 12:22', friends=[1, 2, 3])
print(user.id)
#> 123

🚀 Performance

Our benchmarks show significant performance improvements over existing solutions:

Satya Benchmark Results

📊 Large Dataset Processing (5M records)

  • Satya: 207,321 items/second
  • Pydantic: 72,302 items/second
  • Speed improvement: 2.9x
  • Memory usage: Nearly identical (Satya: 158.2MB, Pydantic: 162.5MB)

🌐 Web Service Benchmark (10,000 requests)

  • Satya: 177,790 requests/second
  • Pydantic: 1,323 requests/second
  • Average latency improvement: 134.4x
  • P99 latency improvement: 134.4x

Note: All benchmarks were run on identical hardware using standardized test cases. Your results may vary depending on your specific use case and data structure complexity.

🎯 Key Features

  • 🏃‍♂️ Lightning Fast: Up to 134x faster than Pydantic
  • 🌊 Stream Processing: Efficient handling of large datasets
  • 🦀 Rust-Powered: High-performance core with zero-cost abstractions
  • 🐍 Pythonic API: Familiar interface for Python developers
  • 🎯 Type Support: Full compatibility with Python type hints
  • 📦 Minimal Overhead: Efficient memory usage

Why Satya?

While Pydantic has revolutionized data validation in Python and inspired this project, there are use cases where raw performance is critical. Satya (सत्य) brings the power of truth to your data validation by:

  • Leveraging Rust's zero-cost abstractions for core validation logic
  • Implementing efficient batch processing with minimal overhead
  • Minimizing Python object creation through smart memory management
  • Reducing memory allocations with Rust's ownership model
  • Providing truthful, precise error messages that pinpoint validation issues

Ideal Use Cases:

  • High-throughput API services
  • Real-time data processing
  • Large dataset validation
  • Stream processing applications
  • Performance-critical microservices

Installation:

pip install satya

Current Status:

Satya is currently in alpha (v0.2.1). While the core functionality is stable, we're actively working on:

  • Expanding type support
  • Adding more validation features
  • Improving error messages
  • Enhancing documentation
  • Additional performance optimizations

Acknowledgments:

Special thanks to the Pydantic project, which has set the standard for Python data validation and heavily influenced Satya's API design. While we've focused on raw performance, Pydantic's elegant API and comprehensive feature set remain a major inspiration.

💝 Open Source Spirit

Note to Data Validation Library Authors: Feel free to incorporate our performance optimizations into your libraries! We believe in making the Python ecosystem faster for everyone. All we ask is for appropriate attribution to Satya under our MIT license. Together, we can make data validation blazingly fast for all Python developers!

🤝 Contributing

We welcome contributions of all kinds! Whether you're fixing bugs, improving documentation, or sharing new performance optimizations, here's how you can help:

  • 🐛 Report issues and bugs
  • 💡 Suggest new features or optimizations
  • 📝 Improve documentation
  • 🔧 Submit pull requests
  • 📊 Share benchmarks and use cases

Check out our CONTRIBUTING.md for guidelines.

License:

Apache 2.0

Note: Performance numbers are from initial benchmarks and may vary based on use case and data structure complexity.

Contact:

Remember: Satya is designed for scenarios where validation performance is critical. For general use cases, especially where features and ecosystem compatibility are more important than raw speed, Pydantic remains an excellent choice.

About

SATYA - High Performance Data Validation for Python written in Rust

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published