Releases: PrefectHQ/prefect
Releases · PrefectHQ/prefect
The Release is Bright and Full of Features
Changelog
0.5.3
Released May 7, 2019
Features
Enhancements
- Flow now has optional
storage
keyword - #936 - Flow
environment
argument now defaults to aCloudEnvironment
- #936 Queued
states acceptstart_time
arguments - #955- Add new
Bytes
andMemory
storage classes for local testing - #956, #961 - Add new
LocalEnvironment
execution environment for local testing - #957 - Add new
Aborted
state for Flow runs which are cancelled by users - #959 - Added an
execute-cloud-flow
CLI command for working with cloud deployed flows - #971 - Add new
flows.run_on_schedule
configuration option for affecting the behavior offlow.run
- #972 - Allow for Tasks with
manual_only
triggers to be root tasks - #667 - Allow compression of serialized flows #993
- Allow for serialization of user written result handlers - #623
- Allow for state to be serialized in certain triggers and cache validators - #949
- Add new
filename
keyword toflow.visualize
for automatically saving visualizations - #1001 - Add new
LocalStorage
option for storing Flows locally - #1006
Task Library
- None
Fixes
- Fix Docker storage not pulling correct flow path - #968
- Fix
run_flow
loading to decode properly by use cloudpickle - #978 - Fix Docker storage for handling flow names with spaces and weird characters - #969
- Fix non-deterministic issue with mapping in the DaskExecutor - #943
Breaking Changes
- Remove
flow.id
andtask.id
attributes - #940 - Removed old WIP environments - #936
(Note: Changes from #936 regarding environments don't break any Prefect code because environments weren't used yet outside of Cloud.) - Update
flow.deploy
andclient.deploy
to useset_schedule_active
kwarg to match Cloud - #991 - Removed
Flow.generate_local_task_ids()
- #992
Contributors
- None
Unredacted: The 0.5.2 Release
0.5.2
Released April 19, 2019
Features
- Implement two new triggers that allow for specifying bounds on the number of failures or successes - #933
Enhancements
DaskExecutor(local_processes=True)
supports timeouts - #886- Calling
Secret.get()
from within a Flow context raises an informative error - #927 - Add new keywords to
Task.set_upstream
andTask.set_downstream
for handling keyed and mapped dependencies - #823 - Downgrade default logging level to "INFO" from "DEBUG" - #935
- Add start times to queued states - #937
- Add
is_submitted
to states - #944 - Introduce new
ClientFailed
state - #938
Task Library
- Add task for sending Slack notifications via Prefect Slack App - #932
Fixes
- Fix issue with timeouts behaving incorrectly with unpickleable objects - #886
- Fix issue with Flow validation being performed even when eager validation was turned off - #919
- Fix issue with downstream tasks with
all_failed
triggers running if an upstream Client call fails in Cloud - #938
Breaking Changes
- Remove
prefect make user config
from cli commands - #904 - Change
set_schedule_active
keyword in Flow deployments toset_schedule_inactive
to match Cloud - #941
Contributors
- None
It Takes a Village
0.5.1
Released April 4, 2019
Features
- API reference documentation is now versioned - #270
- Add
S3ResultHandler
for handling results to / from S3 buckets - #879 - Add ability to use
Cached
states across flow runs in Cloud - #885
Enhancements
- Bump to latest version of
pytest
(4.3) - #814 Client.deploy
accepts optionalbuild
kwarg for avoiding building Flow environment - #876- Bump
distributed
to 1.26.1 for enhanced security features - #878 - Local secrets automatically attempt to load secrets as JSON - #883
- Add task logger to context for easily creating custom logs during task runs - #884
Task Library
- Add
ParseRSSFeed
for parsing a remote RSS feed - #856 - Add tasks for working with Docker containers and imaged - #864
- Add task for creating a BigQuery table - #895
Fixes
- Only checkpoint tasks if running in cloud - #839, #854
- Adjusted small flake8 issues for names, imports, and comparisons - #849
- Fix bug preventing
flow.run
from properly using cached tasks - #861 - Fix tempfile usage in
flow.visualize
so that it runs on Windows machines - #858 - Fix issue caused by Python 3.5.2 bug for Python 3.5.2 compatibility - #857
- Fix issue in which
GCSResultHandler
was not pickleable - #879 - Fix issue with automatically converting callables and dicts to tasks - #894
Breaking Changes
- Change the call signature of
Dict
task fromrun(**task_results)
torun(keys, values)
- #894
Contributors
Open Source Launch!
0.5.0
Released March 24, 2019
Features
- Add
checkpoint
option for individualTask
s, as well as a globalcheckpoint
config setting for storing the results of Tasks using their result handlers - #649 - Add
defaults_from_attrs
decorator to easily constructTask
s whose attributes serve as defaults forTask.run
- #293 - Environments follow new hierarchy (PIN-3) - #670
- Add
OneTimeSchedule
for one-time execution at a specified time - #680 flow.run
is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Pre-populate
prefect.context
with various formatted date strings during execution - #704 - Add ability to overwrite task attributes such as "name" when calling tasks in the functional API - #717
- Release Prefect Core under the Apache 2.0 license - #762
Enhancements
- Refactor all
State
objects to store fully hydratedResult
objects which track information about how results should be handled - #612, #616 - Add
google.cloud.storage
as an optional extra requirement so that theGCSResultHandler
can be exposed better - #626 - Add a
start_time
check for Scheduled flow runs, similar to the one for Task runs - #605 - Project names can now be specified for deployments instead of IDs - #633
- Add a
createProject
mutation function to the client - #633 - Add timestamp to auto-generated API docs footer - #639
- Refactor
Result
interface intoResult
andSafeResult
- #649 - The
manual_only
trigger will pass ifresume=True
is found in context, which indicates that aResume
state was passed - #664 - Added DockerOnKubernetes environment (PIN-3) - #670
- Added Prefect docker image (PIN-3) - #670
defaults_from_attrs
now accepts a splatted list of arguments - #676- Add retry functionality to
flow.run(on_schedule=True)
for local execution - #680 - Add
helper_fns
keyword toShellTask
for pre-populating helper functions to commands - #681 - Convert a few DEBUG level logs to INFO level logs - #682
- Added DaskOnKubernetes environment (PIN-3) - #695
- Load
context
from Cloud when running flows - #699 - Add
Queued
state - #705 flow.serialize()
will always serialize its environment, regardless ofbuild
- #696flow.deploy()
now raises an informative error if your container cannot deserialize the Flow - #711- Add
_MetaState
as a parent class for states that modify other states - #726 - Add
flow
keyword argument toTask.set_upstream()
andTask.set_downstream()
- #749 - Add
is_retrying()
helper method to allState
objects - #753 - Allow for state handlers which return
None
- #753 - Add daylight saving time support for
CronSchedule
- #729 - Add
idempotency_key
andcontext
arguments toClient.create_flow_run
- #757 - Make
EmailTask
more secure by pulling credentials from secrets - #706
Task Library
- Add
GCSUpload
andGCSDownload
for uploading / retrieving string data to / from Google Cloud Storage - #673 - Add
BigQueryTask
andBigQueryInsertTask
for executing queries against BigQuery tables and inserting data - #678, #685 - Add
FilterTask
for filtering out lists of results - #637 - Add
S3Download
andS3Upload
for interacting with data stored on AWS S3 - #692 - Add
AirflowTask
andAirflowTriggerDAG
tasks to the task library for running individual Airflow tasks / DAGs - #735 - Add
OpenGitHubIssue
andCreateGitHubPR
tasks for interacting with GitHub repositories - #771 - Add Kubernetes tasks for deployments, jobs, pods, and services - #779
- Add Airtable tasks - #803
- Add Twitter tasks - #803
- Add
GetRepoInfo
for pulling GitHub repository information - #816
Fixes
- Fix edge case in doc generation in which some
Exception
s' call signature could not be inspected - #513 - Fix bug in which exceptions raised within flow runner state handlers could not be sent to Cloud - #628
- Fix issue wherein heartbeats were not being called on a fixed interval - #669
- Fix issue wherein code blocks inside of method docs couldn't use
**kwargs
- #658 - Fix bug in which Prefect-generated Keys for S3 buckets were not properly converted to strings - #698
- Fix next line after Docker Environment push/pull from overwriting progress bar - #702
- Fix issue with
JinjaTemplate
not being pickleable - #710 - Fix issue with creating secrets from JSON documents using the Core Client - #715
- Fix issue with deserialization of JSON secrets unnecessarily calling
json.loads
- #716 - Fix issue where
IntervalSchedules
didn't respect daylight saving time after serialization - #729
Breaking Changes
- Remove the
BokehRunner
and associated webapp - #609 - Rename
ResultHandler
methods fromserialize
/deserialize
towrite
/read
- #612 - Refactor all
State
objects to store fully hydratedResult
objects which track information about how results should be handled - #612, #616 Client.create_flow_run
now returns a string instead of aGraphQLResult
object to match the API ofdeploy
- #630flow.deploy
andclient.deploy
require aproject_name
instead of an ID - #633- Upstream state results now take precedence for task inputs over
cached_inputs
- #591 - Rename
Match
task (used inside control flow) toCompareValue
- #638 Client.graphql()
now returns a response with up to two keys (data
anderrors
). Previously thedata
key was automatically selected - #642ContainerEnvironment
was changed toDockerEnvironment
- #670- The environment
from_file
was moved toutilities.environments
- #670 - Removed
start_tasks
argument fromFlowRunner.run()
andcheck_upstream
argument fromTaskRunner.run()
- #672 - Remove support for Python 3.4 - #671
flow.run
is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Remove
make_return_failed_handler
asflow.run
now returns all task states - #693 - Refactor Airflow migration tools into a single
AirflowTask
in the task library for running individual Airflow tasks - #735 name
is now required on all Flow objects - #732- Separate installation "extras" packages into multiple, smaller extras - #739
...
Version 0.4.1
Major Features
- Add ability to run scheduled flows locally via
on_schedule
kwarg inflow.run()
- #519 - Allow tasks to specify their own result handlers, ensure inputs and outputs are stored only when necessary, and ensure no raw data is sent to the database - #587
Minor Features
- Allow for building
ContainerEnvironment
s locally without pushing to registry - #514 - Make mapping more robust when running children tasks multiple times - #541
- Always prefer
cached_inputs
over upstream states, if available - #546 - Add hooks to
FlowRunner.initialize_run()
for manipulating task states and contexts - #548 - Improve state-loading strategy for Prefect Cloud - #555
- Introduce
on_failure
kwarg to Tasks and Flows for user-friendly failure callbacks - #551 - Include
scheduled_start_time
in context for Flow runs - #524 - Add GitHub PR template - #542
- Allow flows to be deployed to Prefect Cloud without a project id - #571
- Introduce serialization schemas for ResultHandlers - #572
- Add new
metadata
attribute to States for managing user-generated results - #573 - Add new 'JSONResultHandler' for serializing small bits of data without external storage - #576
- Use
JSONResultHandler
for all Parameter caching - #590
Fixes
- Fixed
flow.deploy()
attempting to access a nonexistent string attribute - #503 - Ensure all logs make it to the logger service in deployment - #508, #552
- Fix a situation where
Paused
tasks would be treated asPending
and run - #535 - Ensure errors raised in state handlers are trapped appropriately in Cloud Runners - #554
- Ensure unexpected errors raised in FlowRunners are robustly handled - #568
- Fixed non-deterministic errors in mapping caused by clients resolving futures of other clients - #569
- Older versions of Prefect will now ignore fields added by newer versions when deserializing objects - #583
- Result handler failures now result in clear task run failures - #575
- Fix issue deserializing old states with empty metadata - #590
- Fix issue serializing
cached_inputs
- #594
Breaking Changes
- Move
prefect.client.result_handlers
toprefect.engine.result_handlers
- #512 - Removed
inputs
kwarg fromTaskRunner.run()
- #546 - Moves the
start_task_ids
argument fromFlowRunner.run()
toEnvironment.run()
- #544, #545 - Convert
timeout
kwarg fromtimedelta
tointeger
- #540 - Remove
timeout
kwarg fromexecutor.wait
- #569 - Serialization of States will ignore any result data that hasn't been processed - #581
- Removes
VersionedSchema
in favor of implicit versioning: serializers will ignore unknown fields and thecreate_object
method is responsible for recreating missing ones - #583 - Convert and rename
CachedState
to a successful state namedCached
, and also remove the superfluouscached_result
attribute - #586
Version 0.4.0
Major Features
- Add support for Prefect Cloud - #374, #406, #473, #491
- Add versioned serialization schemas for
Flow
,Task
,Parameter
,Edge
,State
,Schedule
, andEnvironment
objects - #310, #318, #319, #340 - Add ability to provide
ResultHandler
s for storing private result data - #391, #394, #430 - Support depth-first execution of mapped tasks and tracking of both the static "parent" and dynamic "children" via
Mapped
states - #485
Minor Features
- Add new
TimedOut
state for task execution timeouts - #255 - Use timezone-aware dates throughout Prefect - #325
- Add
description
andtags
arguments toParameters
- #318 - Allow edge
key
checks to be skipped in order to create "dummy" flows from metadata - #319 - Add new
names_only
keyword toflow.parameters
- #337 - Add utility for building GraphQL queries and simple schemas from Python objects - #342
- Add links to downloadable Jupyter notebooks for all tutorials - #212
- Add
to_dict
convenience method forDotDict
class - #341 - Refactor requirements to a custom
ini
file specification - #347 - Refactor API documentation specification to
toml
file - #361 - Add new SQLite tasks for basic SQL scripting and querying - #291
- Executors now pass
map_index
into theTaskRunner
s - #373 - All schedules support
start_date
andend_date
parameters - #375 - Add
DateTime
marshmallow field for timezone-aware serialization - #378 - Adds ability to put variables into context via the config - #381
- Adds new
client.deploy
method for adding new flows to the Prefect Cloud - #388 - Add
id
attribute toTask
class - #416 - Add new
Resume
state for resuming fromPaused
tasks - #435 - Add support for heartbeats - #436
- Add new
Submitted
state for signaling thatScheduled
tasks have been handled - #445 - Add ability to add custom environment variables and copy local files into
ContainerEnvironment
s - #453 - Add
set_secret
method to Client for creating and setting the values of user secrets - #452 - Refactor runners into
CloudTaskRunner
andCloudFlowRunner
classes - #431 - Added functions for loading default
engine
classes from config - #477
Fixes
- Fixed issue with
GraphQLResult
reprs - #374 CronSchedule
produces expected results across daylight savings time transitions - #375utilities.serialization.Nested
properly respectsmarshmallow.missing
values - #398- Fixed issue in capturing unexpected mapping errors during task runs - #409
- Fixed issue in
flow.visualize()
so that mapped flow states can be passed and colored - #387 - Fixed issue where
IntervalSchedule
was serialized at "second" resolution, not lower - #427 - Fixed issue where
SKIP
signals were preventing multiple layers of mapping - #455 - Fixed issue with multi-layer mapping in
flow.visualize()
- #454 - Fixed issue where Prefect Cloud
cached_inputs
weren't being used locally - #434 - Fixed issue where
Config.set_nested
would have an error if the provided key was nested deeper than an existing terminal key - #479 - Fixed issue where
state_handlers
were not called for certain signals - #494
Breaking Changes
- Remove
NoSchedule
andDateSchedule
schedule classes - #324 - Change
serialize()
method to use schemas rather than custom dict - #318 - Remove
timestamp
property fromState
classes - #305 - Remove the custom JSON encoder library at
prefect.utilities.json
- #336 flow.parameters
now returns a set of parameters instead of a dictionary - #337- Renamed
to_dotdict
->as_nested_dict
- #339 - Moved
prefect.utilities.collections.GraphQLResult
toprefect.utilities.graphql.GraphQLResult
- #371 SynchronousExecutor
now does not do depth first execution for mapped tasks - #373- Renamed
prefect.utilities.serialization.JSONField
->JSONCompatible
, removed itsmax_size
feature, and no longer automatically serialize payloads as strings - #376 - Renamed
prefect.utilities.serialization.NestedField
->Nested
- #376 - Renamed
prefect.utilities.serialization.NestedField.dump_fn
->NestedField.value_selection_fn
for clarity - #377 - Local secrets are now pulled from
secrets
in context instead of_secrets
- #382 - Remove Task and Flow descriptions, Flow project & version attributes - #383
- Changed
Schedule
parameter fromon_or_after
toafter
- #396 - Environments are immutable and return
dict
keys instead ofstr
; some arguments forContainerEnvironment
are removed - #398 environment.run()
andenvironment.build()
; removed theflows
CLI and replaced it with a top-level CLI command,prefect run
- #400- The
set_temporary_config
utility now accepts a single dict of multiple config values, instead of just a key/value pair, and is located inutilities.configuration
- #401 - Bump
click
requirement to 7.0, which changes underscores to hyphens at CLI - #409 IntervalSchedule
rejects intervals of less than one minute - #427FlowRunner
returns aRunning
state, not aPending
state, when flows do not finish - #433- Remove the
task_contexts
argument fromFlowRunner.run()
- #440 - Remove the leading underscore from Prefect-set context keys - #446
- Removed throttling tasks within the local cluster - #470
- Even
start_tasks
will not run before their state'sstart_time
(if the state isScheduled
) - #474 DaskExecutor
's "processes" keyword argument was renamed "local_processes" - #477- Removed the
mapped
andmap_index
kwargs fromTaskRunner.run()
. These values are now inferred automatically - #485 - The
upstream_states
dictionary used by the Runners only includesState
values, not lists ofStates
. The use case that required lists ofStates
is now covered by theMapped
state. - #485
Version 0.3.3
Major Features
- Refactor
FlowRunner
andTaskRunner
into a modularRunner
pipelines - #260, #267 - Add configurable
state_handlers
forFlowRunners
,Flows
,TaskRunners
, andTasks
- #264, #267 - Add gmail and slack notification state handlers w/ tutorial - #274, #294
Minor Features
- Add a new method
flow.get_tasks()
for easily filtering flow tasks by attribute - #242 - Add new
JinjaTemplateTask
for easily rendering jinja templates - #200 - Add new
PAUSE
signal for halting task execution - #246 - Add new
Paused
state corresponding toPAUSE
signal, and newpause_task
utility - #251 - Add ability to timeout task execution for all executors except
DaskExecutor(processes=True)
- #240 - Add explicit unit test to check Black formatting (Python 3.6+) - #261
- Add ability to set local secrets in user config file - #231, #274
- Add
is_skipped()
andis_scheduled()
methods forState
objects - #266, #278 - Adds
now()
as a defaultstart_time
forScheduled
states - #278 Signal
classes now pass arguments to underlyingState
objects - #279- Run counts are tracked via
Retrying
states - #281
Fixes
- Flow consistently raises if passed a parameter that doesn't exist - #149
Breaking Changes
Version 0.3.2
Major Features
- Local parallelism with
DaskExecutor
- #151, #186 - Resource throttling based on
tags
- #158, #186 Task.map
for mapping tasks - #186- Added
AirFlow
utility for importing Airflow DAGs as Prefect Flows - #232
Minor Features
- Use Netlify to deploy docs - #156
- Add changelog - #153
- Add
ShellTask
- #150 - Base
Task
class can now be run as a dummy task - #191 - New
return_failed
keyword toflow.run()
for returning failed tasks - #205 - some minor changes to
flow.visualize()
for visualizing mapped tasks and coloring nodes by state - #202 - Added new
flow.replace()
method for swapping out tasks within flows - #230 - Add
debug
kwarg toDaskExecutor
for optionally silencing dask logs - #209 - Update
BokehRunner
for visualizing mapped tasks - #220 - Env var configuration settings are typed - #204
- Implement
map
functionality for theLocalExecutor
- #233
Fixes
- Fix issue with Versioneer not picking up git tags - #146
DotDicts
can have non-string keys - #193- Fix unexpected behavior in assigning tags using contextmanagers - #190
- Fix bug in initialization of Flows with only
edges
- #225 - Remove "bottleneck" when creating pipelines of mapped tasks - #224
Breaking Changes
Version 0.3.1
Version 0.3.0
Major Features
- BokehRunner - #104, #128
- Control flow:
ifelse
,switch
, andmerge
- #92 - Set state from
reference_tasks
- #95, #137 - Add flow
Registry
- #90 - Output caching with various
cache_validators
- #84, #107 - Dask executor - #82, #86
- Automatic input caching for retries, manual-only triggers - #78
- Functional API for
Flow
definition State
classesSignals
to transmitState
Minor Features
- Add custom syntax highlighting to docs - #141
- Add
bind()
method for tasks to call without copying - #132 - Cache expensive flow graph methods - #125
- Docker environments - #71
- Automatic versioning via Versioneer - #70
TriggerFail
state - #67- State classes - #59
Fixes
- None
Breaking Changes
- None