-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce file upload api #5977
Conversation
@@ -45,5 +47,8 @@ export abstract class BaseResource<M extends Model> { | |||
return new this(this.model, blob.get()); | |||
} | |||
|
|||
abstract delete(transaction?: Transaction): Promise<Result<undefined, Error>>; | |||
abstract delete( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored so delete always takes an Authenticator
instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, left mostly aesthetic comments
// Cloud storage logic. | ||
|
||
getPublicUrl(auth: Authenticator): string { | ||
// TODO(2024-07-01 flav) Remove once we introduce AuthenticatorWithWorkspace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this todo really is not needed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep around since this check really bothers me and clutters a lot the code.
front/lib/resources/base_resource.ts
Outdated
@@ -28,7 +30,7 @@ export abstract class BaseResource<M extends Model> { | |||
this.id = blob.id; | |||
} | |||
|
|||
static async fetchById<T extends BaseResource<M>, M extends Model>( | |||
static async fetchByInternalId<T extends BaseResource<M>, M extends Model>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we rename it I would call it fetchByModelId
to be even more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
front/lib/resources/file_resource.ts
Outdated
|
||
static async fetchById( | ||
auth: Authenticator, | ||
id: FileId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we generaly use fileId
for sId
arguments
front/lib/resources/file_resource.ts
Outdated
static async makeNew( | ||
blob: Omit<CreationAttributes<FileModel>, "status" | "sId"> | ||
) { | ||
const fileId = makeDustFileId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not introduce new terminology, this is an sId
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
front/lib/resources/file_resource.ts
Outdated
} | ||
} | ||
|
||
function makeDustFileId(): FileId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's modify generateModelSId to take an optional prefix instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to do it at first, but was not sure we were okay with it 🔥 !
|
||
res.status(200).json({ file: fileRes.toJSON(auth) }); | ||
return; | ||
} catch (error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch should be around external functions only. Let's narrow the catch to the relevant call. If that can come from maybeApplyPreProcessing then this catch is not valid
}); | ||
} | ||
|
||
const preProcessingRes = await maybeApplyPreProcessing(auth, fileRes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fine of "...Res" for resources. Should be just "file" no since ...Res is already used everywhere for results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree but file was taken by formiddable, will update.
}); | ||
} | ||
|
||
const fileRes = await FileResource.fetchById(auth, fileId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's call it "file"
front/lib/api/files/preprocessing.ts
Outdated
|
||
export async function maybeApplyPreProcessing( | ||
auth: Authenticator, | ||
fileRes: FileResource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file
front/lib/api/files/preprocessing.ts
Outdated
auth: Authenticator, | ||
fileRes: FileResource | ||
): Promise<Result<undefined, Error>> { | ||
const processing = processingPerContentType[fileRes.contentType]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we introduce the useCase logic here even if there is only one possible value for useCase? 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM \o/
* Checkout file upload endpoint related code * Limit file upload to 1. * Tmp * Implement FileResource * ✨ * Fix google-cloud/storage issue * Add unique sId index * Fix migration * 👕 * Make file upload rate limit more aggressive * Enforce upload window * Ensure the uploaded file matches the requirements * ✂️ * Use fromidable to adhere to file requirements * s/fetchByInternalId/fetchByModelId * Move try/catch + move markAsFailed for preprocessing * s/fileRes/file * Remove types for FileId * 🙈 * Implement use cases for pre-processing
Description
This PR goes one step further than the original attempt at introducing a file API in #5953.
Summary
While working on vision, we identified a race condition in the process of uploading content fragment files. Currently, we create the conversation or post the message with the content fragment before uploading the associated file, which causes issues. As we plan to use URLs for vision and need to resize images (increasing upload time), the existing logic requires refactoring.
Key Changes
This PR establishes a new set of endpoints aimed at facilitating file uploads, laying the groundwork for future expansion to support various file types and use cases. The key additions include:
/files
Endpoint:/files/:fileId
Endpoint:Pre-processing is currently handled within this endpoint due to the following reasons:
In the future, this approach may need to be revised for larger files (a few gigabytes).
The concept of use cases is introduced, although not yet implemented. Future updates may apply different pre-processing based on the use case. The only supported use case at present is
conversation
, aligning with existing content fragment logic.Memory and Performance Considerations
Known Issues and Updates
google-cloud/storage
has a known issue with streams on Node.js > 21.x. Consequently, this PR includes a version bump. The update primarily changes a few types without affecting behavior.Risk
This endpoint is not yet used.
Deploy Plan