Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add accessors for enzyme associated data #724

Merged
merged 9 commits into from
Jan 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/misc/constants.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ whatever purposes.
"""
const constants = (
default_reaction_bound = 1e3,
default_gene_product_bound = 1e3,
tolerance = 1e-6,
sampling_keep_iters = 100,
sampling_size = 1000,
Expand Down
99 changes: 53 additions & 46 deletions src/reconstruction/enzyme_constrained.jl
Original file line number Diff line number Diff line change
@@ -1,52 +1,51 @@
"""
$(TYPEDSIGNATURES)

Wrap a model into a [`EnzymeConstrainedModel`](@ref), following the structure given by
GECKO algorithm (see [`EnzymeConstrainedModel`](@ref) documentation for details).
Wrap a model into a [`EnzymeConstrainedModel`](@ref), following the structure
given by GECKO algorithm (see [`EnzymeConstrainedModel`](@ref) documentation for
details). Multiple capacity constraints can be placed on the model using the
kwargs.

# Arguments

- `reaction_isozymes` is a function that returns a vector of [`Isozyme`](@ref)s
for each reaction, or `nothing` if the reaction is not enzymatic.
- `gene_product_bounds` is a function that returns lower and upper bound for
concentration for a given gene product (specified by the same string gene ID as in
`reaction_isozymes`), as `Tuple{Float64,Float64}`.
- `gene_product_molar_mass` is a function that returns a numeric molar mass of
a given gene product specified by string gene ID.
- `gene_product_mass_group` is a function that returns a string group identifier for a
given gene product, again specified by string gene ID. By default, all gene
products belong to group `"uncategorized"` which is the behavior of original
- a `model` that implements the accessors `gene_product_molar_mass`,
`reaction_isozymes`, `gene_product_lower_bound`, `gene_product_upper_bound`.
- `gene_product_mass_group` is a dict that returns a vector of gene IDs
associated with each a named capacity constraint. By default, all gene
products belong to group `"uncategorized"`, which is the behavior of original
GECKO.
- `gene_product_mass_group_bound` is a function that returns the maximum mass for a given
mass group.
- `gene_product_mass_group_bound` is a dict that returns the capacity
limitation for a given named capacity constraint.

# Example
```
ecmodel = make_enzyme_constrained_model(
model;
gene_product_mass_group = Dict(
"membrane" => ["e1", "e2"],
"total" => ["e1", "e2", "e3"],
),
gene_product_mass_group_bound = Dict(
"membrane" => 0.2,
"total" => 0.5,
),
)
```

Alternatively, all function arguments may be also supplied as dictionaries that
provide the same data lookup.
# Notes
Reactions with no turnover number data, or non-enzymatic reactions that should
be ignored, must have `nothing` in the `gene_associations` field of the
associated reaction.
"""
function make_enzyme_constrained_model(
model::AbstractMetabolicModel;
reaction_isozymes::Union{Function,Dict{String,Vector{Isozyme}}},
gene_product_bounds::Union{Function,Dict{String,Tuple{Float64,Float64}}},
gene_product_molar_mass::Union{Function,Dict{String,Float64}},
gene_product_mass_group::Union{Function,Dict{String,String}} = _ -> "uncategorized",
gene_product_mass_group_bound::Union{Function,Dict{String,Float64}},
gene_product_mass_group::Dict{String,Vector{String}} = Dict(
"uncategorized" => genes(model),
),
gene_product_mass_group_bound::Dict{String,Float64} = Dict("uncategorized" => 0.5),
)
ris_ =
reaction_isozymes isa Function ? reaction_isozymes :
(rid -> get(reaction_isozymes, rid, nothing))
gpb_ =
gene_product_bounds isa Function ? gene_product_bounds :
(gid -> gene_product_bounds[gid])
gpmm_ =
gene_product_molar_mass isa Function ? gene_product_molar_mass :
(gid -> gene_product_molar_mass[gid])
gmg_ =
gene_product_mass_group isa Function ? gene_product_mass_group :
(gid -> gene_product_mass_group[gid])
gmgb_ =
gene_product_mass_group_bound isa Function ? gene_product_mass_group_bound :
(grp -> gene_product_mass_group_bound[grp])
# ...it would be nicer to have an overload for this, but kwargs can't be used for dispatch
gpb_(gid) = (gene_product_lower_bound(model, gid), gene_product_upper_bound(model, gid))

gpmm_(gid) = gene_product_molar_mass(model, gid)

columns = Vector{Types._EnzymeConstrainedReactionColumn}()
coupling_row_reaction = Int[]
Expand All @@ -60,7 +59,8 @@ function make_enzyme_constrained_model(
gene_row_lookup = Dict{Int,Int}()

for i = 1:n_variables(model)
isozymes = ris_(rids[i])
isozymes = reaction_isozymes(model, rids[i])

if isnothing(isozymes)
push!(
columns,
Expand Down Expand Up @@ -104,7 +104,7 @@ function make_enzyme_constrained_model(
length(coupling_row_gene_product)
end
(row_idx, stoich / kcat)
end for (gene, stoich) in isozyme.stoichiometry if
end for (gene, stoich) in isozyme.gene_product_stoichiometry if
haskey(gene_name_lookup, gene)
)

Expand All @@ -129,11 +129,13 @@ function make_enzyme_constrained_model(
# prepare enzyme capacity constraints
mg_gid_lookup = Dict{String,Vector{String}}()
for gid in gids[coupling_row_gene_product]
mg = gmg_(gid)
if haskey(mg_gid_lookup, mg)
push!(mg_gid_lookup[mg], gid)
else
mg_gid_lookup[mg] = [gid]
for (mg, mg_gids) in gene_product_mass_group # each gid can belong to multiple mass groups
gid ∉ mg_gids && continue
if haskey(mg_gid_lookup, mg)
push!(mg_gid_lookup[mg], gid)
else
mg_gid_lookup[mg] = [gid]
end
end
end
coupling_row_mass_group = Vector{Types._EnzymeConstrainedCapacity}()
Expand All @@ -142,7 +144,12 @@ function make_enzyme_constrained_model(
mms = gpmm_.(gs)
push!(
coupling_row_mass_group,
Types._EnzymeConstrainedCapacity(grp, idxs, mms, gmgb_(grp)),
Types._EnzymeConstrainedCapacity(
grp,
idxs,
mms,
gene_product_mass_group_bound[grp],
),
)
end

Expand Down
51 changes: 30 additions & 21 deletions src/reconstruction/simplified_enzyme_constrained.jl
Original file line number Diff line number Diff line change
@@ -1,44 +1,50 @@

"""
$(TYPEDSIGNATURES)

Construct a model with a structure given by sMOMENT algorithm; returns a
[`SimplifiedEnzymeConstrainedModel`](@ref) (see the documentation for details).

# Arguments

- `reaction_isozyme` parameter is a function that returns a single
[`Isozyme`](@ref) for each reaction, or `nothing` if the reaction is not
enzymatic. If the reaction has multiple isozymes, use
[`simplified_enzyme_constrained_isozyme_speed`](@ref) to select the fastest one, as recommended by
the sMOMENT paper.
- `gene_product_molar_mass` parameter is a function that returns a molar mass
of each gene product as specified by sMOMENT.
- a `model` that implements the accessors `gene_product_molar_mass`,
`reaction_isozymes`.
- `total_enzyme_capacity` is the maximum "enzyme capacity" in the model.

Alternatively, all function arguments also accept dictionaries that are used to
provide the same data lookup.
# Notes
The SMOMENT algorithm only uses one [`Isozyme`](@ref) per reaction. If multiple
isozymes are present the "fastest" isozyme will be used. This is determined
based on maximum kcat (forward or backward) divided by mass of the isozyme.

Reactions with no turnover number data, or non-enzymatic reactions that should
be ignored, must have `nothing` in the `gene_associations` field of the
associated reaction.
"""
function make_simplified_enzyme_constrained_model(
model::AbstractMetabolicModel;
reaction_isozyme::Union{Function,Dict{String,Isozyme}},
gene_product_molar_mass::Union{Function,Dict{String,Float64}},
total_enzyme_capacity::Float64,
)
ris_ =
reaction_isozyme isa Function ? reaction_isozyme :
(rid -> get(reaction_isozyme, rid, nothing))
gpmm_ =
gene_product_molar_mass isa Function ? gene_product_molar_mass :
(gid -> gene_product_molar_mass[gid])
# helper function to rank the isozymes by relative speed
speed_enzyme(model, isozyme) =
max(isozyme.kcat_forward, isozyme.kcat_backward) / sum(
count * gene_product_molar_mass(model, gid) for
(gid, count) in isozyme.gene_product_stoichiometry
)

# helper function to return the fastest isozyme or nothing
ris_(model, rid) = begin
isozymes = reaction_isozymes(model, rid)
isnothing(isozymes) && return nothing
argmax(isozyme -> speed_enzyme(model, isozyme), isozymes)
end

columns = Vector{Types._SimplifiedEnzymeConstrainedColumn}()

(lbs, ubs) = bounds(model)
rids = variables(model)

for i = 1:n_variables(model)
isozyme = ris_(rids[i])

isozyme = ris_(model, rids[i])

if isnothing(isozyme)
# non-enzymatic reaction (or a totally ignored one)
push!(
Expand All @@ -48,7 +54,10 @@ function make_simplified_enzyme_constrained_model(
continue
end

mw = sum(gpmm_(gid) * ps for (gid, ps) in isozyme.stoichiometry)
mw = sum(
gene_product_molar_mass(model, gid) * ps for
(gid, ps) in isozyme.gene_product_stoichiometry
)

if min(lbs[i], ubs[i]) < 0 && isozyme.kcat_backward > constants.tolerance
# reaction can run in reverse
Expand Down
3 changes: 3 additions & 0 deletions src/types/Gene.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ Base.@kwdef mutable struct Gene
id::String
name::Maybe{String} = nothing
sequence::Maybe{String} = nothing
product_molar_mass::Maybe{Float64} = nothing
product_lower_bound::Float64 = 0
product_upper_bound::Float64 = constants.default_gene_product_bound
notes::Notes = Notes()
annotations::Annotations = Annotations()
end
Expand Down
8 changes: 4 additions & 4 deletions src/types/GeneAssociations.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,21 @@ stoichiometry or turnover numbers.
$(TYPEDFIELDS)
"""
Base.@kwdef mutable struct Isozyme
stoichiometry::Dict{String,Float64}
annotation::Annotations = Annotations()
gene_product_stoichiometry::Dict{String,Float64}
kcat_forward::Maybe{Float64} = nothing
kcat_backward::Maybe{Float64} = nothing
annotation::Annotations = Annotations()
end

"""
$(TYPEDSIGNATURES)

A convenience constructor for [`Isozyme`](@ref) that takes a string gene
reaction rule and converts it into the appropriate format. Assumes the
stoichiometry for each subunit is 1.
`gene_product_stoichiometry` for each subunit is 1.
"""
Isozyme(gids::Vector{String}; kwargs...) =
Isozyme(; stoichiometry = Dict(gid => 1.0 for gid in gids), kwargs...)
Isozyme(; gene_product_stoichiometry = Dict(gid => 1.0 for gid in gids), kwargs...)

"""
const GeneAssociations
Expand Down
31 changes: 31 additions & 0 deletions src/types/accessors/AbstractMetabolicModel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -369,3 +369,34 @@ By default, it should be safe to do nothing.
function precache!(a::AbstractMetabolicModel)::Nothing
nothing
end

"""
$(TYPEDSIGNATURES)

Return the molar mass of translated gene with ID `gid`.
"""
gene_product_molar_mass(model::AbstractMetabolicModel, gid::String)::Maybe{Float64} =
nothing

"""
$(TYPEDSIGNATURES)

Return all the [`Isozyme`](@ref)s associated with a `model` and reaction `rid`.
"""
reaction_isozymes(model::AbstractMetabolicModel, rid::String)::Maybe{Vector{Isozyme}} =
nothing

"""
$(TYPEDSIGNATURES)

Return the lower bound of the gene product concentration associated with the `model` and gene `gid`.
"""
gene_product_lower_bound(model::AbstractMetabolicModel, gid::String)::Float64 = 0.0

"""
$(TYPEDSIGNATURES)

Return the upper bound of the gene product concentration associated with the `model` and gene `gid`.
"""
gene_product_upper_bound(model::AbstractMetabolicModel, gid::String)::Float64 =
constants.default_gene_product_bound
4 changes: 2 additions & 2 deletions src/types/accessors/ModelWrapper.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ end

@inherit_model_methods_fn AbstractModelWrapper () unwrap_model () variables metabolites stoichiometry bounds balance objective reactions n_reactions reaction_variables coupling n_coupling_constraints coupling_bounds genes n_genes precache!

@inherit_model_methods_fn AbstractModelWrapper (rid::String,) unwrap_model (rid,) reaction_gene_associations reaction_subsystem reaction_stoichiometry reaction_annotations reaction_notes
@inherit_model_methods_fn AbstractModelWrapper (rid::String,) unwrap_model (rid,) reaction_gene_associations reaction_subsystem reaction_stoichiometry reaction_annotations reaction_notes reaction_isozymes

@inherit_model_methods_fn AbstractModelWrapper (mid::String,) unwrap_model (mid,) metabolite_formula metabolite_charge metabolite_compartment metabolite_annotations metabolite_notes

@inherit_model_methods_fn AbstractModelWrapper (gid::String,) unwrap_model (gid,) gene_annotations gene_notes
@inherit_model_methods_fn AbstractModelWrapper (gid::String,) unwrap_model (gid,) gene_annotations gene_notes gene_product_molar_mass gene_product_lower_bound gene_product_upper_bound
40 changes: 38 additions & 2 deletions src/types/models/ObjectModel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,10 @@ function Accessors.reaction_gene_associations(
id::String,
)::Maybe{GeneAssociationsDNF}
isnothing(model.reactions[id].gene_associations) && return nothing
[collect(keys(rga.stoichiometry)) for rga in model.reactions[id].gene_associations]
[
collect(keys(rga.gene_product_stoichiometry)) for
rga in model.reactions[id].gene_associations
]
end

"""
Expand Down Expand Up @@ -297,6 +300,38 @@ Accessors.gene_name(m::ObjectModel, gid::String) = m.genes[gid].name
"""
$(TYPEDSIGNATURES)

Return the molar mass of translated gene with ID `gid`.
"""
Accessors.gene_product_molar_mass(model::ObjectModel, gid::String) =
model.genes[gid].product_molar_mass

"""
$(TYPEDSIGNATURES)

Return the [`Isozyme`](@ref)s associated with the `model` and reaction `rid`.
"""
Accessors.reaction_isozymes(model::ObjectModel, rid::String) =
model.reactions[rid].gene_associations

"""
$(TYPEDSIGNATURES)

Return the lower bound of the gene product concentration associated with the `model` and gene `gid`.
"""
Accessors.gene_product_lower_bound(model::ObjectModel, gid::String) =
model.genes[gid].product_lower_bound

"""
$(TYPEDSIGNATURES)

Return the upper bound of the gene product concentration associated with the `model` and gene `gid`.
"""
Accessors.gene_product_upper_bound(model::ObjectModel, gid::String) =
model.genes[gid].product_upper_bound

"""
$(TYPEDSIGNATURES)

Convert any `AbstractMetabolicModel` into a `ObjectModel`. Note, some data loss
may occur since only the generic interface is used during the conversion
process. Additionally, assume the stoichiometry for each gene association is 1.
Expand Down Expand Up @@ -354,7 +389,8 @@ function Base.convert(::Type{ObjectModel}, model::AbstractMetabolicModel)
upper_bound = ubs[i],
gene_associations = isnothing(rgas) ? nothing :
[
Isozyme(; stoichiometry = Dict(k => 1.0 for k in rga)) for rga in rgas
Isozyme(; gene_product_stoichiometry = Dict(k => 1.0 for k in rga)) for
rga in rgas
],
notes = reaction_notes(model, rid),
annotations = reaction_annotations(model, rid),
Expand Down
Loading