Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document what Nix *is* #6420

Merged
merged 124 commits into from
Aug 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
523359d
WIP: Document the design of Nix
Ericson2314 Nov 24, 2020
a2b3160
Briefly describe the digest of a store path
Ericson2314 Nov 24, 2020
e64633f
Flesh out TOC
Ericson2314 Nov 24, 2020
a210504
Apply suggestions from code review
Ericson2314 Mar 21, 2022
e3a0209
Move the bits on relocating store entires to the end
Ericson2314 Mar 21, 2022
678d75b
Start on the derivations section
Ericson2314 Mar 21, 2022
f5386d7
Fix stub file's name
Ericson2314 Mar 21, 2022
a04340f
Update doc/manual/src/design/overview.md
Ericson2314 Mar 22, 2022
75c5191
Update doc/manual/src/design/overview.md
Ericson2314 Mar 22, 2022
cdb0bf3
Update doc/manual/src/design/overview.md
Ericson2314 Mar 22, 2022
e308602
Update doc/manual/src/design/store/drvs/drvs.md
Ericson2314 Apr 2, 2022
4e2d5ae
doc: Store entry -> store object
Ericson2314 Apr 19, 2022
838ba26
Rename files after store entry -> store object rename
Ericson2314 Apr 19, 2022
1bbad62
doc: File system data -> file system object, to match Nix
Ericson2314 Apr 19, 2022
5f4d2ac
Improve store object section
Ericson2314 Apr 19, 2022
b4df351
Relocability -> relocation in store object title
Ericson2314 Apr 19, 2022
55b437b
Improve store path section
Ericson2314 Apr 19, 2022
b98dc3b
store objects, better opining sentances
Ericson2314 Apr 19, 2022
e4eea5e
Include abstract syntax based on the thesis for FSOs
Ericson2314 Apr 19, 2022
4e4bbd9
Improve store objects session more
Ericson2314 Apr 19, 2022
c86c1ec
Make refernces sneak preview more concise
Ericson2314 Apr 19, 2022
0737094
Add draft "Rosetta stone" by @fricklerhandwerk and stub commentary
Ericson2314 Apr 22, 2022
0eae4bf
reword overview with clear terminology
fricklerhandwerk Apr 21, 2022
327ccd3
only use generic build system terminology
fricklerhandwerk Apr 21, 2022
804e8bd
indicate sequence with "then"
fricklerhandwerk Apr 22, 2022
23ee0b2
correctly use comma for nesting
fricklerhandwerk Apr 22, 2022
51e6bed
do not mention implementation details
fricklerhandwerk Apr 22, 2022
89a7c95
Apply suggestions from code review
Ericson2314 Apr 22, 2022
b387d80
remove sentence for chapter transition
fricklerhandwerk Apr 24, 2022
34ea74c
reword introductory section
fricklerhandwerk Apr 26, 2022
7598126
remove separate meta-section, add architecture diagram
fricklerhandwerk Apr 27, 2022
d300337
address Nix language consistently as configuration language
fricklerhandwerk Apr 27, 2022
39f0117
design -> architecture, add motivation
fricklerhandwerk Apr 27, 2022
c8c1b70
reword section on Nix store
fricklerhandwerk Apr 24, 2022
7b5c00f
add concrete store examples, reword note on file system
fricklerhandwerk Apr 27, 2022
070c854
fix grammar
fricklerhandwerk Apr 28, 2022
5f96a0b
associated operations are not collected
fricklerhandwerk Apr 28, 2022
610ddf4
reword introduction to rosetta stone, add links
fricklerhandwerk Apr 25, 2022
ca5ebf6
revert build plan/step distinction, reorder rows
fricklerhandwerk Apr 28, 2022
40efe5b
build instrcution: Task -> function
fricklerhandwerk Apr 28, 2022
a145007
component -> store object, realisation -> build
fricklerhandwerk Apr 28, 2022
e5e4859
move git comparison to related work
fricklerhandwerk Apr 28, 2022
90fc5b4
reword file system objects
fricklerhandwerk Apr 28, 2022
fb2ec7e
reword section on references
fricklerhandwerk Apr 28, 2022
5fda995
formalize file system objects
fricklerhandwerk Apr 28, 2022
07d490f
stores can also delete objects
fricklerhandwerk Apr 28, 2022
e90586c
add motivation for references
fricklerhandwerk Apr 28, 2022
b5ca3d1
reword details on keeping closure property
fricklerhandwerk Apr 29, 2022
b01bb65
Fix rel path in doc
Ericson2314 Apr 29, 2022
3d8f2f5
Fix manual TOC links
Ericson2314 Apr 28, 2022
1ba6d8f
remove incomplete section: building
fricklerhandwerk May 3, 2022
96876b1
remove incomplete section: related work
fricklerhandwerk May 3, 2022
7cec9ee
remove incomplete section: relocatability
fricklerhandwerk May 3, 2022
b18852e
remove incomplete section: content-addressed objects
fricklerhandwerk May 3, 2022
3bd125e
remove incomplete section: nix archives
fricklerhandwerk May 3, 2022
ad8c2ed
remove incomplete section: input/content-addressing
fricklerhandwerk May 3, 2022
d3effd0
update architecture diagram
fricklerhandwerk May 3, 2022
87523f0
match grammatical case to arrow direction
fricklerhandwerk May 10, 2022
902638c
build step -> build rule
fricklerhandwerk May 3, 2022
2a8532f
build rule -> build task
fricklerhandwerk May 9, 2022
689b32a
clarify relation of tasks and plans
fricklerhandwerk May 12, 2022
75ce324
use singular for class names consistently
fricklerhandwerk May 4, 2022
68d2601
architecture overview: add link to Nix expression language reference
fricklerhandwerk May 10, 2022
ef81276
architecture overview: add link to command line reference
fricklerhandwerk May 10, 2022
0e63b9b
add link from overview to store section
fricklerhandwerk May 4, 2022
25926c5
Nix store does not underly literally everything
fricklerhandwerk May 11, 2022
2303f84
revert to "build plan" in overview diagram
fricklerhandwerk May 20, 2022
4639b36
use reference links for URLs
fricklerhandwerk May 20, 2022
7c3bca1
revert to build plans in top-level overview
fricklerhandwerk May 20, 2022
d5eea66
introduce build tasks
fricklerhandwerk May 26, 2022
b6b112b
use reference links for URLs
fricklerhandwerk May 26, 2022
e72a787
beautify rosetta table
fricklerhandwerk May 26, 2022
207992a
introduce store and store objects without file system details
fricklerhandwerk May 11, 2022
b84f2bd
introduce mapping to Unix files and processes
fricklerhandwerk May 11, 2022
4eb11d4
fix grammar for clarity
fricklerhandwerk May 19, 2022
4adb660
clarify first sentence on store objects
fricklerhandwerk May 26, 2022
db8703b
use reference links for URLs
fricklerhandwerk May 26, 2022
445f753
replace pseudo code by diagrams
fricklerhandwerk May 26, 2022
4341849
move closure property to discussion references
fricklerhandwerk May 26, 2022
843288a
add subsections for objects and references
fricklerhandwerk May 26, 2022
e63a768
use reference links for URLs
fricklerhandwerk May 26, 2022
7b7e4c6
use singular to match section heading
fricklerhandwerk May 26, 2022
3794618
add commas between output values
fricklerhandwerk Jun 2, 2022
80de4a4
operations diagram: store' to the right
fricklerhandwerk Jun 2, 2022
195aa28
references are added according to build task
fricklerhandwerk Jun 2, 2022
7993ba1
constrain garbage collection scope
fricklerhandwerk Jun 2, 2022
a90fc62
make clear that file system is for processes
fricklerhandwerk Jun 2, 2022
19d8a5d
move first mention of file system object before diagram
fricklerhandwerk Jun 2, 2022
93f721b
remove draft on derivations
fricklerhandwerk Jun 21, 2022
84ddfbf
remove diagonal from operations diagram
fricklerhandwerk Jun 8, 2022
f632816
add explanation and examples of file system objects
fricklerhandwerk Jun 8, 2022
fa7ad45
explain store directory
fricklerhandwerk Jun 9, 2022
1681f4e
better explain reference scanning
fricklerhandwerk Jun 13, 2022
9c54481
paths -> path
fricklerhandwerk Jun 21, 2022
c10dccc
make example a simple list
fricklerhandwerk Jun 21, 2022
631ca18
reword notes on copying
fricklerhandwerk Jun 21, 2022
7c656d9
simplify description of diagram
fricklerhandwerk Jun 21, 2022
ec43977
store: match chapter introduction to outline
fricklerhandwerk Jun 9, 2022
348432f
store: add concept map
fricklerhandwerk Jun 9, 2022
d8b2f9f
make concept map more compact
fricklerhandwerk Jun 9, 2022
475a332
make concept map even more compact
fricklerhandwerk Jun 9, 2022
a28d687
concept map: put closure as it is in the chapter
fricklerhandwerk Jun 9, 2022
c345345
concept map: align hights
fricklerhandwerk Jun 9, 2022
def80d5
add subsections to table of contents
fricklerhandwerk Jun 8, 2022
fe4c0b8
fix typo
fricklerhandwerk Jul 12, 2022
de5dea4
use correct Nix entity
fricklerhandwerk Jul 13, 2022
5a5a956
note customized base32
fricklerhandwerk Jul 13, 2022
bac8623
use "build plan" consistently
fricklerhandwerk Jul 13, 2022
9cabba1
mention hard links
fricklerhandwerk Jul 13, 2022
29c0625
hashes: truncate -> reduce, mention SHA-256
fricklerhandwerk Jul 13, 2022
0228eb8
add Java example on manual dependency declaration
fricklerhandwerk Jul 13, 2022
db6faf4
clarify what store objects can be
fricklerhandwerk Jul 28, 2022
00a7eae
add file system object to table of contents
fricklerhandwerk Aug 4, 2022
b7309ce
move architecture to the end
fricklerhandwerk Aug 4, 2022
3df1ee2
clarify what explicitly declaring certain dependencies means
fricklerhandwerk Aug 4, 2022
8cec32e
fix directory tree renderings
fricklerhandwerk Aug 4, 2022
cc3a5f4
use correct mdBook syntax for callouts
fricklerhandwerk Aug 4, 2022
b631742
fix page rendering
fricklerhandwerk Aug 4, 2022
bc11885
Merge remote-tracking branch 'upstream/master' into doc-what-is-nix
Ericson2314 Aug 4, 2022
b74a3f5
Fix gitignore
Ericson2314 Aug 4, 2022
b430a67
Remove sections within from SUMMARY
Ericson2314 Aug 4, 2022
016d7a8
Fix rosetta stone file name
Ericson2314 Aug 4, 2022
6f6498f
Remove header fragments which is not needd
Ericson2314 Aug 4, 2022
39d32ac
Add disclaimer that arch section is WIP and links may rot
Ericson2314 Aug 4, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ perl/Makefile.config
/doc/manual/src/SUMMARY.md
/doc/manual/src/command-ref/new-cli
/doc/manual/src/command-ref/conf-file.md
/doc/manual/src/expressions/builtins.md
/doc/manual/src/language/builtins.md

# /scripts/
/scripts/nix-profile.sh
Expand Down
6 changes: 5 additions & 1 deletion doc/manual/local.mk
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
ifeq ($(doc_generate),yes)

MANUAL_SRCS := \
$(call rwildcard, $(d)/src, *.md) \
$(call rwildcard, $(d)/src, */*.md)

# Generate man pages.
man-pages := $(foreach n, \
nix-env.1 nix-build.1 nix-shell.1 nix-store.1 nix-instantiate.1 \
Expand Down Expand Up @@ -97,7 +101,7 @@ doc/manual/generated/man1/nix3-manpages: $(d)/src/command-ref/new-cli
done
@touch $@

$(docdir)/manual/index.html: $(MANUAL_SRCS) $(d)/book.toml $(d)/anchors.jq $(d)/custom.css $(d)/src/SUMMARY.md $(d)/src/command-ref/new-cli $(d)/src/command-ref/conf-file.md $(d)/src/language/builtins.md $(call rwildcard, $(d)/src, *.md)
$(docdir)/manual/index.html: $(MANUAL_SRCS) $(d)/book.toml $(d)/anchors.jq $(d)/custom.css $(d)/src/SUMMARY.md $(d)/src/command-ref/new-cli $(d)/src/command-ref/conf-file.md $(d)/src/language/builtins.md
$(trace-gen) RUST_LOG=warn mdbook build doc/manual -d $(DESTDIR)$(docdir)/manual

endif
6 changes: 6 additions & 0 deletions doc/manual/src/SUMMARY.md.in
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,12 @@
@manpages@
- [Files](command-ref/files.md)
- [nix.conf](command-ref/conf-file.md)
- [Architecture](architecture/architecture.md)
- [Store](architecture/store/store.md)
- [Closure](architecture/store/store/closure.md)
- [Build system terminology](architecture/store/store/build-system-terminology.md)
- [Store Path](architecture/store/path.md)
- [File System Object](architecture/store/fso.md)
- [Glossary](glossary.md)
- [Contributing](contributing/contributing.md)
- [Hacking](contributing/hacking.md)
Expand Down
79 changes: 79 additions & 0 deletions doc/manual/src/architecture/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Architecture

*(This chapter is unstable and a work in progress. Incoming links may rot.)*

This chapter describes how Nix works.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may sound like splitting hairs, but instead of the word "Nix", the manuals should say explicitly what the term refers to, even if it is to the detriment of brevity.1 Depending on the context, I saw it used to mean:

  • Nix core concepts / paradigm / model
  • Nix CLI tools
  • Nix expression language
  • the implementation of Nix (i.e., the NixOS/nix)
  • the Nix ecosystem as a whole (i.e., NixOS/nix implementation, Nixpkgs, NixOS, even NixOps)
  • Nixpkgs (rarely, but it still)
  • (there is probably more..)

Taking the very next sentence as an example:

It should help users understand why Nix behaves as it does, and it should help developers understand how to modify Nix and how to write similar tools.

  • "NIx behaves": I presume this alludes to the Nix model and/or the implementation
  • "to modify Nix": To modify a Nix expression, the behavior of the Nix CLI tools, the store object in the Nix store - or something else?

I guess the "Architecture" section is about the model itself, and if so, it would be prudent to start with a short explanation what it is about. (It would be nice to have a specification.) Below are two ways I tried to define "Nix model" to myself:

  1. A purely functional deployment model
    ... then explain what purity2, "functional" (this from PR Greatly expand architecture section, including splitting into abstract vs concrete model #6877), and deployment model mean.
    (Took it directly from the thesis; the omission of the word "software" is deliberate.)

  2. A (new) paradigm of building anything that requires assembly3, and managing the build results. The official term for "build result" is store object because they are kept in the Nix store.


[1]: The definition of "Nix" has been a subject of contention on many online forums, but personally, it even made stop trying out Nix in the beginning and to give up on more certain parts of it a couple of times. I found the tutorials, posts, manuals, etc. confusing because sometimes there were sudden context switches, and not knowing what other aspects of NIx there are, I couldn't connect the dots (heh, still can't most of the time).

[2]: Thesis, page 21 (PDF page 29), 2nd paragraph from bottom.

[3]: My first try was "A (new) paradigm of building digital artifacts3 (where the input can be anything that require assembly), and managing the build results., but the term "digital artifact" seems just as overloaded as "component" or "package":

It should help users understand why Nix behaves as it does, and it should help developers understand how to modify Nix and how to write similar tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this introduction although I know it is lifted from the Gazelle documentation. I, at least personally, feel like the concepts described are essential to using Nix/NixOS/Nixpkgs as a developer. Maybe I am over-estimating the difficulty a little, but I think if you want to do anything more with Nix than have a neat roll-back-able package manager, understanding the basic concepts that this document provides is critical. If you want to start hacking on Nix itself or take inspiration for your own tools, I think there's a lot more to know.

My opinion is that this introduction might push people away. I think it is good for all Nix users, not just the curious. Perhaps we could add a bit in the middle "developers understand how to leverage Nix", then replace the existing "developers" with "those interested", or remove that last bit entirely. To me, this document is about using and understanding Nix and its tooling and ecosystem rather than the gory details of its internal operation. Like you'd read this to understand how to modify Nixpkgs and how to write your own similar expressions, instead of how to write your own build system programming language. Hopefully this makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to do anything more with Nix than have a neat roll-back-able package manager, understanding the basic concepts that this document provides is critical

I thought so, too. After discussing with experienced people a lot, and doing studies with actual users, I have to disagree. We instead need better learning material, and this is why we will put work into nix.dev. I'm still convinced we have to front-load learners with specific knowledge instead of handing them inscrutable examples and templates, as Nix is different from what they usually know. But it has to be the right things, and user-oriented knowledge is different from the conceptual background presented here.

To me, this document is about using and understanding Nix and its tooling

There should be some very condensed variant of this in the nix.dev guides. Right now it's not a priority for me. Would you be interested in giving it a try? The depth presented here should also be available on the onboarding path, but clearly optional. I'm convinced it's not really needed to use Nix effectively, but eventually studies will have to answer that question.

If you want to start hacking on Nix itself or take inspiration for your own tools, I think there's a lot more to know.

Yes, absolutely. This architecture chapter should be the highest level of abstraction to start with. Eventually we should have paths into the source code, with additional prose sections discussing more detailed concepts, and then linking to the code to help discovering the implementation.


## Overview

Nix consists of [hierarchical layers][layer-architecture].

```
+-----------------------------------------------------------------+
| Nix |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Nix |
| packaging system |

Tbh this feels a bit absurd, but where do we draw the line of what terminology is "too specific" for the introduction?

By the same reasoning, we should perhaps replace evaluates to by computes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh this feels a bit absurd, but where do we draw the line of what terminology is "too specific" for the introduction?

Exactly, this is why it is that way and nothing else.

By the same reasoning, we should perhaps replace evaluates to by computes?

We use "evaluate" because that is what pure languages do. That term is also understood by people not into programming language theory, as it is self-explanatory - evaluation produces a value.

| [ commmand line interface ]------, |
| | | |
| evaluates | |
| | manages |
| V | |
| [ configuration language ] | |
| | | |
| +-----------------------------|-------------------V-----------+ |
| | store evaluates to | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| | store evaluates to | |
| | process and data layer | | |
| | evaluates to | |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should rather rename the "persistence layer" in the table to "store"?

| | | | |
| | referenced by V builds | |
| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
| | [ build input ] ---> [ build plan ] ---> [ outputs ] | |

Isn't "output" the technical term for this, or should "result" encompass more than just the output? If so, what?
Example: nix-store -q --outputs

We should have a small glossary to which we then limit our use of terminology.

I wish markdown has set -u 😉 (nounset)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't "output" the technical term for this

It's the Nix term for this, as is derivation for what we call build task here. We deliberately use generic terms in the overview, as none of that is specific to Nix.

We should have a small glossary to which we then limit our use of terminology.

We have that as the Rosetta Stone. The idea is to introduce Nix specific terminology in the subsequent sections, where we also describe how Nix implements the concepts in principle. We could move the table to the overview to accommodate newcomers and experienced Nix users, see #6420 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as none of that is specific to Nix.

Then Nix should be package manager and store should be data and process layer.

Rosetta Stone

This doesn't scale to encompass terminology that is specific to Nix, so we should still have a glossary.

Would it make sense to have a translation of the entire diagram? Seems like a good way to learn the new terminology.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do find the duplication of terminology very concerning.
I think I understand the desire for making the early explanation more accessible, but it seems that you'll extend the learning curve the longer you hold on to the generic terms. It also makes the chapter unsuitable as reference material, which may be ok?

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then Nix should be package manager and store should be data and process layer.

Sounds reasonable, please add a suggestion, considering the following:

  • data and process layer reads a bit clunky, although it's the correct term
    • should we call it "store" generically instead, in the table?
  • how do we still clarify this diagram refers to how Nix does things?

This doesn't scale to encompass terminology that is specific to Nix, so we should still have a glossary.

I hope this would be covered by additional sections, see the colored outline comment above. Is there anything that would still be missing?
Not against a glossary, it just needs enough substance. Maybe abbreviations such as drv or FSO, which we currently don't use and should not use in the official writing, but people may find helpful to decipher. On the other hand those may be NixOS Wiki material.

Would it make sense to have a translation of the entire diagram? Seems like a good way to learn the new terminology.

Would take too much head space for my taste. Or maybe just link to the respective pages behind the table, not mentioning it in the table of contents? Sound like detail work though, not relevant for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humans are really good at memorizing things in places, so I would guess that familiarizing oneself with the general diagram and then learning the technical terms in the same "place" would be highly effective. (I'm not a cognitive psychologist though)

You'd have to put them close together for this correspondence to be noticed; not one at the start and one at the end.

Is there anything that would still be missing?

  • Terminology that is more specific, like "store path", "store hash", "store directory".
  • Terminology that builds on top of the core architecture, such as "input", which has a very specific meaning in Nixpkgs

One could argue that the latter is out of scope for this document, but in practice, all terminology throughout the ecosystem shares the same namespace. We should recognize that and have a single place to look all of them up, and disambiguate them where necessary. (e.g. "input", which is already conflicting with a term used in this chapter)

how do we still clarify this diagram refers to how Nix does things?

The context would imply this, but it'd be helpful to have a caption such as "An overview of the Nix architecture using generic terminology", which then contrasts well with "An overview of the Nix architecture using its specific terms".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology that is more specific, like "store path", "store hash", "store directory".

We have that in the table of contents, and it is entirely Nix-specific.

One could argue that the latter is out of scope for this document, but in practice, all terminology throughout the ecosystem shares the same namespace.

Agreed on potential confusion. Suggestions to do anything about that except always referring to "build input" in generic terms, because that is different from just "input"?

You'd have to put them close together for this correspondence to be noticed; not one at the start and one at the end.

I agree with the learning conditions premise, but cannot picture a specific change. Feel free to make a pull request on the fork against this branch.

| | | |
| +-------------------------------------------------------------+ |
+-----------------------------------------------------------------+
```

At the top is the [command line interface](../command-ref/command-ref.md), translating from invocations of Nix executables to interactions with the underlying layers.

Below that is the [Nix expression language](../expressions/expression-language.md), a [purely functional][purely-functional-programming] configuration language.
Comment on lines +32 to +34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I started thinking about the term "Nix", this is the hierarchy that helped me put things into perspective:


       ┌────────── Nix MODEL ──────────┐
       │                               │
       │ (This would be a spec, but    │
       │  it is the implementation     │
       │  itself, right?)              │
       │                               │
       └───────────────┬───────────────┘
                       │
                       │
                       │
                       ▼
┌───────────── Nix IMPLEMENTATION ──────────┐
│                 ("the core")              │
│                                           │    e.g., modules/c++ files
│         Implementation of the model       │          implementing flakes
│ (C++ code defining the building blocks of │
│  Nix: Nix lexer & parser, behaviour of    │
│  the Nix store, store objects and actions │
│  etc.)                                    │
│                                           │
└──────────────────────┬────────────────────┘
                       │
                       │
                       │
                       ▼
┌─────────────── Nix FRONTEND ──────────────┐            
│                                           │
│   + CLI tools that use the "core" libs    │    e.g., `nix flakes` commands
│                                           │
│   + Nix lang                              │
│     (Well, it is interpreted by the CLI   │
│      tools, but this is what is used to   │
│      interact with the system. Could be   │
│      wrong.)                              │
│                                           │
└───────────────────────────────────────────┘

It is used to compose expressions which ultimately evaluate to self-contained *build plans*, used to derive *build results* from referenced *build inputs*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who has managed to wrap their head around Nix's confusing usage of terminology, I wonder if a pointer here to the rosetta stone would be good, marked for experienced Nix users. It is something I referred to several times while writing this review to make sure my understanding was correct, as this guide (for good reason) uses terms I'm unfamiliar with.

Maybe it should be *build plans* (or derivation closures) too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with the problem. I want to keep this essential section as concise as possible, though. Therefore definitely will not add variants in parentheses.

@tpwrules @Ericson2314 What about moving the Rosetta Stone to the end of the overview section?


The command line and Nix language are what users interact with most.

> **Note**
> The Nix language itself does not have a notion of *packages* or *configurations*.
> As far as we are concerned here, the inputs and results of a build plan are just data.

Underlying these is the [Nix store](./store/store.md), a mechanism to keep track of build plans, data, and references between them.
It can also execute build plans to produce new data.

A build plan is a series of *build tasks*.
Each build task has a special build input which is used as *build instructions*.
The result of a build task can be input to another build task.

```
+-----------------------------------------------------------------------------------------+
| store |
| ................................................. |
| : build plan : |
| : : |
| [ build input ]-----instructions-, : |
| : | : |
| : v : |
| [ build input ]----------->[ build task ]--instructions-, : |
| : | : |
| : | : |
| : v : |
| : [ build task ]----->[ build result ] |
| [ build input ]-----instructions-, ^ : |
| : | | : |
| : v | : |
| [ build input ]----------->[ build task ]---------------' : |
| : ^ : |
| : | : |
| [ build input ]------------------' : |
| : : |
| : : |
| :...............................................: |
| |
+-----------------------------------------------------------------------------------------+
```

[layer-architecture]: https://en.m.wikipedia.org/wiki/Multitier_architecture#Layers
[purely-functional-programming]: https://en.m.wikipedia.org/wiki/Purely_functional_programming
69 changes: 69 additions & 0 deletions doc/manual/src/architecture/store/fso.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# File System Object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended that this file isn't directly reachable from the outline?


The Nix store uses a simple file system model for the data it holds in [store objects](store.md#store-object).

Every file system object is one of the following:

- File: an executable flag, and arbitrary data for contents
- Directory: mapping of names to child file system objects
- [Symbolic link][symlink]: may point anywhere.

We call a store object's outermost file system object the *root*.

data FileSystemObject
= File { isExecutable :: Bool, contents :: Bytes }
| Directory { entries :: Map FileName FileSystemObject }
| SymLink { target :: Path }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clarified somewhere here that the model does not conceptualize hard links, although they can legally appear in the file system representation of a store. Each FSO which is not the root has exactly one parent and one name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Examples:

- a directory with contents

/nix/store/<hash>-hello-2.10
├── bin
│   └── hello
└── share
├── info
│   └── hello.info
└── man
└── man1
└── hello.1.gz

- a directory with relative symlink and other contents

/nix/store/<hash>-go-1.16.9
├── bin -> share/go/bin
├── nix-support/
└── share/

- a directory with absolute symlink

/nix/store/d3k...-nodejs
└── nix_node -> /nix/store/f20...-nodejs-10.24.

A bare file or symlink can be a root file system object.
Examples:

/nix/store/<hash>-hello-2.10.tar.gz

/nix/store/4j5...-pkg-config-wrapper-0.29.2-doc -> /nix/store/i99...-pkg-config-0.29.2-doc

Symlinks pointing outside of their own root or to a store object without a matching reference are allowed, but might not function as intended.
Examples:

- an arbitrarily symlinked file may change or not exist at all

/nix/store/<hash>-foo
└── foo -> /home/foo

- if a symlink to a store path was not automatically created by Nix, it may be invalid or get invalidated when the store object is deleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what "automatically created by Nix" means here. Do you mean GC roots?

This might confuse derivation writers as unless you play nasty games with string contexts or randomly choose a store path out of thin air, creating a symlink in a derivation using a simple ln -s that points to another store path will create a reference, guaranteeing that the pointed-to path exists (if not the particular file linked within it) and cannot be deleted.

Copy link
Contributor

@fricklerhandwerk fricklerhandwerk Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Automatically" means through Nix mechanisms.

Both will render to the same file system object:

this will not create a reference

ln -s /nix/store/<hash>-my-derivation

this will create a reference

ln -s ${my-derivation}

See #4280 (comment)

@tpwrules Do you have a suggestion to put it concisely and be more precise?


/nix/store/<hash>-bar
└── bar -> /nix/store/abc...-foo

Nix file system objects do not support [hard links][hardlink]:
each file system object which is not the root has exactly one parent and one name.
However, as store objects are immutable, an underlying file system can use hard links for optimization.

[symlink]: https://en.m.wikipedia.org/wiki/Symbolic_link
[hardlink]: https://en.m.wikipedia.org/wiki/Hard_link
105 changes: 105 additions & 0 deletions doc/manual/src/architecture/store/path.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Store Path

Nix implements [references](store.md#reference) to [store objects](store.md#store-object) as *store paths*.

Store paths are pairs of

- a 20-byte [digest](#digest) for identification
- a symbolic name for people to read.

Example:

- digest: `b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z`
- name: `firefox-33.1`

It is rendered to a file system path as the concatenation of

- [store directory](#store-directory)
- path-separator (`/`)
- [digest](#digest) rendered in a custom variant of [base-32](https://en.m.wikipedia.org/wiki/Base32) (20 arbitrary bytes become 32 ASCII characters)
- hyphen (`-`)
- name

Example:

/nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1
|--------| |------------------------------| |----------|
store directory digest name

## Store Directory

Every [store](./store.md) has a store directory.

If the store has a [file system representation](./store.md#files-and-processes), this directory contains the store’s [file system objects](#file-system-object), which can be addressed by [store paths](#store-path).

This means a store path is not just derived from the referenced store object itself, but depends on the store the store object is in.

> **Note**
> The store directory defaults to `/nix/store`, but is in principle arbitrary.

It is important which store a given store object belongs to:
Files in the store object can contain store paths, and processes may read these paths.
Nix can only guarantee [referential integrity](store/closure.md) if store paths do not cross store boundaries.

Therefore one can only copy store objects to a different store if

- the source and target stores' directories match

or

- the store object in question has no references, that is, contains no store paths.

One cannot copy a store object to a store with a different store directory.
Instead, it has to be rebuilt, together with all its dependencies.
It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.
It is in general not enough to replace the store directory string in file contents, as it's not possible to make room for a different-length string in all file formats, particularly executables.

I don't like the mention of checksums here because Nix already can and does replace store paths with equal-length ones (for content-addressing and system.replaceRuntimeDependencies) and we just accept the rare possibility of checksum trouble.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to encode exactly this without going into too much detail. This is why it says "in general" and "may render unusable". It is still worth mentioning the checksums, because even if the trouble is rare, it's real and conceptually built in.


# Digest

In a [store path](#store-path), the [digest][digest] is the output of a [cryptographic hash function][hash] of either all *inputs* involved in building the referenced store object or its actual *contents*.

Store objects are therefore said to be either [input-addressed](#input-addressing) or [content-addressed](#content-addressing).

> **Historical Note**
> The 20 byte restriction is because originally digests were [SHA-1][sha-1] hashes.
> Nix now uses [SHA-256][sha-256], and longer hashes are still reduced to 20 bytes for compatibility.

[digest]: https://en.m.wiktionary.org/wiki/digest#Noun
[hash]: https://en.m.wikipedia.org/wiki/Cryptographic_hash_function
[sha-1]: https://en.m.wikipedia.org/wiki/SHA-1
[sha-256]: https://en.m.wikipedia.org/wiki/SHA-256

### Reference scanning

When a new store object is built, Nix scans its file contents for store paths to construct its set of references.

The special format of a store path's [digest](#digest) allows reliably detecting it among arbitrary data.
Nix uses the [closure](store.md#closure) of build inputs to derive the list of allowed store paths, to avoid false positives.

This way, scanning files captures run time dependencies without the user having to declare them explicitly.
Doing it at build time and persisting references in the store object avoids repeating this time-consuming operation.

> **Note**
> In practice, it is sometimes still necessary for users to declare certain dependencies explicitly, if they are to be preserved in the build result's closure.
This depends on the specifics of the software to build and run.
>
> For example, Java programs are compressed after compilation, which obfuscates any store paths they may refer to and prevents Nix from automatically detecting them.

## Input Addressing

Input addressing means that the digest derives from how the store object was produced, namely its build inputs and build plan.

To compute the hash of a store object one needs a deterministic serialisation, i.e., a binary string representation which only changes if the store object changes.

Nix has a custom serialisation format called Nix Archive (NAR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, it would be really nice to document NAR's binary format somewhere else than the PDF of the thesis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely. I had typed out the algorithm in Haskell syntax at some point (haven't checked against the code if it's still exactly the same), but we're not at the level of detail here. Feel free to open a pull request!

serialise fso = str "nix-archive-1" ++ serialize' fso

serialise' fso = str "(" ++ serialise'' fso ++ str ")"

serialise'' (File isExecutable contents) =
  str "type" ++ str "regular" ++ exec ++ str "contents" ++ str contents
  where exec = if isExecutable
               then str "executable" ++ str ""
               else str ""

serialise'' (SymLink target) =
  str "type" ++ str "symlink" ++ str "target" ++ str target

serialise'' (Directory entries) =
  str "type" ++ str "directory" ++ concatMap serialiseEntry (sort entries)
  where serialiseEntry name fso =
    str "entry" ++ str "("
    ++ str "name" ++ str name
    ++ str "node" ++ serialise' fso
    ++ str ")"

str s = int (length s) ++ pad s
int n = _ -- 64-bit little-endian representation of number `n`
pad s = _ -- byte sequence `s`, padded with 0s to a multiple of 8 bytes


Store object references of this sort can *not* be validated from the content of the store object.
Rather, a cryptographic signature has to be used to indicate that someone is vouching for the store object really being produced from a build plan with that digest.

## Content Addressing

Content addressing means that the digest derives from the store object's contents, namely its file system objects and references.
If one knows content addressing was used, one can recalculate the reference and thus verify the store object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight bit of an aside, but one thing I never understood about content addressing: what says that I actually used a given build plan to produce a store object with this digest? Don't you still need someone to vouch for the fact that this build plan produced that output? Or is the content-addressed digest also a function of the input digest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ericson2314 told me there is a database entry in the store that maps content hashes to input hashes. Appears to be a not well-known feature. I suppose reproducing the build would verify the mapping, but that's obviously not guaranteed to succeed, as the build may not be bytewise reproducible.

This is an important detail, but not at this stage. See #6420 (comment) for the outline I envision. The digests should have their own chapter with more structure and details. Please open a pull request if you want to tackle that.

Content addressing is currently only used for the special cases of source files and "fixed-output derivations", where the contents of a store object are known in advance.
Content addressing of build results is still an [experimental feature subject to some restrictions](https://github.com/tweag/rfcs/blob/cas-rfc/rfcs/0062-content-addressed-paths.md).

Loading