Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Position to Int #162

Open
jamesdbrock opened this issue Mar 24, 2022 · 3 comments
Open

Change Position to Int #162

jamesdbrock opened this issue Mar 24, 2022 · 3 comments

Comments

@jamesdbrock
Copy link
Member

Delete Position { column :: Int, line :: Int } and replace it with Int representing the position index from the beginning of the input. For String, the position index would be in units of CodePoints.

Delete the updatePosString and updatePosSingle functions.

-- | Updates a `Position` by adding the columns and lines in `String`.
updatePosString :: Position -> String -> String -> Position
updatePosString pos before after = case uncons before of
Nothing -> pos
Just { head, tail } -> do
let
newPos
| String.null tail = updatePosSingle pos head after
| otherwise = updatePosSingle pos head tail
updatePosString newPos tail after
-- | Updates a `Position` by adding the columns and lines in a
-- | single `CodePoint`.
updatePosSingle :: Position -> CodePoint -> String -> Position
updatePosSingle (Position { line, column }) cp after = case fromEnum cp of
10 -> Position { line: line + 1, column: 1 } -- "\n"
13 ->
case codePointAt 0 after of
Just nextCp | fromEnum nextCp == 10 -> Position { line, column } -- "\r\n" lookahead
_ -> Position { line: line + 1, column: 1 } -- "\r"
9 -> Position { line, column: column + 8 - ((column - 1) `mod` 8) } -- "\t" Who says that one tab is 8 columns?
_ -> Position { line, column: column + 1 }

In updatePosString there is an assumption that 1 tab = 8 spaces and there is no way for the library user to change that behavior. So I think updatePosString has always been fundamentally broken.

We want to provide a way to track the line and column during the parse so that

  1. We can write indentation-sensitive parsers.
  2. We can report the line and column in a ParseError.

The Text.Parsing.Indent module is used by some packages so we should try to keep it.

@jamesdbrock
Copy link
Member Author

Text.Parsing.Indent is based on

https://hackage.haskell.org/package/indents-0.3.3/docs/Text-Parsec-Indent.html

but the author of the indents library seems to have changed their mind and later versions are quite different

https://hackage.haskell.org/package/indents-0.5.0.1/docs/Text-Parsec-Indent-Explicit.html

@jamesdbrock
Copy link
Member Author

Reporting the line and column in a ParseError depends on what the indentation algorithm is, which should be defined by an indentation-sensitive parser.

I like the idea of an indentation-sensitive parser which is expressed as a transformer of ParserT.

https://hackage.haskell.org/package/indents-0.5.0.1/docs/Text-Parsec-Indent.html

So we have something like an IndentT transformer which contains the line and column state.

In the event of a parsing failure, the ParseError must also convey the line and column state of the IndentT.

@jamesdbrock
Copy link
Member Author

jamesdbrock commented Mar 24, 2022

How can we get the indentation level out of IndentParser state and include it in a ParseError?

I really don’t want to paramaterize ParseError but maybe that would be the best way.

withPos could include a region which adds the indentation information to the error message string. For that, region would have to be changed to pass the current ParseState to the function region :: forall m s a. Monad m => (ParseState -> ParseError -> ParseError) -> ParserT s m a -> ParserT s m a

For starters I think we should change the definition of ParseError from ParseError String Position to ParseError String ParseState.

This was referenced Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant