Skip to content

nsu-syspro/haskell-applicative-parser

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Haskell: Applicative parser

points bar

Guidelines

Guidelines

When solving the homework, strive to create not just code that works, but code that is readable and concise. Try to write small functions which perform just a single task, and then combine those smaller pieces to create more complex functions.

Don’t repeat yourself: write one function for each logical task, and reuse functions as necessary.

Don't be afraid to introduce new functions where you see fit.

Sources

Each task has corresponding source file in src directory where you should implement the solution.

Building

All solutions should compile without warnings with following command:

stack build

Testing

You can and should run automated tests before pushing solution to GitHub via

stack test --test-arguments "-p TaskX"

where X in TaskX should be number of corresponding Task to be tested.

So to run all test for the first task you should use following command:

stack test --test-arguments "-p Task1"

You can also run tests for all tasks with just

stack test

Debugging

For debugging you should use GHCi via stack:

stack ghci

You can then load your solution for particular task using :load TaskX command.

Here is how to load Task1 in GHCi:

$ stack ghci
ghci> :load Task1
[1 of 1] Compiling Task1 ( .../src/Task1.hs, interpreted )
Ok, one module loaded.

Note: if you updated solution, it can be quickly reloaded in the same GHCi session with :reload command

ghci> :reload

Preface

In this assignment you will implement applicative parser from scratch and use it together with custom library of "parser combinators" to parse a couple of different formats, like JSON.

It is recommended for tasks to be implemented in order.

This should give you the fundamental understanding of how industry-standard parser libraries (such as parsec, megaparsec and attoparsec) are implemented and how to efficiently use them in real applications.

To learn more in-depth about how to use these libraries and other more advanced features of Haskell, check out tutorial Parser Combinators in Haskell.

Task 1 (3 points)

Parser

In file src/Parser.hs you will find the base for the parser:

-- | Parser of value of type @a@
newtype Parser a = Parser { runParser :: Input -> Parsed a }

In essence parser is a function that processes its input string and either returns parsed value in a more structured format together with remaining string to parse, or fails with some parsing error, ideally indicating where the parsing failed and why (the quality of error reporting directly correlates with ease of use of your parser).

So to keep track of current position and remaining input to parse, parser needs to receive and return both position and the string that is being parsed:

-- | Parser input encapsulating remaining string to be parsed with current position
type Input = Position String

-- | Parsing result of value of type @a@
data Parsed a =
    Parsed a Input           -- ^ Successfully parsed value of type @a@ with remaining input to be parsed
  | Failed [Position Error]  -- ^ Failed to parse value of type @a@ with accumulated list of errors

Note that instead of just single error, Failed constructor accepts list of errors, each with its own position where the error occurred. This will be useful for situations, when parser will have to try several possible alternatives and all of them failed to parse.

The data type Position a encapsulates annotation of value of type a with position:

-- | Value annotated with position of parsed input starting from 0
data Position a = Position Int a

Finally, Error data type represents supported parsing errors:

-- | Parsing error
data Error =
    Unexpected Char -- ^ Unexpected character
  | EndOfInput      -- ^ Unexpected end of input

Tip

Feel free to add more error kinds to simplify debugging for yourself!

Automated tests only check the fact of success or failure, not the exact errors that are reported (see parseMaybe below).

Runners

Instead of exposing runParser and its implementation details to the end user, it is always a good idea to provide separate public functions that will run parser on particular string.

The first function that you should implement is parse, which runs given Parser on provided string, absolving the user from wrapping the string into Input for runParser:

parse :: Parser a -> String -> Parsed a

The second function to implement is parseMaybe, which hides the details further by discarding errors and the rest of the input to be parsed, resulting in simple Maybe value:

parseMaybe :: Parser a -> String -> Maybe a

Applicative

Next you need to implement instances of Functor, Applicative and Alternative for Parser.

Note

When implementing Alternative instance (particularly (<|>) operator), make sure to accumulate errors if both parsers fail. It is also useful to deduplicate errors while accumulating them.

satisfy

The most basic building block for more complex parsers is function satisfy (for you to implement) which parses single character if it satisfies given predicate:

satisfy :: (Char -> Bool) -> Parser Char

Example:

>>> parse (satisfy (>= 'b')) "foo"
Parsed 'f' (Position 1 "oo")
>>> parse (satisfy (>= 'b')) "bar"
Parsed 'b' (Position 1 "ar")
>>> parse (satisfy (>= 'b')) "abc"
Failed [Position 0 (Unexpected 'a')]
>>> parse (satisfy (>= 'b')) ""
Failed [Position 0 EndOfInput]

Important

The constructor for Parser and runParser are intentionally not exposed outside of src/Parser.hs to encourage you to use only satisfy and Applicative/Alternative instances to implement all of the other functions below.

Parser combinators

Now it is time to implement your own library of parser combinators in src/ParserCombinators.hs, where we have already provided you with a skeleton of usual combinator ideas. However, you should improve and extend this library as you solve the rest of the tasks.

Integers

The last part of this first task is to implement two integer parsers in src/Task1.hs:

  • nat for natural numbers
    nat :: Parser Integer
    Example:
    >>> parse nat "0"
    Parsed 0 (Input 1 "")
    >>> parse nat "123"
    Parsed 123 (Input 3 "")
    >>> parse nat "-123"
    Failed [PosError 0 (Unexpected '-')]
    >>> parse nat "abc"
    Failed [PosError 0 (Unexpected 'a')]
    >>> parse nat "123abc"
    Parsed 123 (Input 3 "abc")
  • int for integer numbers
    int :: Parser Integer
    Example:
    >>> parse int "0"
    Parsed 0 (Input 1 "")
    >>> parse int "123"
    Parsed 123 (Input 3 "")
    >>> parse int "-123"
    Parsed (-123) (Input 4 "")
    >>> parse int "abc"
    Failed [PosError 0 (Unexpected 'a')]
    >>> parse int "123abc"
    Parsed 123 (Input 3 "abc")

Task 2 (3 points)

The second task is to construct parser for dates in one of three common formats given as BNF:

date ::= dotFormat | hyphenFormat | usFormat

dotFormat ::= day "." month "." year
hyphenFormat ::= day "-" month "-" year
usFormat ::= monthName " " usDay " " year

usDay ::= nonZeroDigit | "1" digit | "2" digit | "30" | "31"
day ::= "0" nonZeroDigit | "1" digit | "2" digit | "30" | "31"
month ::= "0" nonZeroDigit | "10" | "11" | "12"
year ::= number

number ::= digit | number digit
digit ::= "0" | nonZeroDigit
nonZeroDigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

monthName ::= "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec"

Here are examples of valid dates:

  • 01.01.2012 (dot format)
  • 12.12.2012 (dot format)
  • 01-01-2012 (hyphen format)
  • 12-12-2012 (hyphen format)
  • Jan 1 2012 (US format)
  • Dec 12 2012 (US format)

Note that because the formats are given as context-free grammar, they cannot distinguish between months that contain different number of days and detect leap years (that is not the job of parser after all).

So the following dates are also considered valid by grammar:

  • Feb 31 2012
  • 01.01.00000000

Date

In src/Task2.hs you will find following data type Date to represent dates as well as newtype wrappers for each date component:

data Date = Date Day Month Year
  deriving (Show, Eq)

newtype Day   = Day   Int deriving (Show, Eq)
newtype Month = Month Int deriving (Show, Eq)
newtype Year  = Year  Int deriving (Show, Eq)

Parsing

Your task is to implement the following parser date:

date :: Parser Date

Example:

>>> parse date "01.01.2012"
Parsed (Date (Day 1) (Month 1) (Year 2012)) (Input 10 "")
>>> parse date "12.12.2012"
Parsed (Date (Day 12) (Month 12) (Year 2012)) (Input 10 "")
>>> parse date "12-12-2012"
Parsed (Date (Day 12) (Month 12) (Year 2012)) (Input 10 "")
>>> parse date "Dec 12 2012"
Parsed (Date (Day 12) (Month 12) (Year 2012)) (Input 11 "")
>>> parse date "Jan 1 2012"
Parsed (Date (Day 1) (Month 1) (Year 2012)) (Input 10 "")
>>> parse date "Feb 31 2012"
Parsed (Date (Day 31) (Month 2) (Year 2012)) (Input 11 "")
>>> parse date "12/12/2012"
Failed [PosError 2 (Unexpected '/'),PosError 0 (Unexpected '1')]

Task 3 (4 points)

The last task is to construct parser for JSON format.

Important

You will need to implement the full JSON grammar as it is described at https://www.json.org.

Pay close attention to whitespace handling, number format (with fraction and exponent) and escape sequences.

JSON

In src/Task3.hs you will find following data type JValue to represent JSON values:

data JValue =
    JObject [(String, JValue)]
  | JArray [JValue]
  | JString String
  | JNumber Double
  | JBool Bool
  | JNull
 deriving (Show, Eq)

And in directory samples you can find some JSON samples which you can use to test your implementation, as well as their expected rendered format in .expected files (see section Rendering below).

Parsing

Your task is to implement the following parser json:

json :: Parser JValue

Example:

>>> parse json "{}"
Parsed (JObject []) (Input 2 "")
>>> parse json "null"
Parsed JNull (Input 4 "")
>>> parse json "true"
Parsed (JBool True) (Input 4 "")
>>> parse json "3.14"
Parsed (JNumber 3.14) (Input 4 "")
>>> parse json "{{}}"
Failed [PosError 0 (Unexpected '{'),PosError 1 (Unexpected '{')]

Rendering

Also in src/Task3.hs we provided a set of rendering utilities, which should help you debug and verify correctness of your parser.

For example, to parse file samples/wiki.json using your parser into Parsed JValue, you can run following command in stack ghci:

>>> parseJSONFile "samples/wiki.json"
Parsed (JObject [("first_name",JString "John"),("last_name",JString "Smith"),...) (Position 460 "")

Or to get a nicely rendered representation which can be directly compared with samples/wiki.json.expected:

>>> renderJSONFile "samples/wiki.json" >>= putStrLn
{"first_name": "John", "last_name": "Smith", ...}

All .expected files directly correspond to the format in which render function displays given JValue, so you can use it as a reference.

Note that when rendering JValue directly in GHCi, you will end up with escaped version of rendered string. To avoid that, you can pass this string to function putStrLn which will print you unescaped representation:

>>> render (JArray [JString "abc"])
"[\"abc\"]"
>>> putStrLn (render (JArray [JString "abc"]))
["abc"]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Haskell 100.0%