parboiled
Elegant parsing in Java and Scala - lightweight, easy-to-use, powerful.
Home · sirthias/parboiled Wiki · GitHub parboiled - elegant parsing in java and scala - lightweight, easy-to-use, powerful.
I'd like to create some helper rules that take one rule and add some features to it. For example enforcing that string literals need to be quoted, or adding token position tracking to the token rules / ADT's.
I tried the following syntax (and quite a few permutations).
def quoted[T](rl: Rule1[T]) = rule {
'"' ~ rl ~ '"'
}
It compiles fine but as soon as I wire it up --e.g.,
def NodeObjPathEntry: Rule1[CNodeObjPathEntry] = rule {
WhiteSpace ~ quoted(IdentifierStringUnwrapped) ~ ':' ~ (NodeObjArray | NodeObjObj) ~> CNodeObjPathEntry
}
With the sub-rules :
def IdentifierStringUnwrapped: Rule1[String] = rule {
clearSB() ~ IdentifierChars ~ push(sb.toString)
}
def IdentifierChars = rule {
Alpha ~ appendSB() ~ zeroOrMore(AlphaNum ~ appendSB())
}
I get Illegal rule call: quoted[this.String](this.IdentifierStringUnwrapped)
I could commit to an alternative approach: mix in the primitive token parsers, and then create the variants I need. But I really wanna figure out what is going on.
Source: (StackOverflow)
This is a question both specific to the parboiled parser framework, and to BNF/PEG in general.
Let's say I have the fairly simple regular expression
^\\s*([A-Za-z_][A-Za-z_0-9]*)\\s*=\\s*(\\S+)\\s*$
which represents the pseudo-EBNF of
<line> ::= <ws>? <identifier> <ws>? '=' <nonwhitespace> <ws>?
<ws> ::= (' ' | '\t' | {other whitespace characters})+
<identifier> ::= <identifier-head> <identifier-tail>
<identifier-head> ::= <letter> | '_'
<identifier-tail> ::= (<letter> | <digit> | '_')*
<letter> ::= ('A'..'Z') | ('a'..'z')
<digit> ::= '0'..'9'
<nonwhitespace> ::= ___________
How would you define nonwhitespace (one or more characters that aren't whitespace) in EBNF?
For those of you familiar with the Java parboiled library, how could you implement a rule that defines nonwhitespace?
Source: (StackOverflow)
I'm attempting to create a simple XML parser using the parboiled Java library.
The following code attempts to use a variable to verify that the closing tag contains the same identifier as the opening tag.
class SimpleXmlParser2 extends BaseParser<Object> {
Rule Expression() {
StringVar id = new StringVar();
return Sequence(OpenElement(id), ElementContent(), CloseElement(id));
}
Rule OpenElement(StringVar id) {
return Sequence('<', Identifier(), ACTION(id.set(match())), '>');
}
Rule CloseElement(StringVar id) {
return Sequence("</", id.get(), '>');
}
Rule ElementContent() {
return ZeroOrMore(NoneOf("<>"));
}
Rule Identifier() {
return OneOrMore(CharRange('A', 'z'));
}
}
The above, however, fails with the error message org.parboiled.errors.GrammarException: 'null' cannot be automatically converted to a parser Rule
, when I create the ParseRunner.
It would appear that I have a basic misunderstanding of how variables should be used in parboiled. Can anyone help me resolve this?
Source: (StackOverflow)
I have created a parser class for the parboiled framework according to this simple example:
package my.package;
import org.parboiled.BaseParser;
import org.parboiled.annotations.BuildParseTree;
@BuildParseTree
public class QueryParser extends BaseParser<Object> {
//some rules
}
If I try to create parser as shown in the example
QueryParser parser = Parboiled.createParser(QueryParser.class);
I get an exception at that line:
java.lang.ClassCastException: my.package.QueryParser$$parboiled cannot be cast to org.parboiled.BaseParser
at org.parboiled.Parboiled.createParser(Parboiled.java:56)
...
I'm really not doing anything special that is not done in the example. The only difference is that the parser and and the class calling it are in different projects but I can't imagine why this should matter. The dependencies between the projects (which are Eclipse plugin projects) should be alright.
Can anyone tell what I'm doing wrong or where the mistake could be?
Source: (StackOverflow)
I'm working on a program that uses cglib, included as part of a large package of dependencies (version 2.1_3), and have written a new feature using parboiled processor to do some markdown to html conversion.
The problem arises with a dependency conflict.
If I do nothing, all of my tests for the parboiled feature fail, with messages along the lines of:
java.lang.IncompatibleClassChangeError: org/parboiled/transform/ParserClassNode
And if I include the following exclusion
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
where my big dependency is declared, all of the parboiled tests will pass, but most of the others will fail, with messages like
Caused by: java.lang.NoClassDefFoundError: Could not initialize class net.sf.cglib.proxy.Enhancer
I am using pegdown 1.4.1
Any suggestions? Browsing the internet seems to suggest using a new version of asm (4.0 or later, the one in my project at the moment is 1.5.3) may help, but trying to exclude the asm I have and import the later one didn't help.
Source: (StackOverflow)
So I've been trying to use parboiled2 for the last few weeks now, it is possibly the most difficult dependency to add to a build I have come across in my entire life. My current error is a compile sbt assembly
) error:
[error] missing or invalid dependency detected while loading class file 'Prepender.class'.
[error] Could not access type PrependAux in package shapeless,
[error] because it (or its dependencies) are missing. Check your build definition for
[error] missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
[error] A full rebuild may help if 'Prepender.class' was compiled against an incompatible version of shapeless.
[error] .../Main.scala:56: could not find implicit value for parameter prepender: spray.routing.Prepender[shapeless.HNil,shapeless.::[String,shapeless.HNil]]
[error] path(searchSegment / Segment)(title => get(responder(complete(
[error] ^
It seems that it is simply impossible to make Spray and Parboiled2 play nice together.
I've tried sbt clean
and removing my target
directories. My build file is basically this:
resolvers ++= Seq(
"spray repo" at "http://repo.spray.io"
)
val akkaV = "2.3.6"
val sprayV = "1.3.2"
libraryDependencies ++= Seq(
// If I comment this line, everything works fine.
"org.parboiled" %% "parboiled" % "2.0.1" withSources() withJavadoc(),
//
"org.scalacheck" %% "scalacheck" % "1.12.1" % "test" withSources() withJavadoc(),
"org.specs2" %% "specs2-core" % "2.4.15" % "test" withSources() withJavadoc(),
"org.specs2" %% "specs2-scalacheck" % "2.4.15" % "test" withSources() withJavadoc(),
"org.scalaz" %% "scalaz-core" % "7.1.0" withSources() withJavadoc(),
//
"io.spray" %% "spray-json" % "1.3.1" withSources() withJavadoc(),
"io.spray" %% "spray-can" % sprayV withSources() withJavadoc(),
"io.spray" %% "spray-routing" % sprayV withSources() withJavadoc(),
"io.spray" %% "spray-testkit" % sprayV % "test" withSources() withJavadoc(),
//
"com.typesafe.akka" %% "akka-actor" % akkaV withSources() withJavadoc(),
"com.typesafe.akka" %% "akka-testkit" % akkaV % "test" withSources() withJavadoc()
)
scalaVersion := "2.11.4"
javaOptions ++= Seq("-target", "1.8", "-source", "1.8")
My sbtVersion
is 0.13.6, and my sbt-assembly
version is 0.12.0
Before upgrading to 2.11 and upgrading my specs2 dependencies I got: parboiled2 and Spray cause conflicting cross-version suffixes
Source: (StackOverflow)
I'm new to PEG parsing and trying to write a simple parser to parse out an expression like: "term1 OR term2 anotherterm" ideally into an AST that would look something like:
OR
-----------|---------
| |
"term1" "term2 anotherterm"
I'm currently using Grappa (https://github.com/fge/grappa) but it's not matching even the more basic expression "term1 OR term2". This is what I have:
package grappa;
import com.github.fge.grappa.annotations.Label;
import com.github.fge.grappa.parsers.BaseParser;
import com.github.fge.grappa.rules.Rule;
public class ExprParser extends BaseParser<Object> {
@Label("expr")
Rule expr() {
return sequence(terms(), wsp(), string("OR"), wsp(), terms(), push(match()));
}
@Label("terms")
Rule terms() {
return sequence(whiteSpaces(),
join(term()).using(wsp()).min(0),
whiteSpaces());
}
@Label("term")
Rule term() {
return sequence(oneOrMore(character()), push(match()));
}
Rule character() {
return anyOf(
"0123456789" +
"abcdefghijklmnopqrstuvwxyz" +
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"-_");
}
@Label("whiteSpaces")
Rule whiteSpaces() {
return join(zeroOrMore(wsp())).using(sequence(optional(cr()), lf())).min(0);
}
}
Can anyone point me in the right direction?
Source: (StackOverflow)
I want to use Parboiled to parse a string that should turn a similar source into different types.
Specifically, I am trying to parse an input of words separated by the same separator into the equivalent of (List[String], String)
where the last word is the second element of the tuple.
For example, "a.bb.ccc.dd.e"
should be parsed to (["a", "bb", "ccc", "dd"], "e")
.
A simplified version of my code is as follows:
case class Foo(s: String)
case class Bar(fs: List[Foo], f: Foo)
object FooBarParser extends Parser {
val SEPARATOR = "."
def letter: Rule0 = rule { "a" - "z" }
def word: Rule1[String] = rule { oneOrMore(letter) ~> identity }
def foo = rule { word ~~> Foo }
def foos = rule { zeroOrMore(foo, separator = SEPARATOR) }
def bar = foos ~ SEPARATOR ~ foo ~~> Bar
}
object TestParser extends App {
val source = "aaa.bbb.ccc"
val parseResult = ReportingParseRunner(FooBarParser.bar).run(source)
println(parseResult.result)
}
This prints None
so clearly I am doing something wrong. Is Parboiled capable of parsing this?
Source: (StackOverflow)
I would like to use parboiled2 to parse multiple CSV lines instead of a single CSV String. The result would be something like:
val parser = new CSVRecordParser(fieldSeparator)
io.Source.fromFile("my-file").getLines().map(line => parser.record.run(line))
where CSVRecordParser is my parboiled parser of CSV records. The problem that I have is that, for what I've tried, I cannot do this because parboiled parsers requires the input in the constructor, not in the run method. Thus, I can either create a new parser for each line, that is not good, or find a way to pass the input to the parser for every input that I have. I tried to hack a bit the parser, by setting the input as variable and wrapping the parser in another object
object CSVRecordParser {
private object CSVRecordParserWrapper extends Parser with StringBuilding {
val textBase = CharPredicate.Printable -- '"'
val qTextData = textBase ++ "\r\n"
var input: ParserInput = _
var fieldDelimiter: Char = _
def record = rule { zeroOrMore(field).separatedBy(fieldDelimiter) ~> (Seq[String] _) }
def field = rule { quotedField | unquotedField }
def quotedField = rule {
'"' ~ clearSB() ~ zeroOrMore((qTextData | '"' ~ '"') ~ appendSB()) ~ '"' ~ ows ~ push(sb.toString)
}
def unquotedField = rule { capture(zeroOrMore(textData)) }
def textData = textBase -- fieldDelimiter
def ows = rule { zeroOrMore(' ') }
}
def parse(input: ParserInput, fieldDelimiter: Char): Result[Seq[String]] = {
CSVRecordParserWrapper.input = input
CSVRecordParserWrapper.fieldDelimiter = fieldDelimiter
wrapTry(CSVRecordParserWrapper.record.run())
}
}
and then just call CSVRecordParser.parse(input, separator)
when I want to parse a line. Besides the fact that this is horrible, it doesn't work and I often have strange errors related to previous usages of the parser. I know this is not the way I should write a parser using parboiled2 and I was wondering what is the best way to achieve what I would like to do with this library.
Source: (StackOverflow)
The docs for parboiled2 mention the following to get results:
https://github.com/sirthias/parboiled2#access-to-parser-results
val parser = new MyParser(input)
val result = parser.rootRule.run()
However I get a compilation error when attemping what seems to that approach:
Here is the outline of the parser:
case class CsvParser(input: ParserInput, delimiter: String = ",") extends Parser {
..
def file = zeroOrMore(line) ~ EOI
}
The code to attempt to run it
val in = new StringBasedParserInput(readFile(fname))
val p = new CsvParser(in)
println(p.toString)
p.file.run
But the "run" is not accepted:
Error:(81, 12) too few argument lists for macro invocation
p.file.run
^
Source: (StackOverflow)
I have created a PEG using the Parboiled library for Java.
I based it off of this example.
It works fine, but now I need to actually create the AST.
My question is how do I do this using the library?
After looking around on Google for a bit and looking at the examples on the Github, I see that you are intended to use push, pop, swap, etc. to create the AST but I am having trouble figuring out how to do this with my parser. My parser is similar enough to the Java one that if you can help me understand how it would work for the Java one, I could adapt it to mine.
Source: (StackOverflow)
I have the following program, which executes a parser. This is developed in grappa (a fork of parboiled)
package com.test;
import org.parboiled.Parboiled;
import org.parboiled.parserunners.BasicParseRunner;
import org.parboiled.parserunners.ParseRunner;
import org.parboiled.support.ParsingResult;
public final class SampleRun
{
public static void main(final String... args)
{
// 1. create a parser
final TestGrammar parser = Parboiled.createParser(TestGrammar.class);
// 2. create a runner
final ParseRunner<String> runner
= new BasicParseRunner<String>(parser.oneLine());
// 3. collect the result
@SuppressWarnings("deprecation")
final ParsingResult<String> result
= runner.run("sno101 snamegowtham");
// 4. success or not?
System.out.println(result.isSuccess());
}
}
TestGrammar
package com.test;
import com.github.parboiled1.grappa.parsers.EventBusParser;
import org.parboiled.Rule;
import org.parboiled.support.Var;
import java.util.HashMap;
import java.util.Map;
public class TestGrammar
extends EventBusParser<String>
{
protected final Map<String, String> collectedValues
= new HashMap<String, String>();
protected final Var<String> var = new Var<String>();
Rule key()
{
return sequence(
firstOf(ignoreCase("sno"), ignoreCase("sname")),
var.set(match().toLowerCase()),
!collectedValues.containsKey(var.get())
);
}
Rule separator()
{
return optional(anyOf(":-*_ "));
}
Rule value()
{
return sequence(
oneOrMore(testNot(wsp()), ANY),
collectedValues.put(var.get(), match()) == null
);
}
Rule oneLine()
{
return join(sequence(key(), separator(), value()))
.using(oneOrMore(wsp()))
.min(2);
}
}
But, I am getting the following error when I try to execute the above program.
Exception in thread "main" java.lang.NoSuchMethodError: org.objectweb.asm.tree.ClassNode.<init>(I)V
at org.parboiled.transform.ParserClassNode.<init>(ParserClassNode.java:50)
at org.parboiled.transform.ParserTransformer.extendParserClass(ParserTransformer.java:93)
at org.parboiled.transform.ParserTransformer.transformParser(ParserTransformer.java:63)
at org.parboiled.Parboiled.createParser(Parboiled.java:64)
at com.test.SampleRun.main(SampleRun.java:15)
I have the following maven dependencies
- grappa-1.0.4.jar
- asm-debug-all-5.0.3.jar
- guava-18.0.jar
- jitescript-0.4.0.jar
Here is my pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.test</groupId>
<artifactId>parboiledprogram</artifactId>
<version>0.0.1-SNAPSHOT</version>
<build>
<sourceDirectory>src</sourceDirectory>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>com.github.parboiled1</groupId>
<artifactId>grappa</artifactId>
<version>1.0.4</version>
</dependency>
</dependencies>
</project>
Note: I am using Eclipse Juno Service Release 2
Some attempts that didn't work
I have noticed the following icon for asm-debug-all-5.0.3.jar
, I am not sure what this icon means in eclipse juno.
Also, in the pom.xml of the dependency jitesscript-0.4.0.jar
I have noticed relocation of the org.objectweb.asm
package. However, the classes in that too contain the ClassNode(int)
jitescript
was updated is on 16-Apr-2014 whereas asm-debug-all-5.0.3
was on 24-May-2014
I have tried to remove jitescript.jar and updated the maven project and also cleaned and build it but still no use.
I have also tested this in KEPLER without using maven by manually including all the dependencies that are listed above. But still, I am getting the same error. This means that the problem was not with Maven but something else.
Source: (StackOverflow)
I wrote the following hello-world parboiled2 parser:
class MyParser(val input: ParserInput) extends Parser {
/*
Expr <- Sum
Sum <- Product ('+') Product)*
Product <- Value (('*') Value)*
Value <- Constant | '(' Expr ')'
Constant <- [0-9]+
*/
def Expr: Rule1[Int] = rule { Sum }
def Sum: Rule1[Int] = rule { oneOrMore(Product).separatedBy(" + ") ~> ((products: Seq[Int]) => products.sum) }
def Product: Rule1[Int] = rule { oneOrMore(Value).separatedBy(" * ") ~> ((values: Seq[Int]) => values.product) }
def Value: Rule1[Int] = rule { Constant | ('(' ~ Expr ~ ')') }
def Constant: Rule1[Int] = rule { capture(oneOrMore(Digit)) ~> ((digits: String) => digits.toInt) }
}
This works mostly as expected, e.g. it successfully parses "1 + 2" as 3.
If I give it invalid input such as "1 + (2", I would expect the parse to fail. But it actually succeeds, with 1 as the result.
It looks like parboiled2 is only parsing part of the input, and ignoring the remainder that it cannot parse. Is this expected behaviour? Is there any way to force the parser to parse the whole input and fail if it cannot do so?
Source: (StackOverflow)
Some classes in the parboiled framework have a generic type parameter. E.g. the class BaseParser
. In it's documentation it says:
Type Parameters:
V - the type of the parser values
which I really not find a sufficient documentation. The documentation in the other classes is similiar or missing. Even in the wiki I didn't find information on that. They are just using Object
.
Can anyone explain to me what this parameter is used for and what are valid types to hand over?
Source: (StackOverflow)
I need to parse an equation and then apply it to values. For example, I would like to parse
(x+4)*y
and then apply it to an array of values for x and y. I am able to use the calculator example to evaluate individual equations like
(3+4)*5
but I am not sure how to make this generic. I am using Parboiled for Java. Any direction would be highly appreciated.
Source: (StackOverflow)