I have a mysql schema like below:
data: {
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT '' COMMENT 'the name',
`content` text COMMENT 'something',
}
now I want to extract some info from it: the filed name, type and comment if any. See below:
["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ]
My code is:
parse data [
any [
thru {`} copy field to {`} {`}
thru some space copy field-type to [ {(} | space]
(comm: "")
opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}]
(repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""])
]
]
but I get something like this:
["id" "int" "the name" "content" "text" "something"]
I know the line opt ..
is not right.
I want express if found COMMENT
key word first, then extract the comment info; if found lf first, then continue the next loop. But I don't know how to express it. Any one can help?
I much favour (where possible) building up a set of grammar rules with positive terms to match target input—I find it's more literate, precise, flexible and easier to debug. In your snippet above, we can identify five core components:
space: use [space][
space: charset "^-^/ "
[some space]
]
word: use [letter][
letter: charset [#"a" - #"z" #"A" - #"Z" "_"]
[some letter]
]
id: use [letter][
letter: complement charset "`"
[some letter]
]
number: use [digit][
digit: charset "0123456789"
[some digit]
]
string: use [char][
char: complement charset "'"
[any [some char | "''"]]
]
With terms defined, writing a rule that describes the grammar of the input is relatively trivial:
result: collect [
parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility
opt space
some [
(field: type: none comment: copy "")
"`" copy field id "`"
space
copy type word opt ["(" number ")"]
any [
space [
"COMMENT" space "'" copy comment string "'"
| word | "'" string "'" | number
]
]
opt space "," (keep reduce [field type comment])
opt space
]
]
]
As an added bonus, we can validate the input.
if parsed? [new-line/all/skip result true 3]
One wee application of new-line
to smarten things up a little should yield:
== [
"id" "int" ""
"name" "varchar" "the name"
"content" "text" "something"
]