359x Filetype PDF File size 0.42 MB Source: aclanthology.org
DANISH FIELD GRAMMAR IN TYPED PROLOG
Henrik Rue
UNI-C, Danish Computing Center for Research and Education
Vermundsgade 5, DK 2100 @, Copenhagen, Denmark
ABSTRACT ges in the definition and inventory of
fields in order to make an executable
This paper describes a field grammar for program.
Danish and its implementations in a Prolog
version with predeclared types. In compa- Prolog Dialect
rison to the ususal S -> NP VP schema,
this kind of grammar, where the first rule The Prolog dialect used is the Danish
is S -> CNF FF NF CF enhances analysis prototype of Borland's TurboProlog. This
effeciency because the fields specify is a typed prolog, and may be termed a
constituents and syntactic function at the hybrid between Prolog and Pascal. When
same time. The field grammar tradition is seeing a sample grammar written in this
outlinedand an overview of the major rules dialect, one is impressed by the clarity
of the Prolog program, which implements it achieves: grammatical structures are
the grammar, is given. statically described in the declaration of
types. The dynamic part which enables one
to get at these structures are the rules
of the program. A further aim of this
FIELD GRAMMAR work, then, is to explore whether this
clarity will prevail also in an elaborate
A Syntactic Strategy grammar program.
In terms of computational linguistics, Other Purposes
field grammar may be viewed as a syntactic
strategy, which offers the user the imme- Apart from the purpose implicit in the
diate constituents while at the same time aims we believe that field theory offers a
giving their syntactic functions and the sound (read: economic) starting point for
functional sentence perspective, in part a great variety of parsing purposes. As
at least. Field grammar furthermore faci- mentioned, the theory offers a combina-
litates the handling of discontinuous con- tion of constituent structure analysis
stituents, as will be shown. with syntactic and thematic analysis.
Background This will not only hold for the Scandi-
navian languages, but presumably also for
The field grammar of the Danish linguist other Germanic language like English,
Paul Diderichsen adequately describes con- where one might abandon the S -> NP VP in
stituent structure in Danish, while at the favour of something on the lines of the
same time capturing both topicalization SVC SVA SV SVO etc. clause patterns of
and syntactic roles. Diderichsens grammar Quirk (1972) et al.
"Elementmr dansk grammatik" (1946) was
developed from the 1940's onwards with the In the work presented here, however,
intention that it should be used as a there is no exploitation of the topicali-
common framework for grammar teaching in zation facilities offered by the grammar.
secondary school as well as on university
level. This grammar has since served as
one cornerstone of Danish grammatical
thought. A DANISH FIELD GRAMMAR
Diderichsen's grammar is distinguished According to Diderichsen, the Danish
by a high degree of formalization, and it sentence structure has four major fields,
is one of the aims of the work presented the connector field, the fundament field,
in this paper to see how much of the the nexus field and the content field.
original formalism can be implemented
directly as a Prolog program, and whether The four types are present in main sen-
it is necessary to make substantial chan- tences
167
S -> CONN FF NF CF CONTENTF = nil;
and three of them in subordinate ones: contentf( INFFLD, OBJFLD,
CADVFLD )
SS -> CONN S-NF CF
where all fields except the nexus field These are the major fields. They may in
(NF or S-NF) may be empty. turn be divided into subfields:
The CONN is the field for conjunctions. INFFLD : nil;
inffld( INFI, INF2 )
The FF (for Fundament Field, which is means that Danish has a possibility of two
the Danish topicalization device) may auxiliaries, (the finite + one infinite),
contain any complete constituent, which is and implicitly that if INF2 is filled,
there as a result of a movement from its then this will be the content verb. This
field in the sentence: 'Moderen giver treatment is not quite adequate, actually,
drengen gaven' vs. 'Gaven giver moderen but it follows Diderichsen's schema.
drengen', ('The mother gives the boy a
gift') where the second version differes OBJFLD : nil;
in its thematical content only: it stres- obJfld( NOMINAL, PREPG, NOMINAL )
ses the direct object as the theme.
The NF, for Nexus Field, contains a the object field, which at the moment con-
finite verbform, a possible subject plus tains a quick-and-dirty solution to the
adverbials modifying the verb; the inter- problem that the indirect object may be
nal structure of the nexus field differs expressed by a prepositional phrase in
in main and subordinate clauses. Danish, the solution being the incorpora-
tion of an unwarranted PREP subfield.
The CF, for Content Field, contains two It should be noted in passing, that the
possible infinite verbforms, the objects connector field in Diderichsen's formalism
and predicates plus adverbial and other is one of the places where the system will
modifiers. not be able to hold on to the original.
The Grammar Declaration This field is part of scemata not only for
sentences, but also for noun- and adver-
So far the project has implemented field bial phrases, where it may contain i.a.
analysis of both main and subordinate preposition. The system thus has to di-
sentences. However, not all topicaliza- stinguish between the two types of connec-
tions are handled yet: in questions, the tor fields in order to avoid the genera-
fundament field may be empty too, but this tion of spurious analysis results.
is not incorporated in the program, as it Discontinuous Verbal Particles
remains to be seen whether an anlysis with
the finite topicalized, that is moved into In Danish some verbs are either prefi-
the fundament field, would be more fit for gated or obligatorly constructed with a
the purpose. particle, a preposition actually, which
moves to the end of the sentence with all
Clause structure finite forms: 'oplade' ('charge') but 'han
lader batteriet op', ('he charges the
The following declarations describe main battery'); 'lukke op' ('open up') but 'ban
and subordinate clauses and furthermore lukker d~ren op' ('he opens the-door up').
the internal structure of the major The same phenomenon exists in German:
fields: 'Peter gab sein rauchen auf'. This is one
of the places where field grammar shows
S : s( CONN, FUNDF, NEXUSF, CONTENTF ); its force as a syntactic strategy, because
nil; the phenomenon of discontinuity is handled
s_s( CONN, NEXUSF_S, CONTENTF ) in a straightforward way at the first
CONN = level of analysis:
nil; ADVFLD = nil;
konj( KONJ ) cadvfld( CADF, CADF )
FUNDF = fundf n( NOMINAL ); /* No nil */ with
fundf--a( ADVERBIAL );
fundf--i( INF ); CADF = nil;
fundfZc( CONTENTF ) prep( PREP );
cadf( ADVERBIAL )
NEXUSF : nexusf( FINIT, SUBJ, NADV )
where CADF is the field for i.a. conten-
NEXUSF_S : nexusf_s( SUBJ, NADV, FINIT ) tial adverbs, but also for disjunct verbal
168
particles. These are acommodated by split- the first 'VERB' slot when field analysis
ting the original Diderichsen subfield for is carried out. The result of the syntac-
content adverbials into two further sub- tical analysis which follows, will be in
fields, one of which will contain the the second 'VERB' slot.
verbal particle (if any) the other the Syntax
regular content adverbials. This is suffi-
cient for the declaration of the grammar; The system also comprises a syntactic
how our analysis handles the various
fields will be shown in a later section. part, based on traditional school grammar:
SYNT = synt( SUBJ, VERB, NADV, SUBJPRED,
Phrasal structure OBJ, OBJPRED, IOBJ, CADV,
TEMPG )
Syntagmatic structures are also divided where NADV and CADV are the adverbial
into fields. As the system stands it is modifiers of the nexus and the con-
implemented for adverbial phrases, but not tentfield respectivily. The other mnemo-
yet for noun phrases. These are at the nics should be self evident.
moment structured in a way, that is pretty
much on the NP -> Det AdjP N lines. As The Dictionary
regards adverbials, the structure given is
only one of several possible: As the dictionary of the system has not
NOMINAL = nil; been given much attention yet, and as it
nominal( ART, ADJEKTIVAL, SUBKERN works on a purely ad hoc basis, it will
PREPP, CS ) not be treated in this paper.
ADVERBIAL : nil; ANALYSIS
adverbial( CONN,
DEGREEF, SITUATF, ADVKERN, Analysis runs in two steps, one carrying
PREPP, CS ) out the field analysis, the other handling
The CS is a symbol representing subordi- the syntactical interpretation of the
nate sentences, which have the form: result of the field analysis.
CS = nil; Field Analysys
cs( S, SYNT ) Field analysis is carried out by a call to
where S is the field structure, and SYNT the following major rule:
the corresponding syntactical structure of
the subordinate sentence represented by is_s( I, O, s( CONN, FUNDF, NEXUSF,
the token of the symbol type CS. CONTENTF ) ):-
is forb( I, II, CONN, FEATC ),
Verb phrases, on the other hand, do not FEATC <> subord,
exist as such. Instead we have: is fundf( II, I2, FUNDF ),
FINIT = finit( VERB, VERB, TEMPG ) is--nexusf( I2, I3, NEXUSF ),
INFINIT = infinit( VERB, VERB, TEMPG ) is--contentf( I3, O, CONTENTF ).
VERB = Symbol which applies the following rules in order
to succeed (or fail):
which means that a verb, whether it be is_fundf( I, O, fundf n( NOMINAL ) ):-
finite or infinite, is described by a is nomen( I, O, NOMINAL ), I <> O.
structure, which consists of I) the verbal
form itself as it is found in the sentence is_fundf( I, O, fundf a( ADVERBIAL ) ):-
(the first 'VERB'), 2) a lexical unit, is adverbial( I, O, ADVERBIAL, ),
(the second 'VERB', which will be found as I~> O.
a result of the analysis of the sentence,
and which will leave the fields for infi- is_nexusf( I, O, nexusf( FINIT, NOMINAL,
nite form empty) and 3) a complex descrip- ADVERBIAL ) ):-
tion, TEMPG, of tense, aspect, voice, is finit( I, II, FINIT ),
modality and the telic/atelic property of is-nomen( II, I2, NOMINAL, _, _ ),
the situation described by the verb. This is~adverbial( I2, O, ADVERBIAL, _ ).
TEMPG is used of the sentence as a whole
also. and
In this way a 'FINIT' in a sentence will
have either an auxiliary, a finite verb-
form missing the verbal prefix or the
full, finite form of the content verb in
169
is contentf( I, O, contentf( INFFLD, start:-
-- OBJFLD, CADVFLD ) ):- write("Skriv en smtning"),nl,
is inffld( I, II, INFFLD ), readln( Line ),
is--objfld( II, I2, OBJFLD ), is s( Line, "", S ),
is--cadvfld( I2, O, CADVFLD ), is~syn( S, SYNT ),
I~> O. nl, write("Feltanalyse:"),nl,
skriv s( S, 0 ), nl,
is contentf( I, I, nil ). nl, w~ite("Syntaktisk analyse:"), nl,
skriv( SYNT, 0 ), nl, fail.
As a consequence of having a possible nil- is_syn( S, SYNT ):-
filling for a major field, the content extract_vg( S, VERBI, TEMPG ),
field, it becomes necessary to explode the
number of rules which identify and collect extract disco vpart( VERBI, S, VERB ),
compound verb forms, or in other words extract~advg(--S, NADV, CADV ),
what is gained in the simplicity of the interpret_nominals( S, VERB, SUBJ,
grammar is lost again by the number of SUBJPRED, OBJ,
rules. OBJPRED, IOBJ ),
collect_synt( VERB, NADV, SUBJ,
Discontinous Verbal Particles SUBJPRED, OBJ, OBJPRED,
IOBJ, CADV, TEMPG, SYNT ).
As an example of the rules handling the
major fields, we shall take a look at the is_syn( nil, nai ).
rule, which picks out discontinous verbal
particles.
The claim was that field grammar facili-
The rules which handle the adverbial sub- tates syntactic analysis, and we shall now
field of the content field contain a spe- endeavour to support this claim by looking
cification for the particles, as they at the handling of the noun phrases.
allow for the class of prepositional ad-
verbs: The major rule is 'interpretnominals',
which has the form:
is cadvfld( I, O, cadvfld( PREPG,
-- C ADVERBIAL ) ):- interpret nominals(
is_advprep( I, II,--PREPG ), s( _, FUNDF, NEXUSF, CONTENTF ),
is c adverbial( II, O, C ADVERBIAL ), VERB, SUBJ, SUBJPRED,
I <> O. OBJ, OBJPRED, IOBJ ):-
syn_nomfund( FUNDF, NEXUSF, CONTENTF,
is cadvfld( I, O, cadvfld( C ADVERBIAL, VERB, SUBJ, SUBJPRED,
- PREPG ) ) :- OBJ, OBJPRED, IOBJ).
is c adverbial( I, 11, C ADVERBIAL ),
is--advprep( II, O, PREPG- ), For transitive verbs the following
no~_nom( 0 ), I <> O. version of a 'synnomfund' rule
generates the filler in the fundament
The prepositional adverbs are then picked field as subject, and two fillers to the
up by the rule: object and indirect object slots; if there
is only one filler in the object subfield
is advprep( I, O, prep( PREP ) ):- this will be the object:
fronttoken( I, PREP, 0 ),
dic_prep( X ), X = PREP. syn nomfund(
~undf n( FUNDFN I ),
which in fact is an ad hoc rule to circum- nexus~( _, nil, _ ),
vent the restrictions posed on the system CONTENTF,
be the typing facility. During syntactic VERB, subj( FUNDFN 0 ), nil,
analysis the disjunct particles are col- OBJS, nil, IOBJS )T-
lected with the verb by the rule trans verb( VERB, DITRANS ),
extract disco vpart, as will be demon- check--sentcomp( FUNDFN I, FUNDFN 0 ),
strated-in th~ following. extra~t_obj( nil, DITRANS, CONTENTF,
OBJS, IOBJS ),!.
Syntactic Analysis where the interesting call is the one to
There is one major clause for syntactic 'extract obj', where the following will
analysis, 'is_syn', which is called by the match (the 'check_sentcomp' in the follo-
top level anlysis clause 'start': wing rules should be disregarded, as it
has nothing to do with the analysis of the
arguments proper, it only activates a
syntactic analysis of a possible clausal
complement to the given nominal kernels):
170
no reviews yet
Please Login to review.