Destiny Project Notebook, Folio 5

Toward handling general Boolean queries

October 1998

I have been thinking about extending the knowledge marshalling algorithm completed last month. The algorithm as it stands works well for queries that are simple conjunctives (a series of conditions, all of which must be true for something to satisfy the query). Extending the process to handle queries formed from general Boolean expressions (involving arbitrarily complex constructions using "and", "or", and "not") has entailed many false starts and crumpled sheets of diagrams.

It was necessary to define a language in which I would express such queries. An easy task, since I am still confining myself to the realm of combinations of query items (phrases) that look like knowledge items with variables in place of some of the subjects. Single query items can be inverted by prefacing them with the unary operator ".NOT.". Two query items can be conjoined using the operator ".AND.", or disjoined using ".OR.", with the operator placed between the two items. In addition, the use of parentheses to enclose complex expressions allows operators to treat such expressions as a single query item, as in:

<%2><is a><car>.AND.(<%2><color><white>.OR.<%2><color><green>.OR.<%2><color><silver>)

Following the usual conventions, .NOT. has higher precedence than .AND., which has higher precedence than .OR..

Bill was kind enough to tutor me in LALR and recursive descent parsing during one of our evenings together this month. That and readings in the "Dragon Book" (the standard text on the principles of compiler design) have enabled me to formulate the few simple rules necessary to parse this language:

Translated roughly, the first rule means "An expression that may or may not involve an 'or' is composed either of a single expression that may or may not involve an 'and' or a pair of expressions separated by the '.OR.' operator, the first of which may or may not involve an 'and' and the second of which may or may not involve an 'or'". Similarly, the second rule means "An expression that may or may not involve an 'and' is composed either of a single expression that may or may not be a simple query expression or a pair of expressions separated by the '.AND.' operator, the first of which may or may not be a simple query expression and the second of which may or may not involve an 'and'". The third rule means "An expression which may of may not be a simple query expression is formed either of a single query item, or by the '.NOT.' operator followed by an expression which may or may not be a simple query expression, or by an expression, enclosed in parentheses, which may or may not involve an 'or'". Query items themselves are irreducible in the context of this discussion; they are the basic building blocks of which queries are made.

Three Phases of Analysis (from the Dragon Book)
1 Linear (Lexical) Analysis Breaks character stream into an ordered list of tokens.
2 Hierarchical Analysis (parsing, syntax) Groups tokens into nested collections with collective meaning.
3 Semantic Analysis Performs checks to ensure that the token collections fit together meaningfully.

 

Language Elements
<subject>
knowledge subject Something about which a memory can know something
<%n>
variable Stands in for a set of subjects for which membership is unknown at the time of the query.
(
begin subexpression Marks the start of the specification of a subexpression. Subexpressions are evaluated prior to expressions involving them.
)
end subexpression Completes a subexpression.
.AND.
and operator Conjoins two subexpressions.
.OR.
or operator Disjoins two subexpressions.
.NOT.
not operator Negates a subexpression.

Another question is that of what structure to use to represent the relationships between the query elements. Some sort of tree structure, certainly. What is easiest to get out of the parsing process is what I call the Query Syntax tree, a single tree with nodes for each of the operators and leaves for the query items, each of which appears once:

[Note that in tree form, AND and OR nodes can represent the conjunction or disjunction of more than two subexpressions.]

To resolve the questions of which subjects qualify for membership in the various sets represented by the variables, however, Variable Pool trees centered on the variables are more appropriate:

Each Variable Pool tree has the same structure as the Query Syntax tree, but elements are removed that do not relate to the particular variable . In fact, a Variable Pool tree for a particular variable can be formed by a three-step process:

  1. Make a copy of the Query Syntax tree
  2. Remove all knowledge items that do not mention the variable
  3. Collapse nodes, according to the following rules:

To determine the most efficient way to process the query, some kind of cost estimates must be formed. Initially, I had planned on estimating costs from the Query Syntax tree. The problem with that approach, however, is that the Query Syntax tree doesn't "know" whether branches have variable pools in common. In database terms, an inner join looks the same as an outer join, even though there is a huge cost difference. "Narrowing" of the pools (elimination of candidates), including recosting of specific query associations, is governed by the Variable Pool trees.


Speech and Emotion

4 November 1998

Internally, when they are not observed by players, NPCs could use a formal language to ease processing requirements of the game engine.

In general, the words of a language must be able to express:

Linguistic modifiers must also be able to express:

Hard problems include:

Individuals have different default modifier intensity assumptions both in what they put out and in what they apply to the communications they receive (in the absence of more explicit modifier intensities). In fact, mismatches in these assumptions between speaker and listener are the source of many intriguing misunderstandings, and even humor.

Bill had the idea of having the Orcs, townspeople, etc. "speak" in formal languages, perhaps derived from the common (internal) formal language. Players, with modest effort, could learn these formal languages if they wished (or hire translators, or, depending on the technology level, buy universal translators to take along with them) to communicate with NPCs or listen in on NPCs' conversations.


Characteristic Emotions or Attitudes from the D & D handbook tables for generating non-player characters

(Parenthetic remarks ours)

Personality


More on Speech and Emotion

4 November 1998

We should start with a small subset of emotions, but which ones?

Emotions, while perceptible to the listener, are not necessarily directed at the listener. They reflect the state of the communicator.

Emotional content is more universal (easier to read, at a rough level) among entities in the world than is factual content (when expressed with language, tone of voice, body language, etc.).

Conscious vs. subconscious content. What one wishes to express differs or conflicts with what one expresses in other channels. Some channels are less susceptible to conscious manipulation.

Another guide to the emotions most needed in communication would be the acronyms and emoticons most commonly used in online chat rooms, e.g. <g>, LOL, :-). For more on emoticons, see The Unofficial Smiley Dictionary.

One could perhaps devise a genetic algorithm to develop the ability to learn other formal languages among the denizens of the world.


Constructing Query Trees

7 November 1998

Here is a query to be parsed:

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$
NOT
A
AND
B
OR
C
AND
NOT
D
AND
(
(
NOT
E
)
OR
F
)
$

The construction of the Query Syntax tree, step by step as each token is encountered:

0
Start of query. Create an initial subtree node. Initialize the focus at that node.
1
NOT operator is a unary operator. Create a node for the NOT operator, attaching it as a child of the focus operator. Set the focus to the new node.
2
Create a leaf to hold the query item (A). Attach it as child to the focus node. The focus node is a NOT, which is satisfied by having one child, so set focus to the parent node of the current focus node.
3

AND operator is higher precedence than operator at focus. Create a node for the AND operator, inserting it between the focus operator and its most recent child. Set the focus to the new node.

4

Create a leaf to hold the query item (B). Attach it as child to the focus node. Because the focus node is not satisfied, the focus stays where it is.

[AND and OR operators can parent any number of children, so they are never satisfied.]

5

OR operator has lower precedence than focus operator, so climb tree until another OR or some lower-precedence operator is encountered. In this case, the initial subtree operator gets the focus because it is lower in precedence than an OR.

OR operator is higher precedence than operator at focus. Create a node for the OR operator, attaching it as a child of the focus operator. Set the focus to the new node.

6
Create a leaf to hold the query item (C). Attach it as child to the focus node. Because the focus node is not satisfied, the focus stays where it is.
7
AND operator is higher precedence than operator at focus. Create a node for the AND operator, inserting it between the focus operator and its most recent child. Set the focus to the new node.
8
NOT operator is a unary operator. Create a node for the NOT operator, attaching it as a child of the focus operator. Set the focus to the new node.
9
Create a leaf to hold the query item (D). Attach it as child to the focus node. The focus node is a NOT, which is satisfied by having one child, so set focus to the parent node of the current focus node.
10
AND operator is the same as the focus operator. No new node need be created, and the focus remains unchanged.
11
Open parenthesis signals the beginning of a subtree. Create a subtree node and attach it as a child of the focus node. Set the focus to the new node.
12
Open parenthesis signals the beginning of a subtree. Create a subtree node and attach it as a child of the focus node. Set the focus to the new node.
13
NOT operator is a unary operator. Create a node for the NOT operator, attaching it as a child of the focus operator. Set the focus to the new node.
14
Create a leaf to hold the query item (E). Attach it as child to the focus node. The focus node is a NOT, which is satisfied by having one child, so set focus to the parent node of the current focus node.
15

Close parenthesis satisfies a subtree node. Trace up the tree to the nearest subtree node. In this case we don't have to move anywhere because the focus is already at a subtree node.

Collapse the subtree node by assigning its child to its parent and removing the node. Set the focus at the parent of the removed node.

16

OR operator is lower in precedence that the focus operator, so we would like to climb the tree until we find an operator with precedence or OR or less, but we are not permitted to climb out of a subtree node without having encountered a close parenthesis. For this reason the focus remains at the subreee node.

Create a node for the OR operator, inserting it between the focus operator and its most recent child. Set the focus to the new node.

17
Create a leaf to hold the query item (F). Attach it as child to the focus node. Because the focus node is not satisfied, the focus stays where it is.
18

Close parenthesis satisfies a subtree node. Trace up the tree to the nearest subtree node.

Collapse the subtree node by assigning its child to its parent and removing the node. Set the focus at the parent of the removed node.

19

End of query. Climb the tree to the top. If any subtree nodes are encountered along the way (that are not at the top), then error because of unbalanced parentheses.

Remove the initial subtree node, setting the top-of-tree pointer to the child of the removed node.


Emotion Research Results
letter from Chris to Bill

11 November 1998

According to a psychology text of Elsa's, all emotions can be characterized in three primary ways: in their intensity, in their complexity, and in their degree of pleasantness or unpleasantness.

The autonomic (involuntary) nervous system expresses emotions in one of two ways. In a sympathetic expression, the pupils dilate; the eyes, nose, and throat become drier; the heart beats faster, breathing becomes more rapid, bowel and urination control tighten, the skin cools, and sweat production increases. In a parasympathetic reaction, the opposite effects are evidenced. In addition, there is some evidence that after an extreme expression of either expression type, there is a subsequent period characterized by the other expression type.

In studies of infants, Watson (1930) identified three "unlearned" emotions, developmentally the earliest emotions: rage, fear, and joy. Before these emotions are acquired there is a prior developmental stage in which activity can be differentiated only by degree of general excitement.

Plutchik (1980) identified eight "primary" emotions: fear, surprise, sadness, disgust, anger, anticipation, joy, and acceptance. These are typically arranged in this order around a circle.

This arrangement places complementary emotions opposite one another (probably no accident, and probably a reliable indicator as to the artificiality of the arrangement). According to Plutchik, other emotions can be described in a "paint-mixing" model using the eight primaries. Awe = fear + surprise, disappointment = surprise + sadness, remorse = sadness + disgust, contempt = disgust + anger, aggression = anger + anticipation, optimism = anticipation + joy, love = joy + acceptance, submission = acceptance + fear. Tertiary combinations are also possible: jealousy = love + anger + fear was given as an example. The model seems rife with inconsistency: if jealousy is love + anger + fear and love = joy + acceptance then given that submission = acceptance + fear then jealousy = love + submission + anger, which seems implausible. I'm also unconvinced that aggression is an emotion (a behavior type, yes, but an emotion?). In this respect, acceptance and anticipation also seem on the fringes of what an emotion is, rather than central.

It does seem that some emotions are more anticipatory and others are more reactive, so the notion of anticipation is not without its place in some taxonomy of emotions, it just isn't an emotion by itself. Perhaps attempting to build a taxonomy of emotions should be one of our first steps. (Nothing difficult about that, is there?)


Emotion Research Results
letter from Bill to Chris

12 November 1998

Wow! Now this is the kind of thing we can use! Given, it's all just theory and not nearly complex enough to be "real," (as well as being rife with the inconsistencies you mentioned), but it gives us ideas as to how to create our own models.

It seems that Plutchik's model tries to define emotions in terms of four variables: the Fear <--> Anger variable, the Anticipation <--> Surprise variable, the Joy <--> Sadness variable, and the Acceptance <--> Disgust variable.

His "paint-mixing" is really a matter of having the variables set at different levels. While I believe that this is the "kind" of model that we want, the specific variables are not what we want.

I agree with you on the notion of "agression." Agression may be a strategical behavior, but it is not an emotion per se. Rather, it is a means of acting according to one's emotions.

Another thing that the particular variable set that he has created does not take into account is that some of the variables have a strong association with other variables that are otherwise not indicated. In particular, if someone is feeling any degree of fear or anger, there is good reason to believe that the entity is not in a joyful state.

Nevertheless, it shows that a great number of secondary emotions can be generated from a rather small set of primary emotions.

Regarding anticipation, I suspect that emotions related to anticipation are as a result of foiled plans, or successful implementation of plans, or a predeliction to create plans. I would not expect to see 'anticipation-related' emotions until there exists some degree of either planning or prediction-making. As such, they should be rare in infants.

Hmm... If an emotion does not manifest itself in an infant, can it be considered a "primal" emotion? Perhaps all 'anticipatory-related' emotions are simply complex emotions or emotions that are directed at certain objects? (Love seems to be one such 'directed' emotion. It is an emotion, true, but it is not felt in a vacuum; only as directed towards the object of love or hate).


Back to the Destiny documents

Back to the Previous Folio of the Notebook

Onward to the Next Folio of the Notebook